About Baseball Think Factory  Write for Us  Copyright © 19962014 Baseball Think Factory
User Comments, Suggestions, or Complaints  Privacy Policy  Terms
of Service  Advertising
You are here > Home > Hall of Merit > Discussion
 
Hall of Merit — A Look at Baseball's AllTime Best Monday, August 25, 2008Election Results: Williams, Musial, Delahanty, Yaz and Raines are Tops in Left!By unanimous support, the electorate decided that Ted Williams is the best left fielder in the Hall of Merit. Not that far behind, Stan Musial obtained an impressive 95% of all possible points. The only 19th century player with at least 75%, Ed Delahanty earned a strong 88%. Carl Yastrzemski trailed Big Ed only slightly with his 85%. The last player with 75%, Tim Raines was comfortably over that threshold with 79%. Thanks to OCF and Ron Wargo for their help with the tally! RK Player 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 PTS 1 Ted Williams 22 462 2 Stan Musial 22 440 3 Ed Delahanty 16 3 2 1 407 4 Carl Yastrzemski 6 8 6 2 392 5 Tim Raines 8 8 3 1 2 365 6 Jesse Burkett 2 1 9 5 3 2 338 7 Al Simmons 1 4 5 4 3 5 333 8 Fred Clarke 1 3 6 7 3 1 1 313 9 Billy Williams 2 2 6 4 2 3 1 1 1 267 10 Monte Irvin 3 2 2 1 1 3 2 3 4 1 255 11 Willie Stargell 1 1 1 2 2 3 3 2 2 1 3 1 220 12 Sherry Magee 1 2 5 1 4 3 1 2 1 1 1 192 13 Jimmy Sheckard 1 3 2 3 3 1 1 1 2 1 2 2 171 14T Zack Wheat 2 3 1 3 2 5 1 2 2 1 170 14T Charlie Keller 1 2 3 1 1 2 1 2 4 1 4 170 16 Joe Kelley 4 4 1 3 4 1 1 2 2 168 17 Goose Goslin 2 1 1 3 4 2 3 1 2 3 161 18 Minnie Minoso 1 3 2 2 3 3 3 4 1 132 19 Charley Jones 3 1 2 1 2 1 1 5 6 109 20 Joe Medwick 3 2 3 2 3 3 4 2 97 21 Harry Stovey 1 1 1 1 2 1 2 4 6 3 90 22 Ralph Kiner 1 2 2 1 2 4 2 8 85
John (You Can Call Me Grandma) Murphy
Posted: August 25, 2008 at 12:26 AM  34 comment(s)
Login to Bookmark
Related News: 
BookmarksYou must be logged in to view your Bookmarks. Hot TopicsManny Ramirez
(15  7:31pm, Jan 28) Last: theboyqueen Toby Harrah (15  2:32am, Jan 28) Last: Ardo Tommy John (17  2:20am, Jan 28) Last: Ardo 2016 Hall of Merit Ballot Discussion (65  1:50am, Jan 28) Last: Ardo Wally Schang (12  1:40am, Jan 28) Last: Ardo Ernie Lombardi (53  4:18pm, Jan 27) Last: The District Attorney Most Meritorious Player: 1901 Ballot (9  11:17am, Jan 27) Last: DL from MN Luke Easter (108  10:51pm, Jan 26) Last: Kiko Sakata Most Meritorious Player: 1902 Discussion (11  9:41am, Jan 26) Last: DL from MN Vic Willis & Sam Leever (58  2:28pm, Jan 25) Last: Kiko Sakata Ernie Banks (71  8:32am, Jan 24) Last: djrelays Trevor Hoffman and Billy Wagner (77  2:22am, Jan 23) Last: Rob_Wood Rube Waddell (30  5:45pm, Jan 21) Last: W. G. Braund Most Meritorious Player: 1901 Discussion (39  3:36pm, Jan 18) Last: TomH Ben Taylor (87  11:44am, Jan 17) Last: Chris Cobb 

Page rendered in 0.4137 seconds 
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. John (You Can Call Me Grandma) Murphy Posted: August 25, 2008 at 02:01 AM (#2915011)Here are the results expressed in a slightly different, but entirely equivalent way. Instead of stating each candidate's point total, I'll give the average placement on everyone's ballots. Then I'll also give the standard deviation of that placement. The first number in the chart below is the average placement; the second number is the standard deviation. The 1.00 and 2.00 with zero standard deviation for Williams and Musial are signs of unanimity.
1. T.Williams 1.00 0.00
2. Musial . . 2.00 0.00
3. Delahanty . 3.50 0.99
4. Yaz . . . . 4.18 0.94
5. Raines . . 5.41 1.61
6. Burkett . . 6.64 1.49
7. Simmons . . 6.86 1.55
8. Clarke . . 7.77 1.62
9. B.Williams 9.86 3.40
10. Stargell . 12.00 3.18
11. Magee . . 13.27 2.77
12. Sheckard . 14.23 3.64
13. Wheat . . 14.27 2.80
13. Keller . . 14.27 3.74
15. Kelley . . 14.36 2.79
16. Goslin . . 14.68 2.96
17. Minoso . . 16.00 2.70
18. Jones . . 17.05 4.24
19. Medwick . 17.59 2.27
20. Stovey . . 17.91 3.03
21. Kiner . . 18.15 2.90
Rusty Priske: 87
andrew siegel: 86
whoisalhedges: 86
DL from MN: 86
Chris Cobb: 86
Devin McCullen: 85
Joe Dimino: 85
Howie Menckel: 85
OCF: 84
Dan R: 84
ronw: 84
== median ==
Tiboreau: 84
bjhanke: 83
Sean Gilman: 82
TomH: 82
Mark Donelson: 80
John Murphy: 78
mulder & scully: 78
Esteban Rivera: 76
EricC: 76
Rick A: 70
sunnyday2: 70
What gives?
For instance:
3rd place votes: Delahanty 16, Yaz 6.
4th place votes: Delahanty 3, Yaz 8, Raines 8, Burkett 2, Simmons 1.
And so on. They do all add up to 22.
Yaz's totals are shifted, methinks. His name is too damn long.
With the old tabulator, Yastrzemski's name wouldn't be a problem. However, the makeshift ballot counter that I created doesn't format properly here for some reason, despite it looking fine on my Excel spreadsheet.
As briefly as possible, here's the concept. Almost all  actually, all that I know of  of the measures we use to rank players (Win Shares, Total Player Ranking, etc.) express their results in a form that is not related to Standard Deviation. Now, when a league season is a "high standard deviation" season, that means that the "size" of the standard deviation for that league is greater than the size of the normal league's SD. Well, "size" means Win Shares (or whatever). A highSD league has more Win Shares per standard deviation than a normal league. And that means that the height of the SD does not apply evenly to all players. The more SDs the player has, the more he gains from the highSD league. Burkett and Magee are, of course, players whose normal seasons are more than one standard deviation above the league mean. And so, they gain more from a highSD league like 1901, 1914, or 1915 than a normal player does. Their large numbers of Win Shares in those leagues are, at least in large part, not a result of their having a hot season. They are the result of having a large boost from the highSD league.
OK, that was as short as I can write. What follows is an example, using Magee and the hypothetical player "Joe Ordinary". Please don't obsess over the exact numbers or the fact that I'm using Win Shares. I'm making the numbers up in order to make the example easy to read and understand. I'm using Win Shares because they provide nice, easy numbers. That's all.
Let's say, just for argument's sake, that a league with a normal standard deviation has ten Win Shares in one SD for a player who plays the entire year. Again, that's just a convenient number I made up. Let's postulate a player called Joe Ordinary. Joe's normal season is one SD above the league mean. That does have value, of course  ten WS a year. Joe is a starter, but no more than an ordinary one. The concept, though, is important. What I'm doing is measuring a player's performance by his standard deviations, rather than WS or TPR or anything else. Joe Ordinary is a oneSD player, NOT a tenWS player. That's important. And I think it's correct for purposes of Hall of Merit voting. I have reasons for that, but they triple the length of the post, so they'll have to wait for the center field discussion thread that is coming.
Now consider Sherry Magee. He is hardly Joe Ordinary. His normal season comes in at what? Three SD above the mean? That sounds close to correct. There should be, in a population the size of baseball's, about 5 SD between the mean and the absolute best player. I'm giving Magee three SD. Again, I'm making that up, but at least there is some logic to it. Magee might be a 4SD player. Or a 2.5. I don't know. Just bear with me.
Now let's consider 1914 and 1915, when the Federal League drained some talent away from the NL and AL. That increases the size of the SD in the two established leagues. Let's say the gain in Magee's NL is 2 Win Shares per SD. Again, I'm making this number up for convenience.
What is the effect of this on Joe Ordinary? Well, his typical season goes from 10 WS to 12, because he has one SD. And no one notices or cares. It just looks like a normal variance among Joe's seasons. No problem.
But how about Magee? Well, he's a 3SD player. So his gain is not 2 Win Shares, but 2x3=6. His normal season (again, I know this is high for Sherry Magee, but bear with me) runs about 30 WS if he plays every single day. But in 1914 and 1915, he turns in 36 WS. That looks huge. It looks like career years. In any analysis I know of, it those two seasons define his peak ranking. But is it that huge? Well, not really. Sherry was still the same old 3SD player he always was. His Win Shares are higher only because there are more of them in each of Sherry's 3 SDs.
That's my argument for making deductions for Magee, Burkett, and 1961's Cash, Gentile, Mantle, and Maris. These guys may have turned in their best years ever, in terms of WS or TPR, but some, maybe all, of what we see is simply the effect of a high SD on a multipleSD player.
The argument turns on the idea of SD/% Dissonance. Win Shares (or TPR or whatever) are not statistically related to Standard Deviations. So, if a league has a high SD, that means that each SD, NOT each player (sorry, Paul, but this is where we disagree), gains a WS or two or whatever. MultipleSD players benefit more from that than lesser players do.
I rely on this concept a LOT. I make a LOT of ad hoc deductions based on it. And right now, I'm begging for help. It appears, from some of the comments made in these here HoM threads, that at least some of you actually have worked out the SDs for every league in the history of the majors. Can I get a copy of this data? It would help me a LOT. I am willing to pay for it, as I imagine the effort took a great deal of time and trouble. But I really, really want this tool. If you have it, and are willing to share, PLEASE post up here and let me know.
OK, enough begging. Now you know why I made the deductions. 1901, 1914 and 1915 are highSD seasons. Jesse Burkett and Sherry Magee are multipleSD players. The benefit per SD gets multiplied by the players' SDs.
I am fully aware that the counter argument goes "but those extra WS contribute actual runs and actual wins to the player's actual team." And that is correct. In a highSD league, the best players contribute even more to their teams' efforts than they normally do. That is, basically, why Bill James timelines. If he doesn't, too many of the best rankings end up coming from the dead ball era (and would come from the 19th c. if Bill used FSEs), because early SDs are higher, in general, than later ones. But his timeline is just linear. It's not related to SD at all. This concept is. As far as I know, using SDs to rank players automatically timelines, because it automatically adjusts for high and low SDs. War years. Early baseball. Expansion years. It catches them all.
At this point, I'm stopping. If I go any further, it means discussing the difference between "WS contribute actual runs" and "SDs rank players correctly within their contexts." That is what would triple the length of this post. I promise to get into that one, but not right now. Balloting for left field is closed. I'll post more in center, unless people are just sick and tired of reading my stuff.
Yours, and Thanks!  Brock
No, really, I guess it means all those weird adjustments I made weren't altogether ridiculous, after all. Whew!
 Brock
However, the makeshift ballot counter that I created doesn't format properly here for some reason,
despite it looking fine on my Excel spreadsheet."
I think the problem is that whatever translator went from Excel to the web here placed the tabs at half an inch.
My display is very very wide  just as wide as the ballot displays.
It's true that Yaz' name is too long and runs over the first tab,
but I don't know if that's what threw the whole tab setting off.
BTW, the reason this post doesn't have the width problem is that I forced carriage returns,
not because I somehow solved the problem.
 Brock
Really?
I'm usually near the bottom.
Finally you are all coming around to my way of thinking!
:)
The display is very wide because of the TABalignment and the extra characters (TABs) at the far right. Highlight the region and see that clearly.

Brock #11
The argument turns on the idea of SD/% Dissonance. Win Shares (or TPR or whatever) are not statistically related to Standard Deviations.
So, if a league has a high SD, that means that each SD, NOT each player (sorry, Paul, but this is where we disagree), gains a WS or two or whatever.
MultipleSD players benefit more from that than lesser players do.
I don't disagree with that but I disagree that that should make much difference. Regarding Sherry Magee 19131916, when you knock
Magee 19292616 down to 19262416 (losing five win shares)
you must knock
Sam Crawford 27312813 down to 27282513 (losing six) and
Joe Jackson 36201834 down to 36181634 (losing four).
(Crawford and Jackson show up in right field next month.)
And when you look at 196970, making the 1970 effect a bit smaller in this illustration,
you must knock
Yaz 2636 down to 2334 (losing five)
Williams 2429 down to 2227 (losing four)
Stargell 2717 down to 2516 (losing three)
How will such season effects contrast knocking decades or halfcenturies up and down show up as notable differences in a ranking such as this election?
Only for someone who puts a big emphasis on "peak" defined by best nonconsecutive seasons, such as the Bill James 3yr peak.

For Jesse Burkett, when you knock down his leaguefirst 38 win shares in 1901 you should also boost his leaguesixth 25 in 1900, perhaps 2538 ==> 2735 (losing one).
Only for someone who puts a big emphasis on "peak" defined by best nonconsecutive seasons, such as the Bill James 3yr peak.
I'm not certain that the difference should be remarkable even for any such 3yr peakist.
Someone who counts "MVP type seasons" where 30=yes, 29=no might fortunately discount two such seasons, with remarkable consequences.
1. The counterargument you propose is a straw man. First, let's not use Win Shares even in this theoretical analysis, because they are poorly thought out and lead to all sorts of problems in this type of discussion (I can explain why if you're interested). Let's use something that measures actual value, some indicator of wins above replacement. Doesn't matter whether you use mine, Baseball Prospectus's, or a homegrown version, it's just the concept. Let's also just assume that player performance is normally distributed about the mean (it isn't, but it's close enough for our purposes). OK, take a league where the standard deviation of player performance is 2 wins per season. If we call the bottom 23% of major leaguers replacement players, that means they will be four wins below average per season, while the AllStars will be four wins above average per season. This makes league average a fourWARP player, and an AllStars an eightWARP player. Let's also say that the top two teams in the league win 95 and 90 games.
OK, now let's say that something, some real external factor, actually causes this stdev to double (as opposed to simply the addition of a bunch of superstars or superscrubs to the league). (In practice, this would most likely be an increase in run scoring or an expansion). Now, replacement is eight wins below average, while AllStars are eight wins above average, meaning that league average players have become eightWARP players, and AllStars have become 16WARP players, overnight, with no change to their underlying ability. What happens?
Well, assuming the distribution of talent between teams doesn't change, you'd see a corresponding increase in the standard deviation of wins between teams. So the 90win team (nine wins above average) will become a 99win team (18 wins above average), while the 95win team will become a 109win team.
Why is winning games important? Because it leads to pennants. When you increase standard deviation, ceteris paribus, you change not only the statswins relationship, but also the winspennants relationship by the same amount. 97 wins is enough to eke out a pennant in the lowstdev league, but is only good for third place in the highstdev league. So, if what we are interested in is "pennants added," then we most definitely DO need to correct for standard deviation when assessing players' value.
2. The correct standard deviation adjustment is multiplicative, not additive. You don't "subtract 2" to discount the effects of a highstdev league, you "multiply by .9" or whatever. As you say, this will reduce the value attributed to the stars more than to the scrubs.
3. That said, we need to distinguish between a "true" increase in standard deviationone that actually makes a league "easier to dominate"and the inevitable yeartoyear fluctuation that takes place due to random noise and to the actual distribution of talent in a league. The clearest example of this is, the highest observed major league standard deviations since 1893 are clustered in the 1920's AL. Was this because the league was easy to dominate? No, it's because the league had a oneman star glut by the name of George Herman Ruth, who singlehandledly was increasing the overall league stdev by massive proportions.
The way to do this is with a regression analysis, which determines the relationships between league factors like run scoring, expansion, and population per team, and observed stdevs over the course of baseball history. By applying the resulting equation to each leagueseason, we can then determine how easy it was to dominate based on these factors, without making any reference to the actual performance of the players in the league, thus avoiding the temptation to give extra credit to George Burns for playing in low standard deviation leagues just because all of the stars were in the AL. The result of this is the standard deviation adjustment I use in my WARP.
4. You say, "Now let's consider 1914 and 1915, when the Federal League drained some talent away from the NL and AL. That increases the size of the SD in the two established leagues." It does? This is true neither conceptually (Why would it? If all the superstars jumped to the FL, the major league stdev would go *down*) nor empirically (the stdev for 191415 was 2.69 wins per playerseason, compared to 2.83 wins per playerseason over the 191013 and 191619 period). There is no reason to deduct for 191415. There *is* a reason to deduct for the 1901 expansion, but not more than there is to deduct for the mid 1890's NL being such a high scoring league and producing astronomical stdevs.
5. Someone wants to pay for my data? Are you f'ing kidding me? Did I just die and go to heaven???? It's available for free in the StDevs and Rep Levels.xls file in the Rosenheck WARP.zip archive posted to the Hall of Merit Yahoo group. I can send you an offenseonly version as well if you'd like, which gets a higher rsquared on the regression and, therefore, bigger standard deviation corrections than the ones I am currently using.
6. Using SDs to rank players does not "automatically timeline." It simply allows us to compare players' contributions in terms of pennants added on a level playing field, across leagueseasons where the winspennants relationship can differ substantially. Timelining, as it is generally understood here, means discounting pennants from earlier eras, on the grounds that the quality of competition is not as high. My stdev adjustment only goes far enough to equalize pennants from the 1890s and the 1980's.
7. If you are interested in learning more about this subject, there's a 550post thread on my WARP data that you might find informative. I have been deluging the group with analysis and arguments about standard deviations ever since I rejoined the project, to the great consternation of a large number of participants and lurkers (see around post 170 on the 2005 results thread for criticism of my approach).
The error bar on Stargell is huge. This group ranked him from 6th to 19th. Also Sheckard  9th to 21st.
We have a jarring lack of consensus on defensive value.
I'm not a big fan of ending elections on weekends. I usually don't check this page on the weekends (especially gorgeous weekends in August).
Le Samourai, I don't at the moment, although if there are any particular players you are interested in I can calculate it for them. I plan on producing a 2.1 build of my WARP (no methodological changes, just incorporating the new baserunning and defensive statistics that have become available since I first developed the metric), and would definitely update it for '06'07 (and '08, presumably) then. Version 3.0 would be a fully integrated model including pitchers, but until I can figure out how to approach the pitching/fielding split over time (so I don't have 1930's SS getting 5 FWAA), that's going to remain on the back burner.
Do you see the last column of my post #2? That's the "error bar." And the runaway mostdisagreedabout candidate is Jones. That  the standard deviation of ballot placement  should be higher in the middle of the ranking and lower at both ends. As such, the 3.18 for Stargell isn't really out of line, and we disagreed more about Williams, Sheckard, and Kiner.
Kiner is just a peak/career disagreement.
I don't get Billy Williams at 17th or 20th. He just seems to have an unusually large "tail" and I would expect more voters would reduce his
standard deviation. I don't expect the picture would get clearer on Stargell or Sheckard with more samples.
Ah, well I'm more than willing to wait. I'm very excited to see what your WARP has to say about pitchers when you eventually figure it out.
As to my consensus rating, I am pleased. I just don't think of "merit" as being necessarily tied so closely to career value. I sorta like guys who were "the best of their time" among some cohort or otheri.e. best player, best hitter, best OF, best LF, etc. etc. etc. Also I've never been much of a "rate" voter, but somehow over our long layoff I developed somewhat of a like for a high rate. I suppose I should revisit John McGraw when we resume the regular HoM balloting.
Jones: +7.05
Stargell: +5.00
Minoso: +3.00
B. Williams: 7.14
Sheckard: 6.77
Kelley: 5.64
Goslin: 4.32
Wheat: 3.73
Raines: 3.59
Anyone else can do the same with their own votes  just look at #2. For instance, I had these deviations:
Stargell: +4.00
Medwick: +3.59
Sheckard: 4.77
Jones: 3.95
There is some slack around #910 in the election report, Williams and Stargell, but that is reasonably good separation, Billy Williams a clear number nine, ranked in the top ten by a big majority. The top eight are almost unanimously ranked in the first nine and the bottom eleven are almost unanimously ranked in the last twelve.
Can one of the ballot counters easily tell who won the 'headtohead' battle? If it was 1111 then the tie would just stand.
Much obliged, Joe. Did you do something easy that I should know about or did you spend some time fixing it manually?
I can't do it easily, Joe, since my makeshift tabulator didn't take that into account.
It's mostly manual  I copy the results you post into a spreadsheet (which is probably where you copy them from).
Then I copy and paste into a text file. I do a find and replace where I copy the 'tab' type character that separate each number and replace that with two space characters.
This gets everything 'close' from there I manually add and remove space characters until it all lines up. Probably takes about 1530 minutes depending on the particular ballot.
I don't mind doing it at all (others do much more work than that). I was out of town until Monday afternoon and then slept most of the rest of the day and finally had a chance to work on it this morning.
You must be Registered and Logged In to post comments.
<< Back to main