|
|
Hall of Merit— A Look at Baseball's All-Time Best
Monday, April 04, 2005
|
Bookmarks
You must be logged in to view your Bookmarks.
Hot Topics
Reranking Pitchers 1893-1923: Ballot (7 - 11:44pm, Oct 03)Last: Rob_WoodReranking Shortstops: Results (7 - 8:15am, Sep 30)Last: kcgard2Reranking First Basemen: Results (8 - 4:22pm, Sep 21)Last: Chris CobbReranking Pitchers 1893-1923: Discussion (38 - 7:19pm, Sep 20)Last: DL from MNReranking First Basemen: Ballot (18 - 10:13am, Sep 11)Last: DL from MNReranking First Basemen: Discussion Thread (111 - 5:08pm, Sep 01)Last:  Chris Cobb2024 Hall of Merit Ballot Discussion (151 - 6:33pm, Aug 31)Last:  kcgard2Hall of Merit Book Club (15 - 6:04pm, Aug 10)Last: progrockfanBattle of the Uber-Stat Systems (Win Shares vs. WARP)! (381 - 1:13pm, Jul 14)Last:  Chris CobbReranking Shortstops Ballot (21 - 5:02pm, Jun 07)Last: DL from MNReranking Shortstops: Discussion Thread (69 - 11:52pm, Jun 06)Last: GuapoCal Ripken, Jr. (15 - 12:42am, May 18)Last: The Honorable ArdoNew Eligibles Year by Year (996 - 12:23pm, May 12)Last:  cookiedabookieReranking Centerfielders: Results (20 - 10:31am, Apr 28)Last: cookiedabookieReranking Center Fielders Ballot (20 - 9:30am, Apr 06)Last: DL from MN
|
|
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. John (You Can Call Me Grandma) Murphy Posted: April 04, 2005 at 01:51 AM (#1230290)Short version: everything I find out about the guy makes him appear better than his numbers up at b-ref. Not only are his numbers great, but he's got the 2nd best MOWP+ of the 40 guys I've checked on (the best among those with at least four fingers); tied for the best MOWP+6, 4th best MOWP+4 (first among serious candidates here). Oh yeah, his MOWP itself is pretty good, too (tied for fourth). And he was apparently terrific in big games, too.
And please, if you click on my link, scroll down and notice what happened in his last 19 starts in '33. Pretty impressive sh1t.
Walter Johnson & Hal Newhouser are the only other pitchers to win multiple MVP's.
He'll be my #1 in 1949.
I. By win shares:
A. Peak/Prime: Hubbell and Ferrell are of similar value over their best nine years, though Hubbell is slightly ahead (233 vs. 223). (Hubbell is better on a year by year comparison for every year except that Ferrell has a better 4th, 5th, and 6th best season).
B. Career: Hubbell tacks on another 72 win shares while Farrell only picks up 10 for a resounding Hubbell victory.
II. By WARP-1:
A. Peak/Prime: Ferrell is substantially ahead over their best nine seasons (84.6 to 78.8) and wins most of the season to season comparisons (Hubbell has a better 2nd best season and they are tied in their 4th best seasons.).
B. Career: Hubbell tacks on another 23.2, dwarfing Ferrell's 2.5 and giving him about 15 more career WARP-1.
If WARP has it right, Hubbell and Ferrell should be awfully close on our ballots. If WS is correct, Hubbell is Ferrell-plus-a-smidge at their peak and with 25% more value added to the end of his career as a bonus. As always, I will split the difference.
To me it looks like this: if you trust WS on pitchers, Hubbell is a clear first-ballot HoMer, and Ferrell is borderline, whereas if you trust WARP, Hubbell is a clear first-ballot HoMer, and Ferrell is a strong candidate. The two systems see Hubbell similarly, but they differ on Ferrell.
WS/162 Games
Hubbell: 31.25
Ferrell: 30.25
I think these are apt comparisons as neither Gibson nor Palmer's careers were long in the Perry/Niekro/Young/Johnson/Alex sense, and most of their value was concentrated over a similar number of prime seasons.
By contrast, my system sees Ferrell as like Miner Brown and Iron Joe McGinnity and a bit behind Ed Walsh.
Gibson and Palmer will likely receive much stronger support than Brown and McGinnity did, which should tell you why I feel confident ranking Hubbell much higher than Ferrell.
I think these are apt comparisons as neither Gibson nor Palmer's careers were long in the Perry/Niekro/Young/Johnson/Alex sense, and most of their value was concentrated over a similar number of prime seasons.
By contrast, my system sees Ferrell as like Miner Brown and Iron Joe McGinnity and a bit behind Ed Walsh.
Gibson and Palmer will likely receive much stronger support than Brown and McGinnity did, which should tell you why I feel confident ranking Hubbell much higher than Ferrell.
On the other hand, if you trust WARP1 but don't subscribe to the W2/3 league quality adjustments, then Ferrell could come out ahead of Hubbell -- especially if you give a peak bonus.
Age 25: Bill McGunnigle, a 19th-century pitcher whose career had already ended.
Age 26: Atlee Hammaker. 25-20 with ~120 ERA+ at 26, 34-47 with ~88 ERA+ after 26.
Age 27: Jarrod Washburn. 46-26 with ~125 ERA+ at 27, 21-23 with ~97 ERA+ after 27 (thus far).
Age 28: Bill Walker. His regression began in his age-28 year: 50-37 with ~130 ERA+ at 27, 37-40 with ~100 ERA+ after 27.
Age 29: Doug Rau. 79-53 with ~110 ERA+ at 29, 2-7 with ~62 ERA+ (!!!) after 29.
Hubbell's age-30 comp is Hall of Meriter Stan Coveleski, and his comps are quality pitchers thereafter. I don't know what to make of this.
Yeah... those similarity scores are not era adjusted. Doug Rau also shows up as Grove's best comp for age 28. Dodger Stadium in the 70s was a friendlier place for pitchers than MLB in the late 1920s. Fun list, though.
Hubbell's 3.87 ERA in 1930 was actually good for 2nd place in the NL.
Top 10 in NL pitchers (with top 10 overall in bold)
1929: 19, tied for 7th with Vance. Red Lucas leads with 26
1930: 18, tied for 5th with Fitzsimmons. Vance leads with 26
1931: 20, tied for 7th with Tom Zachary. Brandt leads with 27
1932: 25, best in NL. 7th overall.
1933: 33, best in NL. 3rd overall.
1934: 32, 2nd. Dean leads with 37. 5th overall tied with Ripper Collins.
1935: 26, 2nd. Dean leads with 31. 9th overall tied with Gabby Hartnett.
1936: 37, best player in the National League.
1937: 23, tied for 3rd with Lou Fette. J Turner leads with 27.
1938 and after, no mas.
So, 9 straight years in league top 10 pitchers. 6 straight years in top 3 pitchers. 3 years best pitcher. 5 years in top 10 of all players in NL. 1 year best player in National League.
Also, among league leaders:
ERA: 10 top tens, 3 firsts
ERA+: 9 top tens, 3 firsts
Wins: 8 top tens, 3 firsts
WHIP: 11 top tens, 6 firsts
IP: 9 top tens, 1 first
K/9: 9 top tens, 1 first
walks/9: 11 top tens, 1 first
Also, among eligible pitchers,
only 2 post-1893 pitchers have more career win shares (Rixey and Lyons),
only 4 are comparable for 3 year peak (Willis, Waddell, Dean and they have contextual or career-length problems, and Ferrell.),
only Willis ahead on 7 year prime (Ferrell right behind),
only Dean and Ferrell are ahead on win shares per 275 innings pitched.
He has great peak/prime/careeer/per year numbers compared to the other candidates, he was consistently in top 10 in the meaningful statistical categories for 10 years, he was the best/ second best pitcher in the NL for 5 straight years, and no other eligible pitcher has all the strengths Hubbell does.
Ws L Sv ERA. Gms InnP
20 8 33 1.93 103 224
He had 8 years where he pitched either 5 times or 9 innings in relief with ERAs below 2.00.
In 1933 and 1934, his 2 busiest years, he went 4-1 with 13 saves in 27 apps over 62.1 innings with an ERA of 0.87. He also led the league in ERA both years, innings once, shutouts once, and completed 45 of 67 starts.
He started his career with Cushing in the Oklahoma State league in 1923 and 1924, but no records were listed.
He also pitched for Ardmore and Oklahoma City in 1924 but walked more than he struck out (in 4 listed games.
In 1925, age 22, for Oklahoma City of the Western League (was that like our current AA level?) he finished 17-13 with no listed ERA and walked more than he struck out. Detroit acquires him at this point and brings him to spring training in 1926 and 1927.
1926, age 23. I don't know if he was hurt, but he listed as pitching 31 games with only 93 innings, a 7-7 record, giving up a hit an innning and walking 4.5 men / 9 innings. ERA was 3.77.
1927, age 24. In the III league, seems to put it together: 14-7, 2.53 ERA, only 48 walks in 185 innings. Detroit releases him outright to Beaumont of Texas League. Cannot blame Cobb for this one as he was with the Athletics.
1928, age 25. Half a year with Beaumont, half a year with the Giants.
Looks like normal career growth. Does not have an amazing year at any time in minors.
I wonder if WARP3 may have the adjustments to these two pitchers exactly backwards. My understanding of the adjustment BP is making in moving from WARP1 to WARP2 and 3 is based on a a measure of relative batter quality in the two leagues; basically they think that the NL had better hitting so Hubbell loses little in the adjustment for league quality, while because the AL supposedly has slightly inferior hitting, Ferrell loses several points.
I've always been skeptical of their adjustments for league quality. The only things they can be hanging them on are movement of players from one league to the other, which occurred only rarely during the first half of the 20th century, and usually only near the beginning or end of a player's career, or on interleague play, which also was quite infrequent.
On the other hand, the one big difference between the leagues in the 1930s that is readily observable is the gap in run scoring environment. The AL remained in the very high scoring environment of the 1920s, while scoring in the NL dropped by nearly a run per game. So it should be apparent that to obtain a given ERA+ for a given number of innings, the AL pitcher would have to work quite a bit harder, probably facing 5 to 10 percent more batters per inning than the NL pitcher.
I presume that most of us make adjustments for this in comparing deadball era pitchers with post-1920 performances. But does WARP3 adjust for this? The comparison of Hubbell and Ferrell suggests not; they adjust away the effects of the high scoring environment by making everything relative, so no extra allowance appears to be given for the extra work required of the pitcher in the higher scoring league.
For the comparison of Hubbell and Ferrell these differences are amplified because Ferrell pitched in hitters' parks, while Hubbell pitched in a more pitcher-friendly environment.
If one wishes to adjust their WARP1 scores for differences in league environment, I think a good case can be made for raising Ferrell's rating relative to Hubbell, rather than lowering it as is done by WARP3.
No. It is based on the relative quality of the league in adjacent seasons. Each player is compared with his performance in adjoining seasons in the same league. This gives a large sample base, and yields a history of how the league changed in internal quality over the years. Then all of the cross-league comparisons over all time can be used to calibrate the two league histories relative to each other. A shortage of samples during the 10's/20's/30's is irrelevant when there are a large number of samples in the early 00's and in the modern era.
Where things get shaky is when there are a small number of internal comparisons, such as pitching measures pre-1885 (particularly during the late 1870's), and the fielding by position.
So anyway, most of you don't agree but my desert island uber-pitching stat is ERA+. It's a place to start and then, yes, consider a lot of other stuff.
My new formula as just a quick and dirty starting place is Career (ERA+-100 x IP/100) + Prime (ERA+-100 x IP/100). Of course now you're gonna want to know my definition of prime. No?
So anyway, here are my results. Any pitcher with a score of 100 BTW definitely deserves some consideration, a score of 120 gets most elected. Galvin, Vance, Faber and Caruthers are the only pitchers under 115 elected, I think. Faber would be the worst pitcher elected, er rather the worst player who was mostly a pitcher, well, you know what I mean. I have not tested every pitcher of course.
19th Century
1. Nichols 80 (career) + 87 (prime)= 177
2. Young 77 + 81 = 158 if he had died in 1899
3. Clarkson 153
4. Rusie 148
5. Keefe 144
6. Radbourn 134
7. MULLANE 127 before AA discount
8. Spalding 126
9. WELCH 118
10. MCCORMICK 115
11. Galvin 114
12. GRIFFITH 112
13. Caruthers 108 not including any hitting
14. BOND 105
15. CORCORAN 100
This is all of the 100s.
Deadball
1. Johnson 240
2. Young 220 career
3. Mathewson 187
4. Brown 158
5. Walsh 157
6. Alexander 155 if he died in 1919
7. Young 152 1900ff only
8. JOSS 135
9. WADDELL 135
10. Coveleski 125
11. Plank 125
12. McGinnity 119
13. CICOTTE 113
14. HAHN 109
15. WILLIS 109
This is again all of the 100s.
Golden Age eligible so far
1. Grove 189
2. Alexander 165 total career
3. HUBBELL 137 this is the point of this post, #1 eligible though by a surprisingly small margin over Joss and Waddell
4. LYONS 124
5. Vance 113
6. BRIDGES 111 this the second point of this post, I am surprised at Tommy's ranking here, though OTOH it generally takes about a 120 to get elected
7. RIXEY 109
8. Faber 109
9. GOMEZ 108
10. LUQUE 104
11. W. COOPER 104
12. MAYS 102
13. DEAN 100
14. GRIMES 97
15. SHOCKER 96
Ferrell is below that. Again, I haven't tested everybody that I plan on. But among the currently eligible and tested I get:
1. Hubbell 137
2. Joss 135
3. Waddell 135
4. Mullane 127 but cries out for a discount
5. Lyons 124
6. Welch 118
7. McCormick 115
8. Cicotte 113
9. Griffith 112
10. Bridges 111 again, I am shocked that he scores this well. Warneke by comparison is not even on the radar.
I expect to hear the problems with ERA+ as an uber-stat, but 1) this is just a place to start and 2) if you've got a better one (spare me WARP or WS, especially WS for pitchers), I'm all ears.
1. I understand why you have Ferrell lower - it's a pitching-only stat. The only way to get Ferrell into contention is to give substantial credit for his hitting.
2. The one subtle difficulty in comparing pitchers from the Oughts (e.g. Waddell) with pitchers from the 20's and 30's (e.g., Hubbell) is that the former could and did pitch more innings per year. The conditions of the game did not allow for that in later years. Pitching fewer innings, they should be held to a higher ERA+ standard, but I think your bias is a little high in favor of the extra IP.
3. I also have Bridges on the radar.
So that's how they do it!? To compare the two leagues for the 1930s, they compare not only the handful of players who switched leagues during the 1930s, but also all compare the players who were active in the 1930s with their own earlier and later seasons, which are in turn compared with other players earlier and later seasons, so that the comparison is actually based more on the players who switched leagues in 1901 or in the 1970s and 80s? Pardon me if that doesn't build my confidence in BP's league quality adjustments.
When I took graduate statistics (many years ago), I was taught that due to multicollinearity it is impossible to fully disentangle the effects of age, cohort, and time on a population of individuals, regardless of how many observations one has. (If you'd like to see some references, google "age cohort time effects".) In the original Bill James Historical Abstract I think he intuitively understood the same point when he trashed a study by Dick Cramer that sounds an awful lot like the methods you describe for WARP2 and 3.
I like WARP1 and WS - I think they both make good efforts to control for most of the outside influences that confound baseball statistics. Certainly much better than simple rate statistics like ERA+. But reading your description of how the league quality adjustments are calculated for WARP2 and 3, I'm becoming more convinced that they are statistical hogwash.
I wouldn't go that far, Brent. James did say that the Cramer study made sense when comparing league strength for a particular season or when comparing the WWII years with contiguous seasons.
My concern is the "special sauce" that Clay may have thrown in above and beyond the season-to-season comparisons.
I'm sure that Davenport understands this as well.
The slope of the MLB evolution from 1947-todate is used as a zero baseline for the whole thing. This at least approximates the removal of those effects (though it will have problems when e.g. an aging effect locally is substantially smaller/larger than is typical for the last 57 year period). The relative league quality is then the residue, the deviations from that baseline.
When this project started, I attempted to reverse-engineer the league quality corrections and came up the following broad overview.
The base period follows the conventional wisdom that I remember: small quality drops for expansion followed by rises in quality (best baseball ever relative to baseline is the early 1990's), NL dominates AL during the late 1950's and throughout the 1960's.
Before the base period: the decade before is not much different from the baseline, except for WWII. Before that, there is a significant difference in quality between the baseball pre-1925 and that post-1935, with a "ramp-up" for the decade between. (This difference is mostly in the NL, though the AL also improves.) The AL was usually better than the NL during this period (about the same amount as the NL/AL differential of the 1960's), and the NL was better than the AA (at best it approached the worst AL/NL differences), and the NA has some quality issues.
The overall pre-1925 quality is roughly "constant" between the 8 team early 1880's, the 12-team 1890's, and the 16 team 1910's, with quality drops for the various expansions, followed by slow recovery. Another way of putting that is that pre-1920 baseball appears to have been evolving at about the same rate as post-war baseball has evolved, but that there was a significant acceleration during the 1920's at the same time as the development of the farm systems. (Just my interpretation of the results.)
Hmmmm....I never thought of Luque and Mendez as being that similar in value. Mendez's Cuban League record was 76-28, .731 ; Luque's was 106-71, .599. I don't think Mendez was playing for considerably better teams. You could argue that the Cuban League was tougher during Luque's tenure there (mostly mid-10s to 1930) than during Mendez's (mostly 1908-14), but I'm not sure there was a huge difference. And of course Mendez made a comeback in the U.S. in the early 1920s (while not pitching much in Cuba after 1914).
Since the majority of hitters are RHB, it makes sense for the best-known screwballers to be leftys. In fact the only RHP I associate with the pitch is Mike Marshall.
1. Carl Hubbell
2. Fernando Valenzuela
3. Christy Mathewson
4. Mike Cuellar
5. Mike Marshall
6. Tug McGraw
7. Luis Tiant, Sr.
8. Harry Brecheen
9. Jim Bagby, Sr.
10. Ruben Gomez
Honorable mention to: Luis Arroyo, Jack Baldschun, Huck Betts, Cy Blanton, Jim Brewer, Willie Hernandez, Carmen Hill, Jim Mecir, Fred Norman, Nels Potter, Sloppy Thurston
Neyer goes on to mention how hard it is to master the screwball, how difficult it is on the arm and how other pitches like the forkball, split-fingered fastball and circle change have the same effect with less arm strain.
I don't think James 'trashed' the study. He said that year to year the comparisons were actually quite good and Cramer was definitely on to something. James said the problem was the study was so large, that the little errors added up to an unacceptable total error over the course of 125 years.
James also said that Cramer knew this, and James hoped that Cramer would redo the study if he could fix it. I imagine that is something like what WARP does.
Restricting the range of the comparisons to a year or two greatly reduces the impact of this problem.
Consider the case where a typical player's performance as a function of age is the function p(a). Let the strength of the major leagues over time be the function s(t). In the kind of cross-comparison study that Cramer did, the pair of functions (p(a) + x a) and (s(t) + x t) will fit the data exactly as well as p(a) and s(t). There is no way to determine the value x.
Suppose that baseball were perfectly efficient, and that players' careers began when they exceeded replacement value and ended when they dropped below replacement value. The results of a Cramer-type study would still be biased, because performance tends to peak quickly and fade slowly as a player ages. When a study includes pairs of seasons that are more than one year apart, the typical cross-year comparison will have the expected player performance being worse in the latter year. Effectively, x in the above equations is more negative than Cramer implicitly assumes, and thus the improvement of performance over time has not been as great.
I think that a better way to try to do cross-era comparisons is not to worry about whether performance has actual improved over time, but to simply use population increase over time to determine what would be the most fair relative numbers of players from different eras, and to bend the scales of one's favorite performance metric, if necessary, to fit these proportions.
There's also a problem with that method, Eric. Just because there is a 10% increase in the population doesn't necessarily mean that there is a 10% increase in talent.
Cramer was on the right track, but the little problems that he glossed over magnified greatly with his cross-era comparisons that it derailed his study almost completely.
You must be Registered and Logged In to post comments.
<< Back to main