|
|
|
Hall of Merit— A Look at Baseball's All-Time Best
Monday, December 12, 2005
1967 Ballot Discussion
1967 (December 26)-elect 2
WS W3 Rookie Name-Pos (Died)
166 75.6 1948 Ned Garver-P (living)
203 48.1 1948 Ted Kluszewski-1B (1988)
187 53.2 1950 Jackie Jensen-RF (1982)
184 53.6 1947 Earl Torgeson-1B (1990)
187 48.5 1942 Elmer Valo-RF (1998)
160 57.0 1949 Mike Garcia-P (1986)
180 47.7 1949 Hank Bauer-RF (living)
146 60.0 1949 Johnny Antonelli-P (living)
138 51.8 1947 Gerry Staley-RP (living)
110 36.3 1945 Del Rice-C (1983)
099 40.3 1949 Chuck Stobbs-P (living)
096 35.8 1951 Clem Labine-RP (living)
107 26.3 1952 Jim Rivera-RF/CF (living)
1967 (December 11)—elect 2
HF% Career Name-pos (born) BJ – MVP - All-Star
04% 46-61 Bob Boyd-1B (1926) #9 1b – 0 – 1*
00% 43-61 Marvin Williams-2B (1923) – 0 – 1*
Players Passing Away in 1966
HoMers
Age Elected
None
Candidates
Age Eligible
78 1927 Hippo Vaughn-P
76 1925 Red Smith-3b
76 1925 Lee Magee-CF/2b
76 1931 George J. Burns-LF
76 1931 Hank Gowdy-C
73 1941 Sad Sam Jones-P
72 1938 Rube Bressler-LF/P
71 1942 Bing Miller-RF
70 1936 Johnny Morrison-P
67 1936 Chuck Dressen-3B/Mgr
65 1940 Marty McManus-2B/3B
62 1942 Bill Walker-P
57 1951 Pete Fox-RF
55 1952 Lou Finney-RF/1B
52 1955 Mike Tresh-C
49 1959 Bob Elliott-3B
Gracias, Dan and Chris!
|
Bookmarks
You must be logged in to view your Bookmarks.
Hot Topics
Most Meritorious Player: 1982 Discussion (48 - 9:05pm, May 19)Last: Mr. CMost Meritorious Player: 1981 Results (11 - 3:30pm, May 16)Last: DL from MN2014 Hall of Merit Ballot Discussion (85 - 11:09am, May 13)Last: bjhankeMost Meritorious Player: 1981 Discussion (72 - 10:54am, May 13)Last: bjhankeMost Meritorious Player: 1981 Ballot (47 - 9:51am, May 06)Last: DL from MNMost Meritorious Player: 1979 Discussion (115 - 2:09pm, Apr 19)Last:  DL from MNMost Meritorious Player: 1980 Results (10 - 12:23pm, Apr 15)Last: DL from MNGeorge Scales (70 - 10:52am, Apr 10)Last: Ivan Grushenko of Hong KongLarry Doby (94 - 12:28am, Apr 10)Last: KJOKMost Meritorious Player: 1980 Ballot (21 - 11:03pm, Apr 09)Last: DL from MNMost Meritorious Player: 1980 Discussion (45 - 1:04am, Apr 09)Last: lieiamMost Meritorious Player: 1979 Results (12 - 4:30pm, Mar 14)Last: TomHMost Meritorious Player: 1979 Ballot (35 - 4:06pm, Mar 12)Last: TomHNew Eligibles Year by Year (956 - 3:11pm, Mar 12)Last:  Chris FluitMike Mussina (46 - 8:36am, Mar 12)Last: Rants Mulliniks (formerly Cold Prosimian)
|
|
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
Burns, Doyle, and Roush are hammered in W3 by a double whammy of weak league and steep devaluing of their defense from its contextual value to an all-time standard.
The most problematic element of WARP2/3, in my view, is their translation of fielding to an all-time standard, which makes hash of the tradeoffs between hitting and fielding value that are sensible in a particular historical context.
The thorough and sophisticated jimd has, if I understand correctly, worked out how to tweak WARP2/3 to eliminate the fielding contextual shift but keep the adjustment for quality of competition (I believe he does this by accepting changes to BRAA and FRAA but rejecting changes to the difference between FRAA and FRAR, which is where the normalization to an all-time fielding context takes place.)
But I was going to say that you can just start with WARP1 and make your own adjustments, which is what I used to do until it got to where I had to redo everything about every two weeks.
Burns is 96.5/70.5
Doyle is 93.1/57.5
Though now that I think of it, I don't really know that the fielding issues don't taint WARP1, too.
And WARP's assessments let us get inside the different components of value more easily than WS does. We _can_ get inside the WS system even to the extent of calculating our own win shares, but unless you do all this work yourself it's ****ed hard to figure out things like how many pws a pitcher is losing because of low batting value. We can't recalculate WARP ourselves, but we can see the details of its conclusions much more easily. That's a helpful thing.
I am unequipted to judge the superiority of one methodology versus another, but it is heartwarming to see such a high degree of introspection about this subject as such an analysis can only lead the collective voters to more closely scrutinize their ballots and elect the most deserving candidates.
I saw Bob Lemon pitch but I was only a kid and have no real memory of his effectiveness/prowess.
I cant say the same for any of the older candidates, but soon you will begin electing members from a group of candidates that many/most voters will have witnessed first hand their many exploits.
I'm curious how this aspect of "additional data" will play out in both the discussions and in the weighting of the data.
Keep up the good work guys; this is the single most worthwhile endeavor I've observed on the web (as it relates to sports).
I looked at players with similar OPS+/OPS/Slg/OBP in each season in the AL during his career. Then I looked at their win shares compared to Johnson's. Johnson was usually below the average of the other players. There was one obvious reason - traditional defensive positions had some great hitters during Johnson's career - catcher had Mickey Cochrane, Bill Dickey, and Rudy York. Second base had Charlie Gehringer, Tony Lazzeri, and Buddy Myer. Third base had Harlond Clift. Shortstop had Luke Appling. So there were players with high OPSs playing defensive positions where they were able to earn more defensive win shares.
Second, I had not noticed it before, but his batting average hurts him. Sometimes I forget how consistently high so many players' batting averages were in the AL in the 30s. Johnson's .296 looks good and he walked a lot. But many of his contemporaries either hit for a higher average with similar walk totals or hit for a similar average and walked even more. Johnson was in the top 10 in AL in walks 8 times in his career, but in the top 10 in OBP only 4 times. He doesn't have the singles that other players did. I expected his OBP to be higher, but he simply didn't get as many hits as many of his competitors did.
I see the career .393 OBP (93rd all-time), the 1075 walks (71st all-time), and the 138 OPS+ (85th) and I think wow, that's really good. But compared to other AL players of his era, his career .296 AVG doesn't measure up. There is Earl Averill - OBP of .395, he hit .318. Ed Morgan (Cle first baseman before Hal Trosky) OBP .398 / AVG. 313. DiMaggio OBP .398, Avg .325. Appling OPB .399, Avg .310. Gehringer OBP .404 , Avg. 320. Greenberg OBP .412, Avg .313. Cochrane OBP .419 / Avg .320. Then there are Foxx, Gehrig, Ruth, and Ted Williams.
On the other hand, there are a few hitters similar to Johnson in Avg., but they walked even more than Johnson did. Selkirk OBP .400 , Avg. .290. Roy Cullenbine OBP .408 , Avg .276. Charlie Keller OBP .410 , Avg. .286.
As a consequence, Johnson does have a top 100 career OBP, but there are 14 players in his own league during his career that have better. Similarly, his 62nd career OPS (.899) is bettered by 10 of his AL contemporaries. His 80th career SLG of .506 is bettered by 10 of his AL contemporaries. His 85th career OPS+ of 138 is bettered by 8 of his AL contemporaries.
To my mind, Johnson is hurt by three things compared to his contemporaries. One, he did play for a bad team that under-performed its pythagorean projections. Two, he played an "unimportant" defensive position while there were great hitters at "important" defensive positions. Three, his strengths (OBP, SLG, OPS) are bettered by 8 to 14 players in his own league who played at the same time as he.
Am I off base? It wouldn't be the first time.
Remember, it is only worth my 2 cents.
I haven't lost this. I never had it.
I just vote.
Thanks, jingoist. I think I can speak for everyone involved with the project that we all appreciate your words.
"To my mind, Johnson is hurt by three things compared to his contemporaries. One, he did play for a bad team that under-performed its pythagorean projections. Two, he played an "unimportant" defensive position while there were great hitters at "important" defensive positions. Three, his strengths (OBP, SLG, OPS) are bettered by 8 to 14 players in his own league who played at the same time as he."
It's "one" that bothers me a bit.
I am starting to wonder if "Win Shares" for players is a little like "Wins" for pitchers.
Yes, the better pitchers in the long run win more games.
But some guys have rotten luck, and others are lucky. And it doesn't balance out 100 pct.
I can't see penalizing Johnson because his teams sucked and they underperformed as well. It's not HIS fault, just as a pitcher with a 14-14 record and a 135 ERA+ had a good year no matter what the W-L says.
I have voted for Johnson before, but my problem with him is the war years being his best years. He just seems like a guy who was good but not great, but when the talent dipped, he took advantage.
That's nice, but I suspect a lot of consistent 130 OPS+ guys (which is quite good) might be able to do that.
As for Rixey vs Lemon:
Rixey 1916-21-23-24-25 knocked in 38 runs during those five best seasons.
Lemon in 1948-49-52-54-56 knocked in 71 runs during those five best seasons.
Hey, that helps a team win. But is that enough of a tiebreaker in putting Lemon ahead if he wasn't already?
Pitcher's hitting is something that never occurred to me before this project as a differentiator.
Does anyone have any feeling for what the good hitter is worth compared to the bad one?
That is, suppose one guy has 270 IP of 133 ERA+ and an 85 OPS+. How does that rate, say, compared to 270 IP of 138 ERA+ and a 32 OPS+?
We're looking at roughly 110 ABs by these Ps, generally with not much power. The good hitter is getting a dozen more hits than the bad one, let's say.
I've never considered that a major edge, more like a small bonus. But I'm open to someone quantifying it better.
And yes, I realize Rixey may face a league discount, but seems like that came up more in discussion that it did with Lemon.
Anyway, good lively discussion here.
And thanks as well to jingoist for his thoughts.
Just looking at Runs Created Above Position, we can see that Lemon, in his best years was worth about 1 1/2 wins above an 'average' hitting pitcher with his bat alone, and about 8 wins over his career.
And I do belive that Lemon was the better pitcher durign his peak and prime, his hitting is only furthers that peak/prime.
Some of the big difference between Lemon voters and Rixey voters is whether or not you are a peak/prime guy or a career/prime guy (I think we all vote for great primes). If you value peak over career, Lemon is your guy. If not the it is Rixey.
Unrelated observation – I am a few years late in pointing out how cool it is to be able to visualize the baseball cards of my youth for each player on the list of new eligibles. I’m glad my mom didn’t throw them away.
Pitcher's hitting is something that never occurred to me before this project as a differentiator.
Does anyone have any feeling for what the good hitter is worth compared to the bad one?
Karlmagnus replied:
I think the effect of pitcher hitting has to be non-linear; between say 0 and 80 OPS+ it makes only a modest difference.
Since wins vary with the square of runs scored and allowed, the effect of pitcher hitting is certainly non-linear, but there isn't a sudden tipping point from "significant" to "insignificant." We can calculate pretty easily the drag of a pitcher who _never_ does anything productive at the plate, and then scale up from there.
Here's a quick set of models that can suggest the impact of pitcher hitting on team performance.
Elements:
1) A 4.5 r/g environment
2) An average pitcher produces 30% of the offense of an average position player
3) In this environment, consider the hitting of six pitchers who are average as pitchers but varied as hitters
i) Wes Ferrell (best hitting real pitcher), appx. 100 OPS+, equal to an average position player, creates .535 r/g
ii) Bob Lemon, appx. 80 OPS+, creates .40 r/g
iii) Average Hitting Pitcher, appx. 30-35 OPS+, creates .22 r/g
iv) Eppa Rixey, appx. 20 OPS+, appx. 22 OPS+, creates .17 r/g
v) Sandy Koufax (the worst-hitting real pitcher), appx. -10 OPS+, creates .065 r/g
vi) The Worst Hitter ever, -100 OPS+, creates 0 r/g
By adding or subtracting the pitchers runs created from the amt. an average-hitting pitcher contributes to his teams, using the Pythagorean method, and assuming 30 decisions per pitcher, we get the following winning percentages and wins above or below average for each pitcher, created purely by his hitting (since his pitching is average).
Wes Ferrell, .534, +1 win (3 win shares)
Bob Lemon, .520, +.6 win (1.8 win shares)
Average Hitting pitcher, .500, +/-0 wins (0 win shares)
Eppa Rixey, .492, -.25 win (-.75 ws)
Sandy Koufax, .482, -.5 win (-1.5 ws)
Worst Hitter Ever, .475, -.75 win (.2.25 ws)
Now, if each were changed to an average-hitting pitcher, here's the ERA+ that pitcher would need to achieve the same winning percentage:
Wes Ferrell, 107 ERA+
Bob Lemon, 104 ERA+
Average Hitting Pitcher, 100 ERA+
Eppa Rixey, 98 ERA+
Sandy Koufax, 96 ERA+
Worst Hitter Ever, 95 ERA+
Since ERA+ isn't linear, the exact size of the swing will vary, as will the win-value of a given ERA+, but this should demonstrate (as jimd has demonstrated before in better detail), the effect of a pitcher's hitting on his value to his team.
Finally, just to show how the small, but noticeable impact of hitting on a 30-decision subset of a pitcher's career plays out on larger scales, one might notice this:
Over a 300-decision career, the difference between a Wes Ferrell-type hitting pitcher and a Sandy Koufax-type hitting pitcher is around 45 win shares.
Over a 35-decision peak season, the difference between the two types is 5.2 win shares.
This is the most extreme real difference between pitchers' hitting that we will observe. The difference between the Lemon-type and the Rixey-type is about half as large, but that's still nearly a win per year in their peak seasons.
P.S. I wrote most of this up before KJOK's post: My "Lemon" model pitcher is based on Lemon's average value, not, his peak value, which was a good bit higher.
In my own home-brewed system, I would have Lemon's career equivalent record as 176-141 if I considered only his pitching, and 184-133 if I consider the effect of his own hitting on his run environment. There's also something like a 1.5 win gain over his career for his work as a PH, but only if I ignore his early years as a 3B, which were pretty bad. That's 1.5 wins above some amalgamation of reserve outfielders and infielders who aren't good enough to start - the ones who usually get the PH opportunities.
Mind you, I don't know how he hit as a PH compared to how he hit as a P; I was just allocating it proportially by PA. The offense while pitching is much more valuable than the offense while a PH for two reasons: there's more of it (many more PA), and the baseline is lower (versus typical pitcher rather than versus OF/IF reserve).
So that's very much in line with the estimate KJOK provides.
Would you be able to post Rixey's career RCAAP as a hitter?
I ask partly because I've become curious about the ways that WARP and WS evaluate pitcher hitting. I have a suspicion that both systems are evaluating pitcher hitting in a way that systematically underrates pitchers a little bit.
In WARP's system, it appears to me that they compare pitchers as hitters to a generic replacement level, docking pitchers' WARP if they hit below replacement level. But if an average pitcher hits below replacement level for position players, isn't it the case that a pitcher who is above average for position, even though he is below replacement overall, is helping his team rather than hurting it?
In the WS systeam, a pitcher starts having his pws docked if he creates fewer runs per out than the marginal runs per out rate of the league (.52 * league average runs/out). Again, if an average pitcher creates runs below the league marginal rate, isn't a pitcher who is hitting above average for his position helping his team rather than hurting it, even if he's hitting below the league marginal run rate?
These questions make little difference for the Rixey vs. Lemon issue, but, if my concerns about systematic underrating are correct, then a long-career, weak-hit pitcher like Rixey is being exceptionally disadvantaged by this problem and might deserve a bit of a boost relative to non-pitchers.
Thoughts?
YEAR TEAM RC RCAA RCAP OWP1912 Phillies 2 -7 -2 .063
1913 Phillies 3 -5 0 .152
1914 Phillies 0 -4 -2 .000
1915 Phillies 3 -4 0 .164
1916 Phillies 3 -9 -2 .075
1917 Phillies 4 -9 -2 .103
1919 Phillies 2 -6 -2 .080
1920 Phillies 9 -5 2 .300
1921 Reds 3 -16 -5 .034
1922 Reds 6 -13 -1 .106
1923 Reds 4 -15 -4 .055
1924 Reds 7 -7 1 .227
1925 Reds 6 -11 -1 .124
1926 Reds 5 -7 0 .159
1927 Reds 7 -5 2 .267
1928 Reds 4 -13 -2 .065
1929 Reds 4 -9 0 .109
1930 Reds 3 -7 -1 .095
1931 Reds 1 -6 -1 .031
1932 Reds 3 -2 1 .311
1933 Reds 3 -2 2 .317
TOTALS 82 -162 -17 .120
LG AVERAGE 248 0 0 .500
POS AVERAGE 100 -148 0 .158
Rixey's career of course. Submitted it before I labeled it.
IF he was only -17 for his career, I would guess that means he only cost himself 2 to 4 games more than the average pitcher?
CAREER
P
RCAP RCAP
1 Gus Weyhing -98
2 Pud Galvin -79
3 Red Donahue -55
4 Ed Morris -54
5 Bobby Mathews -53
6 Stump Wiedman -51
7 Mark Baldwin -44
T8 Will White -43
T8 Lefty Grove -43
10 Henry Porter -41
T11 Sadie McMahon -38
T11 Togie Pittinger -38
13 One Arm Daily -37
14 Vic Willis -36
T15 Bill Doak -34
T15 Red Faber -34
T15 Lee Viau -34
T15 Jimmy Ring -34
T15 Casey Patten -34
T20 Red Ehret -33
T20 Dan Casey -33
T20 Dupee Shaw -33
T20 Cy Young -33
T24 Stan Coveleski -32
T24 Billy Rhines -32
CAREER
MODERN (1900-)
P
RCAP RCAP
1 Lefty Grove -43
2 Togie Pittinger -38
T3 Bill Doak -34
T3 Jimmy Ring -34
T3 Red Faber -34
T3 Casey Patten -34
7 Stan Coveleski -32
8 Red Donahue -31
T9 Slim Harriss -30
T9 Al Benton -30
T11 Bob Groom -29
T11 Bob Buhl -29
T13 Lefty Gomez -28
T13 Mel Harder -28
T13 Earl Moore -28
16 Jack Knott -27
T17 Tully Sparks -26
T17 Vic Willis -26
T17 Danny MacFayden -26
20 Bob Friend -25
T21 Alex Ferguson -24
T21 Si Johnson -24
T21 Ned Garvin -24
T21 Barney Pelty -24
T25 Bobo Newsom -23
T25 Ed Siever -23
T25 Flint Rhem -23
T25 Vic Sorrell -23
Sabermetric Baseball Encyclopedia
New editions are available every October
http://www.baseball-encyclopedia.com
Koufax -20
Rixey -17
Ferrell +120
Lemon +81
Heck I guess I should post the top 25 too - post 1900:
CAREER
MODERN (1900-)
P
RCAP RCAP
1 Red Ruffing 131
2 Wes Ferrell 120
T3 Red Lucas 102
T3 George Mullin 102
5 Walter Johnson 96
6 George Uhle 95
7 Bob Lemon 82
8 Don Newcombe 74
9 Schoolboy Rowe 72
10 Carl Mays 61
11 Bucky Walters 57
12 Al Orth 56
13 Doc Crandall 55
14 Dutch Ruether 54
T15 Early Wynn 53
T15 Warren Spahn 53
T15 Joe Bush 53
18 Bob Gibson 52
19 Claude Hendrix 51
20 Jesse Tannehill 50
21 Ray Caldwell 49
22 Earl Wilson 48
23 Jim Tobin 47
24 Babe Ruth 45
T25 Burleigh Grimes 44
T25 Christy Mathewson 44
Ruth is 1914 thru 1917, just in case you are curious.
If you examine the records of Rixey and Lemon at BP carefully, you'll see that Rixey was, at his best, a better _pitcher_ than Lemon. His peak value as a player is a bit lower than Lemon's, however, because he is so much worse as a hitter and slightly worse as a fielder.
Actually the story is a bit more complicated than that—some of the BP measures show Rixey as a better pitcher than Lemon (DERA, PRAA), while others show Lemon ahead (PRAR, which feeds into WARP1). And I’m only looking at the statistics that are “adjusted for season” (i.e., WARP1), which in principle don’t adjust for league quality. Understanding these differences helps us understand some of the adjustments made by WARP, and also points to some important changes in pitching context between the 1920s and 1950s.
To illustrate, let’s compare each pitcher’s “peak” season, which I’ll pick as 1925 for Rixey and 1949 for Lemon. (These are their highest seasons as measured by WARP1, valued at 11.6 for Lemon and 8.9 for Rixey; WS also shows 1949 as Lemon’s best season with 31, while Rixey’s 1925 and 1923 seasons are tied for his best with 26.) Here are their pitching lines:
As we see, in many respects these two seasons are strikingly similar—with similar W-L records, IP, ERA, and run scoring environments (both pitched for good defensive teams in pitchers’ parks). Other parts of their pitching lines are quite different, with Rixey giving up many fewer walks and home runs and many more hits, while striking out many fewer batters. As we’ll see, these differences reflect both the characteristics of the two pitchers and context changes that took place in baseball between 1925 and 49.
Based simply on ERA+ and IP, Rixey appears to have a small, but significant edge as a pitcher (that is, without adjusting for hitting). When we move to the BP statistics, the first adjustment is to add back in the unearned runs and normalize to a 4.5 run/game context, giving us NRA (normalized runs allowed). Although Rixey allowed 17 unearned runs compared to Lemon’s 8, most of this difference reflects the drop in errors between 1925 and 49. If we calculate NRA+ by dividing 4.5 by NRA (placing the measure in the same units as ERA+), we get 141 for Rixey and 134 for Lemon—not much different from their ERA+.
The next adjustment BP makes is for defensive support, giving us DERA (defense-adjusted ERA). According to BP, “if DERA is higher than NRA, you can safely assume he pitched in front of an above-average defense.” BP sees both the 1925 Reds and the 1949 Indians as good defensive teams. It scores the ’25 Reds as 19 FRAA (fielding runs above average), second in the league, and the ’49 Indians as leading the league 43 FRAA. (WS concurs, also placing the ’25 Reds second in the NL in fielding win shares and the ’49 Indians first in the AL.)
Consequently, both Rixey and Lemon see their DERA go up relative to their NRA, reflecting the adjustment for defensive support. Rixey’s 1925 DERA is 3.33 (or DERA+ = 135), while the adjustment hits Lemon’s record a little harder, giving him a DERA of 3.63 (DERA+ = 124).
While the larger adjustment to Lemon’s record reflects BP’s evaluation of the Indians as the stronger defensive team, there is another consideration that also plays a role. BP adjusts the defensive support for the pitcher’s own fielding contributions, and BP sees Lemon as the better fielder (6 FRAA, compared to 2 for Rixey). In other words, to avoid double counting the runs that a pitcher saves through fielding, it removes them from his pitching record.
Now, frankly, I think that this adjustment for pitcher fielding is problematic. I assume that Lemon is credited as a good fielder because of his fielding statistics (mostly assists). We also understand Lemon to have been a ground-ball pitcher, so presumably he had more opportunities to field ground balls than fly-ball pitchers would have had. Should a tendency for a pitcher to give up ground balls that he then can field be counted as “pitching” or “fielding”? —without detailed zone information on batted balls, I don’t think there’s any way to answer the question. I don’t think BP gets the total defensive contribution of a pitcher (pitching plus fielding) wrong, but I am not confident that they have the split right. Win shares avoids the problem by simply not attempting to split out pitcher fielding from pitching, which I think is the preferable approach given the uncertainties in making the split.
The next step for BP is to calculate measures of pitching runs above average (PRAA) and pitching runs above replacement (PRAR). The PRAA statistics are consistent with Rixey’s higher DERA—Rixey is credited with 37 PRAA, while Lemon is credited with 27. PRAR, on the other hand, tells a different story, with Rixey credited with 90 and Lemon credited with 102. Why the difference?
The answer is buried in the BP documentation, which says that PRAR is “similar to PRAA, except that the comparison is made to a replacement level player instead of average. The nominal RA for a replacement pitcher is 6.11.” It then adds: “This assumes that there is a 50/50 split between pitching and fielding. If the pitch/field split is less than that, as it was in the 1800s, the replacement ERA is reduced.”
This is very important. BP assumes that the share of fielding in overall defense started out quite high in the 1800s, and then as the importance of pitching gradually grew, the replacement point is gradually shifted to give pitching more and more weight. I agree with the folks at BP that pitching became more important relative to fielding, as the fielding independent components of defense (walks, strikeouts, and home runs) gradually became more important. However, BP’s mechanism of adding to the weight of pitching by shifting the replacement level doesn’t make a lot of sense. In essence, they are saying that the replacement pitchers in 1900 were quite good, but they got worse and worse relative to the average ML pitcher as the century progressed. It’s logical to think that the development of farm systems actually led to an improvement in the average replacement pitcher, rather than a deterioration. (Win shares potentially has a different way to shift the weights applied to pitching and fielding, and it actually makes some modest shifts in the weights to reflect strikeouts. IMO, however, these shifts do not go nearly far enough.)
Compared to the 1925 NL, the number of walks in the 1949 AL has increased 63 percent, the number of home runs by 21 percent, and the number of strikeouts by 30 percent. Meanwhile, the fielder-dependent components have gone down—hits are down 12 percent and errors are down 34 percent. These changes all imply that for each inning pitched, pure pitching performance has become significantly more important in the late 1940s than it was in the 1920s. This trend is also reflected in a reduction in the standards for innings pitched. A measure I use is based on the # 5 pitcher in each league in innings pitched (data posted on the pitchers thread). According to this measure, typical pitcher workloads declined as follows:
Decade Innings
1900-09 324
1910-19 292
1920-29 276
1930-39 265
1940-49 252
1950-59 252
Consequently, we see that although Lemon pitched fewer innings in 1949 than Rixey did in 1925, relative to the standards of the time Lemon actually had the larger workload.
BP’s final step is to calculate WARP1 based on the sum of pitching (PRAR), fielding (FRAR) and batting (BRAR). For Rixey-1925 this sum is
90 + 2 – 7,
which results in a WARP1 of 8.9. For Lemon-1949 the sum is
102 + 6 + 9,
which results in a WARP1 of 11.6. Batting (16 runs) and fielding (4 runs) account for much of the difference, but another 22 runs are contributed by the differences between PRAR and PRAA—that is, by changes in the playing environment that BP shows as changes in the replacement level. I think that BP is adjusting in the right direction, though I won’t vouch for their exact numbers.
Some conclusions:
- BP has two sets of estimates, relative to average and relative to replacement, which sometimes can tell very different stories.
- Be aware that even in their WARP1 or “adjusted for season” statistics, BP is making some timeline-type adjustments to replacement levels that can fairly dramatically affect the ratings of pitchers (and of hitters, since the opposite adjustments show up in the fielding statistics).
- The playing environment did change fairly dramatically between 1925 and 1949 in ways that made pitching more important. (Similar changes have occurred almost every decade). Therefore, if you don’t use or agree with BP’s adjustments, I still recommend that some adjustment should be made to recognize that an inning pitched in 1949 was more important defensively than an inning pitched in 1925. Direct comparisons of pitching statistics across the decades are misleading.
- The greater importance of pitching in the 1950s is another important reason, along with the batting and higher peak, that I have Lemon ranked well above Rixey.
Depends on the run environment, of course, but 2-4 looks about right. With a few scrawls on a napkin I came up with about 2.
Great walk through the intricacies of WARP! The statement I made, with which you started off, was based on DERA and pitching runs above average. I don't agree with WARP's "replacement level" fudge, so I believe that their PRAR measure is a bit misleading about _quality_. I agree, of course, that an inning pitched in 1949 was more important defensively than an inning pitched in 1925, but I also think that a run saved is still a run saved. That is, if analysis tells us that a pitcher saved 37 (normalized, fielding-neutral) runs above average in 287 innings, it doesn't matter whether those innings were thrown in 1925 or 1949, the pitcher's defensive contribution in terms of quality is still the same.
I don't believe, therefore, that it is misleading to argue that pitcher A, whose has a better rate of runs saved above average per inning than Pitcher B, is a better pitcher, regardless of whether pitcher A and pitcher pitched in the same or in different eras.
When you throw in innings pitched to try to find value, rather than quality, things become more complicated, of course, and that's where WARP gets into PRAR, and where I lose confidence in WARP.
Your data shows IP totals declining by about 10% from the 1920s to the 1950s, but WARP has the value of an average pitcher climbing from about 18 PRAR/100 IP to 27 PRAR/100 IP. That seems a bit out of proportion to the change in usage to me.
I did, but now it's in the recycling because it filled up with scribbles.
I agree, of course, that a run saved is a run saved. Where I think BP is on the right track, though, is in saying that the "average" pitcher in 1949 had more value than the average pitcher in 1925 -- he's taken on more responsibility for the team's overall defense, leaving less responsibility for the gloves. But I'm not confident that BP is giving us the right numbers for the adjustment. I look at them and also at win shares, but ultimately I adjust the pitching data to get the decades to even out in a way that doesn't concentrate too much pitching value in any one era. (I don't think I'm disagreeing with you; just trying to clarify my thinking.) I've spent much more time tweaking my pitching ratings than anything else, but the main reason is that other than wins and losses, the context never stays the same.
In about 50 "years" many of us will end up arguing that Craig Biggio is roughly the 6th to 8th best 2B ever. We'll suggest that the little things with him add up, especially the HPB. He's got 273 right now. There's two ways you can estimate how much those HPBs have been worth to his value.
First you can use a linear weight and say, "A HPB is worth about 1/3 of a run, so all those HPBs are worth 91 runs, or 9 wins."
Or you could figure his runs created in the basic way, then figure his RC without his HPB (and their PAs) and subtract one from the other. The difference in his career is about 70 runs or 7 wins.
The reason I'm saying this is that I think pitcher batting is kind of similar to HPB. It's a secondary skill (for the position at least) which accrues value slowly, but surely, for those who are good at it. Most batters get hit four or five times a year, but for guys like Biggio, who average 15-20 plunkings, while the raw total isn't all that far off from the norm, it adds up over time, despite the fact that in most seasons, the difference is barely enough to give us pause.
Anyway, like I said, a strange analogy.
Similarly, Orel Hershiser who made a run at .400 one year. In fact, if my very cloudy memory serves, Hershiser went into the final day of the season needing a 1/1 to hit .400. He didn't appear in the game as a pitcher but Lasorda did allow him to pinch hit and go for it. I think he bounced to third, but, again, my memory is fuzzy.
I'd agree that the average pitcher in 1949 had more value than the average pitcher in 1925 -- on a per inning basis.
However, the number of innings thrown by an average starting pitcher (the number by which I track changes in usage patterns) was greater in 1925 than it was in 1949 -- 229 IP vs. 206 IP (the change is pretty similar in magnitude to the change you observe in the 5th-highest totals). The value of the average starting pitcher thus changes little, as I see it, between 1925 and 1949. The 1925 pitcher threw about 10% more innings at about 10% less value per inning.
I agree with WARP's conclusion that Lemon's 1949 was better than Rixey's 1925, also, but I disagree with their assessment of the magnitude. They show Lemon as being 2.7 wins better than Rixey (11.6 to 8.9 WARP1), where I see Lemon as being 1.4 wins better (32.8 ws to 28.6 ws). This may seem like a small disagreement with WARP's conclusions, and it is, but Lemon seems to be represented as a pitcher with an excellent peak, and Rixey seems to be represented as a pitcher with no peak, and that representation overrepresents the difference between the two. Lemon has an excellent peak, but not one of the best all time. Rixey has as good peak, one that might not make him a HoMer in and of itself, but he also has a great career.
I see Rixey's career advantage as outweighing Lemon's peak advantage by a good bit. Lemon, if elected, would not be a bad choice, but Rixey would be a better one in 1967.
Orel Hershiser 24 RCAP for his career, with 9 of them in 1993 when he hit .356 and a .784 OPS. Orel had zero career homers.
how about one more list, Pitcher RCAP career - post WW2: Orel barely makes the list.
AREER
1945-2005
P
RCAP RCAP
1 Bob Lemon 82
2 Don Newcombe 74
3 Warren Spahn 53
4 Bob Gibson 52
5 Earl Wilson 48
6 Early Wynn 45
7 Mike Hampton 42
8 Gary Peters 40
9 Don Larsen 38
T10 Bob Forsch 37
T10 Rick Rhoden 37
T10 Mickey McDermott 37
13 Don Robinson 36
14 Don Drysdale 35
15 Tommy Byrne 34
16 Fred Hutchinson 33
17 Vern Law 31
T18 Steve Carlton 30
T18 Ken Brett 30
20 Jim Kaat 29
T21 Rick Wise 28
T21 Johnny Sain 28
T21 Harvey Haddix 28
T24 Claude Osteen 24
T24 Tom Glavine 24
T24 Jack Harshman 24
T24 Orel Hershiser 24
Sabermetric Baseball Encyclopedia
New editions are available every October
http://www.baseball-encyclopedia.com
SEASON
1945-2005
P
RCAP YEAR RCAP
1 Don Newcombe 1955 23
2 Johnny Lindell 1953 17
T3 Don Drysdale 1965 16
T3 Bob Lemon 1950 16
T3 Warren Spahn 1958 16
6 Don Newcombe 1959 15
7 Bob Lemon 1949 14
T8 Bob Lemon 1948 13
T8 Robin Roberts 1955 13
T10 Bob Lemon 1947 12
T10 Fred Hutchinson 1950 12
T10 Brooks Kieschnick 2003 12
T10 Clint Hartung 1947 12
T10 Earl Wilson 1966 12
T15 Catfish Hunter 1971 11
T15 Carl Scheib 1951 11
T15 Johnny Sain 1947 11
T15 Carl Scheib 1948 11
T15 Don Newcombe 1958 11
T20 Ken Brett 1974 10
T20 Bob Gibson 1970 10
T20 Bob Forsch 1975 10
T20 Ferguson Jenkins 1971 10
T20 Boo Ferriss 1945 10
T20 Mike Hampton 2001 10
T20 Fred Hutchinson 1947 10
T20 Blue Moon Odom 1969 10
Overnight, Brent explained WARP for Rixey 1925 and Lemon 1949.
I don't have a vote, but I recommend
Brent, 1967 Ballot Discussion #22-23, for the "HOMey" award.
Even the "good" hitters here typically drew very few walks.
The legendary Terry Forster only got about 3 AB per year, so there's no way he winds up on any lists - but his record is one to behold.
I agree with the folks at BP that pitching became more important relative to fielding, as the fielding independent components of defense (walks, strikeouts, and home runs) gradually became more important. However, BP’s mechanism of adding to the weight of pitching by shifting the replacement level doesn’t make a lot of sense.
Hear, hear!
I think Chris Cobb says "hear, hear!" in other words.
--
Kelly #6:
To my mind, Johnson is hurt by three things compared to his contemporaries. One, he did play for a bad team that under-performed its pythagorean projections. Two, he played an "unimportant" defensive position while there were great hitters at "important" defensive positions. Three, his strengths (OBP, SLG, OPS) are bettered by 8 to 14 players in his own league who played at the same time as he.
Am I off base? It wouldn't be the first time.
Remember, it is only worth my 2 cents.
Good job, Kelly, easily worth my nickel.
--
Mike Webber penultimately:
how about one more list, Pitcher RCAP career - post WW2:
Orel barely makes the list.
That 1945-2005 list is still dominated by old timers.
Is that true also within the DH era, 1973-2005?
Bob Lemon put up a great peak as a batter.
Is RCAP denominated in league runs or normalized runs?
Whereas many pitchers have "a little 'pop' in the bat".
Fernando Valenzuela, 10 HR, 8 BB
Blue Moon Odom, talent may have approached .250 .300 .400
I think Chris Cobb says "hear, hear!" in other words.
Put me down in the "hear, hear" corner, too.
In a response to an earlier post of mine, we saw
In Sisler’s prime, 1915-22, he was 5th in the majors in RCAA (behind Ruth Cobb Speaker Hornsby). From 1903-08, Chance was 3rd in the majors in RCAA (behind Wagner and Lajoie). Which is the better accomplishment?
Because 8 seasons is not 6, we need to see Sisler's for 6 and Chance for 8.
Good point. Here are MLB RCAP rankings for peak/prime of Sisler's and Chance's careers:
consec Frank George
.yrs.. Chance Sisler
6 .......3rd .....3rd
7 .......3rd .....4th
8 .......3rd .....5th
9 .......3rd .....6th
10 ......3rd .....6th
Chance's best years vary from 1902-07 (6 yrs) to 1901-1910 (10 yrs). Sisler's go from 1917-22 to 1914-23.
Does your chart mean that between 1901-1910 Frank Chance had the third most RCAP of anyone in baseball? If so, that is amazing given how few games he played.
.......................RCAP games
1 Honus Wagner 749 1406
2 Nap Lajoie..... 530 1281
3 Frank Chance 283 1068
4 Sam Crawford 270 1463
5 Ty Cobb......... 255 .735 (rookie in '05)
This really shows how excellent Wagner was. And as good as Sam Crawford was, his highest RCAP was 46; Chance had three years with more (47, 53, and 66).
It's less valuable to rack up a high 'above average' number in less playing time. You need to go back in an add in replacement level (whatever you deem that to be) before you say he was more valuable as a hitter than Sam Crawford.
It wasn't a short career in the sense that you could go find an average player to make up missing seasons (not that I buy that replacement becomes average over time anyway). You'd need to make up the missing games within the seasons.
Who would you rather have - Crawford, or Chance and 395 games from a replacement player (who is going to be well below the extra 13 RCAP Chance gives you).
That's the whole problem with using average as the baseline. Playing time counts.
Since Joe's opening up this argument again...
No, you don't "need" to go back in and add replacement level. That is an opinion, not a fact.
It wasn't a short career in the sense that you could go find an average player to make up missing seasons (not that I buy that replacement becomes average over time anyway). You'd need to make up the missing games within the seasons.
Who would you rather have - Crawford, or Chance and 395 games from a replacement player (who is going to be well below the extra 13 RCAP Chance gives you).
That's the whole problem with using average as the baseline. Playing time counts.
No, Chance is likely NOT going to be replaced by a 13 RCAP below player. He's likely to be replaced in the starting lineup by an average player - about 50% likely to be BETTER than average. And actually, starting players are really more like .517 players ON AVERAGE than .500 players. And if you want to get specific, the Cubs being the Cubs, the player who might replace Chance could be even better than that.
And if we want to take the argument even further, if we're looking not for the "most valuable" players but the "most meritorious", then we should be using a cutoff worthy of the "HOM", something like a .600 baseline...
If you are talking about "off-season" replacement, I agree. But, "in-season"? Frank Chance forces you to have a guy on your roster to back him up. You can't assume the back-up is going to have a 0 RCAP.
The very term replacement player is derived value of the average player who is freely available to replace an injured player.
The very term replacement player is derived value of the average player who is freely available to replace an injured player.
But yet "replacement" level for measures such as Win Shares and Warp is set below what even the worst team in the league would insert into the starting lineup on more than a few game basis? And no player is "freely available" - he will have contract cost plus take a roster place cost at a minimum, plus if you have to trade for him, he could cost some talent. Anyway, "cost" is a bit of a red herring as it's an economic issue as opposed to a value measurement issue.
IF you're going to use "replacement" level, it would seem you need to use the level where the 129th best player in the league is at (8 starters for 16 teams = 128 players) as your theoretical replacement benchmark?
I think Karl is right to a small degree here....I am letting waht I jsut wrote sink in....in that 35 and 45 may not be a big difference but maybe 75 to 85 is. Or something like that. Wherever a pitcher would begin to pass bench players.
Chris Cobb #114
Karlmagnus replied:
> I think the effect of pitcher hitting has to be non-linear;
> between say 0 and 80 OPS+ it makes only a modest difference.
Since wins vary with the square of runs scored and allowed, the effect of pitcher hitting is certainly non-linear, but there isn't a sudden tipping point from "significant" to "insignificant."
Right, no sharp corner or discontinuity (effect of tipping point), in those games where the "pitcher" in question does pitch --that is, where the project of adjusting ERA+ for batting makes sense. There may be a "bend" because the strong-batting pitcher occasionally saves a pinch-hitter, bats a little more often than the weak-batting pitcher in the same number of innings.
In a tabletop game, there would be a tipping point where the strong-batting pitcher is consistently used as a pinch-hitter (perhaps, eg, consistently used as the team's second right-handed pinch-hitter). Of course, that would show up in playing time, especialy PAs. In the game on the field, there would not be sudden tipping (on the day after he pitches, we use Babe Ruth only against RHP, at least one baserunner, needing a home run to go ahead, something like that) at least two runs down, with tying but there might be a remarkable bend.
No, Chance is likely NOT going to be replaced by a 13 RCAP below player. He's likely to be replaced in the starting lineup by an average player - about 50% likely to be BETTER than average. And actually, starting players are really more like .517 players ON AVERAGE than .500 players. And if you want to get specific, the Cubs being the Cubs, the player who might replace Chance could be even better than that.
First, since most Major League players are below average, clearly is not likely that Chance would be replaced by an average player. He'd be replaced by someone who wasn't good enough to be a starter otherwise, who would be below average. And to be specific, we know who he would be replaced by. Most years it was Artie Hofman, with a career OPS+ of 104. A little low for a first baseman, but then the Cubs being the Cubs, they had more talent available than, say, the Braves. In 1909, he would have been replaced by Del Howard, with an OPS+ that year of 67, and a career OPS+ of 98. If Chance had been on the Braves that year, his replacement would have been Fred Stem (career OPS+ of 60) or Chick Autry (49.) If average players were so abundant, why didn't the Braves play one of them at first?
Year Lg PRAR %_def FRAR %_def
1925 NL 2031 50.2% 2016 49.8%
1949 AL 2933 70.1% 1254 29.9%
Year Lg pit_ws %_def fld_ws %_def
1925 NL 638.3 66.9% 316.5 33.1%
1949 AL 640.1 66.7% 319.1 33.3%
We see that for WARP1 the weight given to pitching increased tremendously between 1925 and '49 (and the weight given to fielding dropped), whereas for WS there was little change in relative importance of the two contributors to defense. Although I think BP is correct that the importance of pitching increased between 1925 and '49 with the increases in fielding-independent components of baseball such as walks, home runs, and strikeouts, I also think they've greatly overstated the magnitude of the shift. On the other hand, the shift doesn't even register in the WS statistics.
It's good to have it shown clearly that WS's normalization strategy does make it insensitive to the shift of responsibility from fielders to pitchers that the stats show was taking place, as well as to see what a huge shift WARP finds.
Do you have any thoughts on how to derive a more accurate ratio?
Simply because the % of WS or WARP by defense changes by x% does not necessarily mean the VARIATION changes by that amount; we'd need to see how much more is simply baseline, if we wished to assess the impact of fielding ratings between 1925 and 1949. However, for the particular discussion of runs saved over replacement for pitchers, this shouldn't be an issue.
From bb-ref, there was about a 10% decrease in chance-of-ball-in-play per batter between 25 NL and 49 AL. I made a crude RC formula, and varied the BAPIP (batting average when ball in is play), and only showed a 6% change in runs created between the two league models; not much of an effect. However, the UNearned runs dropped dramatically during this time (about 3/4 or 4/5 per game to about 1/2), so fielding percentage certainly grew in importance. I don't know how this would affect the fielding 'baseline' for the two systems (WARP and WS) in question.
A 104 OPS+ for that time wasn't too shabby for a first baseman. What was the average OPS+ at first during Chance's career?
Innings
Joss: 2327
Koufax: 2324
Therefore same career length.
Career ERA+
Joss: 142
Koufax: 131
Significant edge for Joss.
Peak ERA+ (top 5 years)
Joss: 205, 160, 151, 149, 137
Koufax: 190, 187, 161, 160, 143
This is the one major stat in which Koufax has an edge, but not a great one.
Prime (as measured by seasons with 120 ERA+ or greater)
Joss: 8
Koufax: 6
Edge to Joss.
Career rank in ERA (obviously affected by era, but still):
Joss: 2
Koufax: 91
Career rank in WHIP:
Joss: 1
Koufax: 21
Edges to Joss (or, some might say, irrelevant).
Wins
Joss: 160
Koufax: 165
Another slight edge for Koufax.
I could see voters preferring Koufax to Joss, but Joss made 2 ballots last year. I'll bet Koufax eclipses that by the time the third ballot is cast in 1972.
Joss: 2nd, 5th, otherwise not in top 10.
Koufax: 1st, 1st, 3rd, 4th
For peak value in his own time, compare Joss to Waddell, Walsh, McGinnity. Compare Koufax to ??
In 1906, Addie Joss had a nice 1.72 ERA in 280ish innings, but was only the third most valuable pitcher on his own team (Cleveland overall ERA 2.09; great deadball era, musta been a fine team defense also!).
Addie Joss is a viable candidate for top 15, especially if you give him credit for his great 1908 pennant race exploits. But there's a bunch of great reasons many of us will have Sandy K up high when he shows up.
First, J comes before K in the alphabet and I don't think that could be disputed, so Joss gets a point for that.
Wait, using the alphabet as a way to order names disregards the context. How do we know that abcdefghijklmnopqrstuvwxyz is the optimal arrangement. Maybe we should rank by which letter is used more or look from z to a?
<u>Well, do you mean as the first letter in words or any position within a word? And what about just as the first letter in names or do you mean any position within a name?</u>
OK, I need to get more sleep.
Comparing the number of times each pitcher was a leader or top 10'er in various pitching categories.
Wins
Joss: 6 top 10s, 1 best
Koufax: 5 top 10s, 3 bests
Percentage:
Joss: 5 top 10s, 0 bests
Kouax: 5 top 10s, 2 bests
Innings Pitched:
Joss: 2 top 10s, 0 bests
Koufax: 4 top 10s, 2 bests
WHIP:
Joss: 8 top 10s, 2 bests
Koufax: 6 top 10s, 4 bests
K/9
Joss: 1 top 10, 0 bests
Koufax: 8 top 10s, 6 bests
BB/9:
Joss: 7 top 10s, 2 bests
Koufax: 1 top 10, 0 bests
ERA:
Joss: 8 top 10s, 2 bests
Koufax: 6 top 10s, 5 bests
ERA+:
Joss: 8 top 10s, 1 best
Koufax: 6 top 10s, 2 bests
ERA stuff: ERA, LERA, RA, UERA
Joss: 1.89 / 2.70 / 2.82 / .93
Koufax: 2.76 / 3.63 / 3.12 / .36
Saves:
Joss: 5
Koufax: 9
That's a joke...
Looks like Koufax had more years where he was among the best at his position.
Park pitching factors:
BBREF:
Joss: League Park I: 96, 96, 98, 99, 97, 99, 100, 103, League Park II: 103
Koufax: Ebbet's Field: 101, 105, 107, LA Memorial Coliseum: 104, 107, 106, 108, Dodger Stadium: 92, 92, 92, 92, 91
STATS All-Time (the Runs listing, not the homerun one)
Joss: 91, 92, 104, 99, 101, 85, 106, 105, 105
Koufax: 106, 103, 133, 115, 103, 132, 104, 82, 84, 78, 76, 86
More Esoteric:
STATS All-Stars:
Joss: 3
Koufax: 6
Win Shares All-Star (top 4)
Joss: 2
Koufax: 5
Win Shares Best:
Joss: 0
Koufax: 3 (including a virtual ties with Ellsworth.)
Joss had in-season health issues. I don't have it in front of me, but I'll print it tonight - all the gaps in Joss' usage during his career. There are a lot.
Koufax played on better teams, but Koufax was a big reason they achieved what they did. Bill James has pointed out that Koufax at his peak may have been the best ever at pitching to score. If he got 2 runs, he pitched a shutout or gave up just one run. When his time comes, I am sure one of us will reprint all the info from various James' comments.
Even though I loved Firefly, like Angel, and own the Slayer Collection, I have to go with Koufax.
Joss: 2nd, 5th, otherwise not in top 10.
Koufax: 1st, 1st, 3rd, 4th
Exactly, OCF. Addie is demolished by Koufax here. Joss wasn't remotely as durable. That's why Koufax will make a good showing when he's eligible in a few years and why Joss will stay on the outside looking in.
Joss' strength is in his effectiveness. Of course he was not a workhorse. Still:
But when it comes time to enshrine somebody who is not both over-the-top effective AND workhorse-durable, which do you grab? 4000 IP at 110? Or 2500 IP at 140?
To me there is no question that 2500 IP at 140 fits the definition of "greatness" a lot better than the converse.
But how does 2,500 IP during Joss' time translate in Koufax's time?
Don't care. Not in 1967.
The easier question is: when was 140 ERA+ not great?
But it does matters when some here are trying to equate Joss with Koufax, Marc.
The easier question is: when was 140 ERA+ not great?
But rate stats are only one shoe. Games, PAs, and IPs are the other shoe.
As to the other shoe--yes, exactly. My question was if 110 for 4000 IP is better than 140 for 2500.
And my answer is that 140 for 2500 fits my definition of "greatness" better.
That's because 110 doesn't say great. 4000 all by itself doesn't say great. 2500 doesn't say great. 140 (for a career) says great.
Very true, Joe.
The easy answer is "never," but the more accurate answer is "well, it's not quite so great in the aughts as it would be in any other decade."
Joss's career 142 ERA+ is tied for 10th all time. Among his exact contemporaries, he trails only Ed Walsh, who has a career ERA+ of 145 (Walter Johnson and Joe Wood are also ahead of him, but we'll call them teens pitchers . . . )
However, another four pitchers who were active during all or almost all of Joss's career--Brown, Mathewson, Young, and Waddell--are also among the top 24 in career ERA+, so fully 25% of the top 24 come from basically a single decade.
When one takes Joss-sized (or even more-than-Joss-sized) pieces of the careers of these other four pitchers, one finds that all these pitchers put up ERA+ scores comparable to or better than Joss's.
So the ERA+ leaders for the 1901-1910 period, for continuous stretches of 2327 IP or more
Three-Finger Brown, 1903-12, 2481.7 IP, 153 ERA+
Christy Mathewson, 1899-1912, 4208 IP, 147 ERA+
Cy Young, 1901-1908, 2728.3 IP, 147 ERA+
Ed Walsh, 1904-1917, 2964.3 IP, 145 ERA+
Addie Joss, 1902-1910, 2327 IP, 142 ERA+
Rube Waddell, 1897-1907, 2422.3 IP, 140 ERA+
FWIW, Joss was the worst hitter of this group as well.
Joss is being compared here to a group that is mainly made up of HoMers so he's in good company, surely, but when his 142 ERA+ is compared to the ERA+ scores of pitchers from other eras, it needs to be remembered how that ERA+ score looks in context.
It is high enough to show that he was an outstanding pitcher, among the best but not the best of his time with respect to run prevention on a per-inning basis (as ERA+ measures it).
The question is how much should the relative weight of pitchers increase as their role in defense grew-- due to more strikeouts, walks, and home runs, and therefore fewer balls in play. I'll use the following simplified linear weights formula:
.47 1B + .78 2B + 1.09 3B + 1.40 HR + .33 BB - .25(AB-H)
I'll ignore the contributions of baserunning (defensive responsibility for which would need to be split between pitchers and fielders) and also ignore hit by pitch. I give strikeouts the same weight (-.25) as other outs.
Since components of this formula have both negative and positive signs, I'll weight each component by the absolute value of its coefficient. Using these weights, the purely defense-independent components account for 17.7% of the total in the 1925 NL, and 24.9% in the 1949 AL.
It's surely unrealistic, however, to give fielders 100% of the responsibility for the outcomes of balls in play. If we assign 30% of the responsibility for balls in play to pitchers, the weighted share of the pitcher components increases from 42.4% in 1925 to 47.4% in 1949. If we assign 50% of responsibility for balls in play to pitchers, the weighted share of the pitchers components increases from 58.9% in 1925 to 62.4% in 1949.
As TomH observed in # 155, errors are a bigger problem - they don't figure into the linear weights formula, so I haven't accounted for them in these calculations, although there was a substantial decrease in errors over this period. (Errors and unearned runs also should figure into the discussion of Joss, but I'll hold off on entering into that discussion.) The consensus of researchers is that responsibility for errors also needs to be shared between fielders and pitchers, so I'm not sure that the overall picture will change.
The bottom line still seems to be that win shares fails to adjust for the growing role of pitchers, while Warp over-adjusts. But I don't think we know enough to say what the ideal adjustment ought to be.
>What if the 110 for 4000 was at 140 after 2500 IP? The higher the IP, the harder it is to have big rate numbers . . .
Then the 110 for 4000 with a 140 for 2500 prime is probably better, unless the 140 for 2500 also had a peak of 175 for 1250.
Trust me, if 110 for 4000 had a 140 for 2500 prime it's on my spread sheet. I'm a peak/prime voter. But the 110 for 4000 pitchers we've been talking about didn't have that kind of peak/prime.
I don't ignore context, but I also don't care nearly as much as most here about how people rated directly against their peers. The sixth best 1b of the 1985-2005 era is still immensely good, as was the fifth or sixth best pitcher of Welch's era (Welch). Some eras have a lot of talent. I'm not arguing against someone who says Joss should be ranked last on a list with Brown, Matthewson, Young, Walsh and Koufax -- I'm just saying that the difference isn't enough to justify Joss on only two ballots. Well, three next year.
I'm also not here to compare resumes and education, but it would seem more in the spirit of the HoM that if you disagree with someone's apparently more simplistic method of analysis you should do so politely and perhaps mask your disdain a little better. Now, I'm off to litigate in the Supreme Court...
No disagreement, Daryn. But Koufax also pitched in a time of fine pitchers and, while not nearly to the level his numbers may suggest at first glance, stood out among his peers at his peak. This can't be said for Joss.
Now, I'm off to litigate in the Supreme Court...
This is probably a stupid question, but I'm not always the sharpest tool in the shed in the morning :-): do barristers in Canada wear a wig in court? I'm just curious if the British custom passed over to "Upper and Lower" Canada. Obviously, the US discarded it many years ago, so maybe our friends up north did the same, too.
Or did I just do it?
If the pitcher is credited with a larger protion of defensive value in say, the 1960's than in the 1900's, then wouldn't Koufax ERA+ be more impressive than that of Joss? Or maybe we could say that Early Wynn's career is more impressive than that of a similar player like Eppa Rixey (I don't know if they are actually that similar)?
The thought process was that I have posted a lot of comparison lists and I usual list them alphabetically so as not to imply that one player is better than another. Also, there have been more discussions about WARP/Win Shares/Pitching vs. Defense changing contexts, so I thought I would play with that.
Sometimes I am too much of a smart-mouth.
I am actually a fan of Joss and seeing his poor vote-gathering in the 1920s elections is what caused me to get involved in the HOM. I thought he's a HoFer with lots of great rate stats, why isn't he better loved. So I did the research and got involved.
Once again, I am sorry.
To me the question is not how Jake Beckley compares to Mark McGwire or Mark Grace or Jake (sic) Jones. It's: Was Jake Beckley really more valuable when and where than, say, Addie Joss or Rube Waddell? If the rules and strategies and etc. etc. etc. of the day made pitching a more decisive commodity, well, then it was more decisive and the pitchers were more valuable. The fact that lots of pitchers from 1900-1920 had great careers--and lots of hitters had great careers in the '20s and '30s--is overrated as a reason to downgrade the #5 or #6 or #7 pitcher or hitter from those times.
The game came to them and they responded.
So why do I have Browning #1 and Kiner and Keller (and Charley Jones) off ballot?
For raw batting WS, unadjusted by anything, they are:
Kiner...... - 216.3 in 1472 games, 23.8 BWS/162 G, C- fielder
Browning - 198.0 in 1183 games, 27.1 BWS/162 G, C+ fielder
Looking at WARP adjusted for season only
Kiner...... - 74.9 W1, .319 EQA, 608 BRAR, 436 BRAA, .83 FRAR, -70 FRAA
Browning - 94.7 W1, .335 EQA, 643 BRAR, 495 BRAA, 247 FRAR, -40 FRAA
What if we timeline and adjust for league quality?
Kiner...... - 70.8 W2, 72.4 W3
Browning - 50.4 W2, 57.9 W3
I suppose the answer to my own question is that I don't timeline or adjust for league quality as much
Ron answered his own query pretty accurately. While none of us can say we have the Best league quality discount figured out, here are some thots:
AA discount: Over their careers, Kiner’s OWP is .693; Browning’s is .745. If I break out Pete’s early years in the AA, from 1882-85 (his age 21-24), his OWP is .815, and for the rest of his career, .707.
An .815 OWP through age 24. How rare is that? It is the fourth highest ever, behind Ted, the Babe, and Joe Jackson; ahead of Cobb and Mantle and everyone else.
Some have put forth the argument that true superstars can't take full advantage of a lesser league; if true, this would mean that Browning’s discounted stats using methods like BP’s WARP3 would be undervalued. But even if this IS true, I think it’s obvious that he probably would have been performing at a level closer to a .680 OWP before age 25 than .815.
I would say that Browning’s .707 ‘rest of career’ OWP is more indicative of his true ability in the 1880s, unless you believe he really peaked at age 22-23. If so, he was then slightly more dominant than Ralph, but then you still might have some general quality issues to consider. Kiner’s NL was very probably stronger than the AL in his time. The NL was quickly adding players from the NegLgs. The NL of 1950 was probably stronger than MLB play of 1940 or before.
Another way to look at this: Pete had stronger rate stats like OPS+ and OWP and RCAA and EqA, but Kiner kills him in black ink; obviously it was much easier to rack up impressive stats in Pete’s leagues, but Kiner more often led his league.
And so, I cannot put Pete on my ballot above Ralph. Kiner led his league three times in OPS+, and 7 consecutive home run titles is a record that never has been broken, and maybe never will be. He consistently finished in the top10 in MVP voting, despite playing for lousy teams. I do give him a small bump down since Branch Rickey didn’t think anything of him. He is making his way on to my ballot, and he will remain a number of places ahead of another great hitter, the Louisville Slugger.
Thanks, Cliff! So the Cubs weren't hurt that dramatically by Chance's absences as one might think at first glance.
Do barristers in Canada wear a wig in court?
No, but we do wear the silly robes much like your Supreme Court Justices wear.
If a 140 ERA+ means a pitcher is great, does that mean Mark Eichhorn will be on your ballot when his time comes?
This is obviously a joke, but being from Toronto, Eichhorn has always been one of my favourites. I consider his 1986 to be one of the best relief seasons of all time, and have always cursed Jimy one M for not starting him on the last weekend of the season so he could steal the ERA title. If Eichhorn could have just racked up another 1500 innings, he'd have made my ballot.
Personally, if I had the choice between the wig and the robe, I'd take the robe any day of the week. :-)
Joss has been in my PHOM for decades; I think I'm the biggest FOAJ. I think the electorate needs to upgrade him substantially. There are no UERA or high loss level issues like there are with Waddell.
ERA+'s don't average like that due to the fact that ERA is in the *denominator* of ERA+. You have to do harmonic means I believe -- or its safer to just work out the raw ER totals and then take the difference if you're not confident as to which mean to take.
For the problem stated above I get an ERA+ of 81 which would be required to depress a 140ERA+/2500IP pitcher down to 110ERA+/4000IP.
I tried 1500 IP of 100 ERA+ just for yucks and that depresses the 140ERA+/2500IP down to 121ERA+/4000IP.
ERA+ Pitcher Yrs IP
153 Grove 25-32 2125.7
140 Johnson 18-25 2072
139 Vance 23-30 2066.7
133 Alexander 19-27 2274.7
133 Coveleski 18-25 2213
129 Luque 20-27 2069.3
125 Shocker 18-27 2452.3
123 Rommell 20-31 2491
123 Rixey 21-28 2193
123 Faber 20-28 2129.7
121 Quinn 19-28 2147
121 Mays 18-26 2088
121 Cooper 18-24 2072
121 Lyons 25-32 1982.7
120 Pennock 20-28 2155.7
118 Hoyt 21-28 2023
113 Kremer 24-33 1954.7
112 Grimes 20-29 2797.7
112 Uhle 23-31 2131.7
108 Jones 21-30 2131.7
Kiner - 74.9 W1, .319 EQA, 608 BRAR, 436 BRAA, 83 FRAR, -70 FRAA
Browning - 94.7 W1, .335 EQA, 643 BRAR, 495 BRAA, 247 FRAR, -40 FRAA
Keller - 67.1 W1, .324 EQA, 472 BRAR, 347 BRAA, 145 FRAR, 18 FRAA
Jones - 70.5 W1, .319 EQA, 423 BRAR, 303 BRAA, 230 FRAR, 9 FRAA
Jackson - 86.9 W1, .331 EQA, 645 BRAR, 488 BRAA, 161 FRAR, -32 FRAA
Notice Keller and Jones are both rated as slightly above average fielders (18 FRAA for Keller, 9 FRAA for Jones), but Jones is credited with many more fielding runs above replacement level than Keller (230 FRAR in 894 games for Jones, versus 145 FRAR in 1170 games for Keller). Although some of the difference is due to Jones spending some time in center field, mostly this difference illustrates the flip side of BP's shifting replacement level for pitchers that I was discussing above. Just as BP shifts down the replacement level for pitchers over time (thereby increasing the WARP1 of recent pitchers) they also shift up the replacement level for position players, thereby increasing the WARP1 of earlier players. BP is saying that the fielding of the average outfielder of the 1870s and 80s was worth substantially more than the average outfielder of the 1940s. While I think there may be some validity to this shift (as there were many more balls in play in the 19th century), it seems to me that BP has substantially overstated the effect.
Ron W also wrote:
I suppose the answer to my own question is that I don't timeline or adjust for league quality as much as many voters do.
Voters who base their ratings on BP's WARP1 numbers are implicitly accepting their timelining of pitchers and their reverse timelining of position players, due to the shifts in their replacement levels over time.
I don't know about the 110, 4000 pitcher, but I've got Rixey rated well above Joss. Rixey prevented 623 runs versus a replacement level pitcher, while Joss was at 398. This means to me that Rixey was a greater pitcher. No arbitrary ERA or innings standards needed.
Of course, I could be wrong.
I know I'm late to the party, sorry about that.
But Chris - I've always thought the same thing about WARP.
They (and anyone evaluating) should be comparing pitcher hitting to that of an 'average' hitting pitcher for the particular era. This is replacement level - no one makes a decision on pitchers based on their hitting. So a pitcher of the scrap heap will hit like an average pitcher (for the era). It's similar to fielding replacement level for position players, which tends to be average (for different reasons). I think this causes problems with WARP as you suggest.
This is really important.
WS is off in that for bad hitting pitchers, below 1/2 of League Average RC/27, their hitting is zeroed out. So pitchers worse than this level are overrated by WS. But the problem above also plays into it as well, which makes the overall impact pretty complicated to figure out.
I'll tell you what, Eichhorn's 1986 is as valuable than any year Joss had, when you figure in leverage and innings relative to league norms . . . that was a monster year!
I say throw them all on our ballot . . . and let the voters decide!
***********
Great post on Chance in #20 Tom. I'm thinking about it. I think I like him better than Keller now, but that still doesn't put him really high on my ballot - but I can see the argument.
But I said it and now I'm stuck with it. Now I HAVE TO vote for every pitcher who ever had a 140, and I'm not allowed to ask how many innings they pitched.
My bad.
Cliff please post a list of every pitcher that ever threw a 140. I sure home there aren't more than 14. I would hate to have to throw Dobie Moore of my ballot, but I guess if there are 15 then that's just what I'll have to do.
Jeez.
The fact is that everyone in major league history with a 140 era+ and more than 2000 innings is a slam dunk Hall of Meriter. You can claim arbitrary endpoints all you want but 2000 innings is a sensible cut off for starters, representing 10 modern day seasons.
Joe Wood is an interesting comparison as well. 1500 innings/2000 at bats, 140 era+, 110 ops+ and not a candidate.
I'll also be away from Jan. 3 through Jan. 6, but that shouldn't affect anything major here. Besides, the Commish will still be minding the store. :-)
I hope everyone has a wonderful Christmas and Hanuhkah!
Some have put forth the argument that true superstars can't take full advantage of a lesser league; if true, this would mean that Browning’s discounted stats using methods like BP’s WARP3 would be undervalued. But even if this IS true, I think it’s obvious that he probably would have been performing at a level closer to a .680 OWP before age 25 than .815.
I would say that Browning’s .707 ‘rest of career’ OWP is more indicative of his true ability in the 1880s, unless you believe he really peaked at age 22-23. If so, he was then slightly more dominant than Ralph, but then you still might have some general quality issues to consider. Kiner’s NL was very probably stronger than the AL in his time. The NL was quickly adding players from the NegLgs. The NL of 1950 was probably stronger than MLB play of 1940 or before.
Am I understanding you correctly? Are you penalizing Browning for his quality of competition relative to Kiner to make Browning only slightly better, then giving a bonus to Kiner for his quality of competition relative to Browning to make Kiner better? In other words, aren't you doubly penalizing Browning?
Another way to look at this: Pete had stronger rate stats like OPS+ and OWP and RCAA and EqA, but Kiner kills him in black ink; obviously it was much easier to rack up impressive stats in Pete’s leagues, but Kiner more often led his league.
Well, this is a misuse of Black Ink. It's not designed to be a measure of one's dominance over a league, and certainly not how easy or hard a league is. It measures what kinds of things likely Hall Of Fame voters would look at when casting their ballots.
As for Kiner and Browning's relative dominance over their leagues, they're pretty even, except Browning has 9 good seasons to Kiner's 6 or 7 :
OPS+
Browning: 2 times led league, 3 times 2nd, 1 3rd, 1 5th, 2 6th, 9 times top 6
Kiner: 3 times led league, 2 4th place, 1 7th, 6 times top 7
OBP:
Browning: 2 times led league, 3 times 2nd, 1 3rd, 2 4th, 1 9th, 9 times top 9
Kiner: 1 time led league, 2 times 3rd, 2 times 6th, 1 7th, 6 times top 7
SLG:
Browning: 1 time led league, 1 2nd, 1 3rd, 1 4th, 4 5ths, 1 7th, 9 times top 7
Kiner: 3 times led league, 1 3rd, 2 4ths, 1 9th, 7 times top 9
It's the home run titles that give Kiner his big Black Ink lead, not any actual dominance over Browning or his league. Not that the 7 HR titles isn't impressive.
Browning actually leads Kiner in Grey Ink, 147 to 145.
"In March, 1951, with the Korean War heating up, Antonelli--still only twenty years old--was drafted. In the early fifties the commanders of several Army bases maintained quality baseball teams as a source of competitive pride among the bases. Antonelli pitched for the Fort Myer team, in Virginia. He started 44 games in his two years in the Army, completed all 44 and won 42 of them, one of the two losses being to a team from Fort Eustice led by Willie Mays. Capping his military career, Antonelli led the Washington Military District team to the National Baseball Congress Championship in Wichita in 1952 and was included on an All-Star team which toured Japan. It was the tonic he needed after three years of little activity."
Antonelli's story is a fascinating example of the stupidity of the bonus baby rules of the era.
Anyway, you wrote in 167: That's because 110 doesn't say great. 4000 all by itself doesn't say great. 2500 doesn't say great. 140 (for a career) says great.
You must be Registered and Logged In to post comments.
<< Back to main