Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Hall of Merit > Discussion
Hall of Merit
— A Look at Baseball's All-Time Best

Friday, March 01, 2002

Estimating League Quality - Part 1 (the concept)

First of all, let me apologize for the lack of material posted to the Hall of Merit BLOG. In the coming weeks, I’m confident this will no longer be a problem.

When we consider players who played over 100 years ago, it is vital to look at the quality of the leagues they played in. Using a method that is similar to what Clay Davenport has been doing for some time (for examples of this kind of work, see Clay’s recent postings on Baseball Prospectus concerning the quality of play in the Japanese Baseball Leagues), I attempted to estimate the quality of baseball in the “major” leagues of the 19th century.

I focused on hitting stats, since at this time there were only a handful of pitchers active at a given time in a given league.

My method assumes that a player’s overall batting skill does not change appreciably from one year to the next. This assumption is not true on an individual basis, but it starts to make sense when we are talking about a large group of players. The individual changes in skill should become less important as the size of the group increases.

In leagues that are stable, there isn’t a very high turnover in personnel from year to year. In the 19th century National League, in most years, about 70%-80% of the players returned to play regularly the following year. In cases where new leagues started up and players jumped, the percentage of holdovers was much much lower - and this makes comparison much more difficult.

I estimated the quality of each hitter?s batting by using a runs produced ratio [(R+RBI)/PA] and compared it to a league average performance. The reason I chose this, and not Runs Created or Linear Weights, is that I wasn’t going to adjust for park and I assumed that the batting order bias of the R anbd RBI stats was not going to be relevant for a large group of players either.

In the 19th century, where more advanced run estimation formulas are much less accurate than for “modern” baseball, I opted for the simplicity of using Runs Scored and RBI.

Because we are comparing each group of players to league average the result shouldn’t be far from 1.00 for a relatively stable league (where the majority of regulars return the next year). In practice, it’s unlikely to be exactly 1.00 of course.

If the newcomers to the league in a given year were better than typical newcomers, the performance of the holdovers would be worse than in a typicla league and this would be a sign that the league was getting stronger. On the other hand, if a lot of good players jumped to a rival league and their places were filled by less skilled batsmen, the holdovers would improve their performance relative to league average and this would be a sign that the league was weakening.

By comparing the overall performance of the SAME group of players from year to year and league to league, it should be possible to track the changes in the overall quality of play.

In the next part, I’ll apply these methods to a specific example.



This thread will now be included with the Hall of Merit links.

-John Murphy
August 6, 2004

Robert Dudek Posted: March 01, 2002 at 05:49 PM | 172 comment(s)
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 1 of 2 pages  1 2 >
   1. tangotiger Posted: March 14, 2002 at 12:45 PM (#509664)
No matter what method will be chosen, we will end up with problems. I personally would not choose R+RBI as they are heavily influenced by their teammates. Using LWTS or RC also have problems because they are dependent on the league context, and the impact of a single in 1993 mght not necessarily be the same as in 1903. The pool of players selected also has an influence.

As Robert said though, by looking at the changes over a SMALL set period of time, we might be able to learn something GENERALLY, without getting too specific.
   2. scruff Posted: September 16, 2002 at 08:27 PM (#509666)
I'm reposting to reactivate this thread. Tom H asked about a thread to discuss the quality of league play, so here it is!
   3. Charles Saeger Posted: September 18, 2002 at 02:11 PM (#509668)
I've had some thoughts on league quality. I figure we need to look at several tools, not just one ultimate tool. As such, I propose the following tools:

* Five year floating comparisons. A player's performance can be compared with what he did up to five years before or after that year. This is the Cramer study, with a partial buffer against the effects of aging.

* Pitchers hitting, relative to the league. After 1975, this becomes less useful because pitchers bat less often in the minors due to the DH rule.

* Fielding percentage, adjusted for strikeouts (PO-SO)/(PO-SO+E). Less useful early in history since players did not start using gloves at the same time.
   4. Marc Posted: September 21, 2002 at 05:56 PM (#509670)
Are there numbers for the NA? It's probably useless to speculate, but with a little extrapolation it kind of looks like the NA may have been about equal to the early and late AA?

On the other hand, that is not a fair comparison. The NA represented the best baseball, the state of the art, at the time. The AA did not. So I think it is two different things to discount Ross Barnes or Al Spalding versus discounting Stovey, Browning, Caruthers and Mullane. At their best Stovey, Browning, Caruthers and Mullane may never have been as good as their more or less exact contemporaries Brouthers, Connor, Ewing, Glasscock and Clarkson. But at their peak, there was nobody, no rough contemporaries anywhere near as good as Spalding, Barnes and Wright. To me, those are two quite different things that just happen to be described by the same concept--that is, "-.020."
   5. Charles Saeger Posted: September 23, 2002 at 05:06 PM (#509671)
Actually, that study was done by Dick Cramer. It is an interesting study, but it has a fatal flaw -- it makes changes for player aging. The steady upcreep of player skill is probably real but nowhere near as great the Cramer study says it is. The other study outcomes probably are valid.
   6. Marc Posted: September 23, 2002 at 10:32 PM (#509675)
Tom wrote:

>I would subjectively put the league quality at a level so that the early stars (Barnes-Spalding-Wright) come out as no-better-than-even in terms of peak value with Brouthers-Glasscock-Nichols....

That seems sensible. And in fact they would be more or less equal in terms of peak value, and then the short careers of the early stars would have to be factored in. So, no, none of the stars of the NA is probably the very best of the 19th century at his position, but in terms of peak value (in terms of their contributions toward winning pennants) they were "in the ballpark."
   7. Charles Saeger Posted: September 24, 2002 at 11:44 AM (#509676)
I don't think the quality of the best players has advanced much over time, if at all. The quality of the tofu, marginal regulars, however, is where the increase has occurred.

It is interesting to note from the Cramer study that the rates of increase levelled off some in both 1920 and 1960, so we can assume the increase in quality of play has slowed. I do not think the baseball of today is all that much better than the baseball of 1960 -- were you to have an average team from 1960 play an average team from 2000 for 1000 games, my guess is the 2000 team would win 504 games on average, or something like that.
   8. jimd Posted: September 25, 2002 at 07:23 PM (#509677)
As to the quality of the early NA, given the incredible disparity among teams (a 71-8 record?!?) and individual player stats, I would subjectively put the league quality at a level so that the early stars (Barnes-Spalding-Wright) come out as no-better-than-even in terms of peak value with Brouthers-Glasscock-Nichols, and not fear to tread any further.

As I've pointed out on the Pitchers thread, there should be little difference in quality between the NA of 1875 (when Boston went 71-8) and the NL of 1876. I'll let those with better numbers track the evolving quality of the earlier NA years and the succeeding ones of the NL. (scruff has published indications that he believes the NA of 1875 may have been tougher.)

However Harry Wright did it, there is no question that he determined who the best players of that era were and get most of them onto his team. He managed to keep them together and winning with apparently few ego problems (at least none that I've read about). This was good for them, but ultimately, this was bad for the NA (we think that there is no hope of competing against the Yankees now; they are nowhere near this dominant). A priori, they are the NA All-Star team; it will be interesting to see if anyone else can crack that lineup.
   9. scruff Posted: September 26, 2002 at 02:28 PM (#509679)
Tom, I don't know that hitters batting avg relative to the league is a good stat to use. Batting avg is affected by many things.

I'd try to come up with something a little more comprehensive, like XR or RC or something. But then you run into problems with the formulas, etc.

This is what I'd do, if I had the time. Figure runs created for each player on each team, normalize the totals of the individuals to the team total runs, so they come out the same, i.e. the total for Chicago individuals in 1876 equals the actual number of runs the 1876 Cubs scored.

I know some people are opposed to this in theory, but the accuracy of RC and XR is questionable for that time period, and I think the adjustment is necessary, for the stabilizing effect.

Once you have the RC, figure RC per 27 outs (don't use 27, use the league average per game of batting outs recorded, will make a big difference back then, because of all the errors).

Finally, adjust that for the league. Now you've got a number you make meaningful comparisons with.

Alternatively, if you have the WS book (and the digital update from Stats), you can figure out WS per 162 team games or something like that (adjusting for the player's relative playing time), and use that as your comparison number. That would be a helluva lot easier, and the number is meaningful. But I don't think batting average is a good number to use. Just because fielding gets better (moving batting averages down), doesn't mean the quality of league play goes up. Take a league of Ozzie Smith's everywhere, even out in LF. Add Barry Bonds and Rickey Henderson and Jeff Bagwell to the mix (removing two Ozzie LF's and an Ozzie 1B), and the league fielding will get worse, but you're going to have a higher quality league. WS is the kind of number that is perfect for a study like this, even though it is seriously flawed with the pitcher/fielder split on defense for this era.

If you use WS, you can include pitchers as well, or do two separate studies.

Just my .02 . . .
   10. John (You Can Call Me Grandma) Murphy Posted: September 27, 2002 at 10:58 AM (#509683)
Tom, your points are all well taken. All three of them. :-)
   11. Joe Dimino Posted: August 07, 2004 at 11:08 AM (#783170)
moving to hot topics.
   12. John (You Can Call Me Grandma) Murphy Posted: August 07, 2004 at 11:47 AM (#783207)
Joe rules!
   13. Paul Wendt Posted: August 07, 2004 at 03:13 PM (#783385)
Transferred from 1932 Ballot Discussion
David Foss:
It's possible that the AL stars may not have affected the "base-level" of play for the league. Is there any evidence that would support that the AL also had more weak players and weak teams... not enough to reverse the discount, but enough to balance things?

Michael Schell (Biostatistics, UNC) believes that "standard deviation measures league talent" inversely. He uses a version of league SD as a measure, which is not an argument, but it is clear that he believes it. [Baseball's All-Time Best Hitters, Princeton U P, 1999, chapter 4.]

For example, "Many of Cobb's American League compatriots would likely have ridden the National League bench during the same era."

That era is 1910-1914, when SD batting average was AL .043, NL .030. In the AL, 4% batted above .350 and 12% below .220 (mean-adjusted). In the contemporary NL and in both leagues 1980-1984, the shares were 0% and 6%. By those measures, the number of extra "low outliers" in AL1910-1914 was greater than the number of extra "high outliers".[*]

Schell's argument implies that the AL was weaker than the NL, 1910-1930; stronger, 1901 and 1960-1975. That isn't plausible.

[*]
jimd suggested that standard deviation is low in NL1910-1914 (s.d. mean-adjusted batting average) because the NL lacked outliers such as Cobb, Jackson, Speaker, Collins, Lajoie on the high side.
http://www.baseballthinkfactory.org/files/primer/hom_discussion/1932_ballot_discussion/P200/#9
   14. Paul Wendt Posted: August 07, 2004 at 03:22 PM (#783393)
Shibe Park 1938-1954 was home to the NL Philies and AL Athletics. For that period, Michael Schell found normalized ballpark batting average .253 in the NL ("a slightly underaverage hitting park") and .256 in the AL --where .255 is the norm.

Such calculations do not measure league quality, but they have a place in interleague comparisons.
   15. jimd Posted: August 09, 2004 at 01:11 PM (#786004)
Paul, have comparable studies been done for the other shared parks of this era? Polo Grounds 1913-1922; Sportsman's Park 1921?-1952. Informal studies I've done of their park factors during the teens and twenties show that the AL was the "pitchers league" while the NL was the "hitters league"; both of those parks would play as better for hitters in the AL.

This is largely irrelevant to those who use OPS+/ERA+, Win Shares or WARP to evaluate players.
   16. DavidFoss Posted: August 09, 2004 at 02:23 PM (#786184)
Along the same lines as jimd's post, the NL took steps to decrease offense following the rabbit ball 1930 season while the AL did not. Scoring levels in the NL were lower than the AL between 1931 and WWII... often by as much as a full run per game.

Again this is irrevelent for those who use statistics that are scaled like OPS+/ERA+, etc.

This has been mentioned before, but this thread looks like a good place to restate this.
   17. Paul Wendt Posted: August 09, 2004 at 03:38 PM (#786430)
park factors during the teens and twenties show that the AL was the "pitchers league" while the NL was the "hitters league"; both of those parks would play as better for hitters in the AL.

Schell's estimates for Shibe Park suggest the same for 1938-1954. Shibe was a better batting park in the AL because the other AL parks were not so good for batting as the other NL parks.

It should be easy to study the matter by using a consistent series of park factors as a resource, eg the Total Baseball park factors.
   18. John (You Can Call Me Grandma) Murphy Posted: August 09, 2004 at 03:57 PM (#786474)
Schell's estimates for Shibe Park suggest the same for 1938-1954. Shibe was a better batting park in the AL because the other AL parks were not so good for batting as the other NL parks.

Isn't this also true for Sportsman Park?
   19. jimd Posted: August 09, 2004 at 05:08 PM (#786602)
Polo Grounds 1913-1922; Sportsman's Park 1921?-1952. Informal studies I've done of their park factors during the teens and twenties show that the AL was the "pitchers league" while the NL was the "hitters league"; both of those parks would play as better for hitters in the AL.

Isn't this also true for Sportsman Park?

Yep. See above.
   20. John (You Can Call Me Grandma) Murphy Posted: August 09, 2004 at 05:14 PM (#786608)
Yep. See above.

Sorry, Jim. I missed that post.
   21. Paul Wendt Posted: August 12, 2004 at 10:13 AM (#792041)
I doubt error rates as a basis for judging quality in competing leagues. That is how participants probably viewed errors, so liberal scoring may have been a means of competition.

(By eye and mind) Error rates seem to show that NL fielding was significantly better than AL and FL in 1914; that NL and FL fielding improved significantly in 1915.

BTW, the FL signed many more NL regulars than AL.

That's enough for me, since someone else probably has an electronic database with the relevant data for every league-season. (IP, SO and E, I think)
   22. Paul Wendt Posted: September 01, 2004 at 05:51 PM (#831677)
jimd supposes or estimates that Babe Ruth is 16+ wins above replacement; call it 49 win shares.

OK, if I understand correctly:
Suppose that a replacement-level full-time player is worth 6.5 win shares (thus 78 WS or 26 wins for a team of replacement-level players). 6.5 is someone's estimate.

Ruth joins one of 16 teams, which increases the MLB average talent by 3 WS per team on the old scale, or 0.25 per full-time player-position (8 regulars, 4 pitchers). Because the number of wins is fixed, that decreases measured talent by 0.25 per full-time player-position for all slots but his own. Because he joins one team in one league, his impact on the talent scale is 0 in the other league and -0.5 per full-time player-position in his own league. After his entry, the measured talent of replacement-level players is 6.5 WS in the other league, 6.0 in his own league.

Right?
   23. jimd Posted: September 01, 2004 at 09:05 PM (#832239)
Paul, my original argument was made using WARP. It looks like you've constructed the Win Shares version of it here. Adding Ruth to the league cost every regular about 0.5 Win Share per year; having a 7-2 imbalance of super-stars (Johnson, Cobb, Speaker, Collins, Jackson, Baker, Ruth vs Alexander, Hornsby) could cost each regular in the stronger league about 2.5 Win Shares per year or 40 Win Shares over the course of a 17 year career.

For Wheat vs Hooper, they start at 385-330 (all seasons adjusted to 154 G), and this narrows the gap to 385-370 (or 345-330). I don't see them as being that different. Maybe the line between being barely in and being barely out passes between them, but not the line between being first ballot and having no chance at all.
   24. Chris Cobb Posted: September 01, 2004 at 11:03 PM (#832740)
having a 7-2 imbalance of super-stars (Johnson, Cobb, Speaker, Collins, Jackson, Baker, Ruth vs Alexander, Hornsby) could cost each regular in the stronger league about 2.5 Win Shares per year or 40 Win Shares over the course of a 17 year career.

jimd, I know these are back-of-the-envelope estimates, but looking at the careers of these players, I don't think the imbalance was typically 7-2, or that each superstar had the effect of Ruth at the top of his game. If my modeling of great-player effects is correct, pitchers pull more WS from other pitchers than from position players, and vice versa. So let's have Johnson and Alexander cancel each other out, and look at the hitters.

Superstar years
Cobb, 07-19
Speaker, 09-23
Collins, 09-20
Jackson, 11-13, 16-17, 19-20
Baker, 11-14
Ruth, 16-32

Hornsby 17, 20-29

Prior to 1911, there's another set of superstars to deal with, so let's start in 1911.

1911-13 5-0, AL
1914 4-0 AL
1915 3-0 AL
1916 5-0 AL
1917 5-1 AL
1918 4-0 AL
1919 5-0 AL
1920 4-1 AL
1921 2-1 AL

After 1920 we also would need to look at a different set of players in both leagues. On average, from 1911-1920 there's a 4.3 AL advantage in superstars. If we estimate that these superstars average 40 ws/season as a group (which is generous), each one drops a regular in his own league about .33 ws. 4.3 superstars together is 1.4 ws/year during this period. Assuming the average superstar advantage for the AL remains constant throughout the Wheat/Hooper period (I doubt it would be wider), that would mean more like a 25 win-share cost to Hooper for better competition during his career, leading to a 385-355 edge for Wheat.

Maybe the line between being barely in and being barely out passes between them, but not the line between being first ballot and having no chance at all.

I see the gap as a bit wider than that, but remember 1933 was an odd ballot. It's not clear that players who finished 4th and 5th in that election, Jake Beckley and George Van Haltren, will manage election in, say, the next 20 years, so the barely in/barely out line was very close to the top of this ballot. The electorate saw Wheat as above the Beckley/Van Haltren line, Hooper as below it.

If the case for Hooper is to be made successfully, competition quality arguments need to be advanced to run him past George Van Haltren and Jake Beckley as the top career-value candidates from the backlog. I'm not ready to make that move (at least with respecct to Van Haltren) but I hope the electorate is paying attention to jimd's case on league quality differences; there's absolutely an issue here!
   25. jimd Posted: September 01, 2004 at 11:17 PM (#832756)
My case was not so much as pro-Hooper (he's not high on my ballot) as urging caution on Wheat. Obviously, it's too late for that now.

I believe that Win Shares overrates OF'ers from this era because of the fixed ratios for allocating Win Shares to positions. It makes no sense to me to see this long-term evolution that converts defensive plays by the infielders into strikeouts and to then assert (as Win Shares does) that infielders were no more important then than they are now even though they were making many more plays per game.
   26. PhillyBooster Posted: September 01, 2004 at 11:25 PM (#832764)
Adding more meat to Chris's bones, the 7 AL superstars mentioned by jimd only played together for 3 years (1916-1918).

Nonetheless, I added up the WARP-1's for the 9 players mentioned in each year, from 1910-1925 (Hooper and Wheat's overlapping careers). The difference between the groups ran from 18.2 WARP-1 advantage in 1925 to an 84.5 WARP-1 advantage in 1912.

Now 84.5 certainly seems like a large gap, but that's what you get when you compare 6 AL stars (all but Ruth) to 1 NL Star (Alexander). Left out of the calculation are that the ACTUAL stars of 1912 were Honus Wagner, Heinie Zimmerman, Chief Meyers, Johnny Evers, and Christie Mathewson, (one could also include Rube Marquard and Larry Doyle here, but I won't). So that's 3 HoMers, plus at least Meyers and Evers, who are out of the HoM due to short careers, not to any challenge of how good they were in their primes.

Compare THOSE 6 to the AL 6, and the AL's WARP-1 advantage in 192 drops to 23.5.

The fact that the NL may have had fewer All-Time Greats does not mean that it had fewer stars in any given year.

It's only a first step to say 7 is more than 2, and assume a difference of 5. If you look year by year, the actual difference in quality between each league's Top X players (for that year) will be relatively small.
   27. DavidFoss Posted: September 01, 2004 at 11:46 PM (#832779)
Good discussion.

What do we do with Gavvy Cravath? Even with the park adjustments, he's putting up big numbers. We mentally discount him, but does he act like a "superstar" for the purposes of this discussion?
   28. DavidFoss Posted: September 01, 2004 at 11:48 PM (#832781)
Here are some tables:

1000 PA to make the lists:


AMERICAN LEAGUE
CAREER
1911-1920

RUNS CREATED RATE PLAYER LEAGUE
1 Babe Ruth 277 498 180
2 Ty Cobb 248 1299 525
3 Joe Jackson 214 1141 532
4 Tris Speaker 211 1239 588
5 George Sisler 178 587 330
6 Eddie Collins 176 1076 613
7 Sam Crawford 168 641 381
8 Home Run Baker 159 771 486
9 Jack Fournier 149 240 161
10 Birdie Cree 148 301 203
11 Wally Schang 144 414 288
12 Bobby Veach 143 735 512
13 Harry Wolter 141 161 114
14 Braggo Roth 139 441 318
15 Joe Judge 131 362 275
16 Sam Rice 130 308 236
17 Happy Felsch 129 428 331
18 Baby Doll Jacobson 129 272 211
19 Eddie Murphy 128 336 262
20 Harry Heilmann 128 388 304


NATIONAL LEAGUE
CAREER
1911-1920

RUNS CREATED RATE PLAYER LEAGUE
1 Rogers Hornsby 184 485 264
2 Gavvy Cravath 181 723 399
3 Ross Youngs 169 280 165
4 Edd Roush 158 399 252
5 Joe Connolly 154 216 140
6 John Titus 152 194 128
7 Heine Groh 146 653 448
8 Benny Kauff 144 316 219
9 Bill Hinchman 144 223 155
10 Zack Wheat 143 792 552
11 Sherry Magee 142 632 445
12 Johnny Bates 141 238 169
13 George Burns 139 718 516
14 Larry Doyle 138 749 541
15 Honus Wagner 137 506 368
16 Chief Meyers 136 347 255
17 Hal Chase 136 254 187
18 Charlie Hollocher 135 190 141
19 Beals Becker 134 290 217
20 Heinie Zimmerman 133 692 521
   29. DavidFoss Posted: September 01, 2004 at 11:48 PM (#832782)
Here are some tables:

1000 PA to make the lists:


AMERICAN LEAGUE
CAREER
1911-1920

RUNS CREATED RATE PLAYER LEAGUE
1 Babe Ruth 277 498 180
2 Ty Cobb 248 1299 525
3 Joe Jackson 214 1141 532
4 Tris Speaker 211 1239 588
5 George Sisler 178 587 330
6 Eddie Collins 176 1076 613
7 Sam Crawford 168 641 381
8 Home Run Baker 159 771 486
9 Jack Fournier 149 240 161
10 Birdie Cree 148 301 203
11 Wally Schang 144 414 288
12 Bobby Veach 143 735 512
13 Harry Wolter 141 161 114
14 Braggo Roth 139 441 318
15 Joe Judge 131 362 275
16 Sam Rice 130 308 236
17 Happy Felsch 129 428 331
18 Baby Doll Jacobson 129 272 211
19 Eddie Murphy 128 336 262
20 Harry Heilmann 128 388 304


NATIONAL LEAGUE
CAREER
1911-1920

RUNS CREATED RATE PLAYER LEAGUE
1 Rogers Hornsby 184 485 264
2 Gavvy Cravath 181 723 399
3 Ross Youngs 169 280 165
4 Edd Roush 158 399 252
5 Joe Connolly 154 216 140
6 John Titus 152 194 128
7 Heine Groh 146 653 448
8 Benny Kauff 144 316 219
9 Bill Hinchman 144 223 155
10 Zack Wheat 143 792 552
11 Sherry Magee 142 632 445
12 Johnny Bates 141 238 169
13 George Burns 139 718 516
14 Larry Doyle 138 749 541
15 Honus Wagner 137 506 368
16 Chief Meyers 136 347 255
17 Hal Chase 136 254 187
18 Charlie Hollocher 135 190 141
19 Beals Becker 134 290 217
20 Heinie Zimmerman 133 692 521

</pre>
   30. DavidFoss Posted: September 01, 2004 at 11:55 PM (#832785)
Sorry for the double-post... maybe someone can clean that up... still doesn't really line up. ugh... sorry.

The AL does have more superstars, but the NL catches up in the second ten. There are 82 players at 100 RC+ or better in the NL and 65 players at 100 RC+ AL.

Its not clear to me how much of that is due to the averages shifting from the superstars.

Anyways, the proper way to do this is to examine what happens to players who switch leagues. Unfortunately, this was extremely rare in this era.
   31. andrew siegel Posted: September 02, 2004 at 08:54 AM (#832997)
Leaving the players' names out and just speaking theoretically, if one league has 7 players who are multiple levels above the average player and another league has 2 such players, isn't it more likely that the second league has the better overall quality, not the first? Doesn't the number of outliers normally decrease as the overall quality of any group improves?
   32. Paul Wendt Posted: September 02, 2004 at 09:14 AM (#833012)
[Paul Wendt:]
jimd supposes or estimates that Babe Ruth is 16+ wins above replacement; call it 49 win shares.
. . .
Posted by jimd on September 01, 2004 at 09:05 PM (#832239)
Paul, my original argument was made using WARP. It looks like you've constructed the Win Shares version of it here. Adding Ruth to the league cost every regular about 0.5 Win Share per year


Thanks for the clarification.

Relying on someone's estimate of full season replacement level 6.5 win shares. (JoeD?)

Babe Ruth is 48-49 win shares above replacement only in 1923.
01 37 36 40 43 - 1915-1919
51 53 29 55 45 - 1920-1924
13 45 45 45 32 - 1925-1929
38 38 36 29 20 - 1930-1934 (finale, 2 in 1935)
Best 5-year record (a period much shorter than the career of anyone considered here), 233 win shares; annually 46-47 or 40 above replacement.
Best 10-year record, 419; p.a. 42 or 35-36 above replacement.
Best 15-year record, 608; p.a. 40-41 or 34 above replacement.

I don't know how the impact is distributed to pitchers and others, so that's all for now.
   33. PhillyBooster Posted: September 02, 2004 at 09:25 AM (#833018)
Andrew does have a point.

While on the one hand I completely understand how the addition of a Babe Ruth to one league could depress other players' Win Shares, I could also see how if the top two or three teams in terms of drafting marginal talent or making overlooked "finds" were all in the same league, the effect could be the same or greater.

Cravath has "found" by the NL at 31. So was George Suggs at age 27. I one league's teams are systematically replacing lost talent with higher quality "replacement players", that could be like spreading an All-Star throughout the lineup.

Put another way:

Win totals for last (8th) place teams, 1910-1920:

AL: 47 wins (range, 36 to 57 wins; 4 different teams finished last, but the A's finished last 6 consecutive years )
NL: 55 wins (range, 44 to 69 wins; 6 different teams finished last)

AL players got years of feasting on the 36 win Philly A's in 1916 -- with every other team finishing within one game of .500! (Lots more Win Shares to spread around), meanwhile the NL had enough parity that the Giants finished in last place in 1915 with 69 wins -- two years after and two years before winning the pennant.
   34. jimd Posted: September 02, 2004 at 01:47 PM (#833387)
Leaving the players' names out and just speaking theoretically, if one league has 7 players who are multiple levels above the average player and another league has 2 such players, isn't it more likely that the second league has the better overall quality, not the first? Doesn't the number of outliers normally decrease as the overall quality of any group improves?

It all depends. If we're comparing 1912 with 2002, I'm on your side Andrew. The fact that so many "all-time greats" are playing at the same time indicates to me that they probably weren't so great because the competition wasn't either.

But if we're comparing AL and NL in 1912, it's a different story. It could be that Cobb, Collins, Speaker, etc., are tearing up a minor league like Browning did in 1882, but there are other factors that indicate it isn't so. The NL All-Star team of 1882 is mostly HOMers, the NL All-Star team of 1912 is not close. The AL of 1912 has been in operation for over a decade; they'd pretty much have to deliberately run their league into the ground to turn it into a minor league quality operation. The AA of 1882 is just getting started and so highly likely to be full of replacement level players.

The hypothesis that the two leagues were basically equal but that the NL had had a bad run of luck at snagging their share of the high-impact superstars is much more plausible than the hypothesis that the NL had been quietly getting the huge majority of the mid-level players leaving the AL with the leftovers and a handful of inflated faux superstars.
   35. jimd Posted: September 02, 2004 at 02:32 PM (#833473)
Compare THOSE 6 to the AL 6, and the AL's WARP-1 advantage in 192 drops to 23.5.

24 WARP difference for 6 players is an average of 4.0 WARP each. That's pretty significant. Assuming the rest of the leagues are of equal quality, it's enough to say that a 77-77 NL team was equivalent to a 74-80 AL team. (3=24/8) It's enough to consider a 4% discount.

jimd, I know these are back-of-the-envelope estimates,

Agreed. They prove nothing, except that there is a plausibility to Davenport's calculations. A similar superstar imbalance will appear again in the 1950's, and his calculations will show a similar league imbalance, which will again corroborate the public opinion of the time.
   36. John (You Can Call Me Grandma) Murphy Posted: September 02, 2004 at 02:46 PM (#833501)
A similar superstar imbalance will appear again in the 1950's, and his calculations will show a similar league imbalance, which will again corroborate the public opinion of the time.

Except the fifties imbalance can be corrobotated by year-to-year comparisons player by player, while the Deadball Era can't.

I'm more in the camp of "the AL had more of the stars, but overall was basically the same as the NL" camp for now.
   37. Chris Cobb Posted: September 02, 2004 at 03:21 PM (#833551)
I'm more in the camp of "the AL had more of the stars, but overall was basically the same as the NL" camp for now.

But John, if the AL had more of the stars, then it can't have been basically the same as the NL. The only way that the value of an average player in the two leagues could have been the same, with the AL having more stars, is if the NL had either more good players, or fewer bad players than the AL to offset the effects of the AL's stars.

If the AL and the NL were basically the same except for the superstars in the AL, then the next tier of above average players -- the Harry Hoopers and the Larry Gardners -- of the AL are going to have their totals suppressed relative to players of the same ability in the NL, because of the stiffer competition.
   38. John (You Can Call Me Grandma) Murphy Posted: September 02, 2004 at 03:51 PM (#833594)
But John, if the AL had more of the stars, then it can't have been basically the same as the NL.

Overall, I think the two leagues were equal. If that hurts the lower tier of AL players, then that's the case until I'm proven wrong.
   39. DavidFoss Posted: September 02, 2004 at 04:01 PM (#833605)
If the AL and the NL were basically the same except for the superstars in the AL, then the next tier of above average players -- the Harry Hoopers and the Larry Gardners -- of the AL are going to have their totals suppressed relative to players of the same ability in the NL, because of the stiffer competition.

Its possible that the NL had a tighter distribution of talent that the AL, yet still have the same average level of play. This way, the best AL teams would be better than the best NL teams (though the worst NL teams would be better than the worst AL teams).

This would explain why the AL had a larger stdev of talent yet still won the WS every year.
   40. John (You Can Call Me Grandma) Murphy Posted: September 02, 2004 at 04:07 PM (#833612)
Its possible that the NL had a tighter distribution of talent that the AL, yet still have the same average level of play. This way, the best AL teams would be better than the best NL teams (though the worst NL teams would be better than the worst AL teams).

This would explain why the AL had a larger stdev of talent yet still won the WS every year.


That's exactly what I was trying to say.
   41. jimd Posted: September 02, 2004 at 04:13 PM (#833617)
Except the fifties imbalance can be corrobotated by year-to-year comparisons player by player, while the Deadball Era can't.

Don't know exactly what you mean by this, but I'll guess. Davenport builds those ratings by year-to-year comparisons player-by-player within each league so that each league-season is rated compared to the adjacent ones. Those ratings use the largest comparison set available, every player, and are much more solid than anything built on a handful of trades in any given season. All of the interleague movement over a hundred years would then calibrate the leagues relative to each other.

Just because it's harder to verify doesn't mean it doesn't exist. I bring up the topics of public opinion, and superstar imbalance, in an effort to find alternative approaches to verifying the disparity. If people have ideas on disproving it, those are welcome too.
   42. John (You Can Call Me Grandma) Murphy Posted: September 02, 2004 at 04:17 PM (#833623)
Just because it's harder to verify doesn't mean it doesn't exist.

I'll agree with that, Jim.
   43. John (You Can Call Me Grandma) Murphy Posted: September 02, 2004 at 04:20 PM (#833627)
Don't know exactly what you mean by this, but I'll guess. Davenport builds those ratings by year-to-year comparisons player-by-player within each league so that each league-season is rated compared to the adjacent ones. Those ratings use the largest comparison set available, every player, and are much more solid than anything built on a handful of trades in any given season. All of the interleague movement over a hundred years would then calibrate the leagues relative to each other.

I've pointed this out quite a few times, but the Dick Cramer study have both leagues as roughly equal during the Deadball Era. Either Davenport or Cramer is wrong. Which one it is I haven't a clue.
   44. jimd Posted: September 02, 2004 at 05:48 PM (#833768)
Yes, but the peripheral evidence favors Davenport. I remember first encoutering the Cramer study description in "Hidden Game of Baseball" nearly 20 years ago and being very surprised at the absence of a difference for this era, which contradicted my impression built up from much other reading about this period.
   45. John (You Can Call Me Grandma) Murphy Posted: September 02, 2004 at 06:14 PM (#833796)
But both studies run parallel to each other using ostensibly the same comparisons except when the Deadball Era pops up. My question is: is Davenport altering his comparisons for that era so that it matches his perceptions of that period?
   46. jimd Posted: September 02, 2004 at 06:33 PM (#833826)
All things are possible, but my question would be "Why do that"? I don't remember the methodology described for the Cramer study, other than it uses his own quantitative metric, just like Davenport uses his own. The difference may lie in distortions of either metric. I also don't remember whether Cramer's metric is league-normalized. If it isn't, the difference may lie there, because during this period, the NL collectively is playing in a "hitter's park" relative to the AL collectively playing in a "pitcher's park" (this is revealed by comparing the park effects of the NY teams 1913-1922 and the STL teams 1921?-1952; the same park generally plays as a better hitters park and a better home-run park in the AL than the NL). Then again, this may be irrelevant; I just don't know enough about the Cramer study's details.
   47. John (You Can Call Me Grandma) Murphy Posted: September 02, 2004 at 06:45 PM (#833848)
Batter Win Average, the metric that Cramer used as the base for his study, is normalized, Jim.

As for distortions with the metrics, they seem to agree for the rest of baseball history, so that's probably not the case (but who knows?)
   48. jimd Posted: September 02, 2004 at 07:35 PM (#833959)
Is the study available on-line somewhere?
   49. Chris Cobb Posted: September 02, 2004 at 07:55 PM (#834031)
during this period, the NL collectively is playing in a "hitter's park" relative to the AL collectively playing in a "pitcher's park" (this is revealed by comparing the park effects of the NY teams 1913-1922 and the STL teams 1921?-1952; the same park generally plays as a better hitters park and a better home-run park in the AL than the NL). Then again, this may be irrelevant; I just don't know enough about the Cramer study's details.

It may be relevant to demonstrating the existence of differences in levels of competition. If by comparing park factors, it can be convincingly demonstrated that the AL was a "pitcher's league" during this period, does then the following argument hold true?

I calculate estimated win shares for Negro League players by matching their translated batting statistics to major-league contemporaries and then using the batting win shares of the closest matching player (prorated by PA) as the total for the Negro-Leaguer. I have consistently found that the same batting totals will get you more win shares in the National League than in the American League, at least during the teens.

If the American League is demonstrably a pitchers' league by way of park factors, then one would expect to find just the opposite: American league players should get _more_ win shares for the same OPS than National League players would. Is there any explanation for this finding (if what I have observed can be systematically demonstrated) other than that the level of competition, at least for hitters, was higher in the American League than in the National?
   50. John (You Can Call Me Grandma) Murphy Posted: September 02, 2004 at 08:04 PM (#834061)
Is the study available on-line somewhere?

I Googled, but no luck finding it, Jim.
   51. jimd Posted: September 02, 2004 at 08:15 PM (#834111)
An isolated example: 1915 (All stats from B-R.com)

The NL hit .248/.304/.331 and scored 3.62 runs/game.
The AL hit .248/.319/.326 and scored 3.96 runs/game.
The Polo Grounds played as an extreme pitcher's park in the NL (94/93).
The Polo Grounds played as an average park in the AL (100/100).

The Polo Grounds being a typical AL park but a pitcher's park in the NL implies that the rest of the AL parks (on average) would also be considered pitcher's parks in the NL.

If this is typical of other seasons in the vicinity, the implication is that the NL cannot hit or has fantastic defense (or the AL cannot pitch or has great hitting, or some mixture thereof).
   52. jimd Posted: September 02, 2004 at 08:30 PM (#834183)
If the American League is demonstrably a pitchers' league by way of park factors, then one would expect to find just the opposite: American league players should get _more_ win shares for the same OPS than National League players would.

The same batting line will get more Win Shares in 1908 or 1968 than it will in 1930 or 1894. That the same batting line gets more Win Shares in the NL than the AL indicates that the AL was the higher scoring league (see 1915 above for example). That they were the higher scoring league despite playing in parks that depressed scoring overall indicates a dramatically different balance between offense and defense than the NL. How this relates to overall league quality, beats me.
   53. jimd Posted: September 02, 2004 at 08:38 PM (#834228)
Which do people see overall evidence for? The AL having better hitters or the NL having better pitching/defense?
   54. DavidFoss Posted: September 02, 2004 at 08:48 PM (#834290)
Which do people see overall evidence for? The AL having better hitters or the NL having better pitching/defense?

Or something else entirely. What about the sizes of the umpires strike zones? Did they use the same balls? Change them as often for the deadball years?

This doesn't pertain to our discussions, yet, but offense levels in the NL dropped quite a bit compared to the AL starting in 1931. This was a response to the record-breaking 1930 season. Offense levels in the AL stayed high until WWII. How did the NL manage to do that independent of the AL?
   55. Don F Posted: September 02, 2004 at 10:02 PM (#834629)
Is the study available on-line somewhere?

I have the Cramer batting numbers on a spreadsheet; I can post it to the Yahoo! group site a little later.

His averages are normalized and create the appearance of batters getting "worse" when BA & SA spike up, as in ~1911-13 and in the 20s.

This probably wasn't the right way to do it, but on one of the tabs, I tried subtracting out the differences between the real league BA & SA and those of the reference league, the 1976 NL. The picture looks a little different that way; whether it removes illusions or creates different ones, I can't say. My math skills are largely limited to the basic functions. I'm sure one of you smarter guys can do better.

You can probably also figure out how to carry the numbers forward from 1979 to the present.
   56. Don F Posted: September 02, 2004 at 10:47 PM (#834858)
The Cramer spreadsheet is now uploaded.
   57. Paul Wendt Posted: September 03, 2004 at 11:32 AM (#835269)
Dick Cramer's analysis has been criticized because it include no age data and thus attributes no differences in player performance across league-seasons to age differences.

That certainly matters in the intertemporal comparisons. There may be a systematic bias in the Cramer measure of general improvement (especially around large changes in the number of MLB teams, I think). It mattes to interleague comparisons if there interleague differences in age patterns.

How looks the population of players who change leagues? (That can't be said succinctly in English.) Is the subgroup that moves from AL to NL different from the subgroup that moves from NL to AL? If yes, that implies a bias in interleague comparisons a la Cramer. A significant YES is most likely around a disruptive event. 1891. 1901. 1915? (numerous Federal Leaguers moved from NL to AL, but not vice versa). 1977? (expansion in AL only).

That is my three cents. All I have today.
   58. John (You Can Call Me Grandma) Murphy Posted: September 03, 2004 at 12:16 PM (#835329)
Dick Cramer's analysis has been criticized because it include no age data and thus attributes no differences in player performance across league-seasons to age differences.

Correct. Cramer admitted that he was wrong about this in the eighties.

That shouldn't matter, however, for inter-league comparisons. As I have pointed out, Cramer agrees with Davenport except for the Deadball Era. If Cramer's lack of age data was the culprit for the difference, I think we would be noticing the same problem throughtout Cramer's study (which we're not).
   59. jimd Posted: September 03, 2004 at 01:42 PM (#835432)
That shouldn't matter, however, for inter-league comparisons.

It's probably worth investigating. The local validity of the study assumes that league aging patterns are pretty similar through time. I do remember reading that the two greatest youth revolutions (good rookie crops over a period of a few years) occurred during the early 60's and the early 10's. Maybe that's a factor.

Another factor might be the disproportionate impact of superstars. A player with a 4 year career contributes 12 comparisons to the study, 3 for each of the 4 years he played. Ty Cobb contributes 476 comparisons to the study, 23 for each of the 24 seasons he played. He played 6 times longer but has an impact on the study 46 times greater. Since he peaked around 1915, those samples all say that the AL was weakest then when compared to the years when Cobb was younger or older. This is mitigated by other players in other stages of their career, but the point is that superstars are given disproportionate weight due to the lengths of their careers. Much better would be to only compare adjacent seasons, or to adjust the season weights some other way.
   60. jimd Posted: September 03, 2004 at 01:49 PM (#835450)
Did they use the same balls? Change them as often for the deadball years?

Each league had their own official baseball. Both leagues attempted to minimize the number used until the Chapman tragedy changed that attitude.

IIRC, the NL deadened their ball somewhat after the 1930 season.
   61. jimd Posted: September 03, 2004 at 02:11 PM (#835483)
Typo alert: 46 times greater should read 40 times greater
   62. Chris Cobb Posted: September 03, 2004 at 02:29 PM (#835508)
IIRC, the NL deadened their ball somewhat after the 1930 season.

That's what James says in NBJHBA, but he doesn't provide any details.
   63. John (You Can Call Me Grandma) Murphy Posted: September 03, 2004 at 02:45 PM (#835528)
IIRC, the NL deadened their ball somewhat after the 1930 season.

It appears they juiced the ball in '34 to increase attendance.
   64. Paul Wendt Posted: September 03, 2004 at 03:57 PM (#835621)
jimd:
It's probably worth investigating. The local validity of the study assumes that league aging patterns are pretty similar through time.

Yes, and local is "far enough" from a disruption such as 1898-1903 or maybe 1913-1916. How persistent over time is a measured difference in league quality? If very persistent, then "far enough" is very far. (This point holds for any bias. Age pattern is merely a plausible source of bias re which we suppose a difference between Cramer and Davenport methods.)

Fewer league changes, as in the deadball era, implies more persistent measured differences. (Right?) Get it wrong in 1902 and that may have some impact even in 1912.

Another factor might be the disproportionate impact of superstars. A player with a 4 year career contributes 12 comparisons to the study, 3 for each of the 4 years he played. Ty Cobb contributes 476 comparisons to the study, 23 for each of the 24 seasons he played.

There may be a difference between C and D in the time span of the elementary intertemporal data. Eg, five years for Cramer(?): 1928 is compared with 1923,33 but not with 1922,34.
   65. DavidFoss Posted: September 03, 2004 at 03:59 PM (#835628)
It appears they juiced the ball in '34 to increase attendance.

1933 looks like the outlier to me.

Anyhow, year-by-year micro-analysis of the variations of offense is not going to be too fruitful. The point I was trying to make was that the AL had a much higher offense levels from 1931-1941. The biggest differences were in 31, 33, 36-39.

This was following a period of 1922-1930 where the offense-levels of the two leagues more closely tracked each other. 1920-21 had the AL exiting the deadball era a little earlier than the NL. Then the two leagues tracked each other fairly closely for over a decade before that.

A plot would help here. :-)
   66. John (You Can Call Me Grandma) Murphy Posted: September 03, 2004 at 04:07 PM (#835644)
1933 looks like the outlier to me.

I've been rereading The Dizziest Season lately and one of the big stories that year was the "rabbit ball" of '34 in the NL.

Offense did increase 17% that year, FWIW.
   67. DavidFoss Posted: September 03, 2004 at 04:33 PM (#835672)
Offense did increase 17% that year, FWIW.

And it dropped 16-17% the year before. Attendance was down quite a bit in the NL, so they could have made some sort of correction.

Here is what I was talking about with the NL/AL:

year - NL - AL



19423.904.26
19414.234.74
19404.394.97
19394.445.21
19384.425.37
19374.515.23
19364.715.67
19354.715.09
19344.685.13
19333.975.00
19324.605.23
19314.485.14

19305.685.41
19295.365.01
19284.704.76
19274.584.92
19264.544.73
19255.065.20
19244.544.98
19234.854.78
19225.004.75

19214.595.12
19203.974.76
19193.654.09

19183.623.64
19173.533.65
19163.453.68
19153.623.96
19143.843.65
</pre>
   68. DavidFoss Posted: September 03, 2004 at 04:40 PM (#835682)
ACK! The pre-tag interpret's tabs as "no-space"?

OK... that's the second table I've messed up this week. I may give up. One more try... sorry guys...

year - NL - AL


1942---3.90---4.26
1941---4.23---4.74
1940---4.39---4.97
1939---4.44---5.21
1938---4.42---5.37
1937---4.51---5.23
1936---4.71---5.67
1935---4.71---5.09
1934---4.68---5.13
1933---3.97---5.00
1932---4.60---5.23
1931---4.48---5.14

1930---5.68---5.41
1929---5.36---5.01
1928---4.70---4.76
1927---4.58---4.92
1926---4.54---4.73
1925---5.06---5.20
1924---4.54---4.98
1923---4.85---4.78
1922---5.00---4.75

1921---4.59---5.12
1920---3.97---4.76

1919---3.65---4.09
1918---3.62---3.64
1917---3.53---3.65
1916---3.45---3.68
1915---3.62---3.96
1914---3.84---3.65

</pre>
   69. DavidFoss Posted: September 03, 2004 at 04:41 PM (#835683)
Anyhow, "Dizziest of Seasons" sounds like a cool book. I may put that one on my wish list.
   70. jimd Posted: September 03, 2004 at 06:00 PM (#835769)
Eg, five years for Cramer(?):

IIRC, Cramer used all possible comparisons (over some PA threshold), but that memory could be wrong, and IAC it's a memory of a summary, as I have never seen the original study.

If he did use all possible comparisons, then the fluke circumstance of having a number of long-career superstars peaking at around the same time in the same league would severely distort the league measurements at that peak.
   71. Paul Wendt Posted: September 04, 2004 at 12:08 PM (#836899)
OK, I found the 1980 BRJ. "Average Batting Skill" is six pages long, including the figure and table reproduced in The Hidden Game of Baseball.

Yes, Cramer used all pairs (one player, two league-seasons) with at least 20 PA each season.

Perhaps Davenport limits the comparisons. I was recalling a conversation that I initiated in the lobby at SABR34 this July. Dick Cramer observed that Davenport's method must be fairly close to his. He alluded to limiting the comparisons or the differences in some way. I am not sure that timespan of comparisons was the point, but I know I mentioned that Pete Palmer utilized only 1913-1916 data in his assessment of FL 1914-1915; only 1883-1885 data for UA 1884.

The approach should have been implemented many times. (Cramer agrees.) Does anyone know why that has not happened? Databases are widely available; computation is cheap; there are more sabermetricians. The empirical question is exceptionally interesting to many people.

Given any implementation of the approach, it should be trivial to vary the weights on observed differences according to timespan and number of BFP, and learn whether the results are robust. (Excluding all comparisons across time greater than some threshold is a special case using weight 0.)
   72. Paul Wendt Posted: September 04, 2004 at 12:25 PM (#836920)
Pete Palmer utilized only 1913-1916 data in his assessment of FL 1914-1915; only 1883-1885 data for UA 1884.

League-average performance, UA and FL.
Presuming contemporary NL=AA=1 and NL=AL=1.

UA 1884
OPS .76
ERA .875

FL 1914-1915
OPS .90
ERA .924

"League Performance" in the Glossary, Total Baseball 6 (1999).
   73. Paul Wendt Posted: September 06, 2004 at 02:22 PM (#838807)
Based on the Palmer study, old Total Baseball (Thorn & Palmer) incorporated a gross adjustment for league quality in OPS+ and ERA+ for players and teams (editions 6-7 only?).

UA 1884, stipulated league averages
OPS+ = ERA+ = 80

FL 1914-1915, stipulated league averages
OPS+ = ERA+ = 90

Average is 100 for every other MLB league-season.


How is league quality handled by TB7's descendants, Total Baseball 8 (Thorn) and The Baseball Encyclopedia (Palmer).
   74. Jeff M Posted: September 10, 2004 at 07:28 PM (#847684)
In one of the many discussions on this topic over the life of the HoM, I mentioned a conceptual difficulty I have with all measures of league quality that I've tried and seen discussed here:

Even if we can show, for example, that the NL has statistically better hitters in a particular year than the 1904 AL (whether by judging the average player using various methods, or by looking at standard deviations or through an analysis of the outliers), does that mean the NL was actually better than the AL that season? Couldn't that mean the NL pitching that year was worse than the AL pitching, thus reflecting better on the NL hitters? If so, then the NL wouldn't overall be a better league that season.

In other words, since all hitting stats are dependent not only on the quality of the hitters but also on the quality of the pitchers, and vice versa with respect to pitching stats, how can any hitting study (or pitching study) produce a conclusive result about league quality? Not to mention the fielding component.

Assuming that hurdle is overcome, quantifying it will be a separate thorny issue.
   75. Paul Wendt Posted: September 13, 2004 at 11:04 AM (#851609)
JeffM,
As you suggest, such intraleague analysis of batting and pitching statistics (not to mention one without the other) cannot support any interleague quality judgments. But Cramer (batters only), Palmer, and Davenport share a general interleague method, analysing only the records achieved in two leagues by the people who played in both.

There isn't much migration within a season, so the comparison of NL04 and AL04, for example, is mainly derived from comparisons of NL04 and AL03, AL04 and NL03, AL04 and NL02, and so on.
   76. Jeff M Posted: September 13, 2004 at 01:56 PM (#851904)
Paul:

I agree. But I see two lingering issues:

1. When I was looking at the NL vs. AA, the problem was there weren't a significant number of players who played in both leagues as regulars. That might be less of a problem with NL/AL, but you have to confine the analysis to a few years (maybe five years on either side), and that REALLY cuts down the sample size. You also have to make sure the average age of the sampled players is about the same, or you have other factors creeping in.

2. Even if you only analyze players who played in both leagues -- hitters for example -- you still have to know the level of pitching. So, it seems you'd need to know the records of hitters who played in both leagues during a specified time against the pitchers who pitched in both leagues during the same time. There is a hypothetical hybrid league in that scenario, but the sample size gets even smaller.

I'm not suggesting it not be studied; only that it is a very difficult problem.
   77. Paul Wendt Posted: September 14, 2004 at 02:57 PM (#854280)
Yes, the estimates for some league-season pairs is biased if the share of improving players who played in both league-seasons is different from the share of declining players.

The pitcher-batter simultaneity should not be a source of bias. Cramer, Palmer, and Davenport, at least, use batting statistics that are relative to league average, which incorporates the quality of league pitchers; and vice versa, except that Cramer does not look at pitchers.

--
By the way, Cramer and (I am practically certain) Davenport also use the data for NL04 and NL03, AL04 and AL03, etc, generated by those who play multiple years in the "same league" in the ordinary sense.

In effect, all of the interleague quality measures are estimated simultaneously. The estimated difference between NL09 and AL09 is not much influenced by the sparse data on NL08-AL09, AL08-NL09, etc, when there was little movement between NL and AL. Most of the data supporting relative quality in 1909 is ample data on NL03-NL04 ... NL08-NL09 and AL03-AL04 ... AL08-AL09 and ample data on NL00-AL01, AL00-NL01, ... AL02-NL03.

jimd and I alluded to this in #52 and #75, or something like that.

It's time to try posting this much.
   78. Paul Wendt Posted: September 14, 2004 at 03:15 PM (#854323)
C, P, and D use statistics that are relative to league average.

Eg, Pete Palmer's estimates for UA1884 mean that an average UA1884 pitcher, who also played in AA/NL/1883/1885, was 12.5% below average in the latter leagues (ERA+ .875). Sea-level in the UA was 12.5% below 1883/85 major league sea-level for pitchers; 24% below, for batters (OPS+ .76). Rolling hills in the UA pitchers box appear to be mountains. Molehills in the UA batters box appear to be mountains.
   79. Cblau Posted: December 05, 2004 at 07:49 PM (#998191)
For those wondering about Davenport's 19th C. league adjustments, this is from his post to SABR-L in 2000. His methodology was similar to the study TomH describes in post 7, but comparisons are limited to two years (e.g. a player's 1883 performance is compared to his 1881, 1882, 1884, and 1885 numbers.)

These figures mean for instance that an AA player in 1882 with a .260 EQA would be equivalent to an NL player that year with a .196 EQA.


1882 AA 64.0
1883 38.6
1884 30.5
1885 22.0
1886 14.6
1887 12.1
1888 12.9
1889 11.5
1890 29.1
1891 20.4


1884 UA 70.9
1890 PL -3.0
   80. KJOK Posted: December 05, 2004 at 08:08 PM (#998233)
OK, so if I'm understanding these numbers correctly, they would roughly be on the same scale that Davenport uses to gauge that the modern AAA is about 12.0 and 21st century Japanese Leagues are around 8.0, meaning that the AA was never, even at it's most talented point in 1889, as close to NL calibre as today's Japanese Central League or Pacific League are to the NL and AL?!
   81. Joe Dimino Posted: December 06, 2004 at 03:59 AM (#999406)
I think that's a reasonable comparison. The best players (H.Matsui, Ichiro!) would still be very good major leaguers, stars even, but the rank and file players were roughly of AAA quality, making it easier for the good players over there to dominate.
   82. Joe Dimino Posted: December 06, 2004 at 04:00 AM (#999408)
The system also shows the UA and PL about where I've eyeballed them in my head, so that gives it a little more credibility in my eyes.
   83. jimd Posted: December 06, 2004 at 03:06 PM (#1000890)
His methodology was similar to the [The Hidden Game of Baseball] study TomH describes in post 7, but comparisons are limited to two years (e.g. a player's 1883 performance is compared to his 1881, 1882, 1884, and 1885 numbers.)

This is a very important difference in methodology.

When every available year comparison is used (as does the study cited in The Hidden Game of Baseball), the superstars receive a lot of extra weight, simply because they play so many years. A 5 year player contributes 4 samples to each of his 5 years, and a 21 year player contributes 20 samples to each of his 21 seasons. For his peak seasons, there are 10-15 samples implying that the league was "weak" those seasons, due to the assumption that the player's performance is constant over his career. Put a number of such stars in parallel in the same league (e.g. Cobb, Collins, Jackson, Baker) and there is most likely a noticeable impact on the results. In Davenport's study, most comparisons of peak seasons are only with other peak seasons or near-peak seasons; the problem is not completely eliminated, but it is greatly reduced.

Note: the total sample universe for each league season during the 1910's is around 500-600 samples from players that were full-time in both seasons plus a number of partial samples from non-regular players; Davenport's study has about half that number of full-time samples per season.
   84. jimd Posted: December 06, 2004 at 08:56 PM (#1001722)

Year n0 n1 n2 n3 n4 n5 n6
------ ----------- ----------- ----------- ----------- ----------- ----------- ----------- </pre>
Testing.
   85. jimd Posted: December 06, 2004 at 08:59 PM (#1001731)

Year n0 n1 n2 n3 n4 n5 n6
------ ---------- ---------- ---------- ---------- ---------- ---------- ----------
1930 263 187 165 149 124 95 84
1931 251 186 158 135 109 95 72
1932 263 202 163 133 114 98 74
1933 260 184 150 126 112 85 72
1934 261 185 158 133 105 96 75
1935 252 193 159 127 109 89 70
1936 256 181 149 132 107 86 66
1937 260 182 154 126 111 80 68
1938 257 189 158 134 102 80 58
1939 263 198 170 121 91 72 60
1940 274 202 157 114 82 69 95
1941 276 183 124 87 72 112 81
1942 269 155 100 77 123 99 82
1943 245 146 118 111 94 71 56
1944 242 150 106 88 70 52 37
1945 239 99 84 62 44 35 23
1946 283 195 156 130 94 81 64
1947 271 189 155 116 105 87 75
1948 286 192 151 137 120 93 72
1949 267 175 163 138 113 87 79
1950 249 187 157 130 109 98 77
1951 258 184 157 127 116 91 78
1952 262 180 142 135 111 100 83
1953 256 172 150 127 115 104 75
1954 277 201 169 150 128 105 89
1955 283 201 180 157 124 104 91 </pre>
n0: the number of regulars (my definition) in MLB that year
n1: the number of regulars that were also regulars the following year
n2: the number of regulars that were also regulars two years in the future
n6: the number of regulars that were also regulars six years in the future

The point of providing the 25 year span is to allow one to get an idea of typical turnover, and to then compare the effect of the WWII years on that typical turnover.

Some specific points: the transition from 1941 to 1942 (1941-n1) is not way out-of-line, though it is a little low. The war did not have a major impact on MLB in 1942. The following four years show significant turnover, culminating in the dramatic return in 1946 (1945-n1) when only about 40% of the 1945 regulars kept their jobs.

Look at the data points 1942-n4, 1941-n5, 1940-n6. These represent the number of players in these years who were regulars in 1946. They were MUCH more likely to have retained/regained their jobs than typical MLB regulars after the same time interval with no war situation. I don't know if this represented an effort on MLB's part to give the returning veterans every opportunity to regain their jobs, or the impact of the war on the development of the minor league players that would normally have replaced some of these players.
   86. Michael Bass Posted: December 28, 2004 at 01:54 PM (#1043643)


1882 AA 64.0
1883 38.6
1884 30.5
1885 22.0
1886 14.6
1887 12.1
1888 12.9
1889 11.5
1890 29.1
1891 20.4


1884 UA 70.9
1890 PL -3.0


Anyone off hand happen to have Davenport's Federal League adjustment on this same scale?
   87. jimd Posted: December 28, 2004 at 03:10 PM (#1043733)
The Davenport adjustments are not simply a percentage. They are more complex, having an adjustment involving the league replacement levels as well.

The following table shows 3 Federal League CF'ers from 1915:
W-1 W-2 Delt AdjGm Name
13.2 9.4 (3.8) 136.3 Kauff
7.0 3.4 (3.6) 145.9 Roush
4.4 1.0 (3.4) 154.3 Oakes</pre>
As you can see, the amount of value lost going from WARP-1 to WARP-2 is fairly constant (though not completely). Kauff loses more absolute value, showing that there is also a percentage involved, but Oakes loses almost all of his value, presumably based on the notion that he was very close to AL/NL replacement level.

So some Federal League value is removed purely because it has no Major League value, because it is sub-replacement value. The residue from this adjustment is apparently then modified by applying a percentage.
   88. Michael Bass Posted: December 28, 2004 at 03:59 PM (#1043805)
FWIW, as I posted in the other thread, the 3-player example above gives a regression of a subtraction of about 3.25 WARP, and then a discount of about 4.3%.


This is obviously a system that is going to be much more forgiving to star players in an inferior league than a straight % discount.
   89. jimd Posted: December 28, 2004 at 04:18 PM (#1043827)
This is obviously a system that is going to be much more forgiving to star players in an inferior league than a straight % discount.

And it should be. A typical Federal Leaguer had positive value in that league, but would be unable to land a Major League starting job after the collapse (near-zero real value). OTOH, Kauff moved into the majors and was a second-tier star (though not the Ty Cobb/Tris Speaker that his raw FL stats might indicate).

A straight discount does not capture this, and so doesn't correspond to the true situation.
   90. jimd Posted: December 28, 2004 at 05:01 PM (#1043891)
A simple regression analysis of Jim's 3 player example:

Adjusted WARP = .957 * Raw WARP - 3.25

As you can see, after the subtraction takes place, the actual percentage adjustment is only 4.3.


Of course, it's more complicated than that, because batting and fielding are regressed separately.
   91. Paul Wendt Posted: December 29, 2004 at 11:24 AM (#1044727)
jimd #96
Look at the data points 1942-n4, 1941-n5, 1940-n6. These represent the number of players in these years who were regulars in 1946. They were MUCH more likely to have retained/regained their jobs than typical MLB regulars after the same time interval with no war situation. I don't know if this represented an effort on MLB's part to give the returning veterans every opportunity to regain their jobs, or the impact of the war on the development of the minor league players that would normally have replaced some of these players.

Some right of return to a civilian job was provided by law. I don't know details.

For a time including the 1945 and 1946 seasons, MLB roster limits were increased by 20%, partly to make compliance easy.

Cliff Blau, "League Operating Rules"
   92. Joe Dimino Posted: January 03, 2005 at 04:03 AM (#1052602)
Thanks for that WARP2 info jim!

I'm hating the Bob Caruthers induction more and more . . .
   93. karlmagnus Posted: January 03, 2005 at 09:39 AM (#1052668)
If your system says that Buzz Arlett should be in and Caruthers shouldn't, I'd junk the system, if I were you.
   94. EricC Posted: February 18, 2005 at 08:36 PM (#1153490)
Thinking that this is the most appropriate thread for this.

I've calculated career league-adjusted Win Shares above replacement. Replacement level is defined as 1.27 WS per 100 plate appearances, a value determined empirically for the 1901-1940 2-league seasons.

League adjustments are only done when there are mulitiple leagues in a single season, and are done to "equalize" the leagues. No attempt is made to compare leagues across seasons. The parameters in the league adjustments are determined by comparing the performances of individual players, in between seasons. To prevent uncontrolled divergences, 9 average players are added to the data set as players who switched leagues between seasons without change of performance.

The league parameters are determined for each season. Rather than give the full set of data, I just give the 20th century decade-averaged factors that I use to convert actual performance to neutral-league performance.

1900s NL: 0.9473 * WS + 0.00043 * PA
1900s AL: 1.0530 * WS - 0.00044 * PA
1910s FL: 1.0576 * WS - 0.00631 * PA
1910s NL: 1.0037 * WS - 0.00206 * PA
1910s AL: 0.9849 * WS + 0.00332 * PA
1920s NL: 0.9994 * WS - 0.00287 * PA
1920s AL: 1.0007 * WS + 0.00286 * PA
1930s NL: 0.9940 * WS - 0.00126 * PA
1930s AL: 1.0061 * WS + 0.00126 * PA


In presenting the position player leaders in career LAWSAR, I divide players' seasons into 4 roles: C, 1B, "IF" (2B/3B/SS), "OF" (LF/CF/RF), according to the position where they played the plurality of their games. If a position player played some games in a season as a pitcher, I subtracted estimated pitching WS from their totals; if they played a plurality of games as a pitcher, I regretfully did not include that season. Recognizing that the following data is not appropriate for short-season 19th century players, I nonetheless give the top 100 career LAWSAR, 1876-1940 for each role, as well as grand totals for players with more than one role in their career, and the top 100 overall:


C
Hartnet 214 Cochran 211 Dickey_ 195 Schang_ 161 Schalk_ 131
Ewing_B 124 Bresnah 114 Bennett 113 Ruel_Mu 104 McGuire 100
O'Neill 99 Ferrell 96 Zimmer_ 92 Clement 90 Kling_J 90
Lombard 90 OFarrel 88 Severei 87 Farrell 86 Meyers_ 83
Carroll 80 Davis_S 78 McFarla 74 Bassler 73 Gowdy_H 72
Snyder_ 71 Hargrav 66 Sewell_ 64 Grady_M 64 Mancuso 64
Schreck 61 Hogan_S 60 Milliga 59 Danning 56 Miller_ 56
Gibson_ 55 Smith_E 54 Lopez_A 53 Peitz_H 53 Wingo_I 53
Wilson_ 52 Hemsley 52 Wilson_ 50 Perkins 49 Criger_ 49
Nunamak 48 Carriga 48 Pytlak_ 46 Phelps_ 45 Rowe_Ja 45
Lapp_Ja 45 Picinic 44 Flint_S 43 Kelly_K 43 Warner_ 43
Rariden 42 Easterl 42 Robinso 42 Ainsmit 42 Snyder_ 42
York_Ru 41 Killefe 39 OConnor 38 Schrive 38 Moran_P 38
OBrien_ 37 Clarke_ 36 Gharrit 36 Myatt_G 36 Hargrav 35
Archer_ 34 Collins 34 Clapp_J 34 McLean_ 34 Ganzel_ 33
Clarke_ 33 Dooin_R 32 Gonzale 32 Henry_J 32 Gross_E 32
Keenan_ 31 Bowerma 31 Mack_Co 30 Hayes_F 30 Sulliva 30<