Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Hall of Merit > Discussion
Hall of Merit
— A Look at Baseball's All-Time Best

Wednesday, January 19, 2005

Major League Equivalencies

This thread will be used for examining and analyzing MLE’s throughout baseball history.

John (You Can Call Me Grandma) Murphy Posted: January 19, 2005 at 04:22 AM | 213 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 2 of 3 pages  < 1 2 3 > 
   101. Chris Cobb Posted: February 17, 2005 at 04:08 AM (#1149688)
The former. The latter has no basis whatever in mathematical reality -it's double counting the divergence of the small sample from the mean.

Thanks, Karl. That makes sense. Unless I hear other persuasive arguments, I'll adjust my regressions accordingly. This change shouldn't affect career value at all, but it should bring the seasonal totals towards a degree of variance congruent with the variance of major-league careers.

It would be useful to have some major-league data to run comparative studies on. Is there any on-line source for, say, half-season splits for players? I know that ESPN has some data of that sort available, but it may not be very comprehensive.

I have raised Beckwith several notches following your analysis, which I support wholeheartedly in principle, while believing it tends still to round up in practice.

Comparison of the regressed figures to the straight MLEs confirms a small round-up from regression. I'm going to see if creating artificial sub-major league seasons to create rolling 5-year sequences for the ends corrects for that, or if not regressing the first and last years of the career is better.

I have raised Beckwith several notches following your analysis, which I support wholeheartedly in principle, while believing it tends still to round up in practice. But fiddling with the regression formula because the peak's not high enough, or twisting your .87 conversion to .95 to please Gadfly would lose it all credibility, as far as I'm concerned.

You'll notice that I haven't simply adopted Gadfly's recommendations to please him :-) , but if I decide that the evidence warrants it, I will change the conversion ratio after putting the evidence up for discussion.

I figure with you on one side and Gadfly on the other, I'll steer a reasonable course . . .
   102. Gadfly Posted: February 19, 2005 at 01:12 AM (#1153464)
Jonesy (and Paul Wendt)-

One of the interesting things that I've found out while analyzing the 1920s Eastern League is that Major League teams of that time often simply dipped down to try out players from cities close to them. In other words, playing in proximity to a Major League city at that time would help you get a shot at the Majors.

And, back to our favorite subject, I did a little research (Pittsburgh Courier, New York Age, Baltimore Afro-American) to find a earlier game in 1928 between Will Jackman's Quaker Giants and the Lincoln Giants. The Quaker Giants played a double-header against the Lincoln Giants in New York on Sunday, April 22, 1928. However, I can't find any info on the results of the games.

Maybe Gary A can help with some info from the Amsterdam News?

Also, there is (was) a Brooklyn Cuban Giants versus the Philadelphia Giants for the Colored World Championship of the World broadside for sale on e-bay. If you didn't see it, I copied it down, printed it out, and saved it. Let me know if you want that info.

Jim D-

Good Point. Of course, presently there is one Triple-A team for every Major League team. The Minors are a funnel and have been since at least 1969. Of course, from the 1920s until expansion, the Minors were a pyramid and there were 24 Triple-A teams for 16 Major League teams.

However, if my observation that the available talent has been halfed is correct, the 1920 to 1950s peroid still should have higher Triple-A talent since 24 is less than 32.

Chris Cobb-

Unfortunately, there really is no way to regress, without flattening out the peaks and valleys, unless you double count or otherwise give more credit to single seasons with less than a full season of at bats.

However, the fact that the rolling five year talent level has risen the peaks shows that it works. And, since a player's talent changes over the course of his career, is correct. But, of course, there is still the problem of decreasing the beginning and end slightly to balance the actual career path of players.

Karl Magnus-

I did not call you a racist and better people than you have called me nuts before. I have no idea if you are a racist or not; and to accuse someone of racism without proof is to be as intolerant as a racist themselves would be. I would rather approach things with tolerance and an attempt at understanding.

I have noted that you have a high animus to the evaluation of the credentials of Negro League players. But you also have an animus to the evaluation of Minor Leaguers. I assume that you will later have an animus to any evaluation of Sadaharu Oh.

As they said about Dirty Harry: "he's not a bigot, he hates everbody equally."

From my understanding of your posts, you champion evaluation of players mostly by just analysis of their Major League careers and give little or no credit to outside influences that may have stunted or even denied those players from having a Major League career. And, as your championing of Jake Beckley shows, you also advocate quantity over quality as far as Major League careers go.

Of course, this is a completely defensible position despite its ignorance of history.

On the other hand, you ignore any evidence that does not conform to your already made opinions. In fact, when confronted with such evidence, you simply declare that it has only made you more convinced in your opinion.

There is a word for that, but it is not racist.

Of course, if you later start denigrating Willie Mays, Hank Aaron, et al., when the time comes to evaluate their careers, I will have to change my opinion.
   103. karlmagnus Posted: February 19, 2005 at 01:38 AM (#1153493)
Gadfly, if the Japnese leagues in Oh's time were 80% as good as the Majors, Oh had 868 x0.8 x0.8 = 556 Major League equivalent home runs. He's a HOM'er :-))
   104. Chris Cobb Posted: February 27, 2005 at 04:48 AM (#1168003)
Batting Average Conversion Factor from NeL to MLE: New analysis and conclusions

Method

1) Sample: I used only players with full seasons in both the Negro Leagues and the major leagues, and I considered only seasons in which players were no younger than 21 and no older than 36. This gave me a group of 8 players: Bob Boyd, Roy Campanella, Larry Doby, Luke Easter, Monte Irvin, Sam Jethroe, Jackie Robinson, and Hank Thompson. Conveniently, this group was evenly divided between eastern and western players and between players who played in the Negro Leagues only well before their prime and players who reached the majors only after their prime.

2) For each player I used all available Negro-League seasons that were a) not rookie seasons and b) for which a league average was available, 1944-1949, so that I could normalize these averages. Exception: I used Jackie Robinson’s rookie season, since the one season was all he had.

3) For each player I used a five-year consecutive sample of major-league seasons, starting with the second season in the majors and continuing until either the five-year limit was reached or the player’s level of play seriously and permanently declined. Exception: I used Sam Jethroe’s rookie season, because his major-league career was so short prior to his decline due to age and injury. It seemed better to have more data on him than less, and it seemed right to include one major-league rookie season to balance the inclusion of Robinson’s NeL rookie season.

4) I normalized batting averages for all included seasons to the same offensive level; I selected a .262 league average, which was the league average in the NL in 1949, which seemed like a fairly central year, chronologically.

5) Multiplied all batting averages in war-time seasons by .95 to adjust for weakened competition.

6) although the sample I had was balanced between players who played in the Negro Leagues only well before their prime and players who reached the majors only after their prime, I was concerned that aging factors would bias the results. Therefore, I used Tangotiger’s calculations of normal aging patterns, 1919-1997, to calculate a projected major-league average for the seasons in which the player was active in the Negro Leagues.
These are available at http://www.tangotiger.net/agepatterns.txt . For example, with Luke Easter, I found his normalized average for his age 34-36 seasons as a major-league regular, then used that data to find a projected average for his age 31-32 seasons in the NeL.

7) I used the ratio between this projected ML average for his NeL seasons and his normalized average for the player’s NeL seasons (all seasonal data taken from Holway) as a conversion factor for each player.

8) I averaged the conversion factors to find a mean, or standard conversion factor for batting average.

Results

Factor—Player--NeL team – Region – Age type—moved to
.907—Boyd—MEM—West—Young--AL
.932—Campanella—BAL—East—Young--NL
.843—Doby—NWK—East—Young--AL
.840—Easter—WAS—East—Old--AL
.960—Irvin—NWK—East—Old--NL
.897—Jethro—CLE—West—Old--NL
.994—Robinson—KC—West—Old--NL
.809—Thompson—KC—West—Young--NL

Standard Conversion Factor = .897 --> .90

Notes

1) This is an increase from the .87 I had been using. I believe normalizing for leagues and adjusting for age corrected systematic errors in the earlier study. Other systematic errors may still exist, of course.
2) This is not as high as the conversion factor Gadfly has advocated. While my results for Monte Irvin closely match the results of his more intensive study of Irvin’s career, the full consideration set shows individual conversion factors ranging quite widely, with Irvin clearly towards the high end. Some of this variation is surely due to the randomizing effects of the small sample sizes for NeL play, but it seems unlikely that this is the only factor. It appears that, generally, the better players have higher conversion factors, but this is not always true. The electorate should consider, therefore, that a standard conversion factor, even if it accurately captures the average, may significantly overrate or underrate a player. The question is whether there is any way to judge whether a particular player would have a higher or lower factor than the standard, except by seeing the ML statistics that we don’t have.
3) I see no pattern of differences between conversion factors for eastern players and western players, or between older players and younger players.
4) Players moving into the AL show a tendency to have lower conversion factors than players moving in the NL. WARP finds that the AL was the slightly stronger league until fuller integration caught the NL up, which might explain this difference, which is the only difference between groups of players that I have so far noticed aside from the better players having higher conversion factors.
5) It seems likely that the quality of play during the final years of the Negro Leagues, as the young-to-mid-career stars were drawn into the majors and minors, was somewhat lower than the quality of play between the early 1920s and the mid 1930s, when the top black talent was more effectively concentrated in the Negro Leagues.

A full slugging average study remains to be done. Since the square root relationship between batting and slugging would suggest a ba/sa conversion ratio set of .90/.82 anyway, I see no harm in leaving the slugging ratio unchanged for the moment. A fuller study of slugging conversions is, of course, needed.
   105. jimd Posted: February 27, 2005 at 05:49 AM (#1168069)
It appears that, generally, the better players have higher conversion factors

This matches the model used by Davenport in his league translations for the other leagues. That is, there is a "constant" that needs to be subtracted before the percentage is applied. This "constant" represents the difference between the replacement levels in the two leagues (and as such, is not really constant but can vary over time).
   106. karlmagnus Posted: February 27, 2005 at 05:25 PM (#1168496)
Chris, I think that a very fair study, made shaky only by the question of small sample size, which you can't do anything about.

Where I do disagree though, is your assumption that the 20s-30s NL would be better than the last years. The Negro Leagues went bust and re-formed in the early 30s; this must have had a devastating effect on quality level, particulalry the amount of new talent entering. Conversely, with Gibson, Paige et. al. the Negro Leagues had become famous and fashionable by 1940, and the money had also increased, so the talent level immediately before integration (say 1941-46, subject to a dip for war service) must surely have been at the highest point ever reached.

You have successfully proved that a conversion of .90 (plus or minus about .04, I would guess) is right for the 1940s NL (ex the war years -- did they affect NL and ML equally?) I still think that .87 at most would be correct earlier, perhaps .88-.90 in the 1920s, then down to .8 or so while the leagues were in chaos, then rising back to .9 by 1941.
   107. John (You Can Call Me Grandma) Murphy Posted: February 27, 2005 at 05:47 PM (#1168504)
(if he can call me racist, I can call him mad!)

I missed the post where karlmagnus was leveled that charge (it appears to have been a misunderstanding on karlmagnus' part), but I can vouch for him and say that's he's not. While I disagree with him somewhat regarding the Negro Leaguers's quality (though I think it's good to have someone say "Woh!" every once in a while), I'm 100% certain that race has nothing to do with it from his posts here and from numerous AIMs and e-mail messages between us. In his real persona, he's an economist/historian and his reasoning for leaving off the Negro Leaguers rest there, not because of any racism on his part.

Not that he isn't a pain, though. :-D

As for Gadfly, I'm also confident enough to say that stories about his madness have been greatly exaggerated. :-)

Now if anyone needs to be cranky around here, leave it to professionals like me. :-)
   108. Chris Cobb Posted: February 27, 2005 at 10:44 PM (#1168773)
Karlmagnus,

I'm glad you find the latest study reasonable. For myself, the aging patterns addition did much to raise my confidence in the results.

I think a study of player careers--who was coming, who was going--during the first half of the 1930s might provide good evidence about the direction that competition levels were going in that era of economic distress. My view is that contraction concentrated talent in fewer teams, just as it did in the majors in 1890s, raising the level of competition. I think our views of what happened during that decade differ similarly to our views on the first half of the 1930s in the Negro leagues. A study of careers might provide useful evidence.

From the mid-1930s on, I wonder if you are taking account of the extent to which the top black players were playing outside of the Negro Leagues. Paige and Gibson might have been inspiring new talent, but they weren't doing much otherwise to strengthen the leagues themselves. Paige especially was doing more to draw the top talent already in the leagues out of it. He spent a season in Bismarck in a white semipro league there. His hiring led other teams in the league to hire top black pitchers. He took time off from the league to pitch in the Denver Post semipro tournament, along with other black stars. He took a bunch of black stars to the Dominican Republic for the season there in 1937.

From 1939-1944, many of the top black players left the Negro Leagues for one or more seasons in Mexico. Cool Papa Bell, Josh Gibson, Ray Dandridge, Monte Irvin, Roy Campanella, Wild Bill Wright, Martin Dihigo, Willie Wells, and other lesser stars, all left the NeL for Mexico; some never returned.

Just as the competition with the Mexican League was ending, WWII depleted the NeL talent pool. Just as the war ended, integration in turn began to pull the young black stars out of the NeL.

It seems to me that there was never a time after the mid-1930s when the Negro Leagues contained as large a percentage of the known black baseball stars as they did from the early 1920s to the early 1930s, and it's on that information that I make my analysis of league talent levels (which is heavily indebted to gadfly's comments on the subject, I should note). It's imaginable that talent was flowing into the leagues faster after 1935, but at best I think it could only serve to counterbalance the documented flow of talent out of the leagues.
   109. Paul Wendt Posted: February 28, 2005 at 02:29 AM (#1169164)
John Murphy #107
he's an economist /historian and his reasoning for leaving off the Negro Leaguers rest there, not because of any racism on his part.

Chris Cobb #108
I think a study of player careers--who was coming, who was going--during the first half of the 1930s might provide good evidence about the direction that competition levels were going in that era of economic distress. My view is that contraction concentrated talent in fewer teams, just as it did in the majors in 1890s, raising the level of competition. I think our views of what happened during that decade differ similarly to our views on the first half of the 1930s in the Negro leagues.

Yes, in a thread on demography, migration, etc --with jimd and me, at least-- karlmagnus argued that MLB talent probably decreased in the 1890s in response to economic factors. I suppose that is mainly a response to the wage decrease within baseball, but I don't recall that.

If the response to wage rates is significant, it remains an important and open question how quickly it operates on ballplayers of what age.
   110. Brent Posted: February 28, 2005 at 05:05 AM (#1169571)
Chris,

This is great work. Just a note that with the small sample and large variation, plus or minus 2 standard errors will cover a pretty broad range; I believe it's from .853 to .943.

If you did these calculations in Excel, I wonder if you'd mind posting them on the Yahoo group. I'm working on a similar exercise for the 1920s PCL, and it would be good to try to do it consistently with your work.

Thanks again.
   111. KJOK Posted: February 28, 2005 at 08:45 AM (#1170076)
Chris:

Fantastic.

I would say that if you're going to adjust for aging, then maybe you could go ahead and use a larger sample size instead of just these 8 players?
   112. Chris Cobb Posted: February 28, 2005 at 04:54 PM (#1170358)
Brent,

I didn't do the calculations in Excel, though I may move the data into Excel at some point. If I do, I will post it.

KJOK,

I've used the aging data to extend the sample size as far as I can. Tangotiger only produced numbers for ages 21 to 37, and all of the remaining players were either too young or too old.

Here are the remaining players with seasons that meet all criteria except for fitting into the age range:

Junior Gilliam -- 20 in 1949

Minnie Minoso -- 21 in 1947 (non-rookie, in age range)

Bob Thurman -- 39 in 1956 (first suitable ML season)

Elston Howard -- 20 in 1949

Al Smith -- 20 in 1949

Resurveying, I see I could add Minnie Minoso based on his 1947 season, but that's the only one.

Have I missed any hitter who played a season (preferably two) in the NeL between 1944 and 1949 and who also played a full season (preferably two) in the major leagues?
   113. Tango Tiger Posted: February 28, 2005 at 05:49 PM (#1170465)
League factors using Park factors

I haven't gone through this thread in detail, but I thought the above link might have some relevancy to the discussion at hand.

Would would be nice is also minor league park data, especially if minor league teams ever shared the same park with a MLB team.
   114. karlmagnus Posted: February 28, 2005 at 05:50 PM (#1170469)
Mays and Aaron would be interesting, though I suppose you may believe that the quality of the NL had dropped off by their time. I always like the anecdote of Leo Durocher, talking to Willie Mays, who's hitting .477 for the Birmingham Black Barons, but isn't sure he can cut it in the majors: "Listen bud, you're hitting .477; if you can hit .277 in the National League you can help the Giants!" Presumably Mays' 1951 NL and major conversion would depress the factor a bit, since while good even in his rookie season he hit nothing like .477 in the majors!
   115. Chris Cobb Posted: February 28, 2005 at 06:47 PM (#1170591)
The .477 average would have been against all levels of competition.

Mays hit .263 from 1948 to 1950 vs. NeL competition, ages 17-19.
   116. Gary A Posted: February 28, 2005 at 11:28 PM (#1171287)
Actually, in 1951 Willie Mays hit .477 for the Minneapolis Millers of the American Association in 35 games before being called up to the Giants.

His numbers for the Birmingham Black Barons (from the Negro Leagues Book):

Year-Age--G--AB--H---D--T-HR---R-RBI-SB--Ave--Slg
1948-17--25-084-022-03--0--1--10--07-01-.262-.333
1949-18--75-270-084------------63---------.311-----
1950-19--27-106-035-07--2--4--22--28-02-.330-.547

In 1950, he entered organized ball with the Class B Trenton Giants:
1950-19--81-306-108-20--8--4--50--55-07-.353-.510
Then started 1951 with AAA Minneapolis:
1951-20--35-149-071-18--3--8--38--30-05-.477-.799

In Minneapolis, aside from spending only a month there, he probably benefited from ultra-hitter-friendly Nicollet Park, which appears to have retained its characteristics over many decades--Perry Werden was hitting 40-45 homers a year there in the 1890s; Gavy Cravath and Buzz Arlett, among many others, saw their offensive stats helped greatly by Nicollet.
   117. KJOK Posted: March 01, 2005 at 01:17 AM (#1171519)
KJOK,

I've used the aging data to extend the sample size as far as I can. Tangotiger only produced numbers for ages 21 to 37, and all of the remaining players were either too young or too old.


OK, perhaps I extrapolated them on my own, because my own table based on Tango's has adjustments for age 15 thru age 49.....
   118. KJOK Posted: March 01, 2005 at 01:19 AM (#1171522)
AgeAge_Factor
49 2.262431
48 2.171934
47 2.085056
46 2.001654
45 1.921588
44 1.844724
43 1.770935
42 1.700098
41 1.632094
40 1.566810
39 1.504138
38 1.443972
37 1.386214
36 1.330765
35 1.277534
34 1.226433
33 1.177376
32 1.130281
31 1.085069
30 1.041667
29 1.000000
28 1.000000
27 1.000000
26 1.000000
25 1.041667
24 1.085069
23 1.130281
22 1.177376
21 1.226433
20 1.277534
19 1.330765
18 1.386214
17 1.443972
16 1.504138
15 1.566810
   119. KJOK Posted: March 01, 2005 at 01:46 AM (#1171586)
Have I missed any hitter who played a season (preferably two) in the NeL between 1944 and 1949 and who also played a full season (preferably two) in the major leagues?
Bob Thurman
Ray Noble
Harry Simpson
George Crowe
Dave Pope
Jim Pendleton
Al Smith
Gene Baker
Luis Marquez
Chuck Harmon
Joe Taylor
   120. KJOK Posted: March 01, 2005 at 01:50 AM (#1171592)
and Hector Rodriguez had 462 PA's in one MLB season....
   121. Chris Cobb Posted: March 08, 2005 at 11:15 PM (#1188745)
There are a couple of issues with MLEs for the 1930s that I am looking for feedback on.

Here's one.

In the 1930s, the offensive levels in the National League and in the American League were quite different. Here's the NL BA/SA expressed as a factor of the AL for each season in this decade:

1930: 1.052/1.064
1931: 0.996/0.977
1932: 0.996/0.980
1933: 0.974/0.928
1934: 1.000/0.987
1935: 0.989/0.972
1936: 0.961/0.917
1937: 0.968/0.920
1938: 0.950/0.907
1939: 0.975/0.948

In some seasons, the leagues are very close, but except for 1930, the NL is consistently lower, especially in slugging. The slugging ratio appears to be consistently close to the square of the ba ratio, as we might expect.

Since adjusting for the league offensive level is crucial to accurate MLEs, the strong divergence between the leagues in many seasons (any divergence of 5% or more has a large impact on projected ws) means that I need to decide which league I am projecting the NeL statistics into.

For the 1920s, after the early twenties the leagues are generally close in offensive level, but often different in competition level, as far as we can tell, so I used a single offensive-level adjustment in my calculations and alternated leagues season by season for win share projections on the basis of raw stats.

In the 1930s, the leagues are closer in competition level, with the NL a bit ahead, but farther apart in offensive level.

Would it be preferable to continue to see projections based on alternating NL/AL seasons, or would it be preferable to see the NeL players projected consistently into one offensive environment or the other?

Comments?

I'll be posting a little later on the issue of assessing NeL offensive levels during this decade.
   122. OCF Posted: March 08, 2005 at 11:40 PM (#1188787)
Chris: I'd probably pick a consistent environment. Real careers do have real ups and downs, good years and bad years. I remeber an article Bill James did in a mid-80's Abstract in which he presented the season-by-season statistics for two players, both of whom were completely fictional - and proceeded to discuss their careers from the evidence of the statisitcal lines. His point was that the numbers have storytelling power. The problem with the alternating-environment thing is that it introduces year-to-year fluctuations that aren't part of the story you're trying to tell.

Which consistent environment? You've got three choices: AL, NL, and hybrid/midpoint. I don't think it matters that much which one you go for, as long as you make the choice clear to us.
   123. Gadfly Posted: March 09, 2005 at 12:15 AM (#1188836)
Chris Cobb-

Since it's a Major League equivalency, perhaps you should combine both the AL and NL to give one Major League statistic. Then using park and league factors, give a variable showing how much the MLE could vary up or down depended on League and location. I know that seems like poor advice, but it's the best I can think of.

On NeL offensive levels, I must admit that I was very interested in seeing Gary A's league info for one season in the 1930s. As I mentioned before, I adjusted up in the 1930s, contending that the level of play rose as the Depression killed off the weak teams.

Now I think this other factor, how low the Negro League offensive levels seem to have been in the early 1930s, is perhaps a much bigger factor than any change in the talent level.

The level might not have actually risen that much, or perhaps even at all, and my study could have simply been picking up the huge dip in offensive levels in the 1930s, presumably brought about by the Negro Leagues' enthusiastic embrace of night baseball.

Gary A-

Willie Mays complete 1949 NAL stats are:
G-AB-R-H-2B-3B-HR-RBI-BA-SA-TB-SB
80-289-68-90-17-4-5-40-.311-.450-130-11
Negro League Career (1948-50):
132-479-100-147-27-6-10-75-.307-.451-216-14
First Year in Majors (1951)
121-464-059-127-22-5-20-68-.274-.472-219-07

His BB and SO rates are known for 48 and 50 in the Negro Leagues and compare pretty well with his Major League totals, though the Negro League sample size is small:

BB-SO-AB
17-16-190 (NeL 1948 and 1950)
57-60-464 (NY Giants 1951)

Two other points:

1) His statistics in Minneapolis are impressive but certainly show the dangers of small sample size; and Trenton was evidently an impossible home run park. Hust looking at his Minor League stats with no info would lead you to believe that the American Association was a much less talented League than the tough old Interstate League.

2) Any analysis of Mays' Negro League statistics versus his first Major League season has to, as always, take into effect the adjustment effect. Mays spent two full Spring trainings and two and half years with Birmingham.

On the other hand, he was throw into the line-up in mid-season for the Giants. The immediate effect of this was he went hitless in something like his first 20 ABs. Without those 20 abs, his BA/SA is .286-.493 or so.
   124. Chris Cobb Posted: March 09, 2005 at 03:37 AM (#1189163)
Thanks, OCF and gadfly!

Picking the midpoint might be the best choice for ba/sa, but to get to win shares it's much easier for me to have set the ba and sa to an actual league environment, so I'll probably pick one or the other. I'll wait for more comments, though, so if others want to weigh in, please do.

Discussion of NeL offensive levels based on study of median batting averages to follow.
   125. Chris Cobb Posted: March 09, 2005 at 03:39 AM (#1189166)
Here's the other bit I've been working on:

Use of Median Averages to Normalize League Offense Levels

In setting the conversion factor from the NeL to the majors, having league offense levels was important to arriving at a reliable conclusion. League offensive data for the NeL in the 1920s also played an important role in the conversions for that decade. There is little comparable data for the 1930s, however, with uncorrected NeL norms for 1934 provided by Gary A. as the only data so far available.

In an effort to get some numerical view of the offense levels in the NeL in the 1930s that could be used to adjust the records of individual players, I have worked with the one set of comprehensive league data I have access to: Holway’s list of the batting averages for the starters at each position (except pitcher) for each team in each negro league in each season. Holway doesn’t provide at bats to go with these averages, so there is no way to get from player-averages to league averages. However, one can still find a median average for each season, which can be compared to the median averages for the major leagues in each season also.

I am considering how to use this data as part of conversions. From the major-league data it is clear that 1) the median average of starting position players is always higher than the league average, 2) the difference between the median average and the league average varies, but the range of values for this difference is less than .015 (lowest is .006, highest .018), and 3) the higher the median average and the league average, the larger the difference between the two is likely to be. It seems that the American League has larger differences than the National League, but I have so far not been able to determine whether this is a league-based difference or only an effect of the American League having a generally higher offensive level.

From comparing the major-league data to the Negro-League data, it is clear that the median average in the majors is typically higher than the median average in the NeL, from which I infer that offensive levels in major leagues were typically higher than the offensive levels in the Negro-Leagues, with occasional exceptions.

If this inference is valid, the question arises of how best to adjust for this difference, given the degree of inconsistency observed in the relationship between the median and the mean in the major-league data (and the lack of means in the NeL data that force us to work with median averages).

If anyone with statistical know-how has ideas about this, I would welcome input.

Here’s what I have so far.

By playing around with the major-league data, I have found so far that adjusting offense levels by the ratio between league median averages diminishes the level of error when the league median averages are separated by .010 or more. When the league medians are separated by .005-.010, adjusting the league median averages by the ratio of the median does not diminish the level of error.

For example: The AL in 1935 had a league avg. of .280 and a median avg. of .292. The NL in 1930 had a league avg. of .303 and a median avg. of .313. If we did not have the league averages and assumed that the league offense levels were the same, the value of the hitters in the 1935 AL would be underestimated by 7.6% [1-(.280/.303)]. If we multiply the league average by the ratio of the medians, .280 --> .300. If we now assume that the league offense levels have been normalized, we would be underestimating the hitters in the 1935 AL by 1.0%, reducing the inaccuracy of our estimate by 6.6%. This example is more completely successful than many, but no adjustments by this method increased the error when the league median averages differed by .010 or more.

Since the difference between the median and the mean tends to rise as the median and the mean rise, I believe that subtracting the typical difference between median and mean a given range of median values (say, for example .010 for medians between .270 and .280) from the NeL median and finding the ratio between that estimated mean and the actual major-league mean would further increase the accuracy of the estimates, but I have not yet tested it. My playing around with the major-league data suggested that, with no such adjustment, the league with the lower offensive level tended to be overestimated by .5 – 3% (though this did not happen in the example above). Since the NeL is consistently the lower-offense league, this error would systematically benefit the NeL players. I think that the subtraction, if set properly, should eliminate systematic error of this sort.

In some Negro League seasons, the number of games on which the batting averages are based is obviously quite small (20-30 games per team), and these small seasons tend to have more extreme values. I would like to regress them, but I’m not sure how to set that up.

I’ll post the median data set in a moment, for those interested.
   126. Chris Cobb Posted: March 09, 2005 at 03:40 AM (#1189168)
Median Averages, majors and NeL, 1930-1939

1930
NL 313 med, 303 avg
AL 300 med, 288 avg
W 286 med
E 311 med

1931
NL 292 med, 277 avg.
AL 294 med, 278 avg.
W 278 med
E 246 med (small data base)

1932
NL 286 med, 276 avg
AL 293 med, 277 avg.
EW 272 med
S 272 med

1933
NL 276 med, 266 avg
AL 289 med, 273 avg.
EW 287 med

1934
NL 293 med, 279 avg
AL 298 med, 279 avg
EW 267 med, 261 avg.

1935
NL 284 med, 277 avg
AL 292 med, 280 avg.
EW 287 med

1936
NL 289 med, 278 avg
AL 295 med, 289 avg.
W 269 med (small data base)
E 277 med

1937
NL 282 med, 272 avg.
AL 299 med, 281 avg.
W 271 med (small data base)
E 287 med

1938
NL 273 med, 267 avg
AL 296 med, 281 avg
W 280 med (small data base)
E 274 med

1939
NL 285 med, 272 avg
AL 291 med, 279 avg.
W 253 med (small data base)
E 279 med
   127. karlmagnus Posted: March 09, 2005 at 03:48 AM (#1189176)
If you notice, the 1930 NL/AL comparison differs from the others. I have a vague memory that the NL panicked after the Hack Wilson 1930 season and changed the rules or possibly ball, so NL and AL weren't strictly comparable through the 30s. If we're getting to this level of detail, I think that needs to be checked.
   128. OCF Posted: March 09, 2005 at 03:56 AM (#1189184)
karl, go read this article.
   129. jimd Posted: March 09, 2005 at 04:18 AM (#1189213)
The NL/AL difference is even more dramatic than the raw stats indicate. The parks in the NL are probably more hitter friendly than the parks in the AL. Consistently, Sportsman's Park in St. Louis plays as a more extreme hitters park in the AL than it does in the NL, indicating that the other AL parks average out to be more pitcher friendly than the other NL parks. (This effect is less dramatic in the 1930's than it is in the 1920's, perhaps due to the Fenway rehab.)

This effect goes away after Baker Bowl is condemned in 1938 and the Phillies move into Shibe Park with the A's (there is a slight reversal at Sportsman's, though Shibe tends to play as near neutral in both leagues.)
   130. KJOK Posted: March 09, 2005 at 06:20 AM (#1189358)
And on NL vs. AL parks, you should read this:

NL vs. AL Parks Using Common Parks Analysis
   131. John (You Can Call Me Grandma) Murphy Posted: March 16, 2005 at 10:16 PM (#1201843)
From the '47 Ballot thread:

Let's see--here are some players that moved from NeL to ML, with their last NeL and ML position when they settled in:

Jackie Robinson SS-->2B
Monte Irvin CF-->LF
Larry Doby 2B-->CF
Roy Campanella C-->C
Hank Thompson UT/OF*-->3B
Bob Boyd 1B-->1B
Sam Jethroe CF-->CF
Luke Easter LF**-->1B
Minnie Minoso 3B-->LF
Gene Baker SS-->2B


Robinson - the Dodgers already had a great shortstop, so moving Jackie to second made sense (after Stanky was traded)

Minoso - he wasn't going to knock Rosen out at third.

Baker - ex-NeL Banks was manning short starting in '54

Irvin - the Staten Island Scot was doing well in CF

Doby - Flash Gordon at second? Doby moves to CF

I might be wrong, but I don't see a definitive defensive spectrum correction going on here for any of these examples (except possibly Easter). Established stars were already manning the positions they had played in the NeL, so they moved to positions that were open. Besides, how many GM's would want to displace star white players with black players?
   132. Gary A Posted: March 16, 2005 at 10:43 PM (#1201893)
Good point. After all, Robinson entered the ML as a first baseman, because they weren't going to move Stanky off second for him.

Also--am I remembering this right?--wasn't there supposed to have been some worry that baserunners would try to spike him? Or am I thinking of Frank Grant?
   133. John (You Can Call Me Grandma) Murphy Posted: March 16, 2005 at 10:49 PM (#1201904)
Also--am I remembering this right?--wasn't there supposed to have been some worry that baserunners would try to spike him?

That does sound familiar, Gary. Not that he didn't have problems at first in '47 from racists.
   134. jimd Posted: March 16, 2005 at 11:30 PM (#1201964)
Irvin - the Staten Island Scot was doing well in CF

They had no problem moving the Staten Island Scot in 1951 when the Say Hey Kid arrived.
   135. John (You Can Call Me Grandma) Murphy Posted: March 16, 2005 at 11:36 PM (#1201974)
They had no problem moving the Staten Island Scot in 1951 when the Say Hey Kid arrived.

I think we can safely conclude that Irvin (or anybody else alive at the time, for that matter) wasn't Mays in center. :-)
   136. jimd Posted: March 16, 2005 at 11:54 PM (#1202000)
Minoso - he wasn't going to knock Rosen out at third.

Ken Keltner was the Indian 3B during Minoso's brief tryout in '49; Rosen the prospect in waiting.

Minoso was traded to the White Sox in late April '51, his first season as a regular. After the trade, the Sox purchased Bob Dillinger and the two split the 3B duties. The team was so impressed with both that Hector Rodriguez got the job in 1952. They were so impressed with Hector that they had open tryout in 1953 (Bob Elliott, Vern Stephens, Rocky Krsnich, etc.)
   137. Mark Shirk (jsch) Posted: March 17, 2005 at 12:02 AM (#1202014)
I would like to say that I don't have all NeL stars moving to less demanding defensive positions. I would have Lundy, Lloyd, Charleston, Santop, Grant, (was 2B a defensive position in grant's day?), Johnson, Marcelle, Poles, etc. playing their NeL position. Just for guys like Beckwith, Wilson, Suttles (1B instead of LF) do I have some sort of position shift.

The shift is probably the biggest for Beckwith. I move from SS/3B to 3B/1B (which is different from 1B/3B) Suttles is a 1B, Wilson a 3B/1B, but without as much defensive value for Beckwith.

I hope that clears anythign up, I don't think my point of view is that extreme. I would say that it makes more sense than giving every NeL players full defensive credit while shifting to a harder league. Also makes more sense than downgrading all players.
   138. Gary A Posted: March 17, 2005 at 12:17 AM (#1202033)
Actually, what you've got for those guys isn't that different from their NeL positions. Beckwith was a 3B/1B (with a little time at catcher) for two seasons in 1922-23; Suttles was overwhelmingly a 1B (he played LF a couple of years in the beginning of his career, then for a year or two later on while Giles displaced him at 1B); and Wilson was a 3B/1B (he played second a little, too).
   139. John (You Can Call Me Grandma) Murphy Posted: March 17, 2005 at 12:22 AM (#1202039)
The team was so impressed with both that Hector Rodriguez got the job in 1952. They were so impressed with Hector that they had open tryout in 1953 (Bob Elliott, Vern Stephens, Rocky Krsnich, etc.)

Since WS has him as the leader for third basemen in 1952 (in only 124 games, mind you), it doesn't sound like Rodriguez lost job due to his fielding. Sounds like he was pretty impressive in the field.

How was Minoso viewed as a third baseman in the NeL? If he were just OK, it would make sense to replace him with a superior fielding third baseman without it being a NeL-to-majors densive spectrum correction.
   140. jimd Posted: March 17, 2005 at 12:33 AM (#1202045)
I think we can safely conclude that Irvin (or anybody else alive at the time, for that matter) wasn't Mays in center. :-)

The point, of course, is that they weren't inflexible about the job. I don't know what Irvin's NeL rep at center was, but if he was gold-glove quality, they probably would have moved Bobby Thomson, who wasn't long-established in center anyway.

Thomson was the regular CF in 1947. He moved to left in 1948 when Whitey Lockman came up. They were then swapped in 1949, and left alone in 1950. Mays arrives in 1951, so Bobby is tried at 3rd, which, surprisingly, lasts a little more than a year. Thomson gets his old job back when Mays goes in the service. When Mays gets out, Thomson is traded to Milwaukee, which moves him to LF.

Meanwhile, Irvin plays only 1 game in CF during his career, and that's probably while Mays was away. The Giants had the opportunity to give him the job for the duration, but chose not to.
   141. jimd Posted: March 17, 2005 at 12:38 AM (#1202054)
Since WS has him as the leader for third basemen in 1952 (in only 124 games, mind you), it doesn't sound like Rodriguez lost job due to his fielding. Sounds like he was pretty impressive in the field.

Must have been the reverse then, that Hector wasn't that impressed with the White Sox ;-) He was from Cuba, 32, and played only that one season. Nevertheless, with Hector gone, the White Sox didn't turn to Minoso, but attempted a number of alternate solutions in 1953.
   142. jimd Posted: March 17, 2005 at 01:17 AM (#1202113)
I would like to say that I don't have all NeL stars moving to less demanding defensive positions.

Neither do I, and I'm not making any argument for a "systematic" shift. I think it's just common-sense to note that young NeL players arriving in the majors would have been just as likely to be shifted as AAA players. Those with "glowing" defensive reps would stay in place; those who get the "Fielding? He was pretty good." treatment are likely to get shifted down a position or two, just like Hornsby/Frisch/etc.
   143. Mark Shirk (jsch) Posted: March 17, 2005 at 02:05 AM (#1202177)
How much time did Wilson spend at 1B and when (age) did he shift? I must have missed this in the Wilson thread.

Also, I would give Beckwith no time (less than a season) at C and SS, and a little more time at 1B. Somewhere in between Dick Allen and George Brett when it comes to time percentage of time spent at 3B and percentage of time spent at 1B.

Either way, both are in my top 10, but these position shifts (and a few other things like Beckwith negative influence) keep them out of my top 3 (Grove, Hartnett, Jennings).
   144. John (You Can Call Me Grandma) Murphy Posted: March 17, 2005 at 02:40 AM (#1202223)
Meanwhile, Irvin plays only 1 game in CF during his career, and that's probably while Mays was away. The Giants had the opportunity to give him the job for the duration, but chose not to.

Irvin was thirty when he finally made it to the majors, so he might have lost a step in the outfield. Maybe others here know something about his fielding.

Either way, both are in my top 10,

If only others here had Beckwith in their top-ten, we wouldn't be having this conversation now. :-)
   145. Gary A Posted: March 17, 2005 at 04:39 AM (#1202451)
Chris Cobb has Wilson's positions year by year according to Holway over on the Wilson thread. He played 1b 1923-26; 3b 1927-33; 1b 1934-37; 3b 1938-41 (ut in 40). In 1928 he did NOT play primarily second (which is where Holway puts him); he was a third baseman, with a few games at second. Holway is very unreliable on 1928, especially the east.

On Beckwith's "negative influence": that seems to come mostly from Riley. Both Gadfly's contributions and research into contemporary newspapers have made Riley's characterizations extremely suspect, especially as he makes factual assertions that are demonstrably false (e.g., that Beckwith left Chicago without finishing the season in 1923 because he was in trouble with the law). I'm going to keep researching this, but I'm afraid I don't trust Riley on Beckwith at all.
   146. jimd Posted: March 17, 2005 at 05:14 AM (#1202505)
Irvin was thirty when he finally made it to the majors, so he might have lost a step in the outfield.

Which of course brings up the other end of the career. Excellent fielders such as Sewell and Maranville were not immune to position shifts when they got older. I'm sure the same would have happened sooner to some (but not all) NeL players during the individual's decline phase if they had played in the majors.
   147. Mark Shirk (jsch) Posted: March 17, 2005 at 03:42 PM (#1202855)
Wilson had a big injury at an advanced age in 1937 right? Would it be plausible that he may not have been a regular at any other position than say 1B or maybe LF (in the Majors) after that injury to keep his bat in the lineup? For how many of those years should he be slated as a part-timer?

Taking his full-time career (1923-1937) he seems to have played 8 seasons at 1B and 7 seasons at 3B. Who played 3B instead of him from 1923-1927? Was he a good 3B or did the team just not trust the kid to paly 3B?

Now, it is possible, maybe even probable that had he come to a Major League team in 1923 that didn't have a 3B he would have played there. However, one could also argue that he would have switched over to 1B a little sooner because MLB had a higher replacement level (or at least they do in my mind). So 8 seasons at 1B and 7 seasons at 3B, with his best seasons coming at 3B, doesn't sound unreasonable. The more I read about the guy the more I see Edgar Martinez had he been left alone in the field. A HOMer but not an inner circle guy.
   148. DavidFoss Posted: March 17, 2005 at 04:13 PM (#1202897)
Wilson had a big injury at an advanced age in 1937 right? Would it be plausible that he may not have been a regular at any other position than say 1B or maybe LF (in the Majors) after that injury to keep his bat in the lineup? For how many of those years should he be slated as a part-timer?

Several of the career estimates in the Jud Wilson thread are cutting off his career at 1938. Seems reasonable.
   149. Gary A Posted: March 17, 2005 at 05:52 PM (#1203138)
Black Sox third basemen 1923-26:
1923 Julio Rojo
1924 Harry Blackman (Beckwith was at ss)
1925 Harry Jeffries (Beckwith at ss)

Actually, Wilson was at third base starting in 1926, when Ben Taylor arrived to manage and play first.

Rojo was normally a catcher; Blackman was a very well regarded third baseman who died during the season (I don't think Wilson took his place at third, but I can check); Jeffries had a longish career but was not regarded as a top player (don't know about his fielding).
   150. Mark Shirk (jsch) Posted: March 17, 2005 at 10:19 PM (#1203884)
So why might Wilson have spent his first three years as a 1B? Could it be because 1B defense was valued more than 3B defense? Only Blackman seems to be a guy who would have been good enough to keep a younger player off the position.
   151. Gary A Posted: March 18, 2005 at 02:03 AM (#1204421)
No idea. One thing to keep in mind is that this positional information comes from Holway, and he's been known to get such things spectacularly wrong (I have quite a few examples, if anybody's interested). In the next day or two I'll go through what box scores I have for those years and see if that's really an accurate reflection of the positions those guys were playing.
   152. Mark Shirk (jsch) Posted: March 18, 2005 at 02:06 AM (#1204431)
Thank You Gary
   153. Paul Wendt Posted: March 18, 2005 at 06:09 AM (#1204755)
Chris Cobb #26
1934
NL 293 med, 279 avg
AL 298 med, 279 avg
EW 267 med, 261 avg


This is troubling: the only season for with both median (starter median) and average (league mean) data for any Negro League, and the difference between med and avg is much less --about half?-- than usual in the majors.

Note: All quantitative judgments are relative, NeL relative to MLB.

The black median and average statistically-might be close because the black substitute and regular players were close in quality. Fat chance. HOM theorists have argued the contary, that the quality range within the black player pool is large.

Probably, the black median and average are close because the black teams use their regulars more, their substitutes less; thus the regulars who define the median have greater weight in the mean. (Probably in turn because the black teams use smaller rosters, because they are close to the margin economically.)

But heavy use of the regular players threatens the whole comparative analysis.
   154. Gary A Posted: March 18, 2005 at 07:28 AM (#1204801)
Well, one issue is that black teams had much smaller rosters (15-16 players), thus fewer substitutes to use. Another issue is that, in this case, the median and mean come from different sources. I could produce median figures for 1921 and 1928 (and soon 1916) to compare with averages I've already got.
   155. OCF Posted: March 19, 2005 at 02:03 AM (#1205981)
The fact that the means and medians are closer together in the Negro Leagues may well be an artifact of bad data, but if it is real, it may very well be the result of and quite consistent with a broader distribution of talent.

To take an old Bill James point from one of his mid-80's Abstracts: Let's say the league mean BA is .260. How many players can hit 90 points above that, .350? Precious few, if there are any at all, and their names are household words. Now, how many players can hit 90 points below that, .170? One answer is "hundreds", but the real answer is none - you can't keep a major league job hitting that. The result is that BA (and any other stat you care to name, like OPS) has a very skewed distribution, with a blunted off lower end and long tail at the high end. In such a skewed distribution, the mean is higher than the median, with the size of the gap attributable to the degree to which the distribution is skewed.

Now, what if the Negro Leagues had fewer resources available for producing "ordinary" players? It's been noted that they used small rosters, so everyone on such a roster would get plenty of playing time. Did more far-below-average players hang onto jobs than would be true in the major league setting? If so, that would make the distribution less skewed and bring the mean and median closer together.
   156. KJOK Posted: March 19, 2005 at 04:38 AM (#1206331)
Now, what if the Negro Leagues had fewer resources available for producing "ordinary" players? It's been noted that they used small rosters, so everyone on such a roster would get plenty of playing time. Did more far-below-average players hang onto jobs than would be true in the major league setting? If so, that would make the distribution less skewed and bring the mean and median closer together.

This seems very plausible to me. You didn't really have a 'feeder' system of minor leagues the way organized baseball had, so if a player had at least established some level of competence and he didn't give his employers any other reason to remove him from the league, he probably could stay employed on a lot of teams.
   157. jimd Posted: March 19, 2005 at 04:39 AM (#1206333)
Did more far-below-average players hang onto jobs than would be true in the major league setting? If so, that would make the distribution less skewed and bring the mean and median closer together.

Instead of a pyramid, a "Washington monument", a "uniform" distribution, with a similar number of players below average as above, and replacement level nearly as far below average as the greats were above.

How far below average does this imply?

Personally, I lean towards the bad data. This talent distribution is difficult to comprehend.
   158. Paul Wendt Posted: March 19, 2005 at 05:30 AM (#1206429)
OCF makes a good point but it is not a candidate explanation for the superficially high league mean in the Negro Leagues. "Superficially" because the NeL league mean is high relative to the median regular player not the median player. But OCF's point explains a low league mean, not a high one.

In order to proceed, we should have not only Gary A's resolution of the "two sources" problem but also four rather than two statistics for each league-season:
- league mean (in hand)
- league median - properly calculated!
- mean regular player
- median regular player (in hand)

--
My guess: the NeL 1934 phenomenon is real and the explanation is chiefly in roster size & usage.
   159. Gary A Posted: March 22, 2005 at 06:06 AM (#1210672)
I've just finished the first stage of one project, compiling the 1916 NeL season. I found 102 games played between top teams in the west (midwest) that season, including the NY Lincoln Stars, who toured for a few weeks. Of those 102 games, I have box scores for 86 (84%). This represents substantially more games than Holway has in his Complete Book (Holway's W-L in parens):

Chicago American Gts 34-21-3 (10-5)
Indianapolis ABCs 22-15-3 (12-18)
St. Louis Giants 11-8-1 (0-2)
All-Nations Club 4-4-2 (6-2)
Lincoln Stars 4-10 (listed in east as 3-7)
Bowser's ABCs 1-4 (0-3)
Montgomery Gray Sox* 0-3 (not listed)
Chicago Giants 0-3 (not listed)
*-Eventually, I will probably take out Montgomery's three games with the ABCs.

My research has so far only encompassed three of the easiest papers to get, the Chicago Defender, Chicago Tribune, and Indianapolis Freeman.

Anyway, of more interest here are the "league" totals:

G-86
AB-5489
H-1370
D-193
T-68
HR-18
R-771
W-572
K-689
HP-53
SH-110
SB-195
AVE-.250
OBA-.326
SLG-.319
R/G-4.48 (per team)
R/9 inn-4.79
   160. Gary A Posted: March 22, 2005 at 06:24 AM (#1210687)
1916 Park Factors

There was no league of course, so there was no "schedule" as we understand it. Most games were played in either Chicago or Indianapolis (with a few in St. Louis and some neutral locations like Redland Field in Cincinnati), and the ABCs and American Giants tended to stay at home most of the time. So, for what it's worth, here are "raw" park factors for runs, or home/away ratio:

team/park/pf/(home games, road games)
Chi Am Gts / Schorling's Park / 86 (49,9)
Ind ABCs / Federal League Park / 116 (29,11)
St Louis / Federal League Park / 135 (15,5)

The Cuban Stars alone played in all three parks; runs/game for both teams combined in each:

Schorling's Park 29 games, 6.45
Indianapolis FL Park 9 games, 7.44
St. Louis FL Park 4 games, 12.00

Interestingly, the STATS Sourcebook lists the Indianapolis FL Park runs park factor as 130; the St. Louis park is 114.

Oh, and by the way: the "league" fielding percentage was .951.

This is just the first stage of work on this season, but I thought it would be of interest for anyone trying to make sense of pre-league NeL careers.
   161. Gary A Posted: March 22, 2005 at 06:44 AM (#1210720)
By the way, the combined record for the Lincoln Stars, east and west, is 8-13 (Holway has 3-7).

I only have box scores for 17 of 35 known games in the east. Here are the eastern standings, if anyone's interested, with Holway's W-L in parens again:

Bacharach Gts 2-1 (0-4)
Lincoln Gts 12-7 (10-9)
Brooklyn Royal Gts 11-7 (3-0)
Lincoln Stars 4-3 (3-7; might include games in west)
Long Branch Cubans 5-5 (not listed)
Cuban Stars (East) 1-7 (1-3)
Baltimore Black Sox 0-5 (not listed)
   162. Gary A Posted: March 22, 2005 at 06:51 AM (#1210732)
Duh--I left the Cuban Stars out of the western standings. Their record was 21-29-1; Holway has them at 9-14.
   163. Brent Posted: March 22, 2005 at 07:03 AM (#1210748)
This is great stuff, Gary. I notice that the W/AB rate is actually higher than for either of the major leagues.
   164. king_of_rock Posted: March 28, 2005 at 09:34 PM (#1221375)
I am getting desperate! Been reading this thread and have a simple query...

Is there a formula to quickly make MLE conversions? I really would like to convert probably only about 30 to 50 players' minorleague stats in last few years, and surely SOMEONE must have made some sort of conversion factor for this purpose...if you know of the formula, or can point me in right direction, I would be VERY grateful.

Everyone seems to mention Ron Shandler's book, but I am loathe to shell out $30 for something I can do myself with a tiny bit of guidance.

Really appreciate your help, and hope you do not mind this interuption to your scholarly discussion!

Have a great day!

Sincerely,

Ed

.(JavaScript must be enabled to view this email address)
   165. Paul Wendt Posted: March 29, 2005 at 05:22 AM (#1221978)
Gary,
How commonly do the box scores report team totals for lob (left on base) roe (reached on error) or simply errors?

Have you tried to balance box scores and, if so, what success rate?
   166. Gary A Posted: March 29, 2005 at 10:28 PM (#1222823)
Paul,

As you can imagine, the quality of box scores varies widely according to era, city, and newspaper.

LOB--This appears in many "good" box scores--possibly 35-40% or so of all box scores.

ROE--Almost always a team total ("reached first base on errors--Kansas City 2, St. Louis 1")--appears with less frequency than LOB. Maybe 15-20% of box scores.

Errors appear in almost every box score (or else you can be pretty confident there were no errors in the game). Errors, putouts, assists, hits, extra base hits, and pitchers' walks and strikeouts all appear in almost every box score (and in every one I have for 1916).

I try to balance every box score, if there is enough information. I'd say they balance about 75% of the time. When they don't, I believe the culprit is most often sac hits (either counted mistakenly as at bats, or not counted as at bats but not listed in the bottom section) or missing players (usually pinch hitters or other late inning subs).
   167. Gary A Posted: April 03, 2005 at 06:22 AM (#1228778)
I looked into the issue of league mean, median, etc. Using the 1928 NNL, I calculated the following:

For regular players--top 72 in plate appearances, that is, 9 per team--(with league ave. in parens)

batting ave.--.289 (.278)
OBA--.346 (.333)
SLG--.404 (.384)

"mean batting ave" (or ave. calculated with every player given equal weight)--.284 (.278)
mean OBA--.340 (.333)
mean SLG-.390 (.384)

median (that is, halfway between the 36th and 37th player) ave.--.280 (.278)
median OBA--.339 (.333)
median SLG--.381 (.384)

Interestingly, the median for regular players is the closest to the overall league average, though the "mean average" is fairly close.

For all players:

Mean ave.--.237 (.278)
Mean OBA--.286 (.333)
Mean SLG--.310 (.384)

Median ave.--.250 (.278)
Median OBA--.308 (.333)
Median SLG--.316 (.384)

For the regulars, I chose 9 players per team because most teams had at least 2 catchers who shared most of the load, plus Holway commonly lists one or more utility men.
   168. Gary A Posted: April 03, 2005 at 06:30 AM (#1228787)
The same figures for the top 64, or 8 per team (again with league aves. in parens):

batting ave.--.292 (.278)
OBA--.348 (.333)
SLG--.410 (.384)

mean ave.--.289 (.278)
mean OBA--.345 (.333)
mean SLG--.399 (.384)

median ave.--.284 (.278)
median OBA--.344 (.333)
median SLG--.384 (.384)
   169. Gary A Posted: April 03, 2005 at 06:35 AM (#1228799)
Ranges for the 64 regulars:

ave.--.400 (Lowe, Mem.) to .136 (Montalvo, Cuban Stars)
OBA--.449 (Lowe, Mem.) to .211 (Montalvo, Cuban Stars)
SLG--.699 (Wells, StL) to .189 (Montalvo, Cuban Stars)

William Lowe was a utility player with 144 plate appearances (which tied for 63rd). Montalvo was a regular OF for the Cuban Stars who had been known as a slugger in the past, but couldn't hit the side of a barn in 1928.
   170. Chris Cobb Posted: April 03, 2005 at 07:54 PM (#1229321)
Median average, 8 players per team, is the figure I have been using from Holway's data.

Gary's 1928 data, coupled with the data from 1934, comparing the median from Holway with the league average Gary calculated, certainly suggests strongly that the difference between that median and the league average is somewhat smaller in the NeL that it typically was in the majors at that time.
   171. Gary A Posted: April 11, 2005 at 06:02 PM (#1247739)
I found some team totals for the 1930 NNL, in the article from the 1989 Baseball Research Journal that presents NNL stats from the original Negro League Committee project. They're not complete (no hits or at bats, for example, just team batting averages), and there are some other problems (teams' runs and runs allowed don't balance), but I think I was able to work out some useful league-wide information.

I won't bore you with the details of how I worked all this out (but I can if anyone wants!). Here's 1930, with comparable 1928 figures for comparison:

(All figures per team game.)
---1928------1930
AB-33.633--33.667
H--09.361--09.395
D--01.415--01.521
T--00.498--00.696
HR-00.379--00.472
BB-02.361--02.768
SO-03.798--04.431
SB-00.802--00.778
AVE-0.278--00.279
OBA-0.333--00.341*
SLG-0.384--00.408
R/G-5.03-----5.36

*assuming HP rates are same in '30 as in '28.

Holway has some team figures for the 1929 NNL--they are more limited, with just team batting averages, home runs, stolen bases, and runs/game. Weighting the team averages by each team's games played (this works perfectly for 1928--the average I obtain that way matches the real league average exactly), I came up with a .290 league batting average. The league scored 5.37 runs/game. The HR/G is .563, SB/G .773.
   172. Gary A Posted: May 04, 2005 at 05:28 PM (#1310573)
I seem to recall somebody mentioning that they'd like to see NeL totals for 1921 and 1928. I'm surprised I haven't posted them, but here is 1921 (there are some minor balancing issues between pitching and batting stats--these are batting stats):

1921 NNL+ (NNL+ associate members Bacharach Gts, Hilldale, Cleveland Tate Stars, Pittsburgh Keystones)

G-357
AB-23869
H-6372
D-882
T-377
HR-238
R-3634
W-1872
K-2910
HP-301
SH-867
SB-858
AVE-.267
OBA-.329
SLG-.366
R/G-5.10
R/9 inn.-5.28
   173. Gary A Posted: May 04, 2005 at 08:35 PM (#1311075)
1928 NNL (+ a few games involving independent teams: Chicago Giants, Nashville Elite Giants, plus the Bacharachs' western tour). These stats are balanced between pitching and batting.

G-305
AB-20516
H-863
T-304
HR-231
R-3071
W-1440
K-2317
HP-240
SH-746
SB-489
AVE-.278
OBA-.333
SLG-.384
R/G-5.03
R/9 inn.-5.26
   174. Gary A Posted: May 04, 2005 at 08:36 PM (#1311079)
The correct totals for hits and doubles:
H: 5710
D: 863
   175. Dr. Chaleeko Posted: May 04, 2005 at 09:37 PM (#1311276)
For 1928 that's 1 walk per every 14 AB, for 1921 it's 13 AB per walk, and 1930 (listed above) chimes in at 12 AB per walk.
   176. Brent Posted: May 05, 2005 at 02:21 AM (#1312492)
1921 NNL+ ...

AVE-.267
OBA-.329
SLG-.366
R/G-5.10


Compare with 1919 American League:

AVE-.268
OBA-.329
SLG-.359
R/G-4.09

A lot more runs are being scored with similar rate statistics. Any ideas why?
   177. Chris Cobb Posted: May 05, 2005 at 03:33 AM (#1312788)
Well, check out the National League, 1889

Avg. .264
obp. .329
slg. .359
r/g 5.84

It's almost certainly a much higher incidence of errors.
   178. Brent Posted: May 05, 2005 at 03:43 AM (#1312816)
Which benefited batters like... maybe Cool Papa Bell?
   179. Gary A Posted: May 05, 2005 at 04:36 AM (#1312903)
Yep, it's the errors:

The 1921 NNL
AVE-.267
OBA-.327
SLG-.366
FPCT-.948
R/G-5.10

1900 National League
AVE-.279
OBA-.321
SLG-.366
FPCT-.942
R/G-5.21

1901 American League
AVE-.277
OBA-.326
SLG-.371
FPCT-.938
R/G-5.35

1902 American League:
AVE-.275
OBA-.331
SLG-.369
FPCT-.949
R/G-4.89

1911 National League:
AVE-.260
OBA-.329
SLG-.358
FPCT-.958
R/G-4.42

*******

1928 NNL
AVE-.278
OBA-.333
SLG-.384
FPCT-.955
R/G-5.03

1901 American League
AVE-.277
OBA-.326
SLG-.371
FPCT-.938
R/G-5.35

1926 National League
AVE-.280
OBA-.335
SLG-.386
FPCT-.968
R/G-4.54

1927 National League
AVE-.282
OBA-.335
SLG-.386
FPCT-.968
R/G-4.58

1931 National League
AVE-.277
OBA-.331
SLG-.387
FPCT-.971
R/G-4.48

1936 National League
AVE-.278
OBA-.332
SLG-.386
FPCT-.969
R/G-4.71
   180. KJOK Posted: June 30, 2005 at 02:52 AM (#1440654)
Was trying to put together a spreadsheet of all the MLE's, and from what I can see, we are "missing" good MLE's for the following major players:

C - Ted Radcliffe
2B- Newt Allen, Sammy Hughes
SS- Dobie Moore, Pedro Cepeda
3B- Judy Johnson
LF- Turkey Stearns
RF- Chino Smith
P - Chet Brewer

If these are out there and I've just missed them, please let me know.

Also, anyone interested in doing any of these? I think it helps with comparisons to have as many a possible...
   181. KJOK Posted: June 30, 2005 at 03:02 AM (#1440681)
Maybe more importantly, the guys about to come up who'll need MLE's:

Buck O'Neal
Willard Brown
Sam Jethroe
Luke Easter
Ray Dandridge
Bill Wright
Satchel Paige
   182. Jeff M Posted: July 03, 2005 at 02:51 PM (#1446645)
Stearnes is in, so I wouldn't worry about him, but I think Moore and Brewer are particularly important from the old list. Also, I would add Ted Strong and Joe Greene, since they did not even get separate threads.
   183. Gary A Posted: July 03, 2005 at 05:06 PM (#1446697)
I would add James "Bus" (or "Buster") Clarkson, another SS/3B who has been largely overlooked. He hit for power and walked a good deal, so he fits the profile of Negro Leaguers who have benefited from our analysis.
   184. Dr. Chaleeko Posted: July 05, 2005 at 01:28 PM (#1449694)
Since it's fast becoming an issue for us, I thought I'd take a look at the Mexican League in comparison to MLB over the period 1938-1970 to see what players who were in both could tell us about the quality of play South of the Border. I looked mostly at batting, but also took a curosry look at pitching.

This is definitely a quick-n-dirty approach.... After identifying players who were in both leagues during the period, I noted their AB, AGE, AVG, OBP, SLG, AB/BB, AB/K, K/BB, and SB/ToB. No adjustments for park, league, era, age, or anything. From there I simply computed the ratio of their play from Mexico over the play from MLB. I grouped together seasons to avoid small samples, so I typically looked at two and three year groupings in Mexico, then looked at the surrounding MLB seasons where available...or vise versa when the information available dictated doing so. I looked at both cases and an aggregate total as well.

In total there were about 30 players in the study, however, not all of them presented appropriate cases for direct comparison due to small samples (for instance Willard Brown). In those instances, I retained their numbers for the overall average, but didn't use them for individual cases.

OK, so the basic gist is that in the aggregate, I looked at about 18,000-20,000 ABs in each league, and here's the results.
         RATIO MEX/MLB
AVG      1.12
OBP      1.17
SLG      1.15
SB/TOB   1.50
AB/K     1.22
AB/BB    0.67
K/BB     0.55


So in Mexico they ran a lot more, struck out less often, walked more often. Based on how conservatively MLB played in this era and based on the MxL pitching data we've seen recently, these should not come as much of a surprise.

What's more interesting, however, is when we narrow down the time frame to 1938-1950:

         RATIO MEX/MLB
AVG      1.10
OBP      1.15
SLG      1.09
SB/TOB   1.66
AB/K     1.33
AB/BB    0.50
K/BB     0.67

Play was closer to MLB in the pre- and post-war periods and perhaps strongly suggestive of a 10-15% discount rate.

On the pitching side... I looked at Lanier, Maglie, Paige, Solis, and D. Bankhead to just skim the surface.
         1938-1950
         RATIO MEX/MLB
ERA      1.00
H/9      1.04
BB/9     1.13
K/9      0.89
K/BB     0.79

More walks, fewer Ks, but stable ERAs. Actually this sample is mostly loaded up with Maglie and Lanier whose performances in 1946-1947 were extremely close to their performances in surrounding MLB seasons. So there's some selection bias there.

Anyway, this isn't the most rigorous study anyone's done on league quality, but it is potentially instructive for MLEs. Based on the fact that the MxL (in the crucial 1938-1950 period) walked 50 percent more often per AB than MLB in the period and that it struck out one-third less often than MLB, we may need to revisit Wells', Leonard's and Bell's walk totals to see if they are inflated by MxL context.

In addition, we'll want to be careful going forward that we note this issue with players like Campy, Irvin, Wright, and W Brown.

Finally, we'll need to consider the impact this data has on Maglie and Lanier as candidates.
   185. Chris Cobb Posted: July 05, 2005 at 03:56 PM (#1449919)
On Leonard's and Wells' walk rates:

Once Dr. Chaleeko's data initially pointed out that walk rates were higher in the MeL than in the majors, I went back over the walk rates I had used for Wells and for Leonard, and I concluded that neither was in need of significant adjustment. In Wells' case, I had curved the numbers in with his NeL walk rate and done some adjusting to major-league walk averages already.
In Leonard's case, the walk numbers were so high that they obviously needed adjustment for a major-league context. I pegged Leonard's rates to those of high-walk players like Camilli and Ott, and I think the results fit reasonably with the fuller data set.

I have not taken a look at Bell's walk rates yet. I am planning to do new MLEs for him, since we now have a much better sense of MeL conversions than when he came on the scene -- he was our first candidate with MeL playing time.
   186. Dr. Chaleeko Posted: July 05, 2005 at 06:14 PM (#1450282)
Chris, thanks for clarifying, that eases the concerns I had.
   187. Paul Wendt Posted: May 10, 2006 at 07:42 PM (#2013030)
Willard Brown is the subject of much 1976 Ballot Discussion. Beside his short MLB trial and his low walk rate in the Negro AL, the "weak league" is a third point against him, meaning the NAL in contrast to the NNL.

But I don't care about Willard Brown in particular. Since the point is general, here it goes. (And this gives me an opportunity to alert the new and the forgetful. This is a good thread and a crucial one in the history of the project.)

How many players moved between the contemporary Negro Leagues, either directly or via stints with independent and foreign clubs? Has anyone estimated translation rates (alias equivalency, alias interleague quality) for the two Negro leagues around 1925 or around 1940? In theory, separately derived estimates for the six two-league comparisons will be inconsistent. For example, three two-league estimates might be NL:NNL .90, NL:NAL .90 (which jointly imply NL:NAL .81) and NL:NAL .90. Here I am using one dimension to illustrate the arithmetic point clearly, and "NAL" means the "other Negro League" beside the National. But using all of the interleague career data should support a weightier joint estimate of league qualities.
   188. Paul Wendt Posted: May 10, 2006 at 07:46 PM (#2013037)
The same point pertains to the playing records for major/negro leaguers in other leagues or venues such as Cuba or Puerto Rico in the fall or winter.
   189. sunnyday2 Posted: May 10, 2006 at 08:12 PM (#2013072)
This is not really along the lines of Paul's comment, but Paul's comment brought it to mind:

Didn't somebody once posit a "transition" theory, i.e. that when a player changes leagues, even moving to a weaker league, that there is an adjustment period and that in order to "fairly" evaluate a player you have to isolate and remove the "adjustment"? I think it was apropos of Gavvy Cravath. It posited a Gavvy Cravath who played in the MLs continuously. It posited that he performed above his MLEs from his MiL record, because it eliminated all the adjustment periods from his record.

Theoretically I think this is right. As a practical matter, I would never try to account for it in my evaluation because 1) every player has adjustment periods and so they more or less wash out and 2) if you're giving Gavvy Cravath or any player MLE credit, you're already straining against the bonds of earth anyway and I never thought it necessary to be any more generous than that.

But in the odd case of a Willard Brown (i.e. of a VG to great player who had nothing but an adjustment in the real MLs), the theory makes a lot of sense. And it is of course this theory for which Chris Cobb provided excellent data points recently.
   190. Chris Cobb Posted: May 10, 2006 at 08:48 PM (#2013124)
Re Paul Wendt's questions about studies of league quality:

No one in the HoM project, so far as I know, has undertaken such a systematic study for the different Negro Leagues. I have done rough studies of that sort to try to calculate conversion factors for the Mexican League and the Cuban Winter League.

Re Sunnyday2's comment on the "transition" theory:

Gadfly is the great advocate of transition theory, which he illustrated in his intensive analysis of Gavvy Cravath, and also with reference to the Negro Leagues. I believe the data shows pretty clearly that some period of adjustment is nearly always necessary when a player moves to a league of higher quality. I don't fully accept Gadfly's claims about how long the transition period lasts and how one should account for the transition period when evaluating players with career paths that took them into and out of major league baseball at unusual points in their careers, but the basic claim is very strong. A rookie, regardless of age, is going to have a learning curve to climb before playing up to potential.
   191. Dr. Chaleeko Posted: May 11, 2006 at 03:52 PM (#2014562)
But in the odd case of a Willard Brown (i.e. of a VG to great player who had nothing but an adjustment in the real MLs), the theory makes a lot of sense. And it is of course this theory for which Chris Cobb provided excellent data points recently.

For a player in the NgLs in the 1935-1955 period, this is an especially important point. Brown's peripataic journies are nothing if not typical of a player who was a NgL stalwart then suddenly found himself fighting for jobs every year. A number of cndidates have similar career paths, some so mind-bogglingly complex that they make David Cone's "hired gun" reputation or Mike Morgan's career arc look downright settled.

So if theer's any credence to the "transition theory," and I think there is some (if not tons), then many NgLers of the integration period were constantly in the transition phase, between the rise of new leagues, play in regional and indistrial leagues, the war, integration, going into different minor leagues, going to foreign leagues, winter ball, summer ball, etc. If the theory holds, then there is probably some effect on their numbers. It may be cancelled out in instances where players had to settle into lesser leagues, but by the same token, their stellar performances in the Man-Dak or some similar operation also led to opportunities in better leagues.

Know what I mean, Vern?
   192. jimd Posted: May 11, 2006 at 10:42 PM (#2015184)
but the basic claim is very strong. A rookie, regardless of age, is going to have a learning curve to climb before playing up to potential.

Dr.C, we have different impressions of "transition" theory. Mine is that while indeed there is always some adjustment needed for lateral transfers (AL to NL, etc.), the big adjustment is a change of quality level.

A hitter in MLB faced a pitcher who was a team staff ace (a Feller or Ford, etc.) once or twice a week on average. A hitter in the high minors faced pitchers of that quality almost never (except for the occasional top pitching prospect having a good day). So for Cravath to face Walter, Waddell or Walsh level pitching on a regular basis is a big adjustment.

The NeL players have a similar (but smaller) problem. They DO face pitchers of that quality, but just not as often. Paige and Brown are events, 3 or 4 times a year apiece. There were others that were nearly as good, at least for a short time. But they were balanced by pitchers that couldn't stick in AAA. The NeL's had a much higher variance in day-to-day quality. So the transition is to get accustomed to a steady diet of high-quality pitching with much less filler.
   193. Chris Cobb Posted: May 11, 2006 at 11:43 PM (#2015411)
The NeL players have a similar (but smaller) problem. They DO face pitchers of that quality, but just not as often.

I'd agree; I'd also suggest that they had to face pitchers of that quality who were used to facing high-quality hitters all the time. A pitcher of Satchel Paige's savvy was surely adjusting brilliantly hitter to hitter, but I would guess that other top NeL pitchers would develop the sort of habits that minor-league pitchers with great stuff have to unlearn when they get to the majors.
   194. jimd Posted: May 12, 2006 at 12:49 AM (#2015718)
That's an excellent point Chris.
   195. Dr. Chaleeko Posted: May 12, 2006 at 01:27 PM (#2016439)
Jim and Chris,

I agree 95% with you, but I think in the case of NgL players of this generation there are other issues at hand, including playing in foreign countries, playing in newly integrated leagues, and simply the ping-ponging itself. Clay Davenport did a study a couple years ago suggesting that there was some minor adjustment period for US players going to Japan (http://www.baseballprospectus.com/article.php?articleid=1348) the implication being that even players at the top rank of US baseball faced difficulty when moving to a different cultural area, despite playing in leagues of slightly lesser overall quality. Negro Leaguers were facing these same or similar pressures in addition to moving to new leagues and parks. I don't know that it's a big thing, nor that we could realistically isolate it, but I think it's worth remembering, and it may contribute in small but important ways to why so many guys' careers look so weird and why their MLB tryouts weren't initially successful (though, yes mostly it's about adjusting to pitching).

I should say, too, that this is a fairly unique subset of players (NgL vets of the integration period). They may be comparable to the Asians and the Cuban defectors who have entered the US game in the last ten years. Each group faced very different pressures, but the NgLers probably faced the most hostility within the game since the Japanese and Cubans were free agents courted by the teams.

(Speaking of Cuban defectors, does El Duque have any shot at the HOM? Since I've never seen his Cuban stats and I don't know if he's 37---in which case the answer is no--- or 42ish---in which cas, maybe?---I don't have a handle on it, which is why I'm trolling for opinions.)
   196. Mark Shirk (jsch) Posted: May 12, 2006 at 02:04 PM (#2016474)
Interesting about El Duque, does anyone know where one can find Cuban state of the Castro era? I would also like to know how Jose Contreras (who was the ace of that staff) and Omar Linares fared.
   197. Mark Shirk (jsch) Posted: May 12, 2006 at 02:04 PM (#2016475)
state = stats
   198. sunnyday2 Posted: May 12, 2006 at 03:04 PM (#2016549)
Excellent question. Are there going to be MLEs for players who played on the Cuban national team, then defected?

(Easy for me to ask, since I clearly won't be the one doing any MLEs.)
   199. Dr. Chaleeko Posted: May 13, 2006 at 08:44 PM (#2018359)
Clay Davenport ran an article about Kendry Morales last year which suggested that recent Cuban teams are of A or Hi-A quality (don't remember which now). What I can't recollect is whether or not the team was stronger in El Dookoo's era because those were pre-defection days.

I do think the Cuban numbers are out there to be found because IIRC, Ron Shandler was putting some of them into his most recent versions of the Baseball Forecaster. I don't know where they are though.
   200. John (You Can Call Me Grandma) Murphy Posted: May 13, 2006 at 08:47 PM (#2018363)
Excellent question. Are there going to be MLEs for players who played on the Cuban national team, then defected?

I'll say yes, but I'm not the final authority on matters such as this.
Page 2 of 3 pages  < 1 2 3 > 

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
phredbird
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 1.1920 seconds
49 querie(s) executed