Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Hall of Merit > Discussion
Hall of Merit
— A Look at Baseball's All-Time Best

Tuesday, August 17, 2004

Cristóbal Torriente

Rube Foster’s “big gun” for the Chicago American Giants is another centerfielder to compare with Cobb,  Speaker, Roush and Carey.

John (You Can Call Me Grandma) Murphy Posted: August 17, 2004 at 09:17 PM | 87 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. Chris Cobb Posted: August 18, 2004 at 02:19 AM (#803318)
Here's the data I've gathered for Christobal Torriente

Expert Evaluations
Bill James -- #2 Negro-League CF, #67 in top 100 players all-time
CPPD -- 96% of experts vote for HoF induction
Holway -- 9 all-star selections; first-string outfielder on his all-time Negro-League all-star team, with Oscar Charleston and Turkey Stearnes (second team is Cool Papa Bell, Pete Hill, and Wild Bill Wright)

From Holway

1913 .383 (23-60) for Cubans (west); Holway all-star
35-104 in Cuban play (.337)
1914 .358 for Cuban Stars, 4-3 as pitcher; Holway all-star
48-186 in Cuban Play (.387)
1915 . 317 for Cuban Stars; Holway all-star
9-23 (.391) vs. ABCs in Cuba
56-139 (.403) in Cuban Play
1916 .357 for Cuban Stars; Holway all-star
1917 .500 for Cuban Stars, .250 for All Nations
1918 .318 for Cuban Stars; 2-2 as pitcher Holway all-star
1919 .280 for Chi Am Giants; 5-1 as pitche; Holway all-star (dh)
21-73 vs. major-league competition in Cuba
36-100 in Cuban Play
1920 .361 for Chi Am Giants; Holway all-star, 0-1 as pitcher
9-31 vs. major-league competition in Cuba
29-98 (.296) in Cuban Play
1921 .346 for Chi Am Giants, 6-1 as pitcher
4-15 in playoff vs. Hilldale
7-20 (.350) in Cuban Play
1922 .393 for Chi Am Giants
3-19 vs. Bacharachs in World Series
61-194 (.351) in Cuban Play
1923 .395 for Chi Am Giants; Holway all-star
1-6 vs. Detroit Tigers
.377 in Cuban Play
1924 .333 for Chi Am Giants
62-163 (.380) in Cuban Play
1925 .241 for Chi Am Giants
43-122 (.344) in Cuban Play
1926 .371 for KC Monarchs; Holway all-star
11-31 in playoff vs. Chi Am Giants
1927 .326 for Detroit Stars
1928 .333 for Detroit Stars listed as in a utility role

Career Totals
15 documented seasons as regular
856-2548 (.336) lifetime in Negro-League Play, 53 HR
mean avg. .343 in 15 seasons as regular
mean avg. .352 adj. for Chi Am Giants park
48-110 (.436) vs. major-league competition
386-1149 (.336) lifetime in Cuban Play
17-9 as pitcher

From Riley

Avgs. 1919-23 of ..325, 411, .338, .342, .412
Avg. 1926 .381
Avgs. 1927-28 .339, .320

Hit .352 in 13 Cuban Winter League seasons, with avgs. Of
.265, .337, .387, .402, .360, .296, .350, .346, .380, .344, .375, .222

Avg. vs. major-league competition, .313

Played centerfield most of his career

i9s career projections
1913-28
8630 ab, 2768 hits, 567 2b, 188 3b,170 hr, 592 bb, .321 avg., .364 obp, .489 slg, 853 ops, 1511 rc

I've already run WS projections based on the i9s data using the my standard 5% reduction; I'll post those shortly.
   2. Chris Cobb Posted: August 18, 2004 at 02:37 AM (#803401)
Torriente projected WS

These are based on a 5% reduction of i9s, with totals produced by finding closest major-league equivalents to Toriente’s projected production and modifying those totals.

Torriente projected as a B+ centerfielder, with his seasonal WS distribution modeled on his near-contemporary Clyde Milan. Those who see Torriente as better than that, or worse, should bump the fws accordingly.

Year – bws – fws – total
1913 – 10.3 – 1.0 – 11.3
1914 – 22.0 – 2.6 – 24.6
1915 – 26.1 – 3.0 – 29.1
1916 – 31.7 – 4.6 – 36.3
1917 – 29.7 – 6.0 – 35.7
1918 – 28.1 – 5.6 – 33.7*
1919 – 24.6 – 3.7 – 28.3*
1920 – 31.8 – 6.0 – 37.8
1921 – 19.1 – 5.3 – 24.4
1922 – 14.6 – 1.5 – 16.1
1923 – 28.8 – 5.5 – 34.3
1924 – 10.7 – 4.4 – 15.1
1925 – 10.8 – 4.8 – 15.2
1926 – 12.0 – 4.0 – 16.0
1927 – 9.0 – 2.7 – 11.7
1928 – 3.6 – 1.2 – 4.8
tot – 312.9 – 61.9 – 374.4

*These seasons projected to 154 games, not to ML-season equivalents, which would ahve been 128 and 140 games, respectively.
   3. John (You Can Call Me Grandma) Murphy Posted: August 18, 2004 at 02:43 AM (#803424)
Are there any centerfielders from the teens that don't have over 300 WS? :-)
   4. Chris Cobb Posted: August 18, 2004 at 02:52 AM (#803459)
I think Jules Thomas will be a bit short of 300 ws. But by 1938, I think we can be confident that centerfield will no longer be under-represented in the HoM . . .
   5. The definitely immoral Eric Enders Posted: August 18, 2004 at 12:00 PM (#803786)
I would hereby like to propose that from now on, we make a concerted effort to spell Cristobal Torriente's name correctly in Hall of Merit records and discussions. Other historians, especially Bill James, have consistently failed to do so, and frankly it drives me up the wall.

The man died unknown and destitute despite being one of the greatest ballplayers who ever lived. We can at least give him the respect of spelling his name right. :)
   6. Chris Cobb Posted: August 18, 2004 at 12:52 PM (#803805)
Sorry. It's particularly difficult for me to leave the "h" out -- force of habit, you know, since I type "Chris" about 50 times a day . . .
   7. The definitely immoral Eric Enders Posted: August 18, 2004 at 12:59 PM (#803809)
Chris, that post wasn't directed at you. It was directed at Bill James et al, who keep misspelling it in the print sources and thereby confuse everybody in the process.
   8. John (You Can Call Me Grandma) Murphy Posted: August 18, 2004 at 02:01 PM (#803862)
I would hereby like to propose that from now on, we make a concerted effort to spell Cristobal Torriente's name correctly in Hall of Merit records and discussions.

For a minute there, I thought I misspelled his name. I'm glad it was directed at James rather than me. I will do my damnedest to make sure that he gets the proper respect that he deserves, Eric.
   9. Dag Nabbit: Sockless Psychopath Posted: August 18, 2004 at 05:37 PM (#804307)
In looking at his stats in the MacMillan 'cyclopedia, his numbers really impressed me. He seemed to be a better hitter than Pete Hill. In almost any other ballot he'd be 1 or 2, but he'll be a ridiculously strong 6th in 1934.
   10. The definitely immoral Eric Enders Posted: August 18, 2004 at 06:02 PM (#804355)
Somewhere in my collection I have a book, published in Havana with both English and Spanish text, which describes the famous 3-home run game Torriente once had against the New York Giants.

For those who don't know the story, the New York Giants Plus Babe Ruth traveled to Cuba after the 1920 season to play a bunch of exhibition games. Everybody wanted to see Ruth hit home runs, but instead the show was stolen by Torriente, who hit 3 homers in one game off Giants pitchers. (Only 2 were off legitimate pitchers, though; the third was off Ross Youngs.)

Anyway, I'll dig around for the book and see if there's anything worthwhile in it.
   11. KJOK Posted: August 19, 2004 at 12:06 AM (#805020)
Torriente's 1928 Stat Line:
G-39
AB-118
H-38
D-7
T-3
HR-1
R-15
BB-7
SB-2
Ave-.322
OBP-.360
SLG-.458

Fielding:
RF
G-16
Inn-121.3
PO-14
A-0
E-0
DP-0

LF
G-6
Inn-45.3
PO-9
A-0
E-0
DP-0
   12. yest Posted: August 19, 2004 at 12:44 AM (#805200)
I didn't do a thorough investigation of Torriente yet but doing a quick search on him I'm starting to feel he was worse than Santop slightly worse than Poles slightly better then Moore and signifantly better then HR. Johnson.
since obviosly I'm alone on this one I'm asking what am I missing?
   13. karlmagnus Posted: August 19, 2004 at 12:57 AM (#805247)
To me it looks like Lloyd-Torriente-Hill-Santop-Johnson-Poles. These interpretations are VERY tricky.
   14. Chris Cobb Posted: August 19, 2004 at 01:15 AM (#805333)
I didn't do a thorough investigation of Torriente yet but doing a quick search on him I'm starting to feel he was worse than Santop slightly worse than Poles slightly better then Moore and signifantly better then HR. Johnson.
since obviosly I'm alone on this one I'm asking what am I missing?


yest, you haven't shown or described the evidence that you're using for reaching these conclusions about these players. All the evidence that I'm using for Torriente is posted at the beginnig of this thread, as is the data for Santop and Poles on theirs. If you can show data that places Torriente in a different light from that data or can explain how the posted data shows Torriente as slightly worse than Poles or Santop (neither of whom you see as ballotworthy), I might be able to answer your question. But I think some burden has to fall on you to show clearly how you arrive conclusions that differ so greatly from the more generally held view.
   15. yest Posted: August 19, 2004 at 02:55 AM (#805637)
there are basically 4 different types of evidence given for Negro leaguers here
anecdotal
projectile
experts opinions
and statistical
I'll try to explain myself on each of them.

anecdotal I didn’t see any anecdotes on him yet

I think any projections made on any given player are going to be inaccurate.


I don’t pay too much attention to the experts or fans opinions when they saw them play so why should I when they didn‘t see them play? If I did I also would be putting Grich and Maris in my phom both of who I think clearly don’t belong.

CPPD if all the players that they wanted to put in were put in there would be almost an equal amount of Negro Leaguers and Major Leaguers from the twenties thirties and forties in the hall and something seems wrong to me a bought those numbers

Bill James I don’t really agree with him on his ranking of major leaguers which he admits he knows better so why should I agree with him on Negro leaguers?

Holway please look at this site http://baseballguru.com/jholway/analysisjholway30.html
I disagree with his major Leaguers so once again why should I agree with him on Negro leaguers?

From Riley
Santop’s record
.406 lifetime avg. vs. all competition

Torriente’s record
Avgs. 1919-23 of ..325, 411, .338, .342, .412
Avg. 1926 .381
Avgs. 1927-28 .339, .320
Hit .352 in 13 Cuban Winter League seasons, with avgs. Of
.265, .337, .387, .402, .360, .296, .350, .346, .380, .344, .375, .222
Avg. vs. major-league competition, .313
it seems from Riley that Santop was clearly better.

using Holway’s info I’m taking Torriente’s record and comparing it to 4 eligible major leaguers, because I’m not sure his stats from Holway are much better than their actual stats were.
Torriente record. 1290 hits 3807 at bats 338 batting avg
player a’s record 1646 hits 4820 at bats 341 batting avg
player b’s record 1309 hits 3924 at bats 334 batting avg
player c’s record 1282 hits 3854 at bats 333 batting avg
player d’s record 1024 hits 3024 at bats 339 batting avg

player a, b, c and D played in the ML
player A, C and D were a outfielders
player B was a 3rd basemen

I think any projections made are going to be inaccurate.
   16. Chris Cobb Posted: August 19, 2004 at 03:36 AM (#805722)
yest, your ideas about how to interpret evidence are so wildly at variance to mine that I doubt there's anything I can say that will carry weight with you. But since you've offered some data, here are a few things you might consider.

Let's start with data.

Santop -- .406 average vs. all competition
Torriente .313 average vs. major-league competition

This data suggests that Torriente is a better hitter than Santop.

Why?

Batting averages that were compiled largely against semi-pro competition (and that's mostly who the black teams played, prior to 1920) need to be substantially reduced. I think a factor of .7 is about right. This turns Santop's .406 into .284 MLE. Torriente's .313 stays at .313 . He's substantially ahead.

That's only one data point, so it's not worth a lot, but it certainly doesn't indicate that Santop was a better hitter.

I don't know what to make of your comparison of Torriente to players a,b,c, and d. What do you think it shows?

Without knowing more about players a, b, c, and d -- when they played, what parts of their career are represented by the record you've provided -- these comparisons don't provide any data that I can usefully comment on.


I'm troubled by your approach to expert opinion. If you don't agree with the experts about something or other, you appear to completely dismiss what the expert has to say. I don't agree with Bill James or the CPPD panel on many things, but I acknowledge that Bill James knows a lot more about baseball, including the Negro Leagues, than I do, and the same goes for the CPPD foks about Negro-League baseball. Sure, McNeil wants to bring too many players into the Hall of Fame, but that doesn't mean that the experts he's polled have no capacity to identify who the best Negro-League players outside the HoM are. If they all agree that a player was great, then the burden is on us to disprove their claim. The same goes for James. Better to try to understand why he reaches different conclusions than you do than simply dismissing him -- that's throwing out an awful lot of valuable knowledge.

I think any projections made are going to be inaccurate.

Why? Of course no projection can be "accurate," since it is presenting a statistical image of events that didn't actually happen, but a projection can be reasonable, and without making estimates (which are different from projections) about what a player's whole career looked like, we're not treating a player fairly in comparison to players with fully documented careers.

I think any assessment that doesn't attempt to project Negro-League players into a complete career is likely to seriously misrepresent their value.
   17. yest Posted: August 19, 2004 at 10:38 PM (#807592)
this is both From Riely
Torriente .313 average vs. major-league competition
Santop hit .316 lifetime avg. vs. major-league pitching
I don't underdstand whay your talking abought
   18. Chris Cobb Posted: August 19, 2004 at 11:52 PM (#807656)
this is both From Riely
Torriente .313 average vs. major-league competition
Santop hit .316 lifetime avg. vs. major-league pitching
I don't underdstand whay your talking abought


Well, I'm glad I didn't try your pop quiz setup comparing Torriente to four unidentified major leaguers. . . .

Sure, _this_ one data point suggests that Santop was a better hitter.

Then, of course, we could look at Holway's numbers for hitting versus major-league competition:

Torriente, 48-110, .436
Santop, 13-45, .289

I don't underdstand whay your talking abought

Thanks for letting me know.
   19. OCF Posted: August 20, 2004 at 02:50 AM (#808403)
How far off target would I be in thinking of Roberto Clemente? In what ways was Torriente different?
   20. Gary A Posted: December 16, 2004 at 05:22 AM (#1023858)
1921 Cristobal Torriente
Chicago American Giants

G-70
AB-236
H-80
D-5
T-12
HR-9
R-61
W-28
HP-5
SH-9
SB-17
AVE-.339 (NeL .263)
OBA-.420 (NeL .324)
SLG-.576 (NeL .361)

Remember, he played in the American Giants' park. Torriente was far and away the best player on the best team in the league that year.
   21. Gary A Posted: December 16, 2004 at 05:26 AM (#1023866)
From Patrick Rock's research:

1923 Cristobal Torriente

G-74
AB-261
H-101
D-22
T-5
HR-4
R-69
RBI-63
W-44 (third in league)
SB-12
AVE-.387
OBA-.470* (led league)
SLG-.556

Should also note that Torriente led the league in triples in 1921.
   22. Gary A Posted: December 16, 2004 at 04:30 PM (#1024901)
To answer an earlier question about how Torriente was different from Clemente: Torriente walked more (especially considering the low walk levels in the NeL) and was a superb defensive center fielder (clearly better than Charleston, for example). Plus he pitched a little.
   23. Dr. Chaleeko Posted: May 22, 2005 at 10:23 PM (#1354999)
Gary, re post #22, he was a really good pitcher, too! PLACEMENT ON LEADERBOARDS

WINS t-155th with 18 wins

LOSSES t-294th with 6 losses

DECISIONS t-206th with 24 decisions

WINNING PCT .750
(50 Decisions Minimum) DNQ
(25 Decisions Minimum) DNQ
(10 Decisions Minimum) 11th

ADJ PCT OF TEAM DECISIONS 5.8%
(50 Decisions Minimum) DNQ
(25 Decisions Minimum) DNQ
(10 Decisions Minimum) 329th (of 225)

WAT 69th with 3.4

WAT PER DECISION .142
(50 Decisions Minimum) DNQ
(25 Decisions Minimum) DNQ
(10 Decisions Minimum) 27th

He didn't pitch much, but when he did, he was very effective.
   24. Big Banjo Posted: August 22, 2005 at 12:55 PM (#1563010)
Does anyone have Torriente's pitching statistics for 1928? Supposedly he was 7-3 with several saves?
   25. Gary A Posted: August 22, 2005 at 02:49 PM (#1563178)
1928 Cristobal Torriente
NNL Detroit Stars

Pitching

W-6
L-2
TRA-7.32 (NNL 5.26)
SV-1
G-14
ST-9
CG-3
SHO-1
IP-75
H-101
HR-6
BB-25
K-19
HB-0

I only include games I have box scores for; sometimes other NeL researchers will include pitchers' data (usually innings and hits allowed) for games with just line scores.
   26. Gary A Posted: August 22, 2005 at 03:31 PM (#1563268)
I also have:

1916 Torriente Pitching
Cuban Stars (West)

W-0
L-2
TRA-8.38 (western black teams 4.79)
SV-0
G-2
ST-2
CG-0
IP-9.7
H-11
HR-0
BB-9
K-4
HB-0

--AND--

1921 Torriente Pitching
Chicago American Giants

W-2
L-1
TRA-6.04 (NNL 5.20)
G-6
GS-4
CG-2
SHO-0
IP-28.3
H-35
HR-0
BB-3
K-14
HB-1
   27. Gary A Posted: August 22, 2005 at 03:48 PM (#1563305)
While I'm dumping stats, here's something I never posted:

1916 Torriente Batting
Cuban Stars (West)/All-Nations Club

G-43
AB-151
H-52 (tied for 3rd)
D-8 (tied for 4th)
T-2
HR-0
R-25 (5th)
BB-24 (tied for 2nd)
HP-0
SH-1
SB-6
AVE-.344 (*led west--western ave .250)
OBA-.434 (*led west--western ave .326)
SLG-.424 (4th--western ave .319)

Since this is a non-league year, schedules are wildly imbalanced. For rate stat leaders, I used an arbitrary cutoff of 100 plate appearances.
   28. Big Banjo Posted: August 22, 2005 at 08:33 PM (#1563870)
Thanks for the data, Gary A. As a pitcher, Torriente seems to have been a hell of an outfielder! I love the detail you’ve culled from the box scores. This is going to seem like its coming out of leftfield, but I’m interested in starting an on-line repository for Negro League box scores and game accounts. A site where ALL baseball historians and researchers can have access to photocopies of primary source material from the Negro Leagues. Outsiders like me have been blessed (and sometimes cursed) by the heavy lifting done by guys like you. You’ve spent the hours combing through microfiche and microfilm, but to this point the public at large is at a loss for accessing primary source material, unless we decide to go through the same pain-staking process of hours spent inside the research libraries. I realize Out of the Shadows has their big project on the way, but is this what we all want (an actual encyclopedia), or when it arrives, will we be left with the same empty feeling in our stomachs that Holway’s Complete Book inspired? I think a public repository would take Negro League discussion and understanding to a new level. Let me know your thoughts.
   29. Chris Cobb Posted: August 29, 2008 at 02:40 AM (#2921116)
Cristobal Torriente MLEs

Age Year  Team G    PA   Hits TB   BB  SB   BA      OBP   SA    OPS+
22  1916  CSW145  580  162  209  53  14  0.306  0.369  0.397  135
26  1920  Chi  154  647  239  365  58  16  0.406  0.459  0.621  208
27  1921  Chi  150  630  203  330  73  25  0.365  0.438  0.592  170
28  1922  Chi  105  441  121  190  56  17  0.314  0.401  0.494  130
29  1923  Chi  154  647  196  286  96  18  0.357  0.452  0.520  156
30  1924  Chi  134  563  164  258  77  15  0.337  0.428  0.531  156
31  1925  Chi  146  613  146  225  90   6  0.278  0.385  0.431  108
32  1926  KC   153  643  185  258  70  17  0.323  0.396  0.450  127
33  1927  Det  145  609  159  223  65   7  0.293  0.368  0.409  108
34  1928  Det   70  245   68   94  16   3  0.296  0.342  0.410  95
career        1356 5617 1643 2438 654 136  0.331  0.409  0.491  143

*1916 played for Cuban Stars West, and also for the All-Nations 


Career hit distribution
1154 1b
289 2b
94 3b
106 HR
1643 hits

Notes
1913-15 and 1917-19 not included in MLEs because of insufficient data. Therefore, 1916 is not regressed and 1920 is regressed only to avg. of 1920-22. That may contribute to height of 1920 peak.

Source of NeL data on which MLEs are based is the HoF project, published in _Out of the Shadows_, except for
1916 Gary A. http://agatetype.typepad.com/agate_type/
1921 Gary A. http://agatetype.typepad.com/agate_type/
1922 Gary A. http://agatetype.typepad.com/agate_type/
1928 data posted on Cristobal Torriente discussion thread at HoM by KJOK.

Conversion factors used were identical to those used for Oscar Charleston MLEs, which are explained in the posting that includes those figures.
   30. Brent Posted: August 29, 2008 at 05:04 AM (#2921235)
Chris - Here are some Cuban data from the '10s (from Figueredo). Can you use them to help fill in some of the gaps in your MLEs? Please note that these are his actual playing statistics, not MLEs.

Age Year    Team  G  TmG  AB  H 2B 3B HR SB    BA  SLG LgBA LgSLG 
19  1912
-13 Hab   -   32 102 27  1  3  1  6  .265 .363 .278  .339
20  1913
-14 Alm  30   33 104 35  5  2  2 13  .337 .481 .254  .294
21  1914
-15 Alm  34   34 124 48  5  5  0 19  .387 .508 .261  .306
22  1915
-16 Alm  33   40 139 56  5  6  2 28  .403 .568 .280  .334
26  1919
-20 Alm   -   23 100 36  5  5  1 10  .360 .540 .259  .317 


Notes - Torriente didn't play in the Cuban League in the winters 1916-17 or 1918-19 (presumably, he was playing winter ball in the United States). The Cuban League didn't operate during the winter of 1917-18. Torriente led the Cuban League in AB (1914-15), H ('14-15,15-16), 2B ('19-20), 3B ('14-15,15-16,19-20), HR ('13-14,15-16,19-20), SB ('15-16,19-20), BA ('14-15,19-20), and SLG ('13-14,14-15,15-16,19-20).

Gary provides more detailed data for series played against the Lincoln Stars in 1914 and the Indianapolis ABCs in 1915:

Year Team   G TmG AB  H 2B 3B HR BB HP SB   BA  OBA  SLG LgBA LOBA LSLG
1914 Lin St 6   7 20  7  1  0  0  4  2  2 .350 .500 .400 .203 .297 .230
1915 In ABC 8   9 26 10  1  1  0  7  0  4 .385 .515 .500 .274 .367 .331 
   31. Brent Posted: August 29, 2008 at 05:34 AM (#2921247)
A couple of clarifications - in the last table, the second column should be labeled "Opponent" (Torriente played for Almendares in both series). In the 1915 series againts the ABCs, Torriente was caught stealing once (in five attempts). All league rates exclude pitchers.
   32. Chris Cobb Posted: August 29, 2008 at 11:40 AM (#2921294)
Thanks, Brent!

Yes, I can use these. This should allow me to fill in every year but 1917 (Holway has ab-h-ba for 1918, which in isolation I hadn't bothered to use, but if I can fill in 1919, having a 1918 batting average will do well.

I'll post a new set of MLEs tonight or tomorrow.

Please note that because of regression, the projections for 1916, 1920, and 1921 in the existing MLEs will be affected by the new data.
   33. Chris Cobb Posted: August 29, 2008 at 02:23 PM (#2921355)
Brent,

Quick question: do you know with what CWL seasons the two series against NeL teams should be combined?

That is, was the 1914 series with the 1913-14 season or the 1914-15 season?

Was the 1915 series with the 1914-15 season or the 1915-16 season?

Thanks! If you don't know off the top of your head, I can go look on Gary A's site: I'm sure he must have the dates of the two series as part of his posts about them.
   34. Brent Posted: August 29, 2008 at 04:03 PM (#2921453)
Chris,

The two series against NeL teams took place in late fall before the start of the winter league season, so I'd combine the 1914 series with 1914-15 and the 1915 series with 1915-16.
   35. Chris Cobb Posted: August 29, 2008 at 04:32 PM (#2921509)
Thanks, Brent!
   36. DL from MN Posted: September 02, 2008 at 03:33 PM (#2925855)
I've been pleased with Dan R's work on Stearnes and Charleston but it didn't change my placement. I'd love to see the same done on Torriente, Doby, Oms, Hill and Cool Papa Bell. Doby should be the least amount of effort with Bell next easiest.
   37. Paul Wendt Posted: September 02, 2008 at 03:42 PM (#2925871)
DL, There's a lot of work for Chris Cobb to do! See #29-35.
   38. Chris Cobb Posted: September 02, 2008 at 06:36 PM (#2926099)
I'm almost done with Torriente's batting record, so I hope to post data tonight. Given that competition levels in the CWL are rather uncertain (and certainly quite varied) during the teens, I would like to post two different projections, based on different assumptions about CWL competition, so I have some work yet to do.
   39. DL from MN Posted: September 03, 2008 at 05:51 PM (#2927588)
Sorry, I meant Dan and Chris' work. I really do appreciate this stuff. If there is a procedure written up that I can follow I'd be glad to crunch some numbers. I would hesitate to do so otherwise because I'm sure I'd overlook something.
   40. Chris Cobb Posted: September 04, 2008 at 07:05 PM (#2929248)
Cristobal Torriente MLEs

Age Year  Team  G   PA   Hits TB   BB  SB   BA     OBP    SA    OPS+
18  1912  Hab  140  560  138  217  48  14  0.269  0.332  0.425  106
19  1913  Alm  140  560  152  257  52  30  0.300  0.365  0.506  147
20  1914  Alm  150  630  175  260  67  39  0.311  0.385  0.462  151
21  1915  Alm  129  542  150  232  58  50  0.310  0.383  0.480  160
22  1916
CSW  131  550  155  210  51  12  0.310  0.374  0.421  143
23  1917  No data 
(played for CSW and All-Nations)
24  1918  CSW  116  487  143  208  52  15  0.330  0.401  0.478  168
25  1919  Alm  140  588  174  267  61  30  0.331  0.400  0.506  171
26  1920  Chi  154  647  235  354  54  16  0.397  0.447  0.597  198
27  1921  Chi  152  638  212  328  57  25  0.365  0.422  0.564  159
28  1922  Chi  105  441  123  196  55  17  0.320  0.405  0.509  134
29  1923  Chi  154  647  202  287  84  18  0.359  0.442  0.511  151
30  1924  Chi  134  563  165  261  75  15  0.337  0.426  0.536  156
31  1925  Chi  146  613  147  230  89   6  0.281  0.385  0.438  110
32  1926  KC   153  643  182  251  75  17  0.321  0.399  0.442  126
33  1927  Det  145  580  150  204  65   7  0.291  0.370  0.396  105
34  1928  Det   70  210   58   80  15   3  0.297  0.348  0.410  97
career        2159 8899 2563 3843 958 315  0.323  0.396  0.484  145

1916
*Also played for the All-Nations 


Notes. These projections use the Cuban Winter League data provided by Brent for the 1912-15 and 1919 seasons. In this projection, I have used a .85/.72 conversion factor for these seasons, except for 1913, which featured a large number of top American Blacks, so I projected it at .87/.76. I also projected the series played against the Lincoln Stars and ABCs in 1914 and 1915 at the usual NeL conversion rate of .9/.81. These conversion rates are guesses: I am pretty from conversions work that I did from a while back that the 1913 season rates are about right, since those were anchored by comparing the performance of white Cubans in the league to their major-league performance, but I haven’t evaluated other seasons on that basis. Studies that I did of the CWL in the 1920s suggest that, because it was such a small league, its quality varied significantly from year to year, depending upon how many North American players it attracted. In my view, the conversion factors I have used here seem a bit too high: I am doubtful that Torriente was a 160 OPS+ before he was 22. I will present a projection that uses rates that are 3%/6% lower on BA and SA, which looks to me like it fits better with the projections from Torriente’s NeL years in the 1920s.
   41. Chris Cobb Posted: September 04, 2008 at 07:09 PM (#2929255)
Cristobal Torriente with 3%/6% reduction to CWL MLES

Age Year  Team  G   PA   Hits TB   BB  SB   BA     OBP    SA   OPS+
18  1912  Hab  140  560  133  207  48  14  0.260  0.324  0.405   99
19  1913  Alm  140  560  147  245  52  30  0.289  0.355  0.482  138
20  1914  Alm  150  630  170  250  67  39  0.302  0.377  0.445  144
21  1915  Alm  129  542  146  223  58  50  0.301  0.375  0.461  152
22  1916  CSW  131  550  154  208  51  12  0.308  0.371  0.416  141
23  1917  No Data 
24  1918  CSW  116  487  143  207  52  15  0.328  0.400  0.475  167
25  1919  Alm  140  588  171  260  61  30  0.325  0.395  0.494  166
26  1920  Chi  154  647  235  353  54  16  0.396  0.447  0.596  197
27  1921  Chi  152  638  212  327  57  25  0.365  0.422  0.563  158
28  1922  Chi  105  441  123  196  55  17  0.320  0.405  0.509  134
29  1923  Chi  154  647  202  287  84  18  0.359  0.442  0.511  151
30  1924  Chi  134  563  165  261  75  15  0.337  0.426  0.536  156
31  1925  Chi  146  613  147  230  89   6  0.281  0.385  0.438  110
32  1926  KC   153  643  182  251  75  17  0.321  0.399  0.442  126
33  1927  Det  145  580  150  204  65   7  0.291  0.370  0.396  105
34  1928  Det   70  210   58   80  15   3  0.297  0.348  0.410   97
career        2159 8899 2538 3791 958 315  0.320  0.393  0.477  142 
   42. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 04, 2008 at 07:13 PM (#2929258)
Fielding WS? Was he a CF for his whole career? Was he definitely playing ball in 1917?
   43. Chris Cobb Posted: September 04, 2008 at 09:29 PM (#2929390)
I"m working on fielding win shares now.

Here's what I know now about seasons in which he was exclusively a center fielder:
In 1919 he split time between CF and LF: when he and Charleston were teammates, Charleston was in center, it appears.
In 1927 he played right field, according to Holway (Turkey Stearnes was in center for Detroit).
In 1928 he split time between left field and right field (seasonal fielding data posted on this thread).
   44. Gary A Posted: September 05, 2008 at 03:10 AM (#2929580)
I think you might have his age a year off on the Cuban seasons. He was born November 16, 1893; the 1912/13 season actually started in January 1913, so he was already 19 then. It might make more sense to annex the Cuban seasons to the following summer (i.e., 1912/13 to 1913, 1913/14 to 1914 and so on). Maybe then his Cuban stats would fit in better with his U.S. seasons without the extra 3%/6% reduction.

Also--I forgot about his series against the Brooklyn Dodgers in the fall of 1913 (played November 1 through November 30, 1913), for what it's worth:

G: 8 (team 8)
AB: 32
H: 5
D: 1
T: 0
HR: 0
R: 0
W: 1
HBP: 0
SF: 1
SH: 1
SB: 0
AVE: .156 (series .281)
OBA: .176 (series .342)
SLG: .188 (series .345)
runs/game: 4.57
(Pitchers excluded from series averages.)

positions played:
rf-6 games (49 di); lf-2 (20); p-1 (5).

Dan: he definitely played in 1917, for the Cuban Stars and All-Nations.

In Cuba in 1917/18 there was no formal Cuban League because Almendares Park was being rebuilt, but the three teams (Habana, Almendares, Cuban Stars) played various series around the island. I only have box scores for about half the games; if we ever found all of them, we'd have enough to put together a decent substitute for the Cuban League. Anyway, Torriente played in these games, both for Almendares and the Cuban Stars.
   45. Gary A Posted: September 05, 2008 at 03:49 AM (#2929604)
A quick note on Torriente's 1917 U.S. season-I just checked through the Chicago Defender. I didn't find Torriente with the Cuban Stars at all, only with All-Nations. But the All-Nations don't show up in the Defender until the beginning of September.

I know the club was around earlier in the season, because Bullet Rogan appeared briefly with them in May or June (in K.C.) while on a break from the Army. But they must have spent much of their time barnstorming through the upper Midwest or the southwest before showing up in Indianapolis, St. Louis, and Chicago in September and October. The Defender doesn't give a clue as to where they were before then. For that month or so (Sept/Oct), Torriente seems to have appeared in every game I found.

Torriente and Mendez arrived in Tampa from Havana together on May 9, 1917; the passenger list has J.L. Wilkinson (owner of the All-Nations Club and later the Monarchs) as their U.S. contact. So I'm pretty sure Torriente spent the whole 1917 season with All-Nations (though he could have gotten into a game or two with the Cuban Stars--you can never be sure).

I'll probably do 1917 stats for the blog after finishing the 1920-22 NNL book--the 1910s are really important, yet still pretty murky in the books we have.
   46. Chris Cobb Posted: September 05, 2008 at 02:32 PM (#2929809)
Gary,

Thank you for the additional information on Torriente! The birthdate is particularly helpful. The HoF project data gives his age for each NeL season, but not the actual birthday, and the only listed birth information I had was from Riley, who only gave a year "1895," that didn't fit with the HoF data, so I knew it had to be wrong.

In my past MLEs, I have always treated the CWL as following the major-league season, rather than preceding it, so that was the practice I followed with Torriente, which also reduced the number of seasons for which I had to combine data from leagues with two different levels of competition--a time-consuming process, but I'll take a look at what happens if I shift them back a year.

I'll update my work on Torriente with this new information and data, but I want to get a quick run through the other NeL CF candidates done first.
   47. Chris Cobb Posted: September 06, 2008 at 07:07 PM (#2931219)
Cristobal Torriente Fielding Win Shares

Age Year  Team  G   rateFWS
19  1912  Hab  140  1.60  2.0
20  1913  Alm  140  2.06  2.6
21  1914  Alm  150  2.96  4.0
22  1915  Alm  129  3.58  4.2
23  1916  CSW  131  3.87  4.6
24  1917
All-N  0  3.32  0.0
25  1918  CSW  116  4.09  4.3
26  1919
Alm  140  3.22  4.7
27  1920  Chi  154  3.87  5.4
28  1921  Chi  152  3.62  5.0
29  1922  Chi  105  3.24  3.1
30  1923  Chi  154  3.11  4.3
31  1924  Chi  134  2.95  3.6
32  1925  Chi  146  3.25  4.3
33  1926  KC   153  2.54  3.5
34  1927  Det  145  2.01  2.6
35  1928  Det   70  1.26  0.8
career        2159       58.1 


rate* is fielding win shares per 1000 defensive innings, estimated here as 111.11 games.

1917* Also played a game or two for the Cuban Stars West. The All-Nations barnstormed for most of this season; full data for their games against top black teams is not yet available.

1919* Played for Chicago American Giants in North America, but insufficient data exists for projection, so this year is based off of his CWL season.



Notes.

1) Torriente is projected as playing right field 1912-14, 1/2 of 1919, and 1927-28 (data show that he split 1928 between left and right, but my projection methods aren’t sophisticated enough to register a difference between the corners). For other seasons, he is projected as a centerfielder.

2) The quality of his prime is modeled on Benny Kauff, who was a slightly above average centerfielder from ages 24-30, when he was banned by Landis. His decline phase is modeled on Hack Wilson’s, since both players appear to have drunk their careers and lives to early ends. His first three seasons model a very raw player who is about average in RF in year one, rising to excellent in RF in year 3, leading to the move to CF in year 4.
   48. Chris Cobb Posted: September 06, 2008 at 07:09 PM (#2931220)
Argh. Sorry about introducing the long lines: I should have put those notes outside the "pre" tags. Hopefully someone can fix that easily?
   49. Chris Cobb Posted: September 06, 2008 at 07:28 PM (#2931236)
Torriente XBH projections, for higher estimate:

1777 singles
447 doubles
184 triples
155 home runs
   50. Chris Cobb Posted: September 06, 2008 at 07:40 PM (#2931241)
Torriente XBH projections, for lower estimate:

1771 singles
434 doubles
180 triples
153 home runs
   51. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 06, 2008 at 09:10 PM (#2931301)
OK, here's the "lesser" set of MLE's translated to my WARP. I have added in hit by pitch at the league average rate, and filled in 1917 using my war credit equation.

Year SFrac BWAA BRWAA FWAA Replc WARP
1912  0.90  0.2  
-0.1 -0.3  -0.6  0.5
1913  0.90  3.3   0.1  0.4  
-0.6  4.4
1914  1.02  4.5   0.3  0.5  
-0.7  6.0
1915  0.88  4.6   0.6  0.3  
-0.9  6.5
1916  0.90  4.0   0.0  0.3  
-0.9  5.2
1917  0.98  5.0   0.2  0.4  
-1.0  6.5
1918  0.97  6.0   0.1  0.5  
-1.0  7.6
1919  1.06  6.4   0.3  0.8  
-0.9  8.4
1920  1.03  9.1   0.1  0.4  
-1.0 10.5
1921  1.01  5.7   0.2  0.5  
-1.0  7.4
1922  0.68  2.5   0.1  0.1  
-0.7  3.4
1923  1.00  5.3   0.1 
-0.1  -1.0  6.2
1924  0.89  4.9   0.1  0.0  
-0.8  5.8
1925  0.95  1.8   0.0  0.1  
-0.9  2.9
1926  1.02  3.2   0.2 
-0.1  -1.0  4.3
1927  0.91  1.3   0.0 
-0.1  -0.7  1.9
1928  0.33  0.1   0.0 
-0.3  -0.2  0.2
TOTL 15.42 67.9   2.3  3.3 
-14.1 87.6
AVRG  1.00  4.4   0.1  0.2  
-0.9  5.7 


3-year peak: 26.6
7-year prime: 52.6
Career: 87.6
Salary: $271,297,033, at the doorstep of the inner circle, below DiMaggio and Stearnes, similar to Mize and Mathews, way above Griffey.

Again, this is an extremely bullish MLE: there aren't 30 post-1893 MLB position players who project to better than this. Can we get a reputational spot-check here? That's an absolutely beastly peak you're showing there from 1918-21--it's far superior to, say, DiMaggio's, as a comp.

That said, maybe these MLE's really are accurate--because the segregated majors were really weak. After all, the upper inner circle is absolutely dominated by segregation-era players. Perhaps I just need to take a much bigger bite out of all 1893-1947 stats, NgL'ers and MLB'ers alike, than I have been doing. That would make a lot more sense intuitively. But how to quantify it?
   52. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 06, 2008 at 09:19 PM (#2931302)
Another thought on why these MLE's are turning up such high peaks: as I've stressed time and again, league quality and standard deviation are two separate factors. The NgL's may have been AAA-quality on the whole, but they encapsulated a much higher variance of talent: virtually everyone in AAA is roughly AAA-caliber, whereas the NgL's probably had everyone from High-A guys to inner-circle Hall members. As a result, we'd expect the NgL's to show an enormous standard deviation. The correct approach is to first regress NgL'ers to the league mean, so that the NgL stdev is equal to the league you want to translate it to. Only then can you apply an overall league quality conversion factor. I remember this thought first dawned on me looking at the raw numbers on Oscar Charleston's 1921 team, when he had a 1.200 OPS, along with some other guy I'd never heard of, and no one else on the team was on the right side of .700. Chris, do you think there's anything to this, and is there any way to implement it? Charleston at $330M, and Gibson in the mid-$500's with a full catcher bonus (which I don't think he deserves) "feel" right to me, but Stearnes *and* Torriente in DiMaggio-ville seems a little hard for me to stomach.
   53. Blackadder Posted: September 06, 2008 at 09:20 PM (#2931305)
Well, one annoying way to be to do what you suggest elsewhere, namely translating all NgL players into the majors and figuring the numbers of the hypothetical league whose talent pool is everyone. What is the problem with the availability of NgL stats again? I guess that we don't have the data to do that. What if you just did it for every NgL player for whom we have the relevant data? Are there enough of them so that this would give at the very least some useful information as to how much to deflate everyone's stats?
   54. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 06, 2008 at 09:34 PM (#2931326)
You'd still need to equalize the standard deviations of the separate racial player pools before trying that, or else you'd have the NgL'ers too high relative to the MLB'ers. But if you had the complete data set, doing so would be a cinch.
   55. Chris Cobb Posted: September 06, 2008 at 11:37 PM (#2931382)
What is the problem with the availability of NgL stats again?

Up until about four years ago, only rather fragmentary Negro-League stats had ever been compiled. At that time, the Hall of Fame sponsored a project to compile as complete statistics for the Negro Leagues as was possible from box scores, given that not all games are documented with complete box scores.

Of the data gathered by this project, only a sliver has been published: career totals for Negro-Leaguers elected to the Hall of Fame and selected candidates who made the final ballot but were not elected.

We had heard that a complete Negro-League statistical encyclopedia was going to be published on the basis of this data, but so far nothing has come of it.
   56. Chris Cobb Posted: September 06, 2008 at 11:45 PM (#2931388)
The correct approach is to first regress NgL'ers to the league mean, so that the NgL stdev is equal to the league you want to translate it to. Only then can you apply an overall league quality conversion factor. I remember this thought first dawned on me looking at the raw numbers on Oscar Charleston's 1921 team, when he had a 1.200 OPS, along with some other guy I'd never heard of, and no one else on the team was on the right side of .700. Chris, do you think there's anything to this, and is there any way to implement it?

Well, it should be possible to do this kind of standard deviation study for the seasons compiled by Gary A. or available through his website. These are 1916, 1921, 1922, and 1923. Someone has similarly complete data for 1928, because it's been excerpted at various times over the years, but I'm not sure who has it.

Calculating standard deviations for these seasons could give us an idea of how big the SDs in the NeL actually were, though I would hesitate to take the SDs for these early seasons as representative of the league when it was more mature: the first years of the NeL were almost certainly like years following expansion in the majors.

If you could readily do that analysis, I think that would be a great help to our understanding of NeL statistics.

The method as you outline it above makes sense. The one question that I would have about methodology concerns the short-season issue: if NeL players' seasons were regressed to the league mean, would there still be need to regress them to their own several-season mean?
   57. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 07, 2008 at 12:08 AM (#2931401)
Well, I'd be dealing with the same problem I confronted with the majors: how to distinguish true ease of domination from how much the players in a given league-season happened to dominate. In MLB, I have a sample of 218 league-seasons, which is sufficiently large to accomplish this task with a fairly strong degree of confidence. I definitely do not see how I could do something similar with a mere four league-seasons of NgL data. It's perfectly easy to simply set the NgL stdev to the same as the MLB stdev, but that ignores the possibility that the given NgL league-seasons I'm dealing with happen to be similar to, say, the 2001 NL (when everyone just happened to go ape $h!t, probably with some chemical help) or the 1976 AL (when there wasn't a single MVP candidate worthy of the award). That may simply be a risk I have to take.

I can't help the feeling that something is making these MLE's come in high. 9- and 10-WARP2 seasons don't grow on trees in the majors--the median MVP winner is under 9--and they're definitely popping up with some consistency in your MLE's, including for players that don't have the mega-reputations of a Charleston or a Gibson. At the same time, the giants of the first 30 years of the game did put them up with some regularity. I'm correcting for overall standard deviations, but not for kurtosis, and those years do seem to have "fat tails" on the high end (with the rest of the league clustered more closely around average). Certainly I also suspect that a much bigger segregation penalty is appropriate than the tiebreak factor I have been using to date. It IS improbable that eight of the top 14 1893-2005 MLB position players (Ruth, Wagner, Cobb, Speaker, Hornsby, Collins, Lajoie, Gehrig) would have played between 1900 and 1935, and also that only one (Bonds) would have debuted after 1960! There's something going on here that a simple correction for standard deviations is not picking up. The reverse is true of the pitchers, of course, where you had three inner circle guys (Clemens, RJohnson, Maddux) going at once, and a fourth who would have been one if his body had held up (Pedro). You had four more in the deadball era (Young, WJohnson, Alexander, Mathewson), and really only three in the intervening 65 years (Grove, Spahn, Seaver). These issues will require a lot more thought.
   58. Gary A Posted: September 07, 2008 at 01:00 AM (#2931448)
Just to be a pest, I think I need to point out that I had talked about the 1921 St. Louis Giants precisely there was an *unusually* high spread between the best and worst hitters on the team. This is the OPS for the team’s top ten position players (latest stats; NNL games only):

1.290 Oscar Charleston CF
1.192 Charles Blackwell RF
.877 Dan Kennard C
.784 Charles Dudley LF
.753 Sidney Brooks UT
.660 Joe Hewitt SS
.652 Sam Mongin 3B
.649 Tullie McAdoo 1B
.642 Sam Bennett C
.453 Eddie Holtz 2B

Blackwell and Charleston, as it happens, were the only players in the 1921 NNL with an OPS about 1.000; and Eddie Holtz was the only player in the whole league (over 150 plate appearances) with an OPS below .520.

Nevertheless, of course Dan’s larger point is correct: we need to think about regression and standard deviation. But for that we need...reasonably complete statistics. They will be available at some point, though I couldn’t say when.
   59. Blackadder Posted: September 07, 2008 at 01:01 AM (#2931450)
Thanks for the explanation, Chris.

Dan, to be fair ARod is basically guaranteed to join Bonds in the top 14; indeed, depending on how he finishes the season, he may pass Gehrig this year. The larger point is obviously right, and needs consideration. As I think you imply, this is not simply a "technical" problem, one of e.g. running the right regression, but instead require some conceptual thought about what one SHOULD do.
   60. Chris Cobb Posted: September 07, 2008 at 01:10 AM (#2931460)
That may simply be a risk I have to take.

The way I'd put it is that doing standard deviation for these seasons would be a place to start. It would give us a few real data points, where currently we have none. Those points might prove to be misleading in some respects, but I'd venture that even if, in the long run, they were to prove misleading in some ways, we would still get closer to the truth with them than we would by not trying to deal with standard deviation at all.
   61. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 07, 2008 at 01:26 AM (#2931490)
Yes, it's clearly a very major issue. What we really want to do, Chris Cobb, is to re-do your initial NgL/MLB league-switcher study that gave us our conversion factors, but instead of using either raw BA/BBrate/ISO or percentage-based BA+/BB+/ISO+, we should use the z-score of each component. E.g., players whose park-adjusted batting averages were a given number of stdevs above the NgL mean were how many stdevs above the MLB league-average BA once they played in the bigs? Is there enough data available to answer that question?
   62. Chris Cobb Posted: September 07, 2008 at 02:26 AM (#2931590)
I don't know enough about the finer points of calculating standard deviation to say for sure. My guess would be that we don't have nearly enough data.

Let me assume that we want this data for each NeL season from 1940 to 1948 in order to cover the seasons relevant to the league-switcher study. Monte Irvin's NeL debut was a bit earlier than that, but aside from his first two full NeL seasons, all of the league-switcher seasons are from 1940-on, and Irvin has enough later seasons that we wouldn't need those first two.

As far as I know, this is the data available to us.

We have: league batting averages and slugging averages for the relevant NeL seasons, but no league OBP or BB rates.
We have: some players' non-park adjusted seasonal statistics that are part of the same data set that gives us the league BA and SA. The total number of player in this set is about 20. The number of seasons played from 1940-48 ranges from all 9 to only 1. The player statistics include BA, OBP, SA, and the raw stats from which these averages are derived, except for HBP.
We have: other seasonal, non-part-adjusted statistics for quite a few more players (at a guess, I'd say another 20) covering the 1940-48 period. These are derived from earlier, less complete and less rigorously conducted data collection projects for the Negro Leagues.
We have: the contemporary published league BA and SA totals from 1946-48 for the NNL and the NAL. I do not know to what extent their reliability can be vouched for.

So the questions are: how many player seasons, out of all the player seasons in the data set, do we need for any given NeL season to calculate z-scores? How far can we rely on data derived from different studies and still have valid results?

If we can draw on all the data available and we don't separate out the NAL from the NNL, there might be seasons for which we have up to 30 player seasons to work with. Our number of players per season will be dampened because so many NeL stars spent time in Mexico or in the army during these years, of course, but I'm guessing some years would have as many as 30.

My intuition tells me that won't be nearly enough, but I know very little about statistical sampling.
   63. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 07, 2008 at 02:40 AM (#2931607)
I mean, technically, you can calculate a standard deviation with two data points. The more information you have, the better. Certainly 30 player-seasons would be a strong sample for a given year. If you just email me a data dump with all the stats you have for every available NgL player-season between 1940 and 1948, it's rudimentary arithmetic for me to determine z-scores. They may not be very reliable, but perhaps they're better than nothing. Let's find out!
   64. Chris Cobb Posted: September 07, 2008 at 02:51 AM (#2931611)
If I had all that data in electronic form, that would be easy, but everything except what I have entered for the purpose of calculating MLEs is in print sources (or at best a pdf), and I would have to enter the data by hand.

I can do that, but it will take weeks, so I'm afraid there's no way to reach a quick result.
   65. Brent Posted: September 07, 2008 at 02:52 AM (#2931612)
I mean, technically, you can calculate a standard deviation with two data points. The more information you have, the better. Certainly 30 player-seasons would be a strong sample for a given year.

Be careful. You can calculate a standard deviation from a small sample if the sample is randomly selected, but the samples of player seasons that Chris described are definitely not random.
   66. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 07, 2008 at 02:58 AM (#2931614)
Well, that wouldn't matter, so long as the selection bias itself doesn't change from year to year. Let's say Chris only has data for the top 10% of NgL'ers in every year. That's fine; I can calculate conversions from a z-score among the top 10% of NgL'ers to a MLB z-score. What would be problematic is if one year Chris is giving me data for the top 10%, and another year he's giving me data for the middle 30%. That would indubitably lead to errors.
   67. Paul Wendt Posted: September 07, 2008 at 05:08 PM (#2931752)
Maybe you need to measure within-league variation after regression to the mean of neighboring seasons.
Or account later for that part of greater NeL variation due to shorter seasons. How?


the 1976 AL (when there wasn't a single MVP candidate worthy of the award)
I remember April of '76. Finley trades Reggie to the Orioles. Reggie's mother lives in Baltimore. She hopes that he will play here for a long time and probably he will. . . . (debut in May)

From NYT 1976-04-14 coverage of home opening day, the first game at renovated Yankee Stadium:
"Gabe Paul ... was chortling about something else . . . .
Reggie Jackson ... was still not signed by the Baltimore team.
The longer Jackson stays away from Baltimore, he said, the better for us."
   68. Brent Posted: September 08, 2008 at 04:19 AM (#2932657)
Dan wrote:

Can we get a reputational spot-check here? That's an absolutely beastly peak you're showing there from 1918-21--it's far superior to, say, DiMaggio's, as a comp...

Charleston at $330M, and Gibson in the mid-$500's with a full catcher bonus (which I don't think he deserves) "feel" right to me, but Stearnes *and* Torriente in DiMaggio-ville seems a little hard for me to stomach.


One spot check are the Negro leaguers ranked in the top 100 by Bill James in the NBJHBA. As far as I can tell, James was relying more on reputation than on statistics, and appears to have especially relied on Holway.

James ranked 12 Negro leaguers (9 position players and 3 pitchers--counting Dihigo as a pitcher--in his top 100:
4. Charleston
9. Gibson
17. Paige
25. Stearnes
27. Lloyd
43. Suttles
52. Williams
65. Leonard
67. Torriente
76. Bell
86. Wells
95. Dihigo

In comparison, we elected 234 players to the HoM, of which 28 (my quick count-I may have missed one or two) were primarily Negro Leaguers. In other words, Negro leaguers constitute about the same proportion of our top 234 players as they do of James's top 100, so it would make sense that we'd also have about six in the top 50, about 12 in the top 100, and so forth.

Now our rankings of individual players differ quite a bit from James's. For example, Suttles and Bell aren't going to be in our top 100, while Jud Wilson will be. Among the players within our top 100, we'll rank Charleston and probably Paige lower than they were ranked by James, while perhaps Williams and Torriente may be ranked higher. But the bottom line, it seems to me, is that the distribution of Negro leaguers within our top 25-50-100 seems to be coming out just about exactly what I'd expect given that we decided that about 12 percent of HoMers are Negro leaguers. And where we've differed from reputational ranking (such as James), we've been as likely to lower a player's ranking as to raise it.

My reading indicates that Torriente was certainly the best hitter in the Negro Leagues during the late '10s and early '20s until surpassed by Charleston and Beckwith. Chris's MLEs for Torriente and Stearnes are not setting off any alarms for me.
   69. Paul Wendt Posted: September 08, 2008 at 05:26 AM (#2932678)
Peterson in 1970 covered the best players with short bio's and thumbnail photos, usually.
"The names of about sixty men inevitably crop up in any discussion of the greatest players in Negro baseball."

His coverage is
- not including Irvin & co.
- not including 19th century players "because the sport of choosing all-time teams did not get under way for Negro baseball until they were nearly forgotten.
The players are listed alphabetically by position. The first group for each position is made up of those most often selected as the best. Following them is a second group of players who are occasionally mentioned among the greats."

I suppose that he relied on published argument and on his own conversations.

Following 3,2,3,5,3 = 16 players at c,1b,2b,ss,3b it is remarkable that (1) there are only five outfielders including Martin Dihigo, and perhaps (2) when Dihigo made his US debut in 1923 he was the latest of the five.
- Bell, Charleston, Hill, Torrienti
If this is reliable there was an unusual degree of consensus. Rube Foster could have named Charles, Hill, and Torriente in the early 1920s and Cum Posey could have rounded out the list in the early 1930s.
   70. Paul Wendt Posted: September 08, 2008 at 03:12 PM (#2932815)
On the other hand, only a few 1930s debuts crack this standard at any position:
p Slim Jones
c Josh Gibson
1b Buck Leonard
2b Sammy T. Hughes
3b Ray Dandridge

Deference to early opinion become conventional wisdom?
Or was it more difficult to stand out in the later leagues?

--
Back to DanR's question speaking directly. By reputation Torriente was a great player but the projections represent the upside of what reputation would suggest.
   71. Gary A Posted: September 08, 2008 at 09:32 PM (#2933317)
A note on Torriente's whereabouts during the winter of 1916/17: he was in Puerto Rico. Due to the rebuilding of Almendares Park, there was at first no Cuban League season planned. Two Cuban squads—the “Havana Stars,” organized by the players Inocente Mendieta and Pastor Pareda, and the Linares/Molina Cuban Stars—travelled to Puerto Rico to play a three-cornered series in November and December with the Brooklyn Royal Giants (featuring Dick Redding). Torriente was on the Cuban Stars—I have two Cuban Stars box scores from Puerto Rico, both with Torriente in the lineup.

When the Cuban teams returned to Havana in January, Torriente came with them, appearing in at least one (non-league) game for Almendares against Matanzas on January 6. A shortened league season was finally organized, to take place in February at the race track (Oriental Park), but Torriente was never listed with any of the teams, nor could I find any hint of where he was. Juan Padrón was advertised as a member of Tinti Molina’s White Sox, but chose instead to play for Rube Foster in the Breakers/Royal Poinciana series at Palm Beach. Torriente, though, didn’t go to Florida. Nor was he with the L.A. White Sox (where John Donaldson and George Carr, among others, played that winter).

As noted above, Torriente and José Méndez arrived in the U.S. from Havana in May, on their way to join the All-Nations Club. But what Torriente was doing between January and May, I don’t know. It’s possible he was playing elsewhere in Cuba (there was a LOT of pro and semi-pro baseball outside Havana).
   72. DL from MN Posted: September 08, 2008 at 10:07 PM (#2933341)
Just for a sanity check assume the standard deviations are high (good assumption) and use the highest stdev you've seen in the majors for that run environment. I think that is more reasonable than using the actual stdevs for the NL for that season. I'd guess the 20s and 30s have the lowest NgL stdev and the NgL stdev for the 40s and 50s is huge. That might not be correct but it's certainly more reasonable.
   73. Gary A Posted: September 08, 2008 at 11:24 PM (#2933396)
Reputation-wise: Torriente is the subject of one of the Negro Leagues’ more frequently recycled quotations, from C. I. Taylor: “If I should see Torriente walking up the other side of the street, I would say, ‘There walks a ballclub.’”

And Cumberland Posey of the Homestead Grays put Torriente on his all-time team in 1937; his outfield was Torriente lf, Charleston cf, Hill rf.

For those who don’t know, Taylor and Posey are probably two of the three most respected managers in Negro League history, the third being Foster, who of course employed Torriente as his center fielder for several years.
   74. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 09, 2008 at 12:05 AM (#2933454)
No, I don't think that's enough, DL from MN. Just look at the data from that one team Gary A posted--the leader's OPS is nearly three times that of the trailer. Yes, I understand it's a coincidence that they were on the same team, but still, you'd just never, ever see that in the majors.

One other approach if we lack the complete data to calculate overall standard deviations would just be to look at leaderboards (similar to what Joe Dimino does for innings translation). Do we have enough info to put together Top 10 lists for BA, ISO, and BB rate for all the relevant NgL seasons? If we know that, say, the average 5th place finisher in the NgL's has a 125 AVG+, it wouldn't take much effort to come up with a broad standard deviation estimate based on that that we could use as an anchor.
   75. Chris Cobb Posted: September 09, 2008 at 12:26 AM (#2933493)
Do we have enough info to put together Top 10 lists for BA, ISO, and BB rate for all the relevant NgL seasons?

No. Not even close.
   76. Blackadder Posted: September 09, 2008 at 12:42 AM (#2933519)
Can someone fix the page?
   77. Gary A Posted: September 09, 2008 at 01:59 AM (#2933613)
Just look at the data from that one team Gary A posted--the leader's OPS is nearly three times that of the trailer. Yes, I understand it's a coincidence that they were on the same team, but still, you'd just never, ever see that in the majors.

How about 1920 Babe Ruth (1.382) and Chick Galloway (.511)?

Charleston was 2.85 times Holtz; Ruth was 2.7 times Galloway.

Sorry, couldn't resist. ;-)

No, seriously, of course the Negro Leagues should have a higher standard deviation than the majors, generally speaking. I'm curious, however, as to how much of it is due to the talent distribution (high-A to HOF) and how much to smaller seasons.
   78. JoeD has the Imperial March Stuck in His Head Posted: September 21, 2008 at 03:18 PM (#2949221)
Fixing Chris Cobb's post 47 now to make the page more readable.
   79. JoeD has the Imperial March Stuck in His Head Posted: September 21, 2008 at 03:27 PM (#2949224)
BTW guys if there's an issue with a page like this - my email works! Send me a note and I can fix ASAP. I haven't been checking in much of late, due to work issues. Hopefully it slows down in a bit, but it probably won't this week.
   80. KJOK Posted: September 17, 2011 at 07:40 AM (#3927879)
   81. KJOK Posted: September 17, 2011 at 08:25 PM (#3928190)
CORRECT Torriente Link:

Christobal Torriente's Real Stats
   82. Brent Posted: December 14, 2014 at 11:34 PM (#4861852)
On the Ben Taylor thread, I’ve posted an item demonstrating some simple calculations for converting a player’s career OPS+ against NeLg competition from Seamheads to an MLE OPS+. These calculations use adjustment factors that were suggested originally, I believe, by Chris Cobb—0.9 for hits, and 0.9^2=0.81 for walks and extra bases.

One thing I wanted to look at was the extremes—what would these adjustments imply about the very best NeLg hitters whose statistics are available on Seamheads.

The highest OPS+ shown on Seamheads was, unsurprisingly, for Josh Gibson at 197. But Gibson’s data cover only a small portion of his career (4 seasons). I was more interested in players whose careers were mostly covered by the Seamheads data. Among players with at least 1500 recorded plate appearances, the highest OPS+ was for Torriente (190), followed closely by Heavy Johnson (also 190) and Oscar Charleston (189). Seamheads primarily covers players who were active between 1900 and 1928.

If we apply the standard adjustment factors to Torriente, what do we get as his MLE OPS+? Using the same formulas I described on the Ben Taylor thread, I calculate an MLE OPS+ of 147.

Is this reasonable? Among MLB players who were primarily active from 1900 to 1928, I count 8 who had career OPS+ greater than or equal to 150: Ruth (206), Hornsby (175), Jackson (170), Cobb (168), Speaker (157), Cravath (151), Wagner (151), and Lajoie (150).

Knowing what we know about the distribution of talent among black players during the first generation after integration, it frankly bothers me that no NeLg players from this era are projecting as high as the top 8 MLB players. I think these calculations suggest that the 0.9 adjustment factor may be a little too low.

What if we switch to using 0.92 and 0.92^2=0.85? My recollection is that Chris’s original estimates of the conversion factor were not super precise, and I think a slightly larger value of 0.92 might well be within the reasonable range. If we substitute 0.92, it raises Torriente’s MLE OPS+ to 154. Instead of no NeLg hitters of the 1900-28 period with an MLE OPS+ of 150 or above, we would have 3 players in the low 150s. While I’m ok with the idea that no NeLg player of that era approached Ruth, Hornsby, and Cobb in hitting ability, I think it’s reasonable that the best NeLg hitters of that era (Torriente, Charleston, and Johnson) were at least comparable to Speaker, Cravath, Wagner, and Lajoie. I'm inclined to raise my adjustment factors for NeLg MLE's to 0.92 and 0.92^2.
   83. theorioleway Posted: January 15, 2015 at 11:56 AM (#4881553)
Thanks for doing this Brent! Just out of curiosity, what do you think about Heavy Johnson? The only players we haven't elected with a 150 OPS+ is Cravath and Dave Orr (although Browning, Keller, and Jones are not obvious choices). Do you see him as a slightly lesser Cravath?
   84. Chris Cobb Posted: January 15, 2015 at 09:46 PM (#4881926)
Re the conversion factors: given the completeness of the Seamheads data, I wish someone with the statistical know-how would do some standard deviation studies of the competitive context and see what happens when we use standard deviations as a framework for making conversions: that could help to give us a better handle for the conversion factors for a period in which no (or almost no) data exists from players who played in both Cuban/NeL and the major leagues.

I am dubious about looking at the OPS+ totals of top players of an era as yardstick for what the conversion factors should show. If you look at 150 OPS+ players in major-league history, the distribution is quite uneven. Between 1929 and 1957, for example, the AL has 7, including 4 over 160 (Williams, Gehrig, Mantle, Foxx), while the NL has only 3, and none over 160 (Musial, Mize, Ott). Why didn't the NL have anyone as good as Gehrig? Between 1957 and 1986, only four players topped 150 at all--Allen, Mays, Aaron, and Robinson, none of whom were full-career American-league players. All were black. Why weren't there any white players at that level during that era?

I don't make that argument to defend the validity of the .90/.81 conversion factors: they may be shown to be too low, or the conversion method might be shown to overadjust for top players. I think a much better test than looking at top, though, would be to look at league-wide NeL data or running conversions for many non-elite players and seeing what the overall distribution suggests. Running .92/.85 conversions for a large number of players and seeing what that suggests as far as talent distribution would also be helpful.
   85. Alex King Posted: January 16, 2015 at 04:32 AM (#4882010)
Chris, I would be very interested in trying to apply MLEs to a large sample of Negro Leagues players. While I've been very busy over the past couple of years, precluding participation in the HOM, I should have much more time this summer and was hoping to try to devote some time to Negro Leagues players MLEs. I'm pretty proficient in python, so if your methods are sufficiently automatable I'm pretty confident I can write a script that will convert seamheads (or BR) statistics into MLEs using your methods. Also, it would be interesting to use the fairly large sample of players who played in both Cuba and the US to get a better handle on the conversion factors.
   86. Chris Cobb Posted: January 16, 2015 at 07:54 AM (#4882025)
Alex, that would be a great project. Let's talk further between now and the summer to make plans. My first thought about automation is that Brent's shorthand method would be a better choice--especially to get a sense of the broad implications of a given competition-level adjustment--than the full MLE workup that I do. In that process, there are a couple of steps that can't be done by formula and so can't be automated. It might be possible to devise alternative methods that could be automated, of course--full automation would allow for more data to be pulled and processed than I can do by hand, which might make feasible translation methods that are impossible for me. In general, the rate conversions are easily automatable: the project of rates into major-league equivalent counting stats are not. Figuring out what it would take to take standard deviation measures of the performance variance in the NeL would also be valuable. I think there is expertise in the HoM community to work that out, and if we can automate getting seasonal data from Seamheads, then SDs could be calculated.
   87. Dr. Chaleeko Posted: January 03, 2018 at 01:34 PM (#5600176)
Hey, gang,

Please find my latest MLEs for Cristobal Torriente. They will be updated as new info becomes available.

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Dynasty League Baseball

Support BBTF

donate

Thanks to
Hombre Brotani
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 0.9412 seconds
41 querie(s) executed