Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Primate Studies > Discussion
Primate Studies
— Where BTF's Members Investigate the Grand Old Game

Monday, December 10, 2012

What do you do with Deacon White?

Last week, the attention of the baseball world was focused keenly on the Winter Meetings.  They’re reliably one of the high points of the offseason, featuring free agent speculation, wild trade rumors, and Hall of Fame selections from the Veterans’ Committee.

Unfortunately, “keen” is a very generous description for the level of emphasis given to the last item on that list.  The Veterans’ vote didn’t even get top headline status on ESPN’s MLB page when it was announced, even though it was an easier group to publicize than you’d expect from the fact that it’s composed of people whose careers were finished at least a decade before Jamie Moyer was born.  You have a longtime umpire who made one of the most famous and controversial calls in baseball history.  You have the Yankee owner who acquired Babe Ruth and built Yankee Stadium.  And you have Deacon White, who hit 24 home runs in his career, never scored or drove in 100 runs in a season, and registered 500 plate appearances in a year for the first time at age 38.  At a glance, he seems like a rather odd selection.

Rob Neyer is one of the few writers who bothered to address the Veterans’ ballot this year.  He penned the following in reference to White:

“Deacon White - 19th-century catcher, and it’s always hard to know what to do with 19th-century catchers, because the demands of the position at that time—no real mitt, no shin guards, no mask—meant that catchers didn’t play many games, or last many seasons. In fact, White shifted to other positions (mostly third base) in the latter half of his career. At 42, he was still playing every day, which probably says as much about baseball at that time than about his talents.”

Indeed, White’s games played totals during his career as a catcher (1871-79) don’t look terribly impressive:

29, 22, 60, 70, 80, 66, 59, 61, 78

But games played totals have to be put in context.  The numbers of games played by White’s teams in those seasons are:

29, 22, 60, 71, 82, 66, 61, 61, 81

White wasn’t missing time due to the strains of primitive catching.  In fact, he was barely missing any time at all – a total of 8 games skipped in 9 years.  His game totals were low because his teams weren’t playing anything approaching a modern schedule – professional baseball was in its infancy, long-distance travel was a highly challenging endeavor, teams frequently played large numbers of non-league games, and franchises would sometimes fold midseason.

The question then becomes: How do you adjust for this schedule discrepancy?

The easiest answer to this question is not to adjust at all.  White played the games he played, amassed the hits and doubles and RBI that he amassed, and should be compared to other players on that basis.  By this logic, White’s career totals of 2067 hits, 1140 runs, 988 RBI, and 44 Wins above Replacement (as estimated by Baseball Reference) are nice enough, but not terribly impressive in a historic context, especially considering the fact that he doesn’t have a single season exceeding 160 hits, 210 total bases, or 5 WAR.

There are a couple of easily identifiable problems with this method.  First, it doesn’t account for opportunity.  White played in nearly all the games he possibly could have, while a modern player who participates in 80 games in a season is not only missing half of the year, but making his team find someone else to put in his spot for the other 82.  The other issue is that of impact on team results.  In an 82-game schedule, a literally-interpreted 5-WAR player (White’s 1875 total was 4.9) should turn a .500 team into a .561 team (46-36); if you double the length of the schedule, you correspondingly reduce the impact of the wins (86-76, or .531).

The real issue at hand is not White’s raw contributions on the baseball field, but rather how those contributions affected his teams’ position in the standings.  In an attempt to take the most direct approach possible to that question, this analysis is going to take WAR at face value as an estimate of wins added, with all the applicable caveats.

The most intuitive method is to simply pro-rate each season to 162 games, multiplying the player’s statistics by (162/team games played).  This makes a 2-WAR season through 20 games equivalent to a 16-WAR season through 160.  Applied to our sample player, we see that this adjustment turns White’s 1875 from a 4.9-win season into a 9.7-win season, and he also picks up 8-win campaigns in ’72, ’76, and ’77.  His career WAR more than doubles, to 93.0, and he moves firmly into Hall of Fame territory – at least, if this is a fair adjustment.

The trouble is, it’s not a fair adjustment.  White’s 1872 Cleveland Forest Citys played in 22 games, so let’s look at the 2012 standings after 22 games for comparison.  You find the Dodgers and Rangers at 16-6, the Twins and Royals at 6-16, and the Padres and Angels at 7-15.  All of those teams have winning percentages further from .500 than the 55-107 Astros did at the end of the year – and this edition of the Astros had baseball’s most extreme end-of-season record since 2004.  It’s no surprise that smaller samples of games are inherently prone to wider variation in team performance.  Because of this phenomenon, a player who contributes 2 wins in a 20-game schedule (which would be expected to give an otherwise-average team a 12-8 record) would be far less likely to propel his team to a pennant than one who adds 16 wins over a 162-game schedule (97-65).

We can account for this by simply comparing the variance in team performance through different points in the season.  Of course, we’ll want to use a sample larger than one season of baseball to do so.
I took every 162-game season that has been played to completion.  That’s 1962-2012, plus the 1961 AL, and leaving out 1972, 1981, and 1994-95 due to labor disputes – a total of 1242 team seasons.  I split the seasons into not-quite-but-almost equal increments of around 10 games (there are two 11-game samples, taken to be games 41-51 and 122-132, and I also used game 154 as a breakpoint instead of 152 in a vague attempt at a tribute to baseball’s old schedule length).  The results, where S% is standard deviation of winning percentage and S is standard deviation of wins (calculated simply as S% * N), are as follows:

Games     S%       S    S(162)/S
 10     .1652    1.65     6.95
 20     .1234    2.47     4.65
 30     .1052    3.16     3.64
 40     .0952    3.81     3.01
 51     .0878    4.48     2.57
 61     .0830    5.06     2.27
 71     .0795    5.65     2.03
 81     .0771    6.24     1.84
 91     .0749    6.81     1.69
101     .0733    7.40     1.55
111     .0725    8.04     1.43
121     .0720    8.71     1.32
132     .0710    9.37     1.23
142     .0711   10.10     1.14
154     .0711   10.95     1.05
162     .0709   11.48     1.00 

The spread in team winning percentage becomes smaller throughout the year, as expected.  Adjusting for the variance in team performance, it’s evident that a 2-win increase in a 20-game season is not, in fact, equivalent to a 16-win improvement over 162; it’s closer in impact to a 9-win enhancement over a full season.  That looks about right; you’d expect a 12-8 team to be in contention to either win a weak division or take the second wild card slot, and you’d expect the same from a 90-72 team.

Accounting for this takes a bit of the air out of White’s production – his 1877 season, in which he was the best hitter in baseball but played mostly first base rather than catcher, is now equivalent to a 7.3 WAR season rather than 8.5.  That’s still an excellent year, but it doesn’t look as impressive as it did under the simpler pro-rating adjustment.

Note, however, that because I lacked the stamina to enter records after each of 162 games for each of 1242 teams, we’re left with substantial gaps in the table.  For maximum utility, we should try to find a curve that can be applied to any season length, up to and surpassing 162 games.

As it happens, there is just such a curve; its origins will be explained in further detail shortly.  The equation is as follows (with apologies for the awkward formatting):

S%(N) = (.25/N + .0554^2)^1/2,

Where N is the number of games played.  As before, multiplying S%(N) by N gives the standard deviation of team wins.  For comparison, here are the results when this curve is applied to the same season lengths listed in the table above:

Games   S%(N)    S(N)   S(162)/S(N)
 
10     .1675    1.68     6.57
 20     .1248    2.50     4.41
 30     .1068    3.20     3.43
 40     .0965    3.86     2.85
 51     .0893    4.55     2.42
 61     .0847    5.16     2.13
 71     .0812    5.76     1.91
 81     .0785    6.36     1.73
 91     .0763    6.94     1.59
101     .0745    7.52     1.46
111     .0729    8.10     1.36
121     .0717    8.67     1.27
132     .0704    9.30     1.18
142     .0695    9.87     1.11
154     .0685   10.55     1.04
162     .0679   11.00     1.00 


It would not quite be correct to say that there are no noticeable differences.  The reason for those differences comes from the model I selected to build the curve.  (This is the math section.  It may be very slightly more intensive than the most common sabermetric tools, but it contains nothing that can’t be found in the early chapters of a college statistics textbook, or in the Wikipedia articles on standard deviation and binomial distribution, respectively.  Even though I still have my college statistics textbooks, guess where I went to confirm my memory of the formulas…)

The basic assumption I used is that there are two reasons for variance in team performance: talent and luck.  The overall variance can then be expressed as a function of the variance due to talent and the variance due to luck.  Assuming that there is no relationship between talent and luck (which is pretty much true by definition; if your luck is somehow based in talent, it’s not actually luck), this function should be:

S(T+L)^2 = S(T)^2 + S(L)^2

The standard deviation due to luck can be calculated using the binomial probability distribution, which applies to the answering of large numbers of identical yes-or-no questions such as “did the coin come up heads?” or “did you win the baseball game?”.  Over a sample of N games, the standard deviation of the number of wins for a .500 talent team is:

S(L, N) = (.5^2 * N)^1/2

This makes the standard deviation of winning percentage due to luck

S%(L, N) =  (.25/N)^1/2

If that looks familiar, that’s because it’s the first half of the equation for the curve proposed earlier.  The second half is the value for the other source of variance, talent.  Using the equation given above for S(T+L), we can actually find an observed value for the standard deviation due to talent through each of the samples used:

Games   S%(T,N)
 
10     .0480  
 20     .0523  
 30     .0523  
 40     .0531  
 51     .0529  
 61     .0528  
 71     .0529  
 81     .0534  
 91     .0535  
101     .0538  
111     .0547  
121     .0558  
132     .0560  
142     .0574  
154     .0586  
162     .0590 

We can quickly observe two things.  First, the value remains rather stable from game 20 through game 101, then increases steadily for the remainder of the season.  This makes sense, because game 101 is roughly the timing of the trade deadline, when good teams get better and bad teams get worse; you’d expect the variation in talent to increase, and to continue to do so as rosters expand in September.
Second, the changes in the measurement for talent variance are not terribly large.  The curve I’m using treats talent variance as a constant (I went through a few methods in selecting it, none of which change the value of the constant or the overall curve in a noteworthy way).  It would be possible to modify the projection to account for the trade deadline and September callups, but I think it would be inadvisable to do so, because if the season is shorter, bad teams will give up earlier, and the aforementioned increase in variance will occur sooner.

All right, the math is done; let’s get back to the baseball side of things.  What does all of this mean for Deacon White?  Here’s his career WAR in three forms: raw, pro-rated, and adjusted using the proposed model.

Year  Team  Lg  Tm G   WAR  PR WAR  Mod WAR
1871  CLE   NA    29   0.8    4.5      2.8
1872  CLE   NA    22   1.1    8.1      4.6
1873  BOS   NA    60   2.9    7.8      6.3
1874  BOS   NA    71   1.9    4.3      3.6
1875  BOS   NA    82   4.9    9.7      8.4
1876  CHC   NL    66   3.5    8.6      7.0
1877  BSN   NL    61   3.2    8.5      6.8
1878  CIN   NL    61   2.3    6.1      4.9
1879  CIN   NL    81   3.6    7.2      6.2
1880  CIN   NL    83   0.5    1.0      0.8
1881  BUF   NL    83   1.0    2.0      1.7
1882  BUF   NL    84   1.0    1.9      1.7
1883  BUF   NL    98   1.0    1.7      1.5
1884  BUF   NL   115   4.8    6.8      6.3
1885  BUF   NL   112   1.9    2.7      2.6
1886  DTN   NL   126   2.5    3.2      3.1
1887  DTN   NL   127   2.1    2.7      2.6
1888  DTN   NL   134   3.2    3.9      3.7
1889  PIT   NL   134   0.3    0.4      0.4
1890  BUF   PL   134   1.7    2.1      2.0

Total                 44.2   93.0     77.0 


An adjusted career total of 77 WAR, with a peak season of 8.4 and five other seasons exceeding 6.0.  That’s not an inner-circle player, but it is a really strong candidate – roughly Jeff Bagwell level (albeit before applying a length adjustment to Bagwell’s 1994 season) in the context of his own time.  The differences between that context and the modern one are perhaps worth exploring, but that’s a subject that’s been tackled so thoroughly in so many forms and venues that I doubt there’s much to be gained from my taking it on here.

One final, bright red warning light about this adjustment: Since it was derived around team wins and the spread thereof, it’s not really safe to apply to non-win-based measurements.  So while it might be fun (at least if you’re me) to find out that Deacon White had equivalent career totals of 3457 hits and 1957 runs, or that his 1873 season features 263 equivalent hits, exceeding Ichiro’s single-season record, that’s not an exercise in any danger of drowning in rigor.

Even with that caveat, the adjustment proposed here is still quite useful, not only for White and the other stars of the earliest era of baseball, but also for more recent players such as Heinie Groh, Bobby Grich, and Bagwell, who peaked during shortened seasons.  It strikes a balance between no adjustment, which penalizes these players for missing games that never occurred, and pro-rating, which attempts to address that issue but overcompensates.  By accounting for evolution in the standings over the course of the year, it gives us a better chance to answer the question we’re really trying to ask: How much did this player improve his team’s odds of winning the pennant?

Eric J can SABER all he wants to Posted: December 10, 2012 at 08:19 AM | 17 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. Der-K's enjoying the new boygenius album. Posted: December 10, 2012 at 08:59 AM (#4320767)
I've thought about this kind of approach before but never tried to work out the math - this seems quite reasonable. Thanks Eric!
   2. karlmagnus Posted: December 10, 2012 at 10:25 AM (#4320827)
Applying this to Bob Caruthers, he has 56.8 WAR, but this should be multiplied by about 1.3 for seasons averaging around the 116 game range, which gives him 73.8, clearly in the range of a HOFer. Looks about right.
   3. Eric J can SABER all he wants to Posted: December 10, 2012 at 10:38 AM (#4320838)
Many thanks to MWE for his help in getting this posted.

Applying this to Bob Caruthers, he has 56.8 WAR, but this should be multiplied by about 1.3 for seasons averaging around the 116 game range, which gives him 73.8, clearly in the range of a HOFer. Looks about right.

I'm torn on whether to use this method on 19th century pitchers, because their workloads were so much higher than those of their modern counterparts even though the schedules were shorter, and it seems like that may require an additional adjustment.

If anyone's interested, though, I can run a few more early position players and see what happens.
   4. karlmagnus Posted: December 10, 2012 at 11:04 AM (#4320861)
Caruthers' career was ended abnormally early by modern standards; presumably with modern pitching workloads he'd have gone on at least into his mid 30s. Swings and roundabouts.
   5. Qufini Posted: December 10, 2012 at 11:04 AM (#4320862)
Great work, Eric. I, for one, would love to see a few more position players run through this exercise. Maybe some of the HoM honorees. Or maybe someone that we haven't quite gotten to, like Ed Williamson.
   6. DL from MN Posted: December 10, 2012 at 11:21 AM (#4320884)
Fantastic stuff, thanks for sharing this.
   7. Eric J can SABER all he wants to Posted: December 10, 2012 at 12:10 PM (#4320947)
Let's see... the guy who probably benefits the most, even more than White, is Ross Barnes. B-R gives him 29.3 WAR; adjusting turns that into 62.1, with a peak of 12.3, 11.8, 11.2, 9.9.

Cap Anson doesn't really need the help, but he gets it anyway, going from 91.1 to 136.5. His counterparts, Brouthers and Connor, don't leap up as much because their debuts were later, but they both do nicely as well. Brouthers goes from 77 to 98, Connor from 81 to 102 (rounding to the nearest whole number for the sake of brevity).

Other HOMers who debuted before 1885 (let me know if I'm missing anyone):
Charlie Bennett 51 (up from 37)
Buck Ewing 59 (46)
Cal McVey 46 (22)
Joe Start 59 (32)
Bid McPhee 57 (48)
Hardy Richardson 52 (39)
Ezra Sutton 54 (32)
Jack Glasscock 77 (59)
Dickey Pearce 21 (10)
George Wright 51 (25)
Jim O'Rourke 77 (50)
Paul Hines 68 (43)
George Gore 53 (38)
King Kelly 59 (42)
Lip Pike 33 (15)
Charley Jones 41 (25)
Harry Stovey 54 (42)
Pete Browning 49 (38)
Sam Thompson 50 (42)
Monte Ward (non-pitching only) 44 (35)

Other guys from around the same time who I think either get votes sometimes or I'm at least slightly familiar with:
Jimmy Ryan 47 (41)
Ed Williamson 51 (34)
Fred Dunlap 50 (35)
Tip O'Neill 30 (26)

I won't swear by the WAR values themselves, of course. They seem to have gone through some enormous changes since I last entered them - I know B-R updated WAR since then, but I'm not sure which updates had the effect. My guess would be the runs-to-wins conversion.
   8. villageidiom Posted: December 10, 2012 at 01:00 PM (#4321007)
What to do with Deacon White? Well, he played in an era before steroid testing, and was a more "durable" catcher than Craig Biggio.

Just sayin'.
   9. Kiko Sakata Posted: December 10, 2012 at 01:08 PM (#4321020)
This is a very nice piece of work. Thanks for sharing. Personally, I wouldn't do any kind of adjustments for pitchers. I tend to view most pitchers as having a fixed number of innings in their arms, so fewer innings per season tends to be offset by longer careers and vice-versa (I've made a similar argument with respect to the impact of WWII on Bob Feller's career).
   10. Juilin Sandar to Conkling Speedwell (Arjun) Posted: December 10, 2012 at 01:14 PM (#4321029)
This is a really, really interesting process and idea. I'll put my hat back in the ring and thank you for sharing it; I, personally, really like both the theory and the application very much.
   11. snapper (history's 42nd greatest monster) Posted: December 10, 2012 at 02:22 PM (#4321122)
Good article!
   12. Rob_Wood Posted: December 10, 2012 at 04:11 PM (#4321219)
Fantastic work.

As you note, your adjustment methodology is based upon wins. Presumably, an analogous approach could be applied to other counting stats such as hits or home runs. To take a strike year as an example, suppose a player had 30 HR in a strike-shortened season of 81 games. Extrapolation would suggest he'd wind up with 60 HR in a full 162 game season. But as you point out above, extrapolation over-predicts high-performances (Reggie Jackson had 39 home runs at the 1969 All-Star break and wound up with only 47 at season's end). By using full season data, we could find all players who had 30 HR after 81 games and see how many homers they wound up with in the full season. Of course, we'd need to try to account for environment (era, park, etc.) in selecting the players to include as best we could.

Anyway, just a thought.
   13. cardsfanboy Posted: December 10, 2012 at 04:21 PM (#4321231)
Excellent piece of work, and a fairly easy to understand explanation of what you did and why.
   14. Eric J can SABER all he wants to Posted: December 10, 2012 at 09:09 PM (#4321402)
Thanks for all the support and feedback.

Personally, I wouldn't do any kind of adjustments for pitchers. I tend to view most pitchers as having a fixed number of innings in their arms, so fewer innings per season tends to be offset by longer careers and vice-versa

I expect this is true to a point. Still, an inning in 2012 is not necessarily the same as an inning in 1962, let alone an inning in 1912 or 1892 or 1872.

As you note, your adjustment methodology is based upon wins. Presumably, an analogous approach could be applied to other counting stats such as hits or home runs.

This is a really interesting idea, but it would take far more legwork to do something like this for individual players than it did for teams, even if you were only doing one counting stat. To do the entire batting line, you'd want someone who either has more time than I do, or has better data acquisition skills (I entered most of the data for this project manually).

Excellent piece of work, and a fairly easy to understand explanation of what you did and why.

This was especially gratifying to read, because I was at least as concerned about how well I'd communicated the information as I was about the specifics of the method itself.
   15. Der-K's enjoying the new boygenius album. Posted: December 10, 2012 at 09:14 PM (#4321409)
You made it very clear - really a nice job.
   16. bjhanke Posted: December 12, 2012 at 07:48 PM (#4323492)
I also like the method and the explanation. I'm adding my approaches to this problem not to argue with the author, but to offer him extra tools if he can use them.

1. On pitchers up through 1892, the last year of the 50 foot pitching box, I do this: Taking 40 games as a reasonable measure of how many games pitchers have started per season in all of MLB history, I then do this: As long as the season played doesn't include a whole 40 starts for the given pitcher, just include it as it is unless there's a leftover portion from the previous year. However, if the season involves more than 40 starts, then take the first 40 and call them a season. That leaves you with a remainder of games to take to the next year, where you will have to pick a part of that year, to made a combined "season", and have still another leftover portion, and so on. If you do that, the 1800s pitchers come out with much more modern-looking careers. They don't pitch 500 innings in a season, but they have more seasons. That is, they start to look much more modern, and can be compared to modern pitchers in that way.

2. Many years ago, I came up with this for 1800s catchers. You can plot a curve by taking, for each year of MLB play, the third-highest percentage of schedule played by any catcher. Using third-highest gets rid of outlier data, and there are always at least three catchers who have full, healthy seasons in any given year. If you do plot this out, you get a nice, graceful curve, starting on the right, in modern years, with data points close to 100%, but not making it quite that far. The curve serves you well, slowly dropping down as you go back in history, when medicine and equipment were primitive. That lasts until you go back to the 1870s, when the schedules get so small that catchers can play almost every league game, skipping only the occasional exhibition game, which suddenly wrenches your curve way up and unrealistic. You can fix this anomaly by simply continuing the historical curve with a French curve, setting a limit on how much of a 162-game schedule early catchers could catch, rather than paying any attention to the actual percentages in 29-game "schedules." When you want to look at a catcher in a given year, you attribute to him the higher of his actual games played and the games that the curve would give the catcher in a full, 162-game season.

So, when Deacon White plays all 22 games of his team's 22-game schedule in 1871 or 1872 or whenever, and your extended curve says 60% (which is about what it does say), then he gets credit for 60% of 162 games, or 97 games. More than 22, but a small enough number to fit in with the decline in playing time as you go back to weaker and weaker equipment. This gives you a reasonable base of playing time for these catchers, which you can then use with your favorite uberstat to estimate his value. You do still have the problem of a small sample size being inflated to a larger one, but you appear to have ways of dealing with that.

Fair Warning #1: People with credentials in formal math will tell you that extending the curve in this way is not allowed in formal math because the method lacks rigor, which it does. I, however, am trained as an applied mathematician, which means I think like an engineer. Engineers do this sort of stuff all the time; they are constantly running into data points that don't fit on any existing curve. That's why they also always want to build a model and test it - they know their math isn't rigorous. Well, sabermetrics is, IMO, a branch of applied math, not theoretical math. Rigor is impossible. Think like an engineer.

Fair Warning #2: If you do this, people will howl about Deacon White, because this method strips from him too many of his games played at catcher. He becomes a career third baseman. Deacon White fans think of him as a catcher. You will catch some heat. I know, I tried this in the Hall of Merit and got plenty of scorching. But it's still the best method I know of for dealing with very early catchers. - Brock Hanke

   17. Alex King Posted: December 23, 2012 at 12:12 PM (#4331457)
Brock--I'm currently reconstructing my HOM ballot, preparing for next year. I like the method you're using for catcher adjustments--can you send me the data you used, so that I don't have to go through every year on BR to get catcher games played data?

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Dynasty League Baseball

Support BBTF

donate

Thanks to
The Ghost of Sox Fans Past
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 0.2705 seconds
58 querie(s) executed