You are here > Home > Primate Studies > Discussion
 
Primate Studies — Where BTF's Members Investigate the Grand Old Game Monday, December 10, 2012What do you do with Deacon White?Last week, the attention of the baseball world was focused keenly on the Winter Meetings. They’re reliably one of the high points of the offseason, featuring free agent speculation, wild trade rumors, and Hall of Fame selections from the Veterans’ Committee. Unfortunately, “keen” is a very generous description for the level of emphasis given to the last item on that list. The Veterans’ vote didn’t even get top headline status on ESPN’s MLB page when it was announced, even though it was an easier group to publicize than you’d expect from the fact that it’s composed of people whose careers were finished at least a decade before Jamie Moyer was born. You have a longtime umpire who made one of the most famous and controversial calls in baseball history. You have the Yankee owner who acquired Babe Ruth and built Yankee Stadium. And you have Deacon White, who hit 24 home runs in his career, never scored or drove in 100 runs in a season, and registered 500 plate appearances in a year for the first time at age 38. At a glance, he seems like a rather odd selection. Rob Neyer is one of the few writers who bothered to address the Veterans’ ballot this year. He penned the following in reference to White: “Deacon White  19thcentury catcher, and it’s always hard to know what to do with 19thcentury catchers, because the demands of the position at that time—no real mitt, no shin guards, no mask—meant that catchers didn’t play many games, or last many seasons. In fact, White shifted to other positions (mostly third base) in the latter half of his career. At 42, he was still playing every day, which probably says as much about baseball at that time than about his talents.” Indeed, White’s games played totals during his career as a catcher (187179) don’t look terribly impressive: 29, 22, 60, 70, 80, 66, 59, 61, 78 But games played totals have to be put in context. The numbers of games played by White’s teams in those seasons are: 29, 22, 60, 71, 82, 66, 61, 61, 81 White wasn’t missing time due to the strains of primitive catching. In fact, he was barely missing any time at all – a total of 8 games skipped in 9 years. His game totals were low because his teams weren’t playing anything approaching a modern schedule – professional baseball was in its infancy, longdistance travel was a highly challenging endeavor, teams frequently played large numbers of nonleague games, and franchises would sometimes fold midseason. The question then becomes: How do you adjust for this schedule discrepancy? The easiest answer to this question is not to adjust at all. White played the games he played, amassed the hits and doubles and RBI that he amassed, and should be compared to other players on that basis. By this logic, White’s career totals of 2067 hits, 1140 runs, 988 RBI, and 44 Wins above Replacement (as estimated by Baseball Reference) are nice enough, but not terribly impressive in a historic context, especially considering the fact that he doesn’t have a single season exceeding 160 hits, 210 total bases, or 5 WAR. There are a couple of easily identifiable problems with this method. First, it doesn’t account for opportunity. White played in nearly all the games he possibly could have, while a modern player who participates in 80 games in a season is not only missing half of the year, but making his team find someone else to put in his spot for the other 82. The other issue is that of impact on team results. In an 82game schedule, a literallyinterpreted 5WAR player (White’s 1875 total was 4.9) should turn a .500 team into a .561 team (4636); if you double the length of the schedule, you correspondingly reduce the impact of the wins (8676, or .531). The real issue at hand is not White’s raw contributions on the baseball field, but rather how those contributions affected his teams’ position in the standings. In an attempt to take the most direct approach possible to that question, this analysis is going to take WAR at face value as an estimate of wins added, with all the applicable caveats. The most intuitive method is to simply prorate each season to 162 games, multiplying the player’s statistics by (162/team games played). This makes a 2WAR season through 20 games equivalent to a 16WAR season through 160. Applied to our sample player, we see that this adjustment turns White’s 1875 from a 4.9win season into a 9.7win season, and he also picks up 8win campaigns in ’72, ’76, and ’77. His career WAR more than doubles, to 93.0, and he moves firmly into Hall of Fame territory – at least, if this is a fair adjustment. The trouble is, it’s not a fair adjustment. White’s 1872 Cleveland Forest Citys played in 22 games, so let’s look at the 2012 standings after 22 games for comparison. You find the Dodgers and Rangers at 166, the Twins and Royals at 616, and the Padres and Angels at 715. All of those teams have winning percentages further from .500 than the 55107 Astros did at the end of the year – and this edition of the Astros had baseball’s most extreme endofseason record since 2004. It’s no surprise that smaller samples of games are inherently prone to wider variation in team performance. Because of this phenomenon, a player who contributes 2 wins in a 20game schedule (which would be expected to give an otherwiseaverage team a 128 record) would be far less likely to propel his team to a pennant than one who adds 16 wins over a 162game schedule (9765). We can account for this by simply comparing the variance in team performance through different points in the season. Of course, we’ll want to use a sample larger than one season of baseball to do so.
Games S% S S(162)/S The spread in team winning percentage becomes smaller throughout the year, as expected. Adjusting for the variance in team performance, it’s evident that a 2win increase in a 20game season is not, in fact, equivalent to a 16win improvement over 162; it’s closer in impact to a 9win enhancement over a full season. That looks about right; you’d expect a 128 team to be in contention to either win a weak division or take the second wild card slot, and you’d expect the same from a 9072 team. Accounting for this takes a bit of the air out of White’s production – his 1877 season, in which he was the best hitter in baseball but played mostly first base rather than catcher, is now equivalent to a 7.3 WAR season rather than 8.5. That’s still an excellent year, but it doesn’t look as impressive as it did under the simpler prorating adjustment. Note, however, that because I lacked the stamina to enter records after each of 162 games for each of 1242 teams, we’re left with substantial gaps in the table. For maximum utility, we should try to find a curve that can be applied to any season length, up to and surpassing 162 games. As it happens, there is just such a curve; its origins will be explained in further detail shortly. The equation is as follows (with apologies for the awkward formatting): S%(N) = (.25/N + .0554^2)^1/2, Where N is the number of games played. As before, multiplying S%(N) by N gives the standard deviation of team wins. For comparison, here are the results when this curve is applied to the same season lengths listed in the table above:
Games S%(N) S(N) S(162)/S(N) The basic assumption I used is that there are two reasons for variance in team performance: talent and luck. The overall variance can then be expressed as a function of the variance due to talent and the variance due to luck. Assuming that there is no relationship between talent and luck (which is pretty much true by definition; if your luck is somehow based in talent, it’s not actually luck), this function should be: S(T+L)^2 = S(T)^2 + S(L)^2 The standard deviation due to luck can be calculated using the binomial probability distribution, which applies to the answering of large numbers of identical yesorno questions such as “did the coin come up heads?” or “did you win the baseball game?”. Over a sample of N games, the standard deviation of the number of wins for a .500 talent team is: S(L, N) = (.5^2 * N)^1/2 This makes the standard deviation of winning percentage due to luck S%(L, N) = (.25/N)^1/2 If that looks familiar, that’s because it’s the first half of the equation for the curve proposed earlier. The second half is the value for the other source of variance, talent. Using the equation given above for S(T+L), we can actually find an observed value for the standard deviation due to talent through each of the samples used:
Games S%(T,N) We can quickly observe two things. First, the value remains rather stable from game 20 through game 101, then increases steadily for the remainder of the season. This makes sense, because game 101 is roughly the timing of the trade deadline, when good teams get better and bad teams get worse; you’d expect the variation in talent to increase, and to continue to do so as rosters expand in September. All right, the math is done; let’s get back to the baseball side of things. What does all of this mean for Deacon White? Here’s his career WAR in three forms: raw, prorated, and adjusted using the proposed model.
Year Team Lg Tm G WAR PR WAR Mod WAR One final, bright red warning light about this adjustment: Since it was derived around team wins and the spread thereof, it’s not really safe to apply to nonwinbased measurements. So while it might be fun (at least if you’re me) to find out that Deacon White had equivalent career totals of 3457 hits and 1957 runs, or that his 1873 season features 263 equivalent hits, exceeding Ichiro’s singleseason record, that’s not an exercise in any danger of drowning in rigor. Even with that caveat, the adjustment proposed here is still quite useful, not only for White and the other stars of the earliest era of baseball, but also for more recent players such as Heinie Groh, Bobby Grich, and Bagwell, who peaked during shortened seasons. It strikes a balance between no adjustment, which penalizes these players for missing games that never occurred, and prorating, which attempts to address that issue but overcompensates. By accounting for evolution in the standings over the course of the year, it gives us a better chance to answer the question we’re really trying to ask: How much did this player improve his team’s odds of winning the pennant? Eric J can SABER all he wants to
Posted: December 10, 2012 at 09:19 AM  17 comment(s)
Login to Bookmark
Related News: 
BookmarksYou must be logged in to view your Bookmarks. Hot TopicsWhat do you do with Deacon White?
(17  1:12pm, Dec 23) Last: Alex King Loser Scores (15  12:05am, Oct 18) Last: mkt42 Nine (Year) Men Out: Free El Duque! (67  10:46am, May 09) Last: DanG Who is Shyam Das? (4  8:52pm, Feb 23) Last: RoyalsRetro (AG#1F) Greg Spira, RIP (45  10:22pm, Jan 09) Last: Jonathan Spira Northern California Symposium on Statistics and Operations Research in Sports, October 16, 2010 (5  12:50am, Sep 18) Last: balamar Mike Morgan, the Nexus of the Baseball Universe? (37  12:33pm, Jun 23) Last: The Keith Law Blog Blah Blah (battlekow) Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011 (2  8:03pm, May 16) Last: Diamond Research Retrosheet SemiAnnual Site Update! (4  4:07pm, Nov 18) Last: Sweatpants What Might Work in the World Series, 2010 Edition (5  3:27pm, Nov 12) Last: fra paolo Predicting the 2010 Playoffs (11  5:21pm, Oct 20) Last: TomH SABR 40: Impressions of a FirstTime Attendee (5  11:12pm, Aug 19) Last: Joe Bivens, Minor Genius St. Louis Cardinals Midseason Report (12  12:42am, Aug 10) Last: bjhanke Napoleon Lajoie: Definition of Grace (9  12:38am, Jul 01) Last: Hang down your head, Tom Foley Youth Baseball Hitting Drills: Shine the Light (5  6:47am, Mar 11) Last: Pat Rapper's Delight 

Page rendered in 0.3142 seconds 
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
You must be Registered and Logged In to post comments.
<< Back to main