Page rendered in 1.2886 seconds
68 querie(s) executed
— Where BTF's Members Investigate the Grand Old Game
Monday, November 18, 2002
And the Beat Goes On: Derek Jeter and the State of Fielding Analysis in Sabermetrics - Part 5
Mike tackles the latest from Bill James, Win Shares.
The Return of the Master - Win Shares
Bill James, in the 1984 Baseball Abstract, railed against the search for a “great statistic” that would combine all of the aspects of a player’s performance onto one scale. However, that did not keep James from trying to develop one, even after his retirement from sabermetics in 1988. In 1996, James began working on a method for tying together all of the aspects of a player’s performance into a single number, by relating those characteristics to the overall performance of the team in terms of wins. James presented the resulting system, Win Shares, at SABR31 in 2001, and a book detailing the system was published by STATS, Inc. in 2002. Wins Shares quickly became a lively discussion topic here at Baseball Primer, and everywhere that baseball was discussed on the Internet. The defensive analysis portion of the drew the most attention, with many people accepting James’s contention that the system was entirely new and different even though Davenport and Saeger had used many of the same basic concepts in developing their own systems, as I noted earlier.
As did Davenport and Saeger, James also realized that it was important to evaluate fielding first by the performance of the team, then by the performance of the individual fielders on the team. When using individual fielding statistics such as Range Factor as the starting point, James notes that:
...this implies, in turn, that we are assuming that all defensive teams are equal. Making 5.08 plays for one defense is the same as making 5. 08 plays for another defense. (Win Shares, page 110; emphasis is in the original text)
James goes on to note that this assumption cannot be correct, because it is clear that all defensive teams are not equal, and that therefore individual fielding must be evaluated within the context of the team.
I’m going to run through the model in detail for shortstops. Primer author Joe Dimino has put together a spreadsheet for calculating Win Shares for a league-season, which I used to perform the calculations.
In the Win Shares system, James assigns each team three Win Shares for each team win. He then divides those Win Shares between offense and defense (pitching+fielding), assigning roughly 48% of the Win Shares on average to offense and 52% to defense. He then divides the defensive totals between pitching and fielding, with roughly 35% of the team total on average going to pitching and 17% to fielding. So a team that wins 114 games, like the 1998 Yankees, will have 342 total Win Shares to divide, and with a normal performance on offense and defense will have about 164 of those Win Shares assigned to the hitters, 120 assigned to the pitchers, and 58 to the fielders. The 1998 Yankees actually had 171.5 Win Shares assigned to the hitters, 118 to the pitchers, and 52.5 to the fielders. I should note that James has placed a maximum and minimum limit on the number of Win Shares assigned to the fielders, and the standard formula would have resulted in a number of Win Shares assigned to the fielders that would have exceeded the maximum allowable number, so that the fielders actually could not get more than the 52.5 they received. The fielders would have had about another 1.5 WS without the limit.
Once James determines the Win Shares to be assigned to the fielders, he determines how to divide those Win Shares between the team’s defenders at each position - assigning Win Share values to the team’s catchers, 1Bs, 2Bs, etc. Pitcher fielding for some reason is not included. Once Win Shares are assigned to each position, the Win Shares at each position are divided between all of the fielders who played at that position.
The assignment of fielding Win Shares by position, and the assignment of fielding Win Shares at a position, are based on a Claim Point system. Each positive accomplishment by fielders yield a certain number of Claim Points, and the Win Shares are divvied up based on the percentage of the total claim points accumulated. The principle is the same for both team defensive position and individual defenders at a position.
In dividing fielding Win Shares by position, James evaluates fielders based on four defensive characteristics for their position. While there are some variations from position to position, James generally weights what he considers to be the most important fielding characteristic at that position on a 40-point scale, the next most important on a 30-point scale, then 20, then 10. For shortstops, the 40-point scale is based on Assists, the 30-point scale is based on Double Plays, the 20-point scale is based on Error Percentage, and the 10-point scale is based on Putouts. I’m going to walk through the calculations for the 1998 Yankee shortstops (Jeter, Luis Sojo, and Homer Bush):
Yankee shortstops in 1998 had 440 assists. This total is compared to the number of assists that they should have been expected to get, calculated as:
(Team assist total)*(% of assists by league SS)+(excess batters faced by LHP/100)
where “excess batters faced by LHP” is
(LHP BIP) - (team BIP * lg % of BIP vs LHP)
BIP being innings pitched, multiplied by 3, minus strikeouts.
The 1998 Yankees had 1642 assists as a team. The league percentage of assists by shortstops was .29225, and the Yankees’ LHP faced 368 more batters than expected. Thus, Yankee shortstops would have been expected to have
The number of claim points that SS get for assists is
20+(actual A - expected A)/4
For the Yankees this is 20+(440-483.55)/4, or 9.11 claim points.
The claim points for double plays are awarded based upon the team total of DPs turned, compared to the expected number of double plays for that team. The 1998 Yankees turned 146 double plays. The first estimate of the expected number of double plays is calculated based on the number of runners on 1B in DP situations, estimated as:
((H-HR)*(league % of singles allowed))+BB+HBP-SH-WP-Bk-PB
times the league percentage of such runners removed on double plays (DP divided by the same formula above, calculated for the league, using actual singles allowed as the first factor). For the 1998 Yankees, this value is:
This first estimate is then adjusted for the number of assists per inning, compared to the league average, on the theory that teams with more assists tend to see more ground balls than the norm, and thus are likely to turn more DPs. The 1998 Yankees had 1642 assists in 1456 2/3 innings; the league as a whole had 23295 assists in 20194 2/3 innings. The ratio of the Yankees’ A/inning to the league is therefore 0.9772, and the expected number of DPs is thus (134.08*0.9772), or 131.
The claim points for double plays are calculated as:
15+(actual DP - expected DP)/4
or for the Yankees, 15+(146-131)/4 = 18.75
Error percentage for shortstops is the converse of fielding average; it is (errors)/(total chances). The 1998 Yankee SS handled 703 chances and made 11 errors, for an error percentage of 0.01564. The league’s SS handled 10884 chances and made 308 errors, an error percentage of 0.02829. The claim points for error percentage are calculated as:
20 - (10 * (team SS error %)/(league SS error %))
For the Yankees, this is 20 - (10 * 0.01564 / 0.02829), or 14.47 claim points.
The 1998 Yankees had 252 putouts by their shortstops. This is compared to the number of putouts they were expected to make, which is:
(team PO - team K)*(lg % of non-K PO by SS)+(BB above/below lg average) /14-(excess batters faced by LHP)/64
AL shortstops in 1998 had 0.0815 of their league’s non-strikeout putouts. Yankee pitchers walked 0.0615 men per inning fewer than the league average, or 89.67 men fewer than the league average in their 1456 2/3 innings. As noted above, they had 368 more batters faced by LHP than the league average. Thus, their shortstops would have been expected to have:
(4370-1080)*0.0815-89.67/14-368/64, or 255.982 putouts.
This is translated to claim points by the formula:
5 + (actual putouts - expected putouts)/15
For the Yankee SS, this yields 5 + (252 - 255.982)/15, or 4.74 claim points.
James converts the claim points to a claim percentage. Since there are 100 claim points available per position, the claim percentage is simply the total of the claim points for each aspect of the position, divided by 100. For the Yankee SS, this is (9.11+18.75+14.47+4.74)/100, or .4707.
When the calculations are run through, the following claim percentages
result for the other positions on the 1998 Yankees:
James now uses these claim percentages to assign the fielding Win Shares to individual positions. He does so in the following way:
Recall that the 1998 Yankees have 52.5 Win Shares assigned to the fielders. The shortstops get (0.2707)*36, or 9.74 weighted claim points. The team total, applying the same formula to the other positions, is 74.75 weighted claim points. The shortstops thus get 52.5*9.74/74.75, or 6.84 fielding Win Shares.
Once James has the Win Shares assigned to a position, he then assigns those Win Shares to the individual fielders at that position based on a formula that takes into account the percentage of the plays that the fielder handles. Again, this is done based on a claim point system. Each shortstop is assigned claim points via the following formula:
PO + A*2 - E*5 + DP + RBP*2
RBP stands for range bonus plays. A player gets assigned range bonus plays when it is clear by any interpretation of the data that he is making more plays per inning than his teammates. The formula for assigning these is pretty complicated when you don’t have defensive innings, but when you do have defensive innings, as we do for the 1998 Yankees, the range bonus goes to any player who is making more plays per inning than the average for the position for the team.
The Yankees used three shortstops in 1998. Derek Jeter played 1304 2/3 innings with 223 PO, 393 A, 9 E, and 82 DP. Luis Sojo played 141 innings with 29 PO, 44 A, 2 E, and 12 DP. Homer Bush played 11 innings with 3 A and no other defensive markers. The team total was 252 PO, 440 A, 11 E, and 96 DP in 1456 2/3 innings; the team’s SS averaged 0.475 plays per inning. Derek Jeter should have made 620 plays (PO+A) at the team average; he actually made 616, so he gets no range bonus. Luis Sojo made 73 plays, and should have made 66.98, so he gets 6.02 range bonus plays. Homer Bush made three plays, and should have made 5.23, so he gets no range bonus. The claim points for these three players using the formula are:
The position Win Shares are assigned based on the proportion of the claim points that the individual earns. Jeter gets 6.84*1046/1183, or 6.04 Win Shares for his fielding. Sojo gets 6.84*131/1183, or 0.76 Win Shares, and Bush gets 6.84*6/1183, or 0.03 Win Shares (rounding acounts for the other 0.01 Win Share). The calculations for 1999 and 2000 give Jeter 5.02 and 2.70 Win Shares for fielding, respectively, for those two seasons.
Because Win Shares are dependent on playing time, to compare shortstops I determine the number of Win Shares earned per 1000 innings played. Table 8 shows the results of this comparison for the years 1998-2000 among SS with 800 innings or more, with Palmer’s FR, Davenport’s DFTs, Saeger’s CAD Defensive Winning Percentages, and fieldable shortstop opportunities per nine innings included for comparison purposes:
Table 8. AL SS Win Shares /1000 Inn, 1998-2000 (min 800 inn)
The correlation coefficient between Win Shares per 1000 innings and the other data presented for comparison purposes:
Win Shares and FR: r=+0.696 (given James’s take on Fielding Runs, this
might come as a surprise to him)
There have been some criticisms of the Win Shares fielding system, many of them having to do with the arbitrary nature of James’s weights for defensive events. While I think those criticisms are reasonable, I’m more concerned here with whether James has accurately captured overall defensive value in his approach - in other words, can we rely upon the conclusion that a player with 6 defensive Win Shares is a better fielder than a player with 4? If James is accurately evaluating the relative importance of defensive events and accounting for team contextual effects, his rankings should give reliable results, regardless of the specific scale that he uses. If he is not, then we should be able to evaluate the specific shortcomings of the method by looking at how well the results compare with inferences drawn from play-by-play data.
It is clear, looking at the ratings, that Jeter’s Win Shares totals are heavily influenced by his low assist totals. Yankee SS (mostly Jeter) putout total are close to league average, and their double play rates (at least in this method, although as I noted in Part 4 they are very bad in relation to actual DP chances) have been only slightly under the expected rate. Except for 2000, the Yankee SS have very good error rates. But their assist rates have been very low - on the 40-point scale Yankee SS had fewer than 10 claim points in each of the three seasons. Since SS lose one claim point for every four assists that they fall short of their expectation, and since a SS who exactly meets the expectation gets 20 claim points, Yankee SS are getting at least 40 fewer assists than expected in Win Shares. The obvious question to be asked at this point, then, is whether this shortfall is actually due to their fielding skill, or whether it results from an overestimate of their opportunities to make plays in the Win Shares method.
James makes one major adjustment for ball distribution in his method, based on balls in play vs LHP in excess of the league average total. Ordinarily, one might expect this adjustment to increase the number of balls hit into play on the left side - and in most cases that happens. As calculated in the Win Shares method, the Yankees had an excess of LHP in both 1998 and 2000, and were exactly league average in 1999. Thus, one might have expected the Yankees to have more GBIP to the left side in 1998 and 2000, and a league average total in 1999. The reality is quite different, as Table 9 demonstrates:
Table 9. AL Distribution of Ground Balls in Play, 1998-2000
In all three seasons, the Yankees had far fewer GBIP hit to the left side than would be expected given the balls in play against their LHP. In this particular case, the adjustment penalizes Yankee SS for plays that they never had a chance to make. Obviously, this is also true of the other methods that we have discussed, where left/right adjustments based on the orientation of the pitching staff are used - the point here is not to criticize the use of the adjustment but to point out that, in the specific case of the Yankees, the adjustment leads to a bias in the measurement because the Yankees’ ball distribution doesn’t fit within the model on which the method is based. Furthermore, that bias in the model, when applied to the Yankees, would have the effect of reducing the ranking of the team’s shortstops.
I also decided to take a look at James’s assumption that an average shortstop would get assists at the same rate, regardless of whether or not his team had a high number of assists or a low number of assists. I identified 10 extreme fly ball teams in the 1998-2000 AL, and 11 extreme ground ball teams during that period, and took at look at the ratio of shortstop assists to team assists for each of those teams. The shortstops on the flyball teams averaged 28.8% of team assists; the shortstops on the groundball teams averaged 29.0% of team assists. The groundball teams faced 4% more left handed hitters, which would reduce the number of assists that their SS get. When I adjusted for this at the rate at which those team SS got assists when facing LHB and RHB, and weighted the rate based on each team’s pitchers facing an average mix of LHB and RHB, the SS on the flyball teams would have averaged 28.6% of team assists and the SS on the groundball teams would have averaged 29.3% of team assists. When you make a similar correction for LH/RH batters faced for all of the AL teams between 1998 and 2000, there is virtually no correlation between the team’s groundball rate and the percentage of assists that its shortstops get (r=-0.058 over the 1998-2000 period). Thus, while I thought that shortstops who played behind ground ball staffs might have a higher percentage of assists than SS that played behind fly ball staffs, that is apparently not the case, and thus using the league-average rate as a ba
You must be logged in to view your Bookmarks.
What do you do with Deacon White?
(17 - 1:12pm, Dec 23)
Last: Alex King
(15 - 12:05am, Oct 18)
Nine (Year) Men Out: Free El Duque!
(67 - 10:46am, May 09)
Who is Shyam Das?
(4 - 8:52pm, Feb 23)
Last: RoyalsRetro (AG#1F)
Greg Spira, RIP
(45 - 10:22pm, Jan 09)
Last: Jonathan Spira
Northern California Symposium on Statistics and Operations Research in Sports, October 16, 2010
(5 - 12:50am, Sep 18)
Mike Morgan, the Nexus of the Baseball Universe?
(37 - 12:33pm, Jun 23)
Last: The Keith Law Blog Blah Blah (battlekow)
Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011
(2 - 8:03pm, May 16)
Last: Diamond Research
Retrosheet Semi-Annual Site Update!
(4 - 4:07pm, Nov 18)
What Might Work in the World Series, 2010 Edition
(5 - 3:27pm, Nov 12)
Last: fra paolo
Predicting the 2010 Playoffs
(11 - 5:21pm, Oct 20)
SABR 40: Impressions of a First-Time Attendee
(5 - 11:12pm, Aug 19)
Last: Joe Bivens, Minor Genius
St. Louis Cardinals Midseason Report
(12 - 12:42am, Aug 10)
Napoleon Lajoie: Definition of Grace
(9 - 12:38am, Jul 01)
Last: Hang down your head, Tom Foley
Youth Baseball Hitting Drills: Shine the Light
(5 - 6:47am, Mar 11)
Last: Pat Rapper's Delight