Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Primate Studies > Discussion
Primate Studies
— Where BTF's Members Investigate the Grand Old Game

Monday, November 18, 2002

And the Beat Goes On: Derek Jeter and the State of Fielding Analysis in Sabermetrics - Part 5

Mike tackles the latest from Bill James, Win Shares.

The Return of the Master - Win Shares

Bill James, in the 1984 Baseball Abstract, railed against the search for a “great statistic” that would combine all of the aspects of a player’s performance onto one scale. However, that did not keep James from trying to develop one, even after his retirement from sabermetics in 1988. In 1996, James began working on a method for tying together all of the aspects of a player’s performance into a single number, by relating those characteristics to the overall performance of the team in terms of wins. James presented the resulting system, Win Shares, at SABR31 in 2001, and a book detailing the system was published by STATS, Inc. in 2002. Wins Shares quickly became a lively discussion topic here at Baseball Primer, and everywhere that baseball was discussed on the Internet. The defensive analysis portion of the drew the most attention, with many people accepting James’s contention that the system was entirely new and different even though Davenport and Saeger had used many of the same basic concepts in developing their own systems, as I noted earlier.

As did Davenport and Saeger, James also realized that it was important to evaluate fielding first by the performance of the team, then by the performance of the individual fielders on the team. When using individual fielding statistics such as Range Factor as the starting point, James notes that:

...this implies, in turn, that we are assuming that all defensive teams are equal. Making 5.08 plays for one defense is the same as making 5. 08 plays for another defense. (Win Shares, page 110; emphasis is in the original text)

James goes on to note that this assumption cannot be correct, because it is clear that all defensive teams are not equal, and that therefore individual fielding must be evaluated within the context of the team.

I’m going to run through the model in detail for shortstops. Primer author Joe Dimino has put together a spreadsheet for calculating Win Shares for a league-season, which I used to perform the calculations.

In the Win Shares system, James assigns each team three Win Shares for each team win. He then divides those Win Shares between offense and defense (pitching+fielding), assigning roughly 48% of the Win Shares on average to offense and 52% to defense. He then divides the defensive totals between pitching and fielding, with roughly 35% of the team total on average going to pitching and 17% to fielding. So a team that wins 114 games, like the 1998 Yankees, will have 342 total Win Shares to divide, and with a normal performance on offense and defense will have about 164 of those Win Shares assigned to the hitters, 120 assigned to the pitchers, and 58 to the fielders. The 1998 Yankees actually had 171.5 Win Shares assigned to the hitters, 118 to the pitchers, and 52.5 to the fielders. I should note that James has placed a maximum and minimum limit on the number of Win Shares assigned to the fielders, and the standard formula would have resulted in a number of Win Shares assigned to the fielders that would have exceeded the maximum allowable number, so that the fielders actually could not get more than the 52.5 they received. The fielders would have had about another 1.5 WS without the limit.

Once James determines the Win Shares to be assigned to the fielders, he determines how to divide those Win Shares between the team’s defenders at each position - assigning Win Share values to the team’s catchers, 1Bs, 2Bs, etc. Pitcher fielding for some reason is not included. Once Win Shares are assigned to each position, the Win Shares at each position are divided between all of the fielders who played at that position.

The assignment of fielding Win Shares by position, and the assignment of fielding Win Shares at a position, are based on a Claim Point system. Each positive accomplishment by fielders yield a certain number of Claim Points, and the Win Shares are divvied up based on the percentage of the total claim points accumulated. The principle is the same for both team defensive position and individual defenders at a position.

In dividing fielding Win Shares by position, James evaluates fielders based on four defensive characteristics for their position. While there are some variations from position to position, James generally weights what he considers to be the most important fielding characteristic at that position on a 40-point scale, the next most important on a 30-point scale, then 20, then 10. For shortstops, the 40-point scale is based on Assists, the 30-point scale is based on Double Plays, the 20-point scale is based on Error Percentage, and the 10-point scale is based on Putouts. I’m going to walk through the calculations for the 1998 Yankee shortstops (Jeter, Luis Sojo, and Homer Bush):

Yankee shortstops in 1998 had 440 assists. This total is compared to the number of assists that they should have been expected to get, calculated as:

(Team assist total)*(% of assists by league SS)+(excess batters faced by LHP/100)

where “excess batters faced by LHP” is

(LHP BIP) - (team BIP * lg % of BIP vs LHP)

BIP being innings pitched, multiplied by 3, minus strikeouts.

The 1998 Yankees had 1642 assists as a team. The league percentage of assists by shortstops was .29225, and the Yankees’ LHP faced 368 more batters than expected. Thus, Yankee shortstops would have been expected to have

1642*0.29225+(368/100)=483.55 assists.

The number of claim points that SS get for assists is

20+(actual A - expected A)/4

For the Yankees this is 20+(440-483.55)/4, or 9.11 claim points.

The claim points for double plays are awarded based upon the team total of DPs turned, compared to the expected number of double plays for that team. The 1998 Yankees turned 146 double plays. The first estimate of the expected number of double plays is calculated based on the number of runners on 1B in DP situations, estimated as:

((H-HR)*(league % of singles allowed))+BB+HBP-SH-WP-Bk-PB

times the league percentage of such runners removed on double plays (DP divided by the same formula above, calculated for the league, using actual singles allowed as the first factor). For the 1998 Yankees, this value is:

(((1357-156)*0.75196)+466+68-37-37-5-12)*0.0994=134.08

This first estimate is then adjusted for the number of assists per inning, compared to the league average, on the theory that teams with more assists tend to see more ground balls than the norm, and thus are likely to turn more DPs. The 1998 Yankees had 1642 assists in 1456 2/3 innings; the league as a whole had 23295 assists in 20194 2/3 innings. The ratio of the Yankees’ A/inning to the league is therefore 0.9772, and the expected number of DPs is thus (134.08*0.9772), or 131.

The claim points for double plays are calculated as:

15+(actual DP - expected DP)/4

or for the Yankees, 15+(146-131)/4 = 18.75

Error percentage for shortstops is the converse of fielding average; it is (errors)/(total chances). The 1998 Yankee SS handled 703 chances and made 11 errors, for an error percentage of 0.01564. The league’s SS handled 10884 chances and made 308 errors, an error percentage of 0.02829. The claim points for error percentage are calculated as:

20 - (10 * (team SS error %)/(league SS error %))

For the Yankees, this is 20 - (10 * 0.01564 / 0.02829), or 14.47 claim points.

The 1998 Yankees had 252 putouts by their shortstops. This is compared to the number of putouts they were expected to make, which is:

(team PO - team K)*(lg % of non-K PO by SS)+(BB above/below lg average) /14-(excess batters faced by LHP)/64

AL shortstops in 1998 had 0.0815 of their league’s non-strikeout putouts. Yankee pitchers walked 0.0615 men per inning fewer than the league average, or 89.67 men fewer than the league average in their 1456 2/3 innings. As noted above, they had 368 more batters faced by LHP than the league average. Thus, their shortstops would have been expected to have:

(4370-1080)*0.0815-89.67/14-368/64, or 255.982 putouts.

This is translated to claim points by the formula:

5 + (actual putouts - expected putouts)/15

For the Yankee SS, this yields 5 + (252 - 255.982)/15, or 4.74 claim points.

James converts the claim points to a claim percentage. Since there are 100 claim points available per position, the claim percentage is simply the total of the claim points for each aspect of the position, divided by 100. For the Yankee SS, this is (9.11+18.75+14.47+4.74)/100, or .4707.

When the calculations are run through, the following claim percentages result for the other positions on the 1998 Yankees:
C: .6244
1B: .5652
2B: .5779
3B: .5874
OF: .5984

James now uses these claim percentages to assign the fielding Win Shares to individual positions. He does so in the following way:

  • Calculate the number of weighted claim points for each position, by multiplying (claim percentage -.200) by the intrinsic weight James assigns for the position. The intrinsic weight is intended to represent the relative importance of the defensive position. Catchers are assigned an intrinsic weight of 38 points, 1Bs 12 points, 2Bs 32 points, 3Bs 24 points, SS 36 points, and OFs 58 points.
  • Assign Win Shares to each position based on the formula (team fielding Win Shares) * (position weighted claim points) / (total weighted claim points)

    Recall that the 1998 Yankees have 52.5 Win Shares assigned to the fielders. The shortstops get (0.2707)*36, or 9.74 weighted claim points. The team total, applying the same formula to the other positions, is 74.75 weighted claim points. The shortstops thus get 52.5*9.74/74.75, or 6.84 fielding Win Shares.

    Once James has the Win Shares assigned to a position, he then assigns those Win Shares to the individual fielders at that position based on a formula that takes into account the percentage of the plays that the fielder handles. Again, this is done based on a claim point system. Each shortstop is assigned claim points via the following formula:

    PO + A*2 - E*5 + DP + RBP*2

    RBP stands for range bonus plays. A player gets assigned range bonus plays when it is clear by any interpretation of the data that he is making more plays per inning than his teammates. The formula for assigning these is pretty complicated when you don’t have defensive innings, but when you do have defensive innings, as we do for the 1998 Yankees, the range bonus goes to any player who is making more plays per inning than the average for the position for the team.

    The Yankees used three shortstops in 1998. Derek Jeter played 1304 2/3 innings with 223 PO, 393 A, 9 E, and 82 DP. Luis Sojo played 141 innings with 29 PO, 44 A, 2 E, and 12 DP. Homer Bush played 11 innings with 3 A and no other defensive markers. The team total was 252 PO, 440 A, 11 E, and 96 DP in 1456 2/3 innings; the team’s SS averaged 0.475 plays per inning. Derek Jeter should have made 620 plays (PO+A) at the team average; he actually made 616, so he gets no range bonus. Luis Sojo made 73 plays, and should have made 66.98, so he gets 6.02 range bonus plays. Homer Bush made three plays, and should have made 5.23, so he gets no range bonus. The claim points for these three players using the formula are:

    Jeter: 1046
    Sojo: 131
    Bush: 6

    The position Win Shares are assigned based on the proportion of the claim points that the individual earns. Jeter gets 6.84*1046/1183, or 6.04 Win Shares for his fielding. Sojo gets 6.84*131/1183, or 0.76 Win Shares, and Bush gets 6.84*6/1183, or 0.03 Win Shares (rounding acounts for the other 0.01 Win Share). The calculations for 1999 and 2000 give Jeter 5.02 and 2.70 Win Shares for fielding, respectively, for those two seasons.

    Because Win Shares are dependent on playing time, to compare shortstops I determine the number of Win Shares earned per 1000 innings played. Table 8 shows the results of this comparison for the years 1998-2000 among SS with 800 innings or more, with Palmer’s FR, Davenport’s DFTs, Saeger’s CAD Defensive Winning Percentages, and fieldable shortstop opportunities per nine innings included for comparison purposes:

    Table 8. AL SS Win Shares /1000 Inn, 1998-2000 (min 800 inn)

    1998 Team Inn WS WS/1000 FR DFT DWP SSF/9
    Vizquel, O CLE 1316.0 8.63 6.56 7.44 11 0.541 4.08
    Gonzalez, A TOR 1398.3 8.67 6.20 -9.40 5 0.536 3.69
    Bordick, M BAL 1238.3 7.49 6.05 18.26 4 0.501 4.42
    Stocker, K TBA 940.0 5.39 5.74 11.63 14 0.540 4.41
    DiSarcina, G ANA 1370.7 7.20 5.25 -3.63 3 0.507 3.99
    Cruz, D DET 1163.3 5.73 4.93 16.10 17 0.531 5.04
    Meares, P MIN 1270.0 6.21 4.89 -7.68 0 0.492 4.11
    Jeter, D NYA 1304.7 6.04 4.63 -20.02 -3 0.510 3.99
    Rodriguez, A SEA 1389.3 6.22 4.48 2.06 0 0.488 4.35
    Tejada, M OAK 915.0 4.00 4.37 2.16 0 0.477 4.40
    Garciaparra, N BOS 1255.3 4.97 3.96 -15.26 -11 0.478 3.84
    Caruso, M CHA 1121.3 2.84 2.53 -7.33 -16 0.440 4.49
     
    1999 Team Inn WS WS/1000 FR DFT DWP SSF/9
    Bordick, M BAL 1355.0 9.30 6.86 35.38 23 0.560 4.40
    Batista, T TOR 860.7 5.53 6.43 10.18 13 0.523 4.28
    Garciaparra, N BOS 1171.7 7.52 6.42 -7.53 0 0.513 4.04
    Sanchez, R KCA 1128.7 6.80 6.02 31.94 24 0.552 4.82
    Tejada, M OAK 1377.3 8.21 5.96 7.13 4 0.512 4.20
    Cruz, D DET 1300.3 7.42 5.71 4.68 20 0.525 4.31
    Rodriguez, A SEA 1114.7 6.36 5.70 7.23 3 0.510 4.42
    Clayton, R TEX 1149.3 5.34 4.64 3.10 -7 0.487 4.64
    Guzman, C MIN 1069.0 4.35 4.07 -7.20 -3 0.493 4.18
    Caruso, M CHA 1114.7 4.33 3.88 -19.92 -10 0.462 4.25
    Vizquel, O CLE 1214.3 4.46 3.68 1.57 -7 0.487 4.09
    Jeter, D NYA 1395.7 5.02 3.60 -33.55 -13 0.475 3.64
     
    2000 Team Inn WS WS/1000 FR DFT DWP SSF/9
    Rodriguez, A SEA 1285.0 8.73 6.79 8.05 17 0.536 3.98
    Valentin, J CHA 1212.3 8.00 6.60 20.59 0 0.495 4.30
    Sanchez, R KCA 1198.0 7.08 5.91 15.18 28 0.476 4.45
    Martinez, F TBA 887.7 5.20 5.85 31.20 9 0.505 4.73
    Garciaparra, N BOS 1185.0 6.65 5.61 -2.55 -4 0.491 4.37
    Tejada, M OAK 1400.3 7.46 5.33 3.17 -4 0.546 4.46
    Guzman, C MIN 1307.0 6.71 5.13 -14.15 -9 0.456 3.78
    Clayton, R TEX 1237.0 6.16 4.98 -2.59 -4 0.492 4.15
    Cruz, D DET 1355.3 6.34 4.67 3.95 13 0.548 4.34
    Vizquel, O CLE 1328.7 5.96 4.49 -0.54 -4 0.491 3.98
    Gonzalez, A TOR 1225.3 5.26 4.29 -6.82 16 0.487 4.36
    Bordick, M BAL 865.0 3.22 3.72 -14.38 -6 0.481 3.94
    Jeter, D NYA 1278.7 2.70 2.12 -36.47 -27 0.490 3.50

    The correlation coefficient between Win Shares per 1000 innings and the other data presented for comparison purposes:

    Win Shares and FR: r=+0.696 (given James’s take on Fielding Runs, this might come as a surprise to him)
    Win Shares and DFTs: r=+0.711
    Win Shares and CAD DWP: r=+0.640
    Win Shares and SS fieldable opportunities: r=+0.271

    There have been some criticisms of the Win Shares fielding system, many of them having to do with the arbitrary nature of James’s weights for defensive events. While I think those criticisms are reasonable, I’m more concerned here with whether James has accurately captured overall defensive value in his approach - in other words, can we rely upon the conclusion that a player with 6 defensive Win Shares is a better fielder than a player with 4? If James is accurately evaluating the relative importance of defensive events and accounting for team contextual effects, his rankings should give reliable results, regardless of the specific scale that he uses. If he is not, then we should be able to evaluate the specific shortcomings of the method by looking at how well the results compare with inferences drawn from play-by-play data.

    It is clear, looking at the ratings, that Jeter’s Win Shares totals are heavily influenced by his low assist totals. Yankee SS (mostly Jeter) putout total are close to league average, and their double play rates (at least in this method, although as I noted in Part 4 they are very bad in relation to actual DP chances) have been only slightly under the expected rate. Except for 2000, the Yankee SS have very good error rates. But their assist rates have been very low - on the 40-point scale Yankee SS had fewer than 10 claim points in each of the three seasons. Since SS lose one claim point for every four assists that they fall short of their expectation, and since a SS who exactly meets the expectation gets 20 claim points, Yankee SS are getting at least 40 fewer assists than expected in Win Shares. The obvious question to be asked at this point, then, is whether this shortfall is actually due to their fielding skill, or whether it results from an overestimate of their opportunities to make plays in the Win Shares method.

    James makes one major adjustment for ball distribution in his method, based on balls in play vs LHP in excess of the league average total. Ordinarily, one might expect this adjustment to increase the number of balls hit into play on the left side - and in most cases that happens. As calculated in the Win Shares method, the Yankees had an excess of LHP in both 1998 and 2000, and were exactly league average in 1999. Thus, one might have expected the Yankees to have more GBIP to the left side in 1998 and 2000, and a league average total in 1999. The reality is quite different, as Table 9 demonstrates:

    Table 9. AL Distribution of Ground Balls in Play, 1998-2000

    1998 Excess LHP GBIP LS GBIP RS %LS
    Seattle 721 1146 773 59.7%
    Texas -328 1172 828 58.6%
    Detroit -233 1255 900 58.2%
    Tampa Bay -1 1140 821 58.1%
    Boston -274 1094 828 56.9%
    Kansas City 311 1165 891 56.7%
    Oakland -19 1186 908 56.6%
    Chicago(A) 466 1163 891 56.6%
    Baltimore -115 1128 927 54.9%
    Anaheim 149 1088 897 54.8%
    New York(A) 368 1086 898 54.7%
    Cleveland -554 1141 947 54.6%
    Minnesota 137 1081 918 54.1%
    Toronto -627 996 938 51.5%
             
    AL Totals   15841 12365 56.2%
             
    1999 Excess LHP GBIP LS GBIP RS %LS
    Seattle 758 1171 887 56.9%
    Chicago(A) 370 1155 888 56.5%
    Kansas City 24 1198 954 55.7%
    Minnesota 156 1094 907 54.7%
    Baltimore -296 1154 962 54.5%
    Texas -339 1200 1004 54.4%
    Tampa Bay 42 1162 985 54.1%
    Toronto 158 1141 971 54.0%
    Detroit -136 1096 946 53.7%
    Anaheim 159 1096 954 53.5%
    Boston -376 1047 926 53.1%
    Cleveland -263 1078 997 52.0%
    Oakland -258 1092 1055 50.9%
    New York(A) 0 1003 1001 50.0%
             
    AL Totals   15687 13437 53.9%
             
    2000 Excess LHP GBIP LS GBIP RS %LS
    Texas 640 1132 829 57.7%
    Anaheim 87 1217 906 57.3%
    Tampa Bay -572 1207 940 56.2%
    Chicago(A) 459 1122 889 55.8%
    Minnesota 584 1071 854 55.6%
    Toronto 79 1135 933 54.9%
    Boston -11 1116 918 54.9%
    Seattle 228 1058 893 54.2%
    Baltimore -346 1084 947 53.4%
    New York(A) 190 987 890 52.6%
    Detroit -499 1155 1055 52.3%
    Oakland -53 1145 1061 51.9%
    Cleveland -98 1060 992 51.7%
    Kansas City -688 1094 1025 51.6%
             
    AL Totals   15583 13132 54.3%

    In all three seasons, the Yankees had far fewer GBIP hit to the left side than would be expected given the balls in play against their LHP. In this particular case, the adjustment penalizes Yankee SS for plays that they never had a chance to make. Obviously, this is also true of the other methods that we have discussed, where left/right adjustments based on the orientation of the pitching staff are used - the point here is not to criticize the use of the adjustment but to point out that, in the specific case of the Yankees, the adjustment leads to a bias in the measurement because the Yankees’ ball distribution doesn’t fit within the model on which the method is based. Furthermore, that bias in the model, when applied to the Yankees, would have the effect of reducing the ranking of the team’s shortstops.

    I also decided to take a look at James’s assumption that an average shortstop would get assists at the same rate, regardless of whether or not his team had a high number of assists or a low number of assists. I identified 10 extreme fly ball teams in the 1998-2000 AL, and 11 extreme ground ball teams during that period, and took at look at the ratio of shortstop assists to team assists for each of those teams. The shortstops on the flyball teams averaged 28.8% of team assists; the shortstops on the groundball teams averaged 29.0% of team assists. The groundball teams faced 4% more left handed hitters, which would reduce the number of assists that their SS get. When I adjusted for this at the rate at which those team SS got assists when facing LHB and RHB, and weighted the rate based on each team’s pitchers facing an average mix of LHB and RHB, the SS on the flyball teams would have averaged 28.6% of team assists and the SS on the groundball teams would have averaged 29.3% of team assists. When you make a similar correction for LH/RH batters faced for all of the AL teams between 1998 and 2000, there is virtually no correlation between the team’s groundball rate and the percentage of assists that its shortstops get (r=-0.058 over the 1998-2000 period). Thus, while I thought that shortstops who played behind ground ball staffs might have a higher percentage of assists than SS that played behind fly ball staffs, that is apparently not the case, and thus using the league-average rate as a ba

    Mike Emeigh Posted: November 18, 2002 at 06:00 AM | 21 comment(s) Login to Bookmark
      Related News:

  • Reader Comments and Retorts

    Go to end of page

    Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

       1. Marc Stone Posted: November 18, 2002 at 02:04 AM (#607279)
    Don't you have to make a distinction between fielding ability and contribution to winning? If a player for some bizarre reason never has a ball hit near him the entire season, he doesn't deserve any fielding Win Shares but it doesn't make him a bad fielder.

    (James is a little fuzzy on this too; it's not entirely clear whether he is measuring capability or contribution.)
       2. Charles Saeger Posted: November 18, 2002 at 02:04 AM (#607281)
    Mike -- FYI, Joe's spreadsheet has an error for CL4 (home runs allowed) on the pitcher/fielder split. It just shows park-adjusted home runs allowed. You need to subtract that from expected home runs (league home runs/inning times team innings), multiply by 5 and add to 200.
       3. tangotiger Posted: November 18, 2002 at 02:04 AM (#607283)
    Mike said:

    The result of this choice is that fielders are penalized in Win Shares for every play made by a fielder at another position - plays which they themselves had no chance to make.

    Actually, this makes sense in the James scheme. That is, James assigns the Yanks fielders 52.5 Win shares, while an average team would have received 41.3 Win Shares. Therefore, under this scheme, the players on the Yanks are being compared *to each other* as well as to the league. Therefore, an average fielder playing in an above average fielding team (as noted by the James system of 52.5 WS versus 41.3) should come out looking worse than his teammates, but still come out looking exactly like his average counterpart on an average team.

    I'm not sure how clear all that was, or if I'm contradicting what Mike or James are saying. But, as best as I can figure it, that's what James is trying to do.
       4. Charles Saeger Posted: November 18, 2002 at 02:04 AM (#607285)
    I must agree with Tom and not Mike on this point. What James is trying to do is finding the degree of goodness, then spreading that goodness around. If the second baseman made 60 more assists than mean and the third baseman made 60 fewer assists than mean, the second baseman should have more of the goodness. If the team is above mean anyways, the third baseman may well also show up has having above mean Win Shares.

    FWIW, the nice thing about Win Shares fielding is it is a evaluation system unlike more usual systems, like DFTs or CAD. There are many parts that are not done well or right, but the environment of Wins, not Hits, is unique, and gives a different angle. This is a good thing, as the more tools, the merrier we are.
       5. tangotiger Posted: November 18, 2002 at 02:04 AM (#607286)
    Just to be clear, this is what I think James is trying to do. I don't know if he is successful. One way to test it is to construct various "teams", and start to control for things.

    For example, start off with an average team. Add 2 assists to your SS, and subtract 2 assist from your 2B. You'll probably have to subtract 1 putout from your SS, and add 1 to your 2B. Now, what is the effect? How much WS did your SS gain, and how much did your 2B lose? Do the same, but with your SS and 3B (this time, you don't need to worry about PO). What happened?

    Now, do the same, but for a top team. Are the changes similar? The changes should be very similar, as putting Vlad on the Expos or Vlad on the Mariners as a hitter should have a very similar effect (though not exactly). This method doesn't apply to pitchers.

    Because of the complexity of the James calculations, I find that it is easier to look at various marginal effects to note what is really happening under the hood.
       6. Charles Saeger Posted: November 18, 2002 at 02:04 AM (#607287)
    Tom -- I have looked at Win Shares from what you suggest, and it is clear to me that James did NOT do this.

    I said the idea of such adjustments makes sense. James just did not do them right.
       7. Mike Emeigh Posted: November 18, 2002 at 02:04 AM (#607288)
    Sorry if I missed something, but... in your Table 9, do the GBIP LS and GBIP RS columns include ground-ball base hits, or outs only?

    Both Table 9 and Table 10 include hits, outs, errors, and fielders' choices.

    Don't you have to make a distinction between fielding ability and contribution to winning?

    Yes, you do - and WS measures only the latter directly. James measures capability indirectly, primarily by "spreading the goodness around", to quote Charlie.

    Therefore, an average fielder playing in an above average fielding team (as noted by the James system of 52.5 WS versus 41.3) should come out looking worse than his teammates, but still come out looking exactly like his average counterpart on an average team.

    If that were true, then you wouldn't expect much, if any, correlation between fielder opportunities and WS. An average fielder on an average team will have more opportunities and more plays as a result of those opportunities than an average fielder on a good team, but should have no more Win Shares (after prorating for playing time). In fact, there is a mild positive correlation between WS/1000 innings and fielder opportunities.

    For example, start off with an average team. Add 2 assists to your SS, and subtract 2 assist from your 2B. You'll probably have to subtract 1 putout from your SS, and add 1 to your 2B. Now, what is the effect? How much WS did your SS gain, and how much did your 2B lose? Do the same, but with your SS and 3B (this time, you don't need to worry about PO). What happened?

    Now, do the same, but for a top team. Are the changes similar?


    Suppose you take one assist away from your 3B - a play that he didn't make. The end result could be a play that the SS made in back of him, in which case all you've done is transfer one play directly to the other fielder. But it could also be a hit, which adds the possibility of a play by everyone else. A play missed by the 2B or the 1B or the OF could become another play for the SS. So when doing this type of analysis, you need to construct a model where you evaluate these secondary effects as well. It would probably be better to do this with a simulator.

    -- MWE
       8. tangotiger Posted: November 18, 2002 at 02:04 AM (#607291)
    Mike, I agree with your last statement.

    My objective was simply to determine if James did the claim points correct by position. Removing an assist from the SS and giving it to the 3B should have a net effect of zero. I don't know that it does. And even if it does, what is the degree of change? Is this change correct?

    Your objective is also a good one, and should also be done. When doing it your way, you also have an additional complicated wrinkle that the total number of wins will go down, ever so slightly, if you remove a sure out, and replace it with a possible out possible hit.

    There is alot in Win Shares that has not been "proven" other than people's claims that "it works".

    This is why Mike's analysis on this topic is so valuable.

    Mike, another comment: can you also put Jeter's ( WS - (avg WS for a SS per inning) x (Jeter's innings) ) / 30. The avg WS for a SS per inning is essentially a given, and the same every year. This would simply put things in the same scale of Charlie and Clay and Pete (as plus/minus runs against average).
       9. Charles Saeger Posted: November 18, 2002 at 02:04 AM (#607292)
    Tom -- average Win Shares for a shortstop is affected by K/9.
       10. Chris Dial Posted: November 18, 2002 at 02:05 AM (#607294)
    "I hope that throughout this series, the reader has become aware of the limitations of approaches based solely on published statistics."

    This cannot be emphasized enough. The results gathered from available statistics is unreliable. The Derek Jeter situation defines this.

    We have to do a better job of reconciling pbp data to WS, CAD, DFT.

    Mike's done a great job with this.
       11. MGL Posted: November 18, 2002 at 02:05 AM (#607296)
    Although Win Shares does not interest me, and this is a bit of a throwaway comment, I think this is a very good (thorough, intelligent, and well-written) article with a good analysis...
       12. Mike Emeigh Posted: November 18, 2002 at 02:05 AM (#607299)
    My objective was simply to determine if James did the claim points correct by position.

    OK, that wasn't clear from the comment. I don't think the net effect would be zero, because James uses a 50-point scale for assists at 3B and a 40-point scale for SS assists, and because the LHP adjustment differs for 3B and SS.

    Your objective is also a good one, and should also be done. When doing it your way, you also have an additional complicated wrinkle that the total number of wins will go down, ever so slightly, if you remove a sure out, and replace it with a possible out possible hit.

    Suppose you had a team of average pitchers and average fielders. You can tell, from the PBP data, the rate at which each team's fielders make plays, and the percentage of balls that they field in their vicinity (assigning hits in the same proportion as outs, a la UZR). You can assign non-BIP events at the average rate at which they occur. You'd run simulated seasons of 1450 innings, and calculate defensive WS based on the estimated W/L as if the team had average offensive production over those innings.

    Now suppose you took that average SS and placed him on a team with the *best* fielder at each position (as determined by the % of balls that they field in their vicinity). Do the same thing, with simulated seasons of 1450 innings. The team W/L percentage with average offensive production should go up, and then you figure the defensive WS based on that WP. The SS should have the same defensive WS (within a reasonable tolerance).

    Mike, another comment: can you also put Jeter's ( WS - (avg WS for a SS per inning) x (Jeter's innings) ) / 30. The avg WS for a SS per inning is essentially a given, and the same every year. This would simply put things in the same scale of Charlie and Clay and Pete (as plus/minus runs against average).

    I don't think you meant to divide by 30 at the end. I think you might have meant to multiply by 10/3 (since 3 WS = 1 win and 10 runs = 1 win, thus 3 WS = 10 runs and 1 WS = 10/3 runs).

    I have an updated table in which I fixed the spreadsheet error that Charlie noted (I had actually fixed this error earlier but didn't recalculate the individual WS after I fixed it, so the numbers in the article are off) and included the info Tango requested. I'll incorporate that into a revised version of the article and shoot it to Dan tomorrow.

    -- MWE
       13. Doug Posted: November 19, 2002 at 02:05 AM (#607300)
    What does win shares have to say about good players on bad teams?

    Since the available shares to claim is limited by the number of team wins, this really puts guys like Arod at a comparative disadvantage relative to guys on good teams, doesn't it? Looking at it this way, Jeter's low numbers seem even more shocking, considering the enhanced win share potential available to him by dint of playing on a good team.

    Or have I missed something?

       14. Mike Emeigh Posted: November 19, 2002 at 02:05 AM (#607301)
    Since the available shares to claim is limited by the number of team wins, this really puts guys like Arod at a comparative disadvantage relative to guys on good teams, doesn't it? Looking at it this way, Jeter's low numbers seem even more shocking, considering the enhanced win share potential available to him by dint of playing on a good team.

    This is an extension of the earlier discussion. There are two forces going on here that counterbalance each other to some extent:

    1. A good defensive team will have more WS available to its fielders, thus on average the WS available to a defensive position will be higher.
    2. An average defender on a good defensive team will lose chances to his better-fielding teammates, which will reduce the percentage of team defensive WS that he gets.

    Ideally, these should balance each other out entirely, so that the average defender will look the same whether he's on a good team or a bad team. I think that, in practice, the second factor is larger than the first factor, so that in general, the worse you are relative to your teammates, the lower your total WS will be. One way that we might check this is to see what happens to WS when a fielder moves from a good defensive team (or a poor defensive team) to a lesser (or better) defensive team - do his fielding WS change? A subject for another study...

    -- MWE
       15. tangotiger Posted: November 19, 2002 at 02:05 AM (#607303)
    This continues along the same lines as the current discussion -

    I don't think the net effect would be zero, because James uses a 50-point scale for assists at 3B and a 40-point scale for SS assists, and because the LHP adjustment differs for 3B and SS.

    There are *many* forces at work here! While James uses a 50pt scale for assists for 3b, and 40 for assists, the "available total points" for 3B is 24 and for SS is 38(?). So, from this standpoint, the 3B assists get 24 x .5 = 12, and 38 * .4 = 15.2 for SS assists.

    I agree with Mike's assessment that there are many forces at work

    - breakdown of assists, po, dp within position,

    - claim points by position,

    - comparison relative to teammates,

    - whole team fielding compared to whole team off+def, and

    - whole team compared to league.

    Whew! Ideally, moving an average guy from a great team to a bad team should have little impact on the WS of the player in question. (It might have some, because of the interaction, like hitters, but generally speaking it would be pretty small.)

    However, given all these forces at work, I don't see the evidence that all these forces have been balanced out to the point where we can say that this is a true statement. In fact, I think it is a daunting task to determine the extent to which this is true or false. The number of forces and variables at work here would make it a small project until itself to determine the validity of the fielding portion of Win Shares. That's not to say that all is lost. All those arbitrary claim and point assignment that James make may actually be calculable through a more rigid analysis.

    Then again, Clay and Charlie's approach is probably just as good and far easier.
       16. Walt Davis Posted: November 19, 2002 at 02:05 AM (#607305)
    James makes one major adjustment for ball distribution in his method, based on balls in play vs LHP in excess of the league average total. Ordinarily, one might expect this adjustment to increase the number of balls hit into play on the left side - and in most cases that happens.

    Or maybe not. For the 3 years of data you presented, the correlation between a team's "LHP excess" and their GBIP to the left side is a meager .04 and not significant.

       17. Mike Emeigh Posted: November 19, 2002 at 02:05 AM (#607308)
    For the 3 years of data you presented, the correlation between a team's "LHP excess" and their GBIP to the left side is a meager .04 and not significant.

    The appropriate correlation isn't with the *number* of GBIP to the left side (because that number is affected by the total number of GBIP allowed as well as the number of LHP), but with the *percentage* of GBIP that are hit to the left side. The correlation between excess LHP and %LS is r=+0.443 for AL 1998-2000.

    -- MWE
       18. Mike Emeigh Posted: November 19, 2002 at 02:05 AM (#607309)
    The rankings are actually very similar, but the actual run amounts are far less than the DFTs. ... So why the big difference?

    The primary reason for this is that Davenport credits a larger percentage of pitching+fielding to the fielders than does James. James splits pitching+fielding as 67.5% to the pitchers, 32.5% to the fielders; Davenport allocates K, HB, BB, and HR 100% to the pitchers, allocates errors 100% to the fielders, and divides other events as 75% fielding and 25% pitching (used to be 70/30). When you make that division for an average team, and then estimate the runs that result from each set of events (using your favorite run estimator), you wind up with something in the vicinity of 60% of the total runs resulting from events credited to the pitchers and 40% to the fielders.

    -- MWE
       19. Charles Saeger Posted: November 19, 2002 at 02:05 AM (#607314)
    FYI -- unless you calculated something differently, CAD does use LHB/RHB adjustments where they are available (not LHP/RHP).
       20. Mike Emeigh Posted: November 20, 2002 at 02:05 AM (#607319)
    FYI -- unless you calculated something differently, CAD does use LHB/RHB adjustments where they are available (not LHP/RHP).

    Yes, and since I had it for 1998-2000 I used it.

    -- MWE
       21. Mike Emeigh Posted: November 21, 2002 at 02:05 AM (#607333)
    Actually, you don't even need the regression model to see that 1998 is fundamentally different from 1999 & 2000. In 1998, 4 teams had %LS over 58%. No team in 1999 or 2000 had over 58%. Only 1 1998 team had LS% under 54%. There were 6 under 54% in 1999 and 6 more in 2000. Was there a big dropoff in LHP league-wide between 1998 and 1999?

    No. I ran a different version of the query (excluding all bunts, not just SH) and got percentages more in line with 1998 for 1999 and 2000. I also didn't try to assign BIP for which location information was missing, as I had done in the earlier effort - and which I apparently didn't do very accurately, as it turns out. The league percentages for GBIP LS, excluding all bunts, were 56.4% in 1998, 55.1% in 1999, and 55.6% in 2000. There were four teams over 58% in 1998, one in 1999, and 2 in 2000, with 1 team under 54% in 1998, 3 in 1999, and 4 in 2000. The overall correlation factor between excess LHP and % of GBIP to the left side rises to r=+0.447. The Yankees are still below average in GBIP LS in all three seasons.

    1998 is still odd, though. Four of the top 5 teams in GBIP LS % have more RHP than expected.

    -- MWE

    You must be Registered and Logged In to post comments.

     

     

    << Back to main

    BBTF Partner

    Support BBTF

    donate

    Thanks to
    rr
    for his generous support.

    Bookmarks

    You must be logged in to view your Bookmarks.

    Syndicate

    Page rendered in 1.2886 seconds
    68 querie(s) executed