And the Beat Goes On: Derek Jeter and the State of Fielding Analysis in Sabermetrics - Part 5
Mike tackles the latest from Bill James, Win Shares.
The Return of the Master - Win Shares
Bill James, in the 1984 Baseball Abstract, railed against
the search for a “great statistic” that would combine all of the aspects
of a player’s performance onto one scale. However, that did not keep James
from trying to develop one, even after his retirement from sabermetics in
1988. In 1996, James began working on a method for tying together all of the
aspects of a player’s performance into a single number, by relating those
characteristics to the overall performance of the team in terms of wins. James
presented the resulting system, Win Shares, at SABR31 in 2001, and a book
detailing the system was published by STATS, Inc. in 2002. Wins Shares quickly
became a lively discussion topic here at Baseball Primer, and everywhere that
baseball was discussed on the Internet. The defensive analysis portion of the
drew the most attention, with many people accepting James’s contention that
the system was entirely new and different even though Davenport and Saeger
had used many of the same basic concepts in developing their own systems, as I
noted earlier.
As did Davenport and Saeger, James also realized that it was important to
evaluate fielding first by the performance of the team, then by the
performance of the individual fielders on the team. When using individual
fielding statistics such as Range Factor as the starting point, James notes that:
...this implies, in turn, that we are assuming that all defensive
teams are equal. Making 5.08 plays for one defense is the same as making 5.
08 plays for another defense. (Win Shares, page 110; emphasis is
in the original text)
James goes on to note that this assumption cannot be correct, because it is
clear that all defensive teams are not equal, and that therefore individual
fielding must be evaluated within the context of the team.
I’m going to run through the model in detail for shortstops. Primer author
Joe Dimino
has put together a spreadsheet for calculating Win Shares for a league-season,
which I used to perform the calculations.
In the Win Shares system, James assigns each team three Win Shares for each
team win. He then divides those Win Shares between offense and defense
(pitching+fielding), assigning roughly 48% of the Win Shares on average to
offense and 52% to defense. He then divides the defensive totals between
pitching and fielding, with roughly 35% of the team total on average going to
pitching and 17% to fielding. So a team that wins 114 games, like the 1998
Yankees, will have 342 total Win Shares to divide, and with a normal
performance on offense and defense will have about 164 of those Win Shares
assigned to the hitters, 120 assigned to the pitchers, and 58 to the fielders.
The 1998 Yankees actually had 171.5 Win Shares assigned to the hitters, 118
to the pitchers, and 52.5 to the fielders. I should note that James has placed
a maximum and minimum limit on the number of Win Shares assigned to the
fielders, and the standard formula would have resulted in a number of Win
Shares assigned to the fielders that would have exceeded the maximum allowable
number, so that the fielders actually could not get more than the 52.5 they
received. The fielders would have had about another 1.5 WS without the limit.
Once James determines the Win Shares to be assigned to the fielders, he
determines how to divide those Win Shares between the team’s defenders at
each position - assigning Win Share values to the team’s catchers, 1Bs, 2Bs,
etc. Pitcher fielding for some reason is not included. Once Win Shares are
assigned to each position, the Win Shares at each position are divided between all
of the fielders who played at that position.
The assignment of fielding Win Shares by position, and the assignment of
fielding Win Shares at a position, are based on a Claim Point system. Each
positive accomplishment by fielders yield a certain number of Claim
Points, and the Win Shares are divvied up based on the percentage of the total
claim points accumulated. The principle is the same for both team defensive
position and individual defenders at a position.
In dividing fielding Win Shares by position, James evaluates fielders based
on four defensive characteristics for their position. While there are some
variations from position to position, James generally weights what he
considers to be the most important fielding characteristic at that position on
a 40-point scale, the next most important on a 30-point scale, then 20, then
10. For shortstops, the 40-point scale is based on Assists, the 30-point
scale is based on Double Plays, the 20-point scale is based on Error
Percentage, and the 10-point scale is based on Putouts. I’m going to walk
through the calculations for the 1998 Yankee shortstops (Jeter, Luis Sojo, and
Homer Bush):
Yankee shortstops in 1998 had 440 assists. This total is compared to the
number of assists that they should have been expected to get, calculated as:
(Team assist total)*(% of assists by league SS)+(excess batters faced by
LHP/100)
where “excess batters faced by LHP” is
(LHP BIP) - (team BIP * lg % of BIP vs LHP)
BIP being innings pitched, multiplied by 3, minus strikeouts.
The 1998 Yankees had 1642 assists as a team. The league percentage of
assists by shortstops was .29225, and the Yankees’ LHP faced 368 more batters
than expected. Thus, Yankee shortstops would have been expected to have
1642*0.29225+(368/100)=483.55 assists.
The number of claim points that SS get for assists is
20+(actual A - expected A)/4
For the Yankees this is 20+(440-483.55)/4, or 9.11 claim points.
The claim points for double plays are awarded based upon the team total of
DPs turned, compared to the expected number of double plays for that team. The
1998 Yankees turned 146 double plays. The first estimate of the expected number
of double plays is calculated based on the number of runners on 1B in DP
situations, estimated as:
((H-HR)*(league % of singles allowed))+BB+HBP-SH-WP-Bk-PB
times the league percentage of such runners removed on double plays (DP
divided by the same formula above, calculated for the league, using actual
singles allowed as the first factor). For the 1998 Yankees, this value is:
(((1357-156)*0.75196)+466+68-37-37-5-12)*0.0994=134.08
This first estimate is then adjusted for the number of assists per inning,
compared to the league average, on the theory that teams with more assists
tend to see more ground balls than the norm, and thus are likely to turn more
DPs. The 1998 Yankees had 1642 assists in 1456 2/3 innings; the league as a
whole had 23295 assists in 20194 2/3 innings. The ratio of the Yankees’ A/inning
to the league is therefore 0.9772, and the expected number of DPs is thus
(134.08*0.9772), or 131.
The claim points for double plays are calculated as:
15+(actual DP - expected DP)/4
or for the Yankees, 15+(146-131)/4 = 18.75
Error percentage for shortstops is the converse of fielding average; it is
(errors)/(total chances). The 1998 Yankee SS handled 703 chances and made 11
errors, for an error percentage of 0.01564. The league’s SS handled 10884
chances and made 308 errors, an error percentage of 0.02829. The claim points
for error percentage are calculated as:
20 - (10 * (team SS error %)/(league SS error %))
For the Yankees, this is 20 - (10 * 0.01564 / 0.02829), or 14.47 claim
points.
The 1998 Yankees had 252 putouts by their shortstops. This is compared to
the number of putouts they were expected to make, which is:
(team PO - team K)*(lg % of non-K PO by SS)+(BB above/below lg average)
/14-(excess batters faced by LHP)/64
AL shortstops in 1998 had 0.0815 of their league’s non-strikeout putouts.
Yankee pitchers walked 0.0615 men per inning fewer than the league average, or
89.67 men fewer than the league average in their 1456 2/3 innings. As noted
above, they had 368 more batters faced by LHP than the league average. Thus,
their shortstops would have been expected to have:
(4370-1080)*0.0815-89.67/14-368/64, or 255.982 putouts.
This is translated to claim points by the formula:
5 + (actual putouts - expected putouts)/15
For the Yankee SS, this yields 5 + (252 - 255.982)/15, or 4.74 claim points.
James converts the claim points to a claim percentage. Since there are 100
claim points available per position, the claim percentage is simply the total
of the claim points for each aspect of the position, divided by 100. For the
Yankee SS, this is (9.11+18.75+14.47+4.74)/100, or .4707.
When the calculations are run through, the following claim percentages
result for the other positions on the 1998 Yankees:
C: .6244
1B: .5652
2B: .5779
3B: .5874
OF: .5984
James now uses these claim percentages to assign the fielding Win Shares to
individual positions. He does so in the following way:
Calculate the number of weighted claim points for each position, by
multiplying (claim percentage -.200) by the intrinsic weight James assigns for
the position. The intrinsic weight is intended to represent the relative
importance of the defensive position. Catchers are assigned an intrinsic
weight of 38 points, 1Bs 12 points, 2Bs 32 points, 3Bs 24 points, SS 36 points,
and OFs 58 points.
Assign Win Shares to each position based on the formula (team fielding
Win Shares) * (position weighted claim points) / (total weighted claim points)
Recall that the 1998 Yankees have 52.5 Win Shares assigned to the fielders.
The shortstops get (0.2707)*36, or 9.74 weighted claim points. The team total,
applying the same formula to the other positions, is 74.75 weighted claim points.
The shortstops thus get 52.5*9.74/74.75, or 6.84 fielding Win Shares.
Once James has the Win Shares assigned to a position, he then assigns those
Win Shares to the individual fielders at that position based on a formula that
takes into account the percentage of the plays that the fielder handles.
Again, this is done based on a claim point system. Each shortstop is assigned
claim points via the following formula:
PO + A*2 - E*5 + DP + RBP*2
RBP stands for range bonus plays. A player gets assigned range bonus plays
when it is clear by any interpretation of the data that he is making more
plays per inning than his teammates. The formula for assigning these is pretty
complicated when you don’t have defensive innings, but when you do have
defensive innings, as we do for the 1998 Yankees, the range bonus goes to any
player who is making more plays per inning than the average for the position
for the team.
The Yankees used three shortstops in 1998. Derek Jeter played 1304 2/3
innings with 223 PO, 393 A, 9 E, and 82 DP. Luis Sojo played 141 innings with
29 PO, 44 A, 2 E, and 12 DP. Homer Bush played 11 innings with 3 A and no
other defensive markers. The team total was 252 PO, 440 A, 11 E, and 96 DP in
1456 2/3 innings; the team’s SS averaged 0.475 plays per inning. Derek Jeter
should have made 620 plays (PO+A) at the team average; he actually made 616,
so he gets no range bonus. Luis Sojo made 73 plays, and should have made
66.98, so he gets 6.02 range bonus plays. Homer Bush made three plays, and
should have made 5.23, so he gets no range bonus. The claim points for these
three players using the formula are:
Jeter: 1046
Sojo: 131
Bush: 6
The position Win Shares are assigned based on the proportion of the claim
points that the individual earns. Jeter gets 6.84*1046/1183, or 6.04 Win
Shares for his fielding. Sojo gets 6.84*131/1183, or 0.76 Win Shares, and
Bush gets 6.84*6/1183, or 0.03 Win Shares (rounding acounts for the other 0.01
Win Share). The calculations for 1999 and 2000 give Jeter 5.02 and 2.70 Win
Shares for fielding, respectively, for those two seasons.
Because Win Shares are dependent on playing time, to compare shortstops I
determine the number of Win Shares earned per 1000 innings played. Table 8
shows the results of this comparison for the years 1998-2000 among SS with 800
innings or more, with Palmer’s FR, Davenport’s DFTs, Saeger’s CAD Defensive
Winning Percentages, and fieldable shortstop opportunities per nine innings
included for comparison purposes:
Table 8. AL SS Win Shares /1000 Inn, 1998-2000 (min 800 inn)
| 1998 |
Team |
Inn |
WS |
WS/1000 |
FR |
DFT |
DWP |
SSF/9 |
| Vizquel, O |
CLE |
1316.0 |
8.63 |
6.56 |
7.44 |
11 |
0.541 |
4.08 |
| Gonzalez, A |
TOR |
1398.3 |
8.67 |
6.20 |
-9.40 |
5 |
0.536 |
3.69 |
| Bordick, M |
BAL |
1238.3 |
7.49 |
6.05 |
18.26 |
4 |
0.501 |
4.42 |
| Stocker, K |
TBA |
940.0 |
5.39 |
5.74 |
11.63 |
14 |
0.540 |
4.41 |
| DiSarcina, G |
ANA |
1370.7 |
7.20 |
5.25 |
-3.63 |
3 |
0.507 |
3.99 |
| Cruz, D |
DET |
1163.3 |
5.73 |
4.93 |
16.10 |
17 |
0.531 |
5.04 |
| Meares, P |
MIN |
1270.0 |
6.21 |
4.89 |
-7.68 |
0 |
0.492 |
4.11 |
| Jeter, D |
NYA |
1304.7 |
6.04 |
4.63 |
-20.02 |
-3 |
0.510 |
3.99 |
| Rodriguez, A |
SEA |
1389.3 |
6.22 |
4.48 |
2.06 |
0 |
0.488 |
4.35 |
| Tejada, M |
OAK |
915.0 |
4.00 |
4.37 |
2.16 |
0 |
0.477 |
4.40 |
| Garciaparra, N |
BOS |
1255.3 |
4.97 |
3.96 |
-15.26 |
-11 |
0.478 |
3.84 |
| Caruso, M |
CHA |
1121.3 |
2.84 |
2.53 |
-7.33 |
-16 |
0.440 |
4.49 |
| |
|
|
|
|
|
|
|
|
| 1999 |
Team |
Inn |
WS |
WS/1000 |
FR |
DFT |
DWP |
SSF/9 |
| Bordick, M |
BAL |
1355.0 |
9.30 |
6.86 |
35.38 |
23 |
0.560 |
4.40 |
| Batista, T |
TOR |
860.7 |
5.53 |
6.43 |
10.18 |
13 |
0.523 |
4.28 |
| Garciaparra, N |
BOS |
1171.7 |
7.52 |
6.42 |
-7.53 |
0 |
0.513 |
4.04 |
| Sanchez, R |
KCA |
1128.7 |
6.80 |
6.02 |
31.94 |
24 |
0.552 |
4.82 |
| Tejada, M |
OAK |
1377.3 |
8.21 |
5.96 |
7.13 |
4 |
0.512 |
4.20 |
| Cruz, D |
DET |
1300.3 |
7.42 |
5.71 |
4.68 |
20 |
0.525 |
4.31 |
| Rodriguez, A |
SEA |
1114.7 |
6.36 |
5.70 |
7.23 |
3 |
0.510 |
4.42 |
| Clayton, R |
TEX |
1149.3 |
5.34 |
4.64 |
3.10 |
-7 |
0.487 |
4.64 |
| Guzman, C |
MIN |
1069.0 |
4.35 |
4.07 |
-7.20 |
-3 |
0.493 |
4.18 |
| Caruso, M |
CHA |
1114.7 |
4.33 |
3.88 |
-19.92 |
-10 |
0.462 |
4.25 |
| Vizquel, O |
CLE |
1214.3 |
4.46 |
3.68 |
1.57 |
-7 |
0.487 |
4.09 |
| Jeter, D |
NYA |
1395.7 |
5.02 |
3.60 |
-33.55 |
-13 |
0.475 |
3.64 |
| |
|
|
|
|
|
|
|
|
| 2000 |
Team |
Inn |
WS |
WS/1000 |
FR |
DFT |
DWP |
SSF/9 |
| Rodriguez, A |
SEA |
1285.0 |
8.73 |
6.79 |
8.05 |
17 |
0.536 |
3.98 |
| Valentin, J |
CHA |
1212.3 |
8.00 |
6.60 |
20.59 |
0 |
0.495 |
4.30 |
| Sanchez, R |
KCA |
1198.0 |
7.08 |
5.91 |
15.18 |
28 |
0.476 |
4.45 |
| Martinez, F |
TBA |
887.7 |
5.20 |
5.85 |
31.20 |
9 |
0.505 |
4.73 |
| Garciaparra, N |
BOS |
1185.0 |
6.65 |
5.61 |
-2.55 |
-4 |
0.491 |
4.37 |
| Tejada, M |
OAK |
1400.3 |
7.46 |
5.33 |
3.17 |
-4 |
0.546 |
4.46 |
| Guzman, C |
MIN |
1307.0 |
6.71 |
5.13 |
-14.15 |
-9 |
0.456 |
3.78 |
| Clayton, R |
TEX |
1237.0 |
6.16 |
4.98 |
-2.59 |
-4 |
0.492 |
4.15 |
| Cruz, D |
DET |
1355.3 |
6.34 |
4.67 |
3.95 |
13 |
0.548 |
4.34 |
| Vizquel, O |
CLE |
1328.7 |
5.96 |
4.49 |
-0.54 |
-4 |
0.491 |
3.98 |
| Gonzalez, A |
TOR |
1225.3 |
5.26 |
4.29 |
-6.82 |
16 |
0.487 |
4.36 |
| Bordick, M |
BAL |
865.0 |
3.22 |
3.72 |
-14.38 |
-6 |
0.481 |
3.94 |
| Jeter, D |
NYA |
1278.7 |
2.70 |
2.12 |
-36.47 |
-27 |
0.490 |
3.50 |
The correlation coefficient between Win Shares per 1000 innings and
the other data presented for comparison purposes:
Win Shares and FR: r=+0.696 (given James’s take on Fielding Runs, this
might come as a surprise to him)
Win Shares and DFTs: r=+0.711
Win Shares and CAD DWP: r=+0.640
Win Shares and SS fieldable opportunities: r=+0.271
There have been some criticisms of the Win Shares fielding system, many of
them having to do with the arbitrary nature of James’s weights for defensive
events. While I think those criticisms are reasonable, I’m more concerned here
with whether James has accurately captured overall defensive value in his
approach - in other words, can we rely upon the conclusion that a player with
6 defensive Win Shares is a better fielder than a player with 4? If James is
accurately evaluating the relative importance of defensive events and accounting
for team contextual effects, his rankings should give reliable results,
regardless of the specific scale that he uses. If he is not, then we should be
able to evaluate the specific shortcomings of the method by looking at how well
the results compare with inferences drawn from play-by-play data.
It is clear, looking at the ratings, that Jeter’s Win Shares totals are
heavily influenced by his low assist totals. Yankee SS (mostly Jeter) putout
total are close to league average, and their double play rates (at least in
this method, although as I noted in Part 4 they are very bad in relation to
actual DP chances) have been only slightly under the expected rate. Except for
2000, the Yankee SS have very good error rates. But their assist rates have been
very low - on the 40-point scale Yankee SS had fewer than 10 claim points in
each of the three seasons. Since SS lose one claim point for every four
assists that they fall short of their expectation, and since a SS who exactly meets
the expectation gets 20 claim points, Yankee SS are getting at least 40 fewer
assists than expected in Win Shares. The obvious question to be asked at this
point, then, is whether this shortfall is actually due to their fielding skill,
or whether it results from an overestimate of their opportunities to make
plays in the Win Shares method.
James makes one major adjustment for ball distribution in his method, based
on balls in play vs LHP in excess of the league average total. Ordinarily, one
might expect this adjustment to increase the number of balls hit into play on
the left side - and in most cases that happens. As calculated in the Win Shares
method, the Yankees had an excess of LHP in both 1998 and 2000, and were exactly
league average in 1999. Thus, one might have expected the Yankees to have more
GBIP to the left side in 1998 and 2000, and a league average total in 1999. The
reality is quite different, as Table 9 demonstrates:
Table 9. AL Distribution of Ground Balls in Play, 1998-2000
| 1998 |
Excess LHP |
GBIP LS |
GBIP RS |
%LS |
| Seattle |
721 |
1146 |
773 |
59.7% |
| Texas |
-328 |
1172 |
828 |
58.6% |
| Detroit |
-233 |
1255 |
900 |
58.2% |
| Tampa Bay |
-1 |
1140 |
821 |
58.1% |
| Boston |
-274 |
1094 |
828 |
56.9% |
| Kansas City |
311 |
1165 |
891 |
56.7% |
| Oakland |
-19 |
1186 |
908 |
56.6% |
| Chicago(A) |
466 |
1163 |
891 |
56.6% |
| Baltimore |
-115 |
1128 |
927 |
54.9% |
| Anaheim |
149 |
1088 |
897 |
54.8% |
| New York(A) |
368 |
1086 |
898 |
54.7% |
| Cleveland |
-554 |
1141 |
947 |
54.6% |
| Minnesota |
137 |
1081 |
918 |
54.1% |
| Toronto |
-627 |
996 |
938 |
51.5% |
| |
|
|
|
|
| AL Totals |
|
15841 |
12365 |
56.2% |
| |
|
|
|
|
| 1999 |
Excess LHP |
GBIP LS |
GBIP RS |
%LS |
| Seattle |
758 |
1171 |
887 |
56.9% |
| Chicago(A) |
370 |
1155 |
888 |
56.5% |
| Kansas City |
24 |
1198 |
954 |
55.7% |
| Minnesota |
156 |
1094 |
907 |
54.7% |
| Baltimore |
-296 |
1154 |
962 |
54.5% |
| Texas |
-339 |
1200 |
1004 |
54.4% |
| Tampa Bay |
42 |
1162 |
985 |
54.1% |
| Toronto |
158 |
1141 |
971 |
54.0% |
| Detroit |
-136 |
1096 |
946 |
53.7% |
| Anaheim |
159 |
1096 |
954 |
53.5% |
| Boston |
-376 |
1047 |
926 |
53.1% |
| Cleveland |
-263 |
1078 |
997 |
52.0% |
| Oakland |
-258 |
1092 |
1055 |
50.9% |
| New York(A) |
0 |
1003 |
1001 |
50.0% |
| |
|
|
|
|
| AL Totals |
|
15687 |
13437 |
53.9% |
| |
|
|
|
|
| 2000 |
Excess LHP |
GBIP LS |
GBIP RS |
%LS |
| Texas |
640 |
1132 |
829 |
57.7% |
| Anaheim |
87 |
1217 |
906 |
57.3% |
| Tampa Bay |
-572 |
1207 |
940 |
56.2% |
| Chicago(A) |
459 |
1122 |
889 |
55.8% |
| Minnesota |
584 |
1071 |
854 |
55.6% |
| Toronto |
79 |
1135 |
933 |
54.9% |
| Boston |
-11 |
1116 |
918 |
54.9% |
| Seattle |
228 |
1058 |
893 |
54.2% |
| Baltimore |
-346 |
1084 |
947 |
53.4% |
| New York(A) |
190 |
987 |
890 |
52.6% |
| Detroit |
-499 |
1155 |
1055 |
52.3% |
| Oakland |
-53 |
1145 |
1061 |
51.9% |
| Cleveland |
-98 |
1060 |
992 |
51.7% |
| Kansas City |
-688 |
1094 |
1025 |
51.6% |
| |
|
|
|
|
| AL Totals |
|
15583 |
13132 |
54.3% |
In all three seasons, the Yankees had far fewer GBIP hit to
the left side than would be expected given the balls in play against their LHP.
In this particular case, the adjustment penalizes Yankee SS for plays that
they never had a chance to make. Obviously, this is also true of the
other methods that we have discussed, where left/right adjustments based on
the orientation of the pitching staff are used - the point here is not to
criticize the use of the adjustment but to point out that, in the specific
case of the Yankees, the adjustment leads to a bias in the measurement because
the Yankees’ ball distribution doesn’t fit within the model on which the
method is based. Furthermore, that bias in the model, when applied to the
Yankees, would have the effect of reducing the ranking of the team’s
shortstops.
I also decided to take a look at James’s assumption that an average
shortstop would get assists at the same rate, regardless of whether or not his
team had a high number of assists or a low number of assists. I identified 10
extreme fly ball teams in the 1998-2000 AL, and 11 extreme ground ball teams
during that period, and took at look at the ratio of shortstop assists to team
assists for each of those teams. The shortstops on the flyball teams averaged
28.8% of team assists; the shortstops on the groundball teams averaged 29.0%
of team assists. The groundball teams faced 4% more left handed hitters, which
would reduce the number of assists that their SS get. When I adjusted for
this at the rate at which those team SS got assists when facing LHB and RHB,
and weighted the rate based on each team’s pitchers facing an average mix of
LHB and RHB, the SS on the flyball teams would have averaged 28.6% of team
assists and the SS on the groundball teams would have averaged 29.3% of team
assists. When you make a similar correction for LH/RH batters faced for all of
the AL teams between 1998 and 2000, there is virtually no correlation between
the team’s groundball rate and the percentage of assists that its shortstops
get (r=-0.058 over the 1998-2000 period). Thus, while I thought that shortstops
who played behind ground ball staffs might have a higher percentage of assists
than SS that played behind fly ball staffs, that is apparently not the case,
and thus using the league-average rate as a ba
Mike Emeigh
Posted: November 18, 2002 at 06:00 AM |
21 comment(s)
Login to Bookmark
Related News:
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Marc Stone Posted: November 18, 2002 at 02:04 AM (#607279)(James is a little fuzzy on this too; it's not entirely clear whether he is measuring capability or contribution.)
The result of this choice is that fielders are penalized in Win Shares for every play made by a fielder at another position - plays which they themselves had no chance to make.
Actually, this makes sense in the James scheme. That is, James assigns the Yanks fielders 52.5 Win shares, while an average team would have received 41.3 Win Shares. Therefore, under this scheme, the players on the Yanks are being compared *to each other* as well as to the league. Therefore, an average fielder playing in an above average fielding team (as noted by the James system of 52.5 WS versus 41.3) should come out looking worse than his teammates, but still come out looking exactly like his average counterpart on an average team.
I'm not sure how clear all that was, or if I'm contradicting what Mike or James are saying. But, as best as I can figure it, that's what James is trying to do.
FWIW, the nice thing about Win Shares fielding is it is a evaluation system unlike more usual systems, like DFTs or CAD. There are many parts that are not done well or right, but the environment of Wins, not Hits, is unique, and gives a different angle. This is a good thing, as the more tools, the merrier we are.
For example, start off with an average team. Add 2 assists to your SS, and subtract 2 assist from your 2B. You'll probably have to subtract 1 putout from your SS, and add 1 to your 2B. Now, what is the effect? How much WS did your SS gain, and how much did your 2B lose? Do the same, but with your SS and 3B (this time, you don't need to worry about PO). What happened?
Now, do the same, but for a top team. Are the changes similar? The changes should be very similar, as putting Vlad on the Expos or Vlad on the Mariners as a hitter should have a very similar effect (though not exactly). This method doesn't apply to pitchers.
Because of the complexity of the James calculations, I find that it is easier to look at various marginal effects to note what is really happening under the hood.
I said the idea of such adjustments makes sense. James just did not do them right.
Both Table 9 and Table 10 include hits, outs, errors, and fielders' choices.
Don't you have to make a distinction between fielding ability and contribution to winning?
Yes, you do - and WS measures only the latter directly. James measures capability indirectly, primarily by "spreading the goodness around", to quote Charlie.
Therefore, an average fielder playing in an above average fielding team (as noted by the James system of 52.5 WS versus 41.3) should come out looking worse than his teammates, but still come out looking exactly like his average counterpart on an average team.
If that were true, then you wouldn't expect much, if any, correlation between fielder opportunities and WS. An average fielder on an average team will have more opportunities and more plays as a result of those opportunities than an average fielder on a good team, but should have no more Win Shares (after prorating for playing time). In fact, there is a mild positive correlation between WS/1000 innings and fielder opportunities.
For example, start off with an average team. Add 2 assists to your SS, and subtract 2 assist from your 2B. You'll probably have to subtract 1 putout from your SS, and add 1 to your 2B. Now, what is the effect? How much WS did your SS gain, and how much did your 2B lose? Do the same, but with your SS and 3B (this time, you don't need to worry about PO). What happened?
Now, do the same, but for a top team. Are the changes similar?
Suppose you take one assist away from your 3B - a play that he didn't make. The end result could be a play that the SS made in back of him, in which case all you've done is transfer one play directly to the other fielder. But it could also be a hit, which adds the possibility of a play by everyone else. A play missed by the 2B or the 1B or the OF could become another play for the SS. So when doing this type of analysis, you need to construct a model where you evaluate these secondary effects as well. It would probably be better to do this with a simulator.
-- MWE
My objective was simply to determine if James did the claim points correct by position. Removing an assist from the SS and giving it to the 3B should have a net effect of zero. I don't know that it does. And even if it does, what is the degree of change? Is this change correct?
Your objective is also a good one, and should also be done. When doing it your way, you also have an additional complicated wrinkle that the total number of wins will go down, ever so slightly, if you remove a sure out, and replace it with a possible out possible hit.
There is alot in Win Shares that has not been "proven" other than people's claims that "it works".
This is why Mike's analysis on this topic is so valuable.
Mike, another comment: can you also put Jeter's ( WS - (avg WS for a SS per inning) x (Jeter's innings) ) / 30. The avg WS for a SS per inning is essentially a given, and the same every year. This would simply put things in the same scale of Charlie and Clay and Pete (as plus/minus runs against average).
This cannot be emphasized enough. The results gathered from available statistics is unreliable. The Derek Jeter situation defines this.
We have to do a better job of reconciling pbp data to WS, CAD, DFT.
Mike's done a great job with this.
OK, that wasn't clear from the comment. I don't think the net effect would be zero, because James uses a 50-point scale for assists at 3B and a 40-point scale for SS assists, and because the LHP adjustment differs for 3B and SS.
Your objective is also a good one, and should also be done. When doing it your way, you also have an additional complicated wrinkle that the total number of wins will go down, ever so slightly, if you remove a sure out, and replace it with a possible out possible hit.
Suppose you had a team of average pitchers and average fielders. You can tell, from the PBP data, the rate at which each team's fielders make plays, and the percentage of balls that they field in their vicinity (assigning hits in the same proportion as outs, a la UZR). You can assign non-BIP events at the average rate at which they occur. You'd run simulated seasons of 1450 innings, and calculate defensive WS based on the estimated W/L as if the team had average offensive production over those innings.
Now suppose you took that average SS and placed him on a team with the *best* fielder at each position (as determined by the % of balls that they field in their vicinity). Do the same thing, with simulated seasons of 1450 innings. The team W/L percentage with average offensive production should go up, and then you figure the defensive WS based on that WP. The SS should have the same defensive WS (within a reasonable tolerance).
Mike, another comment: can you also put Jeter's ( WS - (avg WS for a SS per inning) x (Jeter's innings) ) / 30. The avg WS for a SS per inning is essentially a given, and the same every year. This would simply put things in the same scale of Charlie and Clay and Pete (as plus/minus runs against average).
I don't think you meant to divide by 30 at the end. I think you might have meant to multiply by 10/3 (since 3 WS = 1 win and 10 runs = 1 win, thus 3 WS = 10 runs and 1 WS = 10/3 runs).
I have an updated table in which I fixed the spreadsheet error that Charlie noted (I had actually fixed this error earlier but didn't recalculate the individual WS after I fixed it, so the numbers in the article are off) and included the info Tango requested. I'll incorporate that into a revised version of the article and shoot it to Dan tomorrow.
-- MWE
Since the available shares to claim is limited by the number of team wins, this really puts guys like Arod at a comparative disadvantage relative to guys on good teams, doesn't it? Looking at it this way, Jeter's low numbers seem even more shocking, considering the enhanced win share potential available to him by dint of playing on a good team.
Or have I missed something?
This is an extension of the earlier discussion. There are two forces going on here that counterbalance each other to some extent:
1. A good defensive team will have more WS available to its fielders, thus on average the WS available to a defensive position will be higher.
2. An average defender on a good defensive team will lose chances to his better-fielding teammates, which will reduce the percentage of team defensive WS that he gets.
Ideally, these should balance each other out entirely, so that the average defender will look the same whether he's on a good team or a bad team. I think that, in practice, the second factor is larger than the first factor, so that in general, the worse you are relative to your teammates, the lower your total WS will be. One way that we might check this is to see what happens to WS when a fielder moves from a good defensive team (or a poor defensive team) to a lesser (or better) defensive team - do his fielding WS change? A subject for another study...
-- MWE
I don't think the net effect would be zero, because James uses a 50-point scale for assists at 3B and a 40-point scale for SS assists, and because the LHP adjustment differs for 3B and SS.
There are *many* forces at work here! While James uses a 50pt scale for assists for 3b, and 40 for assists, the "available total points" for 3B is 24 and for SS is 38(?). So, from this standpoint, the 3B assists get 24 x .5 = 12, and 38 * .4 = 15.2 for SS assists.
I agree with Mike's assessment that there are many forces at work
- breakdown of assists, po, dp within position,
- claim points by position,
- comparison relative to teammates,
- whole team fielding compared to whole team off+def, and
- whole team compared to league.
Whew! Ideally, moving an average guy from a great team to a bad team should have little impact on the WS of the player in question. (It might have some, because of the interaction, like hitters, but generally speaking it would be pretty small.)
However, given all these forces at work, I don't see the evidence that all these forces have been balanced out to the point where we can say that this is a true statement. In fact, I think it is a daunting task to determine the extent to which this is true or false. The number of forces and variables at work here would make it a small project until itself to determine the validity of the fielding portion of Win Shares. That's not to say that all is lost. All those arbitrary claim and point assignment that James make may actually be calculable through a more rigid analysis.
Then again, Clay and Charlie's approach is probably just as good and far easier.
Or maybe not. For the 3 years of data you presented, the correlation between a team's "LHP excess" and their GBIP to the left side is a meager .04 and not significant.
The appropriate correlation isn't with the *number* of GBIP to the left side (because that number is affected by the total number of GBIP allowed as well as the number of LHP), but with the *percentage* of GBIP that are hit to the left side. The correlation between excess LHP and %LS is r=+0.443 for AL 1998-2000.
-- MWE
The primary reason for this is that Davenport credits a larger percentage of pitching+fielding to the fielders than does James. James splits pitching+fielding as 67.5% to the pitchers, 32.5% to the fielders; Davenport allocates K, HB, BB, and HR 100% to the pitchers, allocates errors 100% to the fielders, and divides other events as 75% fielding and 25% pitching (used to be 70/30). When you make that division for an average team, and then estimate the runs that result from each set of events (using your favorite run estimator), you wind up with something in the vicinity of 60% of the total runs resulting from events credited to the pitchers and 40% to the fielders.
-- MWE
Yes, and since I had it for 1998-2000 I used it.
-- MWE
No. I ran a different version of the query (excluding all bunts, not just SH) and got percentages more in line with 1998 for 1999 and 2000. I also didn't try to assign BIP for which location information was missing, as I had done in the earlier effort - and which I apparently didn't do very accurately, as it turns out. The league percentages for GBIP LS, excluding all bunts, were 56.4% in 1998, 55.1% in 1999, and 55.6% in 2000. There were four teams over 58% in 1998, one in 1999, and 2 in 2000, with 1 team under 54% in 1998, 3 in 1999, and 4 in 2000. The overall correlation factor between excess LHP and % of GBIP to the left side rises to r=+0.447. The Yankees are still below average in GBIP LS in all three seasons.
1998 is still odd, though. Four of the top 5 teams in GBIP LS % have more RHP than expected.
-- MWE
You must be Registered and Logged In to post comments.
<< Back to main