You are here > Home > Primate Studies > Discussion
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Primate Studies — Where BTF's Members Investigate the Grand Old Game ## Monday, November 18, 2002## And the Beat Goes On: Derek Jeter and the State of Fielding Analysis in Sabermetrics - Part 5Mike tackles the latest from Bill James, Win Shares. The Return of the Master - Win Shares Bill James, in the 1984 As did Davenport and Saeger, James also realized that it was important to evaluate fielding first by the performance of the team, then by the performance of the individual fielders on the team. When using individual fielding statistics such as Range Factor as the starting point, James notes that:
Win Shares, page 110; emphasis is
in the original text)
James goes on to note that this assumption cannot be correct, because it is clear that all defensive teams are not equal, and that therefore individual fielding must be evaluated within the context of the team. I’m going to run through the model in detail for shortstops. Primer author Joe Dimino has put together a spreadsheet for calculating Win Shares for a league-season, which I used to perform the calculations. In the Win Shares system, James assigns each team three Win Shares for each team win. He then divides those Win Shares between offense and defense (pitching+fielding), assigning roughly 48% of the Win Shares on average to offense and 52% to defense. He then divides the defensive totals between pitching and fielding, with roughly 35% of the team total on average going to pitching and 17% to fielding. So a team that wins 114 games, like the 1998 Yankees, will have 342 total Win Shares to divide, and with a normal performance on offense and defense will have about 164 of those Win Shares assigned to the hitters, 120 assigned to the pitchers, and 58 to the fielders. The 1998 Yankees actually had 171.5 Win Shares assigned to the hitters, 118 to the pitchers, and 52.5 to the fielders. I should note that James has placed a maximum and minimum limit on the number of Win Shares assigned to the fielders, and the standard formula would have resulted in a number of Win Shares assigned to the fielders that would have exceeded the maximum allowable number, so that the fielders actually could not get more than the 52.5 they received. The fielders would have had about another 1.5 WS without the limit. Once James determines the Win Shares to be assigned to the fielders, he determines how to divide those Win Shares between the team’s defenders at each position - assigning Win Share values to the team’s catchers, 1Bs, 2Bs, etc. Pitcher fielding for some reason is not included. Once Win Shares are assigned to each position, the Win Shares at each position are divided between all of the fielders who played at that position. The assignment of fielding Win Shares by position, and the assignment of fielding Win Shares at a position, are based on a Claim Point system. Each positive accomplishment by fielders yield a certain number of Claim Points, and the Win Shares are divvied up based on the percentage of the total claim points accumulated. The principle is the same for both team defensive position and individual defenders at a position. In dividing fielding Win Shares by position, James evaluates fielders based on four defensive characteristics for their position. While there are some variations from position to position, James generally weights what he considers to be the most important fielding characteristic at that position on a 40-point scale, the next most important on a 30-point scale, then 20, then 10. For shortstops, the 40-point scale is based on Assists, the 30-point scale is based on Double Plays, the 20-point scale is based on Error Percentage, and the 10-point scale is based on Putouts. I’m going to walk through the calculations for the 1998 Yankee shortstops (Jeter, Luis Sojo, and Homer Bush): Yankee shortstops in 1998 had 440 assists. This total is compared to the number of assists that they should have been expected to get, calculated as: (Team assist total)*(% of assists by league SS)+(excess batters faced by LHP/100) where “excess batters faced by LHP” is (LHP BIP) - (team BIP * lg % of BIP vs LHP) BIP being innings pitched, multiplied by 3, minus strikeouts. The 1998 Yankees had 1642 assists as a team. The league percentage of assists by shortstops was .29225, and the Yankees’ LHP faced 368 more batters than expected. Thus, Yankee shortstops would have been expected to have 1642*0.29225+(368/100)=483.55 assists. The number of claim points that SS get for assists is 20+(actual A - expected A)/4 For the Yankees this is 20+(440-483.55)/4, or 9.11 claim points. The claim points for double plays are awarded based upon the team total of DPs turned, compared to the expected number of double plays for that team. The 1998 Yankees turned 146 double plays. The first estimate of the expected number of double plays is calculated based on the number of runners on 1B in DP situations, estimated as: ((H-HR)*(league % of singles allowed))+BB+HBP-SH-WP-Bk-PB times the league percentage of such runners removed on double plays (DP divided by the same formula above, calculated for the league, using actual singles allowed as the first factor). For the 1998 Yankees, this value is: (((1357-156)*0.75196)+466+68-37-37-5-12)*0.0994=134.08 This first estimate is then adjusted for the number of assists per inning, compared to the league average, on the theory that teams with more assists tend to see more ground balls than the norm, and thus are likely to turn more DPs. The 1998 Yankees had 1642 assists in 1456 2/3 innings; the league as a whole had 23295 assists in 20194 2/3 innings. The ratio of the Yankees’ A/inning to the league is therefore 0.9772, and the expected number of DPs is thus (134.08*0.9772), or 131. The claim points for double plays are calculated as: 15+(actual DP - expected DP)/4 or for the Yankees, 15+(146-131)/4 = 18.75 Error percentage for shortstops is the converse of fielding average; it is (errors)/(total chances). The 1998 Yankee SS handled 703 chances and made 11 errors, for an error percentage of 0.01564. The league’s SS handled 10884 chances and made 308 errors, an error percentage of 0.02829. The claim points for error percentage are calculated as: 20 - (10 * (team SS error %)/(league SS error %)) For the Yankees, this is 20 - (10 * 0.01564 / 0.02829), or 14.47 claim points. The 1998 Yankees had 252 putouts by their shortstops. This is compared to the number of putouts they were expected to make, which is: (team PO - team K)*(lg % of non-K PO by SS)+(BB above/below lg average) /14-(excess batters faced by LHP)/64 AL shortstops in 1998 had 0.0815 of their league’s non-strikeout putouts. Yankee pitchers walked 0.0615 men per inning fewer than the league average, or 89.67 men fewer than the league average in their 1456 2/3 innings. As noted above, they had 368 more batters faced by LHP than the league average. Thus, their shortstops would have been expected to have: (4370-1080)*0.0815-89.67/14-368/64, or 255.982 putouts. This is translated to claim points by the formula: 5 + (actual putouts - expected putouts)/15 For the Yankee SS, this yields 5 + (252 - 255.982)/15, or 4.74 claim points. James converts the claim points to a claim percentage. Since there are 100 claim points available per position, the claim percentage is simply the total of the claim points for each aspect of the position, divided by 100. For the Yankee SS, this is (9.11+18.75+14.47+4.74)/100, or .4707. When the calculations are run through, the following claim percentages
result for the other positions on the 1998 Yankees:
James now uses these claim percentages to assign the fielding Win Shares to individual positions. He does so in the following way: Recall that the 1998 Yankees have 52.5 Win Shares assigned to the fielders. The shortstops get (0.2707)*36, or 9.74 weighted claim points. The team total, applying the same formula to the other positions, is 74.75 weighted claim points. The shortstops thus get 52.5*9.74/74.75, or 6.84 fielding Win Shares. Once James has the Win Shares assigned to a position, he then assigns those Win Shares to the individual fielders at that position based on a formula that takes into account the percentage of the plays that the fielder handles. Again, this is done based on a claim point system. Each shortstop is assigned claim points via the following formula: PO + A*2 - E*5 + DP + RBP*2 RBP stands for range bonus plays. A player gets assigned range bonus plays when it is clear by any interpretation of the data that he is making more plays per inning than his teammates. The formula for assigning these is pretty complicated when you don’t have defensive innings, but when you do have defensive innings, as we do for the 1998 Yankees, the range bonus goes to any player who is making more plays per inning than the average for the position for the team. The Yankees used three shortstops in 1998. Derek Jeter played 1304 2/3 innings with 223 PO, 393 A, 9 E, and 82 DP. Luis Sojo played 141 innings with 29 PO, 44 A, 2 E, and 12 DP. Homer Bush played 11 innings with 3 A and no other defensive markers. The team total was 252 PO, 440 A, 11 E, and 96 DP in 1456 2/3 innings; the team’s SS averaged 0.475 plays per inning. Derek Jeter should have made 620 plays (PO+A) at the team average; he actually made 616, so he gets no range bonus. Luis Sojo made 73 plays, and should have made 66.98, so he gets 6.02 range bonus plays. Homer Bush made three plays, and should have made 5.23, so he gets no range bonus. The claim points for these three players using the formula are: Jeter: 1046
The position Win Shares are assigned based on the proportion of the claim points that the individual earns. Jeter gets 6.84*1046/1183, or 6.04 Win Shares for his fielding. Sojo gets 6.84*131/1183, or 0.76 Win Shares, and Bush gets 6.84*6/1183, or 0.03 Win Shares (rounding acounts for the other 0.01 Win Share). The calculations for 1999 and 2000 give Jeter 5.02 and 2.70 Win Shares for fielding, respectively, for those two seasons. Because Win Shares are dependent on playing time, to compare shortstops I determine the number of Win Shares earned per 1000 innings played. Table 8 shows the results of this comparison for the years 1998-2000 among SS with 800 innings or more, with Palmer’s FR, Davenport’s DFTs, Saeger’s CAD Defensive Winning Percentages, and fieldable shortstop opportunities per nine innings included for comparison purposes: ## Table 8. AL SS Win Shares /1000 Inn, 1998-2000 (min 800 inn)
The correlation coefficient between Win Shares per 1000 innings and the other data presented for comparison purposes: Win Shares and FR: r=+0.696 (given James’s take on Fielding Runs, this
might come as a surprise to him)
There have been some criticisms of the Win Shares fielding system, many of them having to do with the arbitrary nature of James’s weights for defensive events. While I think those criticisms are reasonable, I’m more concerned here with whether James has accurately captured overall defensive value in his approach - in other words, can we rely upon the conclusion that a player with 6 defensive Win Shares is a better fielder than a player with 4? If James is accurately evaluating the relative importance of defensive events and accounting for team contextual effects, his rankings should give reliable results, regardless of the specific scale that he uses. If he is not, then we should be able to evaluate the specific shortcomings of the method by looking at how well the results compare with inferences drawn from play-by-play data. It is clear, looking at the ratings, that Jeter’s Win Shares totals are
heavily influenced by his low assist totals. Yankee SS (mostly Jeter) putout
total are close to league average, and their double play rates (at least in
this method, although as I noted in Part 4 they are very bad in relation to
actual DP chances) have been only slightly under the expected rate. Except for
2000, the Yankee SS have very good error rates. But their assist rates have been
very low - on the 40-point scale Yankee SS had fewer than 10 claim points in
each of the three seasons. Since SS lose one claim point for every four
assists that they fall short of their expectation, and since a SS who exactly meets
the expectation gets 20 claim points, Yankee SS are getting James makes one major adjustment for ball distribution in his method, based on balls in play vs LHP in excess of the league average total. Ordinarily, one might expect this adjustment to increase the number of balls hit into play on the left side - and in most cases that happens. As calculated in the Win Shares method, the Yankees had an excess of LHP in both 1998 and 2000, and were exactly league average in 1999. Thus, one might have expected the Yankees to have more GBIP to the left side in 1998 and 2000, and a league average total in 1999. The reality is quite different, as Table 9 demonstrates: ## Table 9. AL Distribution of Ground Balls in Play, 1998-2000
In all three seasons, the Yankees had far fewer GBIP hit to the left side than would be expected given the balls in play against their LHP. In this particular case, the adjustment penalizes Yankee SS for plays that they never had a chance to make. Obviously, this is also true of the other methods that we have discussed, where left/right adjustments based on the orientation of the pitching staff are used - the point here is not to criticize the use of the adjustment but to point out that, in the specific case of the Yankees, the adjustment leads to a bias in the measurement because the Yankees’ ball distribution doesn’t fit within the model on which the method is based. Furthermore, that bias in the model, when applied to the Yankees, would have the effect of reducing the ranking of the team’s shortstops. I also decided to take a look at James’s assumption that an average shortstop would get assists at the same rate, regardless of whether or not his team had a high number of assists or a low number of assists. I identified 10 extreme fly ball teams in the 1998-2000 AL, and 11 extreme ground ball teams during that period, and took at look at the ratio of shortstop assists to team assists for each of those teams. The shortstops on the flyball teams averaged 28.8% of team assists; the shortstops on the groundball teams averaged 29.0% of team assists. The groundball teams faced 4% more left handed hitters, which would reduce the number of assists that their SS get. When I adjusted for this at the rate at which those team SS got assists when facing LHB and RHB, and weighted the rate based on each team’s pitchers facing an average mix of LHB and RHB, the SS on the flyball teams would have averaged 28.6% of team assists and the SS on the groundball teams would have averaged 29.3% of team assists. When you make a similar correction for LH/RH batters faced for all of the AL teams between 1998 and 2000, there is virtually no correlation between the team’s groundball rate and the percentage of assists that its shortstops get (r=-0.058 over the 1998-2000 period). Thus, while I thought that shortstops who played behind ground ball staffs might have a higher percentage of assists than SS that played behind fly ball staffs, that is apparently not the case, and thus using the league-average rate as a ba |
## BookmarksYou must be logged in to view your Bookmarks. ## Hot TopicsLoser Scores 2014
(8 - 2:36pm, Nov 15)Last: willcarrolldoesnotsuk Winning Pitcher: Bumgarner....er, Affeldt (43 - 8:29am, Nov 05)Last: ERROR---Jolly Old St. Nick What do you do with Deacon White? (17 - 12:12pm, Dec 23)Last: Alex King Loser Scores (15 - 12:05am, Oct 18)Last: mkt42 Nine (Year) Men Out: Free El Duque! (67 - 10:46am, May 09)Last: DanG Who is Shyam Das? (4 - 7:52pm, Feb 23)Last: RoyalsRetro (AG#1F) Greg Spira, RIP (45 - 9:22pm, Jan 09)Last: Jonathan Spira Northern California Symposium on Statistics and Operations Research in Sports, October 16, 2010 (5 - 12:50am, Sep 18)Last: balamar Mike Morgan, the Nexus of the Baseball Universe? (37 - 12:33pm, Jun 23)Last: The Keith Law Blog Blah Blah (battlekow) Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011 (2 - 8:03pm, May 16)Last: Diamond Research Retrosheet Semi-Annual Site Update! (4 - 3:07pm, Nov 18)Last: Sweatpants What Might Work in the World Series, 2010 Edition (5 - 2:27pm, Nov 12)Last: fra paolo Predicting the 2010 Playoffs (11 - 5:21pm, Oct 20)Last: TomH SABR 40: Impressions of a First-Time Attendee (5 - 11:12pm, Aug 19)Last: Joe Bivens, Minor Genius St. Louis Cardinals Midseason Report (12 - 12:42am, Aug 10)Last: bjhanke |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Page rendered in 0.9063 seconds |

## Reader Comments and Retorts

Go to end of page

(James is a little fuzzy on this too; it's not entirely clear whether he is measuring capability or contribution.)

The result of this choice is that fielders are penalized in Win Shares for every play made by a fielder at another position - plays which they themselves had no chance to make.Actually, this makes sense in the James scheme. That is, James assigns the Yanks fielders 52.5 Win shares, while an average team would have received 41.3 Win Shares. Therefore, under this scheme, the players on the Yanks are being compared *to each other* as well as to the league. Therefore, an average fielder playing in an above average fielding team (as noted by the James system of 52.5 WS versus 41.3) should come out looking worse than his teammates, but still come out looking exactly like his average counterpart on an average team.

I'm not sure how clear all that was, or if I'm contradicting what Mike or James are saying. But, as best as I can figure it, that's what James is trying to do.

FWIW, the nice thing about Win Shares fielding is it is a evaluation system unlike more usual systems, like DFTs or CAD. There are many parts that are not done well or right, but the environment of Wins, not Hits, is unique, and gives a different angle. This is a good thing, as the more tools, the merrier we are.

For example, start off with an average team. Add 2 assists to your SS, and subtract 2 assist from your 2B. You'll probably have to subtract 1 putout from your SS, and add 1 to your 2B. Now, what is the effect? How much WS did your SS gain, and how much did your 2B lose? Do the same, but with your SS and 3B (this time, you don't need to worry about PO). What happened?

Now, do the same, but for a top team. Are the changes similar? The changes should be very similar, as putting Vlad on the Expos or Vlad on the Mariners as a hitter should have a very similar effect (though not exactly). This method doesn't apply to pitchers.

Because of the complexity of the James calculations, I find that it is easier to look at various marginal effects to note what is really happening under the hood.

I said the idea of such adjustments makes sense. James just did not do them right.

Sorry if I missed something, but... in your Table 9, do the GBIP LS and GBIP RS columns include ground-ball base hits, or outs only?Both Table 9 and Table 10 include hits, outs, errors, and fielders' choices.

Don't you have to make a distinction between fielding ability and contribution to winning?Yes, you do - and WS measures only the latter directly. James measures capability indirectly, primarily by "spreading the goodness around", to quote Charlie.

Therefore, an average fielder playing in an above average fielding team (as noted by the James system of 52.5 WS versus 41.3) should come out looking worse than his teammates, but still come out looking exactly like his average counterpart on an average team.If that were true, then you wouldn't expect much, if any, correlation between fielder opportunities and WS. An average fielder on an average team will have more opportunities and more plays as a result of those opportunities than an average fielder on a good team, but should have no more Win Shares (after prorating for playing time). In fact, there is a mild positive correlation between WS/1000 innings and fielder opportunities.

For example, start off with an average team. Add 2 assists to your SS, and subtract 2 assist from your 2B. You'll probably have to subtract 1 putout from your SS, and add 1 to your 2B. Now, what is the effect? How much WS did your SS gain, and how much did your 2B lose? Do the same, but with your SS and 3B (this time, you don't need to worry about PO). What happened?

Now, do the same, but for a top team. Are the changes similar?

Suppose you take one assist away from your 3B - a play that he didn't make. The end result could be a play that the SS made in back of him, in which case all you've done is transfer one play directly to the other fielder. But it could also be a hit, which adds the possibility of a play by everyone else. A play missed by the 2B or the 1B or the OF could become another play for the SS. So when doing this type of analysis, you need to construct a model where you evaluate these secondary effects as well. It would probably be better to do this with a simulator.

-- MWE

My objective was simply to determine if James did the claim points correct by position. Removing an assist from the SS and giving it to the 3B should have a net effect of zero. I don't know that it does. And even if it does, what is the degree of change? Is this change correct?

Your objective is also a good one, and should also be done. When doing it your way, you also have an additional complicated wrinkle that the total number of wins will go down, ever so slightly, if you remove a sure out, and replace it with a possible out possible hit.

There is alot in Win Shares that has not been "proven" other than people's claims that "it works".

This is why Mike's analysis on this topic is so valuable.

Mike, another comment: can you also put Jeter's ( WS - (avg WS for a SS per inning) x (Jeter's innings) ) / 30. The avg WS for a SS per inning is essentially a given, and the same every year. This would simply put things in the same scale of Charlie and Clay and Pete (as plus/minus runs against average).

This cannot be emphasized enough. The results gathered from available statistics is unreliable. The Derek Jeter situation defines this.

We have to do a better job of reconciling pbp data to WS, CAD, DFT.

Mike's done a great job with this.

My objective was simply to determine if James did the claim points correct by position.OK, that wasn't clear from the comment. I don't think the net effect would be zero, because James uses a 50-point scale for assists at 3B and a 40-point scale for SS assists, and because the LHP adjustment differs for 3B and SS.

Your objective is also a good one, and should also be done. When doing it your way, you also have an additional complicated wrinkle that the total number of wins will go down, ever so slightly, if you remove a sure out, and replace it with a possible out possible hit.Suppose you had a team of average pitchers and average fielders. You can tell, from the PBP data, the rate at which each team's fielders make plays, and the percentage of balls that they field in their vicinity (assigning hits in the same proportion as outs, a la UZR). You can assign non-BIP events at the average rate at which they occur. You'd run simulated seasons of 1450 innings, and calculate defensive WS based on the estimated W/L as if the team had average offensive production over those innings.

Now suppose you took that average SS and placed him on a team with the *best* fielder at each position (as determined by the % of balls that they field in their vicinity). Do the same thing, with simulated seasons of 1450 innings. The team W/L percentage with average offensive production should go up, and then you figure the defensive WS based on that WP. The SS should have the same defensive WS (within a reasonable tolerance).

Mike, another comment: can you also put Jeter's ( WS - (avg WS for a SS per inning) x (Jeter's innings) ) / 30. The avg WS for a SS per inning is essentially a given, and the same every year. This would simply put things in the same scale of Charlie and Clay and Pete (as plus/minus runs against average).I don't think you meant to divide by 30 at the end. I think you might have meant to multiply by 10/3 (since 3 WS = 1 win and 10 runs = 1 win, thus 3 WS = 10 runs and 1 WS = 10/3 runs).

I have an updated table in which I fixed the spreadsheet error that Charlie noted (I had actually fixed this error earlier but didn't recalculate the individual WS after I fixed it, so the numbers in the article are off) and included the info Tango requested. I'll incorporate that into a revised version of the article and shoot it to Dan tomorrow.

-- MWE

Since the available shares to claim is limited by the number of team wins, this really puts guys like Arod at a comparative disadvantage relative to guys on good teams, doesn't it? Looking at it this way, Jeter's low numbers seem even more shocking, considering the enhanced win share potential available to him by dint of playing on a good team.

Or have I missed something?

Since the available shares to claim is limited by the number of team wins, this really puts guys like Arod at a comparative disadvantage relative to guys on good teams, doesn't it? Looking at it this way, Jeter's low numbers seem even more shocking, considering the enhanced win share potential available to him by dint of playing on a good team.This is an extension of the earlier discussion. There are two forces going on here that counterbalance each other to some extent:

1. A good defensive team will have more WS available to its fielders, thus on average the WS available to a defensive position will be higher.

2. An average defender on a good defensive team will lose chances to his better-fielding teammates, which will reduce the percentage of team defensive WS that he gets.

Ideally, these should balance each other out entirely, so that the average defender will look the same whether he's on a good team or a bad team. I think that, in practice, the second factor is larger than the first factor, so that in general, the worse you are

relative to your teammates, the lower your total WS will be. One way that we might check this is to see what happens to WS when a fielder moves from a good defensive team (or a poor defensive team) to a lesser (or better) defensive team - do his fielding WS change? A subject for another study...-- MWE

I don't think the net effect would be zero, because James uses a 50-point scale for assists at 3B and a 40-point scale for SS assists, and because the LHP adjustment differs for 3B and SS.There are *many* forces at work here! While James uses a 50pt scale for assists for 3b, and 40 for assists, the "available total points" for 3B is 24 and for SS is 38(?). So, from this standpoint, the 3B assists get 24 x .5 = 12, and 38 * .4 = 15.2 for SS assists.

I agree with Mike's assessment that there are many forces at work

- breakdown of assists, po, dp within position,

- claim points by position,

- comparison relative to teammates,

- whole team fielding compared to whole team off+def, and

- whole team compared to league.

Whew! Ideally, moving an average guy from a great team to a bad team should have little impact on the WS of the player in question. (It might have some, because of the interaction, like hitters, but generally speaking it would be pretty small.)

However, given all these forces at work, I don't see the evidence that all these forces have been balanced out to the point where we can say that this is a true statement. In fact, I think it is a daunting task to determine the extent to which this is true or false. The number of forces and variables at work here would make it a small project until itself to determine the validity of the fielding portion of Win Shares. That's not to say that all is lost. All those arbitrary claim and point assignment that James make may actually be calculable through a more rigid analysis.

Then again, Clay and Charlie's approach is probably just as good and far easier.

James makes one major adjustment for ball distribution in his method, based on balls in play vs LHP in excess of the league average total. Ordinarily, one might expect this adjustment to increase the number of balls hit into play on the left side - and in most cases that happens.Or maybe not. For the 3 years of data you presented, the correlation between a team's "LHP excess" and their GBIP to the left side is a meager .04 and not significant.

For the 3 years of data you presented, the correlation between a team's "LHP excess" and their GBIP to the left side is a meager .04 and not significant.The appropriate correlation isn't with the *number* of GBIP to the left side (because that number is affected by the total number of GBIP allowed as well as the number of LHP), but with the *percentage* of GBIP that are hit to the left side. The correlation between excess LHP and %LS is r=+0.443 for AL 1998-2000.

-- MWE

The rankings are actually very similar, but the actual run amounts are far less than the DFTs. ... So why the big difference?The primary reason for this is that Davenport credits a larger percentage of pitching+fielding to the fielders than does James. James splits pitching+fielding as 67.5% to the pitchers, 32.5% to the fielders; Davenport allocates K, HB, BB, and HR 100% to the pitchers, allocates errors 100% to the fielders, and divides other events as 75% fielding and 25% pitching (used to be 70/30). When you make that division for an average team, and then estimate the runs that result from each set of events (using your favorite run estimator), you wind up with something in the vicinity of 60% of the total runs resulting from events credited to the pitchers and 40% to the fielders.

-- MWE

FYI -- unless you calculated something differently, CAD does use LHB/RHB adjustments where they are available (not LHP/RHP).Yes, and since I had it for 1998-2000 I used it.

-- MWE

Actually, you don't even need the regression model to see that 1998 is fundamentally different from 1999 & 2000. In 1998, 4 teams had %LS over 58%. No team in 1999 or 2000 had over 58%. Only 1 1998 team had LS% under 54%. There were 6 under 54% in 1999 and 6 more in 2000. Was there a big dropoff in LHP league-wide between 1998 and 1999?No. I ran a different version of the query (excluding all bunts, not just SH) and got percentages more in line with 1998 for 1999 and 2000. I also didn't try to assign BIP for which location information was missing, as I had done in the earlier effort - and which I apparently didn't do very accurately, as it turns out. The league percentages for GBIP LS, excluding all bunts, were 56.4% in 1998, 55.1% in 1999, and 55.6% in 2000. There were four teams over 58% in 1998, one in 1999, and 2 in 2000, with 1 team under 54% in 1998, 3 in 1999, and 4 in 2000. The overall correlation factor between excess LHP and % of GBIP to the left side rises to r=+0.447. The Yankees are still below average in GBIP LS in all three seasons.

1998 is still odd, though. Four of the top 5 teams in GBIP LS % have more RHP than expected.

-- MWE

You must be Registered and Logged In to post comments.

<< Back to main