Page rendered in 0.2797 seconds
47 querie(s) executed
— Where BTF's Members Investigate the Grand Old Game
Sunday, July 15, 2007
Fun With Leverage: Is Perception Reality?
—from Tango’s article at The Hardball Times.
Baseball fans - and sportswriters - who are not oriented toward statistical analysis tend to have a fixation with the concept of “clutch hitting”. In spite of numerous studies over the years that show that clutch ability - if it exists at all - tends to be relatively small, fans still argue that so-and-so is truly a “clutch god” or a “choker”.
One issue that we’ve had in trying to evaluate clutch performance from an analytical standpoint is that it’s been difficult to come up with a consistent definition of “clutch situation” that doesn’t do one of two things:
1. aggregate too many “unlike things” together (e.g. performance with runners in scoring position, which equates runner on second/two outs with bases loaded/no outs even though there is a very different potential impact on the game situation);
What Leverage Index does is to place every plate appearance on a sliding scale based on potential game impact. As Tango notes in the quote I highlighted above, most people know clutch when they see it, even if they can’t necessarily define it. LI does an excellent job of accurately capturing the relative importance of game situations from the viewpoint of a typical fan.
Suppose we look at some randomly selected game situations (per Tango’s chart):
Leverage Index 1.0
Leverage Index 2.0
Leverage Index 3.0
While there may be some quibbles about the value assigned to specific situations (like the last one in the 3.0 group), I think that most people would agree that, in general, the relative game importance of the situations as a group mirror the LI assigned to the group. Most fans would recognize the last group of situations as being more “clutch” than the next-to-last group, in my opinion, and the first group as containing the fewest clutch situations.
LI can be used as a weighting factor, to weight a player’s plate appearance by their relative game importance. Consider a .250 hitter who bats in the following game situations over 24 plate appearances:
12 PA with LI = 0.5
If we weight his PA by LI, the 12 PA in the lowest LI situation would be the equivalent of 6 “normal” PA, and the 4 PA in the highest LI situation would by the equivalent of 8 “normal” PA, giving him a weighted equivalent of 22 PA. Suppose that player goes 4-12 in the 0.5 LI situations, 2-8 in the 1.0 LI situations, and 0-4 in the 2.0 LI situations. If we weight that performance by LI, we get:
2-6 weighted by 0.5 LI
or 4-22, a weighted performance of .182. If on the other hand, the player went 2-12 in the low leverage situations, 2-8 in the middle, and 2-4 in the high leverage situations, we’d now have
or 7-22, a weighted performance of .318.
One could, in this manner, develop weighted performance for each player, weighting his PA by the LI of each situation in which he appeared. If the player’s weighted performance was better than his actual performance, one could conclude that he produced more value in game-important situations (e.g. was more “clutch”); if the player’s weighted performance was worse than his actual performance, one could conclude that he produced less value in game-important situations (e.g. was more of a “choker”). The advantage of doing something like this is that every plate appearance for every player can be included in the study, and plate appearances are weighted in a more-or-less appropriate manner based on a consistent definition of the value of the PA.
Now, having said all of that, I don’t think that doing this the simple way actually has a lot of analytical value. There appears to be a small inverse relationship between weighted performance and average leverage - IOW, the higher the average leverage a player sees, the worse his weighted performance is likely to be. Since leverage opportunities are not evenly distributed (they depend on lineup position and team quality, at a minimum), it’s not entirely clear that the weighted performance is fair. That’s why this article is called “Fun with Leverage” - this shouldn’t be taken as a serious attempt to answer the clutch question but as more of a throwaway. But I decided to write this article anyway, because it’s eerie how well some of the results match the perception that many fans have of certain players - and may at least give some insight into why people have picked those labels up.
The play-by-play data from 2003-2006 that I used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet. The LIs were derived from Dave Studenmund’s Win Expectancy worksheet, which is available from the Baseball Graphs site. I didn’t make any sort of year-to-year park or run environment adjustments. in an effort to keep it (relatively) simple.
There were, from 2003-2006, 1029 players who were not pitching at the time and who batted in at least one game. Collectively, these 1029 players hit .270/.338/.433 overall. When their plate appearances were weighted by LI, the collective performance of those players was .271/.345/.431, a net gain of 5 points in OPS. This reflects a fairly typical tradeoff that occurs in high-leverage situations - pitchers are more willing to allow a walk, less inclined to allow an extra-base hit. It may also reflect the “protecting the lines” mentality that permeates baseball teams late in close games.
From that set of 1029 players, I identified a smaller group of 153 players who had at least 250 plate appearances in each of the four seasons 2003-2006. These players I cast as “regulars” - players who got consistent playing time - and the smaller number of PAs any one of these players had was 1220 (Juan Castro). These players, collectively, hit .281/.351/.455 - they were a better group of players across the board. Their collective performance weighted by LI was .282/.358/.454 for a net gain of 6 points in OPS - basically the same pattern as shown by all players.
Finally, within the set of 152 regulars, I took the top 36 hitters, all of whom had OPS of at least .850. These good hitters, collectively, hit .293/.383/.529 unweighted, and .296/.395/.531 weighted by LI - a gain of 14 points in OPS. I found it interesting that, even though they had a larger OBP increase than the other groups, the good hitters maintained their isolated power where the other group lost some of theirs (although the numbers are small and not especially significant). There was virtually no difference in average LI among the three groups.
These group totals set expectations for weighted performance, in my opinion. We would expect modest, OBP-heavy gains in OPS from the typical hitter when his performance is weighted by LI. A really good high-leverage performer would see larger gains; a poor one would see smaller gains, or a decline.
Looking at the group of good hitters, we have.
Top 5, weighted OPS - actual OPS:
Carlos Delgado, .285/.391/.566 unweighted, .310/.416/.618 weighted, 77 point gain
Bottom 5, weighted OPS - actual OPS:
Travis Hafner, .299/.404/.590 unweighted, .289/.399/.563 weighted, 32 point loss
The top five have been well-publicized for their “clutchiness”. The bottom 5 aren’t particularly well-known as “chokers” - with the possible exception of Tejada - but Alfonso Soriano, who was sixth from the bottom, does have something of an “unclutch” reputation.
ARod, FWIW, hit .299/.396/.562 overall, but had a weighted performance of .297/.403/.557, for a 2-point OPS gain. This placed him 24th among the 36 good hitters, and especially in comparison to Jeter probably explains a lot of the perception of ARod as a player who doesn’t produce when it counts. Manny Ramirez, who also has a bit of an “unclutch” reputation, hit .311/.412/.602 overall and .312/.429/.594 weighted, a 9-point OPS gain but with a larger loss of power than the typical good hitter showed.
While there are some mismatches between weighted performance and perception - Bobby Abreu was just behind Jeter, JD Drew and Adam Dunn were also pretty high, and Andruw Jones and Miguel Cabrera are fairly low on the list - as a general rule I think that performance weighted by LI matches perception of clutch value quite well. Whether this has any analytical significance remains to be seen, but I think it offers a starting point.
You must be logged in to view your Bookmarks.
What do you do with Deacon White?
(17 - 1:12pm, Dec 23)
Last: Alex King
(15 - 12:05am, Oct 18)
Nine (Year) Men Out: Free El Duque!
(67 - 10:46am, May 09)
Who is Shyam Das?
(4 - 8:52pm, Feb 23)
Last: RoyalsRetro (AG#1F)
Greg Spira, RIP
(45 - 10:22pm, Jan 09)
Last: Jonathan Spira
Northern California Symposium on Statistics and Operations Research in Sports, October 16, 2010
(5 - 12:50am, Sep 18)
Mike Morgan, the Nexus of the Baseball Universe?
(37 - 12:33pm, Jun 23)
Last: The Keith Law Blog Blah Blah (battlekow)
Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011
(2 - 8:03pm, May 16)
Last: Diamond Research
Retrosheet Semi-Annual Site Update!
(4 - 4:07pm, Nov 18)
What Might Work in the World Series, 2010 Edition
(5 - 3:27pm, Nov 12)
Last: fra paolo
Predicting the 2010 Playoffs
(11 - 5:21pm, Oct 20)
SABR 40: Impressions of a First-Time Attendee
(5 - 11:12pm, Aug 19)
Last: Joe Bivens, Minor Genius
St. Louis Cardinals Midseason Report
(12 - 12:42am, Aug 10)
Napoleon Lajoie: Definition of Grace
(9 - 12:38am, Jul 01)
Last: Hang down your head, Tom Foley
Youth Baseball Hitting Drills: Shine the Light
(5 - 6:47am, Mar 11)
Last: Pat Rapper's Delight