— Where BTF's Members Investigate the Grand Old Game
Tuesday, July 01, 2003
Protection: Fact or Fiction?
Dylan takes a crack at one of the most hotly contested stathead/non-stathead debates.
One of the most traditionally accepted baseball assumptions is that of "hitting protection." Specifically that a big bat hitting behind the batter helps him offensively. We most recently heard this confirmed in 2002 with Jeff Kent moving in front of Barry Bonds in the SF Giants batting order. Batting order "protection" is so often heard spouted out by broadcasters and fans, it?s hard to label it a "theory"?it?s more like a given "truth." On the other hand, some assume it doesn?t exist at all. I decided to see what the data itself showed. This, of course, is more of a study for fun than a real impact on the game itself.
The common sense theory of "protection" is that the pitcher wants to avoid a walking a hitter in front of a big masher who can drive him in. Thus a hitter with "protection" should see more strikes. Since everyone knows this (pitcher, batter, manager, fans, announcers, GM?s who trade for "protectors," etc.), the batter has an advantage. This advantage will logically help the hitter at the plate produce better numbers.
A few studies have tried to verify whether this protection advantage exists, and if so, how large this advantage is. [After all, baseball ultimately is a game of numbers: hits, walks, runs, wins, championships, dollars, etc.] The most notable study I?m aware of was done by David Grabiner for the 1991 AL (http://www.baseball1.com/bb-data/grabiner/protstudy.html) He found that there was no conclusive evidence that protection existed on a league wide scale in the 1991 AL.
I followed Grabiner?s approach for my study of 2002 National League hitters. As I describe the method, I think the following perspective might help. This study is focusing on the individual "protectees", not the individual "protectors." Think of a "protector" in the on deck circle as a "split factor" just like other "split factors" (day-night, home-road, LHP-RHP, "clutch"-"non-clutch", etc). If Rich Aurilla qualified (he doesn?t) it doesn?t matter whether Barry Bonds or Jeff Kent is the "protector"?we?d only be looking at his "protected" vs. "unprotected" split
Below are my classifications in order to help structure the study:
The statistics that were collected are at bats, hits, walks, total bases, and strikeouts. Batting average (AVG), on-base percentage (OBP), slugging percentage (SLG), & "on-base plus slugging" OPS were then generated from these components. (Please note, "OBP" is not the "official version" because it lacks hit by pitch & sacrifice fly adjustments)
ESPN.com (they cite STATS as their source) was used initially. From their "sortable stats" page, the "qualified" filter was used to identify the "protectors" by sorting the SLG column. For each "protectee," under splits, their batting order distribution was noted. Most classic sluggers don?t move around in the batting order much?when Barry Bonds & Jeff Kent did it was news. Mark Bellhorn was an exception, but since many of his at bats were in the leadoff spot, we weren?t going study his protection of Kerry Wood?s at bats?this year. Then each team was studied and "protectee" candidates were identified, again using ESPN.com splits. To weed out pinch-hitting effects (and to make the data gathering more efficient), any spot in batting order splits that had 15 or fewer AB?s was excluded. Therefore the totals for all batters will not necessarily equal their totals for 2002. (This applied to both protectors and protectees).
If a protectee had more than 15 AB?s in a batting order spot that was isolated from any possible teammate protector, those were counted as "unprotected" otherwise, for potential "protectees"?Retrosheet was used to review individual box scores and match up games between protectees & protectors.
For the sake of simplicity, the starting lineup was used. If a protectee was listed in front of a protector, he was credited for all that game?s totals as "protected." If a protectee was not in front of a protector, his totals for that game were "unprotected." The assumption was that the pitcher didn?t know the protector in the on-deck circle was going to be removed for next inning for defense, he forgot, etc.
Using the classifications stated above, there were 23 protectors and 27 protectees. Out of those 27 protectees, 11 of them saw an improved performance in OPS hitting in front of a protector, while 16 of them saw a decrease in performance. By looking at (BB+K)/PA as an estimation of players seeing more strikes, we see that 15 batters had an increase in (BB+K)/PA with a protector behind them, while 11 had a lower (Sosa had almost the exact % here). TB/H gives us a good estimation of how hard the balls were that the batters hit were. 13 of the batters had a higher amount of bases per hit with a protector behind them while 14 were lower.
Looking at overall performance by totaling the 27 protectee?s stats together, it looks like OPS increase by about 3.5%, (BB+K)/PA increased by about 6.4% and TB/H increased by about 2%. The data can be seen here.
Some of the potential limitations of this study include arbitrary selection of the definitions that would leave off potential protectors or protectees. This includes players who might have a reputation without the performance (pitchers don?t look at SLG %), or players who have the production but not the required AB?s. Another method could have been to look at career SLG% instead of just the 2002 season. There is also the potential to have noise in the data regarding certain batters hitting in certain positions. If a certain batter were more comfortable leading off than hitting 3rd but happens to not be protected leading off, how much would this affect his production? A third issue is a protectee?s performance relative to his protectors. In other words, can anyone really protect Bonds v2002?