Defensive Analysis - Continuous Improvement
Retrosheet has become even more valuable? Who would think that was possible? They are also becoming faster, and Baseball Reference is also reaching new heights in capabilities and faster updates. Coupled with Chone Smith’s skill-sets, I have managed to look at last year’s defensive ratings.
At the beginning of this past baseball season, Chone took a look at OCab’s defensive plays, and pondered the results. Why didn’t OCab’s chances match up with what his Zone Rating, my personal favorite defensive system, claimed he made? I have studied ZR for a long time. I am certain that it had been defined as “ground balls into a player’s zone converted into outs, as a fraction”. Chone’s analysis demonstrated this to not be true. Line Drives (LD) caught appeared to be included. It didn’t make sense to me – I had asked this question specifically of STATS before and was told that LDs were not included. Really, though, the math was not making sense either way.
So I asked the inventor, John Dewan. John truthfully answered, “I’m not sure.” Fair enough – the designer doesn’t always have all the controls over the actual inputs. John now does (and designed) the work at BIS, and brilliant work it is. His new system uses slightly different zones than ZR, and properly separates “Balls In Zone” (BIZ) from “Out Of Zone” plays made (OOZ). This has always been recognized as a flaw in the ZR system, causing players who live at the fringes of their assigned zones to be over/underrated.
At any rate, a long discussion ensued regarding the number of plays an infielder made, and whether or not LDs were included. Steve Treder interjected, “Including line drives won’t likely have a big effect, but it would seem to have a net useful effect. Why not include them, even if it only helps a little?” Which was largely ignored, as we continued to argue semantics/minor points. This discussion was the first and only known occurrence here at BTF.
Long story short (I know, too late), Retrosheet has the 2007 data, and I kindly asked Chone (and Mike Emeigh) to pull that data (because I suck at that). Then I plugged the results in my standard DPI spreadsheets to calculate Defensive Runs Saved (DRS) by ZR, and then by ZR with LDs removed.
Chone is working on additional defensive analyses that are going to be awesome. What I am going to work on is making sure there is a good and practical way for you, the Consumer, to accurately calculate defensive value for yourself. I think that’s important. Chone is going to do much fancier work, and he’s doing work to simplify the explanations, and I think it is going to shine the light on how defense can work. But enough about how awesome he is.
So back to the discoveries from this pbp data. I calculated DRS+ (adjusted to a standard baseline) for each player as I always have, and in the same manner anyone with access to ZR can, whether from ESPN or MLB or whathaveyou. Then I took every player’s individual plays on the infield, separating the ground balls (GB), the line drives and the pop ups (PU). It was pretty easy to then use the existing ZR to back-calculate how many ZR chances each player had excluding line drives, and calculate a LD-less ZR. Then using this new ZR, calculate the DRS+. It really isn’t too fancy. After doing that, I could simply subtract my LD-adjusted ZR DRS+ from STATS ZR DRS+, and generate the difference including line drives can make.
The effect of including LDs is minimal, and negligible, in my opinion, effectively validating the original work above. It’s essentially like using Marcel versus running lots of projection data work to improve the rating by a single run – or less. The maximum DRS+ change listed below is the absolute value. A player could have had his DRS+ affected negatively or positively by his LDs caught as compared to average. The average DRS+ change listed below is for the starters – those playing 700 innings or so. If I include all defenders at a position, the average drops to nearly nothing.
Lge Pos Max Avg
AL 1B 1.2 0.3
AL 2B 1.5 0.4
AL 3B 2.1 0.7
AL SS 1.1 0.5
NL 1B 1.3 0.6
NL 2B 1.9 0.9
NL 3B 2.1 0.7
NL SS 1.5 0.8
In general, players with poor DRS+ have their numbers improved by removing the LDs, and the better fielders have their DRS+ decreased by removing the LDs. That makes sense – good defenders are slightly better at snaring LDs, and the factor that, as the average decreases, the worse players are closer to average. They “regress to the mean”, as it were.
The conclusion is that STATS ZR, while including LDs, does not unfairly malign players, or wrongly represent their defensive contributions. As Treder said well: why not include them? They are plays made, and the tightness around the run value indicates they are very close in opportunity and conversion. Yes, I personally think LDs shouldn’t be included, but let’s not throw the baby out with the bath water. You have a good working method to provide yourself with a good estimate of a player’s defensive value you can calculate all season, at any time, by yourself. Hopefully this extra information allows you to be more confident in that data.
Posted: December 31, 2007 at 03:11 AM | 25 comment(s)
Login to Bookmark