Reconciliation - Getting Defensive Stats and Statheads Back Together
There is lots of skepticism around defensive statistics. Wrongly so, in my opinion, but I am probably a little biased. Yes, they aren’t perfect, but there is no reason to throw the baby out with the bathwater. The idea behind the stats is solid. There is some differing opinions around the treatment of the data, but otherwise, the defensive stats community does believe we are in the right general area.
Except for a few people and one of them is Colin Wyers. Colin Wyers, in case you didn’t know, is a sharp cookie and he does tremendous statistical work around testing other people’s stats and theories, in addition to his own developments. He is a terrific critical thinker. He is presently employed at Baseball Prospectus as some sort of stats guru, and has posted several articles questioning the foundation of defensive stats, from the height of pressboxes to the inconsistencies between UZR and +/-, despite coming from the same data source. Let me be perfectly clear, I think Colin is smart, inquisitive and open-minded. I also think he is overly skeptical about the quality of defensive stats.
There is another Primate, LAW or BWV 1129, who has some serious questions about reconciling defensive statistics against DER (comment #14 in the Tango link below), or at least Runs Allowed. He suggests that there should be a pitching stat that complements the fielding stats and those should, at the team level, match runs allowed (in a manner similar to Runs Created matching runs scored), possibly FIP or some other fielding independent stat.
We’ve had an exciting discussion over at Tango’s The Book Blog, with Brian Cartwright, terpsfan, AROM, Tango and MGL, and others. Here at BTF, we had the same types of comments in the latest Colin Wyers BPro thread. Primate Harold posits some interesting thoughts about why the stats won’t reconcile (UZR + FIP =/= RA) but I claim the Dial DRS plus something like FIP might.
SIDEBAR: I just said Dial DRS. Dial DRS? Isn’t that DRS? Well, apparently not. BIS co-opted it, and John Dewan and Bill James outrank me, so once DRS gets posted at ESPN or Fangraphs (who I thought would have known better), and it isn’t the DRS I have been creating and referencing, and being referenced around the web for the last dozen years, then well, I lost the acronym DRS. Up mine. So, Dial DRS or DDRS is where we are today. FWIW, BIS says it was accidental, and I believe them. It doesn’t really excuse Fangraphs, who completely knew about my work, but they are just posting what they are given. Bitter? No, because that’s just the tip of the iceberg. /SIDEBAR
I have been trying to reconcile DDRS and DIPS since DIPS came out and Ron Johnson suggested that should be done to verify that the numbers have solid foundation. I struggled and couldn’t do it, but I was using ERA of HBIP. And a few other “Chris isn’t that smart” stats, and I kept being very far off in outs.
This offseason, Tango made a post about prediction contests. I wanted to generate a set of numbers of projections for pitchers that was more accurate than others since that is an area that all projection systems struggle. So I thought, what if I take pitchers and use *their* HBIP, but their HBIP adjusted for their hits allowed in defensive zones. I thought that I needed to look at each pitchers’ Hits on Balls In Play Not In Zone. HBIPNIZ. Because balls in zone is based on their defensive players, and if one guy has Nate McLouth and another has Carlos Beltran, then one guy looks better than the other, even on the same set of BIP allowed. So I worked backwards from team ZR.
Please bear in mind that I am an open source researcher. I do not think what I am about to post is completely right and I may have made some incorrect assumptions or calc errors. I will argue that I did not, but I am well aware I may have. I EXPECT this community to take what I put forth here and expand and improve on it, either with better data or improved thought processes. I am here to provide the first building block. Who knows – maybe we only need one.
Here are the steps as I have begun:
1. Take the teams ZR chances and Plays Made. This represents the responsibility of the fielders, NOT THE PITCHER. That isn’t a perfect nor completely correct assumption, but we’ll talk about that later.
2. That gives me the number of Hits on Balls In Zone (HBIZ). Interestingly, for 2009, the NL average was 477, with a range of 398 (SFG) to 583 (HOU).
3. With HBIZ, I take HBIP and subtract HBIZ, yielding HBIPNIZ. Now we are talking *pitching*. So the pitching staff that allowed the fewest HBIPNIZ was 710 (LAD), and the most allowed was 861 (PHI). That’s interesting right there. For the curious, SFG was third with 730 HBIPNIZ allowed, but incredibly, HOU was fifth lowest HBIPNIZ. Talk about victimized by poor defense.
4. Now I have HBIP and HBIPNIZ, and thus I create a ratio for each team. The team with the lowest ratio? Houston. Perhaps there is some park factor looming here. The highest is the Phillies. The average here is 0.623. So 62% of hits are on balls not in zones. That makes sense, since line drives make up most hits and most line drives aren’t in anyone’s zone.
5. So now on to the individual pitchers. So I take each pitcher’s line and I calculate their PAR, or “Pitcher Allowed Runs”. Dammit, I *have* to be the first with that!!! PAR is calculated by:
For the league, using linear weights the average value of a HBIP is 0.56 runs (weighted average of singles, doubles and triples).
Notice how my defensive stats mesh completely with DIPS.
6. Next is FAR, or Fielding Allowed Runs, for this specific pitcher. That is calculated (although advanced versions of this statistic would take actual BIP behind this specific pitcher, which I do not have) by: =(1-HBIPNIZ)*0.56*(H-HR). Again, 0.56 represents the average value of an HBIP across the league.
7. The first test is PAR + FAR = TAR (Total Allowed Runs) correlation to actual RA. For the league, summed by each individual pitcher, r^2 is 0.972. That’s pretty strong, but perhaps I haven’t done anything that would make it not be strong. But it is strong.
8. The differences. Ricky Nolasco seems to have allowed 20 more runs than this metric would have agreed with, which is RA - TAR
Ricky Nolasco 20.51
Juan Rincon 11.45
Chad Gaudin 11.29
Craig Stammen 10.52
Hiroki Kuroda 9.94
That got low in a hurry. Nolasco is a clear “What?” miss here.
Going the other direction:
J.A. Happ -24.01
Adam Wainwright -22.20
Doug Davis -19.37
Joe Blanton -18.94
Matt Cain -18.43
JA Happ? Seems like he was due for a collapse, which agrees with ZiPS anyway. Likewise, Adam Wainwright is crashing and burning. Wait, what? Well, I don’t know how predictive it is – it’s meant to be descriptive.
9. To calculate PAR+ (park and playing time adjustments), I calculated each pitcher’s PAR/IP and the league PAR/IP, and then put in the Baseball-Reference Pitching Park Factors, and divided by 100. The league leaders were:
Name Age Tm IP PAR+
Tim Lincecum 25 SFG 225.3 42.75
Chris Carpenter 34 STL 192.6 35.55
Javier Vazquez 32 ATL 219.3 35.06
Danny Haren 28 ARI 229.3 32.84
Josh Johnson 25 FLA 209.0 29.81
Crazy set of leaders, huh? After Johnson there is a 20% drop to the next tier.
10. Now I have team PAR, and Team DRS and Team Runs Allowed. If I sum the defensive prevention numbers, do I match the difference between a team’s runs allowed and league average? Let’s see:
Tm Sum of R Sum of PAR by ip DRS Run vs Avg TRS
BAL 876 104 -24 105 127
BOS 736 -24 -54 -35 30
CHW 732 -52 -8 -39 -44
CLE 865 49 -31 94 79
DET 745 52 71 -26 -19
KCR 842 -4 -46 71 42
LAA 761 41 25 -10 16
MIN 765 11 6 -6 5
NYY 753 -42 -11 -18 -31
OAK 761 -27 16 -10 -43
SEA 692 -97 6 -79 -103
TBR 754 -26 10 -17 -37
TEX 740 -18 11 -31 -29
TOR 771 34 28 0 6
Avg 771 Correl: 0.891
This correlation is 0.89. Goodness, that worked out well. Of course, the NL wasn’t quite as good, coming in at r = 0.74. I will have to expand this to see if 2009 was fluky good or bad, and check a few more seasons to see how it works out. I think I have 2007-2008 lined up, but as you can see this isn’t a small amount of work.
This where you come in – what are the next Next Steps? Lots of these match up very well, but there is some factor I am missing. I think it *could be* IF FBs. Not every team will see enough to reconcile all of these differences, but I believe it will tighten up the numbers. Oddly, I couldn’t find just a straight count of popups by team.
Back to the pitcher being responsible part. A key theory for DIPS and FIP is that the pitcher isn’t responsible for HBIP (much). Therefore, all plays fielded are the responsibility of the fielders, and the pitcher gets zero credit. Or, as I noted last year, the pitcher is responsible for about 70% of plays made. That’s the bare minimum anyone playing gets to make. I referenced it the other day - scroll to post #30 for the dirty details.
Posted: July 22, 2010 at 02:26 PM | 25 comment(s)
Login to Bookmark