User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
Page rendered in 0.3421 seconds
59 querie(s) executed
Dialed In — Thursday, July 22, 2010Reconciliation - Getting Defensive Stats and Statheads Back TogetherThere is lots of skepticism around defensive statistics. Wrongly so, in my opinion, but I am probably a little biased. Yes, they aren’t perfect, but there is no reason to throw the baby out with the bathwater. The idea behind the stats is solid. There is some differing opinions around the treatment of the data, but otherwise, the defensive stats community does believe we are in the right general area. Except for a few people and one of them is Colin Wyers. Colin Wyers, in case you didn’t know, is a sharp cookie and he does tremendous statistical work around testing other people’s stats and theories, in addition to his own developments. He is a terrific critical thinker. He is presently employed at Baseball Prospectus as some sort of stats guru, and has posted several articles questioning the foundation of defensive stats, from the height of pressboxes to the inconsistencies between UZR and +/-, despite coming from the same data source. Let me be perfectly clear, I think Colin is smart, inquisitive and open-minded. I also think he is overly skeptical about the quality of defensive stats. There is another Primate, LAW or BWV 1129, who has some serious questions about reconciling defensive statistics against DER (comment #14 in the Tango link below), or at least Runs Allowed. He suggests that there should be a pitching stat that complements the fielding stats and those should, at the team level, match runs allowed (in a manner similar to Runs Created matching runs scored), possibly FIP or some other fielding independent stat. We’ve had an exciting discussion over at Tango’s The Book Blog, with Brian Cartwright, terpsfan, AROM, Tango and MGL, and others. Here at BTF, we had the same types of comments in the latest Colin Wyers BPro thread. Primate Harold posits some interesting thoughts about why the stats won’t reconcile (UZR + FIP =/= RA) but I claim the Dial DRS plus something like FIP might. SIDEBAR: I just said Dial DRS. Dial DRS? Isn’t that DRS? Well, apparently not. BIS co-opted it, and John Dewan and Bill James outrank me, so once DRS gets posted at ESPN or Fangraphs (who I thought would have known better), and it isn’t the DRS I have been creating and referencing, and being referenced around the web for the last dozen years, then well, I lost the acronym DRS. Up mine. So, Dial DRS or DDRS is where we are today. FWIW, BIS says it was accidental, and I believe them. It doesn’t really excuse Fangraphs, who completely knew about my work, but they are just posting what they are given. Bitter? No, because that’s just the tip of the iceberg. /SIDEBAR I have been trying to reconcile DDRS and DIPS since DIPS came out and Ron Johnson suggested that should be done to verify that the numbers have solid foundation. I struggled and couldn’t do it, but I was using ERA of HBIP. And a few other “Chris isn’t that smart” stats, and I kept being very far off in outs. This offseason, Tango made a post about prediction contests. I wanted to generate a set of numbers of projections for pitchers that was more accurate than others since that is an area that all projection systems struggle. So I thought, what if I take pitchers and use *their* HBIP, but their HBIP adjusted for their hits allowed in defensive zones. I thought that I needed to look at each pitchers’ Hits on Balls In Play Not In Zone. HBIPNIZ. Because balls in zone is based on their defensive players, and if one guy has Nate McLouth and another has Carlos Beltran, then one guy looks better than the other, even on the same set of BIP allowed. So I worked backwards from team ZR. Please bear in mind that I am an open source researcher. I do not think what I am about to post is completely right and I may have made some incorrect assumptions or calc errors. I will argue that I did not, but I am well aware I may have. I EXPECT this community to take what I put forth here and expand and improve on it, either with better data or improved thought processes. I am here to provide the first building block. Who knows – maybe we only need one. Here are the steps as I have begun: Notice how my defensive stats mesh completely with DIPS. 6. Next is FAR, or Fielding Allowed Runs, for this specific pitcher. That is calculated (although advanced versions of this statistic would take actual BIP behind this specific pitcher, which I do not have) by: =(1-HBIPNIZ)*0.56*(H-HR). Again, 0.56 represents the average value of an HBIP across the league.
Pitcher DIFF
Pitcher DIFF JA Happ? Seems like he was due for a collapse, which agrees with ZiPS anyway. Likewise, Adam Wainwright is crashing and burning. Wait, what? Well, I don’t know how predictive it is – it’s meant to be descriptive. 9. To calculate PAR+ (park and playing time adjustments), I calculated each pitcher’s PAR/IP and the league PAR/IP, and then put in the Baseball-Reference Pitching Park Factors, and divided by 100. The league leaders were:
Name Age Tm IP PAR+ Crazy set of leaders, huh? After Johnson there is a 20% drop to the next tier.
Tm Sum of R Sum of PAR by ip DRS Run vs Avg TRS This correlation is 0.89. Goodness, that worked out well. Of course, the NL wasn’t quite as good, coming in at r = 0.74. I will have to expand this to see if 2009 was fluky good or bad, and check a few more seasons to see how it works out. I think I have 2007-2008 lined up, but as you can see this isn’t a small amount of work. This where you come in – what are the next Next Steps? Lots of these match up very well, but there is some factor I am missing. I think it *could be* IF FBs. Not every team will see enough to reconcile all of these differences, but I believe it will tighten up the numbers. Oddly, I couldn’t find just a straight count of popups by team. Back to the pitcher being responsible part. A key theory for DIPS and FIP is that the pitcher isn’t responsible for HBIP (much). Therefore, all plays fielded are the responsibility of the fielders, and the pitcher gets zero credit. Or, as I noted last year, the pitcher is responsible for about 70% of plays made. That’s the bare minimum anyone playing gets to make. I referenced it the other day - scroll to post #30 for the dirty details. And…..go! |
BookmarksYou must be logged in to view your Bookmarks. Hot TopicsSteve Austin is not a Baseball Player
(159 - 12:27am, Jul 07) Last: Infinite Yost (Voxter) Defensive Replacement Level Defined (41 - 1:20pm, Mar 14) Last: Foghorn Leghorn Reconciliation - Getting Defensive Stats and Statheads Back Together (30 - 1:42pm, Apr 28) Last: GuyM Handicapping the NL East (77 - 2:02pm, Oct 15) Last: The Interdimensional Council of Rickey!'s Landing Buerhle a Great Move (79 - 8:43am, Feb 04) Last: Foghorn Leghorn Weekly DRS Update (Defensive Stats Thru July 19, 2010) (3 - 2:47pm, Sep 27) Last: Home Run Teal & Black Black Black Gone! You Have Got To Be Kidding Me (8 - 3:52am, May 01) Last: Harris Weekly DRS Update (Defensive Stats Thru July 4, 2010) (2 - 4:05pm, Jul 11) Last: NewGrass Weekly DRS Update (Defensive Stats Thru Jun 29, 2010) (5 - 12:47pm, Jul 04) Last: Harveys Wallbangers Weekly DRS Update (Defensive Stats Thru Jun 13, 2010) (15 - 1:51am, Jun 16) Last: Chris Dial Weekly DRS Update (Defensive Stats through games of June 6, 2010) (17 - 7:08pm, Jun 14) Last: Foghorn Leghorn Daily Dose of Defense (41 - 8:31pm, Jun 04) Last: Tango 2009 NL OPD (Offense Plus Defense) (37 - 11:22pm, Feb 17) Last: Foghorn Leghorn NOT authorized by Major League Baseball or its Member Teams (40 - 7:32pm, Feb 16) Last: GregQ 2009 AL OPD (Offense Plus Defense) (35 - 9:05pm, Jan 05) Last: Foghorn Leghorn |
|||||||
About Baseball Think Factory | Write for Us | Copyright © 1996-2021 Baseball Think Factory
User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
|
| Page rendered in 0.3421 seconds |
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Chris Dial Posted: July 22, 2010 at 05:29 PM (#3596358)The front page is completely blank for me. Let's bump this again.
Oh and you really ought to sue. This is the umptieth time you've had to change the name of your work. In the meantime, can I suggest DDRS.
EDIT: And now that I check the linky I see you're already using DDRS.
The other obvious thing to try is HPIPNIZ/groundball and HPIPNIZ/flyball with their distinct weights. for 1B/2B/3B
yes, but I think HBIPNIZ are going to be mostly line drives. At BBRef, you can see that LDs are the most hits by a huge margin.
We'll see if those other options move the needle.
The first test is PAR + FAR = TAR (Total Allowed Runs) correlation to actual RA. For the league, summed by each individual pitcher, r^2 is 0.972. That’s pretty strong, but perhaps I haven’t done anything that would make it not be strong. But it is strong.
If you're looking at raw numbers rather than rates, then this correlation is meaningless. Pitchers with lots of innings will have high figures for both TAR and RA figures; that's what you're measuring with correlation. Anyway, one more reason to stick to teams for the moment.
For the teams, how about RMSE rather than correlation? Or a scatter-plot? The picture is much more illuminating than a summarizing number or two?
Is your spreadsheet available somewhere? I'm not following your steps well by reading; I think seeing the formulas and numbers will help (I can re-create it myself a bit later).
Are you suggesting that all HBIZ are the defense's responsibility? Or maybe you don't think that's literally true; is it your intent to treat them that way in this metric?
I agree that this isn't a perfect assumption; in fact, for each ball in a zone, we could say (for descriptive, not predictive purposes, which is what I believe we're doing here) that the pitcher is responsible for the average rate at which that ball is turned into an out, and the fielder is responsible for the balance between that and what actually occurred (positive or negative).
Here's a zone where the out rate is 50%. The ball goes there. The pitcher gets debited for .5 of a hit. The ball is a hit: the fielder gets debited .5, and the sum of the pitcher+fielder is 1 hit allowed. The fielder makes the out: the fielder gets credited for -.5 of a hit, and the sum of the pitcher+fielder is 0 hits allowed.
Every ball is in a zone somewhere, right? Even a ball that is an out .000001 of the time (not that I would imagine there are any zones of that nature). So you can do this for every single batted ball in play.
Now, if you do this for every batted ball in play, you should have a perfect record of each non-HR hit -- there shouldn't be any hit not included.
Now, the trick is to convert these to runs. Each zone would likely have a different run expectation. That 50% zone I mentioned above, let's say that every hit in that zone is a single, and a single is .47 runs. So the run value for a ball in that zone is .235. The ball goes there. The pitcher gets debited for .235 runs. The ball is a hit: the fielder gets debited .235 runs, and the sum of the pitcher+fielder is .47 runs allowed. The fielder makes the out: the fielder gets credited for -.235 runs, and the sum of the pitcher+fielder is 0 runs allowed.
Do that for every batted ball. Add in run values for non-BIP events (1.4 HR, .33 BB/HBP, etc.). You're not going to have a perfect record of each run, because this won't take into account timing (i.e., clutch). It should be somewhat close, though. How close? What's the RMSE? 15 runs? 20 runs? 50 runs? Are we close to the run estimators we use for offense? If we're way off, how can we get closer?
Now, here I'm going to get a bit controversial. MGL believes that UZR is a predictive, not descriptive stat. I.e., it's describing "true talent" more than "what really happened" (though obviously the latter is part of the former).
I'm not so sure.
I suspect that the adjustments he makes to the raw data, if properly made, actually go toward describing what really happened.
I mean, here are some of the adjustments he makes:
- "A bunt ground ball is treated as a separate kind of a batted ball than a non-bunt ground ball, but only for the first, second, and third baseman."
- "The base runner and outs adjustments are a proxy for infield defensive alignment."
- "Left-handed and right-handed batters are treated separately since infielders and outfielders are positioned differently for each."
- "For outfield air balls, two separate categories of batters are used as a proxy for outfielder depth: Batters with less than average power and batters with greater than average power."
There's a bunch more. Park adjustments, baselines. In my mind, what MGL is assuming is that positioning affects the out conversion rate for each zone. This is obviously true, so in lieu of positioning data (which we lack), he comes up with all these proxies that he thinks adjusts for them.
Let's remove, for the moment, the consideration of whether or not MGL's adjustments make the proper corrections. Let's stipulate that they do. If they do, it seems to me that UZR is describing actual events -- the expected out conversion rate for ball X to zone Y was lower than normal because of circumstance Q. That's something real and, theoretically, verifiable (though we don't have the data; we may have data that says "this ball is an out 50% of the time with the bases empty, but with a man on 1st and less than 2 outs, it's an out 40% of the time," and maybe that's how MGL derived his adjustments, we don't know).
I think that if MGL ran this -- and I'll have to check the PZR links Tango put up at The Book to see how much he may have already -- we should see a total of PZR+UZR that should come somewhat close to the correct number of runs allowed on the team level. (For these purposes, I'm including catcher defense and SB/CS in "UZR".)
=((BB-IBB)*0.34+HBP*0.25+IBB*0.31+(H-HR)*HBIPNIZ*0.56+HR*1.44)-((BIP Outs)*0.09+K*0.098)
I can't really contribute to the meat of this discussion, but why do BB, IBB, and HBP all have different coefficients? Isn't the on-field result of any of the three of them exactly the same?
Are you saying it's an effect that if a pitcher walks a guy he's not trying to that he is then more likely to walk someone else?
IBB almost never occur with a runner on first. They're also much more likely to occur with 2 outs.
HBP vs. UIBB, I'm not sure - I've usually treated those as identical in my own sabermetric experiments.
I'm somewhat surprised by the values used in PAR though. I remember I came up with different correlation coefficients for all 3 events but I was pretty sure it was non-IBB > HBP > IBB.
Could be possible, I'm not too far away, but it depends on how work is going.
it may *not*. If, as zenbitz says, there's areally a constant, like FIP, and I use a team specific one, it may not be needed. I mean, that may be a finer adjustment, but it may not be a big enough differnece to warrant the extra effort.
Rusch is what happens when the effect is real rather than chance. I don't know how possible it is to suss that out.
Just curious, do you get your BIP data from same source as baseball-reference ?
If so, how are you dealing with the change in classifications that took place starting last year that greatly impact line drive /fb percentages ?
That's a question for you to answer.
I had completely forgotten about this. It was published at a point in my life where I was doing a lot of travelling and family things, so it was the worst possible moment for it to pique my interest.
A week or two ago I made a suggestion in another thread that WAR double-counts the effects of BIP. After a private discussion with A Well-Known Sabermetrician, I came to the conclusion that any WAR system that uses DRS or UZR is giving a false reading of the value of BIP, because both those systems are designed to rate fielders amongst one another exclusively, and not integrate the value of those fielders (and the value of pitchers' BIP) in terms of the actual team or league context. Without a team or league context, such as is provided by Win Shares Above Bench (a sadly neglected system), all your WAR is useless.
More research on something like what Dial presents here is what is really needed. Has anything been done?
no, I get it from STATS. Changing definition for LD/FB is probematic, and Statcast will put most of that to bed. Alan Nathan rightly says "I need Time of Flight"
Regarding the issue of the reliability of fielding data you reference here, coincidentally I was looking today at the measured fielding opportunities at SS for the NYY. With Jeter retired, I thought it would be interesting to see if there was any change. I used Fangraph's BIZ, which I think is based on the BIS data used in UZR/DRS. Here are BIZ per 9 innings at SS for the NYY in recent years, along with Jeter’s innings in the field.
Year / BIZ9 / Jeter Inn
2011 2.33 1047
2012 2.39 1186
2013 2.65 110
2014 2.09 1138
2015 2.74 0
Of course there are other variables to be looked at (like total BIP), and the 2015 sample is very small. But boy, it sure looks to me like the range of the SS has a huge impact on measured fielding opportunities. With Jeter out with an injury in 2013, fielding opportunities surged. When Jeter returned and was barely mobile in 2014, fielding opportunities plunged. And now, with Jeter retired, lots of balls are once again being hit into the SS zones. (Cross-posted at Tango's blog.)
You must be Registered and Logged In to post comments.
<< Back to main