## Friday, November 11, 2005

#### Dr. StrangeGlove or: How I Learned to Stop Worrying and Love Zone Rating

There is a ton of mistrust of defensive methodologies, and how well they describe defensive play.  MGL’s UZR is recognized at many baseball sites as being good enough to cite with considerable confidence in its accuracy.  MGL does a good deal of work on it to perfect it, and he’s no dummy, so it’s a fair position of others to want to quote his data.

I’m here to tell you, friends and neighbors, that you, yes, you, can be a defensive runs saved calculator without the need to pray MGL is able to let you in on how your favorite player performed.  MGL always obliges, but he doesn’t have to have the burden.

You will have to do some of the work, so stop watching that baseball game and get your head into a spreadsheet.

```
Here’s what you are going to need:

1.	The player’s name and defensive innings
played at position.

2.	The average number of innings a team
played in a season.  Usually, it is about 1440,
depending on extra inning games, etc.  In 2005,
the average AL team had 1441.0.

3.	Obtain the league average ZR at position.
This can be done by “back calculating” from every
other player’s ZR.  It’s a bit tricky, but you can do it.

4.	All you need now is how many chances each
position has if they play every inning and the average
run value per play at that position.

And those are right..here:

Position	AvgZROps	Runs/play
1B	281	0.798
2B	507	0.754
3B	430	0.800
SS	532	0.753
LF	348	0.831
CF	462	0.842
RF	365	0.843

5.	Mix thoroughly.  Sprinkle lightly with arm
and/or double play ratings.  You read that correctly. You
don’t need assists or putouts or anything.  Yes, that’s
mildly annoying because it makes you think,
"that can’t be right”.  But trust me.  Just trust me.

6.	Example:

Position	AvgIP AvgOpps          AvgZR Run/play
1B	        1440	281	       0.871 0.798

last     team     INN       ZR
Helton Col   1230.7    .916

Is that all the data I said you needed to gather/gave you?

Let’s start calculating!

Average Plays Made (PM) = Avg Chances (281) * Avg ZR (0.8710)
Average Plays Made (PM)  = 244.75

So far, so good.  That is the average plays made you
are going to compare every first baseman to.

Now the run value of that:

Runs Saved at Position = PM (244.75) times Runs/play (0.798)
Runs Saved at Position = 195.31 Runs Saved at Position

It is very important to remember that represents the
Runs Saved (RS) for playing every inning.  I refer
this to a “cal” after Cal Ripken who played every

At first base then, the Avg RScal = 195.3.

For Helton we get:
RScal = Helton’s ZR * 281 AvgZRChances * 0.798 runs/play
RScal = 205.4

RScal+ = RScal above average (simple subtraction)

Helton’s RScal+ = 205.4 – 195.3 = 10.1 RScal+

Converting to Helton’s playing time, RSpt:

RScal+ * Helton’s Innings / League Avg Innings

RSpt = 10.1 * 1230.7 / 1440 = 8.6

Yes, that can be a one-line formula, but I personally
like having “if he played every inning”.

To summarize Helton’s line:

last	team	INN	RScal	Rscal+ RSpt	RS/150
Helton,	Col	1230.7      205.4          10.1    8.6                  9.4
```

As described previously, it would be better to also look at Helton’s actual ZR Chances multiplied by the league average ZR, and then subtracting from Helton’s RS.

I don’t have Helton’s actual ZR chances.  I think trying to re-estimate them is folly, as my “281” number is generated from some 56,000 innings and 11,000 chances.

It is very important to understand that ZR chances are like plate appearances.  If a fielder has only played enough innings to make 100 fielding plays, then it is too early to judge how good of fielder he is.  It’s like 100 PAs.  So don’t put too much weight, good or bad, on guy who haven’t played much.  It’s very important for catchers too, because they always play partial seasons.

Okay, for outfielders’ arms, I calculate the average assists per inning from every outfielder and then multiply by the league average innings.  Then I subtract that from the player’s assists with the proper playing time conversion.  Those are straight runs added to a player’s RSpt (or RScal+).

I do not park adjust, pitching staff adjust, groundball/flyball adjust.  I am really skeptical of that still – because I haven’t worked it out for myself.  I’m stubborn that way, but I also have never seen the data broken out in that manner.

Catchers are done as an amalgam of caught stealing per inning above average, stolen bases per inning above average, errors per inning above average and passed balls per inning above average, at an average base advancement of 0.31 runs per.  I have vacillated between a couple of catcher run value calculations.  Someone needs to make an argument on which way I should go.

Is this method any good?

I like my method.  I think defensive evaluations from a zone perspective is the best way.  I think this method is robust due to the sheer volume of data input.  How do I compare it to anything?  I know UZR is pretty good, so I took MGL’s posted numbers in the Gold Glove articles, and David Gassko’s (DSG) from his article at The Hardball Times.  Mr. Smith (Rally) provided all of us with all his numbers for players with 250 innings.

I have 56 data points from MGL.  I have 100 from DSG.  I used 122 from Rally.  The reason I used 122 from Rally is because I used the guys that you have read with posted numbers.  Besides, you’ll see there isn’t much point to more between Rally and me.  We agree very tightly.

I made comparisons to each other, all based on 150 defensive games (1350 innings played) excluding outfield arms and double plays.

Correlations:

```

vsMGL	vs DSG	vs Rally
Dial	0.82	0.60	0.97
Rally	0.80	0.61
DSG	0.61	```

Feel free to square those if you can’t do that in your head.  The complete list of data used can be found here.

What about absolute differences?  You may have seen various discussions (see the AL Gold Glove article) regarding the Nick Swisher number.  MGL has Swisher at +37, while Rally and I had him at average and DSG had him very negative (-23).  Somebody, somewhere has something off.

Average Absolute Difference:

```
vsMGL	vs DSG	vs Rally
Dial	 9.9	12.5	3.0
Rally	10.0	12.2
DSG	12.2```

I’m certain there is a better statistical way to compare these results, but I don’t know what it is.

By position:

```
Correlations with MGL
Dial	Rally	DSG
2	1.00	0.98	0.98
3	0.88	0.88	-0.28
4	0.86	0.85	0.91
5	0.84	0.83	0.66
6	0.69	0.59	0.83
7	0.99	0.99	0.95
8	0.88	0.84	0.83
9	0.74	0.78	0.14```

As DSG noted in his article at The Hardball Times, he has issues to resolve at first base and right field.  However, at the middle infield positions, shortstop and second base, DSG’s are significantly better than either ZR method.  It appears “Range” is capturing something that agrees better with UZR.  I have no comment on whether or not that makes it more “correct”.

Rally’s method correlates well everywhere with some question at shortstop.  I don’t know if his double play addition would increase that, but MGL’s doesn’t include double plays.

My rating does very well.  The worst correlation is 0.69, with a 0.9 rating or greater at five of the eight of the positions.

I am using UZR as a baseline because it is well respected.

If I remove the four worst matches – the players each method disagreed with UZR the most - the correlations increase (duh).  Removing those four moved my correlation to 0.87, Rally’s to 0.84, but most amazingly, DSG’s to 0.83.  That’s a huge jump in DSG’s numbers.  That’s basically saying 7% of these players are problematic when comparing a non-pbp method to a pbp method.  I think that’s really good.  I don’t know if that 7% can be eliminated.

I haven’t worked up all of the 2000-2003 data for which there is a bunch of UZR data available.  One of you more industrious fellows can take my methodology, make your own spreadsheet and compare to UZR.  Or you can wait until I get around to it.  Which will happen.  No, really.

Each of you is now armed with the ability to accurately estimate how many runs a defender saves at any moment.  No more “but how good is his glove”?  No more “will MGL post UZR”?  Well, we still want that, but you can feel pretty confident using this methodology that you are right on it.

Plus, when you vote for MVP, or ROY, you can do it with

50%

52% more knowledge.

HTH.

In the spirit of open research, the data used to calculate these defensive ratings can be found here.

Chris Dial Posted: November 11, 2005 at 03:19 AM
101. Harold Posted: March 15, 2006 at 12:21 AM (#1898882)
Yeah, it actually looks pretty easy. Let me see if I can take a stab at it tonight.

Argh, it looks like the HTML formatting generated is different for some players than for others, which makes this a big pain in the ass. I'll see if I can keep working on this.
102. Punky Brusstar (orw) Posted: March 15, 2006 at 12:53 AM (#1898901)
Dan Turkenkopf, I found the passage that I was looking for in the 1983 Baseball Abstract (page 124, bottom of first column):

A team in a season has 324 "games" in this process, 162 offensive and 162 defensive. The even balance of offense and defense is, again, not an assumption but a conclusion that you can verifu in any number of ways. Of the 162 "defensive games," the itcher is assumed to be responsible for 109 or two thirds of them. That (emphasis mine) is an assumption or an estimate, and it can't be verified. Of the 53 remaining defensive games, the shortstop is assigned the most, 11, and the firstbaseman is assigned the fewest, 3.

At the end of the book he lists the defensive games assigned to each position:

1B 3
LF 4
RF 5
3B 6
CF 6
2B 8
C 10
SS 11
103. Dan Turkenkopf Posted: March 15, 2006 at 01:06 AM (#1898909)
Dan Turkenkopf, I found the passage that I was looking for in the 1983 Baseball Abstract (page 124, bottom of first column)

Thanks. I'll try running the rest of the numbers with these weights tomorrow.
104.  Posted: March 15, 2006 at 02:43 AM (#1898996)
James has been very consistent in the 2/3 pitching to 1/3 fielding allocation, from his very early days through WS (which is 67.5/32.5). As I demonstrated some time ago, that approach essentially allocates a 50/50 split on BIP results to pitchers and fielders.

Does anyone else own Baseball Hacks, by Joseph Adler? Under Hack 29, Adler reproduces the field chart, with coordinates, that MLB.com uses to identify the locations of balls in play, and it might be useful to overlay the zone grid on top of that chart.

-- MWE
105. Dan Turkenkopf Posted: March 15, 2006 at 02:59 PM (#1899318)
Does anyone else own Baseball Hacks, by Joseph Adler? Under Hack 29, Adler reproduces the field chart, with coordinates, that MLB.com uses to identify the locations of balls in play, and it might be useful to overlay the zone grid on top of that chart.

I've got the book, I can probably do that sometime before the end of the weekend - I'm not sure my scanner is set up.

James has been very consistent in the 2/3 pitching to 1/3 fielding allocation, from his very early days through WS (which is 67.5/32.5). As I demonstrated some time ago, that approach essentially allocates a 50/50 split on BIP results to pitchers and fielders.

Based on my understanding of DIPS and other BIP type analysis, that sounds like too much credit to the fielders. Has anyone come up with a more accurate split?
106. Foghorn Leghorn Posted: March 15, 2006 at 03:27 PM (#1899331)
I would suggest that the split for BIP=>outs is calculable with a couple of factors-

ZR has a floor - below which a player would not play there. The floor represents 100% of BIP=> outs (or it approaches that).

A three year average of floor results for ZR at SS should tell us what % of outs are "automatic". Those are "pitcher-credited fielding outs" .

From floor to median may be a 50/50 split wrt "credit" for the outs. Above that are fielder outs.

All non-zone hits are pitcher "fault".

I haven't done those calculations, but that's close to the theory, and should approach accuracy (as opposed to some calculus of dx/dc as the percent converted moves).

If SS floor is ~.775 (anyone lower gets moved off the position), so a SS prevents, on average, 448 plays. the pitcher creates 100% outs on 412 of those. So the average SS really makes 36 plays above "not allowed to play the position", and half of that credit is "pitcher driven", so 430 of 448 outs are pitcher credited, due to ease of play, etc.

That means *at the major league level*, fielders at SS (and assumably most positions), fielding is still going to be 92-96% *the pitcher*.

For teh VERY BEST fielders, converting at a ZR of .900, we are still looking at 88% pitcher-driven defense success.

Yes, these values aren't exact (93.7 or anything), but since we approach 1, the math here, even crude, is more than sufficient - arguing over one or two percent for "credit" is pointless (if my thought process assumptions are solid).

This *completely* flies in the face of nearly everything I have written on the internet over the last decade. I wish I had never figured this out. Stupid scientific process...
107. Punky Brusstar (orw) Posted: March 15, 2006 at 04:07 PM (#1899382)
Damn, Chris. I need to sit back and digest that.
108. Dan Turkenkopf Posted: March 15, 2006 at 05:42 PM (#1899512)
So if I understand this correctly - we should be able to determine an absolute number of runs saved for a given player. Take Jack Wilson, he saved 16.5 runs above average. So those are all credited to him. The average SS makes 36 plays above the floor, which equals roughly 27 runs. Half that should be credited to the SS, so 13.5 runs. That means Wilson had an absolute defensive value of 30 runs. After that, we can combine with his offensive RC to get a total value number.
109.  Posted: March 15, 2006 at 05:52 PM (#1899526)
Based on my understanding of DIPS and other BIP type analysis, that sounds like too much credit to the fielders. Has anyone come up with a more accurate split?

Davenport splits BIP credit 70% to the fielders and 30% to the hitters.

Take Jack Wilson, he saved 16.5 runs above average. So those are all credited to him. The average SS makes 36 plays above the floor, which equals roughly 27 runs. Half that should be credited to the SS, so 13.5 runs. That means Wilson had an absolute defensive value of 30 runs.

That should be "runs above replacement", not "absolute defensive value". The 13.5 runs with which Wilson is credited are those based on the difference between an average SS and a replacement-level SS.

-- MWE
110. Dan Turkenkopf Posted: March 15, 2006 at 06:14 PM (#1899561)
That should be "runs above replacement", not "absolute defensive value". The 13.5 runs with which Wilson is credited are those based on the difference between an average SS and a replacement-level SS.

But if you follow Chris' argument - anything up to replacement is value to the pitcher - for getting the batter to hit to the place where a replacement-level shortstop can get to the ball. Under that description, it would be an absolute value - since it rebaselines replacement level fielding as 100% pitcher, 0% fielder.

Based on my understanding of DIPS and other BIP type analysis, that sounds like too much credit to the fielders. Has anyone come up with a more accurate split?

Oops.. I meant that to say "too much credit to the pitchers" which I didn't realize until now. But then Chris showed that I might have been right after all.
111. Foghorn Leghorn Posted: March 15, 2006 at 06:16 PM (#1899564)
That should be "runs above replacement", not "absolute defensive value". The 13.5 runs with which Wilson is credited are those based on the difference between an average SS and a replacement-level SS.

I'm not so sure. Yes, if we define replacement meaning a defensive player that has to be replaced, but not a defensive player for whom there is FAT (freely available talent) that would be better.

I think you are misusing hte standard definition of "replacement player".

I think that is his absolute value. FAT for defense is slightly above average (for DEFENSE only).

Now we are going to have some debate over setting the right "floor" because sometimes Michael Barrett plays third base.

The more I think about it, the more I think this works - but the caveat of "someone who would be allowed to play the position".

It may also go a long way to defining relativity amongst the positions, based on different floors- like 2B floor is higher than SS- closer to 0.8, and 3B is nearly 0.7.

What this also says is that the defensive weightings could matter - that is a 3B defense has more opportunity for gain (I've known this since hte Dean Palmer/Joe Randa deal in 1998 or so).
112. Foghorn Leghorn Posted: March 15, 2006 at 06:21 PM (#1899569)
But then Chris showed that I might have been right after all.

Yup, and for a decade now, I would have argued long and hard that you were crazy. heck, I probably did so in many threads here.
113. Punky Brusstar (orw) Posted: March 15, 2006 at 06:40 PM (#1899580)
This *completely* flies in the face of nearly everything I have written on the internet over the last decade. I wish I had never figured this out. Stupid scientific process...

So basically, just so I understand what you're saying, you and others in the baseball blogosphere (Voros, et al?) have been overrating fielding and underrating pitching? I think that maybe pitcher's shouldn't get 100% credit for automatic outs. It looks like your swinging the pendulum too far in the other direction.
114. Punky Brusstar (orw) Posted: March 15, 2006 at 06:51 PM (#1899597)
IOW, you're saying "If Dick Stuart can catch it, credit the pitcher."?
115. Foghorn Leghorn Posted: March 15, 2006 at 06:54 PM (#1899605)
No, I'm pretty much the only one who pressed hard for hte defenders.

Sure, the pitcher may not get 100% of the credit, but what they get approaches 100%, fo rfor practical purposes, 100% is correct - obviously having a trsh can out there isn't going to get 75% of plays - So we are talking about starting with a baseline, ratehr than a zero. And hte baseline has 20 years of established work on who will even getthe option to play.

Yes, someone could misinterpret that to mean that you could thus put anyone there and get 75% outs, and that wouldn't be true, but teams don't do that. Or we have to find the correct "floor" - because there are some plays *I* can make, and that means it can be all pitcher-credit.
116.  Posted: March 15, 2006 at 06:59 PM (#1899619)
I'm not so sure. Yes, if we define replacement meaning a defensive player that has to be replaced, but not a defensive player for whom there is FAT (freely available talent) that would be better.

I don't think you can look at it that way for setting "component" replacement levels for offense and defense; when you do, you set the bar too high. The "freely available talent" standard is based on the combination of offense and defense that the player provides; some players at that level provide more offense, some more defense, but the "combination" is the same. Teams will accept a sub-FAT bat to get a glove at a position, and teams will accept a sub-FAT glove to get a bat at a position - it's the "combination" that matters, not the components. For the components, what matters on offense is the point at which teams will not accept the bat regardless of the glove, and what matters on defense is the point at which teams will not accept the glove regardless of the bat. Those points - for the components - are typically lower than the points you'll calculate when looking strictly at FAT.

-- MWE
117. Foghorn Leghorn Posted: March 15, 2006 at 07:23 PM (#1899669)
Those points - for the components - are typically lower than the points you'll calculate when looking strictly at FAT

I don't disagree witht hat, but I don't think we've appropriately studied FAT for defense - I mean I have a study on who is FAT,a nd I suppose I can subsequently evaluate for defense, but FAT players rarely have enough defensive data to evaluate.

Overall, I would hedge toward average defense anyway. Or I would regress to the mean....
118. Punky Brusstar (orw) Posted: March 15, 2006 at 07:44 PM (#1899707)
No, I'm pretty much the only one who pressed hard for hte defenders.

I know you were a staunch proponent of the fielders. But, and correct me if I'm misunderstanding this, but didn't McCracken's theories essentially give the pitcher no credit for balls in play; thus giving fielders 100 % credit for balls converted into outs? I know that that is the strong form of DIPS theory (and I suspect no one really believed that); not the weaker form.
119. Mefisto Posted: March 15, 2006 at 09:00 PM (#1899805)
This is a great discussion.
120. Los Angeles Waterloo of Black Hawk Posted: March 15, 2006 at 10:13 PM (#1899897)
I think Dial's conception of giving the pitcher credit for anything below a position's "floor" is sound.

In the AL last season, each team averaged 4,430 BIP (I just added up the league's AB-HR-SO and divided by 14). How many are in zones, and how many are not?

Per Dial's info on the spreadsheet accompanying this article, 2,925 BIP are in someone's zone. That means that 1,505, or 34%, definitely belong to the pitchers.

Now, as for the floor at each position ... Dial posits a .775 is the SS floor, which over a full season would be -20 runs. Does -20 sound like a fair replacement level? MGL posits that replacement level is around -17 per 150 games, and that's overall, inclusive of fielding, hitting, baserunning, and all of it, so -20 for 162 sounds reasonable to me, as it would require the player to be an above-replacement-level hitter to stay at the position.

Going through each position, and (roughly) identifying what ZR corresponds to -20 runs:

``` Pos  ZROpps  FloorZR  Floor Plays1B     281    .780      219.182B     507    .780      395.463B     430    .715      307.45SS     532    .775      412.30LF     348    .785      273.18CF     462    .845      390.39RF     365    .810      295.65TOT:  2925    .807     2293.61  ```

Let's just round that off to 2,294 ... plus the 1,505 out-of-zone plays we identified above, that's 3,799 plays that, theoretically, belong to the pitcher. There were 4,430 BIP total, remember, so pitchers may take credit for 86% of the BIP.

Of course, that's just BIP, we haven't addressed HR, BB or Balks in looking at an overall contribution.

Expanding my chart of above, we can identify how many "Plays of Dispute" each position will have, on average, meaning plays that are up to the defender (ZROpps minus Floor Plays), and therefore the number of runs a perfect defender could prevent (by multiplying Plays of Dispute by the Runs Per Play value for the position):

``` Pos  ZROpps  FloorZR  Floor Plays  Plays of Dispute   Runs1B     281    .780      219.18          61.82         49.32B     507    .780      395.46         111.54         84.13B     430    .715      307.45         122.55         98.0 SS     532    .775      412.30         119.70         90.1LF     348    .785      273.18          74.82         62.2CF     462    .845      390.39          71.61         60.3RF     365    .810      295.65          69.35         58.5TOT:  2925    .807     2362.15         631.39        502.5  ```

631 plays per season comes out to 3.9 plays per game -- meaning an average AL game in 2005 would see around 4 non-routine opportunities for fielders to make an out (not including balls impossible to reach, as they are out of anyone's zone). Does that sound reasonable? I don't know.
121. GuyM Posted: March 15, 2006 at 10:22 PM (#1899917)
The "freely available talent" standard is based on the combination of offense and defense that the player provides; some players at that level provide more offense, some more defense, but the "combination" is the same.... Those points - for the components - are typically lower than the points you'll calculate when looking strictly at FAT.

If baseball had separate offensive and defensive squads, like football, this would be easy. The "floor" would be the ability of the theoretical 31st-best SS, 31st-best CF, etc. We'd still have to figure out who that was, and how many runs better the avg. player was, but conceptually it's pretty straightforward. (In this scenario, the overall quality of defense would be vastly better than today's game -- at some positions the new replacements would be better than today's average).

Unfortunately, it's not that simple. We have to deal with players selected for their combined Off/fldg talents. While MWE is theoretically correct that a FAT can be good-bat/bad-glove or bad-bat/good-glove (or anywhere in between), the reality is extremely asymmetrical. There are lots of lg-avg gloves (or better) out there in FAT land, but essentially no lg-avg bats. The point is that not only do someFATs give you a lg-avg glove, but on average they do that. In other words, the only reason anybody plays in the majors with a below-avg glove is because they can hit; the same is not true on the hitting side.

Given that, I would say fielders as a group have zero value above replacement. I would say that that catchers and middle infielders have a little positive defensive value (because you have to accept below-FAT hitting to obtain their gloves), while 1B and CR OFs have negative defensive value. And of course individual players may be far above or below average. But net it all out, and you come up with nada.
122.  Posted: March 15, 2006 at 10:24 PM (#1899926)
That means that 1,505, or 34%, definitely belong to the pitchers.

Minus OOZ plays made. You have to give credit to a fielder when an OOZ play is made.

-- MWE
123. Los Angeles Waterloo of Black Hawk Posted: March 15, 2006 at 10:28 PM (#1899937)
Of course, something else to consider is that fielders will record outs on a certain number of out-of-zone plays. Not sure how many ... I'd have to check The Fielding Bible when I get home. But that's something to take into account.
124. Los Angeles Waterloo of Black Hawk Posted: March 15, 2006 at 10:30 PM (#1899942)
Actually, don't the ZROpps Dial posted include OOZ plays made?

I really wish The Fielding Bible had included a way to go to some site and download spreadsheets like The Hardball Times Annual did.
125.  Posted: March 15, 2006 at 10:33 PM (#1899944)
Actually, don't the ZROpps Dial posted include OOZ plays made?

They probably do, now that I think about it - Dial always complains about the fact that STATS doesn't split OOZ plays out of their data.

-- MWE
126. Los Angeles Waterloo of Black Hawk Posted: March 15, 2006 at 10:35 PM (#1899947)
The Fielding Bible presents ZR with the OOZ separated out, which is very nice. I haven't really had a chance to play with those yet.
127. Los Angeles Waterloo of Black Hawk Posted: March 15, 2006 at 10:45 PM (#1899968)
For the 0.001893% of people that might care, I posted a <shameless self-promotion> review of The Fielding Bible today </shameless>, though there's nothing in it that anyone reading this thread doesn't already know.
128. Foghorn Leghorn Posted: March 15, 2006 at 11:56 PM (#1900111)
LAHBW,
how many years of ZR OOZ is in there? I have scads of old data I can compare against.

I have got to get that book!
129. Los Angeles Waterloo of Black Hawk Posted: March 16, 2006 at 12:50 AM (#1900229)
Chris, I'm pretty sure it just goes back to 2003 (probably not what you're dreaming of, but better than nothing). I can double-check when I get home ... sitting at my desk with that book open every day probably wouldn't be smiled on by my employers ...
130. Punky Brusstar (orw) Posted: March 16, 2006 at 02:05 AM (#1900471)
This is a great discussion.

I'm glad I revived it. I have nothing else to add at the moment, as I'm a mere dilettante compared to CTD, MWE, and LAW, but I'll continue to monitor it as it continues.
131. Chris Dial Posted: March 16, 2006 at 02:20 AM (#1900508)
Going through each position, and (roughly) identifying what ZR corresponds to -20 runs:

That is interesting, but I think the spread is more than that. the floor for other positions isn't -20, but what the worst guy allowed to play is. It's a bit lower at 3B and 1B and LF, and higher at 2B.
132. Spivey Posted: March 16, 2006 at 03:13 AM (#1900569)
Does ZR (and UZR, if anyone knows) include both missed played due to errors and missed plays due to range in the same section?
133. Chris Dial Posted: March 16, 2006 at 03:16 AM (#1900577)
yes, spivey, they do.
134. Spivey Posted: March 16, 2006 at 03:16 AM (#1900578)
Also, I went through 2000-2003 UZR, and 2005 ZR, and there is a considerably larger STDEV of defensive levels at 1B and corner outfield slots than 'defensive' positions.
135. Spivey Posted: March 16, 2006 at 03:20 AM (#1900585)
It seems like that could be a problem if you wanted to use past ZR and UZR to project future performance (mainly for young players). Many well regarded SS prospects always seem to throw away a lot of balls when they're young. It seems like they generally improve (and hopefully, keep the good range) - but maybe that's just selective memory, only remembering the guys that didn't move off the position.
136. Los Angeles Waterloo of Black Hawk Posted: March 16, 2006 at 03:31 AM (#1900607)
That is interesting, but I think the spread is more than that. the floor for other positions isn't -20, but what the worst guy allowed to play is. It's a bit lower at 3B and 1B and LF, and higher at 2B.

Oh, I hadn't thought of that.

I did this whole thing where I went through all the positions ... well, I'll keep it below. But I went through this whole thing, but then it ocurred to me that the floor for each position would have to be the same as the offensive replacement level for that position, wouldn't it? Not that we (I) know how to figure that out with 100% certainty, of course.

Here's the part I wrote originally that was probably a huge waste of time:

Could the floor for a position be identified by looking at the ZRs for players that move down the defensive spectrum? Or at least as a starting point ...

Just looking at the 2005 AL, in a totally cursory fashion (and I'm sure you'd have to look at several years to get a real feel for it, and you'd be best-served to use multi-year ZRs with some form of regression to evaluate the true talent level) ...

1B->DH
Jason Giambi had a .750 ZR, David Ortiz a .735; that's in a range of -27 to -30 runs.

2B->3B? or LF?
Jorge Cantu (also played a lot at 3B) was at .764 and utility man Joe McSuck was at .769, so we might be looking at -24 to -26 runs ... Chris speculated it might actually be higher than -20 ... Damion Easley was .770/-16.3 in the NL.

3B->1B? or LF?
Hmm ... Maicer Izturis' (utility) RSCal was -17.5 at .725 ZR, and Mark Teahen (suck) was -24.8 at .704. Pinch-hitter Dave Hansen was at .700 in an extremely small sample, but that might be the right neighborhood (A-Rod's at .735, so we can't go much higher). Cantu was beyond bad.

SS->2B or 3B
The -20 range seems about right; you have Maicer Izturis there again (-22.4 at .770), Mike Morse right behind him, and Russ Adams just ahead. Morse (.768/-23.2) was the worst of anyone with 400+ innings.

LF->1B or DH
David Dellucci was at .757/-28.6, which was the worst of anyone with 300+ innings except for Manny, who has extenuating circumstances. Dellucci is surrounded by guys like Mike Morse and Jose Hernandez, who you wouldn't have expeted to ever play in the outfield.

CF->LF or RF
Terrence Long, .833/-26.1.

RF->LF or 1B or DH
Matt Stairs and Reed Johnson were at .810/-20.3, Ruben Sierra .765/-34.2, so that's a pretty big stretch. I'd be inclined to go with the .810.
137. Dan Turkenkopf Posted: March 16, 2006 at 03:34 AM (#1900611)

631 plays per season comes out to 3.9 plays per game -- meaning an average AL game in 2005 would see around 4 non-routine opportunities for fielders to make an out (not including balls impossible to reach, as they are out of anyone's zone). Does that sound reasonable? I don't know.

Wouldn't some of these 3.9 plays per game be OOZ too because of the way ZR is calculated? So it's actually probably a smaller number of non-routine in-zone opportunities. I wonder what the breakdown is.

As an aside - I don't think I've seen this question answered in any of the threads, but I may have missed it. I know that the zones are chosen based on >50% out conversion. What are the actual out conversion rates in those zones? To figure it from ZR requires knowing the OOZ opportunities, but it would calculated like this I think:

(AvgZR*AvgZROpps - AvgOOZ) / (AvgZROpps - AvgOOZ) = IZR (in zone rating)

We should be able to calculate that from the Fielding Bible, right? And using the floor concept might give us (well me, since I'm sure other people understand it) a better picture of how few in-zone plays per year it is that differentiates good fielders from those who are so bad they can't stay at the position. If that makes any sense whatsoever.
138.  Posted: March 16, 2006 at 03:46 AM (#1900624)
We should be able to calculate that from the Fielding Bible, right?

Chone (Sean) Smith talks about this here. He actually did exactly what you're suggesting, although he didn't post all his numbers.

-- MWE
139. Los Angeles Waterloo of Black Hawk Posted: March 16, 2006 at 04:12 AM (#1900667)
We should be able to calculate that from the Fielding Bible, right?

You don't even need to go through that. The book, tells you, for example, that last year Orlando Hudson had 326 opportunities in his zone, and made 278 plays for a ZR of .853. In addition, he made 52 OOZ plays.
140. Spivey Posted: March 16, 2006 at 06:42 AM (#1901065)
I know that Dial and Rallymonkey have converted ZR to runs above or below average. Have they, or anyone else taken this data or similar data, and seen if team ZR runs approximately equals the difference between runs allowed and DIPS RA?
141. Harold Posted: March 16, 2006 at 07:09 AM (#1901218)
That means *at the major league level*, fielders at SS (and assumably most positions), fielding is still going to be 92-96% *the pitcher*.

I don't agree with the conclusion. Basically you're saying that the pitcher's "credit" or "value" is based on the number of outs that the fielder doesn't influence. I'm note sure that's a useful definition of "credit" or "value".

To use your logic in reverse, there's a certain floor for OBP allowed (well, I guess it's a ceiling, since higher numbers are worse), that a pitcher isn't allowed to pitch if he can't achieve. So a pitcher shouldn't get credit for the first 60% of outs, right? You can just say those are "baseball-driven" or "physics-driven" -- that even the worst pitcher will record some outs due to the nature of the game.

Now, I'm not saying that logic makes sense. I'm just saying that you're looking at the variation among major league shortstops, and saying that that is the importance of the fielder. OK, I buy that. But then you say that the pitcher is worth all of the remaining outs, as opposed to looking at the variation among pitchers.
142. Harold Posted: March 16, 2006 at 07:14 AM (#1901236)
Argh, it looks like the HTML formatting generated is different for some players than for others, which makes this a big pain in the ass. I'll see if I can keep working on this.

I figured out a way around this, by the way. I should be able to get this working pretty soon. By next week we should have a historical ZR database available (though it will be incomplete and there are copyright concerns).
143. Spivey Posted: March 16, 2006 at 08:25 AM (#1901344)
To use your logic in reverse, there's a certain floor for OBP allowed (well, I guess it's a ceiling, since higher numbers are worse), that a pitcher isn't allowed to pitch if he can't achieve. So a pitcher shouldn't get credit for the first 60% of outs, right? You can just say those are "baseball-driven" or "physics-driven" -- that even the worst pitcher will record some outs due to the nature of the game.

I was thinking the same thing, but couldn't think how to phrase it. This sums it up pretty well. Although, if Dials point is that defense only matters on say ~10% of outs, it might be true.

But, it very well might matter on ~33% of outs that could be hits. This has to be thought of in an abstract form, because a fastball on the inside corner in one batter/pitcher matchup can have a guy tied up and another can be waiting for it and drive it - the exact same pitch with the same pitcher and hitter. Whereas for fielding, a rope to the edge of a zone will generally be a play that separates the good from the bad.
144. Los Angeles Waterloo of Black Hawk Posted: March 16, 2006 at 10:51 AM (#1901436)
I know that Dial and Rallymonkey have converted ZR to runs above or below average. Have they, or anyone else taken this data or similar data, and seen if team ZR runs approximately equals the difference between runs allowed and DIPS RA?

Anyone know the best DIPS RA formula to use? Looking at the Angels last year, John Dewan's Plus/Minus system, when converted to runs which they easily could have done in the book instead of just making me waste eight minutes of my life, says the Angel position players were a total of +35.04 runs. Now, that doesn't include throwing out basestealers, wild pitches, or passed balls ... or double plays. Or outfield arms. But there you go; if you have a DIPS or something, see how that works out.
145. Los Angeles Waterloo of Black Hawk Posted: March 16, 2006 at 11:26 AM (#1901443)
You know, when you start looking at in- and out-of-zone balls, there's a lot of wacky.

A-Rod last year, per Dewan's book, was 187-255 in his zone, and made an outrageous 99 OOZ, for a total of 286 plays made. His expected number of plays made: 284.

That means there were 29 plays out of his zone that he was supposed to make. Well, I know that's not literally true, because they're fractions of plays added up ... but that seems kinda goofy, doesn't it?

Other things add up, though ... A-Rod was +6 plays, but only +2 in "enhanced plays," which by the book means he prevented 6 hits but allowed 4 more extra bases than average. Sure enough, you look over, and he was -7 plays going to his right, which is where a 3B would be giving extra-base hits.

So, the conspiracy-minded will think, "Yes, of course, A-Rod is bad to his right because he has to cover for The Jeter to his left." But, oddly, A-Rod is -3 plays to his left. I don't know what's going on there ... he's +13 plays on balls hit straight at him. Of course, that's not really "straight at him," that's "straight at where a 3B typically plays," and the going right and left is based on that. So I don't know ... I'm not really sure where those 99 OOZ plays are.

Other guys seem to make more sense ... Orlando Hudson was 278-326 in his zone with 52 OOZ. His expected outs were 313, which is slightly lower than his ZROpps, which makes sense, as some number balls hit into a defender's "zone" will be balls he only has a .501 chance of making an out on, so expected outs would tend to be lower than ZROpps.

One nifty thing about the book is that it's that its data is presented pretty similar to the old Dale Stephenson stuff with Defensive Average, except it doesn't do the run conversion ...
146. Foghorn Leghorn Posted: March 16, 2006 at 02:36 PM (#1901480)
Could the floor for a position be identified by looking at the ZRs for players that move down the defensive spectrum? Or at least as a starting point ...

LAWBH,
that's what I was basically thinking of - or slightly above that. Those guys were "below the floor".

I am not sure aboutthe OOZ in the book until I get the data and cross-reference.

the easiest way to verify the process, to make sure we're talking apples and apples, is to use the OF. Their putouts match zone plays well. You need to see if all the numbers jibe.

Take Hudson:
278-326, plus 52.

That means his ZR, as I have it, should be 0.873 - what do I have for Hudson - 0.839.

As is noted at Chone's page, they don't match up. SO comparing the systems without knowing what the differences are means a lto.

This also explains a great deal of the differences (IMM) between Pinto and MGL.

We already know there is a LD/FB issue at BIS, and so I am a little concenred with how things are being defined.

Dewan was instrumental in hte invention of these things, so I'm not sure what to believe. I'll have to get the book.
147. AROM Posted: March 16, 2006 at 03:03 PM (#1901492)
Have they, or anyone else taken this data or similar data, and seen if team ZR runs approximately equals the difference between runs allowed and DIPS RA?

I have, I tried not using runs when I did it, I compared plays made above/below average to hits/errors above/below from DER. The correlation was positive, I think there was an r of +.50 to .65 or something.

By next week we should have a historical ZR database available

Are the zone ratings for early years recalculated or do infielders still have DP added in? I've already got a ZR database from the old Scoreboards, going back to 1992.
148. Foghorn Leghorn Posted: March 16, 2006 at 03:41 PM (#1901539)
Rally,
that's the goal. I *know* yours are wrong (pre 1999).

I have to make some comparisons based on otehr diata I have gleaned. More on that later.
149. Danny Posted: March 16, 2006 at 04:29 PM (#1901603)
Anyone know the best DIPS RA formula to use? Looking at the Angels last year, John Dewan's Plus/Minus system, when converted to runs which they easily could have done in the book instead of just making me waste eight minutes of my life, says the Angel position players were a total of +35.04 runs. Now, that doesn't include throwing out basestealers, wild pitches, or passed balls ... or double plays. Or outfield arms. But there you go; if you have a DIPS or something, see how that works out.

I don't know if it's the best one, but FIP is available at THT. The Angels had a 3.68 ERA, a 3.91 FIP, and a 3.95 RA. Over 1464 IP, that's a 38 run difference between ERA and FIP.
150. Los Angeles Waterloo of Black Hawk Posted: March 16, 2006 at 07:55 PM (#1902069)
That 38-run difference is pretty close to the +35 that Plus/Minus intimates. I just went through the Dial-converted ZR's, and that comes out to +13.2 for the team (I didn't include the arm ratings). If I have time later I might check out some other teams, but as it has to be done by hand, I don't know if that'll happen ...

... actually, I add up the Dial ZR conversions by team pretty easily. Well, except players who play for multiple teams have their data lumped together. I'll just divide those in half for an estimate (i.e., if a player played for the Yankees and the Red Sox and was +5 total, each team gets 2.5), as I don't have time to look up every damn multi-team player and figure out the proportions of how much they played for each team.

Again, this is ignoring arm ratings for outfielders and catchers altogether:

``` Team   On Team   Shared   TotalBAL    - 0.3      +1.9    + 1.6BOS    -61.5      -3.3    -64.8CLE    +35.8      -1.6    +34.2CWS    +34.0              +34.0DET    +12.7      +0.7    +13.4KC     -35.1      -4.9    -40.0LAA    +13.2              +13.2MIN    + 1.9      -9.6    - 7.7NYY    -64.9      -1.0    -65.9OAK    +16.4      +6.1    +22.5SEA    -10.4      -9.6    -20.0TB     -39.8              -39.8TEX    -27.2              -27.2TOR    + 5.0      +0.7    + 5.7-------------------------------ARI    + 9.4      -4.4    + 5.0ATL    +22.4              +22.4CHC    +16.3      -3.3    +13.0CIN    -30.1      +0.2    -29.9COL    - 6.0      -0.6    - 6.6FLA    -20.6              -20.6HOU    -19.0              -19.0LAD    +10.0      -4.7    + 5.3MIL    -11.8      -0.1    -11.9NYM    + 3.4      -0.4    + 3.0PHI    +19.7      -2.1    +17.6PIT    - 0.3      -2.8    - 3.1SD     +25.7      +1.6    +27.3SF     - 4.4      -5.6    -10.0STL    + 5.3              + 5.3WAS    + 0.5       0.0    + 0.5  ```
151.  Posted: March 16, 2006 at 10:32 PM (#1902475)
I should note that comparing BIS data to other source data next year is going to be even harder, because BIS will be adding a "fliner" category to their BIP data. (These are those in-between balls hit to the outfield that aren't really fly balls but aren't really line drives, either.)

-- MWE
152. Dan Turkenkopf Posted: March 16, 2006 at 10:51 PM (#1902523)
I should note that comparing BIS data to other source data next year is going to be even harder, because BIS will be adding a "fliner" category to their BIP data. (These are those in-between balls hit to the outfield that aren't really fly balls but aren't really line drives, either.)

Great. So now we're going to have to take into account a pitcher's fliner rate too - and whether there's any skill in that. Not to mention the more real issue of consistency differentiating between a fly ball, a line drive and a fliner. I wouldn't be surprised to see different scorers call the same ball differently.
153. Foghorn Leghorn Posted: March 17, 2006 at 05:04 PM (#1904060)
BTW, I really don't see the plus there. At this point, why aren't scorers being given stopwatches for FBs?
154. Harold Posted: March 18, 2006 at 03:11 AM (#1904970)
Rally,
that's the goal. I *know* yours are wrong (pre 1999).

Well, can you tell if the old data on espn.com handles the DPs correctly? Either Dial or Rallymonkey? You can check a couple active players and see if they match up.

I should note that comparing BIS data to other source data next year is going to be even harder, because BIS will be adding a "fliner" category to their BIP data. (These are those in-between balls hit to the outfield that aren't really fly balls but aren't really line drives, either.)

From what I heard, there will actually be *four* categories (two "fliner" categories between "liners" and "flies"). The idea is that more granularity is better, and that analysts can decide how to partition the data (i.e., you can combine those four categories into two categories for LD and GB to better match up with other services, or use 3 or 4 categories).

If the problem is that BIS data "draws the line" between flies and liners in a different place than the other service, then drawing more lines helps rectify that problem.
155. Chris Dial Posted: March 18, 2006 at 03:24 AM (#1904992)
Well, can you tell if the old data on espn.com handles the DPs correctly? Either Dial or Rallymonkey? You can check a couple active players and see if they match up.

Yes, I guess I coul...didn't you already pull the data?
Oh, and State dumps Cal.
156.  Posted: March 18, 2006 at 04:23 AM (#1905018)
From what I heard, there will actually be *four* categories (two "fliner" categories between "liners" and "flies").

That's correct - fliners like fly balls and fliners like line drives. As I understand it, they are used solely for outfield plays.

-- MWE
157.  Posted: March 18, 2006 at 04:32 AM (#1905022)
Oh, and State dumps Cal.

Go Penn.

-- MWE
158. Harold Posted: March 18, 2006 at 04:38 AM (#1905028)
Oh, and State dumps Cal.

Great, thanks. I just got home and was about to start the game on DVR. I was staying out of the lounge, but didn't think I had to worry about spoilers here.

Yes, I guess I coul...didn't you already pull the data?

No, not yet. I figured out how to make it easy, but haven't gotten a chance to do it. I've been out of town the last couple days.

If the data is from the "bad" ZR construction, I'm not sure it's worth going through the effort to pull it.
159. Dan Turkenkopf Posted: March 18, 2006 at 04:41 AM (#1905031)
Go Penn.

Up one at halftime! Go Quakers!
160. Chris Dial Posted: March 18, 2006 at 05:40 AM (#1905134)
I'm not sure it's worth going through the effort to pull it.

Since I'm not doing it, I'm certain it is worth the effort.
161. dbain21 Posted: July 12, 2006 at 09:07 PM (#2097080)
Chris,

What exactly are AvgZROps and how did you determine them? Are they calculated from other readily available defensive statistics (PO-A-E)??

Sincerely,
Derek Bain
162. Chris Dial Posted: July 12, 2006 at 11:16 PM (#2097151)
Derek,
no, they are not. AvgZROps were developed from about 5 years worth of actual ZR Ops. They represent tens of thousands of innings.

Sorry.
163. mtagliaf Posted: July 21, 2006 at 05:58 PM (#2106640)
I'm confused - the article says:

For Helton we get:
RScal = Helton’s ZR * 281 AvgZRChances
RScal = 205.4

Helton's ZR is .916, but .916 * 281 is not equal to 205.4. What am I missing?
164. Chris Dial Posted: July 26, 2006 at 03:24 AM (#2112590)
Sorry, Mtag. I corrected that:
For Helton we get:
RScal = Helton’s ZR * 281 AvgZRChances * 0.798 runs/play
RScal = 205.4
