Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Baseball Newsstand > Baseball Primer Newsblog > Discussion
Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Monday, November 12, 2007

Baseball Musings: Pinto: Probabilistic Model of Range, 2007, CF

PMR data for 07 CFers is up.

Did Corey Patterson play hurt this year?

Shock Posted: November 12, 2007 at 12:47 AM | 44 comment(s)
  Related News: Sabermetrics

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 1 of 1 pages
   1. Miss Remember Posted: November 12, 2007 at 02:59 AM (#2611984)
Didn't UZR hate Ichiro?
   2. 1k5v3L Posted: November 12, 2007 at 03:21 AM (#2611993)
Erstad?!

One thing that shoewizard brought up is that AZ's outfielders all play very very deep, in "prevent extra base hits" defense. As a result, the numbers of doubles and triples that the Dbacks have given up this year has dropped quite significantly. Also as a result, the outfielders, especially Young, are letting a lot of balls drop for singles that arguably would be handled by someone playing a lot shallower. I don't know if PMR accounts for that (i.e., positioning and sacrificing singles for extra base hits). Also, Young's defense improved quite a bit as the year went on.
   3. Shock Posted: November 12, 2007 at 03:26 AM (#2611996)
Also interesting that it doesn't think much about Melky Cabrera. One has to wonder where all the defense comes from if PMR has the Yankees #1 with SS rated horribly and CF rated around average. I'm thinking Cano is going to be #1 among 2B's...
   4. Le Samourai Posted: November 12, 2007 at 03:32 AM (#2611997)
I'm surprised Taguchi is above-average. I guess his speed is saving him, because despite his reputation, he's looked horrible whenever I've watched him the past two years. Bad reactions, awkward routes, etc.
   5. fear and loathing in birdlives Posted: November 12, 2007 at 03:47 AM (#2612002)
Did Corey Patterson play hurt this year?

Nope, and his poor rating really surprised me.
   6. PreservedFish Posted: November 12, 2007 at 04:01 AM (#2612003)
"Didn't UZR hate Ichiro?"

Yes. I think it had him as one of the worst fielders in baseball in the second half. But good in previous years. And IIRC, a Mariner fan or two agreed that his defense went during the season sour.
   7. Biff, Red Sox Jinx Posted: November 12, 2007 at 04:09 AM (#2612006)
More evidence that Coco got robbed of a Gold Glove, et cetera.
   8. mgl Posted: November 12, 2007 at 08:50 AM (#2612035)
Yup I have Ichiro as awful in CF this year. I do have a very good year for CF though (as opposed to the last 5 years combined), for whatever that is worth. Most of the other positions are awful for some reason (again, as opposed to the last 5 years). There are just awful defenders all over the field - the entire Devil Rays team, most of the Marlins, Braun, etc.
   9. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 12, 2007 at 09:58 AM (#2612067)
MGL, how is that possible? Isn't UZR always denominated relative to average? If there are more awful defenders, don't there also have to be more excellent defenders so that everything sums to zero (albeit with a higher standard deviation)? Or are you saying that the overall league defensive efficiency declined this year--did it?
   10. NJ in DC Posted: November 12, 2007 at 10:10 AM (#2612077)
MGL, how is that possible? Isn't UZR always denominated relative to average? If there are more awful defenders, don't there also have to be more excellent defenders so that everything sums to zero (albeit with a higher standard deviation)? Or are you saying that the overall league defensive efficiency declined this year--did it?

I would imagine that the average is calculated year to year so for example if one year every team stuck their DH at SS, the average would be low compared to years past, but you would still have your standout defenders like Gary Sheffield because a guy like Frank Thomas or Jason Giambi is just so terrible there.
   11. Mike Emeigh Posted: November 12, 2007 at 10:52 AM (#2612104)
Interesting that Red Sox CFs other than Coco were below expectations, although the sample size is small (Coco was out there for 84% of BIP). I guess it makes sense, though: about half of the non-Coco innings went to Ellsbury, the other half went to Pena and Drew. Pena probably dragged the performance down.

One thing that I'd like to see is how well UZR correlates with Pinto's predicted DER. Ichiro has a relatively low predicted DER - which implies that for one reason or another the balls that were being hit against the Mariners were distributed in a manner that would reduce the probability that Ichiro could even make a play on a ball - and he also had a low UZR. If that's consistent for other fielders, it leaves open the possibility that UZR is not independent of opportunity - which is not a knock on UZR, because IMO accounting for opportunity differences is the hardest thing to do when evaluating fielding.

-- MWE
   12. Gaelan Posted: November 12, 2007 at 11:05 AM (#2612115)
MGL, how is that possible? Isn't UZR always denominated relative to average? If there are more awful defenders, don't there also have to be more excellent defenders so that everything sums to zero (albeit with a higher standard deviation)? Or are you saying that the overall league defensive efficiency declined this year--did it?


MGL is now using larger sample sizes (I think he said five years but I'm not sure) to calculate the baseline average. Using this method in any one year everything won't add up to zero.
   13. Miss Remember Posted: November 12, 2007 at 11:10 AM (#2612123)
I'm also really, really excited to see Aaron Rowand's contract.
   14. smilinmike Posted: November 12, 2007 at 11:35 AM (#2612148)
Anyone else surprised that Damon comes out ahead of Melky? I suppose Melky's arm more than makes up the difference, but still...

Also, this provides further evidence that Adnruw Jones is the right free agent choice among the Jones, Rowand, Cameron, Torii options. If I'm a GM I'm thrilled that Andruw hit .222 this year to drastically reduce his cost. A free agent bargain?
   15. AROM Posted: November 12, 2007 at 11:47 AM (#2612158)
Anyone else surprised that Damon comes out ahead of Melky? I suppose Melky's arm more than makes up the difference, but still...


Melky was pretty bad by John Dewan's plus/minus, and PMR comes from the same data source. I don't think any pbp stat shows Melky as a standout defender. I still like him out there (I need another year of evidence before I disbelieve the eyes that says he's at least decent), and his arm is a refreshing change.

UZR had Torii as the better centerfielder. The zone ratings have them about the same. Offensively Torii has passed him, partly due to the off year from Andruw, partly due to playing in the tougher league. The difference is not huge but I put Torii about a half a win better on a projection. I don't think either one is going to be a bargain.
   16. smilinmike Posted: November 12, 2007 at 12:40 PM (#2612212)
UZR had Torii as the better centerfielder. The zone ratings have them about the same. Offensively Torii has passed him, partly due to the off year from Andruw, partly due to playing in the tougher league. The difference is not huge but I put Torii about a half a win better on a projection. I don't think either one is going to be a bargain.
Page 1 of 1 pages


You're probably right about neither being a bargain. I guess I'm seeing Andruw's down year in 07 as coming at just the right time for the signing GM, whereas Torii's up year comes at the perfect time for the player. I tend to think Andruw will be better than Torii from 20008 forward, largely due to his longer track record as an impact player and the fact that he's nearly 2 years younger.
   17. AROM Posted: November 12, 2007 at 01:52 PM (#2612330)
Tough call. You make some good points, but a lot of people think Andruw's conditioning will make his decline swift and ugly.
   18. Cowboy Popup Posted: November 12, 2007 at 01:53 PM (#2612332)
I don't think any pbp stat shows Melky as a standout defender.

ZR Does. And RZR has him as a standout too.
   19. Russ Posted: November 12, 2007 at 02:09 PM (#2612355)


One thing that I'd like to see is how well UZR correlates with Pinto's predicted DER. Ichiro has a relatively low predicted DER - which implies that for one reason or another the balls that were being hit against the Mariners were distributed in a manner that would reduce the probability that Ichiro could even make a play on a ball - and he also had a low UZR. If that's consistent for other fielders, it leaves open the possibility that UZR is not independent of opportunity - which is not a knock on UZR, because IMO accounting for opportunity differences is the hardest thing to do when evaluating fielding.


I had the exact same feeling, Mike. I know that UZR is supposed to be doing the "same thing" as PMR, but the fact that they're getting SUCH different answers is unsettling (even more unsettling is that the errors are tending to be in exactly the way that you describe). I think there is definitely some sort of weirdness going on there. I think it's not (just) that UZR is not independent of opportunity, but I have a feeling that it's not independent of *difficulty*. Does anyone have a link to a description of MGL's new "updated" version of UZR? I found the old version (2003), but Tango suggests that the new version is now more "Pinto-like" (but it's obviously not judging by these discrepancies).

I'm starting to get the feeling that UZR is much more an opportunity/difficulty stat than an actual ability stat (not as egregious as, but similar to RS/RBI vs. OPS/OBP/SLG). UZR definitely is starting to feel more like a description of what a player contributed and PMR is starting to feel more like what a player's actual ability was (something like a context-neutral stat).
   20. GuyM Posted: November 12, 2007 at 02:47 PM (#2612404)
Mike/Russ: It would be interesting to see how UZR and PMR Predicted Outs correlate. But looking at players rated differently by the two systems, like Ichiro, doesn't tell us anything. Both MGL and Pinto know how many outs each player recorded. So if PMR rates a player higher, then by definition it defined him as having fewer/more difficult opportunties. And vice-versa. So players who PMR "likes" more than UZR will naturally tend to have low predicted DER.

* *

I think predicted DER at the position or player level is not very helpful. It's not very intuitive to me whether a CF should be expected to make 9.1% or 10.5% of outs on all BIP. It would help if Pinto added an average line at the top of each table. But beyond that, the expected DER is heavily influenced by GB/FB tendency of pitching staff. It would be great if Pinto developed a "difficulty metric" that told us how hard a fielder's opportunities were each season. For example, for CFs you might define all LD and OFs in certain vectors as the "CF territory" (perhaps defined as vectors in which the CF records majority of outs). Then, tell us what the expected DER was on all airballs hit into that territory. Do the same thing using GBs for infielders. I'm sure someone can improve on this definition, but I think it would be an interesting metric.
   21. GGC won't apologize for liking the Red Sox Posted: November 12, 2007 at 02:52 PM (#2612409)
Russ, are you saying the PMR is more of a forecasting stat and UZR is more of a stat that described what happened that year?
   22. Russ Posted: November 12, 2007 at 03:09 PM (#2612430)
So if PMR rates a player higher, then by definition it defined him as having fewer/more difficult opportunties. And vice-versa. So players who PMR "likes" more than UZR will naturally tend to have low predicted DER.


That's true according to the old version of UZR (MGL's series at BTF circa 2003). I have seen other information that suggests that the new version of UZR is adjusting now for difficulty. That means the two metrics should be much closer together. Then the only difference should really be in the adjustment. And, if UZR is ostensibly adjusting for both difficulty and opportunity, then we should certainly NOT see predictable differences between the two methods that are due to either difficulty or opportunity.

It would be great if Pinto developed a "difficulty metric" that told us how hard a fielder's opportunities were each season.


Unless I'm completely misunderstanding it, this is what the predicted DER is. The percentage of total BIP for a particular fielder that the expected fielder would have gotten to for a particular profile of opportunties (taking into account park, type of hit, etc). That's a difficulty rating, because the denominator is total balls in play for the fielder. So that's what we would have expected for a fielder given his profile of BIP.

Russ, are you saying the PMR is more of a forecasting stat and UZR is more of a stat that described what happened that year?


I'm saying that is my impression of it, going off a 4 year old description of UZR and my own naivete from never having analyzed any individual piece of data. But, yes, if the core UZR is still not adjusting for difficulty, then that is how I would view it. What adjusting for difficulty allows us to do (in theory) is to isolate the player's own contribution to what balls were turned into outs, which should allow for better forecasting.

If anything, David's PMR modelling (and associated articles) really made clear to me that the variation in BIP difficulty is quite large relative to the number of events, much larger than I would have ever expected. And I think his work, more than anything else, shows how there can be such a disconnect between what people see with their eyes and what the various metrics are telling us about defense. It doesn't seem smart to penalize a player for not making a play on a ball that most other players would not have made a play on, especially from a forecasting perspective. In the previous version of UZR, it seemed that the BIP difficulty was being collapsed over for a particular zone. This is obviously not great, because not all BIP are created equally. Now, it still probably worked pretty well because for non-extreme difficulty players (lots of easy or lots of difficult BIP) because players are probably seeing the roughly the same difficulty in each zone. However, for players who saw a lot of difficult chances or a lot of easy chances, assuming they had the same difficulty of chances will distort their effect on turning the BIP into outs.
   23. GuyM Posted: November 12, 2007 at 03:34 PM (#2612462)
we should certainly NOT see predictable differences between the two methods that are due to either difficulty or opportunity.

What I said is that you'll see predictable differences if you look at players who are rated very differently by the two metrics. The players rated higher by PMR will tend to have low predicted DER.

Unless I'm completely misunderstanding it, this is what the predicted DER is.

Only kinda. It tells you how many outs a typical CF would record given those BIP. It doesn't tell you if a player had low expected DER because he faced a lot of tough, sinking line-drives to CF, or simply because he played behind Brandon Webb and most of the BIP were GBs.

But, yes, if the core UZR is still not adjusting for difficulty, then that is how I would view it.

Both adjust for difficulty. Maybe PMR does it better, but I wouldn't assume that. PMR's estimate that a BIP will become an out relies on pretty low sample sizes. For example, it looks at how often a medium-hard FB to vector "X," hit by RH hitter against a LH pitcher, becomes an out. The probability is based on the out% for that exact configuration, in that ballbark only, handled (mainly) by visiting fielders, in ONE year. In many cases, the N must be less than 20.
   24. kevin Posted: November 12, 2007 at 03:54 PM (#2612490)
Whoa, I knew that Coco had improved but I didn't expect him to be first.
   25. GGC won't apologize for liking the Red Sox Posted: November 12, 2007 at 04:21 PM (#2612525)
Now I'm more confused than I was earlier today.

I'll have to reread this at home instead of work.
   26. Shock Posted: November 12, 2007 at 04:26 PM (#2612530)
2B is now up as well. Not sure if we want a new thread for every update, so I'll just leave it here for now.
   27. GuyM Posted: November 12, 2007 at 04:31 PM (#2612544)
It occurs to me that Pinto could create a more meaningful "predicted DER" for each fielder. What we want -- or at least, what I want -- is something parallel to team DER: outs divided by opportunities. Right now, opportunities is defined as all BIP. But somehow, saying that "Coco Crisp was a .114 fielder last year" just doesn't sound impressive. But there's an implicit definition of a fielder's weighted fielding opportunities in PMR, which is the proportion of the predicted DER assigned to that player.

Example: probabilites for a BIP are .25 hit, .10 3B, .15 SS, .30 LF, .20 CF. Predicted DER is .75. So, we divide this opportunity this way: .133 3B (.10/.75), .20 SS, .40 LF, .267 CF (and each player's predicted DER for the ball is identical: .75). Each player ends up with a weighted opportunity total reflecting their level of responsibility for each BIP. Many plays would of course be overwhelmingly or entirely the responsibility of one fielder, but some would be divided as in this example. (If there are any BIP with 100% hit probability, assign to nearest fielder given standard positioning.)

Then each fielder gets his own values for outs, predicted outs, estimated opportunities, predicted DER (pred outs/oppor's), and actual DER (outs/oppor's). A predicted DER below that of the average for your position would indicate that a fielder had a large number of tough plays within his area of responsibility.
   28. Los Angeles Waterloo of Black Hawk Posted: November 12, 2007 at 04:47 PM (#2612570)
   29. Los Angeles Waterloo of Black Hawk Posted: November 12, 2007 at 04:57 PM (#2612588)
   30. Russ Posted: November 12, 2007 at 05:01 PM (#2612597)
The probability is based on the out% for that exact configuration, in that ballbark only, handled (mainly) by visiting fielders, in ONE year. In many cases, the N must be less than 20.


This is not true (as far as I can tell). It seems that David builds a logistic regression model that would allow one to make simplifying assumptions that would increase the sample size brought to bear on the batted balls. Now there could be a problem still with year-to-year variation, but it would not be THAT serious because he's got a lot of data and not a lot of free parameters in the model.

A predicted DER below that of the average for your position would indicate that a fielder had a large number of tough plays within his area of responsibility.


It's still not clear to me that this isn't what David's doing already, with the exception of partial credit for BIP (balls that more than one fielder could get to). Maybe he'll check in and let us know.
   31. villageidiom Posted: November 12, 2007 at 05:25 PM (#2612631)
Interesting that Red Sox CFs other than Coco were below expectations, although the sample size is small (Coco was out there for 84% of BIP). I guess it makes sense, though: about half of the non-Coco innings went to Ellsbury, the other half went to Pena and Drew. Pena probably dragged the performance down.

I'm probably in the minority who think Ellsbury brought the performance down. To date he strikes me as an otherwise average defender with ridiculous speed. Maybe he's just not used to Fenway yet, like Coco in 2006. Or maybe Coco was just so good in 2007 that anything else looked choppy. But I generally wasn't impressed with the routes Ellsbury took, nor his judgment of what was catchable.
   32. GGC won't apologize for liking the Red Sox Posted: November 12, 2007 at 05:34 PM (#2612641)
vi, MWE talks about a learning curve wrt to defense for newly promoted guys. So there's hope for the future. I'll see if I can dig up some of his comments. IIRC, the audiovisual background at a major league park takes getting used to.
   33. Russ Posted: November 12, 2007 at 05:39 PM (#2612655)
Then each fielder gets his own values for outs, predicted outs, estimated opportunities, predicted DER (pred outs/oppor's), and actual DER (outs/oppor's).


OK Guy, I went to David's site and read some of your old posts. You're definitely right that penalizing players for not getting to balls that others get to is suboptimal. I see what you're suggesting there and it's probably the correct thing to do. How much it affects the estimated difficulty, I do not know.
   34. Russ Posted: November 12, 2007 at 05:44 PM (#2612669)
BTW, I think that David's current models would not require much tweaking if he included an interaction between position and vector/zone in the model for probability of being caught there. That would give you a separate estimate of the probability of catching a ball for each zone/vector zone BY POSITION... I'll have to think about this a bit more though. There should be enough BIP that it should not be a big problem, although we're definitely starting to run into some model complexity vs. sample size issues.
   35. Mike Emeigh Posted: November 12, 2007 at 05:50 PM (#2612678)
I'm probably in the minority who think Ellsbury brought the performance down.


Ellsbury had a higher RF than did Coco, while Pena was far below, so I was guessing there. The predicted DER for other BoSox CFs was higher than Coco's predicted DER (.114 vs .106).

-- MWE
   36. Mike Emeigh Posted: November 12, 2007 at 05:51 PM (#2612680)
There should be enough BIP that it should not be a big problem, although we're definitely starting to run into some model complexity vs. sample size issues.


I think so. We're really starting to run into the limits of what we can do fairly simply with the data that we have.

-- MWE
   37. Dan Posted: November 12, 2007 at 05:53 PM (#2612685)
Maybe he's just not used to Fenway yet, like Coco in 2006. Or maybe Coco was just so good in 2007 that anything else looked choppy.

These both seem accurate to me, from watching Ellsbury play this year in his limited time.
   38. GGC won't apologize for liking the Red Sox Posted: November 13, 2007 at 08:28 PM (#2613941)
David Pinto gets to third base.
   39. Shock Posted: November 13, 2007 at 09:28 PM (#2613979)
3B list is probably the most uncontroversial so far or any defensive list. I don't know what Zimmerman's reputation is, but it looks pretty unsurprising.
   40. Scoriano Flitcraft Posted: November 13, 2007 at 09:51 PM (#2613998)
IIRC, Tino Martinez said earlier this year that Coco Crisp is the best he has ever seen at fielding in the OF--he said his arm was mediocre but his ability to go get the ball and make the play was the best.
   41. Los Angeles Waterloo of Black Hawk Posted: November 13, 2007 at 10:12 PM (#2614019)
David Pinto gets to third base.

And so do I.
   42. Der Komminsk-sar Posted: November 13, 2007 at 10:20 PM (#2614024)
Zim has an excellent rep. Offhand, Aramis looks like the biggest surprise there.
   43. BeanoCook Posted: November 13, 2007 at 10:45 PM (#2614045)
3B list is probably the most uncontroversial so far or any defensive list. I don't know what Zimmerman's reputation is, but it looks pretty unsurprising.


One thing that was for certain was Zimmerman's defense was expected to be this good. So why is it that the scouts nailed this, could identify his great defense when he was a college player, but scouts still screw up from time to time on established big leaguers?

Does anyone know if the scouts tend to overrate or underrate a player's defensive abilities when compared to modern day defensive metrics?
   44. Pops Freshenmeyer Posted: November 13, 2007 at 10:52 PM (#2614056)
One thing that was for certain was Zimmerman's defense was expected to be this good. So why is it that the scouts nailed this, could identify his great defense when he was a college player, but scouts still screw up from time to time on established big leaguers?

I think scouts tend to look at defensive skills as opposed to results (it's the nature of their job). When projecting young players, those skills most likely outweigh the raw results in importance because you're looking at potential instead of actual value. The scouts can see the players more likely to turn into good defenders but that doesn't mean their methods of perceiving are best suited for picking the best fielder at that moment in time.
Page 1 of 1 pages

You must be Registered and Logged In to post comments.

 

<< Back to main

Support BBTF

donate

My Bookmarks

You must be logged in to view your Bookmarks.

Vivid Seats is a sports ticket broker, concert ticket broker and theater ticket broker offering the best baseball tickets like Yankees tickets, Cubs tickets, and Red Sox tickets, as well as Police reunion tour tickets and Jersey Boys tickets.

Ticket Nest sells Braves, Cubs, Padres, Indians, Marlins, Nuts, Pirates, Rangers, Patriots, Royals, Stars, Tides, Tigers, Twins, Phillies, Wings, Mets, Yankees, Angels, Dodgers tickets, and Dragons tickets.

Buy Cheap MLB Tickets

Concerts Theatre NFL Angels Dodgers MLB Celtics Theater NBA Tickets Venues NHL Lakers Tickets NFL Yankees NHL Phillies NBA Wicked Marlins MLB Concerts Cubs Mets Red Sox Wicked WWE Red Sox Mets Yankees Dodgers

Page rendered in 1.5294 seconds
81 querie(s) executed