User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
Page rendered in 0.9078 seconds
80 querie(s) executed
|
| |||||||||
Baseball Primer Newsblog — The Best News Links from the Baseball Newsstand Tuesday, November 21, 2006PMR!David Pinto is back with his Probabilistic Model of Range, so I’m once again back with my attempt to convert his figures (which are represented in outs) to runs. The link takes you to the left fielders, so my run conversions are here. David’s also put up centerfielders and first basemen, so I’ve done those as well (CF, 1B). I’ve also compared to zone rating where we have the data, and there are some pretty stark differences. Melky Cabrera fans are gonna love PMR, while Grady Sizemore fans should be prepared for battle. I’ve got some more after the jump. The first base figures include all batted balls, not just groundballs; we ran into a huge problem with all batted balls last year for pop-up hogs (poster boy: Orlando Hudson), and it looks like this year our outlier is Albert Pujols. David’s figures have him at around +40 plays, which comes out to +31 runs. Also, in previous years, David would determine the “predicted” number of outs using a multi-year average. This year, he’s using 2006 only (he describes his reasons here. This simplifies how I do the run conversions. Anyway, there are some stark differences on some players (especially Fenway outfielders) between PMR and ZR, so if anyone has any ideas on what might cause those differences, well, that won’t hurt the discussion. Los Angeles Waterloo of Black Hawk
Posted: November 21, 2006 at 04:26 PM | 55 comment(s)
Related News: General, Sabermetrics, Arizona, Cleveland, NY Yankees, St Louis |
My BookmarksYou must be logged in to view your Bookmarks. Hot TopicsNewsblog: Sam Hutcheson's Top 11 Sabrenerd Baseball Dork's* Basements (19 - 10:30pm, Feb 09) Last: Johnny Clash Newsblog: Hardball Talk: Gleeman: Lenny Dykstra is back with some more can't miss investment advice (128 - 10:29pm, Feb 09) Last: Der_K 2 Newsblog: Joe Torre on "Castle" (17 - 10:14pm, Feb 09) Last: B.G. Gamesh Reeks of Anti-Yankee Bias (w/Zombies) Newsblog: Cashman: No new pacts for big three
(14 - 10:01pm, Feb 09) Last: Kyle C welcomes back our OBP Savior |
||||||||
|
About Baseball Think Factory | Write for Us | Copyright © 1996-2008 Baseball Think Factory
User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
|
| Page rendered in 0.9078 seconds | |||||||
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
I assume the Hudson effect is what causes Conor Jackson to look so bad according to PMR (-19).
Does BIS treat balls off the wall the same way Stats does? I would bet that PMR assumes balls off the wall are uncatchable, which should (and does) make Manny look a little bit better than ZR does.
I'm also curious if the strange dimensions in CF hurt Crisp more in ZR than it would in PMR or UZR.
I believe so.
-- MWE
***
Yeah, but Pinto doesn't, since the park is a separate parameter.
Does anyone know if David's using the "smoothed visitor" model from last year or the original model?
According to some of the comments on the first set of corrected numbers for 2006, he's using the "smoothed visitor" model.
Also, is park one of the independent variables, or does he run separate regressions for each park? If it's the latter, then I'd think the sample size could really become problematic.
Conor is probably not nearly that bad, remember, he's playing next to last year's popup hog. I wouldn't be surprised though if his GB only rate is worse than his ZR - he didn't come to the majors with a rep for glovework.
Well, it would appear that Ryan Howard was roughly 60 outs worse than Albert Pujols defensively, so that's what.
***
It's not really an independent variable or a regression. It's a parameter. In a regression, we would say, p(out) = a*par1 + b*par2 + c*par3..., whereas in Pinto's model, we say, given par1, par2, par3...what is the probability of an out? But yes, the sample size is going to be pretty small for some zones. This is why MGL has talked about building a regression model for estimating probabilities in UZR for years. It just isn't very easy, if at all possible.
I do not think Pinto uses a regression. I believe that he calculates the expected outcome for a given set of parameters - e.g for a medium hit popup to vector K by left handed batter off right handed pitcher in fenway park - if out of 5 balls matching those conditions there were 2 outs by SS, 1 by LF, 1 by CF, 1 falls in, then per ball the expectation is 0.4 outs by SS, .2 each for LF and CF. The smoothed visiting model of PMR doesn't exclude the home team's performance but does severely discount it, so that visitors are more heavily weighted. Sample size issues are severe with one year of data. There are 60 possible parameter combinations for each vector. Consider that (just using the 22 fair STATS vectors, not the finer distinctions of BIS) there are an average of only 200 balls in play along a particular vector in one season in one park, which have to be further subdivided among the 60 combinations of batter handedness/pitcher handedness/hit type/whether the ball was hit soft, medium, hard. The smoothed visiting team model exacerbates the sample size problem.
It sounds like he'd be much better off combining the data for all 30 parks, increasing sample size be 30x, and then making park adjustments where necessary. For starters, you should probably treat GBs the same in all parks. It may not be quite true, but any difference you see in one year of data could as easily reflect the home hitters as a true park effect. Same for popups, at least those in fair territory. Dealing with LDs and FBs is harder, but maybe a park adjustment made at the vector level -- rather than each unique set of parameters -- could do the job.
No, one system shows them to be almost identical (Chris Dial's) and the other shows Pujols to be 50 runs better. I find it hard to believe any two ML 1B can be 50 runs apart in defense.
Stuff like this makes defensive statistics hard to accept.
Biggest problem with defensive stats for me. These wild variations always throw methodology into question. Since this area is not my forte', has there been any of the studies that validate the results of another? If I'm seeing Dial's numbers and PMR numbers that far apart, I'm hard pressed to buy into one system or the other till the gap is breached.
This, to me, is a handicap, as it runs the risk of overvaluing defenders who play positions that are difficult in their home parks. For instance, last season, each of the primary Minnesota outfielders did better under the smoothed visitor model than the original model. The ceiling of the Metrodome can be somewhat hazardous to fielders who aren't used to it, I would imagine, which matches (but certainly doesn't by itself prove) my hypothesis.
This is a risk. But on the other hand, ALL above-avg fielders should do better under the visitor model (since they are being compared to themselves much more in the original model), so maybe the MN OF's are just above-average. I think it's more important to avoid the compared-to-themselves problem.
Intererstingly, this probably means that Manny is even WORSE than his PMR rating!
However, JoeArthur makes the point that the smooth visitor model makes all players look better than they are, since it predicts fewer outs than are actually made. I do think Pinto should adjust the expected outs to equal actual outs for each position and/or each vector.
***
Maury, you might be interested in this: http://stats.mostvaluablenetwork.com/general/a-detailed-comparison-of-defensive-metrics/
The issue with this version of PMR is that it uses pop-ups (and liners) for infielders, which zone rating does not include, and no defensive metric should include (at least not the way PMR does it). So it's not the same statistic. It would be like saying, "I don't trust these batting stats when some players have a very different number for batting average and on-base percentage." The two statistics both purport to measure offense, but they actually are measuring different things. PMR and ZR both purport to measure defense, but they too are using different statistics.
With regard to the smoothed visitor model specifically, this is probably true. Of course to reach this conclusion one msut assume that Manny is indeed below average! And/or that the average visitor is average or below average, rather than above average. Playing in the AL East, the average visiting left fielder at Fenway in 2006 included a healthy dose of Carl Crawford, Reed Johnson and Nick Markakis, who seem to be pretty good.
However, apart from sample size issues, there are a couple of peculiarities about Manny and PMR.
1) The conventional wisdom around Boston is that Manny should do better at Fenway with its smaller effective area to defend, and his presumably limited range would be more exposed in road games. But in 2005 David Pinto published PMR home/road splits for Manny which were hard to understand. [I'm going by memory since his site is down at the moment, but I think he had Manny +3 on the road and -15 at home in 2005.] Manny is very deferential when another fielder can make a play, so he is eligible for a severe ball hog penalty under PMR [I think this explains a lot of Coco Crisp's plus rating in center field as well]. With essentially no outfield foul territory to protect, Fenway encourages the left fielder to shift toward left center a little, increasing opportunities for overlap with the centerfielder. My guess is that this, rather than luck alone, explains much of the wide gap between his home and road performance under PMR.
2) To avoid have separate models for infield and outfield in PMR, Pinto does not use batted ball distance as a parameter. Without having seen the BIS data directly to know how they actually assign the "soft" and "hard" hit attributes, I imagine that this can mean that a medium fly ball to vector "I" could travel 260 or 310 or 360 feet, and PMR would treat all the balls as having the same chance of being caught. So because of this there's a great possibility of noise in 1-year PMR outfield ratings. And in fact data I have assembled for 2006 suggests that much of the difference between Red Sox LF and their opponents occurred along vector "I", but that the average distance of balls hit to the outfield in that direction by the red sox was about 302 feet, and only 285 feet by opponents. Other things being equal (which I don't know yet), the visiting team's opportunities along that vector should have been easier to catch, and I don't think PMR would recognize that.
Actually, last year he did better in the original model than the smoothed visitor model. I think Fenway LF is unique, but not necessarily difficult in the sense positions in other parks might be.
If he puts up any over the weekend, I won't be doing any conversions, as I'll be away from my spreadsheets.
The saddest day for any saberist.
:)
I think he had top 10s in the Bill James Handbook, but I haven't heard of any full list like in the Fielding Bible.
Thanks, Dan. ZR, UZR, PMR, Fielding Bible, Davenport, Range, Win Shares. Are there any systems that I'm missing?
It doesn't look like the Orlando Hudson popup stuff has as big an effect this year, though he did not rank well by ZR, so I wonder if the difference between his ratings by those two methods may be attributable to balls in the air.
those are your runs, right? Not mine? I thought I have Griffey as worse than -11.
I'd rather include information than exclude it. Liners should be included, because line drives caught goes to positioning, which is an important aspect of defense. As for popups, I think that the number of truly discretionary popups is overestimated and that the primary fielder can be identified for most of them. on a popup, the fielder with the best angle will typically be the one who makes the play. On most short infield popups near the pitchers' mound, the 3B makes most of them on the left side, the 1B makes most of them on the right side. On most infield popups near 2B, the fielder who has to backpedal the least typically makes the play - if the 2B is coming at an angle while the SS is mostly going back, the 2B will normally catch it.
-- MWE
He uses a cut-off of 1000 BIP.
The players who don't reach that cutoff appear to be collectively well below average - I think LAWoBH had said the shortstops are apparently 97 runs below average as a group.
OK.
those are your runs, right? Not mine? I thought I have Griffey as worse than -11.
You did have him at -11 for actual runs, it's -18 over 150 games. I'm comparing the actual numbers as the per-150 I do for the PMR conversions is an estimate.
Well, per the link in post 30, it doesn't like Jeter much. Clearly there's a Red Sox bias in these numbers.
Though it does put Alex Gonzalez lower than many Sox fans would have had him for last season. (My own sense is that his defense suffered due to injury in the second half.)
I don't think so at all. Mostly because all you track are ones that are caught, and the distribution is completely random. Including them is wrong - it completely ignores chances.
As for popups, I think that the number of truly discretionary popups is overestimated and that the primary fielder can be identified for most of them.
Not really, because the *pitcher* is the primary fielder. There are ALWAYS two fielders who can make the play on IF popups.
on a popup, the fielder with the best angle will typically be the one who makes the play. On most short infield popups near the pitchers' mound, the 3B makes most of them on the left side, the 1B makes most of them on the right side. On most infield popups near 2B, the fielder who has to backpedal the least typically makes the play - if the 2B is coming at an angle while the SS is mostly going back, the 2B will normally catch it.
While it is true that is who *usually* makes the play, the problem is you cannot distinguish that very well in the data. Popups can almost always be caught by 2 or more fielders, and thus should be excluded.
Pitchers, Catchers, and Firstbasemen are up.
3. LA Waterloo of BH, Registered Hollywood Contrarian Posted: January 26, 2006 at 07:45 PM (#1839788)
There were 27 CF’s that appeared on last year’s and this year’s PMR list. Using the Original PMR Method (as opposed to the Smoothed Visitor Method Pinto presents as an alternative), and using the conversion to Runs Per 4000 BIP, I get Excel gets a correlation of .42509 for the group between the years ... I don’t really know how that compares to anything else in the world.
You must be Registered and Logged In to post comments.
<< Back to main