|
|
|
|
Baseball Primer Newsblog— The Best News Links from the Baseball Newsstand
Monday, August 17, 2009
Look, I like statistics. I’m friendly with stats-minded folks at Baseball Prospectus and ESPN. I have a bookmark for the Fangraphs website. I grew up reading Bill James and I’ve written some nice things about him over the years.
But I also spend my workdays talking to people who know baseball a whole lot better than I do, because they play it or coach it at the highest level. And I don’t think I’ve met anyone in uniform who gives credibility to defensive statistics. When one stat (UZR) says Teixeira has negatively impacted the Yankees in the field, well, it’s hard to take that stat seriously.
And much like Lester…Pinto leaps in
Tyler Kepner just went down a notch in my opinion. In this post he’s talking about the defense of Jeter and Teixeira:
It’s pretty simple. Jeter knows what he’s doing in the field. He’s not great, but he’s good enough. He doesn’t hurt his team on defense and never really has. Teixeira is an excellent fielder who helps the team a lot in the field. Maybe all that stuff just can’t be measured. The people who play the game don’t think it can be, and they know a whole lot better than those of us who don’t.
Right, so all the batted ball data is meaningless. All the plays Jeter hasn’t made over the years didn’t hurt the Yankees, because some coach can’t see it. Tyler, go back and watch the 2002 ALDS and count the number of ground balls Jeter didn’t field. Then tell me what your eyes say.
|
Support BBTF
Thanks to Don Malcolm for his generous support.
Bookmarks
You must be logged in to view your Bookmarks.
Hot Topics
Newsblog: HP: Baseball is leaving the human factor behind (3 - 7:21am, May 25)Last: Greg (U)KNewsblog: Matinale: WADJ: Wins Above Derek Jeter (1 - 7:01am, May 25)Last: bjhankeNewsblog: FS Midwest: Streaker halts Cardinals-Phillies game (1 - 6:55am, May 25)Last: bjhankeNewsblog: Roy Halladay bobblehead with glove on wrong hand selling on MLB.com (12 - 6:46am, May 25)Last: Doris from Rego ParkNewsblog: Sullivan: Dan Haren Makes Mariners Look Like Mariners (1 - 6:40am, May 25)Last: The cushions are crowded for EdmundoNewsblog: 12 Baseball Feats That Only Happened Once (25 - 6:25am, May 25)Last: Greg (U)KNewsblog: Shawn Green to play for Israel in World Baseball Classic (12 - 5:50am, May 25)Last: shoewizardNewsblog: OT: NBA Monthly Thread, May 2012 (1772 - 5:44am, May 25)Last:  baudibNewsblog: Primer Dugout (and link of the day) 5-25-2012 (1 - 5:33am, May 25)Last: Tim Stauffer, Trot Nixon's Coming (Dan Lee)Newsblog: Boston.com: Curt Schilling’s 38 Studios lays off all staff (44 - 4:58am, May 25)Last: Obi One Kenobi NilNewsblog: Wins Above Replacement: Distribution and Rarity of Talent 2011 - Beyond the Box Score (9 - 4:18am, May 25)Last: bobmNewsblog: Greenberg: Cubs' Ricketts decries proposal (749 - 3:19am, May 25)Last:  Greg (U)KNewsblog: Dodgers want to host NHL's Winter Classic (15 - 3:07am, May 25)Last: Greg (U)KNewsblog: Neyer: New Yankee Stadium: A Review (74 - 2:00am, May 25)Last: Dag Nabbit apealing [sic] his own check swingNewsblog: OT: NHL Playoff Thread (1731 - 1:45am, May 25)Last:  baudib
|
|
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. birdlives is one crazy ninja Posted: August 17, 2009 at 05:55 PM (#3295108)Well, if you bookmark it, you must understand and appreciate baseball metrics. And speaking of fangraphs, they respond to Kepner although not specifically to this blog post.
http://www.fangraphs.com/blogs/index.php/seeing-and-uzr-and-teixeira
Remember that UZR doesn't include a lot of 1B defense. It says nothing about handling throws, and, IIRC, doesn't deal with pop-ups well.
A 1B can have a negative UZR and still be a plus fielder.
Sample size, for one. You need a larger sample size for defensive numbers. 4 months of defensive data is like a month or two of offense data and one can easily see the fluctuations there. Players are so close (thanks to being pre-sorted to positions) to each other that a few extra hard-hit balls or bad bounces can change things quite a bit in 2/3 of a season of defensive numbers.
While people will think of a "30 homer guy" as someone who will probably hit 20-40 home runs or so in a year, people also need to get used to a "+5 defense guy" sometimes being 0 and sometimes being +10. But when a 30-homer guy "only" hits 20 home runs, those missing 10 homers do hurt the bottom line, even if the player's better than that.
Alex Rodriguez sometimes hits 35 instead of 45 home runs and Mark Teixeira sometimes is sometimes a run below average instead of 5 above. I'm not sure what the big deal is. It's silly to think a +5 fielder is going to +5 every season.
A quick rule of thumb is about a 3:1 ratio - consider 3 years of defensive data worth a single year of offensive data. A +5 fielder can easily be -1 over the course of a single season.
I thought you needed at least two years of data before you can draw a confident conclusion about a player's defensive ability.
Some people find even that insufficient.
MGL who is something of a standard-setter in creating defensive metrics, has also discussed the problems with defensive metrics. One is that, even for regular players, the sample sizes are a lot smaller than for offensive metrics. The other is observational error -- what gets recorded as a catchable chance may not, in fact, be one. In less than a season, given already small sample sizes and fluctuations, it would not take much in the way of observational error to skew the numbers.
Then you have the problem that defensive ability is no more static than offensive ability. As you gain knowledge, your reflexes and speed may go down. An injury that slows you or affects your reflexes can have a big impact.
At the end of the day, the only thing the defensive metrics can tell you is whether, in that year, within a larger margin of error than for offensive statistics, you appeared to make more or less plays than others at your position. A consistent pattern over multiple years, confirmed by multiple metrics, is almost certainly significant. When the metrics disagree, and/or tend to fluctuate, then you have to take them with an even larger does of salt.
as to Teixeira, by my eyes his play this year has been above average, but not exceptional. His range doesn't wow you, but he certainly looks much better when put against Giambi.
Sure you have. Unfortunately, there are tons of people in the game that give credibility to fielding percentage.
As for UZR, as MGL Will tell you it says little in a small sample. Tex is usually pretty good in UZR. This year he rates at -0.8, but last year was +10.6. He's likely a good to very good fielder and has likely been a good fielder this year as well, but for whatever reason hasn't had a great UZR so far. Could be measurement error, or small sample skewed by plays the system thinks were easy were actually hard, blah blah. We've been over this a million billion times.
I'm certain that (a) what the defensive stats tell us about Teixeira in 2009 aren't necessarily going to line up with what they've said in prior years, and (b) UZR doesn't capture everything a 1B does on defense. What that means regarding his 2009 performance, I don't know. It certainly means you shouldn't extrapolate from his 2009 numbers to judge his ability. But that's not what people are trying to do.
Right, but the UZR for 2009 doesn't necessarily tell you about his defensive performance, for the reasons outlined above. Best we can do is infer his 2009 performance, based on his recent UZR (including 2009 UZR) and his reputation. Given that:
- his 2008 UZR was so high.
- his reputation is very good.
- his 2009 UZR is basically average.
I am comfortable guessing that his actual defensive value this year is pretty good. Probably in the +5 zipcode(*) if I had to give a number to it.
As has been written before, when it says his UZR is -1 or whatever, it's really saying: "His defensive value this year is probably between -6 and +4" or some such.
* hehehe.
And that's probably an additional reason Yankee watchers are having trouble reconciling their opinions of Teixeira with the numbers. They're comparing him to the awkward-looking Giambi, which may skew their view of Teix's contributions or how he truly rates against others at the position.
It reminds me of a column I read on ESPN some years ago. The author was talking about goaltending statistics in the NHL. His points were:
1) GAA (goals against average,) is flawed because it doesn't take into account the number of shots the goalie faced or the quality of his defense. He is 100% correct and I nodded as I read along.
2) Save% is flawed because it doesn't take into account the quality of shots the goaltender faces (eg the quality of his defense.) Again he is 100% correct and I nodded as I read along, anticipating his conclusion.
His conclusion? Throw both out the window and judge the goalie based on wins and losses. Now there's a does-not-compute...
But I digress. Obviously UZR is flawed. Every metric is flawed. But just because the best tool we have isn't perfect, doesn't mean we should use some other tools that are even less perfect, simply because they've been around longer. A lot of people are of the opinion that we should not switch from one tool to the other unless the new tool is flawless. I don't understand that logic.
And people seem to have a problem recognizing that good defensive players can underperform their talent in a limited number of chances, even though we know that players go through similar things on offense. It is very possible that both the observers and UZR are right - Teixeira may well be a talented defensive player who simply didn't get to as many balls as his peers did over a short period of time.
Those decimal points are part of the problem.
You mean in the sense that it presents a level of precision that isn't there?
[20] It doesn't for me, but sure.
And I shouldn't know that.
Is Coco Crisp the reason many Sox fans don't accept Ellsbury's -11 UZR this year (or -6 per 150 for his career)? A lot of these numbers are screwy compared to what one sees on the field on lots of teams and I think it has little to do with any team's previous players.
When the standard error of your metric is ~7 runs it annoys me to see it presented in the #.# format.
Of course the same objection applies to any offensive stat.
I doubt it. Ellsbury's been playing CF in Boston for parts of three seasons. But you'd probably find people in Boston who would insist that Bay is better than his numbers for this reason.
More important, I didn't say it was the only reason, just a possible factor. Teixeira is following a guy that by most accounts is a legitimately bad defender. Even if Teix struggled by his standards, he'd likely still look considerably better than the guy he replaced, possibly skewing the perception of his performance.
Is that really a controversial position CP?
No. Bay is terrible. He can run, and he tries hard, but he's always getting very late jumps and taking bad angles to balls, and he doesn't throw all that well. He's really only better than Manny because he gives a better effort.
I think at the very least UZR should cause people to consider the possibility that their subjectively formed opinion might be wrong. Given the nature of defensive metrics, that's about as far as you can take it. It's a part of the picture, and nothing more.
The main problem with evaluating a player's defense relative to another's has always been opportunity. How many times do you get to see the 30 starting first basemen for each team play? If you're a scout, that's one thing.
But schlubs like us just don't get to see 30 first basemen play 100 times a year. So really, we don't even have a subjective basis for opinions we form with our eyes. That's why we need the 'objective' (objective/subjective is a matter of degree and not kind) data in the first place to form any opinions backed up by evidence.
You're speaking for all of Beantown now, huh tjm? ;-)
Unfortunately.
1. Not everything a 1B does is captured in UZR.
2. Sample size produces large fluctuations. ("4 months of defensive data is like a month or two of offense data and one can easily see the fluctuations there.")
3. It's not that Teixeira got worse; rather the rest of the league might have gotten better.
4. Observational error.
I concede 1, but Teixeira's prior years are subjected to the same issue. His drop in UZR this year is a reflection of differences between 2008 and 2009, and UZR didn't account for this stuff in either year. I suppose one could argue that he's doing even better than usual in all the ways UZR doesn't measure, but I haven't seen anyone make that argument.
On 2, as Szym notes there's a lot of fluctuation between hitting stats over a month and hitting stats in the long haul. But if a hitter has a .120 average in a month, nobody cites "flaws with batting average" as the reason. If you're measuring that hitter's performance, .120 is an accurate representation of his results. Maybe he hit a lot of "right at 'em" shots, or maybe he was up against abnormally good pitching/defense that month. And maybe it'll be the only month under .280 in his career. But a .120 average is a .120 average; he didn't really hit .280 that month. In short, Teixeira's 2009 UZR could be authentically low.
Regarding 3, if the entire league except Teixeira got better at 1B defense in 2009, that works against him, no?
One could chalk up the entirety of Teixeira's low UZR this year to the effect of 4, I suppose. As you point out, based on past performance it seems reasonable that there must be some of this. It seems to me that some folks are very quick to judge the effect of #4 as being the whole story. It's likely not.
I don't think it's controversial but I don't think it's right. The Yanks have had more than their share of bad defenders over the course of the years but I still think that anyone who watches enough games will get a good sense of what is approximately the standard of play at each position by watching the other guys on the field. Because you get to see so many other guys play the field while watching your team I don't think that being subjected to a horrible defensive player like Giambi or Soriano or Manny has much of an effect on a fans' subjective view of what passes for good and bad defensively.
Sorry, I'm cranky, between the defensive metrics conversations going on and the MVP conversation, I'm reading too many people certain of things that aren't clear and it's bugging me. I shouldn't have responded to you like that. I should probably take a day or two off from the boards.
Slacker. Pujols leads all of baseball in OBP (0), SLG (1) and OPS (1).
It could be, yes. But it could be flaws in the system.
With batting average, we have two very rigid components over which there is little quibble: hits and at-bats. Simple. No real way that that can get screwed up unless someone has a bizarre run of a luck where he keeps ROE'ing and the official scorer keeps giving him hits for some reason.
In UZR, the components are "actual outs" and "estimated outs that he should have had" or some such wording. The first of these components is pretty rigid. The second is most definitely not.
Suppose Tex gets a sharp liner shot at him and he has to make an extraordinary dive to catch it. Maybe UZR doesn't realize how difficult a play it was (because it doesn't capture batted ball speed, or positioning.) UZR "thinks" it's a play that is made 99% of the time and gives Tex little credit. In reality, it's a play that is made only 10% of the time and Tex deserved more credit.
Over the long haul, we expect these sorts of things to "even out." But over the small samples we have, a few plays like this can easily skew the numbers. That's the problem.
The problem isn't simply measurement error. It's that unlike offensive stats in which the denominator is a known quantity with defensive stats the denominator is an estimate. Thus, and MGL would say this too, UZR unequivocably does not say what a player's defensive value for a given year is.
I maintain that all of this is a false dilemma. Over the long haul UZR and your eyes should be consistent. The problem isn't with our ability to make observations the problem is with the recording of those observations. If you want to know whether Teixiera has been good this year go back and watch the condensed versions of every single Yankee game this year. If you do this I am quite certain you'll be able to come up with a good assessment of how good he is. Then since defensive numbers should be banded anyway you could give a very educated evaluation of whether he is zero, +5, or + 10. This number will be representative of Teixiera's actual value this year. The problem is that nobody is doing this.
Edit] I'll add that if you think Teixiera is + 10 defensively it should be possible to point out with your eyes exactly where this value came from. If you can't do this then you don't know that Teixiera is +10 defensively.
Too true. Of course what I was talking about was expressing the results of that batting average in runs. Since the standard error would be something close to 13 runs, I'd probably be annoyed if the results weren't rounded to the nearest 5 runs. (not of course that anybody would bother)
BA itself? I ... accept the 3 digit presentation even if it does give a false sense of large differences. (I know that's written poorly but I can't figure out a simple way to explain. But try listing the BA leaders to 2 decimal places. My sense is that they look more tightly bunched when displayed to 2 decimal places even though there no actual difference)
It could be (not saying that it IS, mind you). He's in front of different scorers now, and when you are talking about a relatively small number of balls in play in the 1B vicinity, moving a handful of boundary chances from a zone further away to a zone closer, or having a more restrictive definition of what constitutes a hard-hit ball, could very easily make a big difference.
-- MWE
I disagree, people standards are based upon what they see the most, and what they see with their "heart" when they are dealing with the hometeam. Cardinal fans bad mouth Duncans defense and yet Holliday isn't any better, and his arm is vastly inferior(and that is saying something as Duncan had an average arm) yet the ability to get any Cardinal fan to admit this is almost impossible. Anyone outside of New York who has watched Jeter play defensively for any amount of time, can flat out admit his defense is poor, how long did it take New Yorkers to even begrudgingly admit he may be only average.(heck you still can't get a lot of them to admit that. Just watch him play for one series and you will find at least one play that he missed than an average guy gets too, rarely you'll also find a play where his arm makes up and allows him to get an out where most shortstops didn't) Tex is getting mad props, because the fans are arguing with their hearts, and he is genuinely a very good defender, and in comparison to previous first baseman, he has a lot more mobility.
that and it's also not accurate either, so you are just adding estimates on top of estimates. Fans know that if a fielder with two outs man on third makes a diving stop to get the force out at first, that is one run saved by the fielder (of course that misses how the guy got into position to score in the first place or it relys on the out situation, in a one out situation it wouldn't have saved the run) But this is what we as fans feel we know while watching the game, and when you see Albert Pujols on 5 different occasions play in and throw the guy out at the plate with no outs, telling me that it's only 1.4 runs saved(or whatever) is not intuitive and just seems wrong. It's why people make the comment Andruw Jones probably saves 100 runs a year with his glove, the situation doesn't matter, it's the result that they remember and credit to the guy(unless he made a bad play to set it up)
and anyone that trumpets their system has to acknowledge the emotional impact of what a particular play feels like and explain why that isn't accurate when describing their system, or describe how their system takes it into account.
MGLs response to this would be amusing. I'm sure he'd say doing so would be like trying to teach nuclear physics to dogs. Though maybe he could come up with something more polarizingly offensive. The man has a gift.
Either way, given the nature of defensive metrics, I think convincing those people is pretty far down the line in terms of progress. If you're trumpeting the value of your own defensive metric, you have a hard enough time convincing the people who are speaking your language that it has merit, much less the kinds of folks you reference.
Great discussion. It seems to me that every player is subject to those flaws. I mean, the system wasn't designed to uniquely job Tex. So if the 2009 numbers say he is below average or this or that, then I agree that's what the numbers say. You just have to acknowledge that they say that within some kind of margin of error. But to say, well, there are flaws in the system and to uniquely adjust Tex without adjusting anybody else would seem to me to be adding error, not subtracting it.
MGL:
This is all true, of course. The question, though, is what the typical errors are. If 5% of the time, UZR will be off by 5 runs, then we'd expect about one starter in the majors to be off by that much. Or does it have even larger errors? These sorts of things are extremely difficult to estimate for this kind of metric.
If you watch a bit of cricket, as I do, you'll realize that DL is making an important point here. The mere presence of a player actually adds defensive value in that zone. In cricket, you can add or subtract defensive value by moving your fielders around. Since a batsman can hit into the 360 degree arc, there's always a gap somewhere to exploit. In baseball, most players hit into a much more narrow arc, and over a hundred years of experience has created the optimum fielding positions, which players occupy almost all the time, except in the cases of extraordinary shifts like that against David Ortiz.
Thinking about the contrasts and similarities in defensive actions in baseball and cricket, I increasingly find myself wondering whether UZR and similar complex defensive statistics actually add a whole lot of understanding we couldn't actually achieve with simpler methods. Is something along the lines of the Win Shares defensive system actually close to the optimum trade-off between complexity and accuracy?
On another tack, I would really love to see someone carry out this experiment. I don't have the skills to do it myself. What if we start evaluating fielders in pairs? 1b-2b, 2b-ss, ss-1b, ss-3b, 3b-1b, lf-cf, cf-rf. Has anyone ever tried that?
Because Chris was a bear swatting bees with a honeypot on his hand.
You must be Registered and Logged In to post comments.
<< Back to main