I have spent my professional life in the print world, where intelligenceities don’t see the light of day!
But now comes a fellow who is not only going to make O’Connell’s job easier, but he will also eliminate it altogether.
His name is Sean Forman, and he is the creator of the Web site, Baseball-Reference.com. He also has a method of selecting the most valuable players and Cy Young award winners that he believes is far more reliable than the writers’ voting.
...Without getting into details, Forman presents a formula by which m.v.p. and Cy Young candidates can be ranked. The result is a number that is assigned to each player – Robinson Cano 6.3, Evan Longoria 6.2, Miguel Cabrera 6.0, using A.L. m.v.p. as an example.
Integral to those numbers is something called WAR, which stands for wins above replacement. What replacement? A replacement player, of course, but he’s mythical.
Statistics zealots apparently love to deal with mythical or hypothetical players. The problem for those of us who prefer dealing with reality and actual human beings is we can’t buy into the idea of using mathematical formulas instead of real players.
However, given that many BBWAA voters seemed to buy into the formula stuff in the Cy Young voting last year, it may not be long before they vote for all of the awards on the basis of WAR (no intangibles needed). But if they are going to vote on the basis of WAR, who needs voters?
O’Connell can simply ask Forman to e-mail him the final WAR numbers, and then he can stand on the dais at the New York baseball dinner each January and present the m.v.p. and Cy Young awards to a computer.
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
"Yes, WAR is a great stat, but there are so many assumptions and calculations that go into it, we might not be sure of the margin for error. So if one guy has a WAR of 7 and another 6, the guy with 6 might have been better. And furthermore, given any uncertainty in how accurate WAR is, it might make sense to look at team wins to see who is the most valuable. Team wins is something we can acually observe and may provide some additional, useful information."
So in looking at the leaders in WAR from the 2007 NL none of the top 3 guys played for playoff bound teams. So if you require that for your MVP (I am not a big fan of that), then Holliday is the highest. Not terribly far ahead of Rollins but any bigger and I start to wonder how Holliday could not win it. My first place vote would have gone to Pujols
1. Pujols (STL) 8.3
2. Jones (ATL) 7.9
3. Wright (NYM) 7.8
4. Holliday (COL) 7.3
5. Utley (PHI) 6.6
6. Rollins (PHI) 6.1
7. Ramirez (FLA) 5.6
Tulowitzki (COL) 5.6
9. Reyes (NYM) 5.4
10. Beltran (NYM) 5.3
Batting average does a great job of what it's measuring. That's not the point. The point is, the person who leads the league in batting average is called the "Batting champion", not the "Batting Average Champion" or the "Champion of most hits per times at bat excluding walks, HBP, and sacrifices." When Bill Buckner is the "batting champion" of a league which has a guy creating 3.5 times as many runs with the bat as him*, there's something amiss.
*Buckner 14.5 batting runs, Mike Schmidt 53.6, per BBREF. And that's probably far from the most egregious example.
And here's one more cheer for the Edit function, eh, Banta?
EDIT: I see you caught it pretty quickly. We've all been there many times.
It was all in the inflection. I'm not saying, I'm just saying.
Then why even have individual awards if a you want to reward team performance? Or put another way, why the big charade that these awards are honoring individuals when they are rewards for the performance of a group of people?
Because that's what they are honoring. A player's team and the "narrative" provide the context for his accomplishments, and may serve to enhance or diminish his purely statistical accomplishments in the eyes of the voters, but it's the player himself who's being honored.
But let's get down to cases: If they voted the CYA to Sabathia over Hernandez, or vice versa, would you consider that whatever value you may ordinarily assign to that award would be significantly diminished? Do you really think that a relatively small statistical difference should overrule any other consideration, such as the pressure circumstances of a race for the postseason?
I emphasize "relatively small" to make it clear that I'm not suggesting that a 1985 Gooden should have been bypassed for John Tudor, just because the Cardinals edged out the Mets for the NLE. I'm talking about differences on the order of Sabathia and Hernandez.
And I addressed this point. The problem is people's misuse of the stat, not the stat itself. The solution isn't to get rid of the stat, which was the point that I was addressing and one that crops up on the site with surprising frequency (here it was suggested it needed to be relegated "to the dustbin of history"), but to put it in its proper perpsective and give it its proper weight. Which, of course, is happening, holdover nomenclature notwithstanding.
Not everybody.
In 2001, I remember a lot of people here and/or at usenet were excited about Barry going for a .500 OBP, because it had not been done in over 40 years. The Home Run extravaganza overwhelmed that little achievement, as did the numbers that came in the next 3 years, but it definitely was a point of interest that year.
Seconded. One of the dumbest mainstream statistical rules,
So he, at least, was making a difference of less than 1 WAR significant. Further, he doesn't mention that a different measure of WAR had Halladay as the only pitcher as valuable as any of the 4 hitters, and he was only barely so.
Because batting average is style points. A player's batting value is almost entirely a function of their OBP and SLG.
But then I know you know that Tom. So I assume your point has to do with attaching value for passing through any arbitrary threshold. No disagreement there
But as long as you have OBP and SLG BA provides no meaningful information.
Now you can break things down so that you have BA, IWR (OBP-BA) and ISO (SLG-BA). If BA actually mattered, breaking things down this way would make for a more effective model of team runs scored. It doesn't. Neither the standard error nor the correlation is substantially effected.
Tom Tango has demonstrated to my satisfaction that this is not true in extremely high run scoring contexts -- BA does appear to matter in (say) Coors in the late 90s, while in the Astrodome in 1968 if two players have the same OBP and SLG the one with the lower BA is very likely to be the more effective player.
But in the vast majority of cases there's no detectable difference.
He started out ranking hitters by runs and the by runs per game. Runs per game hung on for decades after Chadwick abandoned them.
And Chadwick's influence had a lot to do with keeping rbi from becoming an official stat.
What if I want to know how likely a guy is to get a base hit when he steps into the batter's box and don't have any interest in doing math at the ol' ballyard? Where should I turn?
Why is it relevant to follow a player's OBP when his value is entirely a function of his WAR? Why would anyone follow the chase for a .500 OBP when it's so much less relevant than the chase for a 10.0 WAR?
Or to look at it another way, is it relevant to follow a player on the path to 60, 70, 74 HRs, when all that value is encapsulated by his far-more-relevant slugging percentage? Does it make any sense to root for a pitcher closing in on 400 Ks, when that value is already captured by his xFIP? The strikeouts are mere "style points," right?
Not only is there a risk with every player that he won't produce at that level, but the risk varies greatly among players.
So at the very least, WAR doesn't measure value above an actual player, but instead value above the numbers a hypothetical player might produce.
My man, I can condescend with the best of them, but have you never seen the sort of reaction that I describe?
Indeed you can, but frankly, C for effort. This is pretty formulaic, your standard m.o.: "Some crazy people take extreme position X. Other crazy people take extreme position not X. I am sensibly in the middle, unlike all those people."
Well, since you're usually Exhibit A for about 10 crazy positions a week, I was trying to be kind by not naming names in this particular instance.
And of course I didn't refer to anyone here as "extreme" or "crazy." All I've suggested is that when it comes to awards based on "value," there's more than one legitimate way of framing the discussion.
---------------------------------
Of course nobody's really suggesting (I hope) that we go to handing out awards solely on the basis of WAR rankings, but when I read people here arguing fervently that there's such an enormous distinction between a 5.1 and a 3.9 WAR that it means that we can't possibly consider voting for the 3.9, I sometimes have to wonder whether or not this is what some of those people really want.
Here's the thing: In Forman's article, he does argue this:
(emphasis added)
So he, at least, was making a difference of less than 1 WAR significant. Further, he doesn't mention that a different measure of WAR had Halladay as the only pitcher as valuable as any of the 4 hitters, and he was only barely so.
Ironically, I might agree with Forman about Halladay, though not solely because of that 6.3 WAR. But I also don't like the idea of pitchers muscling in on the MVP award, since they've got the Cy Young for themselves.
And to your more general point, while I respect Sean's POV as someone who's devoted infinitely more time to thinking about fine measurements of value than I ever have, if his sole basis for value is WAR then I'd also respectfully disagree, for reasons that have already been well argued by many people above.
Except that the act of identifying an arbitrary number of non-strikes (**) has little to nothing to do with the act of hitting a baseball. It's the ability to do the former that BA, God Bless It, measures.
(**) If it took ten balls to make a walk, there would be so few walks in relation to hits that no one would care about the walks.
Not only is there a risk with every player that he won't produce at that level, but the risk varies greatly among players.
So at the very least, WAR doesn't measure value above an actual player, but instead value above the numbers a hypothetical player might produce.
I would also love to know what "replacement level" pitcher the Yankees could come up with whose efforts would have produced only 4 fewer New York wins than Sabathia's.
Javier Vazquez.
The math's not that tough, but you can skip it -- the assumptions are documented. Sean appears to be using RA+ of 85 as replacement level. And a pitcher with an RA+ of 85 this year who is given 6.21 runs a game to work with (in a mild pitcher's park no less) would be expected to go ~ 14-9 if you actually allowed him to pitch as much as Sabathia has. (Actual W-L record would depend on how well he matched his pitching to the support he had to work with, but you'd expect it to center on 14 or so wins, maybe a tad better because it looks like Sabathia doesn't get diminishing returns support very often -- a concern with very good run support)
Is there a pitcher whose true talent level is an RA+ of 85 and is healthy enough to give you 200 and change innings (through September)? Almost certainly. But we'd have no way of identifying him because he'd be constantly replaced (a guy whose true talent level is 85 will throw in some real stinkers. And will pitch well every now and then). In practice he'd pitch less for the Yankees but would put additional strain on the bullpen. Precisely what that's worth is unclear.
Right now Bruce Chen wouldn't be a bad first cut for a replacement level starter.
The other way to do this would be to start with any (or all) projections systems. Eliminate everybody on the major league roster plus everybody who can be considered a prospect of some sort. What's left are your candidate Phelpsers.
I seem to recall Chris Dial doing something like this a while back. Can't find a link though.
Now if your point in this is that there are simplifying assumptions being made, well sure. That's why the creator of WAR says the debate starts with WAR, it doesn't end there.
Has Sabathia made more effective use of his excellent run support than you'd expect? (a bit -- nothing hige) Does he have more decisions than you'd expect? (not really)
For this season, Javier Vazquez (-.2 WAR) is a close fit for an example. In his 27 starts, the Yankees have won 13 times, six less than in Sabathia's 29 starts. However, there is a pretty large discrepancy in run support -- the Yanks have scored 5.6/game when Vazquez has pitched as compared to 7.0 when Sabathia has pitched. Give Vazquez Sabathia's run support, and the Yanks probably have those two extra wins.
Not to batting average, the denominator of which is not plate appearances.
It'd be better to come up with a formula like
1. Is there a player having the greatest single season of all time?
2. If not, is the player leading the league in wins/RBIs also on the winningest team in the league?
3. If not, is the player leading the league in wins/RBIs on your hometown team?
4. If not, is the best defensive player on the winningest team having a decent offensive year?
I could make up criteria forever, but once any one of the questions is answered yes, the rest of the algorithm doesn't matter. But for WAR and other awards-dowsing metrics created by sabermatricians, they do.
Hence why Murray Chass is considered "uncurious" and Sean Forman is considered a numbers fetishist. Frankly, Chass's system doesn't sound as elegant, but it also feels like how real voters (who only get to make one choice) go about making their decisions.
Give one point each for the following:
100 RBI
.300 BA
Lead league in BA
Lead league in RBI
Lead league in HR
Play for a division winner
Play up-the-middle position for a division winner
Total the points; those with the highest number of points are identified by the system as "contenders". There's a tie-breaker between contenders based on HR, RBI, and BA.
Exactly. The player doesn't exist and can't be identified prospectively.
Once you put a number on it like RA+ 85, you can determine the percentile that performance represents within the performance of actual major leaguers. I don't know why that percentile would vary from year to year, but even if it doesn't, what's the reason one would concern himself with that percentile rather than, say, the 60th or 75th percentile? There's nothing magic about any particular level of play that suggests itself as the "correct" baseline performance. A team can't, in real life, obtain "replacement" performance.(**)
(**) The superficial analog is, of course, finance's "excess returns" -- returns over a truly risk-free asset that an investor can actually buy in the marketplace, like a US Treasury bill. WAR is more like credit spreads between risky bonds which tells you important things, none of which, unfortunately, is the ultimate merit of the less risky of the two. Ken Phelps was a junk bond with significant volatility in "return," not a risk-free asset.
This is just SoSH's schtick, defending batting average because it's used as a club to beat idiot sportswriters over the head with. So he ends up making the kind of nonsense comment you see above, when we all know he knows better.
It's true that more and more sportswriters and fans have become finally clued in to relative worthlessness of batting average in evaluating a player's past performance. But many of them still think that batting average is important in this as well; that's why people are overrating Ichiro and such. It's why people think Sabathia should win the Cy Young in the AL.
As to batting titles and such, I couldn't tell you who the league leaders are, and haven't cared in a long while. I did care about Boggs reaching 200 hits; of course, I was 15 at the time. Now, that doesn't mean there's something wrong with caring about these things, but people still use these things as a proxy for value, when they should really understand that there's a far better way.
You can't actually go into the market place and obtain a player who's "certain" to produce at any level- but you need a baseline
7 runs a game
team is 21-8 in 29 starts 6.21 runs per 27 outs (per BBREF)
a team with league average (100 ERA+) pitching should go 19-10 in 29 games with that run support
a team, with 84 ERA+ pitching should expect to go 17-12 in 29 games where they scored 6.21 per.
84 ERA+ is getting close to replacement level I would think.
1990-2010- sub 85 eRA+ pitchers with .550+ WP:
Rk Player ERA+ W-L% W L IP Year Age Tm1 Ramon Ortiz 85 .552 16 13 180.0 2003 30 ANA
2 Shawn Estes 84 .652 15 8 202.0 2004 31 COL
3 Jose Contreras 84 .591 13 9 170.1 2004 32 TOT
4 Ricky Nolasco 84 .591 13 9 185.0 2009 26 FLA
5 Mike Moore 82 .591 13 9 213.2 1993 33 DET
6 Esteban Loaiza 82 .588 10 7 183.0 2004 32 TOT
7 Cliff Lee 80 .636 14 8 179.0 2004 25 CLE
8 Kirk Rueter 80 .600 15 10 184.2 1999 28 SFG
9 Ben Rivera 79 .591 13 9 163.0 1993 25 PHI
10 Shane Reynolds 79 .550 11 9 167.1 2003 35 ATL
11 Brett Tomko 78 .591 13 9 202.2 2003 30 STL
12 Braden Looper 77 .667 14 7 194.2 2009 34 MIL
13 Ismael Valdez 77 .609 14 9 170.0 2004 30 TOT
14 Mark Hendrickson 74 .579 11 8 178.1 2005 31 TBD
wow, the Brewers scored 6 runs a game for Looper? and 4.5 for Gallardo...
So yeah, given CC's run support I can see his being swapped out with a RP level pitcher having only a 4 game difference... of course teh effect (if any) on the bullpen would be hard to quantify
so I'll add this:
"Is there a pitcher whose true talent level is an RA+ of 85 and is healthy enough to give you 200 and change innings (through September)? Almost certainly. But we'd have no way of identifying him because he'd be constantly replaced (a guy whose true talent level is 85 will throw in some real stinkers. And will pitch well every now and then). In practice he'd pitch less for the Yankees but would put additional strain on the bullpen. Precisely what that's worth is unclear."
Or perhaps you should ask, are their pitcherS (plural) who can combine to reliably hurl 200 ip at 85 RA+?
sure, take http://www.baseball-reference.com/players/f/figuene01.shtml
and a couple other AAAA lifers maybe a few reclamation projects etc... maybe 85 is a bit high, but if you can't get 80 your team needs a new GM and manager.
Then why would anyone speak in terms of "replacement player" and value above that? And proceed to construct an entire edifice of analysis atop the concept?
As to baseline, how about 65th percentile?
Why do people assume that every contemplation of data or statistics by a fan has -- or should have -- the aim of determining a player's "value"?
Hitting .400 is cool and the next guy that does it will enter an extremely selective club.(**) There need be nothing more to it than that. If some obsessive moralizer wants to be all schoolmarmish and say "Buh buh buh buh but he only walked 55 times," well, so be it.
(**) Both George Brett and Tony Gwynn would have fit very well.
In the minds of most sportswriters, Ichiro is a HOFer because of the batting titles and hits.
If he had the same value but hit .280, he would not be nearly so strongly supported.
Actually, real MVP voters get to make ten choices. In fact, they have to make ten choices.
Except the metrics -- WARP, VORP -- and the like are framed in terms of replacement player, not replacement level. And I shudder to calculate the number of potent brain hours that have been spent in arguments whose fundamental thesis is "Why did [Team X] spend $10 million on [Player X], who's barely above replacement?" (**)
(**) To which the generally correct answer -- that the performance of Player X is subject to far less uncertainty -- is practically never even mentioned. Ken Phelps was Ken Phelps in large part because of that uncertainty.
OK. And?
If he had the same value but hit .280, he would not be nearly so strongly supported.
The conclusion regarding his value would be premised on the highly dubious proposition that a walk is as "valuable" as a hit. As well as the even more dubious proposition that observers of the game should rate the ability to identify an arbitrary number of non-strikes as highly as the ability to hit a baseball.
According to BB-ref, Ichiro has finished in the Top Ten in MVP voting four times, and in the Top Ten in WAR four times. He's led the MVP voting once, and he's led in WAR once. If you believe in WAR, the sportswriters have him pegged pretty accurately.
if those seasons matched up...
Am I reading WAR's fielding runs wrong, or does Ichiro have seasons where he was saving more runs than Ozzie Smith with the glove?
Unless I'm completely reading this wrong WAR seems to rate Ichiro's runs from fielding on par with Smith's.
No, if a team spends $10 million on [Player X], who's barely above replacement? there is no "generally correct answer" there could be any of a number of different answers
[Player X] is not barely above replacement, he's better than the question gives him credit for
[Player X] was paid $10 million because the GM just made a big mistake because he valued "certainty" far too highly
[Player X] was paid $10 million because the GM mis-evaluated him and thinks he's better than he is
[Player X] was... etc etc etc
Ken Phelps was not Ken Phelps because of uncertainty, Ken Phelps was Ken Phelps because he was defensively limited and was a victim of poor talent evaluation by the Royals (Pete LaCock?) and really bad luck (winning one job but losing it due to an injury in ST etc. etc...
what's really amazing is that Phelps is regarded as being slow and lumbering- but he actually lost PT to 1Bs who were even slower- Aikens and Al Davis
Phelps obviously didn't age well, and probably spent his peak in AAA (.333/.469/.706 in the American Association?)... but he did have a career.
The use of BA as a proxy for value is idiotic, which I've acknowledged more than enough times for you Ray. And, of course, virtually no one uses it as such (how often is the MVP the batting champion, for example?).
My point, which is my goddamned schtick, is that baseball stats don't simply exist to help us divine value. They provide information. They can help tell a story. They provide a link to the past. All of this stuff is important to baseball fans who don't give a rat's ass about how much better than replacement level Player A is compared to Player B. You know, most of them.
My schtick on this issue is not to defend idiot sportswriters (or anyone else) who misuse BA (which I made abundantly clear on Page 1 of this thread). It's to knock it into the head of idiot statheads or stathead wannabes who don't seem to realize that baseball stats are more than just figures to plug into more sophisicated metrics in a neverrending quest for a more precise measurement of value, for whatever the hell that's worth. And each time some ####### says we need to get rid of wins or get rid of errors or get rid of BA (which is done frequently and earnestly around here), you can bet your ass I'll be back with my actual schtick almost as fast as you'll dive headfirst into a steroids thread.
As for the comment above that sparked your reply, there's nothing dishonest about it (though I apologize for my shorthand definition that sparked battlekow's snark). I like to know what a guy's BA is. I like to know if that .380 OBP is the result of TTO guy or a line drive machine? Is the .450 SLG the product of a lot of hits and little power, or a low BA slugger? I like baseball the pasttime, not simply the academic pursuit.
So yes, batting average does indeed tell me something useful Ray. That's where you and BA differ.
This. My Lord, this.
They do it "poorly" only with respect to unproven perceptions of how "value" should be measured.(**) Standing alone, millions of people understand precisely what they measure and capture, and enjoy them on their own terms. They're interesting factoids generated by a compelling sport and are an integral part of the game's texture and beauty.
(**) And with respect to the faulty premise that every piece of data must have as its only aim, measuring "value."
SBB, please take this constructively, but you abuse asterik'd footnotes.
Really? So is that why for decades the player with the highest batting average was called the called batting champion and widely viewed as the best hitter or among the best in the game? Is that why starting pitchers that get a lot of wins were considered the best in the game or why players with high fielding percentages were considered very good fielders? The thing is for decades almost no one enjoyed batting average based on what batting average actually measured and captured and for decades people did not understand what is BA measured.
I don't think it's a good reason to still believe that BA is the greatest stat ever, but something like 3/4 of the time, I'd wager, the player with the higher batting average will be the better player. I hope people understand what I'm saying. I'm not saying that we should use BA, or that it tells us valuable information. I'm just pointing out that it's not all that bad of a stat. It's not like it's fielding % or productive out % or something stupid like that.
This seems right. In Little League (and beyond), we're taught to hit the ball. A walk is a neutral outcome. The most basic idea of the game is to hit the ball and make them field it. It's just what we learn, until we reach the advanced stage where walks are a positive, and not some "neutral" outcome. Most fans of MLB stopped playing before they reached this stage, and therefore don't view a walk as an accomplishment by the hitter.
To relate it to other sports -- in the same way, a very casual basketball fan might see a charging call and think it's something that "just happens," or a mistake by the offensive player, whereas it usually requires a very nuanced skill on the part of the defensive player. Or in football, the ability of a defensive lineman to draw holding penalties.
Piggybacking that, it's not just that we're taught to hit the ball, there are few things in life better than hitting the damn ball hard. Walking is nowhere near as satisfying, even if the end result is the same as a bases-empty single.
bingo... agree right there, occasionally there is such an obvious candidate for the award that the sports writers will still screw up a vote here or there.... but the point is to find the best method and apply your "learned" experience onto the vote.... I will never argue for an award to be based entirely upon stat unless that was the awards purpose.. but as mentioned many times, better stats help tell the story... Nobody with a rational brain thinks any war method that is out there is perfect enough to be the whole story. You have the argument of value per plate appearance, you have the health argument, you have the domino effect of starting pitchers pitching deep into games argument etc..... but using war(mind you I'm not sold on any stat that claims to be the entire ball of wax) is still better than using wins/rbi/saves.
anymore than i don't get why on-base is NOT calculated as ANY way a batter gets on first base - why on earth would you leave out a ROE or a dropped 3rd strike
there are a LOT of problems with judging a player by WAR
1 - the imaginary player thingy
2 - you can't compare pitchers AND hitters because you are using different things to measure.
3 - fielding stats vary a LOT
4 - the formulas are, uh, let's say, not exactly obvious and to us people who are not no math geeeeyusses, appear to be some guy's opinion based on how much he thinks some kind of stat is important and how much "weight" it should have
and so it doesn't seem less (looking for word) capricious (ooooooooh that's a BIG one) than using RBI + BA + HR + teh Intanja-bullz and it does seem more (looking for words) obfuscatingly obscure (ahhhhhhhhhhh)
You must be Registered and Logged In to post comments.
<< Back to main