Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Baseball Newsstand > Baseball Primer Newsblog > Discussion
Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Tuesday, October 03, 2006

Inside the Book: Tangotiger: Baseball Prospectus’ WARP1 is wrong

Thanks to FezPez for pointing this out…for I’m as confused as John Yoko.

The replacement level that I use are: for a position player and a starting pitcher is .380.  For a reliever, it’s .470.  A team of such players will win .300 games. 

So, why does BP calculate WARP-1 the way they do?  The likelihood is that it treats a “replacement-level’ position player as a replacement-level fielder and replacement-level batter.  But, such a player is not the 420th best position-player in the world.  He’s probably not even in the top 1000 players in the world.  Why is this the benchmark?  What does it tell us?

I know all about the 1899 Spiders, and the recent Tigers.  It doesn’t matter.  Even if an MLB team posts a .140 or .250 record, our best estimate of the true talent level of these teams is nowhere close to those records.  They probably need to be regressed 25-50% towards the mean.

Repoz Posted: October 03, 2006 at 11:26 PM | 23 comment(s) Login to Bookmark
  Tags: sabermetrics

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. Kyle S Posted: October 03, 2006 at 11:47 PM (#2196490)
great post. good clear explanation for why the numbers have to be wrong.
   2. Robert in Manhattan Beach Posted: October 03, 2006 at 11:52 PM (#2196494)
Very well done. Agree 100%
   3. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 04, 2006 at 12:34 AM (#2196557)
BP's WARP1 are a joke, but so are Win Shares. I find both publicly available uberstat systems tell you far less than you would know just by combining OPS+, PA, position, and say Zone Rating. As for where to stick replacement level, isn't that an empirical question that Nate Silver (of the very same BP) answered with his research on Freely Available Talent (FAT)? I don't know what win% that translates to, as I don't think he looked at pitchers, but it wouldn't be hard to calculate.

I'd be interested to hear thoughts on how I've calculated replacement level historically. I took Silver's FAT levels at each position, adjusted them for each league-season, and saw how many games an otherwise average team would win with 700 PA from a player with that production. Then I took an average of those theoretical team wins over the 42 league-seasons he considered (1985-2005). I then looked at the production of the worst three regulars at each position in each league-season and averaged those over the 42, and thus derived relationships between replacement level and worst-three-regulars average (eg, replacement 1B were 0.3 wins worse than the average of the worst 3 regular 1B per 700 PA). Then, to work backwards, I simply take a 7-year moving average of the worst three regulars at each league-position, and add on this gap between replacement and worst-regular average.

How does that sound to all you statheads wiser than I am?
   4. Danny Posted: October 04, 2006 at 12:41 AM (#2196582)
FWIW, this reminded me of an exchange from a 2004 Prospectus Chat:

deadmonkeypaw (Chicago): I've read some criticisms of WARP3 that say that judging a player against both a defensive replacement level and an offensive replacement level is double counting. According to this claim, a player who is both a replacement level hitter and fielder would never get even close to the field. Do you have any sort of response to that?

Clay Davenport: It certainly is not double-counting; it just means that my overall replacement level is lower, on the order of a AA player rather than AAA.

And since there are numerous players who finish below even this extra-low replacement level, picking up a negative WARP-3 (the Orioles have 13 such players right now, although admittedly most have hardly played), the claim that they would never get close to the field is demonstrably false.
   5. bibigon Posted: October 04, 2006 at 12:50 AM (#2196612)
I asked Clay about this a while back, and he was kind enough to reply. I don't think he'll mind if I reproduce it here:

There are at least two reasons, Kostya.

One, the replacement level I'm using is lower than what you are citing, on the order of 25 wins for a team/season. Think Cleveland Spiders, 1899.

Two, and more importantly, WARP numbers for individuals don't add up to the same values that I would get if I evaluated them from their team numbers. Consider the 2004 Orioles. This team actually went 78-84. Since the replacement level win% is .153, their actual wins above replacement is 78-162*.153=53.2. Call it 53.

The sum of their individuals is 81, a big difference to be sure from the desired 53. But consider the team totals of batting rar (217), fielding rar (164) and pitching rar (381). These numbers have been adjusted to the standardized league of 9.0 runs per game, pythagorean exponent of 2. In the standard, a replacement level team scores 537 runs per 162 (replacement level EQA of .230, divided by league average of .260, raised to 2.5 power to convert it to runs, times 162 times 4.5 runs per team per game = 537) and allows 1262 (replacement pitching and fielding together yield a standard 7.79 replacement ERA, times 162 games = 1262).

A team that scores 537 and allows 1262 has a Pythagorean win pct of .153. The Orioles adjust to 537+217 brar=754 runs scored and 1262-164 frar-381 prar=717 runs allowed. That gives them Pyth% of .525 (85.1 wins), which makes them (.525-.153)*162=60 wins above replacement.

So we estimate the team as being 60 warp, and in real life they had 53 warp. That's not so bad, considering that they underperformed their real-world Pythagorean estimate by about 4 wins, and they allowed a few more runs than expected based on their pitching statistics.

The main point of that was to show that while the individual WARPs added together to 81, the team WARP only came to 60. There is a diminishing returns principle in the system, which follows from the mathematics of the Pythagorean model. A given number of extra runs will increase the win total of a bad team more than it will a good team. When Tejada or Mora or Lopez have their WARP calculated, they are adding their statistics to a replacement level team (simplifying; I actually calculate a team as average but for one replacement level player, compared to average but for this player) After you add Tejada, the team isn't replacement level anymore, and you don't get as many wins from from Mora as before. Now that you have Tejada and Mora, the team is even farther from replacement level, and Lopez' contributions don't count as much. That, in a nutshell, is the problem. The team environment does make a difference on how many wins a player's performance creates, but I've adjusted that context out.


Perhaps not a complete explanation, but it does shed some light as to the thinking.
   6. "Andruw for HoF" sure died down Posted: October 04, 2006 at 02:21 AM (#2196815)
I don't understand people who call stats "wrong". So are saves or wins or RBI "wrong" because they don't correlate perfectly with value? They might be "misleading" or "not perfectly useful for evaluating players", but calling a stat "wrong" is kind of going overboard. Just whenever we look at WARP1, we should just know that it rewards playing time a lot more than other metrics would.
   7. Sparkles Peterson Posted: October 04, 2006 at 02:28 AM (#2196839)
And since there are numerous players who finish below even this extra-low replacement level, picking up a negative WARP-3 (the Orioles have 13 such players right now, although admittedly most have hardly played), the claim that they would never get close to the field is demonstrably false.


If Albert Pujols was injured 14 plate appearances into a season, there is a significant risk that he would rate as below replacement level during those 14 plate appearances. Take fringe players and the risk of them putting up below replacement level numbers in extremely limited playing time is obviously much higher. Clay Davenport cannot possibly be too dense to have realized this.
   8. Kyle S Posted: October 04, 2006 at 02:47 AM (#2196887)
So in other words (and feel free to jump in if I'm wrong), when Tejada is listed at 8 WARP1 or whatever, that means that he, if placed onto a team of otherwise replacement-level players, would increase that team's win total by 8 wins. Is that right?

---

And since there are numerous players who finish below even this extra-low replacement level, picking up a negative WARP-3 (the Orioles have 13 such players right now, although admittedly most have hardly played), the claim that they would never get close to the field is demonstrably false.

that's ridiculous - players who put up negative warp3 are probably not worse than AA players (if that's what he thinks a true talent 0 WARP3 player is). he knows this. why bring up that example?
   9. Harold Posted: October 04, 2006 at 03:44 AM (#2196980)
So in other words (and feel free to jump in if I'm wrong), when Tejada is listed at 8 WARP1 or whatever, that means that he, if placed onto a team of otherwise replacement-level players, would increase that team's win total by 8 wins. Is that right?

No, according to this parenthetical quoted in post 6:

(simplifying; I actually calculate a team as average but for one replacement level player, compared to average but for this player)
   10. Zach Posted: October 04, 2006 at 04:25 AM (#2196989)
One thought that went through my mind reading this is that the Royals pitching staff this year was a pretty good definition of replaceable talent.

Working through the math, the Royals' ERA was 5.65 against a league ERA of 4.5. If the team's batters had scored 4.5*(4.5/5.65)=3.58 runs per 9 innings, the pythagorean winning percentage would be .286, or just about Tango's replacement level.

So if you ever need a mental picture of replacement level, think about the 2006 Royals pitching staff.

A team that scored 3.58 runs per game would score about 580 runs over the year. The worst team in the league (Devil Rays) beat that by 109 runs.
   11. pkb33 Posted: October 04, 2006 at 04:43 AM (#2196994)
The most interesting thing to me there is the reminder that there's a diminishing returns issue with WARP. Amongst other things, that would mean that using a straight WARP1 in an MVP discussion (for example) divorces the player from their actual team context. Off the top of my head, this would seem to benefit players from good teams, or at least deeper teams, the most.
   12. Walt Davis Posted: October 04, 2006 at 06:41 AM (#2197010)
I thought this had been widely known. Roughly speaking, take about two wins off of WARP 3 to get something like what MGL would get "above replacement" ... but MGL usually reports above average, so take off about 4 wins.

Somewhere in the archives, there's an interesting discussion on these and related issues. I don't remember who, but someone made the pretty good argument that it's best to measure against average (since this is known) and if you want a "true zero" type measure then use true zero and just look at raw runs created (where the "true zero" is known). The latter is easier for hitters I suppose. True zero for pitchers would be???

that's ridiculous - players who put up negative warp3 are probably not worse than AA players (if that's what he thinks a true talent 0 WARP3 player is). he knows this. why bring up that example?

Well, they don't have to be worse (in true talent) than a AA player to make his point. Some people say that players as bad as BPro's "replacement level" would never make the majors. He is (or might be) pointing out that the fact that some players produce negative WARP suggests that players at or just above their "replacement level" do indeed make the majors.
   13. Halofan Posted: October 04, 2006 at 08:08 AM (#2197022)
Will Baseball Prospectus get their rabid lawyers going on this one?
   14. RMc is the loyal supporter of the MLB event Posted: October 04, 2006 at 09:10 AM (#2197027)
I know all about the 1899 Spiders, and the recent Tigers.

Geez...lose one lousy playoff game, and suddenly you're the worst team of all time!
   15. DSG Posted: October 04, 2006 at 11:16 AM (#2197033)
True zero for pitchers would be???

Pitching Runs Created

The Hardball Times tracks them in our stats section:

=1"]AL PRC leaders
   16. sunnyday2 Posted: October 04, 2006 at 11:24 AM (#2197036)
Anybody want to compare VORP? How is it different than WARP? Is it better?
   17. Bob Dernier Cri Posted: October 04, 2006 at 11:31 AM (#2197038)
players at or just above their "replacement level" do indeed make the majors

All you have to do is look at the Rangers pitching staff to notice that.
   18. Dan The Mediocre Posted: October 04, 2006 at 11:34 AM (#2197040)
players at or just above their "replacement level" do indeed make the majors

All you have to do is look at the Rangers pitching staff to notice that.



Or the Cubs Collage of AAA Pitchers.
   19. DSG Posted: October 04, 2006 at 11:35 AM (#2197041)
Also, I have to add that the problem with WARP is not just a preference of baseline. Any study done with WARP is invalidated by the system, especially an economic one (i.e., Nate Silver's otherwise spectacular chapter on what A-Rod is worth in Baseball Between the Numbers) because the replacement baseline is set too low, which lowers the value of a marginal win.

Think about it this way, for example. Let's say we have two different baselines; one, in which a replacement level hitter contributes 3.3 "wins" (i.e. Win Shares) and another in which a replacement level player contributes 1.3 "wins." Now let's take an average team that spends $48 million on their hitters. If we value their hitters based on the higher (and essentially correct) replacement level, we'll find that a win above replacement is worth around $2.25 million (because the team is paying about $45 million marginal dollars and getting around 20 marginal wins). If we use the lower baseline, one win above replacement turns out to be worth $1.2 million. So how do you know what the correct amount to pay for Alex Rodriguez is? Well, if he's worth say 11 "wins," under the first model he ends up being worth (11 - 3.3)*2.25 + .3 = $17.6 million. Under the second, (11 - 1.3)*1.2 + .3 = $11.9 million.

If the first replacement level is correct, any model that uses the latter will be wrong, and if it sets its replacement level as low as WARP does, by quite a bit.

That's what happens in Nate Silver's chapter. And that's why it is imperative to use the correct replacement level.
   20. Kyle S Posted: October 04, 2006 at 01:33 PM (#2197093)
Well, they don't have to be worse (in true talent) than a AA player to make his point. Some people say that players as bad as BPro's "replacement level" would never make the majors. He is (or might be) pointing out that the fact that some players produce negative WARP suggests that players at or just above their "replacement level" do indeed make the majors.

Walt, I still don't think that we can conclude that players as bad as BPro's replacement level make the majors just based on the fact that some players put up negative WARP seasons. Again, just because a player puts up negative WARP doesn't mean his true talent is negative WARP, any more than Adrian Beltre putting up 11.7 WARP in 2004 means he's an 11-WARP true talent player.

Try finding players who have more than a few hundred at bats or BFP and have negative WARP. I haven't found any, and I've plumbed the depths of the worst players in history (Ray Oyler, Mario Mendoza, Andres Thomas, Mike Ryan, Rich Morales, Jim Mason).

Furthermore, we know that players of 0-warp level do make the majors, so this is a silly argument anyway- teams frequently dig into AA pitching staffs for emergency starters or bullpen help, for example.
   21. Tom Cervo, backup catcher Posted: October 05, 2006 at 04:55 AM (#2198628)
The Astros won 82 games, which is pretty much what their RS/RA numbers would have expected. 82 minus 58.5 is 23.5 wins. 23.5 / 162 = .145. Another perennial .500 team I like is the Seattle Mariners. Their team WARP-1 is 52.1, and they won 78. Their RS/RA would have expected around that as well. 78 minus 52.1 = 25.9 wins. 25.9/162=.160. The Yanks won 97, which is also around their pythag record. Team WARP-1 is 71.9. 97 minus 71.9 = 25.1 wins, or a .155 record.


I think there's a problem with methodology here.

By adding up a team's WARP, you're doublecouting the defensive runs above replacement level since run prevention is likely going to be credited to both pitchers and fielders.
   22. Harold Posted: October 09, 2006 at 03:31 AM (#2204415)
By adding up a team's WARP, you're doublecouting the defensive runs above replacement level since run prevention is likely going to be credited to both pitchers and fielders.

If that's the case (and I don't know whether it is), it is a failing of WARP. They have to add up. If the reason they don't add up that way is due to marginal utility, that's reasonable. But double-counting defensive credit isn't.

You must be Registered and Logged In to post comments.

 

 

<< Back to main

Support BBTF

donate

Thanks to
robinred
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Hot Topics

NewsblogHimrich’s Top Ten Target Field Foods
(7 - 1:47am, May 26)
Last: Infinite Yost (Voxter)

NewsblogOT: NBA Monthly Thread, May 2012
(1832 - 1:32am, May 26)
Last: baudib

NewsblogBoston.com: Curt Schilling’s 38 Studios lays off all staff
(119 - 1:28am, May 26)
Last: Swedish Chef

NewsblogHP: Baseball is leaving the human factor behind
(56 - 1:15am, May 26)
Last: The Keith Law Blog Blah Blah (battlekow)

NewsblogT.R. Sullivan: Of Frank Robinson, Milt Pappas and Jim Palmer
(8 - 12:40am, May 26)
Last: The Gurus DO NOT BourbonSamurai

NewsblogWilmoth: Nate McLouth Designated For Assignment
(12 - 12:25am, May 26)
Last: Tripon

Hall of MeritMost Meritorious Player: 1973 Discussion
(15 - 12:13am, May 26)
Last: DanG

NewsblogBud Selig -- No need for more MLB replay for now - ESPN
(86 - 11:59pm, May 25)
Last: cardsfanboy

NewsblogThe Hall of Very Good: Former Cards Slugger Critical of "LaRussa's Regime"
(4 - 11:26pm, May 25)
Last: cardsfanboy

NewsblogCSN to host ‘Phillies at the Beach’ on Memorial Day
(18 - 11:25pm, May 25)
Last: Fielder's the first baseman, Felder is the fielder

Hall of MeritMost Meritorious Player: 1972 Ballot
(28 - 11:25pm, May 25)
Last: lieiam

Sox TherapyA Winning Ballclub?
(20 - 11:24pm, May 25)
Last: Dan

NewsblogMatschulat: Did I Miss The "Paul Konerko Is So Overrated OMG" Bandwagon?
(27 - 11:16pm, May 25)
Last: baudib

NewsblogTBO: Nerdy Rays head north
(17 - 10:07pm, May 25)
Last: PreservedFish

NewsblogDodgers want to host NHL's Winter Classic
(22 - 9:38pm, May 25)
Last: Cris E

Buy MLB playoff tickets, plus 2011 World Series, 2011 ALCS tickets and NLCS game tickets. We also have Texas Rangers playoff schedule, tickets to Red Sox games and Yankees game tickets. Plus, buy Phillies baseball tickets, Tigers playoff tickets and the biggies like ALDS baseball tickets and 2011 NLDS tickets.

Demarini, Easton and TPX Baseball Bats

 

 

 

AllianceTickets.com has cheap MLB Tickets. Get all your Colorado Rockies Tickets, Seattle Mariners Tickets, San Francisco Giants Tickets and all your favorite baseball tickets here. We also carry cheap Denver Broncos Tickets, Seattle Seahawks Tickets and Denver Nuggets Tickets.

Page rendered in 0.2916 seconds
54 querie(s) executed