Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Baseball Newsstand > Baseball Primer Newsblog > Discussion
Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Thursday, October 04, 2007

2007 Hitter Projection Roundup

Nate Silver looks at how several projection systems fared in 2007 (PECOTA, ZiPS, CHONE, Marcel, THT, ESPN, Rotowire, Rototimes).

Joe Crede Clearwater Revival Posted: October 04, 2007 at 08:39 AM | 48 comment(s)
  Related News: Projections

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 1 of 1 pages
   1. Dan Szymborski Posted: October 04, 2007 at 09:27 AM (#2559608)
I need to email Nate and thank him for doing this work. I'm always really nervous before somebody does this - I was especially worried about the hitters since I was middle-of-the-pack last year. Now I just hope that I stayed with the best with the pitchers and I can breathe more easily this winter.
   2. Cowboy Popup Posted: October 04, 2007 at 09:30 AM (#2559610)
Congrats Dan.
   3. Mister High Standards Posted: October 04, 2007 at 09:43 AM (#2559625)
Great job Dan.
   4. Shooty misses Bill King Posted: October 04, 2007 at 09:44 AM (#2559626)
Oh great. Now you gonna start charging us, aren't ya?

Excellent work. Zips is my favorite part of BTF. Except for my own posts, of course.
   5. Kyle S Posted: October 04, 2007 at 09:48 AM (#2559630)
Congratulations Dan! So not only is ZIPS free, it's available much earlier in the year than is PECOTA. The underwear gnomes suggest this model to you:

1. Perfect ZiPS projections
2. Increase snarkiness
3. ?????
4. Profit
   6. NJ in DC (Now with Law School!!!) Posted: October 04, 2007 at 09:59 AM (#2559643)
Dan...if ya ain't first, you're last.
   7. Wally Moses, Isolated Power Broker (GGC) Posted: October 04, 2007 at 10:00 AM (#2559645)
1. Perfect ZiPS projections
2. Increase snarkiness
3. ?????
4. Profit


Leave that crap to the Will Carrolls of the world.

What's ESPN's method?
   8. Fargo Posted: October 04, 2007 at 10:01 AM (#2559648)
One thing I'd like to see is a comparison of the projections for different groups of players separately. It's got to be easier to get these right for players with more ML plate appearances and seasons behind them. I would guess that it's the projections of rookies and near rookies that differentiate the systems most, but also the players who had a relatively small number of plate appearances. And it's for such players that the systems are likely to differ most in coverage -- whether any projections were provided at all.

Another diagnostic would be to calculate the residuals by team, to see whether there was systematic error associated with which team a player hit for. This would likely reflect differences (error) due to using different park factors, rather than differences in information about the individual player.
   9. BaseballDIY Posted: October 04, 2007 at 10:03 AM (#2559651)
Oh yeah? Well, my projection system nailed Carlos Pena's .282-46-121.

That, of course, is a total lie. Thanks, Dan, for ZiPS. It helped me win some money in my fantasy leagues. (Please sign these papers indicating that you did not help me win some money in my fantasy leagues.)
   10. Dan Szymborski Posted: October 04, 2007 at 10:04 AM (#2559653)
What's ESPN's method?

I was wondering myself. Since it did well, hopefully it has nothing to do with the cockamamie ESPN Player Ratings. Maybe on the fantasy part somewhere?
   11. 4seamer Posted: October 04, 2007 at 10:05 AM (#2559656)
are we really clapping for r .60 values?
   12. JPWF13 Posted: October 04, 2007 at 10:12 AM (#2559661)
are we really clapping for r .60 values?


wait till you see the results for pitching projections
an r of .25 will be good
   13. Mike Emeigh Posted: October 04, 2007 at 10:12 AM (#2559662)
are we really clapping for r .60 values?


That's pretty darned good, I think.

-- MWE
   14. Shooty misses Bill King Posted: October 04, 2007 at 10:14 AM (#2559665)
are we really clapping for r .60 values?

Until there's a better system...yeah.
   15. Fargo Posted: October 04, 2007 at 10:17 AM (#2559668)
They're not going to get much better than r=60. But if you're just trying to do better than your roto competition, the differences between the projections or the components of performance may matter. Getting it more right, not absolutely right, is what the game is about, just like winning one more game than anybody else is how you win your league in real baseball, even if you end up 84-78.
   16. Kyle S Posted: October 04, 2007 at 10:18 AM (#2559673)
as mgl likes to say, for guys who marcel can project (e.g. those who have been in the league long enough to have several years of data), the best systems are almost as good as "if God himself told us their true talent OPS". no system will get perfect correlation because of the inherent variance in hitting a pitched ball.
   17. Dan Szymborski Posted: October 04, 2007 at 10:37 AM (#2559702)
as mgl likes to say, for guys who marcel can project (e.g. those who have been in the league long enough to have several years of data), the best systems are almost as good as "if God himself told us their true talent OPS". no system will get perfect correlation because of the inherent variance in hitting a pitched ball.

That's kind of the frustrating thing. If I or Nate or Chone predicted every OPS exactly right, as good as that would seem on the surface, we'd still have to face up to the fact that we weren't perfect because we should have a certain amount of error since all of us would freely admit we can't model chance.
   18. cardsfanboy Posted: October 04, 2007 at 10:39 AM (#2559704)
Wouldn't it be a tad more accurate to adjust for league environment before doing the correlations? I mean Nate even said that scoring league wide was down, and I don't think any of the projection systems really take the 'future' scoring environment into account, so if it's a down year offensively shouldn't you adjust first then grade?
   19. AROM Posted: October 04, 2007 at 10:39 AM (#2559705)
I did the same test I did last year - its a smaller sample as I only included players with at least 500 at bats, but at least I didn't have to make up a projection for anyone to run the data. The systems I had were Chone, Pecota, Zips, Marcel, and THT. I simply deleted any player if any projection system did not have them. With a higher PT threshold, I only had to remove two - Kelly Johnson, and Adrian Beltre.

Results by correlation and average weighted error:

Chone .657, 35.5
Pecota .652, 35.8
Zips .644, 38.3
THT .639, 38.8
Marcel .633, 36.4

weighted error is (ops - proj ops)*PA

Outside of Jimmy Rollins, nobody is going to have too many more AB than a guy who meets the minimum, but if you cut the playing time minimum down correlation isn't a good measure, and error should be weighted by PA. Which is a bigger deal? Being off by 80 points on Jimmy Rollins and his 800 at bat season or off by 80 points on Greg Dobbs? I think the answer is obvious.
   20. AROM Posted: October 04, 2007 at 10:44 AM (#2559715)
are we really clapping for r .60 values?

Until there's a better system...yeah.


I think .60 is pretty good using a smaller playing time minimum. The ones I got for the 500 AB+ guys are not as good as last year across the board, PECOTA was at .73 last year.
   21. Craig in MN Posted: October 04, 2007 at 10:45 AM (#2559717)
Wouldn't it be a tad more accurate to adjust for league environment before doing the correlations? I mean Nate even said that scoring league wide was down, and I don't think any of the projection systems really take the 'future' scoring environment into account, so if it's a down year offensively shouldn't you adjust first then grade?

I kind of thought the same thing. A few less or more injuries than expected and that throws the average off, or a cold snap in the weather here or there could through overall league performance off a bit. It probably doesn't have much effect on the other comparisons of the projection systems, but I'm no expert. Marcel plus a Farmers Almanac might just be as good as anything, though.
   22. AROM Posted: October 04, 2007 at 10:47 AM (#2559718)
I don't think any of the projection systems really take the 'future' scoring environment into account, so if it's a down year offensively shouldn't you adjust first then grade?


It wasn't by design, when I ran everything and got lower scoring the reason seemed to be an outstanding crop of young pitchers while much of the great offense in 2006 came from older hitters. That's what the system spit out, but several of those young pitchers regressed or got hurt. The real reason offense was down was an extremely cold April, after that things were much like 2006.
   23. Kyle S Posted: October 04, 2007 at 10:47 AM (#2559719)
All hail our CHONE overlords.
   24. pkb33 Posted: October 04, 2007 at 10:48 AM (#2559721)
For most purposes of the projections (fantasy leagues and real-life comparison of competing teams) changes in leaguewide run context wash themselves out, don't they? In other words, I am not aware of too many siutations where the absolute runs as opposed to the relative runs really matter.

Which is to say, I'm not seeing how the leaguewide drop in runs affects the correlations meaningfully, unless the argument is it biases the projection comparison towards a system which skews lower?
   25. Wally Moses, Isolated Power Broker (GGC) Posted: October 04, 2007 at 10:51 AM (#2559724)
I'm not a fantasy player, but if I were playing fantasy against Primates, I'd use ESPN. Were I to play against non-Primates I'd use ZiPS. I figure that would be the best way to find undervalued guys depending on the situation.
   26. Gromit Posted: October 04, 2007 at 10:55 AM (#2559731)
PECOTA!

We're #1
We're so full of ourselves.

signed,
BPro.
   27. Shooty misses Bill King Posted: October 04, 2007 at 10:56 AM (#2559732)
I'm not a fantasy player, but if I were playing fantasy against Primates, I'd use ESPN. Were I to play against non-Primates I'd use ZiPS. I figure that would be the best way to find undervalued guys depending on the situation.

I'd be careful--when I played fantasy I'd use at least 3 projection systems to find players. I think most primates know better than to put all their eggs in one basket.
   28. AROM Posted: October 04, 2007 at 11:02 AM (#2559744)
If you have enough time to prepare for the draft gather all the projection systems and focus on the players that they have the most disagreement with, then focus some research on those players, try and figure out why, and make your best decision on them.

The Chone projcection was way low on Casey Kotchman, which I'm very happy about. It doesn't know about his case of mono, just sees a young player who played like crap in 2006. He probably deserved a mulligan for 2006, and combined with scouting reports a human could have beat his computer projection.
   29. pkb33 Posted: October 04, 2007 at 11:03 AM (#2559746)
The other critique of these for fantasy purposes is that the great majority of fantasy leagues are reasonably shallow relative to MLB. If you are in a 14 team mixed league you are starting 14*14 or so hitters with perhaps an additional 14*3-5 on the bench/DL. So, a huge percentage of guys in the majors are irrelevant to you.

What you want to know is of the guys with the best production, how accurate were the systems. I suspect this answer is pretty much the same as the comparison actually done here, but it's not exactly the same question.

Also, for fantasy purposes SB is a huge issue and for these analysis, it is avoided. Similarly, R and RBI are how most fantasy matchups are scored and using OPS does not tell you anything about these categories. While it is true that those stats are more erratic and less skill-oriented than OPS, this does not matter to the fantasy player---one of the things you are paying for is someone to help you best assess the context the player will be in lineup-wise and thus, what R and RBI numbers will look like.
   30. Dan Szymborski Posted: October 04, 2007 at 11:17 AM (#2559767)
PECOTA!

We're #1
We're so full of ourselves.

signed,
BPro.


That's not fair - that previous entry was Will's doing, not Nate's.
   31. Mister High Standards Posted: October 04, 2007 at 11:29 AM (#2559780)
Since it did well, hopefully it has nothing to do with the cockamamie ESPN Player Ratings.


Maybe this is quibbling, but I don't think this is accurate. The ESPN projections didn't really do well. They were middle of the pack at best. What they did do, was give some different information, so that if you combined it with other systems you were getting additional data. That doesn't mean that it did well, since these systems aren't really designed to interact with each other.

With all of that said, the fact that it captured an aspect of projection that a comparision model didn't, and that a linear model didn't DOES make it interesting... just not accurate. More research should be done, IMHO.
   32. Barry`s_Lazy_Boy Posted: October 04, 2007 at 11:35 AM (#2559793)
ALL YOUR PROJECTIONS ARE BELONG TO US
   33. Bring Me the Head of Alfredo Griffin (Vlad) Posted: October 04, 2007 at 11:47 AM (#2559813)
Congrats, Dan. You do excellent work, so it's nice to see you get some accolades.
   34. BaseballDIY Posted: October 04, 2007 at 01:03 PM (#2559926)
And Mike Crudale says hi.
   35. JPWF13 Posted: October 04, 2007 at 01:17 PM (#2559943)
The Chone projcection was way low on Casey Kotchman, which I'm very happy about. It doesn't know about his case of mono, just sees a young player who played like crap in 2006. He probably deserved a mulligan for 2006, and combined with scouting reports a human could have beat his computer projection.


But when you get into doing that the result is you start making excuses for the players you like who played poorly (and probably not the ones you didn't like)
Kotchman- sure it's easy now to say, 2006 doesn't count, but some others always thought Kotchman was "another" PCL inflated Angel's prospect, would regard mention of his illness as excuse making- and wouldn't make any adjustments.
   36. AROM Posted: October 04, 2007 at 02:06 PM (#2559989)
But when you get into doing that the result is you start making excuses for the players you like who played poorly (and probably not the ones you didn't like)


Its a good point. I had a good feeling about Kotchman, but I remember some Mariner fans talking about how great Jose Vidro would be not having to play the field. Vidro's one of the guys I hit exactly with OPS, or was off by .001. I'm not even sure if Kotchman is one of the players who the projection systems disagreed with.

When you do find the players with the most disagreement, spend some time trying to figure out why. Gather some information that projection systems don't take into account. They've got all the stats, and I'm sure most use height/weight data, but I doubt any use scouting reports. You may or may not wind up with better info going into a fantasy draft, but the process will be fun.
   37. user Posted: October 04, 2007 at 02:22 PM (#2560012)
I seem to remember Kotchman getting a fairly poor PECOTA forecast before 2006
   38. John Brill's #1 Fan (JMN) Posted: October 04, 2007 at 04:32 PM (#2560224)
Since I play in a sim league, and the draft starts in about three weeks, ZiPS has extra value for getting out quickly.
   39. JPWF13 Posted: October 04, 2007 at 04:35 PM (#2560233)
When you do find the players with the most disagreement, spend some time trying to figure out why.


well very quickly you realize that most major "disagreements" result from injuries and playing time-

Its the others that are more fascinating- ie: the player does far better or worse than his prior track record- with no apparent cause.
   40. AROM Posted: October 04, 2007 at 05:03 PM (#2560304)
The two players who the 5 systems I looked at (CH, Pec, Zip, Marcel, THT) agreed the most on were Magglio Ordonez and Vernon Wells. All had Mags between .798 and .806, so of course he hits 1.032. All had Wells from .839 to .846, so he barely breaks .700.

Among the 500+ AB guys, the ones with the most disagreement were a mix of young players and aging vets.

The top 10, and the system that came closest:
Delgado- Chone
Gordon - Chone (Marcel, but based on 0 PA and league average)
S Drew - Chone
C Hart - Pecota
C Young - THT
H Ramirez - Marcel (only one "smart" enough to ignore his minor league stats)
Delmon Young - THT
F Thomas - Chone
T Pena - THT
Theriot - Pecota
   41. AROM Posted: October 04, 2007 at 05:14 PM (#2560319)
I feel compelled to add that Nate's favorite statistic at the end of that article, the one that comes from throwing them all in a regression formula, is useless and possibly dangerous. Its a best fit that works out once you already know 2007 actual OPS. There is no reason to think that his formula (5 pecota + 4 zips + 3 espn...) will predict anything for 2008.

If he was looking at previous years projections, and came up with that formula, and validated it against 2007, that would be one thing, but just because you can "best fit" a sample of data DOES NOT mean it will be predictive.
   42. Fargo Posted: October 04, 2007 at 05:31 PM (#2560363)
Good point. Not to mention, the results discussed on this thread refer only to hitters. The story is likely to be very different for pitchers.
   43. Mister High Standards Posted: October 04, 2007 at 05:35 PM (#2560369)

If he was looking at previous years projections, and came up with that formula, and validated it against 2007, that would be one thing, but just because you can "best fit" a sample of data DOES NOT mean it will be predictive.


A very wise man said that in 31. Of course he isn't as wise as rally, or dan or nate silver of mike e. or philly but pretty damn wise. though he has spelling issues.
   44. AROM Posted: October 05, 2007 at 12:51 AM (#2561107)
Here's the story on pitchers.
   45. Shibal Posted: October 05, 2007 at 02:33 AM (#2561167)
Why isn't Ron Shandler's stuff in the ratings? I like them far better than PECOTA (unfamiliar with the other systems.)
   46. AROM Posted: October 05, 2007 at 08:56 AM (#2561243)
I don't have a copy of Shandler's projections. If you send me a spreadsheet with Shandler's predicted ERA for every pitcher with 50 innings last year, I'll be happy to compare his to the rest.
   47. Bad Doctor Posted: October 05, 2007 at 01:20 PM (#2561727)
If I or Nate or Chone predicted every OPS exactly right, as good as that would seem on the surface, we'd still have to face up to the fact that we weren't perfect because we should have a certain amount of error since all of us would freely admit we can't model chance.

I somehow doubt that it would be worded that way on the back of the next year's Baseball Prospectus.
   48. JPWF13 Posted: October 05, 2007 at 01:29 PM (#2561749)
Why isn't Ron Shandler's stuff in the ratings? I like them far better than PECOTA (unfamiliar with the other systems.)


A couple years ago Shandler posted a long article contesting Pecota's claim to be the best.
IIRC he said that their comp method was unfair because all the competing systems did not project all the same players and and and...

It was really a bizarre article, it seemed like he was going to attack or at least discuss BPro's comparison methodology- but never really did. The overall tine was petulant [he'd fit in with BPro's writers if he wrote like that all the time].

I'd gotten Shandler's book earlier that year along with his projections, he did not have a good year let me tell you. I suspect he was pissed off.
Page 1 of 1 pages

You must be Registered and Logged In to post comments.

 

<< Back to main

Support BBTF

donate

My Bookmarks

You must be logged in to view your Bookmarks.

Vivid Seats is a sports ticket broker, concert ticket broker and theater ticket broker offering the best baseball tickets like Yankees tickets, Cubs tickets, and Red Sox tickets, as well as Police reunion tour tickets and Jersey Boys tickets.

We have baseball tickets, the NFL schedule, college football tickets and Cowboys tickets. We have NBA tickets like Celtics tickets and Lakers tickets. Plus, buy Giants tickets, Patriots tickets and Colts tickets. Also check out our MLB baseball schedule

Buy Cheap MLB Tickets

Concerts Theatre NFL Angels Dodgers MLB Celtics Theater NBA Tickets Venues NHL Lakers Tickets NFL Yankees NHL Phillies NBA Wicked Marlins MLB Concerts Cubs Mets Red Sox Wicked WWE Red Sox Mets Yankees Dodgers

Page rendered in 1.3944 seconds
81 querie(s) executed