Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Baseball Newsstand > Baseball Primer Newsblog > Discussion
Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Tuesday, January 10, 2017

Grading the projections: 2016

ZiPS didn’t have a good year. ZiPS was the least accurate of the three systems in each of the five categories, and never by a particularly small margin. You don’t want to conclude too much based on a single season of results, but ZiPS didn’t perform very well in last year’s review, either. (I should also note that this is Steamer’s second straight year of leading the pack in convincing fashion.)

Marcel does its job. Marcel wasn’t great, but it was almost always in the neighborhood of accurate. It beat ZiPS in four of the five categories, and even led OBP. Marcel remains very hard to convincingly beat (or even beat at all), despite its simplicity.

Averaging the projections might be a great idea. The “Average” row in the above table is exactly what you would expect: the accuracy of the average of all four systems. It beats all four systems in four of the five categories, and fell short of only Steamer in the fifth. One would expect that an average would rarely be egregiously wrong; it’s surprising to see that the average also tended to be closer to right than each individual projection. This could be a quirk of a single season of projections, but at the very least, it seems to say that the brute-force method of resolving differences between the projection systems is credible.

RoyalsRetro (AG#1F) Posted: January 10, 2017 at 11:34 AM | 7 comment(s) Login to Bookmark
  Tags: marcel, pecota, steamer, zips

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. DJS, the Digital Dandy Posted: January 10, 2017 at 02:30 PM (#5381079)
I had a weaker than average year on hitters, better than average on pitchers (I center systems on league offensive levels first), very good year on teams, weirdly enough.
   2. Shibal Posted: January 10, 2017 at 04:31 PM (#5381181)
Is he looking at pitching, hitting, or both? I can't tell from the article.
   3. PreservedFish Posted: January 10, 2017 at 07:11 PM (#5381283)
How do you grade Zips on teams? Do you use someone else's playing time estimates?
   4. Walt Davis Posted: January 10, 2017 at 07:35 PM (#5381298)
I'm not clear what correction he made for different league contexts assumed by the systems. It's clear he tried something but it would be good to be more precise. One thing to realize is that, for these sorts of rate stats, the mean and the variance are almost certainly related. A league with a 330 OBP will have a slightly higher variance of OBP than one with a 320 (and therefore slightly higher standard deviation). This is trivial in magnitude but all baseball differences are trivial in magnitude. This is part of the point that DanR regularly makes.

Point being that a 15 point error in OBP in the 330 league might be the equivalent of a 14 point error in a d320 league. Trivial but then the differences among the systems is usually on the order of the third decimal place.

But my key gripe is that I don't have a clue what that number is that he's presenting. It can't possibly be raw RMSE -- the systems weren't "typically" off on OBP by 80-90 points of OBP. Is it RMSE/average context? RMSE/actual OBP? 2*RMSE/actual? It can't be average absolute differences either for the same reason. (Tango's other preferred measure)

There's no point just putting a number out there and saying "see this one is lowest, that's the best." What do these numbers mean. Zips gets a .088 on OBP while 3 others are around .076 -- is that ,009 difference at all meaningful? Is it likely to be anything but season-to-season prediction error variance?

By the way, I did click through to the Tango article linked. All that article does is review the Nate article, tell you which two of the measures that Nate cites are likely to be most useful/meaningful, but notes they all need to be adjusted to the particular context each method assumed but doesn't tell you how to do that.

Sorry if I missed the sentence or two that actually clarifies this but I have looked for them.
   5. DJS, the Digital Dandy Posted: January 10, 2017 at 09:37 PM (#5381386)
Walt, I had some of these questions as well. But there's a bit of a self-serving nature when I make them, so I try not to.
   6. Guy Heckler's Veto Posted: January 10, 2017 at 10:06 PM (#5381394)
Admit no weakness, Szym. Give Murray Chass no quarter.
   7. Russ Posted: January 11, 2017 at 08:07 AM (#5381485)
It would be interesting to look at the distributions of the absolute errors as well (at least providing something like the 25th%ile and 75th%ile of the absolute errors, rather than just the average). Because it would be possible to have some systems to more "small" misses, but fewer "big" misses and vice versa. I don't really love marginal summary statistics for these sorts of comparisons because it just buries a lot of things going on which is unnecessary because every player gets projected by each system, so you can really look at how much they agree on each player, rather than averaging across the error.

For example, what could be nice is to see a plot of the error by statistic with players on the x-axis (sorted by decreasing average absolute error in the statistic across the different projection methods) with the raw error on the player for each of the five methods on the y-axis. To make the graph easier to look at (i.e. to have fewer points), you could stratify players into different blocks by either plate appearances, experience, age, whatever. You would be able to see if systems are missing both above AND below (or everyone is missing the same direction), you could see clearly which players are the "easiest" to predict, which are the "hardest", etc. It also would dramatically show where Marcel does very well and where it does very poorly.

You must be Registered and Logged In to post comments.

 

 

<< Back to main

News

All News | Prime News

Old-School Newsstand


BBTF Partner

Support BBTF

donate

Thanks to
Guts
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Hot Topics

NewsblogOTP 20 Feb. 2017: Baseball in a Time of Politics
(1175 - 3:50pm, Feb 23)
Last: Lassus

NewsblogDave Stieb on Hall of Fame: 'I surely did not deserve to be just wiped off the map' | MLB | Sporting News
(50 - 3:46pm, Feb 23)
Last: AROM

NewsblogOT: January 2016 Soccer Thread
(255 - 3:46pm, Feb 23)
Last: jmurph

NewsblogOT - December 2016 NBA thread
(2445 - 3:44pm, Feb 23)
Last: Moses Taylor, Unwavering Optimist

NewsblogBaseball Reference has a New Look
(81 - 3:44pm, Feb 23)
Last: Cblau

NewsblogIs 300-wins club done adding members?
(105 - 3:23pm, Feb 23)
Last: Betts, Bogaerts, and D Price(GGC)

NewsblogRick Ankiel drank vodka before starts to ease anxiety
(26 - 3:23pm, Feb 23)
Last: Sleepy's still holding up that little wild bouquet

NewsblogMike Piazza Learns How to Be an Owner. Of a Soccer Team. In Italy.
(89 - 3:18pm, Feb 23)
Last: BDC

NewsblogBaseball Hall of Fame to honor 'Homer at the Bat"
(44 - 3:05pm, Feb 23)
Last: Random Transaction Generator

NewsblogThe Bizarre Ending to Pitching's Greatest Winning Streak
(2 - 2:29pm, Feb 23)
Last: Perry

NewsblogMLB replaces retiring umps Jim Joyce, John Hirschbeck, Tim Welke, Bob Davidson
(32 - 1:54pm, Feb 23)
Last: PreservedFish

Sox TherapyThe Roster - Part Deux
(6 - 1:48pm, Feb 23)
Last: Jose is El Absurd Pollo

Gonfalon CubsSpring Training Open Thread
(12 - 1:32pm, Feb 23)
Last: Moses Taylor, Unwavering Optimist

NewsblogPrimer Dugout (and link of the day) 2-23-2017
(7 - 12:49pm, Feb 23)
Last: Betts, Bogaerts, and D Price(GGC)

NewsblogSources: MLB, union agree to use dugout signal for intentional walk
(86 - 10:54am, Feb 23)
Last: SoSH U at work

Page rendered in 0.2899 seconds
47 querie(s) executed