2001 Projections ? A Look Back; Projections Part I
Whose crystal ball is best?
It?s time
once again to take a look at how the various projections did for the 2001
Season. In 2000, I looked at a bunch of various projection systems and how
well they did in projecting the actual results of the various players. I looked
at the top 170 players in terms of plate appearances (with some exceptions)
and how each system did in projecting OPS. The results of that exercise
can be found here.
Anyway,
I?ve decided to do the same for the 2001 season only I?ll limit the analysis
to myself, STATS, Inc. and Baseball Prospectus. The first thing
I?ll do is simply give you the results for the three for the top 170 players
in Plate Appearances:
|
Source
|
Corr.
|
r^2
|
RMSE
|
MAE
|
MAPE
|
Voros
|
.748
|
.559
|
.081
|
.061
|
7.5%
|
Stats, Inc.
|
.739
|
.547
|
.082
|
.061
|
7.4%
|
Baseball Prospectus
|
.703
|
.487
|
.091
|
.069
|
8.5%
|
Corr. = Correlation Coefficient
r^2 = Square of Corr.
RMSE = Root Mean Squared Error
MAE = Mean Absolute Error
MAPE = Mean Absolute Percentage Error
Anyhow, before anybody gets out of ahead of me, I?d like
to talk about one of my favorite subjects: multiple
endpoints. While generally multiple endpoints arguments occur when someone
looks at a data set and then draws the restrictions to achieve a particular
result, sometimes multiple endpoints becomes an issue even when no intent
to rig the numbers is present.
This has
happened here. If I extend the restrictions to the top 176 players, STATS
winds up with a higher correlation coefficient than me. In other words, the
two systems were about equal in terms of their ability to project the OPS
of the players. This was essentially the same case as the year before as well.
I also wanted
to take a look at Baseball Prospectus? numbers. Last year BP?s projections
did every bit as well as mine and STATS and it was surprising that they not
only did worse than mine and STATS this year, but also worse than they had
done the year before. The first thing I checked was the Colorado
factor. In 2000, none of the Colorado
players were counted because I didn?t release projections for them (for reasons
I won?t get into). I also noted that there were some weird projections for
a few Colorado players (.978
OPS for Jeff Cirillo and a .654 OPS for Vinny Castilla). So I guessed that
maybe their system had trouble dealing with the unique nature of Coors field
and gave unreliable results.
Anyway,
I eliminated the four Rockies players from the 170
(Larry Walker, Todd Helton, Cirillo and Juan Pierre) and three more for whom
Coors could have messed with the projections (Eric Young, Vinny Castilla and
Ellis Burks) and re-did the correlations with the remaining 163 players. The
results were .728, .721, .682 for me, STATS and BP respectively. So I?m guessing
that Colorado was not the main
difference between this year and last. That being the case, I?m not sure I
can explain the discrepancy other than BP having an off year.
Another
thing worth looking at, are the players that were missed by the greatest amount.
The five players with the largest percentage miss by my system were: Brady
Anderson, Cal Ripken, Jason Kendall, Edgardo Alfonzo and Barry Bonds. For
STATS it was: Bonds, Anderson,
Kendall, Tim Salmon and Mark Quinn. For BP it was:
Kendall, Alfonzo, Bret Boone, Bonds and Johnny Damon.
Of the nine above players, none of the systems posted an above average projection,
meaning that the differences were largely due to abnormal performances compared
to their previous stats. The closest anybody came to an above average projection
for the nine was my projection for Mark Quinn (.819 projected, .757 actual).
What about
players projected using Minor League numbers? Well I really don?t know how
and when STATS or BP used minor league stats, but there were 13 players out
of the 170 who had less than 502 Major League appearances going into 2001.
The correlation coefficients for these 13 were .60, .53 and .41 for me, STATS
and BP respectively. While that looks much lower than overall, remember this
is a very small sample and more importantly the range in performance for these
13 was much smaller than the overall range, and that drives those numbers
down. The MAE (.037, .053, .051) and the RMSE (.045, .063, .068) are smaller
than the overall? numbers, but this again is due to the smaller range. The
MAPE (5.0%, 7.3%, 6.8%) is probably the most appropriate metric here, and
for it, the numbers are roughly the same.
It should
be noted however that the argument can be made that this represents a selective
sample of only the minor leaguers who racked up a bunch of at bats, and that
could affect the results. This is undoubtedly true, and this combined with
the small sample makes the above numbers regarding minor leaguers not very
meaningful. I?m going to try and do a more exhaustive look with next year?s
stats.
One final
note is that the problem from last year of underprojecting the group as a
whole did not happen this year. The average OPS of the 170 was .814 and myself,
STATS and BP had average OPS numbers for the group of .817, .814 and .820
respectively.
Projecting
player performance is a critical aspect of player analysis, possibly
the most critical. In the coming weeks I?ll discuss why that it is,
what the difficulties in projecting are, and ways future projections might
be able to be improved.
Voros McCracken
Posted: April 09, 2002 at 06:00 AM |
10 comment(s)
Login to Bookmark
Related News:
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Voros McCracken Posted: April 10, 2002 at 12:28 AM (#605102)b) In 2000, I did them on the rush and just posted the park neutral projections. Therefore I knew the Colorado projections would be off, but guessed that the rest would be close enough since other than Colorado, parks don't have too big effect on things.
c) I did tell people why (in fact there's the answer to the question on my web page), but since this article wasn't even about the 2000 projections, I didn't figure it was worth wasting space over it.
d) What's with the attitude?
Between mine and STATS? No way. I believe I mentioned that they were "about equal."
Between STATS and mine and BP's? Hard to say. Probably not, but if the same difference happens once again this year, maybe this years difference should be given more consideration.
The idea was simply to present generally how each system did since this is something many people have asked for in the past.
Ichiro!, Albert Pujols, David Eckstein, Neifi Perez, Todd Walker, Alex Ochoa, Jose Macias, Benito Santiago, Paul LoDuca, Shea Hillenbrand and Rickey Henderson.
Here's the results for the 144 players who had 300 or more ABs in 2000 and were among the 170 player sample:
Voros: .747 Corr., .063 MAE, .084 RMSE
STATS: .740 Corr., .062 MAE, .085 RMSE
BP: .692 Corr., .074 MAE, .095 RMSE
2000 OPS: .690 Corr., .079 MAE, .100 RMSE
There were no players in the sample who played in Coors in 2001 who didn't play there in 2000, nor was the opposite true for any of the players. The Correlation Coefficent for the 2000 stats is probably not useful since it only measures how well the numbers track with the other set, not necessarily how good of a predictor _unregressed_ it would be. The Mean-Absolute and the Root-Mean-Squared errors are far more important for this comparison.
It's also critical to note that the correlation between my projections and the 2000 stats was .903 with an r-squared of .816. In other words, for any players who had 300 or more at bats in 2000, what they did in 2000 is obviously going to have a substantial impact on anyone's projections. So clearly since the 2000 numbers for such players might be considered the "baseline" for future expectations, the discrepancies between it and the projections are the critical thing to look at, and in those cases the projections tend to do quite well.
As an extra control, if you just look at those players aged 25-29 in 2000, how does the Voros, STATS, do against 2001? And how does the 2000 stats by themselves do? At this point, I'm removing age from the equation to see where the "value-added" comes in.
Finally, if you take 1998-2000 stats of players aged 25-29 in year 1999, how does all this work out as well?
Are the predictions really so good because you guys have found something interesting, or is it simply because of the age adjustment? Answering the above question should shed some light on this.
Thanks, Tom
And if we go back to use two and three years, the same problem occurs in that less and less players fit that profile.
James makes another good point in that the similarities between a projection and the players results last year are not the important issue, it's the differences that are key. When looking at the 7 biggest differences between the two on both ends, the projections were closer to the actual results than the 2000 numbers in 10 of the 14 cases, and in all but one case the actual results were in the direction of the projection away from the 2000 stats.
We should also note that this is only the results of a single season. I'll try and piece together 1999 to 2000 as the weeks go on. I have a lot to say on this subject, so I'll save the rest for future installments.
You may be surprised to know, but the more PAs a player hets, the better his performance. So, if you want to project a player's stats for 500 PA or for 200 PA, his RATE stats will be different. The probable reason is that managers, for those average players, are swayed by in-season stats, so much so that if someone has a bad month, his playing time is reduced, thereby not giving that player the benefit of regressing to his normal performance level.
So, if we arbitrarily choose 500 PA as the cutoff, then the projectors who do the best are those who regress the least. The lower the PA threshhold, the more regression has to be considered for those players.
I'm not sure what you mean by "statistically test." Besides which, despite the tagline to the article, I'm really not all that concerned with whose system is "best," I personally am only concerned with my system, and I compare it to others as a means to measure where my numbers might be lacking (and I have several ideas on this). The idea is to work on the system to continue to provide insights into how various variables relate to future performance.
Big Ed,
It's really quite hard to figure out systematic differences and I suspect if they exist they'd be different for different systems.
For my system, I'm thinking that I might not handle increases in performance as well as can be handled. I also am not sure whether weight is a usable variable, and whether hieght might be better to use. The data set I have shows stronger relationships between weight and future performance than height, but weights can be so error-prone and different sources provide different weights, and weights changing over time...
...I'm think it might be a minefield that can't be navigated whereas height is a little easier to deal with.
Again, I'll get to a lot of this stuff as I release more articles on projections, I just need a certain amount of time to puut each together. Sorry.
You must be Registered and Logged In to post comments.
<< Back to main