Arguing over who’s the better player is as much a pastime as baseball itself.
Pedro Martinez or Sandy Koufax? Barry Bonds or Mickey Mantle? Of course it’s impossible to say. You can’t compare players from different eras. Heck, it’s hard enough to compare them between teams, in the same season.
But that doesn’t stop stat junkies from trying. They use equations only slightly less complex than credit derivatives formulas, and no more comprehensible to outsiders than the nose-tapping, ear-tugging, cap-pulling signals of a third base coach.
The latest entry to this field of Monte Carlo simulations and regression analyses and optimization algorithms was posted last Thursday in arXiv, an informal online repository of papers devoted to high-energy physics and self-organizing systems and other such knuckle-balling disciplines.
The study’s authors used network science to crunch the results of every single at-bat between 1954 and 2008 — and thanks to a baseball version of “Six Degrees of Kevin Bacon,” it’s possible to compare players who never faced each other.
There’s actually not much new here, from the description. They start with “Runs Until End”, which is just probability added. Then they take the values and do multiple recursions to adjust each matchup for the values of other matchups involving the principals. Colin Wyers, Vinay Kumar and I were just last week talking in Primer IRC about this, and I mentioned Thorn and Palmer’s Hidden Game of Football included a wagering system proposal based on multiple recursion analysis, in 1988.
On top of not treading much new ground other than computational cycles used (of course, look at the source), the system doesn’t yet value “stolen bases, injuries…differences between ballparks” or defense. If you’re using PA and have this kind of tech at hand, I don’t quite see the big problem including SB. I don’t see what ‘injuries’ has much to do with anything at all.
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Cyril Morong Posted: August 05, 2009 at 08:55 PM (#3281630)1 Barry Bonds 1.051
2 Albert Pujols 1.049
3 Manny Ramirez 1.004
4 Todd Helton 1.002
5 Mickey Mantle .996
6 Mark McGwire .982
7 Frank Thomas .974
8 Lance Berkman .973
9 Alex Rodriguez .967
10 Jim Thome .966
11 Larry Walker .965
12 Vladimir Guerrero .963
13 Chipper Jones .955
14 Willie Mays .949
All of the top 10 hitters the study identified are in the top 14 of OPS. I would like to see how many more wins a guy generates using their method than, say, based on OPS. I took a very quick look at the paper and I did not see this addressed. How much better or worse is Mantle, for example, based on their findings? Should we add 5 wins to what we already thought his WARP was? Take away 3?
I know that when I have looked at things like WPA vs. OPS, the correlations are very high. It seems like they are doing WPA but then adjusting everyone’s values based on strength of opposition using some kind of iterative method, like college football computer rankings.
I tried to answer that at my blog in February. Here is the link:
http://cybermetric.blogspot.com/2009/02/how-good-has-albert-pujols-been-and-how.html
Cy
Pujols.
Bonds Aging vs. Aaron Aging
Bonds Greatest Feat Might Be Improvement
"Has Anyone Aged as Well As Barry Bonds?"
You must be Registered and Logged In to post comments.
<< Back to main