Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Hall of Merit > Discussion
Hall of Merit
— A Look at Baseball's All-Time Best

Monday, February 05, 2007

Dan Rosenheck’s WARP Data

WARP Methodology and Results

Thanks, Dan!

John (You Can Call Me Grandma) Murphy Posted: February 05, 2007 at 02:59 PM | 666 comment(s)
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 1 of 7 pages  1 2 3 4 5 6 7 >
   1. John (You Can Call Me Grandma) Murphy Posted: February 05, 2007 at 04:04 PM (#2292320)
That's Dan's WARP data, not WARPED data. :-D

Seriously, we appreciate you taking the time to create this and allowing us here at the HoM to examine it.
   2. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 04:07 PM (#2292323)
Thanks for the thread.

Here is the text of the methodology explanation included in the .zip file.

With some trepidation, I’ve decided to leap headfirst into the überstat wars. The problems with both BP WARP and particularly Win Shares have been well-documented, above all regarding their replacement level definitions and opaque or nonexistent timelines. I’m here to offer my own, that I immodestly believe to be *far* superior to either available system (although I do leech off of them for defense). I’ll post a detailed account of the methodology here, and I have the data available in spreadsheet form. I hope many of you will find this useful for the 1994 and subsequent elections. I look forward to discussions on this thread either about my approach, or about conclusions one might draw from the data and how they might alter voters’ ballots.

I only have data for non-catcher NL starters from 1893-2005, but I will hopefully be able to post estimates for pitchers, NL catchers, and all AL players who are receiving HoM consideration in time for the 1995 ballot.

Methodology

Step 1: Wins above league average

Offense

1. Using Extrapolated Runs (1947-2005) or BaseRuns (1893-1946), find out how many runs a player created.
2. Subtract the player’s batting outs from the average batting outs per team for that league-season to determine the outs left over for teammates on a theoretical average team.
3. Multiply the remaining outs by the league runs scored per out, and add the player’s runs created, to get theoretical team runs scored.

Defense

1. Input raw Fielding Win Shares (FWS) and BP FRAA (1893-1999), Chris Dial’s Zone Rating RSpt (1987-2005), UZR (2000-2003), and Fielding Bible +/- (2003-2005) for every starter (led team in PA at the position) in the league.
2. Calculate the league average FWS per full season at each position for each season. Multiply this by each player-season’s percentage of the season played, and subtract the product from each player-season’s FWS, to get FWS above/below average. Divide by three, and multiply by the league-season’s runs per marginal win (equal to 3.32*((runs per game)^.7103)), to get Win Shares Fielding Runs Above Average (WS-FRAA).
3. Calculate the standard deviation (stdev) of RSpt per full season for each position for the following time periods: 1987-1999, 2000-2003, and 2003-2005. Calculate the stdev of BP FRAA and WS-FRAA per full season for each position from 1987-1999, the stdev of UZR per full season from 2000-2003, and the stdev of +/- from 2003-2005.
4. Multiply each player-season’s BP FRAA and WS-FRAA by the ratio of the RSpt stdev from 1987-1999 to the respective BP FRAA and WS-FRAA stdev for the same time period. Multiply each player-season’s UZR by the ratio of the RSpt stdev from 2000-2003 to the UZR stdev for the same time period. Multiply each player-season’s +/- by the ratio of the RSpt stdev from 2003-2005 to the +/- stdev for the same time period. This standardizes the stdevs for all the different defensive metrics (to a level less than BP FRAA but higher than WS).
5. Average the modified BP FRAA and WS-FRAA scores for each player-season from 1893-1986. Take a weighted average (40% RSpt, 30% modified BP FRAA, 30% modified WS-FRAA) for each player-season from 1987-1999. Take a weighted average (70% modified UZR, 30% RSpt) for each player-season from 2000-2002. Take a weighted average (55% modified UZR, 30% modified +/-, 15% RSpt) for each player-season from 2003. Take a weighted average (65% modified +/-, 35% RSpt) for each player-season from 2004-05. This is the player’s Fielding Runs Above Average (Rosenheck FRAA).
6. For the 1893-1918 period, when LF was more important and difficult than RF, add 2 runs per season to LF and subtract 2 per season from RF.
7. Subtract the player’s Rosenheck FRAA from the league average runs scored per team. This is theoretical team runs allowed.

Record

1. Input the theoretical team runs scored and runs allowed into the Pythagorean theorem (exponent = ((RS+RA)/G)^.285)) to get a winning percentage. Multiply this by 162 to get theoretical team wins. (W1) This is a straight-line season-length adjustment.


Step 2: League adjustment

1. Conduct a multiple regression analysis on the stdev of W1 above average per season from 1893-2005. (For those who are interested: the formula is .00366*Year + .1254*Runs per game - .028777 * NL Teams - .00567 * MLB Teams - .00256 * Season Length - .932 * Win% of worst team in league - .0278 * Years since expansion (Max 12) + .15 * World War (1 or 0) + .00158 * Estimate of player population (14.8M in 1893, 60.1M in 2005) - .2466* Integration (1 or 0) + 2.789, and the r^2 is .5766. The graph is the second tab of the Excel spreadsheet in the file.).
2. Use the regression equation to get a projected stdev for each league-season.
3. Regress each player-season’s W1 to 81 using the following equation: (Reg = 2005 stdev/league-season in question’s projected stdev). (81*(1-Reg)) + (Wins*Reg). This is W2.

Step 3: Replacement level

1. Calculate W2 above/below average per full season played for every starter.
2. Average the W2 below average for the worst three starters at each position in the league for every league-season.
3. Average these worst-three-regulars averages at each position from 1985-2005.
4. Subtract the worst-three-regulars average from 1985-2005 from Nate Silver’s empirically determined Freely Available Talent (FAT) replacement levels for each position for the same time period.
5. Add the difference to the worst-three-regulars average at each position for each year. This is the FAT level at each position for each season.
6. Take a nine-year moving average of the FAT level for each position over time. This is the replacement level (measured in wins below average per full season). A graph of positional replacement levels over time is the third tab of the Excel spreadsheet.
7. For each player-season, multiply the replacement level by his fraction of the season played, and subtract the product from his W2. This is WARP2.

Step 4: Salary estimator

1. Divide a player’s WARP2 by his fraction of the season played (measured by % of the average PA per lineup slot for that year) to get WARP2 per full season.
2. Use Nate Silver’s 2005 salary estimator ($212,730*WARP^2 + $402,530*WARP) to find out how much the player would have earned on the 2005 market had he played a full season. Convert all negative numbers to $0.
3. Multiply this by the player’s fraction of the season played to find out how much he would have earned for that season.

Notes

Step 1: The offensive methodology is fairly straightforward, and quite similar to most other approaches. Note that BP’s BRAA are standardized to a 4.5 runs per game league. The FRAA number is simply a weighted average of the best defensive statistics available to us, but with the standard deviations equalized so that one doesn’t count for more than another.

Step 2: Using a regression-predicted stdev rather than an actual stdev should account for all factors that determine stdev EXCEPT for changes in *concentration* of talent (i.e., very many or very few great players in the league at a given time) and random, meaningless fluctuation (no NL player happened to have a big year in 1995, lots did in 2001). If you did real standard deviations, Zack Wheat would probably look like Tris Speaker, since they were both the premier hitters of their era in their respective leagues. By using projected stdev, we can factor out things like integration and expansion that affect the spread of performance in the league while still giving actual talent its due.
And NOTA BENE: DO NOT CONFUSE THIS WITH EITHER A TIMELINE ADJUSTMENT OR A LEAGUE DIFFICULTY ADJUSTMENT. IT IS NEITHER. To be clear, the 1993 NL is regressed MORE than the 1914 NL. Anyone who thinks that the level of play was higher in 1914 than 1993 is crazy. What this corrects for—and NOTHING ELSE—is what is often colloquially called on this site the “ease of domination,” when people talk about it being “easier” to “accumulate win shares or WARP1” in certain years than in others. If you want to timeline or adjust for league difficulty, you still need to make those adjustments—this does NOT account for them.

Step 3: As far as I’m concerned, Nate Silver’s FAT research is the last word on replacement level. Using a percentage of positional or league average is silly: the former suggests that the presence of, say, A-Rod, Jeter, Nomar, and Tejada makes a replacement player (Neifi Pérez, Pat Meares) better than they otherwise would be; the latter is incapable of capturing changes in the relative depth of positions over time. By contrast, the worst-three-regulars average (adjusted for the gap between it and the FAT level) should always track the real empirical replacement level, since they are the most likely to be close to actual replacement players.

Step 4: The salary estimator concept, which I should give credit to David Zylberberg (‘zop on BTF) for coming up with originally, seems to me to be the ideal way to combine the value of career and peak, durability and rate. “What would the market pay for this player-season?,” a University of Chicago economist would ask. Here’s the answer.
   3. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 04:13 PM (#2292325)
Actually let me be even clearer about the W1-W2 adjustment. I agree that a pennant is a pennant, and I oppose timelining. All I am doing is correcting for the spread of performance in a league, so that a two-standard deviation contribution to a pennant is valued the same in 1893 and in 2005.
   4. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 04:29 PM (#2292338)
Here are the first two things that leap out to me about my results.

1. Is it any coincidence that two of the last three times the single season home run record was set (1961 and 1998) were expansion years? The stdev-regression equation finds that years since expansion (combined with the typically low worst-team win% in expansion years) has one of the strongest correlations to league stdev, so expansion seasons (Willie Mays' and Frank Robinson's 1962, Willie McCovey's 1969-70, Barry Bonds' 1993 and Bagwell's 1994, and McGwire's 1998) tend to get docked pretty harshly in the W1-W2 adjustment. I think this is fair--we tend to not discount big seasons by saying "it was an expansion year" the same way we do by saying "it was 1930 or 1894." I think we should.

2. Most replacement levels seem pretty stable, but the one that bounces all over the place is 2B. Rather than simply "switching places with 3B before 1920," it seems to be around its historical average in the 1890s, SOARS up to around corner outfielder level around 1910 (ouch, Larry Doyle--if it's the same in the NL then Lajoie and Collins are somewhat overrated), then falls back to a historical norm for the 20s through the 70s, then ZOOMS up again in the 1980s--sorry, Ryno--before returning again to its long term average by the late 1990s. I know (I think) what accounts for the deadball spike (although not for why 2B wasn't so strong in the 1890s as well)--but what happened in the 1980's? The major increase in NL 2B production is reflected in the positional averages as well, I believe.

I haven't dug much into the individual player-season data, but I'll start poking around now. (It took forever just to derive it). I'll get cracking on estimates for AL players, pitchers, and NL catchers, but they will be much less accurate than these numbers, unfortunately--it takes a *long* time to crunch this (I enter in all the FRAA, FWS etc. by hand! If you've got a faster way to do that, let me know!)
   5. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 04:35 PM (#2292342)
3. Interesting how all four infield replacement levels dip around 1980. I do have data (unrefined) for AL SS from 1960-2005, and it shows the exact same drop at the exact same time, only even more pronounced. And then NL 2B takes off from there...explanations, anyone?

It's worth reiterating that these replacement levels are *not* measures of the overall strength of a position--they have nothing to do with the performance of the best players at a position. They are only a measure of the *depth* of the position, what the freely available talent level is.
   6. Juan V, posting on behalf of Juan V. Posted: February 05, 2007 at 05:10 PM (#2292365)
Pretty good stuff, I'll see how I can incorporate that into my sistem. Any rating sistem that likes David Concepcion is good for me :)
   7. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 05:44 PM (#2292380)
I used to think Concepción was borderline. But once you take into account the low standard deviation of the era in which he played and that it was a historic nadir of depth at the SS position, he becomes an obvious selection to me.
   8. sunnyday2 Posted: February 05, 2007 at 06:09 PM (#2292402)
The expansion year observation is big, theoretically speaking. I have sort of mentally downgraded Norm Cash's 1961, but that's easy to do because it stands out from his own line. The idea of discounting everybody for those years has never really occurred to me (and to be honest, I probably never will, now that we are approaching the end of the project; lazy bastard). In a perfect world, it would be a good idea. Of course, in a perfect world, I would do something like what Dan is doing.

I opened up your spreadsheet, BTW...do you have player totals somewhere?
   9. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 06:25 PM (#2292405)
Also because Cash admitted he was corking his bat that year!

I don't, but it's as easy as sorting by player name and hitting AutoSum...just be careful for missing partial seasons, since I only did starters (I'm trying to fill in partial seasons for HoM candidates, but there's lots of stuff to do!).
   10. Steve Treder Posted: February 05, 2007 at 06:33 PM (#2292408)
Also because Cash admitted he was corking his bat that year!

Sigh ... as well as every other year of his entire career ...
   11. zoperino,if youre not into the whole brevity thing Posted: February 05, 2007 at 06:34 PM (#2292409)
I have sort of mentally downgraded Norm Cash's 1961, but that's easy to do because it stands out from his own line. The idea of discounting everybody for those years has never really occurred to me

Gosh, to me the 1961 AL always stuck out as a crazy stdev outlier.....

Cash, Mantle, Maris, Howard...
   12. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 06:39 PM (#2292413)
and Jim Gentile!!!
   13. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 07:24 PM (#2292437)
I forgot to mention that yes, these are park-adjusted. Five year weighted moving average of baseball-reference park factors.
   14. Dandy Little Glove Man Posted: February 05, 2007 at 09:12 PM (#2292479)
Expansion seasons (Willie Mays' and Frank Robinson's 1962, Willie McCovey's 1969-70, Barry Bonds' 1993 and Bagwell's 1994, and McGwire's 1998) tend to get docked pretty harshly in the W1-W2 adjustment. I think this is fair--we tend to not discount big seasons by saying "it was an expansion year" the same way we do by saying "it was 1930 or 1894." I think we should.

I couldn't agree more. One of my biggest problems with previously existing player evaluation systems is that they don't appropriately account for expansion. It dilutes the league and brings down the average and replacement levels against which players are judged. Expansion is largely responsible for the fact that most systems have disproportionate numbers of players who peaked in the 60s/early 70s and 90s/early 00s toward the top of the rankings, with far lower representation from the mid 70s through the 80s. I've only worked with OPS+ and ERA+ data on this issue, but my research has shown that expansion causes a bump in both of these statistics in terms of the league top 10s for about 5 years before returning to their previous levels. What some may find counterintuitive is that OPS+ and ERA+ always move in tandem at the highest levels, rather than the league-leading hitters improving at the expense of the pitchers or vice versa. The top values appear to be much more influenced by changes in the average level, chiefly through expansion, than by changes in the players occupying the top spots on the leaderboard. I'm interested to see how the player rankings differ from other evaluation systems by incorporating this factor into the model.
   15. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 10:33 PM (#2292514)
Of course they move in tandem at the highest levels. You add in 50 guys to the league who were in the minors the year before. The best hitters beat up on the minor league pitchers, while the minor league hitters drag down the average allowing the best hitters to produce even better league-relative numbers. The same is true for pitchers (they strike out all the minor league hitters, while the minor league pitchers pull up the league ERA). It's a pretty substantial effeect, and as I said, it has one of the strongest coefficients in the regression on stdev of wins above average.
   16. Kyle S Posted: February 05, 2007 at 11:56 PM (#2292566)
The spreadsheet is only NL guys, right?
   17. Bob "Jugement" Dernier Posted: February 06, 2007 at 12:02 AM (#2292569)
But why wouldn't a WARP system be neutral toward expansion seasons? Those new clubs absorb the freely available talent, so that the best player you can get for free is considerably worse than he was a year ago. Hence Willie McCovey gets a lot more valuable overnight. I don't see how that's any different than Concepcion happening to peak in a bad epoch for shortstops.

I am always stupid in these conversations, so feel free to point out why :)
   18. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 12:11 AM (#2292571)
Only NL, and no catchers. And it's taken me six months to get just that much! Like I said, I'm going to try to have estimates for pitchers and AL players and NL catchers under the group's consideration "by 1995."

Bob Dernier Cri, you are right that expansion lowers the absolute empirical replacement level relative to average (and in early versions of this system, guys from the 8-team leagues came out poorly since replacement was so close to average). But my approach here is to first standardize all the wins-above/below-average scores for every player to the projected standard deviation of the league, puttinh all seasons (regardless of expansion, league size, and every other factor under the sun) on an equal footing. THEN I look at the worst three regulars and calculate replacement level. So replacement level is really tied to a z-score--say, 2 standard deviations below the mean (I have no idea what the real number is)--rather than an absolute figure.

My point about expansion was that since those seasons tend to have much higher standard deviations, they get regressed more than other seasons.
   19. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 09:20 AM (#2292633)
The best way to think about these numbers is, by how many standard deviations did a player exceed the replacement players at his position for his time? That's why Concepción does so well--he comes out similarly in overall value (career + peak, as measured by the salary estimator) to Jesse Burkett in this system. He was a league average hitter for his career (with some well-above-average seasons), a plus baserunner, and an extraordinary fielder, in an era where a) talent was closely bunched together and b) shortstops were historically awful. Let's take, say, his 1979, not even his best year. He was 1.7 wins above average offensively and 1.5 wins above average defensively, for 3.2 wins above average total (he had an above-average number of plate appearances that season, so for the purpose of comparison he was 3.0 wins above average per season played). By contrast, Atlanta started Pepe Frias, who was just about a replacement level shortstop that year. Frias was -2.8 wins per season with the bat and -.6 wins per season with the glove, for -3.4 wins per season total. Now, the projected standard deviation for the 1979 NL is 2.34 wins per season (the real one was 2.27), so Concepción was (3.0-(-3.4))/2.34 = 2.74 stdevs better than Frias per season. Readjusting for Concepción's high PA total, he was 2.87 stdevs above replacement.

Now let's compare him to, say, 1895 Jesse Burkett. After straight-line-adjusting for season length, Burkett was 6.8 wins above average with the bat and 0.9 wins above average with the glove, for 7.7 wins above average total. He too had an above-average PA total, so he was 7.1 wins above average per season. A replacement LF for 1895 would be someone like Tommy McCarthy, who was -0.6 wins per season with the bat and -0.9 wins per season with the glove, for -1.5 wins per season total. The projected standard deviation for the 1895 NL is 3.25 (the real one was 3.27), so Burkett was (7.1-(-1.5))/3.25 = 2.65 standard deviations per season better than McCarthy. Readjust for Burkett's high PA total, and he was 2.89 stdevs above replacement.

Thus, although Burkett's raw .409/.486/.524 line obviously dwarfs Concepción's modest .281/.348/.415, his 154 OPS+ is still far superior to Concepción's 107, and his 9.4 wins above replacement exceed Concepción's 6.7, once you adjust for the "ease of domination" of the 1895 NL, the two seasons were equally valuable: both made a 2.9-standard-deviation-above-replacement contribution to a pennant. In the 2005 NL, with a 2.30 projected standard deviation, a 2.9 stdev-above-replacement contribution is worth 2.3*2.9 = 6.7 WARP.

I hope this example makes clear what this statistic measures, and why I feel it can add to the discussion and perhaps change people's opinions as we move into the final ballots.
   20. Bob "Jugement" Dernier Posted: February 06, 2007 at 09:31 AM (#2292639)
Well, I still don't see why a comparison (for instance) of Dave Concepcion and Pee Wee Reese should involve a comparison of Pepe Frias and Lennie Merullo, but having read your detailed explanation of the math, Dan, this is certainly a thorough way of looking at the problem of what a player contributed toward winning a pennant in a given year.
   21. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 09:43 AM (#2292643)
It's the exact same reason why a comparison of Reese and Gil Hodges should involve a comparison of Merullo and Whitey Lockman! The reason why Reese's .272 EqA is worth more than Hodges' .289 EqA is that the guy you would have to replace Reese with, Merullo, only puts up a .231 EqA, while Lockman, the guy you replace Hodges with, puts up a .263. Similarly, the guy you have to replace Concepción with, Frias, only produces a .204 EqA. It seems to me that you either have to disregard replacement level altogether, in which case your Hall of Merit will be 90% outfielders and first basemen, or you take it into account, in which case you have to recognize that replacement levels change over time and that the gap between 1B and SS in the 50s NL isn't much bigger than the gap between SS in the 50s NL and SS in the 70s NL.
   22. Eric Chalek (Dr. Chaleeko) Posted: February 06, 2007 at 09:58 AM (#2292652)
Dan,

Although I recognize that your system is limited at this time, one thing I'd really like to see is a DRpHOM-not-HOM list. In other words, I'd like to know what guys your system would put in and leave out, among the players we've considered that you currently have data for. You could just assume the NgLs and ALs and NL catchers are the same for now, but show us what guys would be different via your system than by HOM consensus. It would just help a bit in understanding the results your system offers.
   23. Dizzypaco Posted: February 06, 2007 at 09:58 AM (#2292653)
Dan,

I appreciate your efforts to adjust for league strength. I'm skeptical that all the formula isn't just randomly throwing various quality of play factors together, but its an honest attempt.

The bigger problem for me is that I believe we have no idea how league strength in say, 1895 to 1995. I am personal believer that league strength is vastly higher in recent years than in the early years of baseball, but I have no idea how much, and frankly, I don't think anyone else does either. Taking expansion into account makes sense when comparing the results of 1961 with 1960, but makes no sense to me when comparing 1993 and 1893.

Its not just that the potential population that MLB draws from is much higher in 1993 than 1893, its that we don't have the foggiest idea what that population really is. Talent aside, not every 25 year old male has the potential to play major league ball no matter how good that player is, and its not just because of segregation and other similar issues. In 1893, how many young people were playing baseball often enough to get good enough to play in the Majors? How many were playing some type of organized baseball? Of those, how many had the potential to be noticed by scouts? How many played for some type of "minor league" team, despite the fact that they were good enough to play in the National League? I have no idea, but my guess is that the real population that MLB was drawing from is much, much smaller than what is estimated in the formula. Not only does modern Major league baseball draw from a much, much larger population (young males in US + young males in other countries), but there is a much more organized system for funneling the best players at each level to the major leagues.

So what do we do about it? I'm not a believer in using a timeline - not because I don't think the level of competition is higher today, but because I don't think there is any realistic way for measuring the difference. However, if we are not going to adjust for the quality of play over time, I also don't think we should be adjusting for minor differences in year to year play either, such as are caused by expansion.

I am aware that people have made attempts to measure differences in quality of play over time, but I have never agreed with the methodologies. I know many, if not most people on BTF disagree with me on some of these points, but its my 2 cents.
   24. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 09:59 AM (#2292654)
I'll try to do that today.
   25. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 10:18 AM (#2292664)
Dizzypaco,

Thanks for your comments. I'm afraid I can't be clear enough about this (which is why I put it in bold in the methodology description)--I am not attempting to adjust for league strength or quality of play!!!

People conflate standard deviations and quality of play, I imagine due to Stephen Jay Gould's article, but they are by no means the same thing. To repeat, I regress the 1993 NL more than the 1914 NL--this is not a league "strength" adjustment. It's a league standard deviation adjustment, nothing more and nothing less. All I am doing is measuring the spread of performance in a league and standardizing it across eras, so that a two-stdev contribution to a pennant in 1893 is worth the same as a two-stdev contribution to a pennant in 2005. As I said, if you want to adjust for quality of play or timeline, you have to make that adjustment yourself to my WARP2 numbers. The projected standard deviation of the 1914 NL was the same as that of the 1993 NL, but clearly the level of play was higher in 1993, and I'm not accounting for that. I personally am against timelining. I am just trying to be fair to all eras, which requires looking at the distribution of performance in each season. 10 WARP in 1893 "bought" fewer pennants than they do in 2005, because the stdev was higher. That's true no matter whether 1893 was a particularly strong or weak league in terms of absolute quality of competition.
   26. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 11:16 AM (#2292703)
(but 10 WARP in 1914 "bought" more pennants than 10 WARP in 1993)
   27. zoperino,if youre not into the whole brevity thing Posted: February 06, 2007 at 11:22 AM (#2292711)
<straightman>

Dan, I see that your system seems to love 70's-80's sluggers like Mike Schmidt, Pedro Guerrero, Jack Clark, and Dale Murphy. Could you explain why?
</straightman>
   28. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 12:04 PM (#2292748)
Why yes, David, I do believe I could--although it now looks to me like Guerrero, Clark, and Murphy are all juuuust on the wrong side of the in/out line ($75M career salary). The simple reason is that the late 70s and all of the 80s were the lowest standard deviation era in the league's history--those years were the "hardest to dominate" (as measured by OPS+ or WARP1) in the entire NL. So (making these numbers up), a 140 OPS+ in 1985 might be equal to a 150 in 1997 might be equal to a 160 in 1935. Adjusting for the "difficulty of domination" of those days makes Guerrero, Clark, and Murphy appear to have contributed about the same number of pennants as, say, Max Carey or Cupid Childs.
   29. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 12:50 PM (#2292797)
OK, here's what I've got, using my best estimates for AL and pre-1893 seasons. It looks to me like the cutoff for inner circle is $150M, and in/out is somewhere around $75M. Of course there is a margin of error on these numbers, so I wouldn't draw too much attention to differences of $2 million--picking among the 70-to-75M batch is really just a question of taste. 2006 stats are not included. War credit is given, and fairly liberally at that (including seasons like Mays '53 and Maranville '18). Reese and Slaughter certainly depend on it. The guy that waaay leaps out at me is Larkin--almost inner circle. Wowee.

Name                Career Salary
Barry Bonds         
$355,075,512
Honus Wagner        
$301,940,127
Willie Mays         
$240,268,011
Rogers Hornsby      
$231,526,395
Stan Musial         
$210,776,870
Hank Aaron          
$210,319,344
Mike Schmidt        
$209,650,472
Joe Morgan          
$181,519,676
Frank Robinson      
$160,375,579
Mel Ott             
$154,068,892
Arky Vaughan        
$152,338,635
INNER CIRCLE
--------------------
Barry Larkin        $145,872,934
Ozzie Smith         
$129,998,781
Jeff Bagwell        
$128,092,272
Gary Sheffield      
$123,033,450
Bill Dahlen         
$119,772,919
Ed Delahanty        
$117,382,556
George Davis        
$113,291,040
Tim Raines          
$111,020,960
Eddie Mathews       
$110,901,266
Billy Hamilton      
$110,692,353
Tony Gwynn          
$110,436,526
Johnny Mize         
$104,493,744
Roberto Clemente    
$102,761,223
Paul Waner          
$101,946,007
Jackie Robinson      
$98,237,421 (no Negro League credit)
Fred Clarke          $97,367,980
Larry Walker         
$96,967,219
Pee Wee Reese        
$96,496,818
Pete Rose            
$96,047,274
Dave Concepción      
$94,666,913
Frankie Frisch       
$92,862,321
Jim Edmonds          
$91,590,321
Hughie Jennings      
$91,588,914
Dick Allen           
$90,960,121
Ron Santo            
$90,817,848
Jesse Burkett        
$90,105,659
Chipper Jones        
$89,464,583
John McGraw          
$89,410,662
Sammy Sosa           
$87,128,960
Ernie Banks          
$86,323,002
Darrell Evans        
$84,203,505
Scott Rolen          
$83,447,841
Joe Kelley           
$82,819,556
Duke Snider          
$82,220,482
Billy Williams       
$81,748,209
Jimmy Sheckard       
$81,421,173
Reggie Smith         
$80,330,007
Ron Cey              
$79,950,646
Heinie Groh          
$79,794,093
Willie Stargell      
$79,323,100
Willie McCovey       
$78,779,130
Ryne Sandberg        
$78,449,070
Albert Pujols        
$78,360,169 (no 2006!)
Enos Slaughter       $77,129,289
Vladimir Guerrero    
$76,795,704
TOP OF BORDERLINE
---------------
Will Clark           $75,655,086
Cupid Childs         
$75,028,557
Sherry Magee         
$74,905,275
Brian Giles          
$74,785,122
Zack Wheat           
$73,823,975
Billy Herman         
$73,634,666
Dale Murphy          
$73,023,171
Willie Keeler        
$72,632,593
Max Carey            
$72,202,231
Luis González        
$71,945,402
Jim Wynn             
$71,473,562
Jeff Kent            
$71,029,423
BOTTOM OF BORDERLINE
------------
Pedro Guerrero       $69,945,660
Craig Biggio         
$69,804,540
Dave Bancroft        
$69,791,255
Fred McGriff         
$68,297,428
Jack Clark           
$68,281,910
George Foster        
$67,141,524
Joe Tinker           
$67,016,210
Stan Hack            
$66,777,214
Keith Hernández      
$66,654,616
Andre Dawson         
$66,071,050
Art Fletcher         
$66,009,607
Bobby Bonds          
$65,228,399
Kiki Cuyler          
$65,179,605
Eric Davis           
$64,153,929
Rabbit Maranville    
$63,869,749
Edd Roush            
$63,190,215
José Cruz Sr
.        $63,170,898
Joe Medwick          
$63,124,835
Ken Boyer            
$63,070,276
Ken Caminiti         
$62,564,785
Chuck Klein          
$60,807,729
Cesar Cedeño         
$60,623,540
Ralph Kiner          
$60,580,538
Jake Beckley         
$60,408,747
Bobby Abreu          
$60,231,369
Richie Ashburn       
$60,024,942
Tommy Leach          
$59,731,485
George Burns         
$59,294,919
George Van Haltren   
$56,923,002
Hugh Duffy           
$56,204,790
Bob Elliott          
$54,701,989
Rusty Staub          
$54,127,458
Frank Chance         
$51,290,002
Tony Pérez           
$51,120,246
Pie Traynor          
$49,824,287
Bill Terry           
$47,695,966
Gavvy Cravath        
$44,893,823 (no minor league credit)
Orlando Cepeda       $40,369,597
Bill Mazeroski       
$37,847,348
Larry Doyle          
$36,011,518
   30. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 12:59 PM (#2292808)
OK, so to summarize:

Guys who I think should clearly be in the HoM and are not:

Dave Concepción
John McGraw
Reggie Smith
Ron Cey

Guys who I think are HoM mistakes:
Stan Hack (particularly since my system doesn't penalize him for wartime competition)
Joe Medwick
Ken Boyer
Ralph Kiner
Richie Ashburn

Of the borderliners, my instinct is to pick the old-time guys (Childs, Magee, Wheat, Herman, and Carey, probably not Keeler) and leave out the more recent ones (W. Clark, Murphy, L. González, Wynn, Kent). B. Giles is clearly in for me given that he hasn't retired yet.
   31. Chris Cobb Posted: February 06, 2007 at 01:15 PM (#2292822)
It looks like Bill Terry ought to belong on the "mistake" list, also.
   32. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 01:24 PM (#2292828)
oops, yep, I forgot he was elected! I didn't even look down that far. I had stopped voting by the time he was elected...what happened there? Short career, only three really outstanding offensive season none of which were extraordinary, only minor defensive value...what gives?
   33. Chris Cobb Posted: February 06, 2007 at 01:54 PM (#2292853)
Well, one theory is that Terry benefited from being a Shiny New Toy in a weak year. The fact that Terry slipped in so easily has caused the electorate to be more cautious with borderline new candidates.

For ease of comparison of Dan's list to the current HoM roster:

By my count, we have actually elected 47 players from among the group from which this list is drawn: National League players who played the bulk of their careers after 1893 and who were neither pitchers nor catchers.

The 48th eligible or elected player on Dan's list is Dave Bancroft. So assuming that our total of 48 elected from this pool is right, then Dan's system finds that we should have elected

Concepcion, McGraw, Smith, Cey, Wynn, and Bancroft

in place of

Hack, Medwick, Boyer, Kiner, Ashburn, and Terry.

Some of the differences between the lists could be artifacts of the election schedule and not "mistakes" -- i.e. we elected the best player available at the time, but vagaries in the supply of talent let one in but kept another out. Others are surely genuine disagreements about value. We've had plenty of chances to elect McGraw and Bancroft but we haven't.

Did you give Kiner and war credit, Dan?
   34. Chris Cobb Posted: February 06, 2007 at 01:56 PM (#2292856)
Ack, errors in the previous post! i tried to stop it before it loaded, but if it didn't, here the inconsistencies are corrected.

Well, one theory is that Terry benefited from being a Shiny New Toy in a weak year. The fact that Terry slipped in so easily has caused the electorate to be more cautious with borderline new candidates.

For ease of comparison of Dan's list to the current HoM roster:

By my count, we have actually elected 48 players from among the group from which this list is drawn: National League players who played the bulk of their careers after 1893 and who were neither pitchers nor catchers.

The 48th eligible or elected player on Dan's list is Dave Bancroft. So assuming that our total of 48 elected from this pool is right, then Dan's system finds that we should have elected

Concepcion, McGraw, Smith, Cey, Wynn, and Bancroft

in place of

Hack, Medwick, Boyer, Kiner, Ashburn, and Terry.

Some of the differences between the lists could be artifacts of the election schedule and not "mistakes" -- i.e. we elected the best player available at the time, but vagaries in the supply of talent let one in but kept another out. Others are surely genuine disagreements about value. We've had plenty of chances to elect McGraw and Bancroft but we haven't.

Did you give Kiner any war credit, Dan?
   35. Dandy Little Glove Man Posted: February 06, 2007 at 01:58 PM (#2292857)
What's Dave Parker's Career Salary? Also, if these numbers only include NL time, Will Clark should be an easy HOM selection under this system after the AL portion of his career is added.
   36. Mark Shirk (jsch) Posted: February 06, 2007 at 02:02 PM (#2292864)
This also seems to be career numbers. What do the three and five year peaks of Kiner and Medwick look like?
   37. Eric Chalek (Dr. Chaleeko) Posted: February 06, 2007 at 02:04 PM (#2292866)
Dan, thanks for posting the big list above. I did a little informal parsing of it to see what's to see, particularly looking for matters of era and positional balance. I'm not coming to any judgment about the system or anything because we don't have all the players in it yet, but I thought I'd just take a preliminary look anyway.

By position

Using the groupings Dan provided I made a chart of which positions were represented in the chart and where. I assigned the positions myself, which can be arbitrary as you know. Leach is a 3B and Rose is 2B, etc etc etc....

1B 2B 3B SS LF CF RF  TOTAL
-------------------------------------------
INNER CIRCLE    0  2  1  2  2  1  3   11
HOMERS          5  4  8  8  8  3  9   45
UPPER BORDER    1  3  0  0  3  3  2   12
LOWER BORDER    9  3  6  4  5  7  6   40
===========================================
TOTAL          15 12 15 14 18 14 20  108


I'll be happy to answer any questions about who is at what position, if anyone wants to know.

By decade

Granted decades are arbitrary endpoints and it's often tough to know exactly which decade to put somebody in but I forged ahead. Again, the groupings are Dan's but the decade assignments are mine.

1890s 1900s 1910s 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s TOTAL
--------------------------------------------------------------------------------------------
INNER CIRCLE     0     1     0     1     2     1     0     3     1     1     0     1     11
HOMERS           8     2     1     1     1     3     4     6     5     4     5     5     45
UPPER BORDER     2     1     2     0     1     0     0     1     0     1     1     3     10
LOWER BORDER     3     3     6     3     4     1     2     3     5     6     3     1     40
============================================================================================
TOTAL           13     7     9     5     8     5     6    13    11    12     9    10    108


In this chart in particular, it's worth noting that the blurring of a player's career between decades may make some gaps look bigger than they may actually be. I'll happily elaborate on who is in what group if anyone would find that information helpful.

Anyway, like I said, I'm not offering any judgment, but I thought this might provide the group with some interesting information.
   38. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 02:14 PM (#2292878)
I did not give Kiner war credit--I didn't know he fought. How old was he, and what were his minor league stats before that?

My McGraw pick has everything to do with the salary estimator and very little to do with the WARP system. McGraw played at such a high rate that even after adjusting for the high standard deviation of the time period, he looks like a dominant player--I have his 1899 as the 5th most valuable season between 1893 and 1946 (after Wagner '07-'08, Hornsby '24, and Jennings '96). Because the salary estimator rewards rate exponentially (versus playing time linearly), extremely high peak rate seasons are counted as being, well, extremely valuable--this is by design. But fully 31% of McGraw's value, in my book, is the 1899 season--a guy who had three of those and nothing else would be a Hall of Meriter in my book. If you look at career, McGraw's 43.6 WARP2 aren't much of a case (Jack Clark has over 47), and his best five seasons (31.4 WARP2) don't really stand out either (Pedro Guerrero has 31.3), and nobody on this board thinks Pedro Guerrero's peak with Jack Clark's career is a HoM'er. It's just because my salary estimator values rate so highly (which is how I like it) that McGraw comes out so well.

I've posted the data precisely so that people can dig into it and use it in their own systems and draw their own conclusions. This is just the raw information; how you value peak vs. career, and rate vs. durability, is up to you.
   39. zoperino,if youre not into the whole brevity thing Posted: February 06, 2007 at 02:21 PM (#2292886)
In this chart in particular, it's worth noting that the blurring of a player's career between decades may make some gaps look bigger than they may actually be. I'll happily elaborate on who is in what group if anyone would find that information helpful.

Anyway, like I said, I'm not offering any judgment, but I thought this might provide the group with some interesting information.


Consider, Doc Chaleeko, that the NL represents a varying proportion of the total HoM eligible players over time. For example, we'd expect a dramatic dropoff from the 1890's to the 1900's because of the rise of the AL and a concurrent dilution of talent. We'd expect a rise from the 50's through the 60's as African-American players enter the MLB game in large numbers.

I think when this is taken into consideration, the decade-breakdown looks nearly perfect.
   40. DavidFoss Posted: February 06, 2007 at 02:31 PM (#2292893)
When do you think you'll get around to doing the AL?
   41. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 02:42 PM (#2292903)
Parker is $57,146,405.

These numbers definitely include my best estimates of AL seasons (and seasons played at catcher, for Frank Chance, and pitcher, for George Van Haltren).

Mark Shirk, that information is in the spreadsheet. But I'll post it here.

RC+: Runs produced per out, relative to league average.
SFrac: Percentage of season played, compared to the league average plate appearances per lineup
spot.

W1AA: Wins above average.
WARP1: Wins above a replacement player at the same position.
LeagueAdj: Ratio of the league's projected standard deviation to the 2005 NL standard deviation.
W2AA: Wins above average, adjusted for standard deviation.
RepW/Yr: Wins below average of a replacement player at the same position per full season, adjusted for standard deviation.
WARP2: Wins above a replacement player at the same position, adjusted for standard deviation.
WARP2/Yr: Wins above a replacement player at the same position, adjusted for standard deviation, projected to a full season.
PennAdd: Pennants Added.
Market Salary: How much the 2005 market would have paid for that performance.

Medwick
Year RC
FRAA/Yr SFrac W1AA WARP1 LeagueAdj W2AA RepW/Yr WARP2 WARP2/Yr PennAdd      Salary
1937 205      
-1  1.04  7.7   8.3      .847  6.5    -0.5   7.0      6.7    .100 $12,767,893
1936 169       4  1.02  5.8   6.3      .836  4.9    
-0.4   5.2      5.1    .071  $7,848,027
1935 167       4  1.02  5.6   6.1      .808  4.5    
-0.4   5.0      4.8    .066  $7,105,488
1941 154       9  0.88  4.5   5.1      .833  3.8    
-0.6   4.3      4.8    .056  $6,127,671
1938 153       3  1.01  4.5   5.0      .828  3.7    
-0.5   4.2      4.1    .054  $5,359,686
THREE YEAR TOTAL       19.1  20.7           15.9          17.2             .237 
$27,721,408
FIVE YEAR TOTAL        28.1  30.8           23.3          25.6             .347 
$39,208,765

Kiner
Year RC
FRAA/Yr SFrac W1AA WARP1 LeagueAdj W2AA RepW/Yr WARP2 WARP2/Yr PennAdd      Salary
1951 219      
-9  1.10  6.9   7.4      .943  6.5    -0.5   7.0      6.4    .100 $12,310,451
1949 211      
-9  1.10  6.8   7.1      .937  6.4    -0.2   6.6      6.0    .093 $11,132,974
1947 193      
-2  1.10  6.6   6.9      .939  6.2    -0.3   6.5      5.8    .090 $10,637,409
1948 159      11  1.12  5.8   6.2      .950  5.5    
-0.3   5.9      5.3    .081  $8,976,516
1950 172     
-11  1.12  4.3   4.7      .922  3.9    -0.4   4.4      3.9    .057  $5,343,011
THREE YEAR TOTAL       20.3  21.4           19.1          20.1             .283 
$34,080,833
FIVE YEAR TOTAL        30.4  32.3           28.5          30.3             .422 
$48,400,361
   42. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 03:00 PM (#2292914)
Indeed. Also I would note that ceteris paribus, we'd expect to have more HoM'ers (by this measure) from recent decades just because of expansion. Let's just hypothetically say that everyone above 3 SD from the mean is a HoM'er. If you double your "sample size"--the number of players in the National League--you will double the number of players who are more than 3 SD from the mean. That seems right and logical to me.

DavidFoss, do you mean *complete* AL data (like this), or just estimates for guys on HoM ballots? The latter I hope to have by the 1995 election. The former is probably six months away--like I said, I do this by hand. If anyone's got a faster way (particularly to input WS and BP FRAA data into Excel), I'd love to hear it.
   43. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 03:06 PM (#2292920)
So few from the 50s? I'd say Mays, Musial, Aaron, F. Robinson, Mathews, J. Robinson, Reese, Banks, and Snider were all 50's guys (obviously Mays/Aaron/F. Robinson were also 60s, and Musial/Reese were also 40s, but still).
   44. Eric Chalek (Dr. Chaleeko) Posted: February 06, 2007 at 03:14 PM (#2292927)
So few from the 50s? I'd say Mays, Musial, Aaron, F. Robinson, Mathews, J. Robinson, Reese, Banks, and Snider were all 50's guys (obviously Mays/Aaron/F. Robinson were also 60s, and Musial/Reese were also 40s, but still).

Right, Dan, and that's what I was trying to say above. That the blurring of decades makes it tough to draw particular conclusions about fairness to eras unless there's a really obvious skew somewhere.
   45. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 03:19 PM (#2292932)
Everything on my screen is coming up as preformatted text...am I missing a ?

You could look at # of HoM'ers playing in each year...
   46. DavidFoss Posted: February 06, 2007 at 04:17 PM (#2292969)
DavidFoss, do you mean *complete* AL data (like this), or just estimates for guys on HoM ballots? The latter I hope to have by the 1995 election. The former is probably six months away--like I said, I do this by hand. If anyone's got a faster way (particularly to input WS and BP FRAA data into Excel), I'd love to hear it.

Yeah, complete data like this... OK, I understand these things are not automatic.

I've heard of programs that will scan a website like BP and dump data from pages into tables, specifically people do that in mid-season sometimes and it would also be useful for BP's frequent "updates" of their WARP data. I don't know how to do it though.

I do know that if you store the data in csv format that its quite a bit smaller and zips up nicer. I got the entire first sheet of your XLS file into a zip file that is only 218 KB.
   47. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 04:21 PM (#2292977)
Well, if you find out how to do it, please let me know!

Didn't think to stick in in .csv, I guess I'll redo that now for easier downloading.
   48. DavidFoss Posted: February 06, 2007 at 04:53 PM (#2293004)
Hmmm... baseball graphs has Win Share data here:

ftp://ftp.baseballgraphs.com/winshares/

Its "WS only" with the common "playerid" labels for identifying players (e.g. "killeha01" for killebrew) instead of the mechanism that you have.

Wow, that CSV file is a big help for me! I'm glad I helped you look. :-)
   49. DavidFoss Posted: February 06, 2007 at 07:39 PM (#2293090)
Well, if you find out how to do it, please let me know!

I've made some progress!

Turns out there is a program on linux called "wget" which will copy a webpage to your local machine. A free windows version is easy to find with a google search (I got the one here: here).

You can feed wget an input file with a list of urls. Well, since baseballprospectus urls are simply www.baseballprospectus.com/dt/playerid.php then you can create a list of 16000+ urls containing all the unique playerids in the baseballgraphs csv file. It took about a half an hour at work (might take longer at home) but I dumped all the BP webpages onto my local machine. Its about 300 MB. There were a couple of dozen pages that couldn't be found due to name disagreements (e.g. gwynnto02/gwynnan01) but that's a great success rate. With a good HTML parser then you can automatically parse those pages into csv-style data.

Turns out my work has great in-house software for this. I'll see what I can do. (Though I'm supposed to be working :-)). There may be great freeware out there for that too, for those that want to try at home.
   50. Eric Chalek (Dr. Chaleeko) Posted: February 06, 2007 at 10:38 PM (#2293225)
Here's a question the commish posted on the discussion thread. I thought this might be a good thread to discuss it in since I think that Dan Rosenheck is doing something a little different than what Woolner does (if I get both of their systems).

I think I might use VORP+FRAA as a foundation for running a hitter type spreadsheet similar to my pitching one. I realize VORP only goes back to 1959 on the website (according to Neyer's article), I'll have to come up with something else pre-1959, but I'll let you guys know how it works out . . . any deficiences I should be made aware of before starting?

Dan, I think you explained to me that your WARPs are based on freely available talent (as originally defined and studied by Nate Silver, IIRC), whereas Woolner's VORP measures against backups, not FAT. What's the advantage of one over the other?
   51. Eric Chalek (Dr. Chaleeko) Posted: February 06, 2007 at 10:42 PM (#2293231)
Also a quick procedural question:

2. Use Nate Silver’s 2005 salary estimator ($212,730*WARP^2 + $402,530*WARP) to find out how much the player would have earned on the 2005 market had he played a full season. Convert all negative numbers to $0.

Why wouldn't zeroes be equivalent to the minimum major league salary? The minimum salary is paid to players regardless of performance. Alternatively, how about the AAA minimum since, presumably, the parent club bears some of the costs associated with a player it has farmed out?
   52. jimd Posted: February 06, 2007 at 11:07 PM (#2293241)
Why wouldn't zeroes be equivalent to the minimum major league salary?

Three reasons that I can think of right away.

1) The regression on which the formula is based does not produce that result, or
2) Silver acknowledges that WARP's replacement level is too low and accounts for it, or
3) A constant term that adds the MLB minimum salary to the formula has gotten lost somewhere

What's the MLB minimum as of 2005 anyway? 300K or has it gone up?

A WARP of 1.0 should get 615K as a full time player according to the estimator.
A WARP of 0.6 should get 318K as a full time player according to the estimator.
   53. DavidFoss Posted: February 06, 2007 at 11:08 PM (#2293243)
OK. I have BP's "Actual Batting Statistics" and "Advanced Batting Statistics" converted to csv player-season format for >99% of the players.

Its 11.8 MB (3.6 MB zipped). Where should I put it? Dan, what is your email address?
   54. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 11:26 PM (#2293255)
Dr. Chaleeko, my reasoning is simple: not all backups are freely available. I don't know the nitty-gritty of Woolner's study, but I imagine it would be tough to distinguish backups from, say, September callups just by looking at games played or plate appearances. Moreover, I don't think it includes defense, as Nate's does. Meanwhile, Silver's methodology is simple, straightforward, and dead-on accurate: over age 27, making less than ha