Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Hall of Merit > Discussion
Hall of Merit
— A Look at Baseball's All-Time Best

Monday, February 05, 2007

Dan Rosenheck’s WARP Data

WARP Methodology and Results

Thanks, Dan!

EDIT: Link updated 2/23/2009

John (You Can Call Me Grandma) Murphy Posted: February 05, 2007 at 08:59 PM | 763 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 1 of 8 pages  1 2 3 4 5 6 >  Last ›
   1. John (You Can Call Me Grandma) Murphy Posted: February 05, 2007 at 09:04 PM (#2292320)
That's Dan's WARP data, not WARPED data. :-D

Seriously, we appreciate you taking the time to create this and allowing us here at the HoM to examine it.
   2. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 09:07 PM (#2292323)
Thanks for the thread.

Here is the text of the methodology explanation included in the .zip file.

With some trepidation, I’ve decided to leap headfirst into the überstat wars. The problems with both BP WARP and particularly Win Shares have been well-documented, above all regarding their replacement level definitions and opaque or nonexistent timelines. I’m here to offer my own, that I immodestly believe to be *far* superior to either available system (although I do leech off of them for defense). I’ll post a detailed account of the methodology here, and I have the data available in spreadsheet form. I hope many of you will find this useful for the 1994 and subsequent elections. I look forward to discussions on this thread either about my approach, or about conclusions one might draw from the data and how they might alter voters’ ballots.

I only have data for non-catcher NL starters from 1893-2005, but I will hopefully be able to post estimates for pitchers, NL catchers, and all AL players who are receiving HoM consideration in time for the 1995 ballot.

Methodology

Step 1: Wins above league average

Offense

1. Using Extrapolated Runs (1947-2005) or BaseRuns (1893-1946), find out how many runs a player created.
2. Subtract the player’s batting outs from the average batting outs per team for that league-season to determine the outs left over for teammates on a theoretical average team.
3. Multiply the remaining outs by the league runs scored per out, and add the player’s runs created, to get theoretical team runs scored.

Defense

1. Input raw Fielding Win Shares (FWS) and BP FRAA (1893-1999), Chris Dial’s Zone Rating RSpt (1987-2005), UZR (2000-2003), and Fielding Bible +/- (2003-2005) for every starter (led team in PA at the position) in the league.
2. Calculate the league average FWS per full season at each position for each season. Multiply this by each player-season’s percentage of the season played, and subtract the product from each player-season’s FWS, to get FWS above/below average. Divide by three, and multiply by the league-season’s runs per marginal win (equal to 3.32*((runs per game)^.7103)), to get Win Shares Fielding Runs Above Average (WS-FRAA).
3. Calculate the standard deviation (stdev) of RSpt per full season for each position for the following time periods: 1987-1999, 2000-2003, and 2003-2005. Calculate the stdev of BP FRAA and WS-FRAA per full season for each position from 1987-1999, the stdev of UZR per full season from 2000-2003, and the stdev of +/- from 2003-2005.
4. Multiply each player-season’s BP FRAA and WS-FRAA by the ratio of the RSpt stdev from 1987-1999 to the respective BP FRAA and WS-FRAA stdev for the same time period. Multiply each player-season’s UZR by the ratio of the RSpt stdev from 2000-2003 to the UZR stdev for the same time period. Multiply each player-season’s +/- by the ratio of the RSpt stdev from 2003-2005 to the +/- stdev for the same time period. This standardizes the stdevs for all the different defensive metrics (to a level less than BP FRAA but higher than WS).
5. Average the modified BP FRAA and WS-FRAA scores for each player-season from 1893-1986. Take a weighted average (40% RSpt, 30% modified BP FRAA, 30% modified WS-FRAA) for each player-season from 1987-1999. Take a weighted average (70% modified UZR, 30% RSpt) for each player-season from 2000-2002. Take a weighted average (55% modified UZR, 30% modified +/-, 15% RSpt) for each player-season from 2003. Take a weighted average (65% modified +/-, 35% RSpt) for each player-season from 2004-05. This is the player’s Fielding Runs Above Average (Rosenheck FRAA).
6. For the 1893-1918 period, when LF was more important and difficult than RF, add 2 runs per season to LF and subtract 2 per season from RF.
7. Subtract the player’s Rosenheck FRAA from the league average runs scored per team. This is theoretical team runs allowed.

Record

1. Input the theoretical team runs scored and runs allowed into the Pythagorean theorem (exponent = ((RS+RA)/G)^.285)) to get a winning percentage. Multiply this by 162 to get theoretical team wins. (W1) This is a straight-line season-length adjustment.


Step 2: League adjustment

1. Conduct a multiple regression analysis on the stdev of W1 above average per season from 1893-2005. (For those who are interested: the formula is .00366*Year + .1254*Runs per game - .028777 * NL Teams - .00567 * MLB Teams - .00256 * Season Length - .932 * Win% of worst team in league - .0278 * Years since expansion (Max 12) + .15 * World War (1 or 0) + .00158 * Estimate of player population (14.8M in 1893, 60.1M in 2005) - .2466* Integration (1 or 0) + 2.789, and the r^2 is .5766. The graph is the second tab of the Excel spreadsheet in the file.).
2. Use the regression equation to get a projected stdev for each league-season.
3. Regress each player-season’s W1 to 81 using the following equation: (Reg = 2005 stdev/league-season in question’s projected stdev). (81*(1-Reg)) + (Wins*Reg). This is W2.

Step 3: Replacement level

1. Calculate W2 above/below average per full season played for every starter.
2. Average the W2 below average for the worst three starters at each position in the league for every league-season.
3. Average these worst-three-regulars averages at each position from 1985-2005.
4. Subtract the worst-three-regulars average from 1985-2005 from Nate Silver’s empirically determined Freely Available Talent (FAT) replacement levels for each position for the same time period.
5. Add the difference to the worst-three-regulars average at each position for each year. This is the FAT level at each position for each season.
6. Take a nine-year moving average of the FAT level for each position over time. This is the replacement level (measured in wins below average per full season). A graph of positional replacement levels over time is the third tab of the Excel spreadsheet.
7. For each player-season, multiply the replacement level by his fraction of the season played, and subtract the product from his W2. This is WARP2.

Step 4: Salary estimator

1. Divide a player’s WARP2 by his fraction of the season played (measured by % of the average PA per lineup slot for that year) to get WARP2 per full season.
2. Use Nate Silver’s 2005 salary estimator ($212,730*WARP^2 + $402,530*WARP) to find out how much the player would have earned on the 2005 market had he played a full season. Convert all negative numbers to $0.
3. Multiply this by the player’s fraction of the season played to find out how much he would have earned for that season.

Notes

Step 1: The offensive methodology is fairly straightforward, and quite similar to most other approaches. Note that BP’s BRAA are standardized to a 4.5 runs per game league. The FRAA number is simply a weighted average of the best defensive statistics available to us, but with the standard deviations equalized so that one doesn’t count for more than another.

Step 2: Using a regression-predicted stdev rather than an actual stdev should account for all factors that determine stdev EXCEPT for changes in *concentration* of talent (i.e., very many or very few great players in the league at a given time) and random, meaningless fluctuation (no NL player happened to have a big year in 1995, lots did in 2001). If you did real standard deviations, Zack Wheat would probably look like Tris Speaker, since they were both the premier hitters of their era in their respective leagues. By using projected stdev, we can factor out things like integration and expansion that affect the spread of performance in the league while still giving actual talent its due.
And NOTA BENE: DO NOT CONFUSE THIS WITH EITHER A TIMELINE ADJUSTMENT OR A LEAGUE DIFFICULTY ADJUSTMENT. IT IS NEITHER. To be clear, the 1993 NL is regressed MORE than the 1914 NL. Anyone who thinks that the level of play was higher in 1914 than 1993 is crazy. What this corrects for—and NOTHING ELSE—is what is often colloquially called on this site the “ease of domination,” when people talk about it being “easier” to “accumulate win shares or WARP1” in certain years than in others. If you want to timeline or adjust for league difficulty, you still need to make those adjustments—this does NOT account for them.

Step 3: As far as I’m concerned, Nate Silver’s FAT research is the last word on replacement level. Using a percentage of positional or league average is silly: the former suggests that the presence of, say, A-Rod, Jeter, Nomar, and Tejada makes a replacement player (Neifi Pérez, Pat Meares) better than they otherwise would be; the latter is incapable of capturing changes in the relative depth of positions over time. By contrast, the worst-three-regulars average (adjusted for the gap between it and the FAT level) should always track the real empirical replacement level, since they are the most likely to be close to actual replacement players.

Step 4: The salary estimator concept, which I should give credit to David Zylberberg (‘zop on BTF) for coming up with originally, seems to me to be the ideal way to combine the value of career and peak, durability and rate. “What would the market pay for this player-season?,” a University of Chicago economist would ask. Here’s the answer.
   3. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 09:13 PM (#2292325)
Actually let me be even clearer about the W1-W2 adjustment. I agree that a pennant is a pennant, and I oppose timelining. All I am doing is correcting for the spread of performance in a league, so that a two-standard deviation contribution to a pennant is valued the same in 1893 and in 2005.
   4. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 09:29 PM (#2292338)
Here are the first two things that leap out to me about my results.

1. Is it any coincidence that two of the last three times the single season home run record was set (1961 and 1998) were expansion years? The stdev-regression equation finds that years since expansion (combined with the typically low worst-team win% in expansion years) has one of the strongest correlations to league stdev, so expansion seasons (Willie Mays' and Frank Robinson's 1962, Willie McCovey's 1969-70, Barry Bonds' 1993 and Bagwell's 1994, and McGwire's 1998) tend to get docked pretty harshly in the W1-W2 adjustment. I think this is fair--we tend to not discount big seasons by saying "it was an expansion year" the same way we do by saying "it was 1930 or 1894." I think we should.

2. Most replacement levels seem pretty stable, but the one that bounces all over the place is 2B. Rather than simply "switching places with 3B before 1920," it seems to be around its historical average in the 1890s, SOARS up to around corner outfielder level around 1910 (ouch, Larry Doyle--if it's the same in the NL then Lajoie and Collins are somewhat overrated), then falls back to a historical norm for the 20s through the 70s, then ZOOMS up again in the 1980s--sorry, Ryno--before returning again to its long term average by the late 1990s. I know (I think) what accounts for the deadball spike (although not for why 2B wasn't so strong in the 1890s as well)--but what happened in the 1980's? The major increase in NL 2B production is reflected in the positional averages as well, I believe.

I haven't dug much into the individual player-season data, but I'll start poking around now. (It took forever just to derive it). I'll get cracking on estimates for AL players, pitchers, and NL catchers, but they will be much less accurate than these numbers, unfortunately--it takes a *long* time to crunch this (I enter in all the FRAA, FWS etc. by hand! If you've got a faster way to do that, let me know!)
   5. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 09:35 PM (#2292342)
3. Interesting how all four infield replacement levels dip around 1980. I do have data (unrefined) for AL SS from 1960-2005, and it shows the exact same drop at the exact same time, only even more pronounced. And then NL 2B takes off from there...explanations, anyone?

It's worth reiterating that these replacement levels are *not* measures of the overall strength of a position--they have nothing to do with the performance of the best players at a position. They are only a measure of the *depth* of the position, what the freely available talent level is.
   6. Juan V Posted: February 05, 2007 at 10:10 PM (#2292365)
Pretty good stuff, I'll see how I can incorporate that into my sistem. Any rating sistem that likes David Concepcion is good for me :)
   7. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 10:44 PM (#2292380)
I used to think Concepción was borderline. But once you take into account the low standard deviation of the era in which he played and that it was a historic nadir of depth at the SS position, he becomes an obvious selection to me.
   8. sunnyday2 Posted: February 05, 2007 at 11:09 PM (#2292402)
The expansion year observation is big, theoretically speaking. I have sort of mentally downgraded Norm Cash's 1961, but that's easy to do because it stands out from his own line. The idea of discounting everybody for those years has never really occurred to me (and to be honest, I probably never will, now that we are approaching the end of the project; lazy bastard). In a perfect world, it would be a good idea. Of course, in a perfect world, I would do something like what Dan is doing.

I opened up your spreadsheet, BTW...do you have player totals somewhere?
   9. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 11:25 PM (#2292405)
Also because Cash admitted he was corking his bat that year!

I don't, but it's as easy as sorting by player name and hitting AutoSum...just be careful for missing partial seasons, since I only did starters (I'm trying to fill in partial seasons for HoM candidates, but there's lots of stuff to do!).
   10. Steve Treder Posted: February 05, 2007 at 11:33 PM (#2292408)
Also because Cash admitted he was corking his bat that year!

Sigh ... as well as every other year of his entire career ...
   11. 'zop sympathizes with the wrong ####### people Posted: February 05, 2007 at 11:34 PM (#2292409)
I have sort of mentally downgraded Norm Cash's 1961, but that's easy to do because it stands out from his own line. The idea of discounting everybody for those years has never really occurred to me

Gosh, to me the 1961 AL always stuck out as a crazy stdev outlier.....

Cash, Mantle, Maris, Howard...
   12. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 05, 2007 at 11:39 PM (#2292413)
and Jim Gentile!!!
   13. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 12:24 AM (#2292437)
I forgot to mention that yes, these are park-adjusted. Five year weighted moving average of baseball-reference park factors.
   14. Dandy Little Glove Man Posted: February 06, 2007 at 02:12 AM (#2292479)
Expansion seasons (Willie Mays' and Frank Robinson's 1962, Willie McCovey's 1969-70, Barry Bonds' 1993 and Bagwell's 1994, and McGwire's 1998) tend to get docked pretty harshly in the W1-W2 adjustment. I think this is fair--we tend to not discount big seasons by saying "it was an expansion year" the same way we do by saying "it was 1930 or 1894." I think we should.

I couldn't agree more. One of my biggest problems with previously existing player evaluation systems is that they don't appropriately account for expansion. It dilutes the league and brings down the average and replacement levels against which players are judged. Expansion is largely responsible for the fact that most systems have disproportionate numbers of players who peaked in the 60s/early 70s and 90s/early 00s toward the top of the rankings, with far lower representation from the mid 70s through the 80s. I've only worked with OPS+ and ERA+ data on this issue, but my research has shown that expansion causes a bump in both of these statistics in terms of the league top 10s for about 5 years before returning to their previous levels. What some may find counterintuitive is that OPS+ and ERA+ always move in tandem at the highest levels, rather than the league-leading hitters improving at the expense of the pitchers or vice versa. The top values appear to be much more influenced by changes in the average level, chiefly through expansion, than by changes in the players occupying the top spots on the leaderboard. I'm interested to see how the player rankings differ from other evaluation systems by incorporating this factor into the model.
   15. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 03:33 AM (#2292514)
Of course they move in tandem at the highest levels. You add in 50 guys to the league who were in the minors the year before. The best hitters beat up on the minor league pitchers, while the minor league hitters drag down the average allowing the best hitters to produce even better league-relative numbers. The same is true for pitchers (they strike out all the minor league hitters, while the minor league pitchers pull up the league ERA). It's a pretty substantial effeect, and as I said, it has one of the strongest coefficients in the regression on stdev of wins above average.
   16. Kyle S Posted: February 06, 2007 at 04:56 AM (#2292566)
The spreadsheet is only NL guys, right?
   17. BDC Posted: February 06, 2007 at 05:02 AM (#2292569)
But why wouldn't a WARP system be neutral toward expansion seasons? Those new clubs absorb the freely available talent, so that the best player you can get for free is considerably worse than he was a year ago. Hence Willie McCovey gets a lot more valuable overnight. I don't see how that's any different than Concepcion happening to peak in a bad epoch for shortstops.

I am always stupid in these conversations, so feel free to point out why :)
   18. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 05:11 AM (#2292571)
Only NL, and no catchers. And it's taken me six months to get just that much! Like I said, I'm going to try to have estimates for pitchers and AL players and NL catchers under the group's consideration "by 1995."

Bob Dernier Cri, you are right that expansion lowers the absolute empirical replacement level relative to average (and in early versions of this system, guys from the 8-team leagues came out poorly since replacement was so close to average). But my approach here is to first standardize all the wins-above/below-average scores for every player to the projected standard deviation of the league, puttinh all seasons (regardless of expansion, league size, and every other factor under the sun) on an equal footing. THEN I look at the worst three regulars and calculate replacement level. So replacement level is really tied to a z-score--say, 2 standard deviations below the mean (I have no idea what the real number is)--rather than an absolute figure.

My point about expansion was that since those seasons tend to have much higher standard deviations, they get regressed more than other seasons.
   19. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 02:20 PM (#2292633)
The best way to think about these numbers is, by how many standard deviations did a player exceed the replacement players at his position for his time? That's why Concepción does so well--he comes out similarly in overall value (career + peak, as measured by the salary estimator) to Jesse Burkett in this system. He was a league average hitter for his career (with some well-above-average seasons), a plus baserunner, and an extraordinary fielder, in an era where a) talent was closely bunched together and b) shortstops were historically awful. Let's take, say, his 1979, not even his best year. He was 1.7 wins above average offensively and 1.5 wins above average defensively, for 3.2 wins above average total (he had an above-average number of plate appearances that season, so for the purpose of comparison he was 3.0 wins above average per season played). By contrast, Atlanta started Pepe Frias, who was just about a replacement level shortstop that year. Frias was -2.8 wins per season with the bat and -.6 wins per season with the glove, for -3.4 wins per season total. Now, the projected standard deviation for the 1979 NL is 2.34 wins per season (the real one was 2.27), so Concepción was (3.0-(-3.4))/2.34 = 2.74 stdevs better than Frias per season. Readjusting for Concepción's high PA total, he was 2.87 stdevs above replacement.

Now let's compare him to, say, 1895 Jesse Burkett. After straight-line-adjusting for season length, Burkett was 6.8 wins above average with the bat and 0.9 wins above average with the glove, for 7.7 wins above average total. He too had an above-average PA total, so he was 7.1 wins above average per season. A replacement LF for 1895 would be someone like Tommy McCarthy, who was -0.6 wins per season with the bat and -0.9 wins per season with the glove, for -1.5 wins per season total. The projected standard deviation for the 1895 NL is 3.25 (the real one was 3.27), so Burkett was (7.1-(-1.5))/3.25 = 2.65 standard deviations per season better than McCarthy. Readjust for Burkett's high PA total, and he was 2.89 stdevs above replacement.

Thus, although Burkett's raw .409/.486/.524 line obviously dwarfs Concepción's modest .281/.348/.415, his 154 OPS+ is still far superior to Concepción's 107, and his 9.4 wins above replacement exceed Concepción's 6.7, once you adjust for the "ease of domination" of the 1895 NL, the two seasons were equally valuable: both made a 2.9-standard-deviation-above-replacement contribution to a pennant. In the 2005 NL, with a 2.30 projected standard deviation, a 2.9 stdev-above-replacement contribution is worth 2.3*2.9 = 6.7 WARP.

I hope this example makes clear what this statistic measures, and why I feel it can add to the discussion and perhaps change people's opinions as we move into the final ballots.
   20. BDC Posted: February 06, 2007 at 02:31 PM (#2292639)
Well, I still don't see why a comparison (for instance) of Dave Concepcion and Pee Wee Reese should involve a comparison of Pepe Frias and Lennie Merullo, but having read your detailed explanation of the math, Dan, this is certainly a thorough way of looking at the problem of what a player contributed toward winning a pennant in a given year.
   21. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 02:43 PM (#2292643)
It's the exact same reason why a comparison of Reese and Gil Hodges should involve a comparison of Merullo and Whitey Lockman! The reason why Reese's .272 EqA is worth more than Hodges' .289 EqA is that the guy you would have to replace Reese with, Merullo, only puts up a .231 EqA, while Lockman, the guy you replace Hodges with, puts up a .263. Similarly, the guy you have to replace Concepción with, Frias, only produces a .204 EqA. It seems to me that you either have to disregard replacement level altogether, in which case your Hall of Merit will be 90% outfielders and first basemen, or you take it into account, in which case you have to recognize that replacement levels change over time and that the gap between 1B and SS in the 50s NL isn't much bigger than the gap between SS in the 50s NL and SS in the 70s NL.
   22. Dr. Chaleeko Posted: February 06, 2007 at 02:58 PM (#2292652)
Dan,

Although I recognize that your system is limited at this time, one thing I'd really like to see is a DRpHOM-not-HOM list. In other words, I'd like to know what guys your system would put in and leave out, among the players we've considered that you currently have data for. You could just assume the NgLs and ALs and NL catchers are the same for now, but show us what guys would be different via your system than by HOM consensus. It would just help a bit in understanding the results your system offers.
   23. Dizzypaco Posted: February 06, 2007 at 02:58 PM (#2292653)
Dan,

I appreciate your efforts to adjust for league strength. I'm skeptical that all the formula isn't just randomly throwing various quality of play factors together, but its an honest attempt.

The bigger problem for me is that I believe we have no idea how league strength in say, 1895 to 1995. I am personal believer that league strength is vastly higher in recent years than in the early years of baseball, but I have no idea how much, and frankly, I don't think anyone else does either. Taking expansion into account makes sense when comparing the results of 1961 with 1960, but makes no sense to me when comparing 1993 and 1893.

Its not just that the potential population that MLB draws from is much higher in 1993 than 1893, its that we don't have the foggiest idea what that population really is. Talent aside, not every 25 year old male has the potential to play major league ball no matter how good that player is, and its not just because of segregation and other similar issues. In 1893, how many young people were playing baseball often enough to get good enough to play in the Majors? How many were playing some type of organized baseball? Of those, how many had the potential to be noticed by scouts? How many played for some type of "minor league" team, despite the fact that they were good enough to play in the National League? I have no idea, but my guess is that the real population that MLB was drawing from is much, much smaller than what is estimated in the formula. Not only does modern Major league baseball draw from a much, much larger population (young males in US + young males in other countries), but there is a much more organized system for funneling the best players at each level to the major leagues.

So what do we do about it? I'm not a believer in using a timeline - not because I don't think the level of competition is higher today, but because I don't think there is any realistic way for measuring the difference. However, if we are not going to adjust for the quality of play over time, I also don't think we should be adjusting for minor differences in year to year play either, such as are caused by expansion.

I am aware that people have made attempts to measure differences in quality of play over time, but I have never agreed with the methodologies. I know many, if not most people on BTF disagree with me on some of these points, but its my 2 cents.
   24. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 02:59 PM (#2292654)
I'll try to do that today.
   25. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 03:18 PM (#2292664)
Dizzypaco,

Thanks for your comments. I'm afraid I can't be clear enough about this (which is why I put it in bold in the methodology description)--I am not attempting to adjust for league strength or quality of play!!!

People conflate standard deviations and quality of play, I imagine due to Stephen Jay Gould's article, but they are by no means the same thing. To repeat, I regress the 1993 NL more than the 1914 NL--this is not a league "strength" adjustment. It's a league standard deviation adjustment, nothing more and nothing less. All I am doing is measuring the spread of performance in a league and standardizing it across eras, so that a two-stdev contribution to a pennant in 1893 is worth the same as a two-stdev contribution to a pennant in 2005. As I said, if you want to adjust for quality of play or timeline, you have to make that adjustment yourself to my WARP2 numbers. The projected standard deviation of the 1914 NL was the same as that of the 1993 NL, but clearly the level of play was higher in 1993, and I'm not accounting for that. I personally am against timelining. I am just trying to be fair to all eras, which requires looking at the distribution of performance in each season. 10 WARP in 1893 "bought" fewer pennants than they do in 2005, because the stdev was higher. That's true no matter whether 1893 was a particularly strong or weak league in terms of absolute quality of competition.
   26. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 04:16 PM (#2292703)
(but 10 WARP in 1914 "bought" more pennants than 10 WARP in 1993)
   27. 'zop sympathizes with the wrong ####### people Posted: February 06, 2007 at 04:22 PM (#2292711)
<straightman>

Dan, I see that your system seems to love 70's-80's sluggers like Mike Schmidt, Pedro Guerrero, Jack Clark, and Dale Murphy. Could you explain why?
</straightman>
   28. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 05:04 PM (#2292748)
Why yes, David, I do believe I could--although it now looks to me like Guerrero, Clark, and Murphy are all juuuust on the wrong side of the in/out line ($75M career salary). The simple reason is that the late 70s and all of the 80s were the lowest standard deviation era in the league's history--those years were the "hardest to dominate" (as measured by OPS+ or WARP1) in the entire NL. So (making these numbers up), a 140 OPS+ in 1985 might be equal to a 150 in 1997 might be equal to a 160 in 1935. Adjusting for the "difficulty of domination" of those days makes Guerrero, Clark, and Murphy appear to have contributed about the same number of pennants as, say, Max Carey or Cupid Childs.
   29. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 05:50 PM (#2292797)
OK, here's what I've got, using my best estimates for AL and pre-1893 seasons. It looks to me like the cutoff for inner circle is $150M, and in/out is somewhere around $75M. Of course there is a margin of error on these numbers, so I wouldn't draw too much attention to differences of $2 million--picking among the 70-to-75M batch is really just a question of taste. 2006 stats are not included. War credit is given, and fairly liberally at that (including seasons like Mays '53 and Maranville '18). Reese and Slaughter certainly depend on it. The guy that waaay leaps out at me is Larkin--almost inner circle. Wowee.

Name                Career Salary
Barry Bonds         
$355,075,512
Honus Wagner        
$301,940,127
Willie Mays         
$240,268,011
Rogers Hornsby      
$231,526,395
Stan Musial         
$210,776,870
Hank Aaron          
$210,319,344
Mike Schmidt        
$209,650,472
Joe Morgan          
$181,519,676
Frank Robinson      
$160,375,579
Mel Ott             
$154,068,892
Arky Vaughan        
$152,338,635
INNER CIRCLE
--------------------
Barry Larkin        $145,872,934
Ozzie Smith         
$129,998,781
Jeff Bagwell        
$128,092,272
Gary Sheffield      
$123,033,450
Bill Dahlen         
$119,772,919
Ed Delahanty        
$117,382,556
George Davis        
$113,291,040
Tim Raines          
$111,020,960
Eddie Mathews       
$110,901,266
Billy Hamilton      
$110,692,353
Tony Gwynn          
$110,436,526
Johnny Mize         
$104,493,744
Roberto Clemente    
$102,761,223
Paul Waner          
$101,946,007
Jackie Robinson      
$98,237,421 (no Negro League credit)
Fred Clarke          $97,367,980
Larry Walker         
$96,967,219
Pee Wee Reese        
$96,496,818
Pete Rose            
$96,047,274
Dave Concepción      
$94,666,913
Frankie Frisch       
$92,862,321
Jim Edmonds          
$91,590,321
Hughie Jennings      
$91,588,914
Dick Allen           
$90,960,121
Ron Santo            
$90,817,848
Jesse Burkett        
$90,105,659
Chipper Jones        
$89,464,583
John McGraw          
$89,410,662
Sammy Sosa           
$87,128,960
Ernie Banks          
$86,323,002
Darrell Evans        
$84,203,505
Scott Rolen          
$83,447,841
Joe Kelley           
$82,819,556
Duke Snider          
$82,220,482
Billy Williams       
$81,748,209
Jimmy Sheckard       
$81,421,173
Reggie Smith         
$80,330,007
Ron Cey              
$79,950,646
Heinie Groh          
$79,794,093
Willie Stargell      
$79,323,100
Willie McCovey       
$78,779,130
Ryne Sandberg        
$78,449,070
Albert Pujols        
$78,360,169 (no 2006!)
Enos Slaughter       $77,129,289
Vladimir Guerrero    
$76,795,704
TOP OF BORDERLINE
---------------
Will Clark           $75,655,086
Cupid Childs         
$75,028,557
Sherry Magee         
$74,905,275
Brian Giles          
$74,785,122
Zack Wheat           
$73,823,975
Billy Herman         
$73,634,666
Dale Murphy          
$73,023,171
Willie Keeler        
$72,632,593
Max Carey            
$72,202,231
Luis González        
$71,945,402
Jim Wynn             
$71,473,562
Jeff Kent            
$71,029,423
BOTTOM OF BORDERLINE
------------
Pedro Guerrero       $69,945,660
Craig Biggio         
$69,804,540
Dave Bancroft        
$69,791,255
Fred McGriff         
$68,297,428
Jack Clark           
$68,281,910
George Foster        
$67,141,524
Joe Tinker           
$67,016,210
Stan Hack            
$66,777,214
Keith Hernández      
$66,654,616
Andre Dawson         
$66,071,050
Art Fletcher         
$66,009,607
Bobby Bonds          
$65,228,399
Kiki Cuyler          
$65,179,605
Eric Davis           
$64,153,929
Rabbit Maranville    
$63,869,749
Edd Roush            
$63,190,215
José Cruz Sr
.        $63,170,898
Joe Medwick          
$63,124,835
Ken Boyer            
$63,070,276
Ken Caminiti         
$62,564,785
Chuck Klein          
$60,807,729
Cesar Cedeño         
$60,623,540
Ralph Kiner          
$60,580,538
Jake Beckley         
$60,408,747
Bobby Abreu          
$60,231,369
Richie Ashburn       
$60,024,942
Tommy Leach          
$59,731,485
George Burns         
$59,294,919
George Van Haltren   
$56,923,002
Hugh Duffy           
$56,204,790
Bob Elliott          
$54,701,989
Rusty Staub          
$54,127,458
Frank Chance         
$51,290,002
Tony Pérez           
$51,120,246
Pie Traynor          
$49,824,287
Bill Terry           
$47,695,966
Gavvy Cravath        
$44,893,823 (no minor league credit)
Orlando Cepeda       $40,369,597
Bill Mazeroski       
$37,847,348
Larry Doyle          
$36,011,518 
   30. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 05:59 PM (#2292808)
OK, so to summarize:

Guys who I think should clearly be in the HoM and are not:

Dave Concepción
John McGraw
Reggie Smith
Ron Cey

Guys who I think are HoM mistakes:
Stan Hack (particularly since my system doesn't penalize him for wartime competition)
Joe Medwick
Ken Boyer
Ralph Kiner
Richie Ashburn

Of the borderliners, my instinct is to pick the old-time guys (Childs, Magee, Wheat, Herman, and Carey, probably not Keeler) and leave out the more recent ones (W. Clark, Murphy, L. González, Wynn, Kent). B. Giles is clearly in for me given that he hasn't retired yet.
   31. Chris Cobb Posted: February 06, 2007 at 06:15 PM (#2292822)
It looks like Bill Terry ought to belong on the "mistake" list, also.
   32. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 06:24 PM (#2292828)
oops, yep, I forgot he was elected! I didn't even look down that far. I had stopped voting by the time he was elected...what happened there? Short career, only three really outstanding offensive season none of which were extraordinary, only minor defensive value...what gives?
   33. Chris Cobb Posted: February 06, 2007 at 06:54 PM (#2292853)
Well, one theory is that Terry benefited from being a Shiny New Toy in a weak year. The fact that Terry slipped in so easily has caused the electorate to be more cautious with borderline new candidates.

For ease of comparison of Dan's list to the current HoM roster:

By my count, we have actually elected 47 players from among the group from which this list is drawn: National League players who played the bulk of their careers after 1893 and who were neither pitchers nor catchers.

The 48th eligible or elected player on Dan's list is Dave Bancroft. So assuming that our total of 48 elected from this pool is right, then Dan's system finds that we should have elected

Concepcion, McGraw, Smith, Cey, Wynn, and Bancroft

in place of

Hack, Medwick, Boyer, Kiner, Ashburn, and Terry.

Some of the differences between the lists could be artifacts of the election schedule and not "mistakes" -- i.e. we elected the best player available at the time, but vagaries in the supply of talent let one in but kept another out. Others are surely genuine disagreements about value. We've had plenty of chances to elect McGraw and Bancroft but we haven't.

Did you give Kiner and war credit, Dan?
   34. Chris Cobb Posted: February 06, 2007 at 06:56 PM (#2292856)
Ack, errors in the previous post! i tried to stop it before it loaded, but if it didn't, here the inconsistencies are corrected.

Well, one theory is that Terry benefited from being a Shiny New Toy in a weak year. The fact that Terry slipped in so easily has caused the electorate to be more cautious with borderline new candidates.

For ease of comparison of Dan's list to the current HoM roster:

By my count, we have actually elected 48 players from among the group from which this list is drawn: National League players who played the bulk of their careers after 1893 and who were neither pitchers nor catchers.

The 48th eligible or elected player on Dan's list is Dave Bancroft. So assuming that our total of 48 elected from this pool is right, then Dan's system finds that we should have elected

Concepcion, McGraw, Smith, Cey, Wynn, and Bancroft

in place of

Hack, Medwick, Boyer, Kiner, Ashburn, and Terry.

Some of the differences between the lists could be artifacts of the election schedule and not "mistakes" -- i.e. we elected the best player available at the time, but vagaries in the supply of talent let one in but kept another out. Others are surely genuine disagreements about value. We've had plenty of chances to elect McGraw and Bancroft but we haven't.

Did you give Kiner any war credit, Dan?
   35. Dandy Little Glove Man Posted: February 06, 2007 at 06:58 PM (#2292857)
What's Dave Parker's Career Salary? Also, if these numbers only include NL time, Will Clark should be an easy HOM selection under this system after the AL portion of his career is added.
   36. Mark Shirk (jsch) Posted: February 06, 2007 at 07:02 PM (#2292864)
This also seems to be career numbers. What do the three and five year peaks of Kiner and Medwick look like?
   37. Dr. Chaleeko Posted: February 06, 2007 at 07:04 PM (#2292866)
Dan, thanks for posting the big list above. I did a little informal parsing of it to see what's to see, particularly looking for matters of era and positional balance. I'm not coming to any judgment about the system or anything because we don't have all the players in it yet, but I thought I'd just take a preliminary look anyway.

By position

Using the groupings Dan provided I made a chart of which positions were represented in the chart and where. I assigned the positions myself, which can be arbitrary as you know. Leach is a 3B and Rose is 2B, etc etc etc....

1B 2B 3B SS LF CF RF  TOTAL
-------------------------------------------
INNER CIRCLE    0  2  1  2  2  1  3   11
HOMERS          5  4  8  8  8  3  9   45
UPPER BORDER    1  3  0  0  3  3  2   12
LOWER BORDER    9  3  6  4  5  7  6   40
===========================================
TOTAL          15 12 15 14 18 14 20  108 


I'll be happy to answer any questions about who is at what position, if anyone wants to know.

By decade

Granted decades are arbitrary endpoints and it's often tough to know exactly which decade to put somebody in but I forged ahead. Again, the groupings are Dan's but the decade assignments are mine.

1890s 1900s 1910s 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s TOTAL
--------------------------------------------------------------------------------------------
INNER CIRCLE     0     1     0     1     2     1     0     3     1     1     0     1     11
HOMERS           8     2     1     1     1     3     4     6     5     4     5     5     45
UPPER BORDER     2     1     2     0     1     0     0     1     0     1     1     3     10
LOWER BORDER     3     3     6     3     4     1     2     3     5     6     3     1     40
============================================================================================
TOTAL           13     7     9     5     8     5     6    13    11    12     9    10    108 


In this chart in particular, it's worth noting that the blurring of a player's career between decades may make some gaps look bigger than they may actually be. I'll happily elaborate on who is in what group if anyone would find that information helpful.

Anyway, like I said, I'm not offering any judgment, but I thought this might provide the group with some interesting information.
   38. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 07:14 PM (#2292878)
I did not give Kiner war credit--I didn't know he fought. How old was he, and what were his minor league stats before that?

My McGraw pick has everything to do with the salary estimator and very little to do with the WARP system. McGraw played at such a high rate that even after adjusting for the high standard deviation of the time period, he looks like a dominant player--I have his 1899 as the 5th most valuable season between 1893 and 1946 (after Wagner '07-'08, Hornsby '24, and Jennings '96). Because the salary estimator rewards rate exponentially (versus playing time linearly), extremely high peak rate seasons are counted as being, well, extremely valuable--this is by design. But fully 31% of McGraw's value, in my book, is the 1899 season--a guy who had three of those and nothing else would be a Hall of Meriter in my book. If you look at career, McGraw's 43.6 WARP2 aren't much of a case (Jack Clark has over 47), and his best five seasons (31.4 WARP2) don't really stand out either (Pedro Guerrero has 31.3), and nobody on this board thinks Pedro Guerrero's peak with Jack Clark's career is a HoM'er. It's just because my salary estimator values rate so highly (which is how I like it) that McGraw comes out so well.

I've posted the data precisely so that people can dig into it and use it in their own systems and draw their own conclusions. This is just the raw information; how you value peak vs. career, and rate vs. durability, is up to you.
   39. 'zop sympathizes with the wrong ####### people Posted: February 06, 2007 at 07:21 PM (#2292886)
In this chart in particular, it's worth noting that the blurring of a player's career between decades may make some gaps look bigger than they may actually be. I'll happily elaborate on who is in what group if anyone would find that information helpful.

Anyway, like I said, I'm not offering any judgment, but I thought this might provide the group with some interesting information.


Consider, Doc Chaleeko, that the NL represents a varying proportion of the total HoM eligible players over time. For example, we'd expect a dramatic dropoff from the 1890's to the 1900's because of the rise of the AL and a concurrent dilution of talent. We'd expect a rise from the 50's through the 60's as African-American players enter the MLB game in large numbers.

I think when this is taken into consideration, the decade-breakdown looks nearly perfect.
   40. DavidFoss Posted: February 06, 2007 at 07:31 PM (#2292893)
When do you think you'll get around to doing the AL?
   41. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 07:42 PM (#2292903)
Parker is $57,146,405.

These numbers definitely include my best estimates of AL seasons (and seasons played at catcher, for Frank Chance, and pitcher, for George Van Haltren).

Mark Shirk, that information is in the spreadsheet. But I'll post it here.

RC+: Runs produced per out, relative to league average.
SFrac: Percentage of season played, compared to the league average plate appearances per lineup
spot.

W1AA: Wins above average.
WARP1: Wins above a replacement player at the same position.
LeagueAdj: Ratio of the league's projected standard deviation to the 2005 NL standard deviation.
W2AA: Wins above average, adjusted for standard deviation.
RepW/Yr: Wins below average of a replacement player at the same position per full season, adjusted for standard deviation.
WARP2: Wins above a replacement player at the same position, adjusted for standard deviation.
WARP2/Yr: Wins above a replacement player at the same position, adjusted for standard deviation, projected to a full season.
PennAdd: Pennants Added.
Market Salary: How much the 2005 market would have paid for that performance.

Medwick
Year RC
FRAA/Yr SFrac W1AA WARP1 LeagueAdj W2AA RepW/Yr WARP2 WARP2/Yr PennAdd      Salary
1937 205      
-1  1.04  7.7   8.3      .847  6.5    -0.5   7.0      6.7    .100 $12,767,893
1936 169       4  1.02  5.8   6.3      .836  4.9    
-0.4   5.2      5.1    .071  $7,848,027
1935 167       4  1.02  5.6   6.1      .808  4.5    
-0.4   5.0      4.8    .066  $7,105,488
1941 154       9  0.88  4.5   5.1      .833  3.8    
-0.6   4.3      4.8    .056  $6,127,671
1938 153       3  1.01  4.5   5.0      .828  3.7    
-0.5   4.2      4.1    .054  $5,359,686
THREE YEAR TOTAL       19.1  20.7           15.9          17.2             .237 
$27,721,408
FIVE YEAR TOTAL        28.1  30.8           23.3          25.6             .347 
$39,208,765

Kiner
Year RC
FRAA/Yr SFrac W1AA WARP1 LeagueAdj W2AA RepW/Yr WARP2 WARP2/Yr PennAdd      Salary
1951 219      
-9  1.10  6.9   7.4      .943  6.5    -0.5   7.0      6.4    .100 $12,310,451
1949 211      
-9  1.10  6.8   7.1      .937  6.4    -0.2   6.6      6.0    .093 $11,132,974
1947 193      
-2  1.10  6.6   6.9      .939  6.2    -0.3   6.5      5.8    .090 $10,637,409
1948 159      11  1.12  5.8   6.2      .950  5.5    
-0.3   5.9      5.3    .081  $8,976,516
1950 172     
-11  1.12  4.3   4.7      .922  3.9    -0.4   4.4      3.9    .057  $5,343,011
THREE YEAR TOTAL       20.3  21.4           19.1          20.1             .283 
$34,080,833
FIVE YEAR TOTAL        30.4  32.3           28.5          30.3             .422 
$48,400,361 
   42. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 08:00 PM (#2292914)
Indeed. Also I would note that ceteris paribus, we'd expect to have more HoM'ers (by this measure) from recent decades just because of expansion. Let's just hypothetically say that everyone above 3 SD from the mean is a HoM'er. If you double your "sample size"--the number of players in the National League--you will double the number of players who are more than 3 SD from the mean. That seems right and logical to me.

DavidFoss, do you mean *complete* AL data (like this), or just estimates for guys on HoM ballots? The latter I hope to have by the 1995 election. The former is probably six months away--like I said, I do this by hand. If anyone's got a faster way (particularly to input WS and BP FRAA data into Excel), I'd love to hear it.
   43. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 08:06 PM (#2292920)
So few from the 50s? I'd say Mays, Musial, Aaron, F. Robinson, Mathews, J. Robinson, Reese, Banks, and Snider were all 50's guys (obviously Mays/Aaron/F. Robinson were also 60s, and Musial/Reese were also 40s, but still).
   44. Dr. Chaleeko Posted: February 06, 2007 at 08:14 PM (#2292927)
So few from the 50s? I'd say Mays, Musial, Aaron, F. Robinson, Mathews, J. Robinson, Reese, Banks, and Snider were all 50's guys (obviously Mays/Aaron/F. Robinson were also 60s, and Musial/Reese were also 40s, but still).

Right, Dan, and that's what I was trying to say above. That the blurring of decades makes it tough to draw particular conclusions about fairness to eras unless there's a really obvious skew somewhere.
   45. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 08:19 PM (#2292932)
Everything on my screen is coming up as preformatted text...am I missing a ?

You could look at # of HoM'ers playing in each year...
   46. DavidFoss Posted: February 06, 2007 at 09:17 PM (#2292969)
DavidFoss, do you mean *complete* AL data (like this), or just estimates for guys on HoM ballots? The latter I hope to have by the 1995 election. The former is probably six months away--like I said, I do this by hand. If anyone's got a faster way (particularly to input WS and BP FRAA data into Excel), I'd love to hear it.

Yeah, complete data like this... OK, I understand these things are not automatic.

I've heard of programs that will scan a website like BP and dump data from pages into tables, specifically people do that in mid-season sometimes and it would also be useful for BP's frequent "updates" of their WARP data. I don't know how to do it though.

I do know that if you store the data in csv format that its quite a bit smaller and zips up nicer. I got the entire first sheet of your XLS file into a zip file that is only 218 KB.
   47. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 06, 2007 at 09:21 PM (#2292977)
Well, if you find out how to do it, please let me know!

Didn't think to stick in in .csv, I guess I'll redo that now for easier downloading.
   48. DavidFoss Posted: February 06, 2007 at 09:53 PM (#2293004)
Hmmm... baseball graphs has Win Share data here:

ftp://ftp.baseballgraphs.com/winshares/

Its "WS only" with the common "playerid" labels for identifying players (e.g. "killeha01" for killebrew) instead of the mechanism that you have.

Wow, that CSV file is a big help for me! I'm glad I helped you look. :-)
   49. DavidFoss Posted: February 07, 2007 at 12:39 AM (#2293090)
Well, if you find out how to do it, please let me know!

I've made some progress!

Turns out there is a program on linux called "wget" which will copy a webpage to your local machine. A free windows version is easy to find with a google search (I got the one here: here).

You can feed wget an input file with a list of urls. Well, since baseballprospectus urls are simply www.baseballprospectus.com/dt/playerid.php then you can create a list of 16000+ urls containing all the unique playerids in the baseballgraphs csv file. It took about a half an hour at work (might take longer at home) but I dumped all the BP webpages onto my local machine. Its about 300 MB. There were a couple of dozen pages that couldn't be found due to name disagreements (e.g. gwynnto02/gwynnan01) but that's a great success rate. With a good HTML parser then you can automatically parse those pages into csv-style data.

Turns out my work has great in-house software for this. I'll see what I can do. (Though I'm supposed to be working :-)). There may be great freeware out there for that too, for those that want to try at home.
   50. Dr. Chaleeko Posted: February 07, 2007 at 03:38 AM (#2293225)
Here's a question the commish posted on the discussion thread. I thought this might be a good thread to discuss it in since I think that Dan Rosenheck is doing something a little different than what Woolner does (if I get both of their systems).

I think I might use VORP+FRAA as a foundation for running a hitter type spreadsheet similar to my pitching one. I realize VORP only goes back to 1959 on the website (according to Neyer's article), I'll have to come up with something else pre-1959, but I'll let you guys know how it works out . . . any deficiences I should be made aware of before starting?

Dan, I think you explained to me that your WARPs are based on freely available talent (as originally defined and studied by Nate Silver, IIRC), whereas Woolner's VORP measures against backups, not FAT. What's the advantage of one over the other?
   51. Dr. Chaleeko Posted: February 07, 2007 at 03:42 AM (#2293231)
Also a quick procedural question:

2. Use Nate Silver’s 2005 salary estimator ($212,730*WARP^2 + $402,530*WARP) to find out how much the player would have earned on the 2005 market had he played a full season. Convert all negative numbers to $0.

Why wouldn't zeroes be equivalent to the minimum major league salary? The minimum salary is paid to players regardless of performance. Alternatively, how about the AAA minimum since, presumably, the parent club bears some of the costs associated with a player it has farmed out?
   52. jimd Posted: February 07, 2007 at 04:07 AM (#2293241)
Why wouldn't zeroes be equivalent to the minimum major league salary?

Three reasons that I can think of right away.

1) The regression on which the formula is based does not produce that result, or
2) Silver acknowledges that WARP's replacement level is too low and accounts for it, or
3) A constant term that adds the MLB minimum salary to the formula has gotten lost somewhere

What's the MLB minimum as of 2005 anyway? 300K or has it gone up?

A WARP of 1.0 should get 615K as a full time player according to the estimator.
A WARP of 0.6 should get 318K as a full time player according to the estimator.
   53. DavidFoss Posted: February 07, 2007 at 04:08 AM (#2293243)
OK. I have BP's "Actual Batting Statistics" and "Advanced Batting Statistics" converted to csv player-season format for >99% of the players.

Its 11.8 MB (3.6 MB zipped). Where should I put it? Dan, what is your email address?
   54. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 07, 2007 at 04:26 AM (#2293255)
Dr. Chaleeko, my reasoning is simple: not all backups are freely available. I don't know the nitty-gritty of Woolner's study, but I imagine it would be tough to distinguish backups from, say, September callups just by looking at games played or plate appearances. Moreover, I don't think it includes defense, as Nate's does. Meanwhile, Silver's methodology is simple, straightforward, and dead-on accurate: over age 27, making less than half the league minimum salary. If it ain't broke, don't fix it.

Good point about the league minimum salary, although I can't imagine it would change much. I'm just using Nate's equation (no, he doesn't pay me to do his P.R.) found at http://www.baseballprospectus.com/article.php?articleid=4535. The reason why there's no minimum salary in the formula is definitely not jimd's reason #2, because Silver is using the WARP found on the PECOTA cards, which has a realistic replacement level, not the crazy-low one on the DT cards. #3 seems to me the most likely. I could just email Nate and ask if people actually think it's important.

DavidFoss, you are a god...please send it to cooberp@gmail.com. Thanks a zillion.
   55. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 07, 2007 at 04:28 AM (#2293256)
ummmmmmm.......I meant less than twice the league minimum salary. Making less than half the league minimum while on a major league roster would be quite a feat.
   56. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 07, 2007 at 08:35 PM (#2293632)
I've just made a few tweaks: added average player height and weight to the regression, which squeezed another 1.5% out of the r^2, added a handful of player-seasons missing from the previous spreadsheet, and saved the file in .csv format for easier downloading. The URL is the same. I'm working on estimates for AL candidates now.
   57. TomH Posted: February 08, 2007 at 04:27 PM (#2294086)
Comments to Dan, along with explaining how I go about things:

Yes, I do account for value at different positions, AND the fact that replacement levels vary over time, AND that different leagues were easier to dominate. Also, I agree with Dan that variation is one reasonable proxy to use for ease of domination. Your league-wide regression, Dan, seems very well thought out, good use of dummy variables, and if I were to strap myself to a purely mathematical model I might choose yours over any other I’ve seen.

But. Let me tackle the topics of Replacement Level and Variation

Let’s say all MLB hitters were represented on a scale for 0 to 10.

In one league of 8 shortstops, 7 starting SS were hitters of value 0, 1, 2, 3, 4, 5, and 6 (ignore defense for now).

Dave Concepcion is a 6. The three worst guys were 0, 1, and 2, so replacement level = 1 (avg of those 3, using Dan’s method), and Davey would be +5.

But if a MLB manager had decided that his rotten-hitting SS (0) and mediocre-hitting (3) 2Bman should swap positions for defensive purposes, the 3 lowest shortstops would be 1, 2, 3, average of 2, so Davey is only +4.

I don’t think in scenario two Concepcion is, in the big picture, much less valuable. There is some fungibility among players. So, I prefer to take a bit broader brush of position-replacement-level, which puts me somewhere in the middle of the “value vs. ability” debate. I was an advocate of Joe Sewell, partly because shortstops hit so poorly when he played that he was valuable, but I also understand that maybe he was fortunate to come along at that time and place in history.

Variation:
I am not in favor of using the term “.932 * Win% of worst team in league” to measure league variation (ease of dominance). This has been discussed when the Dynasties book came out, and the authors used the variation in team runs scored and runs allowed to determine how many standard deviations one team was above the rest. While in a general sense this is a good idea, there are many strong reasons why this ain’t such a hot idea to strap yourself to. I’d rather not spend another 1000 words on this now.

So, if the 1970s guys played in a time when the worst team wasn’t TOO bad, I’ll account for that somewhat. I am one who places such guys as Joe Morgan, Mike Schmidt, and Tom Seaver higher on all-time lists than most other baseball nuts who rnak such things. But I’m skeptical that this should be what seems to be a large boost; bottom line is, Dave C does not make my top 30 if I use Win Shares or WARP3 or OWP/RCAA/RCAP + defense. However, I am Very open to analyzing the reasons why he comes out ranked much higher by Dan’s methods. A very hearty welcome, Dan, and I look forward to more of your work and full-throttle debates.
   58. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 08, 2007 at 06:21 PM (#2294171)
Tom,

Thanks very much for your thoughtful comments. Responses:

1. I don't think your hypothetical is realistic. Let's say the SS is 2.5 wins below average with the bat, and the 2B is 1 win below average with the bat, and both are league-average fielders at their positions. Why would the manager switch them?

I know I sound like a walking Nate Silver ad, but he wrote, "I suspect that major league shortstops, as a group, have quite a bit more general athletic ability than major league second basemen. You could put Miguel Tejada at second base if you wanted to, and he’d still outhit pretty much everyone at the position, but there’s no reason to since his defense is more valuable at short. You couldn’t put Jeff Kent at shortstop, however, without your pitching staff chipping in on a bounty against you."

So the SS might pick up 5 runs in the field, but the 2B would lose 15. Thus, you'd have the same offense but give up 10 more runs on defense. Why would a manager do that?

2. If I remove win% of worst team from the regression, I still get .5769 r^2, so it's having an effect of less than 1%. I could easily take it out with no consequences for the results. The late 70s and 80s had low standard deviations not because there were few historically awful teams, but because they were low-scoring integrated leagues many years removed from expansion in an era when the player population per team was quite high. The only reason I put in win % of worst team was because I thought 1899 guys like McGraw just got to beat up on the Spiders all day long. But the effect wound up being almost nonexistent.

3. Of course Win Shares won't like Concepción; it has no replacement level. WARP3 loves him, I'm surprised he wouldn't rank high on that basis. I don't know how RCAP + defense is calculated, but if it compares Concepción to the replacement level shortstops *of his day*, he should do fairly well. I'm sure it doesn't take standard deviations into account, though, which further strengthen his case.
   59. TomH Posted: February 08, 2007 at 07:09 PM (#2294204)
Dan, I should have said "I use a combination of Win Shares or WARP3 or OWP/RCAA/RCAP + defense". WS hates DC, WARP3 does like him a lot.

What did you mean by "Of course Win Shares won't like Concepción; it has no replacement level"? WARP has a pretty low replacement level too.

In my first example, some guys play 2B or 3B who could play MLB shortstop, but already have a SS on their team (see: A Rod). Or, how about this: let’s say in 2007 the Brewers choose between playing Bill Hall in CF or at SS. If he plays SS and cranks out 35 HR while some rookie plays a poor CF, or if he plays CF and JJ Hardy has a poor year at short, does this really impact Jose Reyes’ value?

What is Concepcion's best year by your WARP score, Dan?
   60. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 08, 2007 at 07:24 PM (#2294220)
Yes, but WARP has different replacement levels for different positions. WS only distinguishes between positions via Fielding WS, and guys at tough positions aren't credited with sufficient FWS to properly account for the gap in replacement level at their position versus, say, 1B.

I determine replacement level by averaging 27 different player-seasons for each league-season-position. I'm assuming that situations like the hypothetical Bill Hall one you're describing cancel themselves out in a 27-player sample.

I have Concepción's best year as 1981 (straight line adjusting for season length). His 1974 is right behind.

Because I initially started posting this research in support of Concepción's candidacy (and now because of my handle), people only seem to be discussing this work in regards to him. There are 8,350 player-seasons in here! There are plenty of guys to pay attention to...Ron Cey and Reggie Smith also look like HoM'ers to me and will be on my ballot. Can we talk about them too? :)
   61. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 08, 2007 at 08:56 PM (#2294263)
Since for most people this seems to be heresy, why don't I make my argument that Reggie Smith is more deserving than Willie Stargell.

Here are the charts, sorted from best season to worst. (Note that tweaks to the stdev regression have caused most career salaries to increase by about $5 million; changes in ranking are insignificant. And the AL data is necessarily less accurate because I'm using NL replacement levels (adjusted for the DH where necessary, as in Smith's 1973) and average Fielding Win Shares at each position).

Glossary
BWAA/Yr: Batting wins above league average per full season played.
BRWAA/Yr: Baserunning wins above league average per full season played.
FWAA/Yr: Fielding wins above positional average per full season played.
Rep: Wins above league average per full season played of a replacement player at the position.
WARP1/Yr: Wins above a replacement player at the position per full season played. This should theoretically equal BWAA/Yr + BRWAA/Yr + FWAA/Yr + Rep, but often differs by a few tenths of a win due to Pythagorean effects.
&#xSe;as: Percentage of the season played, as compared to the league average plate appearances per lineup slot.
WARP1: Wins above a replacement player at the position in the given season (WARP1/Yr * &#xSe;as).
LgAdj: Ratio of the 2005 NL projected standard deviation to the projected standard deviation of the season in question.
WARP2: Wins above a replacement player at the position in the given season, adjusted for the standard deviation (“ease of domination”) of the league.
WARP2/Yr: Wins above a replacement player at the position per full season, adjusted for the standard deviation (“ease of domination”) of the league.
PennAdd: Pennants added, adjusted for the standard deviation.
MarketSal: How much the 2005 market would have paid for the player’s performance.

Reggie Smith

Year BWAA/Yr BRWAA/Yr FWAA/Yr  Rep WARP1/Yr &#xSe;as WARP1 LgAdj WARP2 WARP2/Yr PennAdd       Salary
1977     6.2      0.1     0.1 -0.7      7.2   .94   6.7  .985   6.6      7.1    .094  $12,691,176
1974     4.8      0.1     0.7 -0.8      6.4   .95   6.1  .978   5.9      6.3    .082  $10,331,219
1980     5.5      0.2     1.5 -0.8      7.8   .56   4.4 1.036   4.5      8.1    .060   $9,656,867
1978     5.7      0.3    -0.8 -0.7      5.9   .84   5.0 1.027   5.1      6.1    .068   $8,594,862
1973     3.9     -0.1     0.7 -1.7      6.3   .78   5.0  .954   4.7      6.0    .063   $7,964,437
1972     3.9      0.1    -0.5 -0.9      4.5   .93   4.1  .978   4.1      4.4    .053   $5,411,664
1970     2.8     -0.2     0.4 -1.2      4.3  1.02   4.4  .926   4.1      4.0    .053   $5,082,542
1969     3.8     -0.6    -0.2 -1.3      4.4  0.96   4.3  .929   4.0      4.1    .051   $5,067,259
1975     3.8      0.2    -0.4 -0.7      4.3   .87   3.7 1.000   3.7      4.3    .048   $4,911,259
1971     2.9      0.1    -0.4 -1.2      3.8  1.12   4.3  .959   4.1      3.6    .053   $4,800,811
1976     1.6      0.1     1.8 -0.7      4.3   .69   2.9  .997   2.9      4.2    .036   $3,811,528
1968     2.7     -0.5    -0.3 -1.2      3.1  1.03   3.2 1.010   3.2      3.1    .040   $3,506,523
1982     3.7      0.2    -0.1  0.0      3.8   .63   2.4 1.033   2.5      3.9    .030   $3,079,236
1979     2.7      0.3     0.8 -0.7      4.5   .42   1.9 1.015   1.9      4.6    .023   $2,678,304
1967     0.6      0.0     0.7 -1.1      2.4  1.01   2.5  .976   2.4      2.4    .029   $2,176,637
TOTAL    3.6      0.0     0.5 -1.0      4.8 12.75  60.7  .981  59.6      4.7    .783  $89,664,325


Willie Stargell

Year BWAA/Yr BRWAA/Yr FWAA/Yr  Rep WARP1/Yr &#xSe;as WARP1 LgAdj WARP2 WARP2/Yr PennAdd       Salary
1973     7.2      0.0    -0.2 -0.9      7.9   .96   7.6  .967   7.3      7.6    .105  $14,770,951
1971     7.4      0.0    -0.8 -0.8      7.5   .95   7.1  .958   6.8      7.2    .097  $13,119,448
1974     5.6      0.0    -0.7 -0.8      5.8   .95   5.5  .978   5.4      5.6    .073   $8,568,340
1966     5.9      0.0    -0.5 -0.6      6.1   .85   5.1  .931   4.8      5.7    .064   $7,668,318
1969     5.4      0.0    -0.7 -0.9      5.7   .94   5.4  .914   4.9      5.2    .066   $7,454,198
1972     5.7      0.0    -0.9  0.3      4.6   .94   4.3  .963   4.1      4.4    .054   $5,524,119
1978     4.9      0.1    -0.3  0.0      4.6   .72   3.3 1.027   3.4      4.7    .043   $4,809,055
1965     2.8      0.0     0.7 -0.6      4.2   .91   3.8  .908   3.4      3.8    .044   $4,137,982
1975     4.1      0.0    -0.4  0.2      3.6   .84   3.0 1.000   3.0      3.6    .038   $3,516,186
1968     2.8      0.1    -0.1 -0.8      3.7   .78   2.9 1.015   2.9      3.4    .036   $3,479,362
1967     3.8      0.0    -1.0 -0.7      3.6   .85   3.0  .959   2.9      3.4    .036   $3,299,312
1979     3.7      0.0    -0.1  0.0      3.6   .75   2.7 1.015   2.8      3.7    .034   $3,256,971
1970     2.2      0.0     0.4 -0.8      3.5   .82   2.9  .921   2.7      3.2    .033   $2,896,053
1977     4.0      0.0    -0.7  0.0      3.4   .34   1.2  .985   1.1      3.3    .013   $1,257,954
1976     2.3      0.1    -0.9  0.0      1.4   .77   1.1  .997   1.1      1.4    .013     $780,313
1964     2.3      0.0    -1.2 -0.6      1.7   .70   1.2  .907   1.0      1.5    .012     $758,583
1980     2.7      0.0    -0.9  0.1      1.7   .36   0.6 1.036   0.6      1.8    .007      $13,648
1963     0.7      0.0    -1.5 -0.4     -0.4   .52  -0.2  .905  -0.2     -0.4   -.003           $0
1981    -0.8      0.0    -1.1  0.0     -1.9   .17  -0.3  .981  -0.3     -1.8   -.004           $0
TOTAL    4.2      0.0    -0.5 -0.4      4.2 14.25  60.2  .958  57.7      4.0    .761  $85,809,433
   62. TomH Posted: February 08, 2007 at 08:57 PM (#2294264)
Oh yeah, bring on stuff about RSmith and Cey (in the discussion threads). Especially if you can explain why your ##s see them more favorably than other metrics.

On Dave C - Wi Shares also sees 1974 as his best year. I suppose a peak/prime voters answer to "why not Concepcion" might be that if in your best season you had a 106 OPS+, while making 30 errors & finishing just above average in range factor, and your team didn't win the division for the only time in a 5-year period, how great were you?
   63. DL from MN Posted: February 08, 2007 at 09:03 PM (#2294268)
Toss a Japan MLE season onto Reggie Smith's numbers and the separation gets wider.
   64. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 08, 2007 at 09:05 PM (#2294270)
Dammit, that didn't work. It doesn't like the percentage sign, and I'm missing a year for Stargell. Here they are again...

Reggie Smith

Year BWAA/Yr BRWAA/Yr FWAA/Yr  Rep WARP1/Yr SFrac WARP1 LgAdj WARP2 WARP2/Yr PennAdd       Salary
1977     6.2      0.1     0.1 -0.7      7.2   .94   6.7  .985   6.6      7.1    .094  $12,691,176
1974     4.8      0.1     0.7 -0.8      6.4   .95   6.1  .978   5.9      6.3    .082  $10,331,219
1980     5.5      0.2     1.5 -0.8      7.8   .56   4.4 1.036   4.5      8.1    .060   $9,656,867
1978     5.7      0.3    -0.8 -0.7      5.9   .84   5.0 1.027   5.1      6.1    .068   $8,594,862
1973     3.9     -0.1     0.7 -1.7      6.3   .78   5.0  .954   4.7      6.0    .063   $7,964,437
1972     3.9      0.1    -0.5 -0.9      4.5   .93   4.1  .978   4.1      4.4    .053   $5,411,664
1970     2.8     -0.2     0.4 -1.2      4.3  1.02   4.4  .926   4.1      4.0    .053   $5,082,542
1969     3.8     -0.6    -0.2 -1.3      4.4  0.96   4.3  .929   4.0      4.1    .051   $5,067,259
1975     3.8      0.2    -0.4 -0.7      4.3   .87   3.7 1.000   3.7      4.3    .048   $4,911,259
1971     2.9      0.1    -0.4 -1.2      3.8  1.12   4.3  .959   4.1      3.6    .053   $4,800,811
1976     1.6      0.1     1.8 -0.7      4.3   .69   2.9  .997   2.9      4.2    .036   $3,811,528
1968     2.7     -0.5    -0.3 -1.2      3.1  1.03   3.2 1.010   3.2      3.1    .040   $3,506,523
1982     3.7      0.2    -0.1  0.0      3.8   .63   2.4 1.033   2.5      3.9    .030   $3,079,236
1979     2.7      0.3     0.8 -0.7      4.5   .42   1.9 1.015   1.9      4.6    .023   $2,678,304
1967     0.6      0.0     0.7 -1.1      2.4  1.01   2.5  .976   2.4      2.4    .029   $2,176,637
TOTAL    3.6      0.0     0.5 -1.0      4.8 12.75  60.7  .981  59.6      4.7    .783  $89,664,325


Willie Stargell

Year BWAA/Yr BRWAA/Yr FWAA/Yr  Rep WARP1/Yr SFrac WARP1 LgAdj WARP2 WARP2/Yr PennAdd       Salary
1973     7.2      0.0    -0.2 -0.9      7.9   .96   7.6  .967   7.3      7.6    .105  $14,770,951
1971     7.4      0.0    -0.8 -0.8      7.5   .95   7.1  .958   6.8      7.2    .097  $13,119,448
1974     5.6      0.0    -0.7 -0.8      5.8   .95   5.5  .978   5.4      5.6    .073   $8,568,340
1966     5.9      0.0    -0.5 -0.6      6.1   .85   5.1  .931   4.8      5.7    .064   $7,668,318
1969     5.4      0.0    -0.7 -0.9      5.7   .94   5.4  .914   4.9      5.2    .066   $7,454,198
1972     5.7      0.0    -0.9  0.3      4.6   .94   4.3  .963   4.1      4.4    .054   $5,524,119
1978     4.9      0.1    -0.3  0.0      4.6   .72   3.3 1.027   3.4      4.7    .043   $4,809,055
1965     2.8      0.0     0.7 -0.6      4.2   .91   3.8  .908   3.4      3.8    .044   $4,137,982
1975     4.1      0.0    -0.4  0.2      3.6   .84   3.0 1.000   3.0      3.6    .038   $3,516,186
1968     2.8      0.1    -0.1 -0.8      3.7   .78   2.9 1.015   2.9      3.4    .036   $3,479,362
1967     3.8      0.0    -1.0 -0.7      3.6   .85   3.0  .959   2.9      3.4    .036   $3,299,312
1979     3.7      0.0    -0.1  0.0      3.6   .75   2.7 1.015   2.8      3.7    .034   $3,256,971
1970     2.2      0.0     0.4 -0.8      3.5   .82   2.9  .921   2.7      3.2    .033   $2,896,053
1977     4.0      0.0    -0.7  0.0      3.4   .34   1.2  .985   1.1      3.3    .013   $1,257,954
1976     2.3      0.1    -0.9  0.0      1.4   .77   1.1  .997   1.1      1.4    .013     $780,313
1964     2.3      0.0    -1.2 -0.6      1.7   .70   1.2  .907   1.0      1.5    .012     $758,583
1980     2.7      0.0    -0.9  0.1      1.7   .36   0.6 1.036   0.6      1.8    .007     $498,640
1982     0.7      0.0    -0.6  0.0      0.2   .13   0.0 1.033   0.0      0.2    .000      $13,648
1963     0.7      0.0    -1.5 -0.4     -0.4   .52  -0.2  .905  -0.2     -0.4   -.003           $0
1981    -0.8      0.0    -1.1  0.0     -1.9   .17  -0.3  .981  -0.3     -1.8   -.004           $0
TOTAL    4.2      0.0    -0.5 -0.4      4.2 14.25  60.2  .958  57.7      4.0    .761  $85,809,433
   65. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 08, 2007 at 09:31 PM (#2294294)
I didn't know Smith played in Japan! Is there an available MLE?

Dammit there's still a typo on Smith's chart--he averaged 0.2 fielding wins above average per full seaosn for his career, not 0.5.

Anyways, to analyze these charts, briefly--Stargell was clearly the superior hitter as everyone knows (60.3 batting wins above average, versus 45.9 for Smith). But the gap in their defensive value is greater. While Smith played half his career in center, half in the corners, Stargell played 40% of his career at first base, in a very strong era for first base depth (Guillermo "Willie" Montañez was consistently among the very worst-hitting first basemen in the NL with a 105 OPS+), and the rest in the corners. Thus, the players that would have replaced Stargell had he not been playing were, on average, just 0.4 wins below overall league average, while Smith's replacements averaged 1.0 wins below league average. Moreover, Smith fielded his positions at a slightly above-average rate (2.3 fielding wins above average), while Stargell was consistently bad and occasionally a butcher (7.3 fielding wins below average).

Smith's advantages on defense exactly negate Stargell's on offense for their careers; both have between 60 and 61 WARP1. But Smith debuted four years later than Stargell, and thus contributed more of his value in the very low standard deviation era of the late 1970s, while Stargell had more value in the higher-standard deviation 1960s, closer to expansion years. Thus, Stargell's 60.2 WARP would only be worth 57.7 in the 2005 NL, while Smith's would be worth 59.6.

On a peak basis, Smith didn't quite have a year to measure up to Pops's '71 and '73 (although his '77 was close), but had more All-Star caliber seasons overall. The salary estimator is extremely peak-oriented, and it prefers Smith's. Your mileage may vary.

All in all, I think they are extremely close in value, and given the greater uncertainty in defensive numbers, I'd probably still take Stargell. But I think both are deserving HoM'ers.
   66. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 08, 2007 at 09:40 PM (#2294300)
You know my response, but I'll give it anyways: If you trust an average of FRAA and Fielding Win Shares (and for what it's worth, Defensive Regression Analysis and his reputation at the time agree), in spite of the 30 errors and unremarkable range factor, he was an extremely good fielder, stole 41 bases at an excellent rate, and played in an era where replacement shortstops were putrid and talent was tightly bunched together in general. It was an extremely valuable season. And the team won 98 games despite a below-average pitching staff (with a little help from Messrs. Morgan and Bench).
   67. sunnyday2 Posted: February 08, 2007 at 11:02 PM (#2294351)
Reggie #2 is in my PHoM. So is Pops. Davey isn't gonna make it. He's not quite Joe Sewell who never made my PHoM either.
   68. Juan V Posted: February 08, 2007 at 11:08 PM (#2294358)
One doubt I have: The only difference between WARP1 and WARP2 is the standard-deviation league adjustment, right?
   69. Joey Numbaz (Scruff) Posted: February 09, 2007 at 04:11 AM (#2294528)
Still up at the top, but this jumped out at me:

"I forgot to mention that yes, these are park-adjusted. Five year weighted moving average of baseball-reference park factors."


The baseball reference park factors, I believe are already multiple year factors - there isn't enough variance in them for this not to be true.

So you might be over-correcting there. You should be able to use the BR factors right out of the box, so to speak. Also those factors (I think) account for situations where the park changed significantly, etc..

Also, not sure if you accounted for this twice somewhere else, but those factors already account for not facing your teammates. So you don't need to adjust for that anywhere if you are using them.

**********

This appears very promising - I can't wait to jump into it further.
   70. Joey Numbaz (Scruff) Posted: February 09, 2007 at 04:28 AM (#2294533)
One thing Dan - once you have all of the data (meaning AL/AA/NA), I would very strongly recommend using the bottom 3 (or 6 if you go with the same proportions) MLB players, not separate replacement levels at each position for each league/season. The league distinctions are meaningless, plus your sample sizes will be bigger. This will help 'smooth-out' the peaks and valleys I think.
   71. Joey Numbaz (Scruff) Posted: February 09, 2007 at 04:31 AM (#2294537)
One other thing - I don't think I like Nate Silver's salary estimator - it over-exaggerates the value of big seasons.

I would much prefer using something like Pennants Added (another BPro brainchild) for giving weight to a high peak.
   72. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 09, 2007 at 04:35 AM (#2294540)
Juan V., that's correct. WARP2 is just WARP1 times the ratio of the 2005 NL projected stdev to the projected stdev of the league-season in question.

Joe Dimino, is there any way to find out baseball-reference.com's park factor methodology? And yes, I agree that there is one MLB replacement level, not two league-specific ones--I just have to be careful to account for the DH! (That's why Reggie Smith's 1973 replacement level is so much lower than his earlier ones).
   73. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 09, 2007 at 04:51 AM (#2294545)
Pennants Added is already in the spreadsheet, my friend. But that's a question of your career vs. peak preferences as a HoM voter, and has nothing to do with what can be learned from this WARP data.
   74. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 09, 2007 at 06:22 AM (#2294580)
And here are charts for Reggie Smith's contemporaries Bobby Bonds and Jimmy Wynn, as well as Indian Bob Johnson who many people like. (I haven't included any minor league equivalent data for Johnson, but also haven't docked him for wartime competition, so I'm assuming those cancel out.) Smith and Wynn were equally good hitters, but Smith was a better fielder and had a longer career. Smith and Bonds were equally good fielders and had the same career length, but Smith was a much better hitter. Johnson's career was equally valuable to Smith's, but in an era that was *much* easier to dominate as measured by standard deviations. Smith clearly seems to me a meaningful step above all of them, similar in overall value to Stargell and over the in/out line.

Bobby Bonds

Year BWAA/Yr BRWAA/Yr FWAA/Yr  Rep WARP1/Yr SFrac WARP1 LgAdj WARP2 WARP2/Yr PennAdd       Salary
1971     4.1      0.1     0.9 -0.8      5.9  1.09   6.4  .958   6.2      5.6    .086   $9,899,128
1973     4.0     -0.1     0.7 -0.9      5.5  1.17   6.4  .967   6.1      5.3    .085   $9,355,608
1969     3.2      0.6     1.0 -0.9      5.7  1.15   6.6  .914   6.0      5.2    .083   $9,104,927
1975     4.3     -0.2     0.4 -1.1      5.7  1.00   5.7  .976   5.5      5.5    .076   $8,749,696
1970     3.4      0.3    -0.1 -0.8      4.5  1.17   5.3  .921   4.8      4.1    .064   $6,196,262
1972     1.6      1.0     0.5 -0.9      4.0  1.16   4.7  .963   4.5      3.9    .059   $5,494,767
1977     3.1     -0.1     0.0 -1.1      4.2  1.06   4.4  .932   4.1      3.9    .054   $5,103,730
1974     2.0      0.3     0.7 -0.8      3.8  1.06   4.0  .978   3.9      3.7    .051   $4,679,441
1978     2.0      0.1    -0.3 -1.1      3.9  1.04   4.0  .959   3.9      3.7    .050   $4,633,433
1968     2.7     -0.1     0.6 -0.8      4.0   .56   2.2 1.015   2.2      4.0    .027   $2,832,358
1976     1.8     -0.1     0.7 -1.1      3.5   .68   2.4 1.014   2.4      3.6    .030   $2,821,514
1979     2.2     -0.6    -0.4 -1.1      2.5   .99   2.4  .937   2.3      2.3    .028   $2.047,329
1981     0.2     -0.7    -0.3 -0.8      0.0   .46   0.0  .981   0.0      0.0   -.001           $0
1980    -1.4      0.1    -0.7 -0.8     -1.3   .43  -0.5 1.036  -0.6     -1.3   -.007           $0
TOTAL    2.7      0.1     0.3 -0.9      4.2 13.03  54.1  .950  51.4      3.9    .684  $70,918,193


Jimmy Wynn

Year BWAA/Yr BRWAA/Yr FWAA/Yr  Rep WARP1/Yr SFrac WARP1 LgAdj WARP2 WARP2/Yr PennAdd       Salary
1969     6.6      0.1    -0.6 -1.3      7.5  1.03   7.8  .914   7.1      6.9    .101   $13,185,333
1968     5.6     -0.7     0.2 -1.2      6.3  1.02   6.4 1.015   6.5      6.4    .092   $11,521,023
1974     4.9     -0.4     0.8 -1.2      6.5  1.03   6.7  .978   6.6      6.4    .092   $11,520,600
1965     4.3      0.6     0.1 -1.0      6.2  1.03   6.3  .908   5.8      5.6    .079    $9,194,798
1972     4.3      0.0    -0.4 -0.9      4.9  1.10   5.3  .963   5.1      4.7    .069    $7,176,099
1970     4.1      0.2    -0.7 -1.2      5.0  1.05   5.2  .921   4.8      4.6    .064    $6,635,005
1975     3.9     -0.5     0.3 -1.1      4.7   .84   4.0 1.000   4.0      4.7    .051    $5,557,357
1967     3.9      0.1    -1.1 -1.1      4.0  1.07   4.3  .959   4.1      3.9    .054    $5,047,334
1966     2.0     -0.4    -0.2 -1.0      2.5   .73   1.9  .931   1.7      2.4    .020    $1,564,932
1973     1.4     -0.3     0.0 -0.9      2.0   .91   1.8  .967   1.8      1.9    .021    $1,427,096
1963     1.3      0.2    -0.8 -0.4      1.1   .45   0.5  .905   0.4      1.0    .005      $276,413
1964    -0.7      0.3    -1.1 -1.0     -0.4   .39  -0.2  .907  -0.1     -0.4   -.002            $0
1971    -1.6      0.3     0.2 -0.8     -0.3   .74  -0.2  .958  -0.2     -0.3   -.003            $0
TOTAL    3.5     -0.1    -0.2 -1.0      4.4 11.39  49.8  .956  47.6      4.2    .643   $73,105,990


Bob Johnson

Year BWAA/Yr BRWAA/Yr FWAA/Yr  Rep WARP1/Yr SFrac WARP1 LgAdj WARP2 WARP2/Yr PennAdd       Salary
1944     8.2      0.0     0.2 -0.4      8.9   .81   7.2  .845   6.1      7.5    .084  $12,099,978
1939     6.5      0.4    -0.3 -0.5      7.1   .85   6.1  .813   4.9      5.8    .066   $8,089,137
1937     5.8      0.2     1.1 -0.4      7.7   .73   5.6  .818   4.6      6.3    .061   $7,999,285
1934     4.1      0.3     1.2 -0.4      6.1   .84   5.2  .837   4.3      5.1    .057   $6,444,945
1938     4.7      0.2     0.0 -0.9      6.0   .87   5.2  .822   4.3      4.9    .056   $6,177,655
1941     3.9      0.0     0.6 -0.6      5.1   .84   4.3  .861   3.7      4.4    .047   $4.943,059
1942     4.3      0.0     0.0 -0.5      4.9   .86   4.2  .866   3.7      4.3    .047   $4,823,552
1943     3.5      0.0     1.4 -0.5      5.5   .69   3.8  .831   3.2      4.6    .040   $4,358,988
1936     3.1      0.1     0.6 -0.4      4.3   .84   3.7  .813   3.0      3.5    .037   $3,431,159
1940     3.5      0.2    -0.2 -0.6      4.2   .79   3.3  .838   2.8      3.5    .034   $3,203,588
1935     3.6      0.0    -0.3 -0.4      3.9   .88   3.4  .843   2.9      3.3    .036   $3,158,139
1945     2.9      0.0     0.5 -0.5      3.9   .84   3.3  .833   2.7      3.3    .034   $2,977,297
1933     4.0      0.2    -1.3 -0.4      3.4   .83   2.8  .849   2.4      2.9    .029   $2.447,488
TOTAL    4.5      0.1     0.3 -0.5      5.4 10.66  58.0  .835  48.4      4.5    .627  $70,154,269
   75. KJOK Posted: February 09, 2007 at 06:34 AM (#2294584)
The baseball reference park factors, I believe are already multiple year factors - there isn't enough variance in them for this not to be true.

I think there's still an issue with the baseball-reference.com park factors being 3 year factors thru 1997, then switching to 1-year factors for 1998-2006.

The calculation page is here:

Baseball Reference Park Factor Calculation
   76. KJOK Posted: February 09, 2007 at 06:40 AM (#2294588)
I didn't know Smith played in Japan! Is there an available MLE?

Yes. HOM Reggie Smith Thread
   77. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 09, 2007 at 07:27 AM (#2294601)
Argh, I'm sorry I'm so messy with these, somehow I had improperly inputted Bob Johnson's plate appearances into the spreadsheet. Turns out he's actually well below Bonds and Wynn. I knew those numbers looked funny. The stuff in the actual spreadsheet I've posted on the Web has been triple-checked, but since I'm just putting these AL player-seasons together now I haven't had the chance to make sure I haven't made mistakes. Here he is again.

Bob Johnson, done right this time...

<pre>
Year BWAA/Yr BRWAA/Yr FWAA/Yr Rep WARP1/Yr SFrac WARP1 LgAdj WARP2 WARP2/Yr PennAdd Salary
1944 6.9 0.0 -0.1 -0.4 7.3 .96 7.0 .845 5.9 6.2 .082 $10,197,806
1939 5.6 0.3 -0.5 -0.5 6.1 .98 6.0 .813 4.9 5.0 .065 $7,100,719
1937 4.9 0.2 0.7 -0.4 6.4 .86 5.5 .818 4.5 5.2 .059 $6,776,422
1934 3.8 0.3 1.0 -0.4 5.6 .91 5.1 .837 4.3 4.7 .056 $5,941,892
1938 4.1 0.2 -0.2 -0.9 5.2 .99 5.2 .822 4.3 4.3 .056 $5,612,566
1941 3.3 0.0 0.3 -0.6 4.3 .97 4.2 .861 3.6 3.7 .046 $4,261,108
1942 3.8 0.0 -0.2 -0.5 4.2 .98 4.1 .866 3.6 3.7 .046 $4,228,396
1943 3.1 0.0 1.1 -0.5 4.8 .78 3.8 .831 3.1 4.0 .039 $3,920,868
1936 2.7 0.1 0.3 -0.4 3.7 .97 3.6 .813 2.9 3.0 .036 $2,992,086
1940 3.1 0.2 -0.4 -0.6 3.6 .90 3.2 .838 2.7 3.0 .034 $2,818,820
1935 3.2 0.0 -0.4 -0.4 3.3 1.00 3.3 .843 2.8 2.8 .035 $2,775,427
1945 2.5 0.0 0.3 -0.5 3.4 .93 3.2 .833 2.6 2.8 .032 $2,639,446
1933 3.5 0.2 -1.3 -0.4 2.8 .95 2.7 .849 2.3 2.4 .028 $2,066,036
TOTAL 3.9 0.1 0.0 -0.5 4.7 12.18 56.7 .835 47.4 3.9 .612 $61,331,593
<pre>
   78. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 09, 2007 at 07:46 AM (#2294605)
um....one. last. try. my apologies.

Bob Johnson

Year BWAA/Yr BRWAA/Yr FWAA/Yr  Rep WARP1/Yr SFrac WARP1 LgAdj WARP2 WARP2/Yr PennAdd       Salary
1944     6.9      0.0    -0.1 -0.4      7.3   .96   7.0  .845   5.9      6.2    .082  $10,197,806
1939     5.6      0.3    -0.5 -0.5      6.1   .98   6.0  .813   4.9      5.0    .065   $7,100,719
1937     4.9      0.2     0.7 -0.4      6.4   .86   5.5  .818   4.5      5.2    .059   $6,776,422
1934     3.8      0.3     1.0 -0.4      5.6   .91   5.1  .837   4.3      4.7    .056   $5,941,892
1938     4.1      0.2    -0.2 -0.9      5.2   .99   5.2  .822   4.3      4.3    .056   $5,612,566
1941     3.3      0.0     0.3 -0.6      4.3   .97   4.2  .861   3.6      3.7    .046   $4,261,108
1942     3.8      0.0    -0.2 -0.5      4.2   .98   4.1  .866   3.6      3.7    .046   $4,228,396
1943     3.1      0.0     1.1 -0.5      4.8   .78   3.8  .831   3.1      4.0    .039   $3,920,868
1936     2.7      0.1     0.3 -0.4      3.7   .97   3.6  .813   2.9      3.0    .036   $2,992,086
1940     3.1      0.2    -0.4 -0.6      3.6   .90   3.2  .838   2.7      3.0    .034   $2,818,820
1935     3.2      0.0    -0.4 -0.4      3.3  1.00   3.3  .843   2.8      2.8    .035   $2,775,427
1945     2.5      0.0     0.3 -0.5      3.4   .93   3.2  .833   2.6      2.8    .032   $2,639,446
1933     3.5      0.2    -1.3 -0.4      2.8   .95   2.7  .849   2.3      2.4    .028   $2,066,036
TOTAL    3.9      0.1     0.0 -0.5      4.7 12.18  56.7  .835  47.4      3.9    .612  $61,331,593
   79. EddieA Posted: February 09, 2007 at 08:53 AM (#2294617)
Look again at Bonds 2005. Pretty sure he didn't generate 10 WARP in 49 PA on one leg.
   80. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 09, 2007 at 03:08 PM (#2294670)
You're getting your columns confused. He generated 0.8 WARP (WARP1 and WARP2) in 2005. He was on *pace* for a 10-WARP season, largely because his one or two FRAA get projected out to 16 per full season.
   81. Mark Shirk (jsch) Posted: February 09, 2007 at 03:35 PM (#2294693)
Dan,

Did you say above that being in the DH league helped or hurt Reggie Smith? Also, while I understand that you are trying to factor in everything, I am not sure I would factor in the DH. Being in a DH league isn't something a player can control and I am not sure I would penalize/reward a player for that. However, I realize that, much like Joe's recommending that you use PA instead of the salary estimator, that comes down to personal taste and how I try to balance value vs. ability.
   82. Mark Shirk (jsch) Posted: February 09, 2007 at 03:39 PM (#2294696)
One more thing,

Would using an MLB wide replacement level help to even out the dips in a league wide replacement level? I realize that you don't yet have the data for an MLB wide replacement level, but I was wondering is some players, hate to mention him again but Concepcion, aren't benefitting a little because of this.
   83. Dr. Chaleeko Posted: February 09, 2007 at 03:57 PM (#2294709)
i wonder if concepcion is benefitting from the one-league effect. The AL wasn't exactly stuffed with great SS either. There's
-a few good Campy years
-the bad part of Yount's career
-a couple years of Toby Harrah's misadventure at the position
-Rick Burleson
-most of Roy Smalley's good years
-Alan Trammell's bad years.

I guess that's probably better than Concepcion's league, which included Larry Bowa, Don Kessinger, and the 9 dwarves.

But this raises the following questions for me:

1) Isn't it better to compare a guy to all MLB players at his position (adjuted for run differing contexts), since all MLB teams have an (theoretical) equality of opportunity in competing for amateur talent?

2) Isn't it better to compare a guy only to his own league since that more directly establishes the value of him to his own team?

3) Isn't it better to compare a guy to all MLB players at his position since all MLB teams compete for the same World Series?

4) Isn't it better to compare a guy only to players in his own league up til around 1975-1990 since the leagues had different identies, and players tended to remain in one league until free agency came around (due to the rule which forced teams to take waivers on any cross-league trade during any part of the season, not just after July 31st as it is today)?

5) Isn't this a maddening vortex of supposition?
   84. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 09, 2007 at 04:13 PM (#2294718)
Mark Shirk, being in the AL (for one season, 1973) neither hurts nor helps Smith. It just means that I have to correct for the DH to compare those seasons equally to all others--the same batting line is a worse OPS+ in the AL than the NL because the DH's pull up the league average and the pitchers pull it down. I was just noting that the reason why Smith's 1973 rep level is lower than that of his previous seasons is because of the introduction of the DH.

I will definitely use an MLB-wide rep level...once I finish the AL. That said, I really don't think anything will budge by more than 0.2 wins as a result, or 0.3 max. (to move by 0.3, the 27 worst regulars at a position in the AL would have to be on average over 6 runs better or worse per season than the 27 worst regulars at the same position in the NL, which would be pretty extreme). For what it's worth, I do have data for AL shortstops from 1960-2005 and the AL line almost exactly traces the NL one.
   85. DavidFoss Posted: February 09, 2007 at 04:57 PM (#2294763)
I guess that's probably better than Concepcion's league, which included Larry Bowa, Don Kessinger, and the 9 dwarves.

Chris Speier, Gary Templeton and a young no-hit-yet Ozzie were his big competition in the NL.

The AL had better shortstops (by WS anyways). Burleson had a three-year fielding peak that (by FWS) no one else in this era can match. Harrah may have been an embarrassment with the glove, but the guy could rake.
   86. DL from MN Posted: February 09, 2007 at 05:13 PM (#2294775)
> haven't docked him for wartime competition

> Johnson's career was equally valuable to Smith's, but in an era that was *much* easier to dominate as
> measured by standard deviations

I'd say your method automatically determines a war discount by accounting for the larger standard deviation in the weaker leagues. In that case I think it is essential to actually give the minor league credit when figuring out Bob Johnson. From your data I'd estimate another .035 to .070 pennants added, which gets him right back in the picture with Bobby Bonds and Jimmy Wynn. I believe the AL was also considered the "stronger" league in the time of Bob Johnson's career. I'm curious if comparing him to the bottom MLB player would make him look better or worse by this method.
   87. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 09, 2007 at 05:38 PM (#2294788)
DavidFoss--Burleson was amazing. My system thinks he's god--a high Hall of Merit peak, but just didn't play long enough to be a serious candidate. It loves Harrah's 1975 too. But remember, these guys aren't being compared to the average of their contemporaries, they're being compared to replacement level.

DL from MN, apparently not--take a look at the LgAdj numbers for Johnson, they're no different for the war years than for the 1930s. I do include a wartime dummy variable in my regression, but its effect is countered by the extremely low run scoring of WWII-era play, compared to the offensive bonanza of the 1930's AL. I don't know what the real actual stdev was for the AL during the war, but for the NL only 1944 had an exceedingly high stdev--1943 was actually lower than '41 and '42 (and the 1930's average), and '45 was slightly high but just the same as 1940. I can't stress enough that the stdev adjustment is *not* a competition or quality of play adjustment, whatever the late Stephen Jay Gould might have you believe to the contrary.
   88. Mark Shirk (jsch) Posted: February 09, 2007 at 06:45 PM (#2294840)
Dan beat me to it, but the discussion of DC's contemporaries should focus on the bottom feeders at his position not the average or better players, at least as far as Dan's analysis goes.

I find this very intriguing, but part of me is loathe to accept that a weak bottom level of SS's (or a strong bottom level of 1B) shoudl really effective how a player is viewed in a HOM context. As a GM or someone trying to critique salary disbursement, it is extremely valuable. But we are trying to figure out the best players of all time across eras, not necessarily within one. Again, this comes down to how one weighs value vs. ability. Most of us try and find a balance and Dan's work is all about value. Again, not saying it isn't useful, it is very useful and very helpful.

One more question,

You say that McCovey and Stargell are overrated because of the high replcement level of 1B during their careers, when does this replacement level begin to drop (assuming that it does)? Does the advent of the DH and about 12 more jobs for fringe 1B/corner OF effect 1B replcement level? If it did, I wouldn't exactly kow what that means, I was just wondering if there is a connection between the two.
   89. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 09, 2007 at 07:29 PM (#2294873)
Mark Shirk, that couldn't be more true. I imagine the advent of the DH in the AL would only really begin to affect NL replacement levels in 1977, after free agency created much more player movement between leagues. As you can see on the graph of replacement levels, the gap between the 1B rep level and the average of the other three infield position rep levels averaged 2.3 wins from 1960-76 and just 1.9 wins from 1977-2005. Which is just what you would expect, adding the DH lowers the replacement level for good hit no field types, which makes immobile sluggers more valuable post-DH (in both leagues, after free agency) than they were beforehand.
   90. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 09, 2007 at 10:52 PM (#2295014)
As for the "ability vs. value" debate, I think there's a possibility that it's a red herring altogether. The question, it seems to me, is where a player is what I will call "anchored" on the distribution of talent. We know that positional replacement levels and averages change over time. After adjusting for quality of play and standard deviations, the question then becomes, what would a player be "anchored" to on the distribution if he were magically transported to another era: the league average, the positional average, or the positional replacement level?

Take a guy who is two wins above league average, three above positional average, and five above positional replacement. Now transfer him to a league where there are some huge superstars at his position and nothing else (like SS in the early 80s or late 90s AL). The positional average will have moved up slightly due to the superstars (let's say, to 0.5 wins below league average), while the replacement level will have dropped (to, say, 4 wins below league average, which is now 3.5 below positional average). Where will our player turn up? If he's "anchored" to the league average, two wins above league average is two wins above league average, and he will be 2.5 wins above positional average ("worse") and 6 wins above positional replacement ("better"). If he's "anchored" to the positional average, he'll remain three wins above positional average, so he "improves" relative to the league average (now 2.5 above) and the positional replacement level (again, 6 above). And if he's "anchored" to the positional replacement level, he'll decline relative to the league average (just 1 above) and the positional average (1.5 above).

We don't know what the right answer to this question is, and I don't know if we can. But the implicit hypothesis or assumption behind my work is that the latter is true, that players are "anchored" to their positional replacement level. I'm saying that if Dave Concepción, a six- or seven-WARP player at his peak, had played in the 1950's NL, with a much higher rep level for SS and a higher standard deviation, he would have looked a lot like Ernie Banks. Your mileage may vary.
   91. Alex meets the threshold for granular review Posted: February 09, 2007 at 10:58 PM (#2295017)
Dumb question but does the NL spreadsheet include all player seasons since a certain year or what?
   92. Mark Shirk (jsch) Posted: February 09, 2007 at 11:10 PM (#2295023)
I guess I am not sure why we have to anchor a player to anything
   93. BDC Posted: February 09, 2007 at 11:11 PM (#2295024)
if Dave Concepción, a six- or seven-WARP player at his peak, had played in the 1950's NL, with a much higher rep level for SS and a higher standard deviation, he would have looked a lot like Ernie Banks

My mileage does vary, Dan :) I think that Concepcion might have looked like Ernie Banks if Davey had played in Coors Field in 2000, soaking wet, with the wind blowing out and the ball juiced and the umpires on the take.

Otherwise, it's a real hard sell. The concept you are working with is like saying that the shortstop position in a given few years' span is like a "league"; the best player by far (above the bottom) in that "league" is then equal to the best player by far (above the bottom) in another era's "league" (shortstops of the 1950s, e.g.), so that Concepcion = Banks. Ultimately I can't see how that's much different from saying that the men's hoops MVP of the Little East conference should be as good an NBA draft prospect as the MVP of the Big East. Choosing a Hall of Whatever is different than judging the local market in shortstops in 1974. Essentially it's like "drafting" players onto a higher plane (and the vagaries of immediate supply don't much matter, because we have all eternity to draft them in). I think your work says an immense amount, don't get me wrong, but I am skeptical about its ability to compare players well for Hall purposes.
   94. Mark Shirk (jsch) Posted: February 09, 2007 at 11:17 PM (#2295028)
Or maybe I should say that I am not sure there is one right way to anchor a player to something. For instance, if you are a GM and are tryign to figure out how much to pay Free Agent X or how much PLayer B is worth to your team, knowing the level of 'freely' available talent is very important. When tryign to compare teh greatest players ever I don't see much reason to figure out how good the worst two or three at each position were. Or at least I am not sure if that is the best way to measure this.

The same is true for other debates. For instace, take park factors. There are generally two ways to adjust for park, one is to adjust for overall runs coring environment and the other is to adjsut for reach component. If you are a GM, then you want to know if a certain player's style of play will be effective in your park, but when doing a HOM like comparison I see no reason to degrade someone like say, Elston Howard or Mel Ott, because they were able to take special advantage of their park over their careers.

My point is that different types of player analysis are useful for different functions. While the 'less useful' version (for lack of a better term) should still be a part of the debate, it may not be ideally suited for the task at hand.

And I hope that you do not feel that I am brushing aside your work here, Dan. That is not my aim.
   95. 'zop sympathizes with the wrong ####### people Posted: February 09, 2007 at 11:20 PM (#2295029)
I guess I am not sure why we have to anchor a player to anything

One reason is because "ease-of-excellence" at a position may be contextual. For example, I think that Dan's data strongly indicates that the SS drought of the 70's-80's (really, a SS-2B-3B drought for part of that period) was self-selected by MLB conventional wisdom; during that period larger shortstops who were capable of hitting at the levels seen in the 50's and 90's were moved off the position.

Through the hindsight of the modern, I can look at that era and judge, "these teams were screwing themselves with these tiny, good-field-no-hit shortstops." And I grant that's a possibility, in which case perhaps we should dock Concepcion for his inferior competition. But I'm also a big believer in the efficiency of the market, and I suspect that over 10 years if a big shortstop was SUCH a big advantage some wily GM or manager would have picked up on the opportunity and exploited it.

Therefore, we have to consider the possibility that there was something inherently difficult about SS defense during the no-hit replacement era that made it a more difficult defensive position compared to other eras, and therefore limited the offensive value of those who played it. Maybe it's much harder to play SS on turf if you're a big guy, and the widespread turf fields forced larger players to the OF and 3B. If there's a "reason" why SS replacement level dropped during Dave Concepcion's era, then Concepcion should be fully rewarded for playing SS as well as it could be played in the conditions of his time. As the HoM constitution says: All eras should be treated equally.
   96. Joey Numbaz (Scruff) Posted: February 09, 2007 at 11:33 PM (#2295035)
BTW - Dan, I have a spreadsheet that calculates Baseball-Reference-type park factors if you want it.

It could be used to come up with 3-year factors for 1998-2006, for example . . .

Just send me an email if you want it . . .
   97. Joey Numbaz (Scruff) Posted: February 09, 2007 at 11:38 PM (#2295037)
"Therefore, we have to consider the possibility that there was something inherently difficult about SS defense during the no-hit replacement era that made it a more difficult defensive position compared to other eras, and therefore limited the offensive value of those who played it. Maybe it's much harder to play SS on turf if you're a big guy, and the widespread turf fields forced larger players to the OF and 3B."


I think this is a very large part of it.

Turf and huge OFs are also a big reason why CF defense became much tougher with the cookie-cutter ballpark designs starting in the mid-1960s . . .
   98. KJOK Posted: February 09, 2007 at 11:42 PM (#2295042)
I too have such a spreadsheet, with data thru 2003 already populated....
   99. 'zop sympathizes with the wrong ####### people Posted: February 09, 2007 at 11:56 PM (#2295047)
I think this is a very large part of it.

Turf and huge OFs are also a big reason why CF defense became much tougher with the cookie-cutter ballpark designs starting in the mid-1960s . .


I think turf is a very good hypothesis. But I think the key point is that it's almost certainly something Dan's replacement level time-series does not look stochastic.
   100. Joey Numbaz (Scruff) Posted: February 10, 2007 at 12:09 AM (#2295057)
Agreed Phil . . .
Page 1 of 8 pages  1 2 3 4 5 6 >  Last ›

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
JE (Jason Epstein)
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Demarini, Easton and TPX Baseball Bats

 

 

 

 

Page rendered in 1.2145 seconds
49 querie(s) executed