Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Hall of Merit > Discussion
Hall of Merit
— A Look at Baseball's All-Time Best

Monday, February 05, 2007

Dan Rosenheck’s WARP Data

WARP Methodology and Results

Thanks, Dan!

John (You Can Call Me Grandma) Murphy Posted: February 05, 2007 at 02:59 PM | 620 comment(s)
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 6 of 7 pages  1 2 3 4 5 6 7 >
   501. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 06, 2007 at 06:46 PM (#2607207)
This is a great example of where Gould's equation of higher league quality and lower standard deviation is just dead wrong. You are right that minority stars displaced white replacement players (in the NL, at least). A replacement player is usually what, about 2 wins below positional average? And a star is what, 3 or 4 wins above positional average? So the early days of integration actually *increased* standard deviations, by creating a "star glut" in the NL. Conversely, the relative dearth of stars in the AL around 1950 (the only megastar was Williams, maybe Doby) led to quite *low* standard deviations in the league around this time. This is, of course, why you can't use actual standard deviations, because you'd wind up punishing the NL'ers for integrating and rewarding the AL'ers for not integrating! My regression equation doesn't "know" where the stars were, of course, and so it projects the AL to have a significantly higher stdev than the NL throughout the 1950's, even though in fact the opposite was true. And the equation is right--the AL was easier to dominate, even though more players happened to dominate the NL.
   502. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 06, 2007 at 06:52 PM (#2607217)
To clarify, park factors are worth 3.5 wins to Campaneris, SB/CS are worth 4.5 (above and beyond the non-SB baserunning runs I mentioned earlier), and DP avoidance is worth 3.7 wins.
   503. DCW3 * Posted: November 06, 2007 at 06:53 PM (#2607219)
Wow! That's very surprising. Are RCAP park-adjusted? Do they include SB/CS? Do they include double play avoidance? All three are major benefits to Campaneris which I assumed were included. Also, by subtracting 4 years from McCovey and 2 from Campaneris, I'm giving Campaneris more below-average time than McCovey. DCW3, how would it look if you just did above-average seasons, ignoring all below-average ones?

The way I calculate them, RCAP are park-adjusted, but not era-adjusted: for instance, Campaneris had a higher RCAP in 1970 than 1968, but since the offensive context was higher in '70, '68 was a more valuable season. They include SB/CS, but not DP avoidance. If I subtract out all the below-average seasons, it puts McCovey at 431 and Campaneris at 156.
   504. zoperino,if youre not into the whole brevity thing Posted: November 06, 2007 at 08:22 PM (#2607320)
the relative dearth of stars in the AL around 1950 (the only megastar was Williams, maybe Doby)

Yogi Berra and Joe DiMaggio say hi.
   505. Mike Emeigh Posted: November 06, 2007 at 08:31 PM (#2607328)
Yogi Berra and Joe DiMaggio say hi.


DiMag was on his last legs by 1950. Berra hadn't yet established himself as a megastar, although he would within a couple of years.

-- MWE
   506. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 06, 2007 at 08:49 PM (#2607343)
Catchers as a group hit pretty well in the '50s, and because their rates are more closely bunched together than other positions AND they play fewer games, it's VERY difficult for a C to really move a leaguewide standard deviation (which requires an extremely high number of wins above positional average). When you correct for all the obstacles that catchers face, Berra is among the top 40 MLB position players of the pitchers' mound era, but in terms of dominating a league, he was certainly never putting up the type of 8-wins-above-positional-average seasons that someone like Cal Ripken did.

DiMaggio's last superstar year was '48. He played great in '49 but missed half the year, played at an All-Star but not superstar level in '50, and was done after '51.

Just checking the data, the time period I'm really talking about is 1951 to 1955, when integration really began in earnest and the AL lost its best player to Korea. The top 7 players by WARP1 (so not adjusted for standard deviation) during that stretch were Musial (NL), Snider (NL), Mantle (AL), Jackie Robinson (NL), Doby (AL), Ashburn (NL), and Hodges (NL).
   507. zoperino,if youre not into the whole brevity thing Posted: November 06, 2007 at 08:57 PM (#2607346)
played at an All-Star but not superstar level in '50

2nd best OPS+ in the league, in a highly unfavorable park, playing a mediocre but not terrible CF (-8 buggy FRAA).

I mean, if that's not a superstar, what is?
   508. AROM Posted: November 06, 2007 at 09:06 PM (#2607354)
I just posted this in case anyone wants to check catchers from 1957 on.

Statsite

Excluding 1999 of course, and 2007 until those are available. I compared each catcher to that year's league average, and controlling for pitcher handedness.

I hear in this year's Hardball Times TangoTiger will up the ante and publish this data controlling for pitcher.
   509. sunnyday2 Posted: November 06, 2007 at 09:14 PM (#2607358)
1950

Joe DiMaggio 32-122-.301/.585 SA led the league (because Ted Williams was hurt)
Berra 28-124-.322 arguably his best year, if not a star then that only refers to notoriety not value
Doby 25-102-.326 (Doby and Berra both 25 yrs; to say Doby was a star and not Berra cannot be right)
Kell, Wertz, Hoot Evers all 100 RBI for Detroit
Vern Stephens 30-144-.295
Dropo and Doerr also 100 RBI; Dropo, Doerr, Stephens, Pesky and Dom D. all scored 100 R
Billy Goodman led the league at .364 with 91 R as utility man
Dom DiMaggio .328/131 R led the league because somebody had to
Al Rosen 37-116-.287 just getting warmed up, the HR led the league

Raschi, Lopat, Reynolds/Lemon, Wynn, Feller

PS. Berra 1949 20-91-.277, Doby 24-85-.280; Berra finished ahead of Doby in MVP voting in both 1949 and 1950

Not arguing the larger point, but there were some pretty good ballplayers in the AL in 1950.

NL--Ennis led the league with 126 RBI
Jackie, Hodges, Reese, Snider, Furillo, Campy, Newk, Roe all had good years yet the Dodgers couldn't win the pennant
Stanky OB via H or BB > 300 times yet couldn't lead the league in R, Earl Torgeson did
Musial led the league at .348 and .596
Kiner 47-118
Kluszewski, Pafko, Hank Sauer
Roberts, Simmons, Konstanty, Jansen, Maglie; Spahn, Sain and Bickford (not Rain in '50 that was '48); Blackwell

Dunno what this shows. Some journeymen "dominated" in the NL, too, maybe. The Phillies won with a pretty mediocre lineup compared to the Dodgers. But "more NL players dominated than in the AL," except that sounds so much like an oxymoron.... But wait, 11 hitters in each league had 100 RBI. But 3 NL pitchers won 20 games, only 2 did in the AL.
   510. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 06, 2007 at 09:21 PM (#2607363)
Well, I don't use component park factors. It was a very high-scoring league and DiMaggio's OBP wasn't top 10; adjusting for league-relative SLG-heaviness his OPS+ "should" have been more like 146 (while Doby's OBP-heavy one "should" have been more like 159). Only played 139 games, and yeah, below average fielding. Moreover it was an extremely easy to dominate league (basically because of the run scoring).
   511. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 06, 2007 at 09:25 PM (#2607364)
1950 was also Rizzuto's great year. As I said, I checked my data and the gap I was talking about was really 1951-55.
   512. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 06, 2007 at 11:34 PM (#2607433)
OK, this Campaneris-versus-McCovey thing was interesting enough that I've done an exhaustive comparison. Here is a rather daunting überchart of the two, with a new glossary:

SFrac: Percentage of the season played.
BTWAA: "Raw" batting wins above average.
DPWAA: Double play avoidance wins above average.
BRWAA: Baserunning wins above average.
FWAA: Fielding wins above average.
PAdj: Wins added/subtracted to correct for park effect.
TWAA: Total wins above average (BTWAA+DPWAA+BRWAA+FWAA+PAdj).
PAWAA: Wins above average an average player at the given position in the given league-season would have produced in a full year's worth of play.
WAP1: Wins above positional average (TWAA-(PAWAA*SFrac)).
LgAdj: Ratio of the regression-projected standard deviation of the given league-season to the 2005 standard deviation.
WAP2: Wins above positional average, adjusted for standard deviation. (WAP1*LgAdj)
A-Rp: Gap in standard deviation-adjusted wins between a positional average and a replacement player at the given position per full year's worth of play.
WARP2: Standard deviation-adjusted wins above replacement player (WAP2 + (Av-Rep*SFrac)).

TOTL is career totals, TXBR is career totals excluding sub-replacement seasons, and TXBA is career totals excluding sub-positional-average seasons.

Dagoberto Campaneris

Year SFrac BTWAA DPWAA BRWAA FWAA Padj TWAA PAWAA WAP1 LgAdj WAP2 A-Rp WARP2
1964  0.42  
-0.3  +0.1  +0.2 -0.5 -0.1 -0.5  +0.9 -0.9 0.983 -0.9  4.2  +0.9
1965  0.94  
+0.7  +0.3  +0.3 -1.4 +0.1 +0.1  -0.2 +0.2 0.977 +0.2  3.1  +3.1
1966  0.91  
+0.0  +0.1  +0.8 -0.6 +0.3 +0.5  -0.3 +0.8 0.999 +0.8  3.2  +3.7
1967  0.97  
-0.7  +0.3  +0.6 -0.8 +0.2 -0.3  -1.0 +0.7 0.985 +0.7  2.6  +3.1
1968  1.06  
+1.8  +0.3  +0.5 +0.7 +0.5 +3.7  -1.0 +4.7 1.003 +4.8  2.7  +7.6
1969  0.86  
-1.8  +0.2  +1.0 +0.1 +0.3 -0.2  -0.1 -0.1 0.948 -0.1  3.8  +3.2
1970  0.95  
+1.4  +0.3  +0.5 +1.0 +0.4 +3.6  -0.8 +4.3 0.949 +4.1  3.1  +7.0
1971  0.91  
-1.6  +0.4  +0.5 +0.3 +0.1 -0.3  -1.7 +1.3 0.962 +1.3  2.2  +3.3
1972  1.05  
-1.5  +0.2  +0.9 +1.7 +0.4 +1.8  -1.4 +3.2 0.970 +3.1  2.4  +5.6
1973  0.97  
-1.7  +0.1  +0.2 +1.8 +0.4 +0.9  -2.9 +3.7 0.947 +3.5  1.6  +5.0
1974  0.85  
+0.6  +0.1  +0.2 +0.7 +0.5 +2.2  -1.7 +3.6 0.963 +3.5  2.8  +5.8
1975  0.84  
-0.4  +0.1  +0.0 -0.2 +0.1 -0.3  -2.2 +1.5 0.943 +1.4  2.2  +3.3
1976  0.91  
-0.5  +0.3  +0.4 +0.4 +0.3 +0.9  -1.2 +2.0 0.948 +1.9  3.1  +4.7
1977  0.89  
-1.4  +0.4  -0.2 +1.7 -0.1 +0.5  -2.3 +2.5 0.907 +2.3  2.1  +4.2
1978  0.44  
-2.4  +0.1  +0.3 -0.2 +0.0 -2.3  -1.9 -1.5 0.919 -1.3  2.4  -0.3
1979  0.40  
-1.5  -0.1  -0.1 +0.4 +0.1 -1.3  -2.7 -0.2 0.913 -0.2  1.7  +0.5
1980  0.33  
-0.8  +0.1  -0.1 -0.4 +0.0 -1.1  -2.0 -0.5 0.929 -0.4  2.3  +0.3
1981  0.20  
-0.2  +0.2  +0.0 -0.7 +0.0 -0.7  +0.1 -0.7 0.950 -0.7  2.2  -0.2
1983  0.22  
+0.0  +0.0  -0.4 -0.2 +0.1 -0.5  -0.3 -0.4 0.954 -0.4  2.3  +0.1
TOTL 14.12 
-10.3  +3.6  +5.6 +4.1 +3.7 +6.6  -1.2 24.3 0.964 23.4  2.7  61.0
TXBR 13.49  
-7.7  +3.3  +5.3 +4.9 +3.7 +9.6  -1.2 26.4 0.961 25.4  2.7  61.5
TXBA 11.25  
-3.3  +3.0  +4.7 +5.5 +3.4 13.2  -1.4 28.5 0.961 27.4  2.6  56.5



Willie McCovey

Year SFrac BTWAA DPWAA BRWAA FWAA Padj TWAA PAWAA WAP1 LgAdj WAP2 A-Rp WARP2
1959  0.34  
+2.8  -0.2  +0.0 +0.0 +0.1 +2.8  +2.3 +2.0 0.965 +1.9  2.5  +2.8
1960  0.47  
+1.2  +0.1  -0.1 -0.4 +0.3 +1.2  +1.8 +0.3 0.955 +0.3  2.0  +1.3
1961  0.58  
+1.5  +0.0  -0.2 +0.4 +0.2 +2.0  +1.8 +0.9 0.962 +0.9  2.0  +2.0
1962  0.38  
+2.1  +0.0  -0.1 +0.1 +0.1 +2.2  +2.3 +1.4 0.900 +1.2  2.8  +2.3
1963  0.94  
+5.5  +0.2  -0.1 -0.8 +0.1 +4.9  +2.4 +2.7 0.942 +2.5  3.0  +5.3
1964  0.65  
+1.3  +0.0  -0.1 -0.9 -0.1 +0.3  +1.8 -0.9 0.930 -0.8  2.4  +0.7
1965  0.94  
+5.4  +0.2  -0.4 -0.2 -0.3 +4.7  +2.0 +2.8 0.937 +2.7  2.2  +4.7
1966  0.87  
+5.7  +0.2  -0.1 -0.5 -0.3 +5.0  +2.0 +3.2 0.950 +3.1  2.1  +4.9
1967  0.80  
+4.7  +0.1  -0.2 -0.3 +0.1 +4.4  +1.2 +3.4 0.947 +3.2  1.4  +4.3
1968  0.91  
+6.8  +0.3  -0.1 -0.7 +0.0 +6.4  +1.8 +4.8 0.973 +4.6  1.9  +6.3
1969  0.92  
+8.7  +0.2  -0.2 -0.4 +0.3 +8.6  +2.6 +6.2 0.914 +5.7  2.5  +8.0
1970  0.93  
+6.6  +0.3  -0.2 +0.0 +0.1 +6.8  +2.4 +4.6 0.919 +4.3  2.3  +6.4
1971  0.60  
+3.0  +0.3  -0.3 -0.8 +0.1 +2.4  +2.5 +0.9 0.940 +0.8  2.4  +2.2
1972  0.47  
+0.5  +0.1  -0.5 -0.4 +0.0 -0.4  +2.1 -1.4 0.950 -1.3  2.0  -0.4
1973  0.73  
+4.7  +0.2  -0.3 -0.5 -0.4 +3.8  +1.5 +2.7 0.948 +2.6  1.4  +3.6
1974  0.65  
+3.6  -0.1  -0.3 -0.6 +0.3 +3.0  +1.6 +1.9 0.932 +1.8  1.5  +2.8
1975  0.70  
+1.8  +0.1  -0.3 -0.1 +0.4 +1.9  +1.7 +0.7 0.936 +0.7  1.6  +1.8
1976  0.37  
-0.7  +0.1  +0.0 -0.1 +0.2 -0.5  +1.3 -0.9 0.929 -0.9  1.4  -0.4
1977  0.80  
+2.7  -0.4  -0.4 -0.7 +0.0 +1.2  +1.8 -0.2 0.972 -0.2  2.0  +1.4
1978  0.58  
-0.2  -0.1  -0.3 -0.2 +0.2 -0.7  +1.4 -1.5 0.988 -1.5  1.8  -0.5
1979  0.58  
+0.3  -0.1  -0.1 -0.5 +0.4 -0.1  +1.5 -0.9 0.981 -0.9  1.8  +0.2
1980  0.19  
-0.5  +0.0  +0.0 -0.1 +0.0 -0.5  +1.5 -0.8 0.985 -0.8  1.9  -0.4
TOTL 14.39  67.6  
+1.5  -4.0 -7.6 +1.9 59.5  +1.9 32.0 0.935 29.9  2.0  59.4
TXBR 12.78  68.6  
+1.5  -3.3 -6.7 +1.5 61.5  +1.9 36.7 0.938 34.4  2.1  61.0
TXBA 10.75  64.3  
+1.9  -2.7 -4.7 +1.3 60.1  +2.0 38.7 0.939 36.4  2.1  58.8


Lots of interesting stuff to see here. McCovey's "raw" hitting is literally 76 wins better than Campaneris's, but Dagoberto chips away at that advantage with better double play avoidance (2 wins' difference), baserunning (8.5 wins), fielding (11.5 wins), and more pitcher-friendly parks (2 wins), reducing McCovey's advantage to 52 wins.

Then we have to account for the fact that Campaneris played shortstop and McCovey played first base. We can do this either by comparing to positional average or to replacement. Compared to positional average, if we ignore all below-average seasons, McCovey still exceeded the average 1B of his day by more than Campaneris exceeded the average SS of his: 38.7 wins above positional average for McCovey, 28.5 for Campaneris. Pretty big difference. McCovey played in slightly higher standard deviation leagues than Campaneris did; adjusting for that knocks 2.3 wins off McCovey and 1.1 off of Campaneris, which still leaves McCovey with a 36.4-27.4 advantage. I stand corrected: I thought Campaneris and McCovey would be comparable relative to positional average; they are not.

So why do I have Campaneris slightly higher? Because I see the gap between positional average and replacement as larger at SS (2.7 wins per season) than at 1B (2.1 wins per season) when they played. Remember that my replacement levels are calculated using the worst 3/8 of regulars after adjusting for the leaguewide standard deviation, so the only three things that can account for different-sized gaps between positional average and replacement are:

1. The gap between the worst-regulars average and Nate Silver's FAT levels for the 1985-2005 period. But this actually favors 1B--I subtract 0.3 wins from the worst-regulars average to get replacement level for SS, and 0.5 wins for 1B.

2. The standard deviation of performance *within* the position, which I intentionally do not correct for. Shortstop is, as Nate Silver puts it, a "feast or famine" position--you tend to have a few superstars who are just extraordinary athletes, and then a lot of guys who are really overmatched. By contrast, at 1B, you can more or less play it if you can walk, so while SS are separated by their hitting AND their fielding, 1B are separated basically by their hitting alone. This means that 1B are likely to be bunched much more closely together around positional average, while SS are likely to be spread out much further from positional average.

3. Kurtosis. It could be the case that even keeping standard deviation constant, SS were high-kurtosis in this period ("fat tails" and then a tight cluster in the middle), while 1B were low-kurtosis ("shoulders" near the center of the distribution and very few outliers). This isn't necessarily characteristic of the two positions over time, but it might be true in the 60s and 70s.

So there you have it. Compared to the average player at their positions, McCovey was indubitably better; compared to the freely available talent level, they were just about equal.

There are other factors I didn't mention, but the two that leap to mind probably even out--better in-season durability for Campaneris, tougher league for McCovey.
   513. Dizzypaco Posted: November 07, 2007 at 09:35 AM (#2607682)
By the way, Dan original question was based on the premise that win shares has some obvious mistakes, such as rating Staub over Vaughn. This just isn't true. Staub had 358 win shares in 2951 games, while Vaughn had 356 win shares in 1817 games, which is obviously a more impressive accomplishment. Win shares was never designed so that you just make a list of career win shares, and that's the order of quality. Part of the reason for the misunderstanding is that win shares makes no attempt to measure replacement value, but we've discussed this before.
   514. John (You Can Call Me Grandma) Murphy Posted: November 07, 2007 at 09:40 AM (#2607688)
Diz, you're absolutely correct about Win Shares. I have been stating for years now that you just can't rate players without using WS/162 in conjunction with straight WS numbers. Heck, Bill James knew this in the NBJHA and responded to it in kind.
   515. sunnyday2 Posted: November 07, 2007 at 09:43 AM (#2607691)
>Win shares was never designed so that you just make a list of career win shares,

Which is pretty obvious if you just look at Bill James' rankings. They're not even close to following the career WS totals.

I would agree with whoever said/wherever it was said that the big contribution of this project to world peace is in adjusting WS to 162 games (fairness to 19C players), WWII (and WWI and Korea) credit, and of course NeL MLEs. MiL MLEs are a much lesser matter in part because, as Chris Cobb showed in another thread, we haven't really elected anybody because of them (OK, some might argue Charley Keller, I don't know). Not only that but MiL MLEs were out there though used for a different purpose.

I still hope somebody writes a book out of all of this, though the logistical issues are extreme--e.g. who owns all of the work that's been posted here? Is it public domain? Could Chris' and Doc's MLEs be published w/o their permission, or would they give their permission? What about Dan's work? And who has the time to write it, not to mention the skills?

But all of those questions aside, the adjustments to WS (and to WARP) that have been done here are really cutting edge stuff. Maybe they already have a wider audience on the Web than they would ever get in print anyway. But even so, you'll agree that navigating all of it on this blog is a challenge.
   516. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 07, 2007 at 09:51 AM (#2607698)
Win shares was never designed so that you just make a list of career win shares, and that's the order of quality.


I guess I'm not sure what it *was* designed for, then. It seems like the response is "showing players' proportional contributions to wins," but I don't see why WS accomplishes that any more than WARP or my system does--in fact, I think it doesn't accomplish that goal as well as either WARP approach. If the whole thing is just a gimmick so that player wins add up to team wins, then all you have to do is just take 40.5 wins and allocate them according to PA, another 40.5 and allocate them according to IP and defensive innings played, and add on wins above average and you've accomplished the same thing without creating any of the distortions that WS introduces. Why go the trouble of making all these "cutting-edge adjustments" to WS when you can just get it right in the first place?

Maybe this should be moved to the uberstats thread.
   517. John (You Can Call Me Grandma) Murphy Posted: November 07, 2007 at 10:36 AM (#2607752)
(OK, some might argue Charley Keller, I don't know
)

I would argue Averill, too.
   518. Chris Cobb Posted: November 07, 2007 at 10:56 AM (#2607802)
Here's a potpourri of responses.

First to John Murphy:

I would argue Averill, too.

I didn't list Averill (in the list on the 2007 ballot discussion thread) because that list was looking at players that we elected that the HoF hasn't elected, and the reasons for the difference. Since the HoF elected Averill, he didn't show up. I would agree that minor-league credit helped his case with us, as it did Charley Keller's.


Next to Sunnyday2:

For the record, if someone wanted to do the work of putting together the book, I would happily give permission for my MLEs to be used, for whatever they are worth, if my permission were needed.

I think members of the electorate have the necessary skill to write the book, though probably not the time.


Finally, to Danr:

The main sabermetric advancement that James was pursuing in win shares, I think, was a satisfactory way of measuring pitching, batting, and fielding values together in terms of wins. His definition of "satisfactory" had three criteria, I think:

(1) it should include a good measure of fielding value [since that is the basic piece that was lacking];

(2) it should not be built on "value above average," since in James's view using average as a baseline creates false impressions about the nature of player value (not that his choosing a zero point rather than seeking replacement level doesn't run into the same problem); and

(3) it should be tied in some meaningful way to actual runs and actual wins (his philosophy here is similar to his philosophy with his runs created formulas, which are always brought back to actual runs scored). The choice of having the sum of a team's win shares equal three times team wins is a gimmick, of course, but some defined way of making win shares correspond to actual wins is not a gimmick but a philosophical commitment about how to best represent value.

You could compare my remarks here to what James says in the introductory chapters of _Win Shares_ to see how well I have represented his own claims in the matter.
   519. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 07, 2007 at 11:22 AM (#2607837)
Pete Palmer had certainly done 1, if I'm not mistaken. As for 2, various concepts of replacement level had been around forever--was it not James himself who first advanced them? If I'm right, then I don't see why the father of replacement level would then abandon it for the unrealistic 52% baseline level. 3 is indeed a novelty, one whose merit is widely disputed.
   520. sunnyday2 Posted: November 07, 2007 at 11:41 AM (#2607872)
James talks about Pete Palmer's version of 1, and Pete's was based on "estimated innings," which run into huge huge problems on the margins. i.e. if a guy was routinely removed for a defensive replacement or there was some other reason why the estimate was off. I remember there was one year when Heinie Groh was the Reds' 3B and played probably 140-145 games. The guy who replaced him on his off days got more fielding value than Heinie did. There was definite weirdness in Pete's methodology.

As for 3, I am reminded of dark matter. Just because we can't measure it and don't know what it is doesn't mean it's not out there. Likewise, the "dumb luck" that accounts for the divergences between RC formulae and actual runs and between phythag wins and actual wins. Just because we don't know what caused it doesn't make it meaningless. I don't think it's off the rails to incorporate "it" even if we don't know what "it" is.
   521. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 07, 2007 at 11:50 AM (#2607883)
Sure, but Fielding Win Shares is a mess, too. And of course we just have no idea about FRAA. I personally am a biig fan of DRA--.76 correlation to PBP metrics--, but as of now it's only publicly available for shortstops.

I never said it was "off the rails." I said it was "widely disputed." I definitely consider the question of whether to include run estimation and Pythagorean errors in your evaluation as normative, not positive--there's no one right answer. Certainly there was something going on with those 1890s Braves...I tend to think Tom Glavine-style "clutch pitching" from the stretch may often be a major contributing factor that deserves to be credited. Or fielders deciding whether to take a risky dive or not based on the leverage of the situation...on the offensive side, it seems much more likely to me that it's all luck.
   522. EricC Posted: November 07, 2007 at 08:46 PM (#2608733)
I have been stating for years now that you just can't rate players without using WS/162 in conjunction with straight WS numbers.

Agreed that performance rates need to be considered, as well as career totals, but note that WS/162 as a rate stat can be misleading. For example, for a player with a significant number of appearances as a pinch hitter, such as Enos Slaughter, WS/162 underestimates their rate of performance. WS per plate appearance is better (but ought to be adjusted relative to the OBP of the era in question).
   523. Eric Chalek (Dr. Chaleeko) Posted: November 07, 2007 at 10:06 PM (#2608773)
I'm not big on WS/PA. I like WS/out better, myself. The short-form WS formula includes outs and it makes sense that way. Then you can say things like he earned x WS for every game's worth of outs he made, which gives you something like the WS/LS relationship.
   524. El Hombre 4 MVP (Le Samourai) Posted: November 07, 2007 at 11:35 PM (#2608850)
Can somebody rehost the spreadsheets?
   525. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 08, 2007 at 03:00 AM (#2608995)
Which spreadsheets? The ones with my data are available in the Hall of Merit Yahoo group.
   526. Joe Dimino Posted: November 28, 2007 at 10:55 AM (#2626976)
Test
   527. Wally Moses, Isolated Power Broker (GGC) Posted: December 05, 2007 at 04:07 PM (#2635851)
I just posted this in case anyone wants to check catchers from 1957 on.

Statsite

Excluding 1999 of course, and 2007 until those are available. I compared each catcher to that year's league average, and controlling for pitcher handedness.

I hear in this year's Hardball Times TangoTiger will up the ante and publish this data controlling for pitcher.


Cool, Rallymonkey!
   528. KJOK Posted: December 07, 2007 at 01:51 AM (#2637702)
So where is this catcher data located?
   529. David Concepcion de la Desviacion Estandar (Dan R) Posted: January 19, 2008 at 10:59 AM (#2671742)
Good news for fans of my WARP: BRWAA and FWAA are in line for a major makeover. Dan Fox of Baseball Prospectus has been kind enough to share his complete baserunning data, which is context-adjusted (stealing third with 2 out is not worth as much as stealing third with 1 out) going all the way back to 1956, and Michael Humphreys is in the process of sending me complete DRA since 1893 (I have 2B, 3B, SS, and CF already). I don't plan on doing another full version of my WARP until I am ready for the "Holy Grail" of an integrated system with a variable pitching/fielding split (the stdev of deadball-era DRA is enormous), but I'm happy to provide the +/- numbers on fielding or baserunning for any specific player-seasons of interest to the group. Dr. Chaleeko, since you've been asking me about Dick Allen for eons, he was 5 runs below average for his career.
   530. AROM Posted: January 19, 2008 at 10:45 PM (#2672187)
Dan, when I click on the link at the start of this article, I get a page not available error. Has it been moved? I'd like to check out the big file.
   531. David Concepcion de la Desviacion Estandar (Dan R) Posted: January 19, 2008 at 11:01 PM (#2672200)
It's uploaded to the Hall of Merit Yahoo group.
   532. David Concepcion de la Desviacion Estandar (Dan R) Posted: July 24, 2008 at 10:32 PM (#2872114)
I've been fortunate enough to get my hands on seasonal Win Probability Added data for every player-season since 1974. I've tested it against my batting + baserunning wins above average, and here are the relevant conclusions:

1. I was always under the impression that while clutch ability did not exist, clutch performance did--in other words, that there were substantial discrepancies each year between a player's actual value to his team and the value you would expect from his statistics, but that the cause and distribution of those discrepancies was totally random. It turns out that a player's offensive statistics are a *damn* good predictor of his WPA, far better than I would have thought. Just multiplying batting + baserunning wins above average for any given player-season (as measured by my WARP) by .85 gets you a sweet 90% r-squared on the WPA for that player-season. So the first thing to say is that just using a run estimator and Pythagoras gets you pretty damn close to where you want to go.

2. The one thing that *leaps* out to me about the players whose career WPA most exceeds what we would expect from their offensive statistics is that the leaders are all Rockies. Something is seriously wrong in baseball-reference's park factor calculations, because it is dinging those Colorado hitters (Helton, Walker, Galarraga, Castilla, Bichette) far more than WPA thinks is appropriate. My guess is that the Rockies' hitters learn to take advantage of the park, and that that gives them a bigger-than-average home field advantage, a dynamic which the standard park factor calculation (which assumes that home and away teams benefit equally from park effects) cannot take into account.

3. For HoM purposes, here are the players who showed the greatest and smallest gaps between their BWAA + BRWAA and their WPA from 1974 to 2005:

Leaders

1. Todd Helton, +10.7 wins (1.38 per season)
2. Larry Walker, +10.4 (.94)
3. Andrés Galarraga, +9.2 (.74)
4. Chipper Jones, +9.0 (.86)
5. Dale Murphy, +8.0 (.60)--this will likely get him on my 2009 ballot
6. Vladimir Guerrero, +7.9 (1.00)--I suspect this is because he hits everything the same, be it a 100mph closer fastball or a junkball
7. Vinny Castilla, +7.8 (.77)
8. Andre Dawson, +6.8 (.43)
9. Shawn Green, +6.7 (.67)
10. Fred McGriff, +6.6 (.45)--helps an otherwise weak case
11. Andruw Jones, +6.3 (.74)
12. Graig Nettles, +6.1 (.57)--very relevant for 3B ranking purposes
13. Ryne Sandberg, +5.9 (.43)--would have been good to know when we were voting on 2B
14. Álex Rodríguez, +5.5 (.57)--and they call him a choker!
15. Jeromy Burnitz, +5.5 (.69)
16. Luis González, +5.3 (.40)--same comment as McGriff
17. Terry Pendleton, +5.2 (.50)
18. Marquis Grissom, +5.1 (.41)
19. Gary Gaetti, +5.1 (.36)
20. Mike Piazza, +4.9 (.49)
21. Darin Erstad, +4.7 (.63)--it's true!
22. Brian Giles, +4.7 (.58)--in my PHoM
23. Jim Rice, +4.6 (.34)--takes a tiny bit of sting off his likely HoF induction
24. George Brett, +4.5 (.26)

Everyone else is below +4.5. Notables by rate include Torii Hunter (+.75), Bip Roberts (+.71), Lance Berkman (+.7), José Vidro (+.65), Bob Horner (+.59), Ron LeFlore (+.56), and Lonnie Smith (+.56)

Trailers

1. Travis Fryman, -6.4 (-.61)
2. Don Baylor, -5.7 (-.46)
3. Mickey Tettleton, -5.6 (-.80)
4. Omar Vizquel, -5.4 (-.41)
5. Jeff Conine, -4.8 (-.48)
6. Larry Bowa, -4.6 (-.47)
7. Ken Griffey, Sr., -4.4 (-.42)
8. Ruben Sierra, -4.3 (-.37)
9. Rob Deer, -4.3 (-.69)
10. Jim Gantner, -4.2 (-.49)
11. Luis Alicea, -4.1 (-.87)
12. Tim Raines, -4.1 (-.28)
13. Don Mattingly, -4.0 (-.36)
14. Bob Boone, -3.9 (-.41)
15. Ken Caminiti, -3.9 (-.41)
16. Keith Hernandez, -3.8 (.32)--this might be enough to drop him out of my PHoM
17. Mark McLemore, -3.8 (-.41)
18. Alan Trammell, -3.7 (-.28)--this would have dropped him a nudge on my SS rankings
19. Mike Bordick, -3.7 (-.42)
20. Raúl Mondesi, -3.6 (-.40)
21. Bill Mueller, -3.5 (-.57)

Everyone else is above -3.5. Notables by rate include Marty Barrett (-.66), Ichiro Suzuki (-.61; I suspect this is because no one is on base for his singles and because a lot of them are infield singles, so they're not much better than walks), Scott Brosius (-.56), Darren Daulton (-.55), Mike Pagliarulo (-.55), Craig Reynolds (-.54), Gene Richards (-.49), Ron Oester (-.49), Rich Aurilia (-.48), Jim Eisenreich (-.48), Howard Johnson (-.48), Tony Bernazard (-.47), Fernando Viña (-.47), and Jorge Posada (-.46; still in my PHoM).


If anyone sees any patterns in these lists that might suggest which types of players tend to be more or less valuable than their offensive stats, I'd love to hear them.
   533. user Posted: July 25, 2008 at 08:58 AM (#2872534)
Could I ask as to the source of the data? There are a few discrepancies with what I've previously seen.

Just multiplying batting + baserunning wins above average for any given player-season (as measured by my WARP) by .85 gets you a sweet 90% r-squared on the WPA for that player-season.


Why is the multiplying by .85 necessary?
   534. David Concepcion de la Desviacion Estandar (Dan R) Posted: July 25, 2008 at 10:00 AM (#2872615)
It's FanGraphs data.

The .85 is necessary because that's how regression works--you get a closer fit to WPA if you multiply by .85 than if you don't. Basically, what that means is that 90% of WPA is accounted for by offensive statistics, and the remaining 10% is the timing of the events. So to predict WPA, you reduce the weight of the offensive stats from 100% to 90% (the other 5% is presumably because WPA must have a slightly smaller standard deviation or something than BWAA + BRWAA), and "fill in" the rest with league-average timing (that is to say, 0). Does that make sesne?
   535. Bleed the Freak Posted: July 25, 2008 at 10:57 AM (#2872738)
532. David Concepcion de la Desviacion Estandar (Dan R) Posted: July 24, 2008 at 10:32 PM (#2872114)
If anyone sees any patterns in these lists that might suggest which types of players tend to be more or less valuable than their offensive stats, I'd love to hear them.


This may only be by coincidence, but the positive WPA leaders are almost all Non-Middle IF (C, 2B, SS), and those that are, Sandberg, A-Rod, and Piazza, are the best sluggers at those positions ever. The positive list is comprised almost of all sluggers with multiple 30 Hr seasons AND 300+ HR career, except Erstad, Pendleton, and Grissom. Also 12 of top 23 are OF.

For the negative side, roughly half are middle infielders, while almost all players showed limited power, only Baylor and Sierra have 300 HR career, and a couple who had power at the peak, HoJo and Mattingly, who fizzled quickly.

Don't know if this is any help, but an observation none the less.
   536. Bleed the Freak Posted: July 25, 2008 at 11:16 AM (#2872786)
Dan, with the boost to some of these modern players, Todd Helton, Dale Murphy, Andre Dawson, Shawn Green, Fred McGriff, Andruw Jones, Graig Nettles, Luis Gonzalez, and Brian Giles in particular, how will these men fair in your all-time rankings.

It appears that borderliners Dawson, Nettles, and Giles would be stronger PHOM cases.

Murphy may vault into the Freehan/Leach area, borderline PHOM.

Do any of the others have a shot at reaching your ballot and or are they amongst the Top 200 post 1893-non Negro league position players.
   537. David Concepcion de la Desviacion Estandar (Dan R) Posted: July 25, 2008 at 11:26 AM (#2872797)
Bleed the Freak, that's certainly interesting. Here's a guess: the big hitters were more likely to come up with men on base, and therefore their hits were more likely to drive runners in and have a highly positive WPA, whereas the middle infielders tend to be clustered at the bottom of the lineup where they have very few RBI opportunities. I am actually using WPA/LI, which theoretically controls for context, but still...

I haven't yet decided whether I will adjust my rankings to reflect WPA's findings (besides the Rockies, who will definitely get an upward adjustment). I need to do some tests to see whether there is more variation here than we would expect merely from chance--if there isn't, I'd be disinclined to give credit for it.
   538. DL from MN Posted: July 25, 2008 at 11:27 AM (#2872803)
Could we be measuring an error in WPA when it comes to the Rockies? Isn't it using a generic run probability? Would adjusting the WPA run scoring odds to account for scoring in Colorado make the difference?
   539. user Posted: July 25, 2008 at 11:31 AM (#2872810)
It's FanGraphs data.


In which case:

Jim Rice: BPRO BRAA: 290 (I assume this is at least in the same vicinitity as your BWAA)
Fangraphs WPA/LI: +29.05
Fangraphs BRAA (+ve Run expectancy): +247
Fangraphs WPA: +22.65
Clutch:-7.07
Rice gets progressively worse as you go from context neutal measures towards WPA whilst you have him on your leaderboard!

Baseball Between the Numbers also has Rice as being -6.67 wins over his career.
   540. David Concepcion de la Desviacion Estandar (Dan R) Posted: July 25, 2008 at 11:49 AM (#2872830)
User, good catch. Cross-posting, I was using WPA/LI for this, rather than straight WPA, which I thought effectively neutralized clutch *opportunities* by giving every player a LI of 1.00 (e.g. if a player had a pLI of 1.1, WPA/LI would just be his WPA divided by 1.1). But that's clearly wrong--upon further investigation, I think it gives each individual *plate appearance* a LI of 1.00, rather than just the whole season average, and the difference between the two is the "Clutch" score they report, which represents the timing of the hits. I guess that explains why WPA/LI hugs BWAA + BRWAA so tightly.

The above list is still interesting--it shows hitters whose production was, on a context-neutral basis, still worth more or less than their stats would predict--but its not measuring what I thought it was measuring. I'll do the same study using raw WPA rather than WPA/LI and report back soon. Sorry for the mix-up, I'm new to this stuff!!
   541. user Posted: July 25, 2008 at 12:02 PM (#2872844)
I think it gives each individual *plate appearance* a LI of 1.00


I'm fairly sure this is the case. WPA/LI effectively "rewards" players for shaping their production to meet the preferred context of each plate appearance (e.g. having a disproportionate amount of their wlaks come with the bases empty etc). straight WPA rewards them for timing their production into high leverage occasions.
   542. user Posted: July 25, 2008 at 12:30 PM (#2872879)
Something else to consider is how fangraphs is incorporating baserunning: I believe your BRWAA has components which are not SB/CS related? Fangraphs I don't think has any such euivalent so you may need to remove them from your data pre-comparison.
   543. Wally Moses, Isolated Power Broker (GGC) Posted: July 25, 2008 at 12:47 PM (#2872908)
Dan, first off, thanks for the email.

My guess is that the Rockies' hitters learn to take advantage of the park, and that that gives them a bigger-than-average home field advantage, a dynamic which the standard park factor calculation (which assumes that home and away teams benefit equally from park effects) cannot take into account.


Interesting. Another Primate is studying this, but it seems to me that good hitters with extreme splits are overvalued. The extra runs that are produced at home have less value than the runs that runs lost on the road. Looking at it this way is similar to looking at support-neutral pitching stats, only that a pitcher who is more flaky is usually more valuable than a more consistent hurler with a similar ERA. Now, things may be different in an extreme hitters park.
   544. Mike Emeigh Posted: July 25, 2008 at 12:55 PM (#2872922)
Could we be measuring an error in WPA when it comes to the Rockies? Isn't it using a generic run probability?


I thought Fangraphs park-adjusted WPA, but I could be wrong. (I know Studes is aware that WPA needs to be park-adjusted.)

-- MWE
   545. David Concepcion de la Desviacion Estandar (Dan R) Posted: July 25, 2008 at 01:03 PM (#2872936)
I did remove the non-SB baserunning info before doing the test, user. Thanks for the thought.

I'll ask whether the numbers are park-adjusted. I think they must be, or else it would be clear from the leaders and trailers that they weren't.
   546. user Posted: July 25, 2008 at 01:04 PM (#2872938)
http://www.insidethebook.com/ee/index.php/site/comments/hardball_times_team_stats/

posts #4 and #7 are the relevent ones -

the thread implies that they do now but didn't initially - what age is your data set Dan?
   547. DL from MN Posted: July 25, 2008 at 01:08 PM (#2872943)
There's a difference in park adjusting the results and park adjusting the situations. How exactly do you come up with a park context adjustment for 1st and 3rd with 2 outs in the 7th down by 2 runs?
   548. user Posted: July 25, 2008 at 01:08 PM (#2872944)
I did remove the non-SB baserunning info before doing the test, user. Thanks for the thought.


Excellent. Of course a connected point to consider is that such non Sb-baserunning credit will be erroneously credited to hitters. I can't imagine this will be particularly significant, but did anyone of note spend an extremely large part of their career batting behind superlative baserunners?
   549. Joe Dimino Posted: July 25, 2008 at 03:36 PM (#2873123)
I thought Colorado players have a road-field disadvantage (as opposed to a home-field advantage) that makes it look like they have a home field advantage to the naked eye.

Didn't someone do a study that showed they are disproportionately worse when coming down from altitude, like the first and second games of road trips, compared to normal teams; or something like that?
   550. JPWF13 Posted: July 25, 2008 at 05:13 PM (#2873371)
Didn't someone do a study that showed they are disproportionately worse when coming down from altitude, like the first and second games of road trips, compared to normal teams; or something like that?

I have to find the thread, but that idea has been kicking around for awhile, but when someone really looked it wasn't true.
   551. Joe Dimino Posted: July 25, 2008 at 05:34 PM (#2873433)
I know there was an article in an old Elias Analyst (1993? as a Rockies 'preview' article) that looked at this for the Jazz and Nuggets (Broncos too?) and found evidence of it. Not sure what's been done since.

Isn't it also accepted that coming down from altitude is harder on the body in the short term than going up to it is? Meaning Rockies have bigger than normal advantage when Mets come town, but Mets have even bigger advantage when Rockies come to town, as an example?
   552. JPWF13 Posted: July 25, 2008 at 06:02 PM (#2873460)
here's one look at it
The Hangover Effect

from 2005
   553. Blackadder Posted: July 26, 2008 at 06:16 AM (#2874632)
Do we know where Fangraphs gets their park factors? A differemce there seems like the most likely explanation for the Rockies thing.
   554. Paul Wendt Posted: July 26, 2008 at 05:37 PM (#2875349)
Isn't it also accepted that coming down from altitude is harder on the body in the short term than going up to it is?

Why does the USOC, or some US Track and Field organization, train elite athletes at Colorado Springs?
I suppose there is some effect in the other direction, contrary to this quotation. Maybe it was discovered by trial and error but I presume there is some scientific explanation today.
   555. TomH Posted: August 02, 2008 at 06:28 PM (#2887812)
Dan, you really ought to find some spot(s) to publish your WAR system, possibly comparing it to BP's and WS (or WSAB). I'm sure SABR and other places would enjoy poring over your methods.
   556. David Concepcion de la Desviacion Estandar (Dan R) Posted: August 03, 2008 at 12:16 PM (#2888346)
Thanks very much for the encouragement, Tom, but it's still a total work in progress. The gigantic question/challenge ahead is addressing the question of the historical pitching/fielding split. The standard deviation of Michael Humphreys's Defensive Regression Analysis (DRA) declines very sharply over time: +15 is good enough to lead the league at SS these days in many years, whereas you needed a +40 to top the charts in the deadball era. Similarly, the standard deviation of defense-independent ERA's (K, BB, and HR rates) has soared over time: deadball pitchers made their living off of BABIP (just ask Joe McGinnity), while everyone who's heard of Voros McCracken knows how much pitchers can control that today. The Holy Grail here is some reliable way to disentangle pitching from defense 100 years ago, something that tells me that if a team is 100 runs above average on BABIP, how much to credit to each fielder and how much to each pitcher. That could also provide an alternative approach to the ghastly business of innings translation. Without it, I'm forced to rely on BP (as I do with my current numbers), and there are too many cases where it's just blatantly wrong (check out the 1904 Giants, for example).
   557. Paul Wendt Posted: August 03, 2008 at 06:49 PM (#2888807)
555. TomH Posted: August 02, 2008 at 06:28 PM (#2887812)
Dan, you really ought to find some spot(s) to publish your WAR system, possibly comparing it to BP's and WS (or WSAB). I'm sure SABR and other places would enjoy poring over your methods.
556. David Concepcion de la Desviacion Estandar (Dan R) Posted: August 03, 2008 at 12:16 PM (#2888346)
Thanks very much for the encouragement, Tom, but it's still a total work in progress.


For publication in SABR "By the Numbers", piecemeal would be a better way to go even if the whole were equally polished.
I suppose that is true in general for the audience that considers statistical analysis its discipline.

For example, DanR publishes the standard deviation approach to adjusting statistics by league.
Ideally, Clay Davenport decides to publish his version of adjustment based on comparing the records of same players in different leagues.
   558. David Concepcion de la Desviacion Estandar (Dan R) Posted: August 03, 2008 at 08:53 PM (#2888895)
Clay and I are measuring two completely different things. He's studying league strength; I'm just looking at the causes of the spread of performance among players. Contrary to what Stephen Jay Gould would have you think, some very tough leagues have high standard deviations, and some very weak ones have low standard deviations.
   559. Paul Wendt Posted: August 04, 2008 at 01:11 AM (#2889137)
That is why I called it "adjusting statistics by league"
but Replacement or replacement-level play may be a better example.
   560. Paul Wendt Posted: August 25, 2008 at 03:08 PM (#2915639)
In the "Election Results: Williams . . ." for LeftField,
DanR replied to bjhanke with a lecture on standard deviation of player-season ratings, especially its use to adjust those ratings.
Standard Deviation, DanR to bjhanke (today)

Here are two of seven points. The emphasis is mine and it may depart from the lecture. Follow the preceding link to the original.
16. David Concepcion de la Desviacion Estandar (Dan R) Posted: August 25, 2008 at 09:41 AM (#2915272)
bjhanke, as the group's self-anointed standard deviation guru, I very much appreciate your interest in the concept. I think your understanding of a few points could be deepened a bit.

1. The counterargument you propose is a straw man. First, let's not use Win Shares even in this theoretical analysis, because they are poorly thought out and lead to all sorts of problems in this type of discussion (I can explain why if you're interested). Let's use something that measures actual value, some indicator of wins above replacement. Doesn't matter whether you use mine, Baseball Prospectus's, or a home-grown version, it's just the concept. Let's also just assume that player performance is normally distributed about the mean (it isn't, but it's close enough for our purposes). OK, take a league where the standard deviation of player performance is 2 wins per season. If we call the bottom 2-3% of major leaguers replacement players, that means they will be four wins below average per season, while the All-Stars will be four wins above average per season. This makes league average a four-WARP player, and an All-Stars an eight-WARP player. Let's also say that the top two teams in the league win 95 and 90 games.
OK, now let's say that something, some real external factor, actually causes this stdev to double (as opposed to simply the addition of a bunch of superstars or super-scrubs to the league). (In practice, this would most likely be an increase in run scoring or an expansion). Now, replacement is eight wins below average, while All-Stars are eight wins above average, meaning that league average players have become eight-WARP players, and All-Stars have become 16-WARP players, overnight, with no change to their underlying ability. What happens?
Well, assuming the distribution of talent between teams doesn't change, you'd see a corresponding increase in the standard deviation of wins between teams. So the 90-win team (nine wins above average) will become a 99-win team (18 wins above average), while the 95-win team will become a 109-win team.
Why is winning games important? Because it leads to pennants. When you increase standard deviation, ceteris paribus, you change not only the stats-wins relationship, but also the wins-pennants relationship by the same amount. 97 wins is enough to eke out a pennant in the low-stdev league, but is only good for third place in the high-stdev league. So, if what we are interested in is "pennants added," then we most definitely DO need to correct for standard deviation when assessing players' value.

3. That said, we need to distinguish between a "true" increase in standard deviation--one that actually makes a league "easier to dominate"--and the inevitable year-to-year fluctuation that takes place due to random noise and to the actual distribution of talent in a league. The clearest example of this is, the highest observed major league standard deviations since 1893 are clustered in the 1920's AL. Was this because the league was easy to dominate? No, it's because the league had a one-man star glut by the name of George Herman Ruth, who singlehandledly was increasing the overall league stdev by massive proportions.
The way to do this is with a regression analysis, which determines the relationships between league factors like run scoring, expansion, and population per team, and observed stdevs over the course of baseball history. By applying the resulting equation to each league-season, we can then determine how easy it was to dominate based on these factors, without making any reference to the actual performance of the players in the league, thus avoiding the temptation to give extra credit to George Burns for playing in low standard deviation leagues just because all of the stars were in the AL. The result of this is the standard deviation adjustment I use in my WARP.


--
Dan,
Regarding the theme I have highlighted:
We have a within-league-season (raw) measure of player wins and hope to adjust it.
- Why not work on the wins-pennant relationship directly? Is the "pennant" intractable, with variation in division size and number, playoff size, etc?
- If the pennant is intractable, it may still be reasonable to work with the standard deviation of team wins rather than of player ratings.
   561. Paul Wendt Posted: August 25, 2008 at 03:13 PM (#2915647)
Brock and others,

At the Hall of Merit, 'WARP' refers equivocally to the Wins Above Replacement Player measures by Clay Davenport and Dan Rosenheck. This thread on "Dan Rosenheck's WARP data" is half full of theory, the foundations of his WARP data.

There is a thread "Battle of the Uber-Stat Systems (Win Shares vs. WARP)!" about the rating systems Win Shares by Bill James and WARP by Clay Davenport. That is half full of theory of one or the other, separately (not so much "vs.").
   562. David Concepcion de la Desviacion Estandar (Dan R) Posted: August 25, 2008 at 04:01 PM (#2915675)
Well, the key phrase in my argument is that "ceteris paribus"--it's Awfully hard to do this empirically, because the changing distribution of talent in the league screws things up (like, you know, the Yankees). But if you look at the Pennants Added studies like Wolverton's, a key variable in their calculations is, of course, the standard deviation of team wins, which, holding the distribution of talent constant, is accounted for 100% by the standard deviation of their players' performance.
   563. Blackadder Posted: September 06, 2008 at 01:29 PM (#2931146)
Dan, I apologive if you have done this elsewhere, but can you outline how you are planning on computing/are currently computing pitcher WARP? In particular, do you adjust for innings pitched by era? It is not clear to me whether that is the right thing to do. Also, if you have it, could you send me a copy of your pitcher WARP as it currently stands, with all the necessary provisos understood. You can use djhanen ( AT ) gmail ( DOT) com.
   564. David Concepcion de la Desviacion Estandar (Dan R)