Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

This is a great example of where Gould's equation of higher league quality and lower standard deviation is just dead wrong. You are right that minority stars displaced white replacement players (in the NL, at least). A replacement player is usually what, about 2 wins below positional average? And a star is what, 3 or 4 wins above positional average? So the early days of integration actually *increased* standard deviations, by creating a "star glut" in the NL. Conversely, the relative dearth of stars in the AL around 1950 (the only megastar was Williams, maybe Doby) led to quite *low* standard deviations in the league around this time. This is, of course, why you can't use actual standard deviations, because you'd wind up punishing the NL'ers for integrating and rewarding the AL'ers for not integrating! My regression equation doesn't "know" where the stars were, of course, and so it projects the AL to have a significantly higher stdev than the NL throughout the 1950's, even though in fact the opposite was true. And the equation is right--the AL was easier to dominate, even though more players happened to dominate the NL.

To clarify, park factors are worth 3.5 wins to Campaneris, SB/CS are worth 4.5 (above and beyond the non-SB baserunning runs I mentioned earlier), and DP avoidance is worth 3.7 wins.

503. DCW3
Posted: November 06, 2007 at 11:53 PM (#2607219)

Wow! That's very surprising. Are RCAP park-adjusted? Do they include SB/CS? Do they include double play avoidance? All three are major benefits to Campaneris which I assumed were included. Also, by subtracting 4 years from McCovey and 2 from Campaneris, I'm giving Campaneris more below-average time than McCovey. DCW3, how would it look if you just did above-average seasons, ignoring all below-average ones?

The way I calculate them, RCAP are park-adjusted, but not era-adjusted: for instance, Campaneris had a higher RCAP in 1970 than 1968, but since the offensive context was higher in '70, '68 was a more valuable season. They include SB/CS, but not DP avoidance. If I subtract out all the below-average seasons, it puts McCovey at 431 and Campaneris at 156.

Catchers as a group hit pretty well in the '50s, and because their rates are more closely bunched together than other positions AND they play fewer games, it's VERY difficult for a C to really move a leaguewide standard deviation (which requires an extremely high number of wins above positional average). When you correct for all the obstacles that catchers face, Berra is among the top 40 MLB position players of the pitchers' mound era, but in terms of dominating a league, he was certainly never putting up the type of 8-wins-above-positional-average seasons that someone like Cal Ripken did.

DiMaggio's last superstar year was '48. He played great in '49 but missed half the year, played at an All-Star but not superstar level in '50, and was done after '51.

Just checking the data, the time period I'm really talking about is 1951 to 1955, when integration really began in earnest and the AL lost its best player to Korea. The top 7 players by WARP1 (so not adjusted for standard deviation) during that stretch were Musial (NL), Snider (NL), Mantle (AL), Jackie Robinson (NL), Doby (AL), Ashburn (NL), and Hodges (NL).

Excluding 1999 of course, and 2007 until those are available. I compared each catcher to that year's league average, and controlling for pitcher handedness.

I hear in this year's Hardball Times TangoTiger will up the ante and publish this data controlling for pitcher.

509. sunnyday2
Posted: November 07, 2007 at 02:14 AM (#2607358)

1950

Joe DiMaggio 32-122-.301/.585 SA led the league (because Ted Williams was hurt)
Berra 28-124-.322 arguably his best year, if not a star then that only refers to notoriety not value
Doby 25-102-.326 (Doby and Berra both 25 yrs; to say Doby was a star and not Berra cannot be right)
Kell, Wertz, Hoot Evers all 100 RBI for Detroit
Vern Stephens 30-144-.295
Dropo and Doerr also 100 RBI; Dropo, Doerr, Stephens, Pesky and Dom D. all scored 100 R
Billy Goodman led the league at .364 with 91 R as utility man
Dom DiMaggio .328/131 R led the league because somebody had to
Al Rosen 37-116-.287 just getting warmed up, the HR led the league

Raschi, Lopat, Reynolds/Lemon, Wynn, Feller

PS. Berra 1949 20-91-.277, Doby 24-85-.280; Berra finished ahead of Doby in MVP voting in both 1949 and 1950

Not arguing the larger point, but there were some pretty good ballplayers in the AL in 1950.

NL--Ennis led the league with 126 RBI
Jackie, Hodges, Reese, Snider, Furillo, Campy, Newk, Roe all had good years yet the Dodgers couldn't win the pennant
Stanky OB via H or BB > 300 times yet couldn't lead the league in R, Earl Torgeson did
Musial led the league at .348 and .596
Kiner 47-118
Kluszewski, Pafko, Hank Sauer
Roberts, Simmons, Konstanty, Jansen, Maglie; Spahn, Sain and Bickford (not Rain in '50 that was '48); Blackwell

Dunno what this shows. Some journeymen "dominated" in the NL, too, maybe. The Phillies won with a pretty mediocre lineup compared to the Dodgers. But "more NL players dominated than in the AL," except that sounds so much like an oxymoron.... But wait, 11 hitters in each league had 100 RBI. But 3 NL pitchers won 20 games, only 2 did in the AL.

Well, I don't use component park factors. It was a very high-scoring league and DiMaggio's OBP wasn't top 10; adjusting for league-relative SLG-heaviness his OPS+ "should" have been more like 146 (while Doby's OBP-heavy one "should" have been more like 159). Only played 139 games, and yeah, below average fielding. Moreover it was an extremely easy to dominate league (basically because of the run scoring).

OK, this Campaneris-versus-McCovey thing was interesting enough that I've done an exhaustive comparison. Here is a rather daunting überchart of the two, with a new glossary:

SFrac: Percentage of the season played.
BTWAA: "Raw" batting wins above average.
DPWAA: Double play avoidance wins above average.
BRWAA: Baserunning wins above average.
FWAA: Fielding wins above average.
PAdj: Wins added/subtracted to correct for park effect.
TWAA: Total wins above average (BTWAA+DPWAA+BRWAA+FWAA+PAdj).
PAWAA: Wins above average an average player at the given position in the given league-season would have produced in a full year's worth of play.
WAP1: Wins above positional average (TWAA-(PAWAA*SFrac)).
LgAdj: Ratio of the regression-projected standard deviation of the given league-season to the 2005 standard deviation.
WAP2: Wins above positional average, adjusted for standard deviation. (WAP1*LgAdj)
A-Rp: Gap in standard deviation-adjusted wins between a positional average and a replacement player at the given position per full year's worth of play.
WARP2: Standard deviation-adjusted wins above replacement player (WAP2 + (Av-Rep*SFrac)).

TOTL is career totals, TXBR is career totals excluding sub-replacement seasons, and TXBA is career totals excluding sub-positional-average seasons.

Lots of interesting stuff to see here. McCovey's "raw" hitting is literally 76 wins better than Campaneris's, but Dagoberto chips away at that advantage with better double play avoidance (2 wins' difference), baserunning (8.5 wins), fielding (11.5 wins), and more pitcher-friendly parks (2 wins), reducing McCovey's advantage to 52 wins.

Then we have to account for the fact that Campaneris played shortstop and McCovey played first base. We can do this either by comparing to positional average or to replacement. Compared to positional average, if we ignore all below-average seasons, McCovey still exceeded the average 1B of his day by more than Campaneris exceeded the average SS of his: 38.7 wins above positional average for McCovey, 28.5 for Campaneris. Pretty big difference. McCovey played in slightly higher standard deviation leagues than Campaneris did; adjusting for that knocks 2.3 wins off McCovey and 1.1 off of Campaneris, which still leaves McCovey with a 36.4-27.4 advantage. I stand corrected: I thought Campaneris and McCovey would be comparable relative to positional average; they are not.

So why do I have Campaneris slightly higher? Because I see the gap between positional average and replacement as larger at SS (2.7 wins per season) than at 1B (2.1 wins per season) when they played. Remember that my replacement levels are calculated using the worst 3/8 of regulars after adjusting for the leaguewide standard deviation, so the only three things that can account for different-sized gaps between positional average and replacement are:

1. The gap between the worst-regulars average and Nate Silver's FAT levels for the 1985-2005 period. But this actually favors 1B--I subtract 0.3 wins from the worst-regulars average to get replacement level for SS, and 0.5 wins for 1B.

2. The standard deviation of performance *within* the position, which I intentionally do not correct for. Shortstop is, as Nate Silver puts it, a "feast or famine" position--you tend to have a few superstars who are just extraordinary athletes, and then a lot of guys who are really overmatched. By contrast, at 1B, you can more or less play it if you can walk, so while SS are separated by their hitting AND their fielding, 1B are separated basically by their hitting alone. This means that 1B are likely to be bunched much more closely together around positional average, while SS are likely to be spread out much further from positional average.

3. Kurtosis. It could be the case that even keeping standard deviation constant, SS were high-kurtosis in this period ("fat tails" and then a tight cluster in the middle), while 1B were low-kurtosis ("shoulders" near the center of the distribution and very few outliers). This isn't necessarily characteristic of the two positions over time, but it might be true in the 60s and 70s.

So there you have it. Compared to the average player at their positions, McCovey was indubitably better; compared to the freely available talent level, they were just about equal.

There are other factors I didn't mention, but the two that leap to mind probably even out--better in-season durability for Campaneris, tougher league for McCovey.

513. Dizzypaco
Posted: November 07, 2007 at 02:35 PM (#2607682)

By the way, Dan original question was based on the premise that win shares has some obvious mistakes, such as rating Staub over Vaughn. This just isn't true. Staub had 358 win shares in 2951 games, while Vaughn had 356 win shares in 1817 games, which is obviously a more impressive accomplishment. Win shares was never designed so that you just make a list of career win shares, and that's the order of quality. Part of the reason for the misunderstanding is that win shares makes no attempt to measure replacement value, but we've discussed this before.

Diz, you're absolutely correct about Win Shares. I have been stating for years now that you just can't rate players without using WS/162 in conjunction with straight WS numbers. Heck, Bill James knew this in the NBJHA and responded to it in kind.

515. sunnyday2
Posted: November 07, 2007 at 02:43 PM (#2607691)

>Win shares was never designed so that you just make a list of career win shares,

Which is pretty obvious if you just look at Bill James' rankings. They're not even close to following the career WS totals.

I would agree with whoever said/wherever it was said that the big contribution of this project to world peace is in adjusting WS to 162 games (fairness to 19C players), WWII (and WWI and Korea) credit, and of course NeL MLEs. MiL MLEs are a much lesser matter in part because, as Chris Cobb showed in another thread, we haven't really elected anybody because of them (OK, some might argue Charley Keller, I don't know). Not only that but MiL MLEs were out there though used for a different purpose.

I still hope somebody writes a book out of all of this, though the logistical issues are extreme--e.g. who owns all of the work that's been posted here? Is it public domain? Could Chris' and Doc's MLEs be published w/o their permission, or would they give their permission? What about Dan's work? And who has the time to write it, not to mention the skills?

But all of those questions aside, the adjustments to WS (and to WARP) that have been done here are really cutting edge stuff. Maybe they already have a wider audience on the Web than they would ever get in print anyway. But even so, you'll agree that navigating all of it on this blog is a challenge.

Win shares was never designed so that you just make a list of career win shares, and that's the order of quality.

I guess I'm not sure what it *was* designed for, then. It seems like the response is "showing players' proportional contributions to wins," but I don't see why WS accomplishes that any more than WARP or my system does--in fact, I think it doesn't accomplish that goal as well as either WARP approach. If the whole thing is just a gimmick so that player wins add up to team wins, then all you have to do is just take 40.5 wins and allocate them according to PA, another 40.5 and allocate them according to IP and defensive innings played, and add on wins above average and you've accomplished the same thing without creating any of the distortions that WS introduces. Why go the trouble of making all these "cutting-edge adjustments" to WS when you can just get it right in the first place?

Maybe this should be moved to the uberstats thread.

(OK, some might argue Charley Keller, I don't know

)

I would argue Averill, too.

518. Chris Cobb
Posted: November 07, 2007 at 03:56 PM (#2607802)

Here's a potpourri of responses.

First to John Murphy:

I would argue Averill, too.

I didn't list Averill (in the list on the 2007 ballot discussion thread) because that list was looking at players that we elected that the HoF hasn't elected, and the reasons for the difference. Since the HoF elected Averill, he didn't show up. I would agree that minor-league credit helped his case with us, as it did Charley Keller's.

Next to Sunnyday2:

For the record, if someone wanted to do the work of putting together the book, I would happily give permission for my MLEs to be used, for whatever they are worth, if my permission were needed.

I think members of the electorate have the necessary skill to write the book, though probably not the time.

Finally, to Danr:

The main sabermetric advancement that James was pursuing in win shares, I think, was a satisfactory way of measuring pitching, batting, and fielding values together in terms of wins. His definition of "satisfactory" had three criteria, I think:

(1) it should include a good measure of fielding value [since that is the basic piece that was lacking];

(2) it should not be built on "value above average," since in James's view using average as a baseline creates false impressions about the nature of player value (not that his choosing a zero point rather than seeking replacement level doesn't run into the same problem); and

(3) it should be tied in some meaningful way to actual runs and actual wins (his philosophy here is similar to his philosophy with his runs created formulas, which are always brought back to actual runs scored). The choice of having the sum of a team's win shares equal three times team wins is a gimmick, of course, but some defined way of making win shares correspond to actual wins is not a gimmick but a philosophical commitment about how to best represent value.

You could compare my remarks here to what James says in the introductory chapters of _Win Shares_ to see how well I have represented his own claims in the matter.

Pete Palmer had certainly done 1, if I'm not mistaken. As for 2, various concepts of replacement level had been around forever--was it not James himself who first advanced them? If I'm right, then I don't see why the father of replacement level would then abandon it for the unrealistic 52% baseline level. 3 is indeed a novelty, one whose merit is widely disputed.

520. sunnyday2
Posted: November 07, 2007 at 04:41 PM (#2607872)

James talks about Pete Palmer's version of 1, and Pete's was based on "estimated innings," which run into huge huge problems on the margins. i.e. if a guy was routinely removed for a defensive replacement or there was some other reason why the estimate was off. I remember there was one year when Heinie Groh was the Reds' 3B and played probably 140-145 games. The guy who replaced him on his off days got more fielding value than Heinie did. There was definite weirdness in Pete's methodology.

As for 3, I am reminded of dark matter. Just because we can't measure it and don't know what it is doesn't mean it's not out there. Likewise, the "dumb luck" that accounts for the divergences between RC formulae and actual runs and between phythag wins and actual wins. Just because we don't know what caused it doesn't make it meaningless. I don't think it's off the rails to incorporate "it" even if we don't know what "it" is.

Sure, but Fielding Win Shares is a mess, too. And of course we just have no idea about FRAA. I personally am a biig fan of DRA--.76 correlation to PBP metrics--, but as of now it's only publicly available for shortstops.

I never said it was "off the rails." I said it was "widely disputed." I definitely consider the question of whether to include run estimation and Pythagorean errors in your evaluation as normative, not positive--there's no one right answer. Certainly there was something going on with those 1890s Braves...I tend to think Tom Glavine-style "clutch pitching" from the stretch may often be a major contributing factor that deserves to be credited. Or fielders deciding whether to take a risky dive or not based on the leverage of the situation...on the offensive side, it seems much more likely to me that it's all luck.

522. EricC
Posted: November 08, 2007 at 01:46 AM (#2608733)

I have been stating for years now that you just can't rate players without using WS/162 in conjunction with straight WS numbers.

Agreed that performance rates need to be considered, as well as career totals, but note that WS/162 as a rate stat can be misleading. For example, for a player with a significant number of appearances as a pinch hitter, such as Enos Slaughter, WS/162 underestimates their rate of performance. WS per plate appearance is better (but ought to be adjusted relative to the OBP of the era in question).

523. Dr. Chaleeko
Posted: November 08, 2007 at 03:06 AM (#2608773)

I'm not big on WS/PA. I like WS/out better, myself. The short-form WS formula includes outs and it makes sense that way. Then you can say things like he earned x WS for every game's worth of outs he made, which gives you something like the WS/LS relationship.

I just posted this in case anyone wants to check catchers from 1957 on.

Statsite

Excluding 1999 of course, and 2007 until those are available. I compared each catcher to that year's league average, and controlling for pitcher handedness.

I hear in this year's Hardball Times TangoTiger will up the ante and publish this data controlling for pitcher.

Cool, Rallymonkey!

528. KJOK
Posted: December 07, 2007 at 06:51 AM (#2637702)

Good news for fans of my WARP: BRWAA and FWAA are in line for a major makeover. Dan Fox of Baseball Prospectus has been kind enough to share his complete baserunning data, which is context-adjusted (stealing third with 2 out is not worth as much as stealing third with 1 out) going all the way back to 1956, and Michael Humphreys is in the process of sending me complete DRA since 1893 (I have 2B, 3B, SS, and CF already). I don't plan on doing another full version of my WARP until I am ready for the "Holy Grail" of an integrated system with a variable pitching/fielding split (the stdev of deadball-era DRA is enormous), but I'm happy to provide the +/- numbers on fielding or baserunning for any specific player-seasons of interest to the group. Dr. Chaleeko, since you've been asking me about Dick Allen for eons, he was 5 runs below average for his career.

530. AROM
Posted: January 20, 2008 at 03:45 AM (#2672187)

Dan, when I click on the link at the start of this article, I get a page not available error. Has it been moved? I'd like to check out the big file.

I've been fortunate enough to get my hands on seasonal Win Probability Added data for every player-season since 1974. I've tested it against my batting + baserunning wins above average, and here are the relevant conclusions:

1. I was always under the impression that while clutch ability did not exist, clutch performance did--in other words, that there were substantial discrepancies each year between a player's actual value to his team and the value you would expect from his statistics, but that the cause and distribution of those discrepancies was totally random. It turns out that a player's offensive statistics are a *damn* good predictor of his WPA, far better than I would have thought. Just multiplying batting + baserunning wins above average for any given player-season (as measured by my WARP) by .85 gets you a sweet 90% r-squared on the WPA for that player-season. So the first thing to say is that just using a run estimator and Pythagoras gets you pretty damn close to where you want to go.

2. The one thing that *leaps* out to me about the players whose career WPA most exceeds what we would expect from their offensive statistics is that the leaders are all Rockies. Something is seriously wrong in baseball-reference's park factor calculations, because it is dinging those Colorado hitters (Helton, Walker, Galarraga, Castilla, Bichette) far more than WPA thinks is appropriate. My guess is that the Rockies' hitters learn to take advantage of the park, and that that gives them a bigger-than-average home field advantage, a dynamic which the standard park factor calculation (which assumes that home and away teams benefit equally from park effects) cannot take into account.

3. For HoM purposes, here are the players who showed the greatest and smallest gaps between their BWAA + BRWAA and their WPA from 1974 to 2005:

Leaders

1. Todd Helton, +10.7 wins (1.38 per season)
2. Larry Walker, +10.4 (.94)
3. Andrés Galarraga, +9.2 (.74)
4. Chipper Jones, +9.0 (.86)
5. Dale Murphy, +8.0 (.60)--this will likely get him on my 2009 ballot
6. Vladimir Guerrero, +7.9 (1.00)--I suspect this is because he hits everything the same, be it a 100mph closer fastball or a junkball
7. Vinny Castilla, +7.8 (.77)
8. Andre Dawson, +6.8 (.43)
9. Shawn Green, +6.7 (.67)
10. Fred McGriff, +6.6 (.45)--helps an otherwise weak case
11. Andruw Jones, +6.3 (.74)
12. Graig Nettles, +6.1 (.57)--very relevant for 3B ranking purposes
13. Ryne Sandberg, +5.9 (.43)--would have been good to know when we were voting on 2B
14. Álex Rodríguez, +5.5 (.57)--and they call him a choker!
15. Jeromy Burnitz, +5.5 (.69)
16. Luis González, +5.3 (.40)--same comment as McGriff
17. Terry Pendleton, +5.2 (.50)
18. Marquis Grissom, +5.1 (.41)
19. Gary Gaetti, +5.1 (.36)
20. Mike Piazza, +4.9 (.49)
21. Darin Erstad, +4.7 (.63)--it's true!
22. Brian Giles, +4.7 (.58)--in my PHoM
23. Jim Rice, +4.6 (.34)--takes a tiny bit of sting off his likely HoF induction
24. George Brett, +4.5 (.26)

Everyone else is below +4.5. Notables by rate include Torii Hunter (+.75), Bip Roberts (+.71), Lance Berkman (+.7), José Vidro (+.65), Bob Horner (+.59), Ron LeFlore (+.56), and Lonnie Smith (+.56)

Trailers

1. Travis Fryman, -6.4 (-.61)
2. Don Baylor, -5.7 (-.46)
3. Mickey Tettleton, -5.6 (-.80)
4. Omar Vizquel, -5.4 (-.41)
5. Jeff Conine, -4.8 (-.48)
6. Larry Bowa, -4.6 (-.47)
7. Ken Griffey, Sr., -4.4 (-.42)
8. Ruben Sierra, -4.3 (-.37)
9. Rob Deer, -4.3 (-.69)
10. Jim Gantner, -4.2 (-.49)
11. Luis Alicea, -4.1 (-.87)
12. Tim Raines, -4.1 (-.28)
13. Don Mattingly, -4.0 (-.36)
14. Bob Boone, -3.9 (-.41)
15. Ken Caminiti, -3.9 (-.41)
16. Keith Hernandez, -3.8 (.32)--this might be enough to drop him out of my PHoM
17. Mark McLemore, -3.8 (-.41)
18. Alan Trammell, -3.7 (-.28)--this would have dropped him a nudge on my SS rankings
19. Mike Bordick, -3.7 (-.42)
20. Raúl Mondesi, -3.6 (-.40)
21. Bill Mueller, -3.5 (-.57)

Everyone else is above -3.5. Notables by rate include Marty Barrett (-.66), Ichiro Suzuki (-.61; I suspect this is because no one is on base for his singles and because a lot of them are infield singles, so they're not much better than walks), Scott Brosius (-.56), Darren Daulton (-.55), Mike Pagliarulo (-.55), Craig Reynolds (-.54), Gene Richards (-.49), Ron Oester (-.49), Rich Aurilia (-.48), Jim Eisenreich (-.48), Howard Johnson (-.48), Tony Bernazard (-.47), Fernando Viña (-.47), and Jorge Posada (-.46; still in my PHoM).

If anyone sees any patterns in these lists that might suggest which types of players tend to be more or less valuable than their offensive stats, I'd love to hear them.

533. user
Posted: July 25, 2008 at 12:58 PM (#2872534)

Could I ask as to the source of the data? There are a few discrepancies with what I've previously seen.

Just multiplying batting + baserunning wins above average for any given player-season (as measured by my WARP) by .85 gets you a sweet 90% r-squared on the WPA for that player-season.

The .85 is necessary because that's how regression works--you get a closer fit to WPA if you multiply by .85 than if you don't. Basically, what that means is that 90% of WPA is accounted for by offensive statistics, and the remaining 10% is the timing of the events. So to predict WPA, you reduce the weight of the offensive stats from 100% to 90% (the other 5% is presumably because WPA must have a slightly smaller standard deviation or something than BWAA + BRWAA), and "fill in" the rest with league-average timing (that is to say, 0). Does that make sesne?

535. Bleed the Freak
Posted: July 25, 2008 at 02:57 PM (#2872738)

532. David Concepcion de la Desviacion Estandar (Dan R) Posted: July 24, 2008 at 10:32 PM (#2872114)
If anyone sees any patterns in these lists that might suggest which types of players tend to be more or less valuable than their offensive stats, I'd love to hear them.

This may only be by coincidence, but the positive WPA leaders are almost all Non-Middle IF (C, 2B, SS), and those that are, Sandberg, A-Rod, and Piazza, are the best sluggers at those positions ever. The positive list is comprised almost of all sluggers with multiple 30 Hr seasons AND 300+ HR career, except Erstad, Pendleton, and Grissom. Also 12 of top 23 are OF.

For the negative side, roughly half are middle infielders, while almost all players showed limited power, only Baylor and Sierra have 300 HR career, and a couple who had power at the peak, HoJo and Mattingly, who fizzled quickly.

Don't know if this is any help, but an observation none the less.

536. Bleed the Freak
Posted: July 25, 2008 at 03:16 PM (#2872786)

Dan, with the boost to some of these modern players, Todd Helton, Dale Murphy, Andre Dawson, Shawn Green, Fred McGriff, Andruw Jones, Graig Nettles, Luis Gonzalez, and Brian Giles in particular, how will these men fair in your all-time rankings.

It appears that borderliners Dawson, Nettles, and Giles would be stronger PHOM cases.

Murphy may vault into the Freehan/Leach area, borderline PHOM.

Do any of the others have a shot at reaching your ballot and or are they amongst the Top 200 post 1893-non Negro league position players.

Bleed the Freak, that's certainly interesting. Here's a guess: the big hitters were more likely to come up with men on base, and therefore their hits were more likely to drive runners in and have a highly positive WPA, whereas the middle infielders tend to be clustered at the bottom of the lineup where they have very few RBI opportunities. I am actually using WPA/LI, which theoretically controls for context, but still...

I haven't yet decided whether I will adjust my rankings to reflect WPA's findings (besides the Rockies, who will definitely get an upward adjustment). I need to do some tests to see whether there is more variation here than we would expect merely from chance--if there isn't, I'd be disinclined to give credit for it.

538. DL from MN
Posted: July 25, 2008 at 03:27 PM (#2872803)

Could we be measuring an error in WPA when it comes to the Rockies? Isn't it using a generic run probability? Would adjusting the WPA run scoring odds to account for scoring in Colorado make the difference?

539. user
Posted: July 25, 2008 at 03:31 PM (#2872810)

It's FanGraphs data.

In which case:

Jim Rice: BPRO BRAA: 290 (I assume this is at least in the same vicinitity as your BWAA)
Fangraphs WPA/LI: +29.05
Fangraphs BRAA (+ve Run expectancy): +247
Fangraphs WPA: +22.65
Clutch:-7.07
Rice gets progressively worse as you go from context neutal measures towards WPA whilst you have him on your leaderboard!

Baseball Between the Numbers also has Rice as being -6.67 wins over his career.

User, good catch. Cross-posting, I was using WPA/LI for this, rather than straight WPA, which I thought effectively neutralized clutch *opportunities* by giving every player a LI of 1.00 (e.g. if a player had a pLI of 1.1, WPA/LI would just be his WPA divided by 1.1). But that's clearly wrong--upon further investigation, I think it gives each individual *plate appearance* a LI of 1.00, rather than just the whole season average, and the difference between the two is the "Clutch" score they report, which represents the timing of the hits. I guess that explains why WPA/LI hugs BWAA + BRWAA so tightly.

The above list is still interesting--it shows hitters whose production was, on a context-neutral basis, still worth more or less than their stats would predict--but its not measuring what I thought it was measuring. I'll do the same study using raw WPA rather than WPA/LI and report back soon. Sorry for the mix-up, I'm new to this stuff!!

541. user
Posted: July 25, 2008 at 04:02 PM (#2872844)

I think it gives each individual *plate appearance* a LI of 1.00

I'm fairly sure this is the case. WPA/LI effectively "rewards" players for shaping their production to meet the preferred context of each plate appearance (e.g. having a disproportionate amount of their wlaks come with the bases empty etc). straight WPA rewards them for timing their production into high leverage occasions.

542. user
Posted: July 25, 2008 at 04:30 PM (#2872879)

Something else to consider is how fangraphs is incorporating baserunning: I believe your BRWAA has components which are not SB/CS related? Fangraphs I don't think has any such euivalent so you may need to remove them from your data pre-comparison.

My guess is that the Rockies' hitters learn to take advantage of the park, and that that gives them a bigger-than-average home field advantage, a dynamic which the standard park factor calculation (which assumes that home and away teams benefit equally from park effects) cannot take into account.

Interesting. Another Primate is studying this, but it seems to me that good hitters with extreme splits are overvalued. The extra runs that are produced at home have less value than the runs that runs lost on the road. Looking at it this way is similar to looking at support-neutral pitching stats, only that a pitcher who is more flaky is usually more valuable than a more consistent hurler with a similar ERA. Now, things may be different in an extreme hitters park.

544. Mike Emeigh
Posted: July 25, 2008 at 04:55 PM (#2872922)

Could we be measuring an error in WPA when it comes to the Rockies? Isn't it using a generic run probability?

I thought Fangraphs park-adjusted WPA, but I could be wrong. (I know Studes is aware that WPA needs to be park-adjusted.)

the thread implies that they do now but didn't initially - what age is your data set Dan?

547. DL from MN
Posted: July 25, 2008 at 05:08 PM (#2872943)

There's a difference in park adjusting the results and park adjusting the situations. How exactly do you come up with a park context adjustment for 1st and 3rd with 2 outs in the 7th down by 2 runs?

548. user
Posted: July 25, 2008 at 05:08 PM (#2872944)

I did remove the non-SB baserunning info before doing the test, user. Thanks for the thought.

Excellent. Of course a connected point to consider is that such non Sb-baserunning credit will be erroneously credited to hitters. I can't imagine this will be particularly significant, but did anyone of note spend an extremely large part of their career batting behind superlative baserunners?

I thought Colorado players have a road-field disadvantage (as opposed to a home-field advantage) that makes it look like they have a home field advantage to the naked eye.

Didn't someone do a study that showed they are disproportionately worse when coming down from altitude, like the first and second games of road trips, compared to normal teams; or something like that?

550. JPWF13
Posted: July 25, 2008 at 09:13 PM (#2873371)

Didn't someone do a study that showed they are disproportionately worse when coming down from altitude, like the first and second games of road trips, compared to normal teams; or something like that?

I have to find the thread, but that idea has been kicking around for awhile, but when someone really looked it wasn't true.

I know there was an article in an old Elias Analyst (1993? as a Rockies 'preview' article) that looked at this for the Jazz and Nuggets (Broncos too?) and found evidence of it. Not sure what's been done since.

Isn't it also accepted that coming down from altitude is harder on the body in the short term than going up to it is? Meaning Rockies have bigger than normal advantage when Mets come town, but Mets have even bigger advantage when Rockies come to town, as an example?

552. JPWF13
Posted: July 25, 2008 at 10:02 PM (#2873460)

553. Blackadder
Posted: July 26, 2008 at 10:16 AM (#2874632)

Do we know where Fangraphs gets their park factors? A differemce there seems like the most likely explanation for the Rockies thing.

554. Paul Wendt
Posted: July 26, 2008 at 09:37 PM (#2875349)

Isn't it also accepted that coming down from altitude is harder on the body in the short term than going up to it is?

Why does the USOC, or some US Track and Field organization, train elite athletes at Colorado Springs?
I suppose there is some effect in the other direction, contrary to this quotation. Maybe it was discovered by trial and error but I presume there is some scientific explanation today.

555. TomH
Posted: August 02, 2008 at 10:28 PM (#2887812)

Dan, you really ought to find some spot(s) to publish your WAR system, possibly comparing it to BP's and WS (or WSAB). I'm sure SABR and other places would enjoy poring over your methods.

Thanks very much for the encouragement, Tom, but it's still a total work in progress. The gigantic question/challenge ahead is addressing the question of the historical pitching/fielding split. The standard deviation of Michael Humphreys's Defensive Regression Analysis (DRA) declines very sharply over time: +15 is good enough to lead the league at SS these days in many years, whereas you needed a +40 to top the charts in the deadball era. Similarly, the standard deviation of defense-independent ERA's (K, BB, and HR rates) has soared over time: deadball pitchers made their living off of BABIP (just ask Joe McGinnity), while everyone who's heard of Voros McCracken knows how much pitchers can control that today. The Holy Grail here is some reliable way to disentangle pitching from defense 100 years ago, something that tells me that if a team is 100 runs above average on BABIP, how much to credit to each fielder and how much to each pitcher. That could also provide an alternative approach to the ghastly business of innings translation. Without it, I'm forced to rely on BP (as I do with my current numbers), and there are too many cases where it's just blatantly wrong (check out the 1904 Giants, for example).

557. Paul Wendt
Posted: August 03, 2008 at 10:49 PM (#2888807)

555. TomH Posted: August 02, 2008 at 06:28 PM (#2887812)
Dan, you really ought to find some spot(s) to publish your WAR system, possibly comparing it to BP's and WS (or WSAB). I'm sure SABR and other places would enjoy poring over your methods.
556. David Concepcion de la Desviacion Estandar (Dan R) Posted: August 03, 2008 at 12:16 PM (#2888346)
Thanks very much for the encouragement, Tom, but it's still a total work in progress.

For publication in SABR "By the Numbers", piecemeal would be a better way to go even if the whole were equally polished.
I suppose that is true in general for the audience that considers statistical analysis its discipline.

For example, DanR publishes the standard deviation approach to adjusting statistics by league.
Ideally, Clay Davenport decides to publish his version of adjustment based on comparing the records of same players in different leagues.

Clay and I are measuring two completely different things. He's studying league strength; I'm just looking at the causes of the spread of performance among players. Contrary to what Stephen Jay Gould would have you think, some very tough leagues have high standard deviations, and some very weak ones have low standard deviations.

559. Paul Wendt
Posted: August 04, 2008 at 05:11 AM (#2889137)

That is why I called it "adjusting statistics by league"
but Replacement or replacement-level play may be a better example.

560. Paul Wendt
Posted: August 25, 2008 at 07:08 PM (#2915639)

In the "Election Results: Williams . . ." for LeftField,
DanR replied to bjhanke with a lecture on standard deviation of player-season ratings, especially its use to adjust those ratings. Standard Deviation, DanR to bjhanke (today)

Here are two of seven points. The emphasis is mine and it may depart from the lecture. Follow the preceding link to the original.
16. David Concepcion de la Desviacion Estandar (Dan R) Posted: August 25, 2008 at 09:41 AM (#2915272) bjhanke, as the group's self-anointed standard deviation guru, I very much appreciate your interest in the concept. I think your understanding of a few points could be deepened a bit.

1. The counterargument you propose is a straw man. First, let's not use Win Shares even in this theoretical analysis, because they are poorly thought out and lead to all sorts of problems in this type of discussion (I can explain why if you're interested). Let's use something that measures actual value, some indicator of wins above replacement. Doesn't matter whether you use mine, Baseball Prospectus's, or a home-grown version, it's just the concept. Let's also just assume that player performance is normally distributed about the mean (it isn't, but it's close enough for our purposes). OK, take a league where the standard deviation of player performance is 2 wins per season. If we call the bottom 2-3% of major leaguers replacement players, that means they will be four wins below average per season, while the All-Stars will be four wins above average per season. This makes league average a four-WARP player, and an All-Stars an eight-WARP player. Let's also say that the top two teams in the league win 95 and 90 games.
OK, now let's say that something, some real external factor, actually causes this stdev to double (as opposed to simply the addition of a bunch of superstars or super-scrubs to the league). (In practice, this would most likely be an increase in run scoring or an expansion). Now, replacement is eight wins below average, while All-Stars are eight wins above average, meaning that league average players have become eight-WARP players, and All-Stars have become 16-WARP players, overnight, with no change to their underlying ability. What happens?
Well, assuming the distribution of talent between teams doesn't change, you'd see a corresponding increase in the standard deviation of wins between teams. So the 90-win team (nine wins above average) will become a 99-win team (18 wins above average), while the 95-win team will become a 109-win team.
Why is winning games important? Because it leads to pennants. When you increase standard deviation, ceteris paribus, you change not only the stats-wins relationship, but also the wins-pennants relationship by the same amount. 97 wins is enough to eke out a pennant in the low-stdev league, but is only good for third place in the high-stdev league. So, if what we are interested in is "pennants added," then we most definitely DO need to correct for standard deviation when assessing players' value.

3. That said, we need to distinguish between a "true" increase in standard deviation--one that actually makes a league "easier to dominate"--and the inevitable year-to-year fluctuation that takes place due to random noise and to the actual distribution of talent in a league. The clearest example of this is, the highest observed major league standard deviations since 1893 are clustered in the 1920's AL. Was this because the league was easy to dominate? No, it's because the league had a one-man star glut by the name of George Herman Ruth, who singlehandledly was increasing the overall league stdev by massive proportions.
The way to do this is with a regression analysis, which determines the relationships between league factors like run scoring, expansion, and population per team, and observed stdevs over the course of baseball history. By applying the resulting equation to each league-season, we can then determine how easy it was to dominate based on these factors, without making any reference to the actual performance of the players in the league, thus avoiding the temptation to give extra credit to George Burns for playing in low standard deviation leagues just because all of the stars were in the AL. The result of this is the standard deviation adjustment I use in my WARP.

--
Dan,
Regarding the theme I have highlighted:
We have a within-league-season (raw) measure of player wins and hope to adjust it.
- Why not work on the wins-pennant relationship directly? Is the "pennant" intractable, with variation in division size and number, playoff size, etc?
- If the pennant is intractable, it may still be reasonable to work with the standard deviation of team wins rather than of player ratings.

561. Paul Wendt
Posted: August 25, 2008 at 07:13 PM (#2915647)

Brock and others,

At the Hall of Merit, 'WARP' refers equivocally to the Wins Above Replacement Player measures by Clay Davenport and Dan Rosenheck. This thread on "Dan Rosenheck's WARP data" is half full of theory, the foundations of his WARP data.

There is a thread "Battle of the Uber-Stat Systems (Win Shares vs. WARP)!" about the rating systems Win Shares by Bill James and WARP by Clay Davenport. That is half full of theory of one or the other, separately (not so much "vs.").

Well, the key phrase in my argument is that "ceteris paribus"--it's Awfully hard to do this empirically, because the changing distribution of talent in the league screws things up (like, you know, the Yankees). But if you look at the Pennants Added studies like Wolverton's, a key variable in their calculations is, of course, the standard deviation of team wins, which, holding the distribution of talent constant, is accounted for 100% by the standard deviation of their players' performance.

563. Blackadder
Posted: September 06, 2008 at 05:29 PM (#2931146)

Dan, I apologive if you have done this elsewhere, but can you outline how you are planning on computing/are currently computing pitcher WARP? In particular, do you adjust for innings pitched by era? It is not clear to me whether that is the right thing to do. Also, if you have it, could you send me a copy of your pitcher WARP as it currently stands, with all the necessary provisos understood. You can use djhanen ( AT ) gmail ( DOT) com.

Sure. I use BP's DERA statistic (which is excellent if you trust their defensive adjustments, which are sometimes questionable) to measure pitcher effectiveness, and convert it into pitching wins above average. I then compare the pitcher's hitting to a league average hitter to get total wins above average. Next, I determine how many wins below average a replacement pitcher would have had in the same number of innings and plate appearances, and subtract this value to get WARP1. I then make two adjustments: one for innings pitched (based on a moving average of IP leaders), and another for standard deviations (using the same methodology I do for hitters), to get WARP2. I do not yet make any adjustment for career length.

The caveats are:

1. I only have data for seasons where the pitcher in question was the starter for at least half his appearances.

2. I am completely reliant on BP’s DERA statistic to calculate pitcher effectiveness. If you don’t trust their defensive adjustments, you shouldn’t trust these numbers.

3. The numbers *do* include a rudimentary adjustment for seasonal innings totals, but *not* for career length, which systematically biases them against old-time pitchers who get their seasonal IP crunched but don’t get their careers extended.

4. The batting numbers are compiled against a baseline of pitcher average hitting for the league-season in question, which can fluctuate quite a bit. I think it would probably be better to use some sort of multi-year moving average as the baseline, but I haven’t gotten around to it yet.

5. I use a fixed replacement level for starting pitching, rather than one that floats over time as the hitter ones do. There is a very good chance that this is inaccurate.

6. I have not yet "integrated" these numbers with my position player ones at all, so not everything "adds up" the way it theoretically should.

I think these are enough warning signs to make the numbers of quite limited usefulness. Nonetheless, I'll send them your way now.

565. Paul Wendt
Posted: September 06, 2008 at 06:28 PM (#2931192)

Do you find that the same factors (average runs scored, years since expansion) explain the pitcher and "player" standard deviations in about the same way?

4. The batting numbers are compiled against a baseline of pitcher average hitting for the league-season in question, which can fluctuate quite a bit. I think it would probably be better to use some sort of multi-year moving average as the baseline, but I haven’t gotten around to it yet.

I think you must be right about this, in favor of the moving-average.

566. Paul Wendt
Posted: September 06, 2008 at 07:42 PM (#2931243)

DanR #139 I just did a quick league strength study for the 50s and 60s, and my preliminary results are absolutely jawdropping.

I looked at every position player who switched leagues between 1951 and 1968 (Mantle's career), a sample of 249 players. I took their rate in each season of batting wins per year, baserunning wins per year, and fielding wins per year, and added 8.7 to turn them into wins created per year. I then weighted each player by the harmonic mean of his playing time in the two seasons before and after the switch, giving 87 full seasons' worth of sample (where a player who plays every game in both the year before and after the switch is counted as 1.00). Finally, I took the ratio of their weighted performances before and after the switch. The ratio for batting wins was 1.092, for baserunning wins it was 1.001, and for fielding wins it was 1.007. This is, astonishingly, on par with the gap between the major leagues of 1944 and those of 1942/46.

For example, where WinR is win rate and Time is playing time.
WinR Time
1.00 1.00 1958
3.00 0.66 1959
4.00 1.00 1960
2.00 0.50 1961

Now you combine the two years "before" by weighted average --weighted by playing time, uniform by sequence? Same for the two years "after".
WinR Time
1.80 1.66(/2=0.833) before
3.33 1.50(/2=0.750) after

From which
- performance ratio = 3.33/1.80 = 1.85 ; or 1.80/3.33 = 0.54 depending on the direction of the move
- weight = harmonic mean of 0.833, 0.750

Then given a timespan such as 1951-1968 you take the weighted geometric mean of all the performance ratios (expressed in terms of fixed direction of the move) that are in the timespan.

IIUC you should be able to calculate once and store these 6 to 8 values for all interleague moves, subsequently do all sorts of things with them.
- year
- playerID
- performance ratios for batting, baserunning, fielding (3)
- weights (1 to 3 of them)

Oh, that would combine all fielding positions :-(

Or does the following mean that you combine Batting, BR, and F at the player-season level, before my exposition begins?
That doesn't fit the three-part numerical conclusion. Do you add 8.7 three times, once to each? I took their rate in each season of batting wins per year, baserunning wins per year, and fielding wins per year, and added 8.7 to turn them into wins created per year.

There are some overlaps and some differences, Paul Wendt. The single biggest factor in pitcher stdevs is league strikeout rate, which doesn't matter much for hitters, and league home run rate also plays a major role. This is hardly surprising--they both reflects the transition in responsibility from fielders to pitchers. Also, pitcher stdevs show a strong (and intuitive) correlation to season length, whereas hitter stdevs surprisingly do not. Aanecdotally, you certainly think of Schmidt/Evans/Dawson/Grich in '81 and Bagwell/Thomas/Belle/Lofton/O'Neil/Gwynn/Mitchell/M. Williams in '94, but over the whole history of the game it doesn't hold up. There was nothing particularly fluky in the war-shortened 1918 season, and semi-reduced years like 1919 and 1995 don't show any anomalies.

On the flip side, expansion does NOT turn up a statistically significant impact on pitchers, although maybe I could tease one out if I tweaked my expansion weights some. Run scoring definitely matters on both sides of the ball, as you would expect--more runs = more plate appearances = more opportunities for players to distinguish themselves from one another.

Pitcher standard deviations have never been higher than during the 1993-present "Steroid Era," and particularly in 1994-95. Their all-time low was in the 1920's, which is why I will certainly be the best friend of Dazzy Vance in our starting pitcher rankings, and why I may consider Adolfo Luque in 2009. The variance in them is extremely large, much more so than for hitters--ignoring the strike years for now, Roger Clemens's 222 RA+ in the 1997 AL translates to just a 174 RA+ in Luque's 1923 NL, while Luque's 188 RA+ in the 1923 NL counts for the same as a 256 RA+ in the 1997 AL. Of course, '97 Clemens would have thrown 322 innings in the 1923 NL, and '23 Luque would have thrown 264 innings in the 1997 AL, so it all sort of evens out. I have both seasons right around 11 WARP, with Luque slightly ahead.

I calculated separate conversion factors for each of BWAA, BRWAA, and FWAA. Example:

Bob had 2 batting wins above average and 0.75 fielding wins above average in 3/4 of a season in the AL in Year 1, and 1 batting win above average and 0.5 fielding wins above average in half a season in the NL in Year 2. Joe had -3 batting wins above average and -1 fielding win above average in a full season in the NL in Year 1, and -1 batting win above average and -.5 fielding wins above average in half a season in the AL in Year 2. These are the only two league-switchers in my sample.

First, the batting:

Bob's batting rate in the AL is 2/.75 = 2.67 BWAA per year, and in the NL it's 1/.5 = 2.00 BWAA per year. Adding 8.7 to this, we get him with 11.37 batting wins per year in the AL, and 10.7 batting wins per year in the NL. Joe's batting rate in the NL is -3/1 = -3 BWAA per year, and in the AL it's -1/.5 = -2 BWAA per year. Adding 8.7, we get him with 5.7 batting wins per year in the NL, and 6.7 batting wins per year in the AL.

Bob is weighted at the harmonic mean of .75 and .5, which is 0.6, and Joe is weighted at the harmonic mean of 1 and .5, which is 0.67. So, Bob's 11.37 batting wins per year in the AL become a weighted 6.82, and his 10.7 batting wins per year in the NL become a weighted 6.42. Joe's 5.7 batting wins per year in the NL become a weighted 3.82, and his 6.7 batting wins per year in the AL become a weighted 4.49.

In total, that makes 6.82 + 4.49 = 11.31 weighted total batting wins in the AL, and 6.42 + 3.82 = 10.24 weighted total batting wins in the NL, producing a batting win conversion factor of 1.10. (I picked the numbers at random, but that's not far off from my actual observed result).

Now, the fielding: Bob's fielding rate in the AL is .75/.75 = 1.00 FWAA per year, and in the NL it's .5/.5 = 1.00 FWAA per year. Adding 8.7 to this, we get him with 9.7 fielding wins per year in the AL, and 9.7 fielding wins per year in the NL. Joe's fielding rate in the NL is -1/1 = -1 FWAA per year, and in the AL it's -.5/.5 = -1 FWAA per year. Adding 8.7, we get him with 7.7 fielding wins per year in the NL, and 7.7 fielding wins per year in the AL.

Using the same weights, Bob's 9.7 fielding wins per year in the AL become a weighted 5.82, as do his 9.7 fielding wins per year in the NL. Joe's 7.7 fielding wins per year in the AL become 5.16, as do his 7.7 fielding wins per year in the NL.

In total, that makes 5.82 + 5.16 = 10.98 weighted fielding wins in the AL, and the same total in the NL, producing a fielding win conversion factor of 1.00.

I only looked at the seasons directly before and after the league switch in this first study.

Dan - I've only been able to skim - and I do like DERA, as an adjustment for team defense (NRA-DERA) to using a runs allowed compared to league average, adjusting for park, etc. in the methodology.

But are you saying you use DERA itself as the main component of pitcher effectiveness (not starting from RA/INN)? I'm not sure I have that much faith in it.

I need to read more thoroughly, but wanted to mention this at least in case I can't get back to it for a bit.

Yes, I use DERA itself as the main component, because every time I've spot-checked NRA (doing my own calculation of a pitcher's winning percentage based on his RA+), it's within .02-.03 of my estimate. Their math on turning RA into WPCT is perfectly sound; it's the accuracy of the defensive (NRA to DERA) adjustment that's very much debatable. (The best example of this is the 1904 Giants, where BP implicitly argues that the fielders made Mathewson and McGinnity look good, but in fact it's much more likely that Big Six and Iron Joe induced lots of easy plays for the defense). Distributing credit between pitchers and fielders for hit prevention on balls in play going back in time is just a daunting challenge--it is precisely what has stopped me from being able to integrate my hitter and preliminary pitcher WARP so far.

571. Paul Wendt
Posted: September 10, 2008 at 10:22 PM (#2935975)

Thanks, Dan.
At this stage, does every player-season have one value for playing time (0.75 for Bob in year 1) that is used in all three batting, fielding, and pitching analyses?

thinking aloud, not entirely unrelated:
Does anyone know whether there is a detectable tendency for pitchers who are good batters to move from AL to NL, and vice versa?

572. Blackadder
Posted: September 11, 2008 at 01:45 AM (#2936603)

Wasn't Palmer particularly good at giving him fieldable balls? I vaguely recall TotalZone making a specific "Palmer adjustment" to deflate the stats of some of those Orioles players...

573. Howie Menckel
Posted: September 11, 2008 at 03:42 AM (#2937237)

Don't know the exact numbers, but there was some anecdotal AND statistical sense that Palmer knew what he had in that amazing defense, and "pitched to the infield" as opposed to pitching to the score as they used to say. Plus Paul Blair in CF, wow.

Palmer is a fascinating case; some stats make him seem lucky, but I don't think so. The zero grand slams allowed in career also plays to that, to an extent.

For batters, playing time is measured by their plate appearances divided by 1/9 of the league average plate appearances per team. For pitchers, it is measured by their percentage of a league average team's innings pitched.

Such a tendency would be optimal, but I am not aware of any empirical study that has demonstrated one.

There is a VERY strong argument that Palmer's pitching style was uniquely well-suited to his fielders, intentionally or not: he gave up 169 fewer hits on balls in play than he would have if opposing batters had posted the same BABIP against him than they did against the rest of the Orioles' staff. Moreover, he allowed 97 fewer runs than he would have if the hits and walks against him had been distributed randomly, suggesting a true Glavine-like ability to "bear down" from the stretch. He was the kind of guy that somebody just looking at strikeouts, walks, and home runs allowed (as Voros McCracken would recommend) would consistently underrate.

575. Paul Wendt
Posted: September 11, 2008 at 04:39 PM (#2937906)

There is a VERY strong argument that Palmer's pitching style was uniquely well-suited to his fielders, intentionally or not: he gave up 169 fewer hits on balls in play than he would have if opposing batters had posted the same BABIP against him than they did against the rest of the Orioles' staff.

"Uniquely" may mean unique among Orioles who enjoyed the same fielders for a medium or long time. Among 140 or 150 great pitchers, Spalding to Santana, here are the 10 or 11 "leaders" per 9 innings. (It's 150 including 10-15 relief pitchers.)

Larry Corcoran and Fred Goldsmith divided the Chicago workload during most of their careers, and it shows: -0.20 and +0.29.
Maddux, Glavine, and Smoltz: -0.12, -0.11, -0.09.

Here are the "trailers"

+0.44 Pettitte

+0.29 John, Goldsmith, Kaat,
+0.28 Oswalt

+0.23 Lolich
+0.21 Pennock

<0.19 everyone else

576. Paul Wendt
Posted: September 11, 2008 at 04:43 PM (#2937916)

Forget about Rivera. It looks like I have XIP in this table, which is decision-adjusted innings.

Pettitte, John and Kaat at the "bottom" of the list. Is that predictable?

578. Paul Wendt
Posted: September 11, 2008 at 04:57 PM (#2937945)

Moreover, he allowed 97 fewer runs than he would have if the hits and walks against him had been distributed randomly, suggesting a true Glavine-like ability to "bear down" from the stretch.

The "leaders":

-0.80 Spalding (-0.80 runs per 9 innings)
(gap)
-0.57 Cummings
(gap)
-0.41 Maglie, Ford, Oswalt, McCormick, Will White
-0.31 Caruthers, Grove, Plank

That is five pre-1893 pitchers in the first ten, with Welch, Bond, Goldsmith, and Mullane (9) also in the top 20.

Palmer is about #12 of about 125 post-1892 pitchers.

Who leads Palmer in both measures?
No one.
McGinnity and Coveleski are the two others with very strong combined records in these two statistics.

579. Paul Wendt
Posted: September 11, 2008 at 04:59 PM (#2937947)

What about Mr. Hittable, Glendon Rusch?

He is not one of my 150 great pitchers.
Maybe Pettitte fooled you regarding the lofty standard.

580. Paul Wendt
Posted: September 11, 2008 at 05:05 PM (#2937961)

Based on the manual pages for DT cards at baseball prospectus (my source for these data), I think I should use XIP only with RAA and PRAA, but I should use actual innings pitched with these statistics DH, DR, and DW.

I'll put the six new statistics including XIP in my lahman5.4-extended and upload the table with IP and playerID too.

581. Paul Wendt
Posted: September 12, 2008 at 01:44 AM (#2938650)

done
I checked my 150 great pitchers against two reference lists (see the description) and added 11 more.

upload to "Files" for the "HallofMerit" egroup at yahoogroups.com

pitchers161.csv
Career data for 161 "great" pitchers: seven new statistics (DH, DR, DW, RAA, PRAA, DERA, XIP), four old ones (IP, ERA+, PA, OPS+), name and numeric lahmanID. The 161 include the Bill James top 100 (2001) and everyone with career ERA+ >=130 who qualifies as a leader at baseball-reference (2008-09-11). All data current sometime Jul-Sep 2008.

I've just calculated WARP for selected 2008 players. Notes:

1. Standard deviation is calculated as it is for all other league-seasons. Both the AL and NL come out to a .991 LgAdj.
2. Defense is calculated using an average of 70% Dewan's Plus/Minus, 30% Chris Dial's RSpt. The only exception is that Dewan has Chase Utley at an insane 50 plays above average this year, so I used a 50/50 weighting there. I will likely amend this rating once I get a UZR for him. Since these numbers are unregressed, they will probably show a somewhat higher standard deviation than the ones I have published for previous years.
3. I have added 3.4 to the park factor of all NL players and subtracted 3.4 from the park factor of all AL players to adjust for league strength. This posits an 8.0 wins-per-team strength gap between the leagues in favor of the AL, divided 65/35 among position players and pitchers. It comes out to about .3/.4 wins per player. Note that that means these numbers are NOT directly comparable with those posted to my spreadsheet, which always assume equal strength between the AL and NL.
4. Non-SB baserunning is estimated using my standard estimation equation. This means BRWAA will probably show a slightly lower standard deviation than it does for 1972-05.
5. Replacement levels are assumed to be the same as 2005. This may very well not be the case.

This does NOT mean these are all the top-ranked players in MLB--they're just the ones I happened to check. There may be others that should be interspersed on this leaderboard. If there are any other players in particular you're curious about, let me know.

Even after adjusting for league strength, the standard deviation gap between the AL and NL is striking. Is there some real factor my standard deviation projection equation is missing? Is the league strength gap bigger than the (already massive) 8-wins-per-team differential I'm positing? Or was it just that nobody happened to have a big year in the AL, a la 1976?

Before anyone wants to wring my neck, the YouKike thing is a joke. His baseball-reference and BP id are youkike, and since he's Jewish, it makes for quite a coincidence. In response, baseball-reference has turned the "i" into a lowercase "L"--which is, of course, indistiguishable on a computer screen from a capital I--while Baseball Prospectus has just stuck a ton of underscores in the middle, as in youki__________ke. Pretty funny.

Ryan Howard must have just had the worst 48-homer season ever.

That said, the clear AL MVP is Cliff Lee, who was a mere 9.5 WARP after adjusting for league strength (assuming Cleveland's fielders were league-average, which I believe they were). Even better than Pujols.

Dan - I think the AL has an inherent structural advantage in interleague play because the DH. NL teams don't have to keep one around, and are a large disadvantage in games in the AL park.

I haven't studied it - but does the AL have a larger than expected HFA in interleague games in AL parks? 8 wins per team seems very high to me - I thought the NL had closed the gap. Also, it doesn't necessarily hold that the advantage would be the same for hitters as pitchers.

Is the gap between an average AL DH and an NL pinch hitter bigger than the gap between pitchers who hit 2-3 times a start and those who never, ever hit? Obviously if it's Thome the answer is yes, but a lot of teams carry plenty of dead weight at DH.

Both Nate Silver and Mitchel Lichtman told me for an NY Times column last year that the league strength gap in 2006 was 10 wins a team, after factoring out DH-related factors. I too heard it had narrowed--a bit, hence my guesstimate of 8 wins.

587. sunnyday2
Posted: October 03, 2008 at 11:53 AM (#2965982)

It has been said many many times that the main virtue of WS is that it adjusts to actual wins earned on the field of play.

I have also said 100 times that my big problem with WARP is that it's always changing, and of course that is BP WARP.

DanR WARP came along at a time when I just wasn't willing to change "systems." I mean, I do have a life.

Having said all of that, I am favorably impressed by Dan's 2008 MVP rankings. Let's just take the NL.

Win Shares

Berkman 38, Pujols 35, Beltran 33, H. Ramirez 32, Utley 30, Reyes and Wright 29, Lincecum and McLouth 27, Ludwick and A. Gone 26

VORP

Pujols 97, H. Ramirez 81, C. Jones 75, J. Santana 73, Berkman and Lincecum 72, Wright 66, Reyes 63, Utley 62, Holliday 60

DanRWARP

Pujols, H. Ramirez, Utley, C. Jones, Texeira, Berkman, Wright, Beltran, M. Ramirez (I assume this is both leagues), Reyes; I don't think you rated the pitchers rather than they fell short...?

Some markers:

Berkman: 1 on WS, 5T on VORP, 6 on DanRWARP
Beltran: 3, not top 10, 8
C. Jones: unrated, 3, 4
Texeira: unrated, unrated, 5

My gut says that WS is closest on Berkman though I don't see him as #1, DanR is closest on Beltran, but Jones would be closer to #11 than to 3 or 4, and Texeira would be closer to unrated than to #5.

But all things considered, if I'm just looking at the top 3, I like Pujols, Hanley and Utley better than I like the other 2 top 3s. Not that looking at the top 3 is all that meaningful, more of a toy.

Maybe it's just a coincidence but I don't see WS defensive methodology being particularly influential in the differentials here. I might infer from Berkman being #1 that it over-rates 1B defense but then how to explain Pujols...not to mention, again, Texeira. Maybe this is only really a problem when you roll up the numbers at the career level? Well, no, Dan says it's as much as 2.5-3 WS too low for a really good corner OF. I don't know who the representative case is this year, however.

A. Gone and McLouth may reflect WS propensity to advantage good players on crappy teams (or does it?). Some said it does apropos of Ralph Kiner but nobody said it about Chuck Klein. But that charge has been around forever.

588. AROM
Posted: October 03, 2008 at 12:15 PM (#2965986)

Rosenheck sounds, like a Jewish name. So Dan can list Youkilis as he did, the Gentiles among us can't.

589. Tiboreau
Posted: October 03, 2008 at 01:05 PM (#2966010)

Maybe it's just a coincidence but I don't see WS defensive methodology being particularly influential in the differentials here. I might infer from Berkman being #1 that it over-rates 1B defense but then how to explain Pujols...not to mention, again, Texeira.

If the issue is that WS underrates defense, then wouldn't the problem be that WS overrates 1B (and corner OF) overall when compared to premium defensive positions, not 1B defense? Even if it did overrate 1B defense in comparison to other positions, the argument was that it underrates defense in comparison to offense, cutting too small a piece of pie for the former, so there would be little effect on the WS ratings if it was determined that 1B defense was overrated and was adjusted without taking into account that it underrates defense overall, its real problem.

Win Shares . . Dan R's WARP Berkman . . . . Pujols Pujols . . . . H. Ramirez Beltran . . . . Utley H. Ramirez . . C. Jones Utley . . . . . Teixeira Reyes . . . . . Berkman Wright . . . . Wright McLouth . . . . Beltran A. Gonzalez . . M. Ramirez

First, VORP is simply an offensive rating; it doesn't take defense into account. Also, while it is likewise created by BP, it isn't related to WARP in any way, I believe.

Looking at the top 9 you listed (excluding pitchers since, like you said, Dan hasn't rated pitchers) we see Hanley Ramirez, Chase Utley & Chipper Jones have jumped up the list in Dan's evaluation, above players from easier positions defensively, while Reyes is the only one that drops down the list among ballplayers from the more difficult end of the defensive spectrum (sorry, I always get the directions, left-right, mixed up). That doesn't take into account how well they play defense at their respective positions. For example, I believe that every defensive measure (play-by-play anyways) rates Pujols as the top defensive 1B of the past few years, let alone '08. That has an effect on the difference in ratings (as well as the fact that he was insanely good offensively this year: looking at bb-ref's sabermetric stats, OPS+, RC & BtRuns, as well as considering their defense--how does Pujols end up behind Berkman? Berkman's WPA advantage doesn't seem that big . . . ). I don't know the defensive merits of Reyes & Beltran in '08 so I don't know if that had an effect on their drop among the top 9 from WS to Dan's WARP, but I believe I've heard that McLouth's defense isn't so hot while Hanley (I almost typed Horacio, Yikes!) Ramirez's has improved from worst in the league to nearly average according to some (and hasn't Chris Dial argued that Chipper's defense at 3B is underrated. I'm not sure if I recall correctly or not . . . ).

Well, no, Dan says it's as much as 2.5-3 WS too low for a really good corner OF.

The issue isn't only corner OF, it is every position--Win Shares gives to little weight to defense, too much to offense, overrating poor defensive players, underrating good ones, overrating ballplayers from easier defensive positions while underrating the up-the-middle positions. That's why Paul Waner will receive 5th place votes, because his defensive value makes up for any weaknesses OPS+ & WS sees in comparison to other RF.

Sunnyday--I didn't include pitchers, but I did a few by hand, trusting BP's defensive adjustments for now. After factoring in league strength, Cliff Lee blew away the field at 9.5, Lincecum and Santana were 6.8, Halladay was 6.7, and no one else was above 6.

AROM--definitely Jewish, but I've barely stepped into a synagogue since my Bar Mitzvah.

Tiboreau--I wouldn't say precisely that Win Shares "underrates" defense *on the whole*. It makes two mistakes: first, its intrinsic positional weights are off (an average fielding SS should have about 6 more Fielding WS than an average fielding corner outfielder, and 8 more than an average fielding first baseman); and second, it compresses the impact of fielding quality into too small of a range. The spread of true fielding talent, according to modern PBP metrics, is something like -8 to +8 at first without counting scooping, -15 to +15 at corner outfield, and -20 to +20 at third and up the middle. (These are BALLPARK figures; I didn't bother to measure actual standard deviations on my data so don't kill me). In any given year, scores with an absolute value greater than 30 can be observed at up-the-middle positions, although they are unlikely to repeat themselves (just as a batting champion will often hit over .350 but only the rarest of players hits .350 over multiple seasons). So you should be seeing gaps of about 65 Fielding WS between players at the same position over 5-year stretches, and as much as 20 Fielding WS between players at the same position in a single year.

Needless to say, that's not the case. But that doesn't necessarily mean that good fielders are underrated--it could just as easily mean that bad fielders are overrated. (In fact, both are true). What it does mean is that Win Shares doesn't achieve its goal of accurately allocating a team's wins among its players.

591. AROM
Posted: October 03, 2008 at 03:24 PM (#2966181)

Looking at simple runs saved above average, Lee comes in at +51 and Halladay +43. Ballpark and defense adjustments not included. How does Lee jump to almost 3 wins above Halladay? Is the Blue Jay defense that good? Are you just basing this on runs, or expected runs from components, or some sort of support neutral W-L record? Does this account for the documented difference in quality of teams faced?

592. sunnyday2
Posted: October 03, 2008 at 03:26 PM (#2966184)

Most people seem to think a starting pitcher couldn't possibly win an MVP award, or more to the point, shouldn't, because of playing time issues.

I would say that in terms of effectiveness Cliff Lee surely was the "best" player in the AL this year, in part because nobody had a really kick-ass year (no position players) and partly because Cliff's year was sooo good.

I'm guessing that Lee will be lucky to hit the bottom 3 in the official MVP voting, or will not make the top 10 at all. My feeling is he is probably top 3-5, don't know if he's #1 or not. Halladay should also be top 10 IMO.

What is ironic, of course, is that those who say a starting pitcher can't be MVP then turn around and favor closers. I guess the argument hinges on the number of games, then, rather than the number of innings (outs) accounted for.

I'm not convinced any of the NL pitchers are top 10, but Lincecum is at least close and he is the Cy, though Webb had a pretty good year too.

But anyway, I like that your system gives starting pitchers a shot, unlike WS, whereas VORP seems to rate them too highly.

Here's the nitty-gritty. Dewan has Toronto's defense at 44 plays and 42 bases above average, so that's +33.5 runs. Doing the catcher D by hand, Toronto's catchers had 11 errors, 6 passed balls, 86 stolen bases allowed, and 37 caught stealing, versus league averages of 10.5, 10.9, 93.4, and 34.9, so that's 3 more runs, for +36.5 total. Halladay pitched 246 of the team's 1446.7 innings, so his defensive support was 36.5*246/1446.7 = 6.2 runs, which on top of the 88 he allowed makes 94.2. Then I multiply that by .963 to account for league strength, resulting in 90.7 runs allowed.

Using a 99 5-year park factor for the Blue Jays, a league-average offense would have scored 4.78*.99*246/9 = 129.3 runs in 246 innings. 129.3 runs scored plus 90.7 runs allowed is 220 total runs in 246 innings, which yields a Pythagorean exponent of 1.81 and a winning percentage of .655. A .655 winning percentage in 17% of a team's games translates to 4.27 wins above average. Add on 2.1 wins per 200 innings for replacement level, and Halladay comes out at 6.85 WARP.

Now, Lee. Dewan has a very interesting take on Cleveland's defense--they were exactly average at preventing hits on balls in play, but quite good at keeping those hits to singles (0 plays and 29 bases above average). This is principally attributable to Gutiérrez and secondarily to Sizemore. That comes out to a team defense of +7 runs. Cleveland's catchers were league average (12 errors, 13 passed balls, 67 steals, 27 caught stealing). Lee pitched 223.3 of the team's 1437 innings, so his defensive support was 7*223.3/1437 = 1 run, which on top of the 68 runs he allowed makes 69. Multiplying by .963 for league strength gives a result of 66.4 runs allowed.

Using a 100 5-year park factor for the Indians, a league-average offense would have scored 4.78*223.3/9 = 118.6 runs in 223.3 innings. 118.6 runs scored plus 66.4 runs allowed is 185 total runs in 223.3 innings, which yields a Pythagorean exponent of 1.77 and a winning percentage of .737. A .737 winning percentage in 15.5% of a team's games translates to 5.95 wins above average. Add on 2.1 wins per 200 innings for replacement level, and Lee comes out at 8.3 WARP.

Your instinct was right, AROM, that the gap was too large, and the reason why is incredibly stupid--I did the same analysis of Lee a few weeks ago to see what he was on track for, and didn't check to see that his last three starts were poor. Apologies. At 8.3 WARP, he is still clearly the best player in the AL, but definitely not as valuable as Pujols.

For now, I do not take quality of opposing hitters into account. I'd be interested to see that data.

Chipper at #11, sunnyday? Yes, he missed a good chunk of time, over 20% of the season. But he had a one-seventy-seven OPS+, and according both to Dewan and Dial (whose ratings are based on separate and independently compiled data sets) he played a meaningfully above-average third base as well. (Anecdotally, I'd support this--I had Jones on my fantasy team this year and thus watched him play a lot, and I was consistently impressed with his fielding, but I'm no scout).

Look at it this way--a replacement third baseman will have about a 75 OPS+ (think José Castillo or Wes Helms) and field at an average rate. Jones was about a +10 per season third baseman this year per both Dewan and Dial, so let's move those runs from the defensive side of the ledger to the offensive one, where they're worth 11 points of OPS+. So, you've got a 177 + 11 = 188 OPS+ 3B who fields his position at the league average for 78% of the season, and a 75 OPS+ 3B who fields his position at the league average for 22% of the season. Combine the two, and you have a full season of play at a 163 OPS+ with average fielding at third. That's roughly Eddie Mathews in 1959--a very strong MVP candidate in a typical year.

595. Paul Wendt
Posted: October 03, 2008 at 06:23 PM (#2966434)

#593 - is that a walk through the current draft of WARP for pitchers (who don't bat)?

. . . Add on 2.1 wins per 200 innings for replacement level, and Lee comes out at 8.3 WARP.
estimated from some pitcher analogue to the moving average of the three worst regular shortstops in the league?

596. AROM
Posted: October 03, 2008 at 06:38 PM (#2966460)

Thanks Dan for the full explanation. Lee faced easier opponents than Halladay did, but if the adjustment for this is made it's only about 1 win. Lee still has the edge on Halladay.

Paul Wendt, that's the walkthrough for WARP1 in the modern era, which is what matters for MVP/Cy Young voting. For cross-era comparisons you have to account for standard deviations (which I obviously can handle), innings translation (which still drives me crazy), and potential changes in replacement level over time (which I haven't begun to address). No, the replacement level comes from Tangotiger's placement at .410 (see http://www.insidethebook.com/ee/index.php/site/comments/the_replacement_pitchers/), plus an extra 0.1 win per 200 innings for the gap between the average long reliever (who when promoted to spot starter is a .410 pitcher) and the replacement pitcher who becomes the team's new long reliever.

Also, again, Paul, just to clarify on how the position player replacement levels are calculated: The "base" levels do NOT come from the worst-regulars average. They come from Nate Silver's Freely Available Talent (FAT) study. I use the worst-regulars average (the worst 3/8ths, not the worst 3 period) to trace the defensive spectrum back in time, by holding the gap between the FAT level and the worst-regulars average constant.

E.g., from 1985 to 2005 (the period covered by the FAT study), the worst 3/8 of starting MLB first basemen averaged 0.3 batting + baserunning + fielding wins above average per 162 games. Nate found that freely available first basemen over that timespan averaged 0.2 wins below average per 162 games. So this tells us that replacement 1B are 0.5 wins a season worse than the average performance of the worst 3/8 of starting 1B in the league. If my published replacement level at 1B in a given year is -1.0, that means that the worst 3/8 of starting 1B in the surrounding 9-year period averaged 0.5 wins below average per season.

599. jimd
Posted: October 04, 2008 at 12:33 AM (#2966633)

Assume that it's .500 in the NL parks and that those were half the games (approx 252-252 .500). Then in the AL parks it would be approx 324-180 .643. That's a lot to explain away by the NL team not having a useful DH on the major league roster. If it's so important they could, you know, send down a pitcher and call up some 1b/of prospect from the minors for the series.

600. Blackadder
Posted: October 07, 2008 at 10:34 PM (#2973860)

Dan, have you computed these for 2006 and 2007? If so, I would be curious to see the leaderboards for those years.

## Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.67 8 >Wow! That's very surprising. Are RCAP park-adjusted? Do they include SB/CS? Do they include double play avoidance? All three are major benefits to Campaneris which I assumed were included. Also, by subtracting 4 years from McCovey and 2 from Campaneris, I'm giving Campaneris more below-average time than McCovey. DCW3, how would it look if you just did above-average seasons, ignoring all below-average ones?The way I calculate them, RCAP are park-adjusted, but not era-adjusted: for instance, Campaneris had a higher RCAP in 1970 than 1968, but since the offensive context was higher in '70, '68 was a more valuable season. They include SB/CS, but not DP avoidance. If I subtract out all the below-average seasons, it puts McCovey at 431 and Campaneris at 156.

the relative dearth of stars in the AL around 1950 (the only megastar was Williams, maybe Doby)Yogi Berra and Joe DiMaggio say hi.

DiMag was on his last legs by 1950. Berra hadn't yet established himself as a megastar, although he would within a couple of years.

-- MWE

DiMaggio's last superstar year was '48. He played great in '49 but missed half the year, played at an All-Star but not superstar level in '50, and was done after '51.

Just checking the data, the time period I'm really talking about is 1951 to 1955, when integration really began in earnest and the AL lost its best player to Korea. The top 7 players by WARP1 (so not adjusted for standard deviation) during that stretch were Musial (NL), Snider (NL), Mantle (AL), Jackie Robinson (NL), Doby (AL), Ashburn (NL), and Hodges (NL).

played at an All-Star but not superstar level in '502nd best OPS+ in the league, in a highly unfavorable park, playing a mediocre but not terrible CF (-8 buggy FRAA).

I mean, if that's not a superstar, what is?

Statsite

Excluding 1999 of course, and 2007 until those are available. I compared each catcher to that year's league average, and controlling for pitcher handedness.

I hear in this year's Hardball Times TangoTiger will up the ante and publish this data controlling for pitcher.

Joe DiMaggio 32-122-.301/.585 SA led the league (because Ted Williams was hurt)

Berra 28-124-.322 arguably his best year, if not a star then that only refers to notoriety not value

Doby 25-102-.326 (Doby and Berra both 25 yrs; to say Doby was a star and not Berra cannot be right)

Kell, Wertz, Hoot Evers all 100 RBI for Detroit

Vern Stephens 30-144-.295

Dropo and Doerr also 100 RBI; Dropo, Doerr, Stephens, Pesky and Dom D. all scored 100 R

Billy Goodman led the league at .364 with 91 R as utility man

Dom DiMaggio .328/131 R led the league because somebody had to

Al Rosen 37-116-.287 just getting warmed up, the HR led the league

Raschi, Lopat, Reynolds/Lemon, Wynn, Feller

PS. Berra 1949 20-91-.277, Doby 24-85-.280; Berra finished ahead of Doby in MVP voting in both 1949 and 1950

Not arguing the larger point, but there were some pretty good ballplayers in the AL in 1950.

NL--Ennis led the league with 126 RBI

Jackie, Hodges, Reese, Snider, Furillo, Campy, Newk, Roe all had good years yet the Dodgers couldn't win the pennant

Stanky OB via H or BB > 300 times yet couldn't lead the league in R, Earl Torgeson did

Musial led the league at .348 and .596

Kiner 47-118

Kluszewski, Pafko, Hank Sauer

Roberts, Simmons, Konstanty, Jansen, Maglie; Spahn, Sain and Bickford (not Rain in '50 that was '48); Blackwell

Dunno what this shows. Some journeymen "dominated" in the NL, too, maybe. The Phillies won with a pretty mediocre lineup compared to the Dodgers. But "more NL players dominated than in the AL," except that sounds so much like an oxymoron.... But wait, 11 hitters in each league had 100 RBI. But 3 NL pitchers won 20 games, only 2 did in the AL.

SFrac: Percentage of the season played.

BTWAA: "Raw" batting wins above average.

DPWAA: Double play avoidance wins above average.

BRWAA: Baserunning wins above average.

FWAA: Fielding wins above average.

PAdj: Wins added/subtracted to correct for park effect.

TWAA: Total wins above average (BTWAA+DPWAA+BRWAA+FWAA+PAdj).

PAWAA: Wins above average an average player at the given position in the given league-season would have produced in a full year's worth of play.

WAP1: Wins above positional average (TWAA-(PAWAA*SFrac)).

LgAdj: Ratio of the regression-projected standard deviation of the given league-season to the 2005 standard deviation.

WAP2: Wins above positional average, adjusted for standard deviation. (WAP1*LgAdj)

A-Rp: Gap in standard deviation-adjusted wins between a positional average and a replacement player at the given position per full year's worth of play.

WARP2: Standard deviation-adjusted wins above replacement player (WAP2 + (Av-Rep*SFrac)).

TOTL is career totals, TXBR is career totals excluding sub-replacement seasons, and TXBA is career totals excluding sub-positional-average seasons.

Dagoberto Campaneris

`Year SFrac BTWAA DPWAA BRWAA FWAA Padj TWAA PAWAA WAP1 LgAdj WAP2 A-Rp WARP2`

1964 0.42 -0.3 +0.1 +0.2 -0.5 -0.1 -0.5 +0.9 -0.9 0.983 -0.9 4.2 +0.9

1965 0.94 +0.7 +0.3 +0.3 -1.4 +0.1 +0.1 -0.2 +0.2 0.977 +0.2 3.1 +3.1

1966 0.91 +0.0 +0.1 +0.8 -0.6 +0.3 +0.5 -0.3 +0.8 0.999 +0.8 3.2 +3.7

1967 0.97 -0.7 +0.3 +0.6 -0.8 +0.2 -0.3 -1.0 +0.7 0.985 +0.7 2.6 +3.1

1968 1.06 +1.8 +0.3 +0.5 +0.7 +0.5 +3.7 -1.0 +4.7 1.003 +4.8 2.7 +7.6

1969 0.86 -1.8 +0.2 +1.0 +0.1 +0.3 -0.2 -0.1 -0.1 0.948 -0.1 3.8 +3.2

1970 0.95 +1.4 +0.3 +0.5 +1.0 +0.4 +3.6 -0.8 +4.3 0.949 +4.1 3.1 +7.0

1971 0.91 -1.6 +0.4 +0.5 +0.3 +0.1 -0.3 -1.7 +1.3 0.962 +1.3 2.2 +3.3

1972 1.05 -1.5 +0.2 +0.9 +1.7 +0.4 +1.8 -1.4 +3.2 0.970 +3.1 2.4 +5.6

1973 0.97 -1.7 +0.1 +0.2 +1.8 +0.4 +0.9 -2.9 +3.7 0.947 +3.5 1.6 +5.0

1974 0.85 +0.6 +0.1 +0.2 +0.7 +0.5 +2.2 -1.7 +3.6 0.963 +3.5 2.8 +5.8

1975 0.84 -0.4 +0.1 +0.0 -0.2 +0.1 -0.3 -2.2 +1.5 0.943 +1.4 2.2 +3.3

1976 0.91 -0.5 +0.3 +0.4 +0.4 +0.3 +0.9 -1.2 +2.0 0.948 +1.9 3.1 +4.7

1977 0.89 -1.4 +0.4 -0.2 +1.7 -0.1 +0.5 -2.3 +2.5 0.907 +2.3 2.1 +4.2

1978 0.44 -2.4 +0.1 +0.3 -0.2 +0.0 -2.3 -1.9 -1.5 0.919 -1.3 2.4 -0.3

1979 0.40 -1.5 -0.1 -0.1 +0.4 +0.1 -1.3 -2.7 -0.2 0.913 -0.2 1.7 +0.5

1980 0.33 -0.8 +0.1 -0.1 -0.4 +0.0 -1.1 -2.0 -0.5 0.929 -0.4 2.3 +0.3

1981 0.20 -0.2 +0.2 +0.0 -0.7 +0.0 -0.7 +0.1 -0.7 0.950 -0.7 2.2 -0.2

1983 0.22 +0.0 +0.0 -0.4 -0.2 +0.1 -0.5 -0.3 -0.4 0.954 -0.4 2.3 +0.1

TOTL 14.12 -10.3 +3.6 +5.6 +4.1 +3.7 +6.6 -1.2 24.3 0.964 23.4 2.7 61.0

TXBR 13.49 -7.7 +3.3 +5.3 +4.9 +3.7 +9.6 -1.2 26.4 0.961 25.4 2.7 61.5

TXBA 11.25 -3.3 +3.0 +4.7 +5.5 +3.4 13.2 -1.4 28.5 0.961 27.4 2.6 56.5

Willie McCovey

`Year SFrac BTWAA DPWAA BRWAA FWAA Padj TWAA PAWAA WAP1 LgAdj WAP2 A-Rp WARP2`

1959 0.34 +2.8 -0.2 +0.0 +0.0 +0.1 +2.8 +2.3 +2.0 0.965 +1.9 2.5 +2.8

1960 0.47 +1.2 +0.1 -0.1 -0.4 +0.3 +1.2 +1.8 +0.3 0.955 +0.3 2.0 +1.3

1961 0.58 +1.5 +0.0 -0.2 +0.4 +0.2 +2.0 +1.8 +0.9 0.962 +0.9 2.0 +2.0

1962 0.38 +2.1 +0.0 -0.1 +0.1 +0.1 +2.2 +2.3 +1.4 0.900 +1.2 2.8 +2.3

1963 0.94 +5.5 +0.2 -0.1 -0.8 +0.1 +4.9 +2.4 +2.7 0.942 +2.5 3.0 +5.3

1964 0.65 +1.3 +0.0 -0.1 -0.9 -0.1 +0.3 +1.8 -0.9 0.930 -0.8 2.4 +0.7

1965 0.94 +5.4 +0.2 -0.4 -0.2 -0.3 +4.7 +2.0 +2.8 0.937 +2.7 2.2 +4.7

1966 0.87 +5.7 +0.2 -0.1 -0.5 -0.3 +5.0 +2.0 +3.2 0.950 +3.1 2.1 +4.9

1967 0.80 +4.7 +0.1 -0.2 -0.3 +0.1 +4.4 +1.2 +3.4 0.947 +3.2 1.4 +4.3

1968 0.91 +6.8 +0.3 -0.1 -0.7 +0.0 +6.4 +1.8 +4.8 0.973 +4.6 1.9 +6.3

1969 0.92 +8.7 +0.2 -0.2 -0.4 +0.3 +8.6 +2.6 +6.2 0.914 +5.7 2.5 +8.0

1970 0.93 +6.6 +0.3 -0.2 +0.0 +0.1 +6.8 +2.4 +4.6 0.919 +4.3 2.3 +6.4

1971 0.60 +3.0 +0.3 -0.3 -0.8 +0.1 +2.4 +2.5 +0.9 0.940 +0.8 2.4 +2.2

1972 0.47 +0.5 +0.1 -0.5 -0.4 +0.0 -0.4 +2.1 -1.4 0.950 -1.3 2.0 -0.4

1973 0.73 +4.7 +0.2 -0.3 -0.5 -0.4 +3.8 +1.5 +2.7 0.948 +2.6 1.4 +3.6

1974 0.65 +3.6 -0.1 -0.3 -0.6 +0.3 +3.0 +1.6 +1.9 0.932 +1.8 1.5 +2.8

1975 0.70 +1.8 +0.1 -0.3 -0.1 +0.4 +1.9 +1.7 +0.7 0.936 +0.7 1.6 +1.8

1976 0.37 -0.7 +0.1 +0.0 -0.1 +0.2 -0.5 +1.3 -0.9 0.929 -0.9 1.4 -0.4

1977 0.80 +2.7 -0.4 -0.4 -0.7 +0.0 +1.2 +1.8 -0.2 0.972 -0.2 2.0 +1.4

1978 0.58 -0.2 -0.1 -0.3 -0.2 +0.2 -0.7 +1.4 -1.5 0.988 -1.5 1.8 -0.5

1979 0.58 +0.3 -0.1 -0.1 -0.5 +0.4 -0.1 +1.5 -0.9 0.981 -0.9 1.8 +0.2

1980 0.19 -0.5 +0.0 +0.0 -0.1 +0.0 -0.5 +1.5 -0.8 0.985 -0.8 1.9 -0.4

TOTL 14.39 67.6 +1.5 -4.0 -7.6 +1.9 59.5 +1.9 32.0 0.935 29.9 2.0 59.4

TXBR 12.78 68.6 +1.5 -3.3 -6.7 +1.5 61.5 +1.9 36.7 0.938 34.4 2.1 61.0

TXBA 10.75 64.3 +1.9 -2.7 -4.7 +1.3 60.1 +2.0 38.7 0.939 36.4 2.1 58.8

Lots of interesting stuff to see here. McCovey's "raw" hitting is literally 76 wins better than Campaneris's, but Dagoberto chips away at that advantage with better double play avoidance (2 wins' difference), baserunning (8.5 wins), fielding (11.5 wins), and more pitcher-friendly parks (2 wins), reducing McCovey's advantage to 52 wins.

Then we have to account for the fact that Campaneris played shortstop and McCovey played first base. We can do this either by comparing to positional average or to replacement. Compared to positional average, if we ignore all below-average seasons, McCovey still exceeded the average 1B of his day by more than Campaneris exceeded the average SS of his: 38.7 wins above positional average for McCovey, 28.5 for Campaneris. Pretty big difference. McCovey played in slightly higher standard deviation leagues than Campaneris did; adjusting for that knocks 2.3 wins off McCovey and 1.1 off of Campaneris, which still leaves McCovey with a 36.4-27.4 advantage. I stand corrected: I thought Campaneris and McCovey would be comparable relative to positional average; they are not.

So why do I have Campaneris slightly higher? Because I see the gap between positional average and replacement as larger at SS (2.7 wins per season) than at 1B (2.1 wins per season) when they played. Remember that my replacement levels are calculated using the worst 3/8 of regulars after adjusting for the leaguewide standard deviation, so the only three things that can account for different-sized gaps between positional average and replacement are:

1. The gap between the worst-regulars average and Nate Silver's FAT levels for the 1985-2005 period. But this actually favors 1B--I subtract 0.3 wins from the worst-regulars average to get replacement level for SS, and 0.5 wins for 1B.

2. The standard deviation of performance *within* the position, which I intentionally do not correct for. Shortstop is, as Nate Silver puts it, a "feast or famine" position--you tend to have a few superstars who are just extraordinary athletes, and then a lot of guys who are really overmatched. By contrast, at 1B, you can more or less play it if you can walk, so while SS are separated by their hitting AND their fielding, 1B are separated basically by their hitting alone. This means that 1B are likely to be bunched much more closely together around positional average, while SS are likely to be spread out much further from positional average.

3. Kurtosis. It could be the case that even keeping standard deviation constant, SS were high-kurtosis in this period ("fat tails" and then a tight cluster in the middle), while 1B were low-kurtosis ("shoulders" near the center of the distribution and very few outliers). This isn't necessarily characteristic of the two positions over time, but it might be true in the 60s and 70s.

So there you have it. Compared to the average player at their positions, McCovey was indubitably better; compared to the freely available talent level, they were just about equal.

There are other factors I didn't mention, but the two that leap to mind probably even out--better in-season durability for Campaneris, tougher league for McCovey.

Which is pretty obvious if you just look at Bill James' rankings. They're not even close to following the career WS totals.

I would agree with whoever said/wherever it was said that the big contribution of this project to world peace is in adjusting WS to 162 games (fairness to 19C players), WWII (and WWI and Korea) credit, and of course NeL MLEs. MiL MLEs are a much lesser matter in part because, as Chris Cobb showed in another thread, we haven't really elected anybody because of them (OK, some might argue Charley Keller, I don't know). Not only that but MiL MLEs were out there though used for a different purpose.

I still hope somebody writes a book out of all of this, though the logistical issues are extreme--e.g. who owns all of the work that's been posted here? Is it public domain? Could Chris' and Doc's MLEs be published w/o their permission, or would they give their permission? What about Dan's work? And who has the time to write it, not to mention the skills?

But all of those questions aside, the adjustments to WS (and to WARP) that have been done here are really cutting edge stuff. Maybe they already have a wider audience on the Web than they would ever get in print anyway. But even so, you'll agree that navigating all of it on this blog is a challenge.

I guess I'm not sure what it *was* designed for, then. It seems like the response is "showing players' proportional contributions to wins," but I don't see why WS accomplishes that any more than WARP or my system does--in fact, I think it doesn't accomplish that goal as well as either WARP approach. If the whole thing is just a gimmick so that player wins add up to team wins, then all you have to do is just take 40.5 wins and allocate them according to PA, another 40.5 and allocate them according to IP and defensive innings played, and add on wins above average and you've accomplished the same thing without creating any of the distortions that WS introduces. Why go the trouble of making all these "cutting-edge adjustments" to WS when you can just get it right in the first place?

Maybe this should be moved to the uberstats thread.

I would argue Averill, too.

First to John Murphy:

I would argue Averill, too.I didn't list Averill (in the list on the 2007 ballot discussion thread) because that list was looking at players that we elected that the HoF hasn't elected, and the reasons for the difference. Since the HoF elected Averill, he didn't show up. I would agree that minor-league credit helped his case with us, as it did Charley Keller's.

Next to Sunnyday2:

For the record, if someone wanted to do the work of putting together the book, I would happily give permission for my MLEs to be used, for whatever they are worth, if my permission were needed.

I think members of the electorate have the necessary skill to write the book, though probably not the time.

Finally, to Danr:

The main sabermetric advancement that James was pursuing in win shares, I think, was a satisfactory way of measuring pitching, batting, and fielding values together in terms of wins. His definition of "satisfactory" had three criteria, I think:

(1) it should include a good measure of fielding value [since that is the basic piece that was lacking];

(2) it should not be built on "value above average," since in James's view using average as a baseline creates false impressions about the nature of player value (not that his choosing a zero point rather than seeking replacement level doesn't run into the same problem); and

(3) it should be tied in some meaningful way to actual runs and actual wins (his philosophy here is similar to his philosophy with his runs created formulas, which are always brought back to actual runs scored). The choice of having the sum of a team's win shares equal three times team wins is a gimmick, of course, but some defined way of making win shares correspond to actual wins is not a gimmick but a philosophical commitment about how to best represent value.

You could compare my remarks here to what James says in the introductory chapters of _Win Shares_ to see how well I have represented his own claims in the matter.

As for 3, I am reminded of dark matter. Just because we can't measure it and don't know what it is doesn't mean it's not out there. Likewise, the "dumb luck" that accounts for the divergences between RC formulae and actual runs and between phythag wins and actual wins. Just because we don't know what caused it doesn't make it meaningless. I don't think it's off the rails to incorporate "it" even if we don't know what "it" is.

I never said it was "off the rails." I said it was "widely disputed." I definitely consider the question of whether to include run estimation and Pythagorean errors in your evaluation as normative, not positive--there's no one right answer. Certainly there was something going on with those 1890s Braves...I tend to think Tom Glavine-style "clutch pitching" from the stretch may often be a major contributing factor that deserves to be credited. Or fielders deciding whether to take a risky dive or not based on the leverage of the situation...on the offensive side, it seems much more likely to me that it's all luck.

I have been stating for years now that you just can't rate players without using WS/162 in conjunction with straight WS numbers.Agreed that performance rates need to be considered, as well as career totals, but note that WS/162 as a rate stat can be misleading. For example, for a player with a significant number of appearances as a pinch hitter, such as Enos Slaughter, WS/162 underestimates their rate of performance. WS per plate appearance is better (but ought to be adjusted relative to the OBP of the era in question).

Cool, Rallymonkey!

1. I was always under the impression that while clutch

abilitydid not exist, clutchperformancedid--in other words, that there were substantial discrepancies each year between a player's actual value to his team and the value you would expect from his statistics, but that the cause and distribution of those discrepancies was totally random. It turns out that a player's offensive statistics are a *damn* good predictor of his WPA, far better than I would have thought. Just multiplying batting + baserunning wins above average for any given player-season (as measured by my WARP) by .85 gets you a sweet 90% r-squared on the WPA for that player-season. So the first thing to say is that just using a run estimator and Pythagoras gets you pretty damn close to where you want to go.2. The one thing that *leaps* out to me about the players whose career WPA most exceeds what we would expect from their offensive statistics is that the leaders are all Rockies. Something is seriously wrong in baseball-reference's park factor calculations, because it is dinging those Colorado hitters (Helton, Walker, Galarraga, Castilla, Bichette) far more than WPA thinks is appropriate. My guess is that the Rockies' hitters learn to take advantage of the park, and that that gives them a bigger-than-average home field advantage, a dynamic which the standard park factor calculation (which assumes that home and away teams benefit equally from park effects) cannot take into account.

3. For HoM purposes, here are the players who showed the greatest and smallest gaps between their BWAA + BRWAA and their WPA from 1974 to 2005:

Leaders1. Todd Helton, +10.7 wins (1.38 per season)

2. Larry Walker, +10.4 (.94)

3. Andrés Galarraga, +9.2 (.74)

4. Chipper Jones, +9.0 (.86)

5. Dale Murphy, +8.0 (.60)--this will likely get him on my 2009 ballot

6. Vladimir Guerrero, +7.9 (1.00)--I suspect this is because he hits everything the same, be it a 100mph closer fastball or a junkball

7. Vinny Castilla, +7.8 (.77)

8. Andre Dawson, +6.8 (.43)

9. Shawn Green, +6.7 (.67)

10. Fred McGriff, +6.6 (.45)--helps an otherwise weak case

11. Andruw Jones, +6.3 (.74)

12. Graig Nettles, +6.1 (.57)--very relevant for 3B ranking purposes

13. Ryne Sandberg, +5.9 (.43)--would have been good to know when we were voting on 2B

14. Álex Rodríguez, +5.5 (.57)--and they call him a choker!

15. Jeromy Burnitz, +5.5 (.69)

16. Luis González, +5.3 (.40)--same comment as McGriff

17. Terry Pendleton, +5.2 (.50)

18. Marquis Grissom, +5.1 (.41)

19. Gary Gaetti, +5.1 (.36)

20. Mike Piazza, +4.9 (.49)

21. Darin Erstad, +4.7 (.63)--it's true!

22. Brian Giles, +4.7 (.58)--in my PHoM

23. Jim Rice, +4.6 (.34)--takes a tiny bit of sting off his likely HoF induction

24. George Brett, +4.5 (.26)

Everyone else is below +4.5. Notables by rate include Torii Hunter (+.75), Bip Roberts (+.71), Lance Berkman (+.7), José Vidro (+.65), Bob Horner (+.59), Ron LeFlore (+.56), and Lonnie Smith (+.56)

Trailers1. Travis Fryman, -6.4 (-.61)

2. Don Baylor, -5.7 (-.46)

3. Mickey Tettleton, -5.6 (-.80)

4. Omar Vizquel, -5.4 (-.41)

5. Jeff Conine, -4.8 (-.48)

6. Larry Bowa, -4.6 (-.47)

7. Ken Griffey, Sr., -4.4 (-.42)

8. Ruben Sierra, -4.3 (-.37)

9. Rob Deer, -4.3 (-.69)

10. Jim Gantner, -4.2 (-.49)

11. Luis Alicea, -4.1 (-.87)

12. Tim Raines, -4.1 (-.28)

13. Don Mattingly, -4.0 (-.36)

14. Bob Boone, -3.9 (-.41)

15. Ken Caminiti, -3.9 (-.41)

16. Keith Hernandez, -3.8 (.32)--this might be enough to drop him out of my PHoM

17. Mark McLemore, -3.8 (-.41)

18. Alan Trammell, -3.7 (-.28)--this would have dropped him a nudge on my SS rankings

19. Mike Bordick, -3.7 (-.42)

20. Raúl Mondesi, -3.6 (-.40)

21. Bill Mueller, -3.5 (-.57)

Everyone else is above -3.5. Notables by rate include Marty Barrett (-.66), Ichiro Suzuki (-.61; I suspect this is because no one is on base for his singles and because a lot of them are infield singles, so they're not much better than walks), Scott Brosius (-.56), Darren Daulton (-.55), Mike Pagliarulo (-.55), Craig Reynolds (-.54), Gene Richards (-.49), Ron Oester (-.49), Rich Aurilia (-.48), Jim Eisenreich (-.48), Howard Johnson (-.48), Tony Bernazard (-.47), Fernando Viña (-.47), and Jorge Posada (-.46; still in my PHoM).

If anyone sees any patterns in these lists that might suggest which types of players tend to be more or less valuable than their offensive stats, I'd love to hear them.

Why is the multiplying by .85 necessary?

The .85 is necessary because that's how regression works--you get a closer fit to WPA if you multiply by .85 than if you don't. Basically, what that means is that 90% of WPA is accounted for by offensive statistics, and the remaining 10% is the timing of the events. So to predict WPA, you reduce the weight of the offensive stats from 100% to 90% (the other 5% is presumably because WPA must have a slightly smaller standard deviation or something than BWAA + BRWAA), and "fill in" the rest with league-average timing (that is to say, 0). Does that make sesne?

532. David Concepcion de la Desviacion Estandar (Dan R) Posted: July 24, 2008 at 10:32 PM (#2872114)If anyone sees any patterns in these lists that might suggest which types of players tend to be more or less valuable than their offensive stats, I'd love to hear them.

This may only be by coincidence, but the positive WPA leaders are almost all Non-Middle IF (C, 2B, SS), and those that are, Sandberg, A-Rod, and Piazza, are the best sluggers at those positions ever. The positive list is comprised almost of all sluggers with multiple 30 Hr seasons AND 300+ HR career, except Erstad, Pendleton, and Grissom. Also 12 of top 23 are OF.

For the negative side, roughly half are middle infielders, while almost all players showed limited power, only Baylor and Sierra have 300 HR career, and a couple who had power at the peak, HoJo and Mattingly, who fizzled quickly.

Don't know if this is any help, but an observation none the less.

It appears that borderliners Dawson, Nettles, and Giles would be stronger PHOM cases.

Murphy may vault into the Freehan/Leach area, borderline PHOM.

Do any of the others have a shot at reaching your ballot and or are they amongst the Top 200 post 1893-non Negro league position players.

I haven't yet decided whether I will adjust my rankings to reflect WPA's findings (besides the Rockies, who will definitely get an upward adjustment). I need to do some tests to see whether there is more variation here than we would expect merely from chance--if there isn't, I'd be disinclined to give credit for it.

In which case:

Jim Rice: BPRO BRAA: 290 (I assume this is at least in the same vicinitity as your BWAA)

Fangraphs WPA/LI: +29.05

Fangraphs BRAA (+ve Run expectancy): +247

Fangraphs WPA: +22.65

Clutch:-7.07

Rice gets progressively worse as you go from context neutal measures towards WPA whilst you have him on your leaderboard!

Baseball Between the Numbers also has Rice as being -6.67 wins over his career.

The above list is still interesting--it shows hitters whose production was, on a context-neutral basis, still worth more or less than their stats would predict--but its not measuring what I thought it was measuring. I'll do the same study using raw WPA rather than WPA/LI and report back soon. Sorry for the mix-up, I'm new to this stuff!!

I'm fairly sure this is the case. WPA/LI effectively "rewards" players for shaping their production to meet the preferred context of each plate appearance (e.g. having a disproportionate amount of their wlaks come with the bases empty etc). straight WPA rewards them for timing their production into high leverage occasions.

Interesting. Another Primate is studying this, but it seems to me that good hitters with extreme splits are overvalued. The extra runs that are produced at home have less value than the runs that runs lost on the road. Looking at it this way is similar to looking at support-neutral pitching stats, only that a pitcher who is more flaky is usually more valuable than a more consistent hurler with a similar ERA. Now, things may be different in an extreme hitters park.

I thought Fangraphs park-adjusted WPA, but I could be wrong. (I know Studes is aware that WPA needs to be park-adjusted.)

-- MWE

I'll ask whether the numbers are park-adjusted. I think they must be, or else it would be clear from the leaders and trailers that they weren't.

posts #4 and #7 are the relevent ones -

the thread implies that they do now but didn't initially - what age is your data set Dan?

Excellent. Of course a connected point to consider is that such non Sb-baserunning credit will be erroneously credited to hitters. I can't imagine this will be particularly significant, but did anyone of note spend an extremely large part of their career batting behind superlative baserunners?

Didn't someone do a study that showed they are disproportionately worse when coming down from altitude, like the first and second games of road trips, compared to normal teams; or something like that?

I have to find the thread, but that idea has been kicking around for awhile, but when someone really looked it wasn't true.

Isn't it also accepted that coming down from altitude is harder on the body in the short term than going up to it is? Meaning Rockies have bigger than normal advantage when Mets come town, but Mets have even bigger advantage when Rockies come to town, as an example?

The Hangover Effect

from 2005

Isn't it also accepted that coming down from altitude is harder on the body in the short term than going up to it is?Why does the USOC, or some US Track and Field organization, train elite athletes at Colorado Springs?

I suppose there is some effect in the other direction, contrary to this quotation. Maybe it was discovered by trial and error but I presume there is some scientific explanation today.

555. TomH Posted: August 02, 2008 at 06:28 PM (#2887812)Dan, you really ought to find some spot(s) to publish your WAR system, possibly comparing it to BP's and WS (or WSAB). I'm sure SABR and other places would enjoy poring over your methods.

556. David Concepcion de la Desviacion Estandar (Dan R) Posted: August 03, 2008 at 12:16 PM (#2888346)

Thanks very much for the encouragement, Tom, but it's still a total work in progress.

For publication in SABR "By the Numbers", piecemeal would be a better way to go even if the whole were equally polished.

I suppose that is true in general for the audience that considers statistical analysis its discipline.

For example, DanR publishes the standard deviation approach to adjusting statistics by league.

Ideally, Clay Davenport decides to publish his version of adjustment based on comparing the records of same players in different leagues.

but Replacement or replacement-level play may be a better example.

DanR replied to bjhanke with a lecture on standard deviation of player-season ratings, especially its use to adjust those ratings.

Standard Deviation, DanR to bjhanke (today)

Here are two of seven points. The emphasis is mine and it may depart from the lecture. Follow the preceding link to the original.

16. David Concepcion de la Desviacion Estandar (Dan R) Posted: August 25, 2008 at 09:41 AM (#2915272)

bjhanke, as the group's self-anointed standard deviation guru, I very much appreciate your interest in the concept. I think your understanding of a few points could be deepened a bit.

1. The counterargument you propose is a straw man. First, let's not use Win Shares even in this theoretical analysis, because they are poorly thought out and lead to all sorts of problems in this type of discussion (I can explain why if you're interested).

Let's use something that measures actual value, some indicator of wins above replacement.Doesn't matter whether you use mine, Baseball Prospectus's, or a home-grown version, it's just the concept. Let's also just assume that player performance is normally distributed about the mean (it isn't, but it's close enough for our purposes). OK, take a league where the standard deviation of player performance is 2 wins per season. If we call the bottom 2-3% of major leaguers replacement players, that means they will be four wins below average per season, while the All-Stars will be four wins above average per season. This makes league average a four-WARP player, and an All-Stars an eight-WARP player. Let's also say that the top two teams in the league win 95 and 90 games.OK, now let's say that something, some real external factor, actually causes this stdev to double (as opposed to simply the addition of a bunch of superstars or super-scrubs to the league). (In practice, this would most likely be an increase in run scoring or an expansion). Now, replacement is eight wins below average, while All-Stars are eight wins above average, meaning that league average players have become eight-WARP players, and All-Stars have become 16-WARP players, overnight, with no change to their underlying ability. What happens?

Well, assuming the distribution of talent between teams doesn't change, you'd see a corresponding increase in the standard deviation of wins between teams. So the 90-win team (nine wins above average) will become a 99-win team (18 wins above average), while the 95-win team will become a 109-win team.

Why is winning games important? Because it leads to pennants.

When you increase standard deviation, ceteris paribus, you change not only the stats-wins relationship, but also the wins-pennants relationship by the same amount. 97 wins is enough to eke out a pennant in the low-stdev league, but is only good for third place in the high-stdev league.So, if what we are interested in is "pennants added," then we most definitely DO need to correct for standard deviation when assessing players' value.3. That said, we need to distinguish between a "true" increase in standard deviation--one that actually makes a league "easier to dominate"--and the inevitable year-to-year fluctuation that takes place due to random noise and to the actual distribution of talent in a league. The clearest example of this is, the highest observed major league standard deviations since 1893 are clustered in the 1920's AL. Was this because the league was easy to dominate? No, it's because the league had a one-man star glut by the name of George Herman Ruth, who singlehandledly was increasing the overall league stdev by massive proportions.

The way to do this is with a regression analysis, which determines the relationships between league factors like run scoring, expansion, and population per team, and observed stdevs over the course of baseball history. By applying the resulting equation to each league-season,

we can then determine how easy it was to dominate based on these factors, without making any reference to the actual performance of the players in the league, thus avoiding the temptation to give extra credit to George Burns for playing in low standard deviation leagues just because all of the stars were in the AL. The result of this is the standard deviation adjustment I use in my WARP.--

Dan,

Regarding the theme I have highlighted:

We have a within-league-season (raw) measure of player wins and hope to adjust it.

- Why not work on the wins-pennant relationship directly? Is the "pennant" intractable, with variation in division size and number, playoff size, etc?

- If the pennant is intractable, it may still be reasonable to work with the standard deviation of team wins rather than of player ratings.

At the Hall of Merit, 'WARP' refers equivocally to the Wins Above Replacement Player measures by Clay Davenport and Dan Rosenheck. This thread on "Dan Rosenheck's WARP data" is half full of theory, the foundations of his WARP data.

There is a thread "Battle of the Uber-Stat Systems (Win Shares vs. WARP)!" about the rating systems Win Shares by Bill James and WARP by Clay Davenport. That is half full of theory of one or the other, separately (not so much "vs.").

The caveats are:

1. I only have data for seasons where the pitcher in question was the starter for at least half his appearances.

2. I am completely reliant on BP’s DERA statistic to calculate pitcher effectiveness. If you don’t trust their defensive adjustments, you shouldn’t trust these numbers.

3. The numbers *do* include a rudimentary adjustment for seasonal innings totals, but *not* for career length, which systematically biases them against old-time pitchers who get their seasonal IP crunched but don’t get their careers extended.

4. The batting numbers are compiled against a baseline of pitcher average hitting for the league-season in question, which can fluctuate quite a bit. I think it would probably be better to use some sort of multi-year moving average as the baseline, but I haven’t gotten around to it yet.

5. I use a fixed replacement level for starting pitching, rather than one that floats over time as the hitter ones do. There is a very good chance that this is inaccurate.

6. I have not yet "integrated" these numbers with my position player ones at all, so not everything "adds up" the way it theoretically should.

I think these are enough warning signs to make the numbers of quite limited usefulness. Nonetheless, I'll send them your way now.

4. The batting numbers are compiled against a baseline of pitcher average hitting for the league-season in question, which can fluctuate quite a bit. I think it would probably be better to use some sort of multi-year moving average as the baseline, but I haven’t gotten around to it yet.I think you must be right about this, in favor of the moving-average.

DanR #139

I just did a quick league strength study for the 50s and 60s, and my preliminary results are absolutely jawdropping.

I looked at every position player who switched leagues between 1951 and 1968 (Mantle's career), a sample of 249 players. I took their rate in each season of batting wins per year, baserunning wins per year, and fielding wins per year, and added 8.7 to turn them into wins created per year. I then weighted each player by the harmonic mean of his playing time in the two seasons before and after the switch, giving 87 full seasons' worth of sample (where a player who plays every game in both the year before and after the switch is counted as 1.00). Finally, I took the ratio of their weighted performances before and after the switch. The ratio for batting wins was 1.092, for baserunning wins it was 1.001, and for fielding wins it was 1.007. This is, astonishingly, on par with the gap between the major leagues of 1944 and those of 1942/46.

For example, where WinR is win rate and Time is playing time.

WinR Time

1.00 1.00 1958

3.00 0.66 1959

4.00 1.00 1960

2.00 0.50 1961

Now you combine the two years "before" by weighted average --weighted by playing time, uniform by sequence? Same for the two years "after".

WinR Time

1.80 1.66(/2=0.833) before

3.33 1.50(/2=0.750) after

From which

- performance ratio = 3.33/1.80 = 1.85 ; or 1.80/3.33 = 0.54 depending on the direction of the move

- weight = harmonic mean of 0.833, 0.750

Then given a timespan such as 1951-1968 you take the weighted geometric mean of all the performance ratios (expressed in terms of fixed direction of the move) that are in the timespan.

IIUC you should be able to calculate once and store these 6 to 8 values for all interleague moves, subsequently do all sorts of things with them.

- year

- playerID

- performance ratios for batting, baserunning, fielding (3)

- weights (1 to 3 of them)

Oh, that would combine all fielding positions :-(

Or does the following mean that you combine Batting, BR, and F at the player-season level, before my exposition begins?

That doesn't fit the three-part numerical conclusion. Do you add 8.7 three times, once to each?

I took their rate in each season of batting wins per year, baserunning wins per year, and fielding wins per year, and added 8.7 to turn them into wins created per year.On the flip side, expansion does NOT turn up a statistically significant impact on pitchers, although maybe I could tease one out if I tweaked my expansion weights some. Run scoring definitely matters on both sides of the ball, as you would expect--more runs = more plate appearances = more opportunities for players to distinguish themselves from one another.

Pitcher standard deviations have never been higher than during the 1993-present "Steroid Era," and particularly in 1994-95. Their all-time low was in the 1920's, which is why I will certainly be the best friend of Dazzy Vance in our starting pitcher rankings, and why I may consider Adolfo Luque in 2009. The variance in them is extremely large, much more so than for hitters--ignoring the strike years for now, Roger Clemens's 222 RA+ in the 1997 AL translates to just a 174 RA+ in Luque's 1923 NL, while Luque's 188 RA+ in the 1923 NL counts for the same as a 256 RA+ in the 1997 AL. Of course, '97 Clemens would have thrown 322 innings in the 1923 NL, and '23 Luque would have thrown 264 innings in the 1997 AL, so it all sort of evens out. I have both seasons right around 11 WARP, with Luque slightly ahead.

Bob had 2 batting wins above average and 0.75 fielding wins above average in 3/4 of a season in the AL in Year 1, and 1 batting win above average and 0.5 fielding wins above average in half a season in the NL in Year 2. Joe had -3 batting wins above average and -1 fielding win above average in a full season in the NL in Year 1, and -1 batting win above average and -.5 fielding wins above average in half a season in the AL in Year 2. These are the only two league-switchers in my sample.

First, the batting:

Bob's batting rate in the AL is 2/.75 = 2.67 BWAA per year, and in the NL it's 1/.5 = 2.00 BWAA per year. Adding 8.7 to this, we get him with 11.37 batting wins per year in the AL, and 10.7 batting wins per year in the NL. Joe's batting rate in the NL is -3/1 = -3 BWAA per year, and in the AL it's -1/.5 = -2 BWAA per year. Adding 8.7, we get him with 5.7 batting wins per year in the NL, and 6.7 batting wins per year in the AL.

Bob is weighted at the harmonic mean of .75 and .5, which is 0.6, and Joe is weighted at the harmonic mean of 1 and .5, which is 0.67. So, Bob's 11.37 batting wins per year in the AL become a weighted 6.82, and his 10.7 batting wins per year in the NL become a weighted 6.42. Joe's 5.7 batting wins per year in the NL become a weighted 3.82, and his 6.7 batting wins per year in the AL become a weighted 4.49.

In total, that makes 6.82 + 4.49 = 11.31 weighted total batting wins in the AL, and 6.42 + 3.82 = 10.24 weighted total batting wins in the NL, producing a batting win conversion factor of 1.10. (I picked the numbers at random, but that's not far off from my actual observed result).

Now, the fielding: Bob's fielding rate in the AL is .75/.75 = 1.00 FWAA per year, and in the NL it's .5/.5 = 1.00 FWAA per year. Adding 8.7 to this, we get him with 9.7 fielding wins per year in the AL, and 9.7 fielding wins per year in the NL. Joe's fielding rate in the NL is -1/1 = -1 FWAA per year, and in the AL it's -.5/.5 = -1 FWAA per year. Adding 8.7, we get him with 7.7 fielding wins per year in the NL, and 7.7 fielding wins per year in the AL.

Using the same weights, Bob's 9.7 fielding wins per year in the AL become a weighted 5.82, as do his 9.7 fielding wins per year in the NL. Joe's 7.7 fielding wins per year in the AL become 5.16, as do his 7.7 fielding wins per year in the NL.

In total, that makes 5.82 + 5.16 = 10.98 weighted fielding wins in the AL, and the same total in the NL, producing a fielding win conversion factor of 1.00.

I only looked at the seasons directly before and after the league switch in this first study.

But are you saying you use DERA itself as the main component of pitcher effectiveness (not starting from RA/INN)? I'm not sure I have that much faith in it.

I need to read more thoroughly, but wanted to mention this at least in case I can't get back to it for a bit.

At this stage, does every player-season have one value for playing time (0.75 for Bob in year 1) that is used in all three batting, fielding, and pitching analyses?

thinking aloud, not entirely unrelated:

Does anyone know whether there is a detectable tendency for pitchers who are good batters to move from AL to NL, and vice versa?

Palmer is a fascinating case; some stats make him seem lucky, but I don't think so. The zero grand slams allowed in career also plays to that, to an extent.

Such a tendency would be optimal, but I am not aware of any empirical study that has demonstrated one.

There is a VERY strong argument that Palmer's pitching style was uniquely well-suited to his fielders, intentionally or not: he gave up 169 fewer hits on balls in play than he would have if opposing batters had posted the same BABIP against him than they did against the rest of the Orioles' staff. Moreover, he allowed 97 fewer runs than he would have if the hits and walks against him had been distributed randomly, suggesting a true Glavine-like ability to "bear down" from the stretch. He was the kind of guy that somebody just looking at strikeouts, walks, and home runs allowed (as Voros McCracken would recommend) would consistently underrate.

There is a VERY strong argument that Palmer's pitching style was uniquely well-suited to his fielders, intentionally or not: he gave up 169 fewer hits on balls in play than he would have if opposing batters had posted the same BABIP against him than they did against the rest of the Orioles' staff."Uniquely" may mean unique among Orioles who enjoyed the same fielders for a medium or long time. Among 140 or 150 great pitchers, Spalding to Santana, here are the 10 or 11 "leaders" per 9 innings. (It's 150 including 10-15 relief pitchers.)

-0.66 Joss (-0.66 hits per 9 innings)

-0.52 Santana

Hunter, Johnson, Coveleski

Stieb [and Rivera, optional]

McGinnity

-0.39 Tiant, Rusie, Palmer

Larry Corcoran and Fred Goldsmith divided the Chicago workload during most of their careers, and it shows: -0.20 and +0.29.

Maddux, Glavine, and Smoltz: -0.12, -0.11, -0.09.

Here are the "trailers"

+0.44 Pettitte

+0.29 John, Goldsmith, Kaat,

+0.28 Oswalt

+0.23 Lolich

+0.21 Pennock

<0.19 everyone else

Pettitte, John and Kaat at the "bottom" of the list. Is that predictable?

Moreover, he allowed 97 fewer runs than he would have if the hits and walks against him had been distributed randomly, suggesting a true Glavine-like ability to "bear down" from the stretch.The "leaders":

-0.80 Spalding (-0.80 runs per 9 innings)

(gap)

-0.57 Cummings

(gap)

-0.41 Maglie, Ford, Oswalt, McCormick, Will White

-0.31 Caruthers, Grove, Plank

That is five pre-1893 pitchers in the first ten, with Welch, Bond, Goldsmith, and Mullane (9) also in the top 20.

Palmer is about #12 of about 125 post-1892 pitchers.

Who leads Palmer in both measures?

No one.

McGinnity and Coveleski are the two others with very strong combined records in these two statistics.

What about Mr. Hittable, Glendon Rusch?He is not one of my 150 great pitchers.

Maybe Pettitte fooled you regarding the lofty standard.

I'll put the six new statistics including XIP in my lahman5.4-extended and upload the table with IP and playerID too.

I checked my 150 great pitchers against two reference lists (see the description) and added 11 more.

upload to "Files" for the "HallofMerit" egroup at yahoogroups.com

pitchers161.csvCareer data for 161 "great" pitchers: seven new statistics (DH, DR, DW, RAA, PRAA, DERA, XIP), four old ones (IP, ERA+, PA, OPS+), name and numeric lahmanID. The 161 include the Bill James top 100 (2001) and everyone with career ERA+ >=130 who qualifies as a leader at baseball-reference (2008-09-11). All data current sometime Jul-Sep 2008.

1. Standard deviation is calculated as it is for all other league-seasons. Both the AL and NL come out to a .991 LgAdj.

2. Defense is calculated using an average of 70% Dewan's Plus/Minus, 30% Chris Dial's RSpt. The only exception is that Dewan has Chase Utley at an insane 50 plays above average this year, so I used a 50/50 weighting there. I will likely amend this rating once I get a UZR for him. Since these numbers are unregressed, they will probably show a somewhat higher standard deviation than the ones I have published for previous years.

3. I have added 3.4 to the park factor of all NL players and subtracted 3.4 from the park factor of all AL players to adjust for league strength. This posits an 8.0 wins-per-team strength gap between the leagues in favor of the AL, divided 65/35 among position players and pitchers. It comes out to about .3/.4 wins per player. Note that that means these numbers are NOT directly comparable with those posted to my spreadsheet, which always assume equal strength between the AL and NL.

4. Non-SB baserunning is estimated using my standard estimation equation. This means BRWAA will probably show a slightly lower standard deviation than it does for 1972-05.

5. Replacement levels are assumed to be the same as 2005. This may very well not be the case.

This does NOT mean these are all the top-ranked players in MLB--they're just the ones I happened to check. There may be others that should be interspersed on this leaderboard. If there are any other players in particular you're curious about, let me know.

Even after adjusting for league strength, the standard deviation gap between the AL and NL is striking. Is there some real factor my standard deviation projection equation is missing? Is the league strength gap bigger than the (already massive) 8-wins-per-team differential I'm positing? Or was it just that nobody happened to have a big year in the AL, a la 1976?

Before anyone wants to wring my neck, the YouKike thing is a joke. His baseball-reference and BP id are youkike, and since he's Jewish, it makes for quite a coincidence. In response, baseball-reference has turned the "i" into a lowercase "L"--which is, of course, indistiguishable on a computer screen from a capital I--while Baseball Prospectus has just stuck a ton of underscores in the middle, as in youki__________ke. Pretty funny.

Ryan Howard must have just had the worst 48-homer season ever.

`Player SFrac BWAA BRWAA FWAA Replc WARP`

Pujols 0.93 7.4 -0.1 1.6 -0.2 9.1

Hanley 1.01 4.8 0.2 0.2 -2.6 7.8

Utley 1.03 3.3 0.2 2.5 -1.7 7.8

Chipper 0.78 5.6 0.0 0.8 -1.2 7.6

Teixeira 1.00 5.1 -0.2 1.8 -0.5 7.2

Berkman 0.97 5.0 0.1 1.3 -0.2 6.7

Wright 1.07 4.5 0.1 0.2 -1.9 6.6

A-Rod 0.86 4.1 0.2 0.2 -1.9 6.4

Mauer 0.91 3.6 -0.2 0.4 -2.6 6.4

Pedroia 1.04 2.4 0.4 1.1 -2.4 6.3

Beltrán 1.03 3.0 0.5 1.2 -1.6 6.3

Sizemore 1.08 3.3 0.6 0.1 -2.2 6.2

Hamilton 1.02 4.2 0.2 -0.5 -1.9 5.9

YouKike 0.90 4.0 -0.3 1.1 -1.1 5.8

Markakis 1.01 3.8 -0.2 0.6 -1.5 5.7

Manny 0.95 5.4 0.0 -0.7 -0.9 5.5

J. Reyes 1.10 2.2 0.6 -0.2 -2.8 5.4

Holliday 0.91 3.5 0.5 0.5 -0.8 5.2

Bradley 0.74 4.6 -0.1 0.2 -0.2 4.8

B. Giles 0.95 3.1 -0.1 0.9 -0.8 4.7

B. Roberts 1.02 2.2 0.4 -0.2 -2.3 4.7

Kinsler 0.83 2.9 0.5 -0.7 -1.9 4.6

Ludwick 0.90 4.3 -0.2 -0.4 -0.8 4.5

Rollins 0.91 0.0 0.9 1.3 -2.2 4.5

Quentin 0.82 3.7 0.0 -0.4 -1.2 4.5

McCann 0.83 2.7 0.0 -0.3 -1.9 4.3

Morneau 1.03 3.6 -0.2 -0.2 -0.8 4.0

Granderson 0.90 2.6 0.1 -0.7 -1.9 4.0

Huff 0.96 3.6 0.0 -0.3 -0.6 3.9

Braun 0.96 2.3 0.1 0.4 -0.8 3.7

Soto 0.82 1.9 -0.2 0.0 -1.9 3.6

Howard 1.02 2.1 -0.1 -0.2 -0.3 2.0

Delgado 1.00 2.7 -0.2 -1.1 -0.3 1.7

`Player SFrac BWAA BRWAA FWAA Replc WARP`

Pujols 0.93 7.4 -0.1 1.6 -0.2 9.1

Hanley 1.01 4.8 0.2 0.2 -2.6 7.8

Utley 1.03 3.3 0.2 2.5 -1.7 7.8

Chipper 0.78 5.6 0.0 0.8 -1.2 7.6

Teixeira 1.00 5.1 -0.2 1.8 -0.5 7.2

Berkman 0.97 5.0 0.1 1.3 -0.2 6.7

Wright 1.07 4.5 0.1 0.2 -1.9 6.6

A-Rod 0.86 4.1 0.2 0.2 -1.9 6.4

Mauer 0.91 3.6 -0.2 0.4 -2.6 6.4

Pedroia 1.04 2.4 0.4 1.1 -2.4 6.3

Beltrán 1.03 3.0 0.5 1.2 -1.6 6.3

Sizemore 1.08 3.3 0.6 0.1 -2.2 6.2

Hamilton 1.02 4.2 0.2 -0.5 -1.9 5.9

YouKike 0.90 4.0 -0.3 1.1 -1.1 5.8

Markakis 1.01 3.8 -0.2 0.6 -1.5 5.7

Manny 0.95 5.4 0.0 -0.7 -0.9 5.5

J. Reyes 1.10 2.2 0.6 -0.2 -2.8 5.4

Holliday 0.91 3.5 0.5 0.5 -0.8 5.2

Bradley 0.74 4.6 -0.1 0.2 -0.2 4.8

B. Giles 0.95 3.1 -0.1 0.9 -0.8 4.7

B. Roberts 1.02 2.2 0.4 -0.2 -2.3 4.7

Kinsler 0.83 2.9 0.5 -0.7 -1.9 4.6

Ludwick 0.90 4.3 -0.2 -0.4 -0.8 4.5

Rollins 0.91 0.0 0.9 1.3 -2.2 4.5

Quentin 0.82 3.7 0.0 -0.4 -1.2 4.5

McCann 0.83 2.7 0.0 -0.3 -1.9 4.3

Morneau 1.03 3.6 -0.2 -0.2 -0.8 4.0

Granderson 0.90 2.6 0.1 -0.7 -1.9 4.0

Huff 0.96 3.6 0.0 -0.3 -0.6 3.9

Braun 0.96 2.3 0.1 0.4 -0.8 3.7

Soto 0.82 1.9 -0.2 0.0 -1.9 3.6

Howard 1.02 2.1 -0.1 -0.2 -0.3 2.0

Delgado 1.00 2.7 -0.2 -1.1 -0.3 1.7

I haven't studied it - but does the AL have a larger than expected HFA in interleague games in AL parks? 8 wins per team seems very high to me - I thought the NL had closed the gap. Also, it doesn't necessarily hold that the advantage would be the same for hitters as pitchers.

Both Nate Silver and Mitchel Lichtman told me for an NY Times column last year that the league strength gap in 2006 was 10 wins a team, after factoring out DH-related factors. I too heard it had narrowed--a bit, hence my guesstimate of 8 wins.

I have also said 100 times that my big problem with WARP is that it's always changing, and of course that is BP WARP.

DanR WARP came along at a time when I just wasn't willing to change "systems." I mean, I do have a life.

Having said all of that, I am favorably impressed by Dan's 2008 MVP rankings. Let's just take the NL.

Win Shares

Berkman 38, Pujols 35, Beltran 33, H. Ramirez 32, Utley 30, Reyes and Wright 29, Lincecum and McLouth 27, Ludwick and A. Gone 26

VORP

Pujols 97, H. Ramirez 81, C. Jones 75, J. Santana 73, Berkman and Lincecum 72, Wright 66, Reyes 63, Utley 62, Holliday 60

DanRWARP

Pujols, H. Ramirez, Utley, C. Jones, Texeira, Berkman, Wright, Beltran, M. Ramirez (I assume this is both leagues), Reyes; I don't think you rated the pitchers rather than they fell short...?

Some markers:

Berkman: 1 on WS, 5T on VORP, 6 on DanRWARP

Beltran: 3, not top 10, 8

C. Jones: unrated, 3, 4

Texeira: unrated, unrated, 5

My gut says that WS is closest on Berkman though I don't see him as #1, DanR is closest on Beltran, but Jones would be closer to #11 than to 3 or 4, and Texeira would be closer to unrated than to #5.

But all things considered, if I'm just looking at the top 3, I like Pujols, Hanley and Utley better than I like the other 2 top 3s. Not that looking at the top 3 is all that meaningful, more of a toy.

Maybe it's just a coincidence but I don't see WS defensive methodology being particularly influential in the differentials here. I might infer from Berkman being #1 that it over-rates 1B defense but then how to explain Pujols...not to mention, again, Texeira. Maybe this is only really a problem when you roll up the numbers at the career level? Well, no, Dan says it's as much as 2.5-3 WS too low for a really good corner OF. I don't know who the representative case is this year, however.

A. Gone and McLouth may reflect WS propensity to advantage good players on crappy teams (or does it?). Some said it does apropos of Ralph Kiner but nobody said it about Chuck Klein. But that charge has been around forever.

Maybe it's just a coincidence but I don't see WS defensive methodology being particularly influential in the differentials here. I might infer from Berkman being #1 that it over-rates 1B defense but then how to explain Pujols...not to mention, again, Texeira.If the issue is that WS underrates defense, then wouldn't the problem be that WS overrates 1B (and corner OF)

overallwhen compared to premium defensive positions, not 1B defense? Even if it did overrate 1B defense in comparison to other positions, the argument was that it underrates defense in comparison to offense, cutting too small a piece of pie for the former, so there would be little effect on the WS ratings if it was determined that 1B defense was overrated and was adjusted without taking into account that it underrates defense overall, its real problem.`Win Shares . . Dan R's WARP`

Berkman . . . . Pujols

Pujols . . . . H. Ramirez

Beltran . . . . Utley

H. Ramirez . . C. Jones

Utley . . . . . Teixeira

Reyes . . . . . Berkman

Wright . . . . Wright

McLouth . . . . Beltran

A. Gonzalez . . M. Ramirez

First, VORP is simply an offensive rating; it doesn't take defense into account. Also, while it is likewise created by BP, it isn't related to WARP in any way, I believe.

Looking at the top 9 you listed (excluding pitchers since, like you said, Dan hasn't rated pitchers) we see Hanley Ramirez, Chase Utley & Chipper Jones have jumped up the list in Dan's evaluation, above players from easier positions defensively, while Reyes is the only one that drops down the list among ballplayers from the more difficult end of the defensive spectrum (sorry, I always get the directions, left-right, mixed up). That doesn't take into account how well they play defense at their respective positions. For example, I believe that every defensive measure (play-by-play anyways) rates Pujols as the top defensive 1B of the past few years, let alone '08. That has an effect on the difference in ratings (as well as the fact that he was insanely good offensively this year: looking at bb-ref's sabermetric stats, OPS+, RC & BtRuns, as well as considering their defense--how does Pujols end up behind Berkman? Berkman's WPA advantage doesn't seem that big . . . ). I don't know the defensive merits of Reyes & Beltran in '08 so I don't know if that had an effect on their drop among the top 9 from WS to Dan's WARP, but I believe I've heard that McLouth's defense isn't so hot while Hanley (I almost typed Horacio, Yikes!) Ramirez's has improved from worst in the league to nearly average according to some (and hasn't Chris Dial argued that Chipper's defense at 3B is underrated. I'm not sure if I recall correctly or not . . . ).

Well, no, Dan says it's as much as 2.5-3 WS too low for a really good corner OF.The issue isn't only corner OF, it is every position--Win Shares gives to little weight to defense, too much to offense, overrating poor defensive players, underrating good ones, overrating ballplayers from easier defensive positions while underrating the up-the-middle positions. That's why Paul Waner will receive 5th place votes, because his defensive value makes up for any weaknesses OPS+ & WS sees in comparison to other RF.

AROM--definitely Jewish, but I've barely stepped into a synagogue since my Bar Mitzvah.

Tiboreau--I wouldn't say precisely that Win Shares "underrates" defense *on the whole*. It makes two mistakes: first, its intrinsic positional weights are off (an average fielding SS should have about 6 more Fielding WS than an average fielding corner outfielder, and 8 more than an average fielding first baseman); and second, it compresses the impact of fielding quality into too small of a range. The spread of true fielding talent, according to modern PBP metrics, is something like -8 to +8 at first without counting scooping, -15 to +15 at corner outfield, and -20 to +20 at third and up the middle. (These are BALLPARK figures; I didn't bother to measure actual standard deviations on my data so don't kill me). In any given year, scores with an absolute value greater than 30 can be observed at up-the-middle positions, although they are unlikely to repeat themselves (just as a batting champion will often hit over .350 but only the rarest of players hits .350 over multiple seasons). So you should be seeing gaps of about 65 Fielding WS between players at the same position over 5-year stretches, and as much as 20 Fielding WS between players at the same position in a single year.

Needless to say, that's not the case. But that doesn't necessarily mean that good fielders are underrated--it could just as easily mean that bad fielders are overrated. (In fact, both are true). What it does mean is that Win Shares doesn't achieve its goal of accurately allocating a team's wins among its players.

I would say that in terms of effectiveness Cliff Lee surely was the "best" player in the AL this year, in part because nobody had a really kick-ass year (no position players) and partly because Cliff's year was sooo good.

I'm guessing that Lee will be lucky to hit the bottom 3 in the official MVP voting, or will not make the top 10 at all. My feeling is he is probably top 3-5, don't know if he's #1 or not. Halladay should also be top 10 IMO.

What is ironic, of course, is that those who say a starting pitcher can't be MVP then turn around and favor closers. I guess the argument hinges on the number of games, then, rather than the number of innings (outs) accounted for.

I'm not convinced any of the NL pitchers are top 10, but Lincecum is at least close and he is the Cy, though Webb had a pretty good year too.

But anyway, I like that your system gives starting pitchers a shot, unlike WS, whereas VORP seems to rate them too highly.

Using a 99 5-year park factor for the Blue Jays, a league-average offense would have scored 4.78*.99*246/9 = 129.3 runs in 246 innings. 129.3 runs scored plus 90.7 runs allowed is 220 total runs in 246 innings, which yields a Pythagorean exponent of 1.81 and a winning percentage of .655. A .655 winning percentage in 17% of a team's games translates to 4.27 wins above average. Add on 2.1 wins per 200 innings for replacement level, and Halladay comes out at 6.85 WARP.

Now, Lee. Dewan has a very interesting take on Cleveland's defense--they were exactly average at preventing hits on balls in play, but quite good at keeping those hits to singles (0 plays and 29 bases above average). This is principally attributable to Gutiérrez and secondarily to Sizemore. That comes out to a team defense of +7 runs. Cleveland's catchers were league average (12 errors, 13 passed balls, 67 steals, 27 caught stealing). Lee pitched 223.3 of the team's 1437 innings, so his defensive support was 7*223.3/1437 = 1 run, which on top of the 68 runs he allowed makes 69. Multiplying by .963 for league strength gives a result of 66.4 runs allowed.

Using a 100 5-year park factor for the Indians, a league-average offense would have scored 4.78*223.3/9 = 118.6 runs in 223.3 innings. 118.6 runs scored plus 66.4 runs allowed is 185 total runs in 223.3 innings, which yields a Pythagorean exponent of 1.77 and a winning percentage of .737. A .737 winning percentage in 15.5% of a team's games translates to 5.95 wins above average. Add on 2.1 wins per 200 innings for replacement level, and Lee comes out at 8.3 WARP.

Your instinct was right, AROM, that the gap was too large, and the reason why is incredibly stupid--I did the same analysis of Lee a few weeks ago to see what he was on track for, and didn't check to see that his last three starts were poor. Apologies. At 8.3 WARP, he is still clearly the best player in the AL, but definitely not as valuable as Pujols.

For now, I do not take quality of opposing hitters into account. I'd be interested to see that data.

Look at it this way--a replacement third baseman will have about a 75 OPS+ (think José Castillo or Wes Helms) and field at an average rate. Jones was about a +10 per season third baseman this year per both Dewan and Dial, so let's move those runs from the defensive side of the ledger to the offensive one, where they're worth 11 points of OPS+. So, you've got a 177 + 11 = 188 OPS+ 3B who fields his position at the league average for 78% of the season, and a 75 OPS+ 3B who fields his position at the league average for 22% of the season. Combine the two, and you have a full season of play at a 163 OPS+ with average fielding at third. That's roughly Eddie Mathews in 1959--a very strong MVP candidate in a typical year.

. . . Add on 2.1 wins per 200 innings for replacement level, and Lee comes out at 8.3 WARP.estimated from some pitcher analogue to the moving average of the three worst regular shortstops in the league?

E.g., from 1985 to 2005 (the period covered by the FAT study), the worst 3/8 of starting MLB first basemen averaged 0.3 batting + baserunning + fielding wins above average per 162 games. Nate found that freely available first basemen over that timespan averaged 0.2 wins below average per 162 games. So this tells us that replacement 1B are 0.5 wins a season worse than the average performance of the worst 3/8 of starting 1B in the league. If my published replacement level at 1B in a given year is -1.0, that means that the worst 3/8 of starting 1B in the surrounding 9-year period averaged 0.5 wins below average per season.

2008: 149-103 .591

2007: 137-115 .544

2006: 154-098 .611

2005: 136-116 .540

-----------------------

Total: 576-432 .571

Assume that it's .500 in the NL parks and that those were half the games (approx 252-252 .500). Then in the AL parks it would be approx 324-180 .643. That's a lot to explain away by the NL team not having a useful DH on the major league roster. If it's so important they could, you know, send down a pitcher and call up some 1b/of prospect from the minors for the series.

67 8 >You must be Registered and Logged In to post comments.

<< Back to main