Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Hall of Merit > Discussion
Hall of Merit
— A Look at Baseball's All-Time Best

Tuesday, November 24, 2009

Chone’s WARP and the Hall of Merit

Please use this thread to discuss Chone’s WARP and how it relates to the Hall of Merit and other ‘uber-systems’, like DanR’s WAR, BPro WARP (1, 2 and 3), Win Shares, my (Joe Dimino’s) pitcher ratings, VORP, etc.. If I forgot one let me know and I’ll add it to the list as well.

Joey Numbaz (Scruff) Posted: November 24, 2009 at 09:12 PM | 160 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 2 of 2 pages  < 1 2
   101. DL from MN Posted: December 15, 2009 at 04:20 PM (#3413202)
Crossposting from another BTF thread:

"41. GuyM Posted: December 15, 2009 at 10:12 AM (#3413188)
In using Sean's WAR data, you have to keep in mind that his Total Zone method systematically understates the value of great fielders, and also systematically underestates the cost of weak fielding. As a result, Jeter gets ranked far, far higher on this list than he should. In WAR, he's only about 5 wins behind Beltran or Ichiro for the decade. But Jeter's fielding weakness almost certainly means he was 5-10 wins worse than WAR estimates, while Beltran and Ichiro were probably a lot more valuable. Overall, he's not even close to those two players, or a number of others who are behind him on this list."

Is this true? I thought CHONE had run several regressions on fielding and that was one of the strengths of his system.
   102. Eric J can SABER all he wants to Posted: December 15, 2009 at 04:33 PM (#3413227)
In WAR, he's only about 5 wins behind Beltran or Ichiro for the decade. But Jeter's fielding weakness almost certainly means he was 5-10 wins worse than WAR estimates, while Beltran and Ichiro were probably a lot more valuable.

Comparing TotalZone to the UZR listed on Fangraphs, Jeter actually gains about 10 runs from 2002-09 by UZR. Beltran loses about 10 runs over the same span, while Ichiro gains about 6. So no, this doesn't seem to be true for these particular players, at least if you trust the Fangraphs data.
   103. jscmeagol Posted: January 14, 2010 at 06:16 PM (#3437532)
This may seem like a simple question, but where would one place an average player according to WAR? How about an All-star level palyer? I realize that these are vague concepts, but it would help me understand what i am looking at

Thanks!
   104. AROM Posted: January 14, 2010 at 06:29 PM (#3437539)
An average player, playing full time, is about 2 wins above replacement. That's an easy question, since that is true by definition of the way the system was designed.

The second part is a bit tougher, but I'll give it a try:

I'll say 4 WAR is all-star level, as there were 47 players at that level or above last season and allstar rosters are pretty big these days.

A six WAR player is one who, depending on additional criteria, you might be able to make a reasonable MVP case for. An 8 WAR season is probably a leading MVP candidate, though of course some leagues might have multiple top candidates like that (Pujols vs. Bonds) and some seasons may not have a player at that level (2008 AL leader was Pedroia at 6.7 - Looks like the writers got that one correct).
   105. jscmeagol Posted: January 14, 2010 at 07:43 PM (#3437632)
thanks AROM
   106. Paul Wendt Posted: January 18, 2010 at 04:26 AM (#3439967)
Is choneWARP a good foundation for HOM consideration of "1880s pitchers"?
That is the theme of Alex King, "2011 Ballot Discussion" #65.

Alex King's focus is replacement level in the weaker major league(s) of each season 1882-1891. That bears directly on the ratings of all players in those leagues relative to the National League (to the Players League in 1890) rather than the ratings of primary pitchers relative to other players.
>>
Since I rated the 1880's pitchers much higher than the consensus, I was concerned that Sean Smith's WAR overrates 1880's pitchers; in particular, I was concerned that his replacement levels for the American Association and Union Association might be too high.
<<
   107. Paul Wendt Posted: January 18, 2010 at 04:40 AM (#3439975)
I would consider adjusting away the relatively high workloads that were normal for pitchers. Some have observed that many early pitchers paid for high season workloads with short careers.

I would consider tinkering with the balance between pitching and fielding, which would bump pitchers down and their teammates up.
Unfortunately I don't know anything particular about the balance in choneWARP ratings.

Regarding win shares, which early pitchers dominate, someone has suggested dividing their WS ratings by two at the final step. Maybe Bill James made that suggestion himself. James uses 67.5% pitching, 32.5% fielding as a timeless baseline division of credit for run prevention, to be adjusted up or down by calculations from the team-season pitching and fielding data. Some or many in this forum have concluded that the basic division must be a historical variable.
   108. Alex King Posted: January 18, 2010 at 07:12 AM (#3440046)
Paul/107:

"I would consider adjusting away the relatively high workloads that were normal for pitchers. Some have observed that many early pitchers paid for high season workloads with short careers."

I agree that on a seasonal basis, you have to adjust for these pitchers' extraordinarily high workloads. But I don't think it's necessary to adjust career numbers, as all 5 of the star 1880's pitchers under consideration had normal career lengths:
Buffinton 3404 IP
Welch 4802 IP
Mullane 4531 IP
King 3190 IP
McCormick 4275 IP
For the season totals, in my initial rankings, I multiplied everything by 2/3. Then I looked at the league leaders in innings pitched for every season from 1876 to 2009. Between 1876 and 1892, the league leaders averaged 579 IP. It makes no sense to compare pitcher workloads in this era to more recent pitcher workloads, so I decided to compare the 1876-1892 leaders to the 1893-1920 leaders. 1920 is a logical endpoint because top pitchers' innings pitched dropped dramatically due to the increase in home runs.
Between 1893 and 1920, league leaders averaged 373 IP. The adjustment this implies, 0.6446, is nearly the same as the 2/3 that I used earlier.

"I would consider tinkering with the balance between pitching and fielding, which would bump pitchers down and their teammates up.
Unfortunately I don't know anything particular about the balance in choneWARP ratings."

To determine a pitcher's WAR, Chone subtracts a pro-rated portion of the team's defensive rating from the pitcher's actual runs allowed. Therefore, I don't think it's necessary to "tinker with the balance between fielding and pitching." Of course, you could try to use a FIP-based WAR, similar to Fangraphs' WAR, but, as others have pointed out, we don't know for sure if DIPS is valid for 19th century pitchers.
   109. Paul Wendt Posted: January 18, 2010 at 08:01 PM (#3440330)
That calculation of pitching value as a residual does make its accuracy at the team level strictly dependent on the accuracy of "defensive ratings" at the team level (where defense is either fielding or fielding+pitching). The balance simply becomes equivalent to the direct measure of fielding, so I don't know enough to persist or concede re that tinkering.

A basic question probably of general interest:
Is choneWAR documented in this respect and others or do you know some details from participation in internet discussion with Sean Smith?

P.S.
Officially we associate the second of four pitching eras with the 1893-1923 seasons because 1923/24 seems more clearly or more decisively than 1920/21 to mark a change in pitcher starts or complete game rates or innings. (I may have initiated that or formulated the biggest share of the argument but I don't recall such details.) I say "officially" because the only effect of the periodization so far is the classification of Coveleski, Faber, and Rixey with Rusie & Nichols rather than Pierce & Ford. That decision cannot be separated from the formal association with 1923/24.
   110. Alex King Posted: January 18, 2010 at 09:57 PM (#3440460)
If I change the "deadball era" to 1893-1923, I get an average of 369 IP for league leaders. The new ratio that I will use is 369/579 or 0.6384.

Chone's WAR is based on Tom Tango's description of how to calculate WAR, which can be found here.
In this post, Chone (RallyMonkey5) corroborates my earlier statement that he "subtracts a pro-rated portion of the team's defensive rating from the pitcher's actual runs allowed."
In this post, he describes the latest major updates to WAR.
Here he describes his catcher defense system.
Here is the original description of Chone's WAR, 1955-2008.
In this Fangraphs article, he describes his system of position adjustments over the Retrosheet era.

The best form of documentation for Chone's WAR is the player pages themselves. RAR is simply batting + baserunning + reached on error + Total Zone + Double Plays + Outfield Arm + Catcher defense + Position adjustment + Replacement level. Then RAR is converted to WAR by a runs to wins converter which varies according to the run environment of that season (these converters may be player-specific).
For pitchers, Chone adds defense to runs allowed, calculates the number of runs a replacement level pitcher would allow, finds RAR, and then uses a custom runs-to-wins converter (probably specific to each player) to determine WAR.
   111. Paul Wendt Posted: January 18, 2010 at 10:27 PM (#3440492)
There is also AROM's reply to our questions at the top of this thread. In part:

11. AROM Posted: November 24, 2009 at 04:38 PM (#3395992)
Thanks Joe, but no rush. I won't be able to do a WAR update for about 2 weeks. There's a ton of things to get together for it.

Next update will be for 2009, but also 1952, replacing the crude methods I used for earlier seasons with the better retrosheet data.

[Paul Wendt:]
> AROM, are you also a'Rally o'Monkey and/or Alex Rodriguez 'o'm?

Stands for Anaheim Rallymonkey of Maryland. A play on the team name change from back when.

> How does Chone handle defense before 1955?

JAARF. Just another adjusted range factor. You've seen the type. I don't think it's any better or worse than the Davenport fielding ratings, or defensive win shares, or anything that more advanced than total baseball fielding runs (in other words, based on balls in play not innings). I won't spend a second defending it, it's just a crude estimation until I get better data.

...
   112. HGM Posted: January 25, 2010 at 09:25 PM (#3446044)
Question for AROM... There's a discrepency in the plate appearance total for players as compared to B-R. For example, Ozzie's WAR page says 10,501 PA. B-R has him at 10,778. Why is there this discrepency?
   113. Paul Wendt Posted: January 26, 2010 at 01:01 AM (#3446255)
Evidently Sean Smith uses the calculation
: PA = AB + BB + HB
and Sean Forman uses the calculation
: PA = AB + BB + HB + SH + SB

For Ozzie Smith the discrepancy between those two estimates is
: 277 = 214 SH + 63 SB


One reason people calculate PA in different ways is that there are multiple definitions of On-Base Average and some choose to define PA as the denominator for on-base calculation.
(Sean Smith may be reporting his chosen denominator AB + BB + HB (?).)

Another reason is that people have access to more or less complete counts of basic outcomes and some choose to define PA as the sum of the basic counts covered in their datasets.
(Sean Forman does so. The only modern outcome not covered in his dataset is first base on catcher's interference.)

Finally some people choose to omit from contemporary PA the counts of any basic outcomes that are missing from their datasets for some past seasons. Thus they impose a kind of historical continuity.
   114. Paul Wendt Posted: January 26, 2010 at 01:23 AM (#3446270)
(too late to edit?)

"SB" should be SF, sacrifice flies.

Baseball-Reference provides some explanation of the reported variables. For example, visit any batting table and "mouse-over" the column label "PA" to see this explanation.
   115. HGM Posted: January 27, 2010 at 04:48 PM (#3447643)
Thanks
   116. Paul Wendt Posted: October 02, 2010 at 07:44 PM (#3653440)
Evidently the 2011 election season has opened.

"Chone's WARP" is now incorporated in Baseball-Reference player pages (as "WAR" in Player Value and Pitcher Value tables). So I expect that we may see more frequent use of it here, as elsewhere in cyberspace.

Newcomers should note that participant "AROM" in this thread is the author Sean Smith alias Chone, aka Rally or Rally Monkey.

See #110 for references to some of the author's presentations of the rating system.
   117. Paul Wendt Posted: October 14, 2010 at 08:46 PM (#3664025)
Esteban Rivera has posted several tables in "2011 Ballot Discussion" (#230-240 so far) which give leaders by fielding position with some ratings derived from Sean Smith's WAR, such as WAR per 162 games played.

The first group of tables covers 1869-1883 and 1884-1900 will follow.

Note that author Sean Smith (AROM in this forum) does not advocate the fielding component of WAR for any seasons outside the retrosheet era.
(source of following quotation on page one)

...
> How does Chone handle defense before 1955?

JAARF. Just another adjusted range factor. You've seen the type. I don't think it's any better or worse than the Davenport fielding ratings, or defensive win shares, or anything that more advanced than total baseball fielding runs (in other words, based on balls in play not innings). I won't spend a second defending it, it's just a crude estimation until I get better data.
   118. Esteban Rivera Posted: October 14, 2010 at 08:55 PM (#3664033)
Thanks Paul for the clarification of the fielding component. As you confirmed, it's like every other metric, a best guess based on certain assumptions and the available information. I did include a disclaimer about this in the explanation before the charts but I'm glad you also brought it up here because it bears repeating (and the disclaimer may get passed over since it is buried halfway through the intial post).
   119. Paul Wendt Posted: October 15, 2010 at 02:13 AM (#3664190)
(Back again.) Quotation is from Esteban's introduction #230 in the other thread.

>>For the WAR numbers, please remember that there is no available pitching WAR for 1871 to 1875 when looking at the pitcher numbers.<<

Why not available?

>>Also, please remember that any fielding and baserunning numbers in the metric are based on educated estimates. They are not to be taken as exact gospel, only as the metric’s opinion based on the numbers and its framework. Two components of WAR are not available due to lack of data for this time, ROE and DPs (someone correct me if this is wrong). Due to the defensive norms of the era (lack of gloves, bad terrains, etc.), in my opinion, the ROE is an important aspect of the play during this time, so please keep this also in mind.
<<


We have fielding DP thruout.
We have batting GIDP for 1871-75 only.
We have running SB and CS for 1871-75 only.
We have batting and pitching HB thruout but the award of first base was not introduced until AA 1884, NL 1887.
We have batting and pitching SO thruout the early period ... thru 1896, then no batting for a while.

ROE, first base on error, has been compiled only for the retrosheet era, I think.

I don't know how Sean Smith (CHONE) or anyone else in particular defines all-time editions of ratings to cover times with some component data missing.
   120. Esteban Rivera Posted: October 15, 2010 at 02:33 AM (#3664198)
>>For the WAR numbers, please remember that there is no available pitching WAR for 1871 to 1875 when looking at the pitcher numbers.<<

Why not available?


Baseball Reference does not have pitching WAR listed for these years. I do not know if Sean Smith has these numbers (they are not on Baseball Projection either) or why they are not on Baseball Reference. Thanks for the information on what data is and is not available. I am not completely sure what components of all-time WAR are fully informed, which are educated guesstimates, and which are not included due to lack of information and/or guesstimates. It would be great if it could be clarified by someone in the know.
   121. DL from MN Posted: October 15, 2010 at 08:15 PM (#3664612)
Alright, so I ran Ned Williamson through using the WAR from baseball reference and adjusted the numbers for season length. This leads to Williamson coming in 4th on my list of players which just doesn't seem right. I know that BBREF WAR doesn't adjust for standard deviations. What is the general opinion of the correlation of the run estimator to wins for pre 1893 WAR on BBREF?
   122. Alex King Posted: October 16, 2010 at 12:16 AM (#3664780)
Esteban/120:

Regarding NA pitchers, I've estimated Tommy Bond's WAR by imitating Sean Smith's methods. I based WAR on park-adjusted, opponent-adjusted RA; I didn't include defensive support. Here are the numbers I have for Bond:

1874 0.7
1875 11.4

From the NA pitchers, however, I've only estimated Bond's WAR, as he is the only NA pitcher who I've included in my consideration set.
   123. Alex King Posted: October 16, 2010 at 12:55 AM (#3664906)
DL/121:

I'm not entirely sure how BBREF WAR treats pre-1893 players, but I do remember reading somewhere that the run estimator in bWAR is based on linear weights, but adjusted so that the runs for each team add up to their actual runs scored.

Also, I don't think that Williamson's hitting is the reason for his high rating; rather, it's a result of his fielding. BBREF has him at +87 in just 5082 PA, or +10/600 PA. Williamson's also at +56 for the position adjustment, partially reflecting Sean's high value for the 1880s 3B position adjustment (although Williamson did play 450 G at SS compared to 716 at 3B). For the pre-1900 period, Smith seems to have the 3B position adjustment at ~4 runs/600 PA (I'm only presenting position adjustments "per 600 PA" as a matter of convenience; I'm pretty sure Smith calculates position adjustments using defensive games played). There's no published value for this position adjustment; I estimated it from BBREF player pages.
   124. Paul Wendt Posted: October 26, 2010 at 02:16 AM (#3675766)
Vaguely I recall reading that Bill James grades Ed Williamson in the D range at shortstop, but Williamson is not listed at shortstop in the Win Shares book. Perhaps the only extant very low mark is by Clay Davenport (FRAR or a rate version of FRAR).

Baseball-reference shows slightly above replacement-level rating for Williamson in four seasons at shortstop, by Sean Smith (dWAR). That is in sharp contrast to Henry Larkin whom Bill James grades D- at firstbase. Smith rates Larkin nearly -1 dWAR per season.

The catcher-shortstop Jack Rowe provides another contrast with Larkin; disagreement between James and Smith. A regular catcher 1879-84, Rowe was moved to shortstop(!) for 1885 and continued there through 1890. Bill James grades him F at short. Smith rates him -1 for his first season and -1 for his last but merely -1 for six seasons --much better than Larkin at first.
   125. Alex King Posted: November 17, 2010 at 06:48 AM (#3691799)
Thoughts on 1970s Infielders:

The position adjustments for the 1970s infielders is possibly the largest source of disagreement between DanR's WAR and ChoneWAR. I converted DanR's replacement levels to position adjustments like those used in ChoneWAR, to compare the two for 1970s IF:

Pos, DanR, Chone
SS, 22, 10
2B, 8, 4
3B, -1, 4

The striking thing about Chone's adjustments, however, is that the position switching data they're based on is very strange for 1970s SS/3B: http://www.fangraphs.com/blogs/index.php?author=12
The 1970’s

The infield/outfield gap keeps getting bigger as we go back in time, for this decade it’s 20.2 on offense and 8.6 on defense. Center fielders had only a 5.6 run advantage on the corners. Second basemen were 7.6 runs worse than shortstops, but 3.4 run better than third basemen. So if we add that up, shortstops must have been 10 or more runs better than third basemen, right?

If only it was that easy. In fact, players who played both third and short in the 1970’s were 1.1 runs worse as third basemen. Sometimes the pieces of this data puzzle do not fit very well together. In every other decade, shortstops were at least 4.7 runs better than third basemen and at least 6.6 runs excluding the 2000’s. For the 1970’s the other pieces – shortstop to 2B, 2B to 3B, show the normal pattern. Chalk this one up to a fluke, and I’ll try to make the adjustments make sense.


What follows is some speculation/thought experiments. All my assertions below could be verified or disproved using the Lahman database, I think; unfortunately, I'm not sure if I'll have the time/database skills to do this research anytime in the near future.

Some people have speculated that the massive offensive gap between SS and other IF positions in the 70s might be caused by SS being smaller than the other positions. If this assertion is true, we might expect that 70s 3B, who were unusually good at defense, consist in large part of players who in other eras might have been SS, but were moved to 3B because of their size. As a result, the SS/3B comparison might break down for the 1970s, since 3B who also played SS might have been unusually well-suited for SS (i.e. they were the players who would have been SS in other eras). If we completely ignore the SS/3B conversions, position adjustments might look something like this:

SS 13
2B 5
3B 2

Still quite different from DanR's numbers. DanR has a 14-run difference between SS and 2B, but the position switching data points to no more than an 8-run difference. SS and 2B are much closer in skillset than SS and 3B, and so the selective sampling issue should be far less. Is there any reason why we should expect a greater gap between 70s SS/2B than, say, 2000s SS/2B, given that the defensive gaps are quite similar?
   126. Joey Numbaz (Scruff) Posted: November 19, 2010 at 04:29 AM (#3693408)
Artificial turf.

You had to have insane range to play SS on turf in the 70s. Managers adjusted to this, and that's why by and large those guys didn't hit.

When the turf left, we went back to having SS's that could hit, like we did in the 40s and 50s. Even as late as the 60s we still had guys like Rico Petrocelli playing SS.

In the 1970s those guys disappeared. Well, except for Yount I guess. But even he didn't start hitting for real until the 80s. My vote is for turf.
   127. Howie Menckel Posted: November 19, 2010 at 05:15 AM (#3693418)
With no data to back this up...
:)

It seemed to me as if managers - to their credit - recognized how different the game was becoming with turf, and how important it was to have speed there.

But didn't they overreact?

Roger Metzger had a 70 OPS+ in 1971 in 617 PA as the Astros' SS, for instance.
The Astros were so pleased that they gave him 715 PA in 1972 as he racked up a 58 OPS+.
This alarmed them so much that he remained the regular in 1973-76, with OPS+s of 74, 76, 68, 66.
He finally lost his regular gig in 1977, with an OPS+ of 51 in 307 PA that led him to share duty with the 'offense option' Julio Gonzalez and his 69 OPS+ in 413 SS-2B PA.

And that's hardly an isolated instance.

I remember Metzger as a nice fielder.
I'd be curious - if the Astros could have found a 90 OPS+ SS with a mediocre glove, would that have been a worse play? Did many teams do that? Not that I recall offhand...
   128. Alex King Posted: November 19, 2010 at 05:18 AM (#3693421)
But shouldn't turf affect 2B too? And if those SS had insane range, shouldn't they have been better than +8 at 2B when they switched (from Sean's study)?
   129. DL from MN Posted: November 20, 2010 at 12:30 AM (#3694036)
If your team had 0 WAR you would have a .320 WP 52-110. A league average team has a .500 winning percentage 81-81. Therefore the league should have 81-52 = 29WAR per team * 30 teams = 870 WAR to distribute across all the players. Does WAR reconcile to having 870 WAR to distribute across all players? It really should be a zero-sum amount.
   130. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 20, 2010 at 01:00 AM (#3694049)
For the AL from 1987 to 2005, my low aggregate WARP for position players with over 50 PA is 235 and my high is 253. The average is 243, which is 17.36 per team or 1.93 per lineup slot. That seems about right, no?
   131. DL from MN Posted: November 20, 2010 at 05:52 AM (#3694144)
I'm just wondering if WAR reconciles with wins. Wins are zero sum across the league. If you think you're setting your replacement level at .320 W% but there are enough wins above replacement to make the average team 83-79 then there is something wrong. Either the run environment isn't correct or defense isn't being calculated correctly or there is something missing (which we know there is actually). I suppose if you're within a 1 win margin of error every year you're doing fine.
   132. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 20, 2010 at 06:40 PM (#3694299)
Well, my point is that my aggregate position player WAR for the AL are always within 7% of each other (I imagine the variance is due to the performance of players with less than 50 PA). So that's a good sign. I can't see if they really all add up to total league WAR because I don't have complete pitching numbers. But roughly, if I give an average SP 2.4 WAR per 200 innings and an average RP 0.7 WAR per 200 innings, then 17.4 + 13.6 = 31 WAR per team. Since I use a benchmark of 49-50 wins rather than 52 (that comes from Nate Silver for position players and Tango for pitchers), then it seems I'm right on track.
   133. Best Dressed Chicken in Town Posted: November 21, 2010 at 10:54 PM (#3694933)
It looks to me like there were 915 total WAR this season, 851 last season. I didn't check any further. That seems like more than just rounding errors.
   134. Paul Wendt Posted: November 23, 2010 at 04:40 AM (#3695761)
DL,
What is the argument for drawing the line between 83-79 ("there is something wrong") and 82-80 ("you're doing fine")?
   135. DL from MN Posted: November 23, 2010 at 05:10 AM (#3695772)
5% error is probably acceptable but more than that and you're probably missing something.
   136. Joey Numbaz (Scruff) Posted: December 06, 2010 at 11:25 PM (#3704342)
Does Chone's WAR adjust for season length? I assume it does, but want to make sure.
   137. Alex King Posted: December 07, 2010 at 12:36 AM (#3704414)
Joe D./136:

No, CHONE (bb_ref) WAR does not adjust for season length.
   138. Joey Numbaz (Scruff) Posted: December 07, 2010 at 12:54 AM (#3704432)
Really? Because he's got 1981 as Dawson's best year - so that made me figure he must.

It was his best year, but I don't think it was more than 50% better than any other season he had, after adjusting for season length . . .
   139. Alex King Posted: December 07, 2010 at 01:17 AM (#3704454)
Looking at Dawson's page, the rep level goes from 19 in 1980 to 13 in 1981 back to 19 in 1982, showing that WAR isn't adjusted for season length. Dawson benefits that year from being +18 in the field and +5 on the bases despite the shortened season; plus it was his best offensive year at +29 runs (157 OPS+ was the highest of his career).

I'd never noticed before just how good that year is by choneWAR--11.0 WAR when adjusted.
   140. user Posted: December 07, 2010 at 02:50 AM (#3704512)
NAME IP ERA+ WAR
George Bradley 487 88 -3.0
Harry Salisbury 89 113 -1.1
Harry adj. 487 113 -6.0

Again, how? How can Salisbury be much worse than Bradley, pitching for the same team, with a 25-point ERA+ advantage?



Cross-posted from 2011 ballot thread.

I'm confident this is entirely a function of ERA vs RA. WAR is based on RA.

Bradley has an ERA of 2.85 and an RA of 6.67. Salisbury an ERA of 2.22 and an RA of 7.28.


Looking at the key components for pitching war - bradley gave up 361 runs. War estimates a replacement level pitcher would have given up 313. Bradley therefore has -48 RAR which is mapped to -3.0 WAR.


Pro-rating salisbury to 487 ip gives 394 runs given up, compared to 312 for a replacement pitcher (the 1 run discrepancy presumably being because of rounding) for -82 RAR and -6.0 WAR. Given non-linearities in RAR->WAR conversion (although 48->3 and 82->6 seem somewhat anti-intuitve) and some rounding errors everything seems explained by RA vs ERA
   141. Alex King Posted: December 07, 2010 at 08:03 AM (#3704636)
There is a problem with the replacement levels for 1876-1891 pitchers.

Year NL AA UA/PL
1876 .466
1877 .470
1878 .460
1879 .463
1880 .444
1881 .449
1882 .447 .443
1883 .447 .449
1884 .423 .440 .414
1885 .441 .438
1886 .421 .425
1887 .441 .429
1888 .436 .423
1889 .429 .418
1890 .448 .442 .425
1891 .434 .416

Replacement level is supposed to be .420 for starting pitchers.

Replacement level is considerably higher for the 1876-1879 period, suggesting that these pitchers may be underrated. Meanwhile the large fluctuations between 1884 and 1891 are quite disturbing and cannot be explained by rounding errors alone. There are no adjustments for the AA, UA, and Players' League, but we knew that already--mainly this affects Mullane, who goes from being a serious candidate to not even close.
I'll probably try to put together some pitcher WAR estimates for Welch, McCormick, etc. based on a .420 replacement level, with an AA discount, and otherwise following Sean Smith's methods.
   142. Best Dressed Chicken in Town Posted: December 07, 2010 at 08:25 AM (#3704644)
Replacement level is considerably higher for the 1876-1879 period

I'd guess it should be, but I don't know if this is a conceptual adjustment or a mistake.
   143. bjhanke Posted: December 07, 2010 at 10:38 AM (#3704655)
User, comment 140 -

First, thanks for the info and the attempt to help out rather than just criticize. Do I understand this right when I think that RA means runs allowed, both earned and unearned? And that WAR uses that RA to compute WAR for pitchers? That's what your computations look like to me, but I'm looking at this with little context. If I do understand this right, does that not amount to assigning all responsibility for errors to the pitcher and none to the fielders? If that's true, then I can certainly see why WAR would make these massive adjustments, given the numbers of errors in the time period. However, I disagree with the premise pretty strongly. I'm not one of the people who think that early pitching just was tossing the ball up there with no stuff on it, like it was batting practice or Home Run Derby or something, and that all pitching success should be attributed to the fielders. But I do think that there is defense in early baseball, and that there are differences in defenses between teams.

If I got this wrong, please let me know. But if I don't, then I guess, with considerable lack of confidence, that I have found a flaw in WAR. Since I have done no work on WAR, and have not even attempted to deconstruct it, it seems unlikely that I would find a problem just en passant looking at Will White for the Hall of Merit. But if I do understand everything correctly, then it may actually be true. So, do I have this right, and if so, does WAR adjust for the problem, or is it a real one for the system? Thanks.

Hmm. I guess I ought to add that I cannot, offhand, think of any way to try to isolate the pitching contribution from the defense back in the 1880s. Sean Smith has certainly put much more effort into that than I have, but if he couldn't figure out how to separate the two, either, then I can see why he would use RA instead of ERA. RA is what produces the final results. ERA is, by comparison, an artificial construct which attempts to separate pitching from fielding, but is hardly perfect at it. - Brock
   144. David Concepcion de la Desviacion Estandar (Dan R) Posted: December 07, 2010 at 06:03 PM (#3704871)
Not quite right, Brock. CHONE WAR starts with RA rather than ERA but then adjusts for the overall quality of the team defense (estimated as best we can, which is phenomenally difficult). That's the Rdef column on the baseball-reference pages. The problem with ERA is that it attempts to account for surehandedness but not for range (or lack thereof).
   145. bjhanke Posted: December 08, 2010 at 12:33 AM (#3705295)
Dan -

Well, I find myself in complete agreement, especially with "which is phenomenally difficult." And if you're starting with RA instead of ERA, I understand completely how you got to wildly different conclusions than I did. What I intend to do is take WAR into account in my attempts at balancing acts. This will certainly drop White off my ballot, because your conclusions are so low that they will drag him out in spite of being just one component of the evaluation. But it's fair to have one method that starts with ERA and another than starts with just RA, because it's maybe impossible to determine how much responsibility to assign to the fielders and pitchers for the unearned runs. Using ERA assigns almost none. Using RA assigns almost all. Reality is probably in between. I've never used a method before that started with RA, so my position needs to move towards "in between."

Thanks! - Brock
   146. Alex King Posted: December 10, 2010 at 03:07 AM (#3707510)
I calculated Will White's WAR numbers, but using a .420 replacement level instead of the higher one Sean Smith uses between 1876 and 1879 (I called this result mWAR). Here are the results:

Year mWAR WAR
1877 0.4 0.3
1878 2.5 1.3
1879 3.1 1.0
1880 4.2 2.2
1881 -0.3 -0.5

After 1881, the WAR numbers really depend on the AA adjustment that you use--I use a pretty harsh discount for the early years, which makes White a non-candidate in my eyes. The more important result, however, is that White's low WAR totals in the NL are not an artifact of ChoneWAR's strange replacement levels for these years; instead, they indicate that he wasn't very good in this time period (likely due to the high number of unearned runs).
   147. Alex King Posted: December 10, 2010 at 03:12 AM (#3707516)
duplicate
   148. Alex King Posted: December 10, 2010 at 03:12 AM (#3707517)
Replacement level is considerably higher for the 1876-1879 period

I'd guess it should be, but I don't know if this is a conceptual adjustment or a mistake.


For HOM purposes, I think it's wrong to use the lower replacement levels, since it's unfair to these earlier pitchers.
   149. Paul Wendt Posted: December 10, 2010 at 04:45 AM (#3707569)
Alex King #141
>>Replacement level is considerably higher for the 1876-1879 period, suggesting that these pitchers may be underrated. Meanwhile the large fluctuations between 1884 and 1891 are quite disturbing and cannot be explained by rounding errors alone.

Best #142
>>I'd guess it should be, but I don't know if this is a conceptual adjustment or a mistake.

Best, Do you guess replacement level should be higher for 1876-1879 because you interpret the *1882* entry of the AA as expansion?

Alex #148
>>For HOM purposes, I think it's wrong to use the lower replacement levels, since it's unfair to these earlier pitchers.

Alex, Do you mean wrong to use the *higher* replacement levels, because those are unfair to 1876-1879 pitchers?

About the replacement levels tabulated in #148:
Have you inferred those from replacement runs reported for particular players at baseball-reference or baseballprojections? If so, have you used games scheduled or games played as an associated measure of playing time (one or the other, I suppose)? Differences between scheduled and played may be one cause of year-to-year variation in replacement levels if Sean Smith has not handled them appropriately, or one cause of year-to-year mismeasurement in table #148 if you have not handled them appropriately.
   150. Alex King Posted: December 10, 2010 at 04:53 AM (#3707580)
Alex #148
>>For HOM purposes, I think it's wrong to use the lower replacement levels, since it's unfair to these earlier pitchers.

Alex, Do you mean wrong to use the *higher* replacement levels, because those are unfair to 1876-1879 pitchers?


Yes, I meant that the higher replacement levels are unfair.

About the replacement levels tabulated in #148:
Have you inferred those from replacement runs reported for particular players at baseball-reference or baseballprojections? If so, have you used games scheduled or games played as an associated measure of playing time (one or the other, I suppose)? Differences between scheduled and played may be one cause of year-to-year variation in replacement levels if Sean Smith has not handled them appropriately, or one cause of year-to-year mismeasurement in table #148 if you have not handled them appropriately.


I assume you're referring to the replacement levels in #141. I calculated those replacement levels based on leaguewide totals, so there shouldn't be any problems created by poor estimates of playing time. I do think that some of the variation can be explained by rounding errors.
   151. AROM Posted: December 10, 2010 at 05:22 AM (#3707607)
Replacement levels are higher for pitchers in some of those years because the pitcher has less control of the outcome of games. When league levels of walks, strikeouts, and homeruns are very low, then there is less cost to a team in having a pitcher out there who is not as good at those things. More balls in play means less differentiation between pitchers, since even those who reject DIPS must admit the fielder has some impact on the results of those balls.
   152. Best Dressed Chicken in Town Posted: December 10, 2010 at 05:54 AM (#3707635)
Best, Do you guess replacement level should be higher for 1876-1879 because you interpret the *1882* entry of the AA as expansion?

My thinking was just that in an immature game, the players are easier to replace. I didn't think AROM had taken this into account, but a serious study of the quality of play in that era should. (How much one should account for that in cross-era comparison is another question.)
   153. Alex King Posted: December 10, 2010 at 07:51 AM (#3707784)
Replacement levels are higher for pitchers in some of those years because the pitcher has less control of the outcome of games. When league levels of walks, strikeouts, and homeruns are very low, then there is less cost to a team in having a pitcher out there who is not as good at those things. More balls in play means less differentiation between pitchers, since even those who reject DIPS must admit the fielder has some impact on the results of those balls.


I agree; this sounds like a very reasonable adjustment. Is there any particular justification for using a replacement level of .460-.470 in the 1870s?
   154. bjhanke Posted: December 10, 2010 at 03:47 PM (#3707901)
Alex -

Thanks for comment 146. It gives me an even better idea of how large the adjustments for this era are in modern WAR systems. I've never adjusted the replacement rate by era, nor started looking at pitchers by starting with total Runs Allowed instead of ERA. On the other hand, the people who do make these adjustments have put a lot more time into analyzing the numbers than I have. If I given them ANY respect for that - and surely I should - then White just drops out of contention because of the unearned runs. The important thing to me, for the last couple of years, was that I find out what adjustments were being made, not that I necessarily agree with all of them. The most advanced modern analysts don't agree with each other. But if I start with trying to eyeball some balance between the various methods and their conclusions, then I need to accumulate all the reasoned opinions that I can and try to balance them all. I've been rating White as if all analysis should start with ERA and then make adjustments, none of which included assigning the pitcher any responsibility for unearned runs. I've also been using a level replacement rate, well under .400. If I factor the opposing positions in at all, then White just collapses. It also means that I should start looking at WAR results to see if there are (and there should be) any pitchers from that era who are advantaged by this approach. If White's unearned runs are WAY over the league average, then some one or group of pitchers has to have their unearned runs be less than average. WAR will rate such pitchers higher than I would, after adjustments for replacement rate, probably much higher.

So thanks for the help. I now have to try to figure out if I even have a position on variable replacement rates, much less what that position ought to be. That will take a while. Right now, I can see logic on both sides. But then, what else is new?

Anyway, thanks to all who helped, particularly Joe, Dan and Alex! - Brock
   155. Paul Wendt Posted: December 14, 2010 at 04:42 AM (#3709987)
AROM #151,
Best #152,

Alex King #141 observes that the replacement level incorporated in WAR is high for 1876-1879 specifically. That is in contrast to 1880, not only to 1900 and 1960.
For example,
1876-79: small var around .465
1880-83: small var around .447

The "immature game" (Best) isn't a candidate to explain that difference. By its nature maturity happens slowly. If the game was much more mature in the early 1880s than late 1870s, that is News.

"league levels of walks, strikeouts, and homeruns" (AROM) isn't a plausible explanation either, for a similar reason. Those rates were still very low in the early 1880s. One crucial rules change governing walks and strikeouts would probably cause one change in pitcher "control of the outcome of games", but the crucial numbers of balls and strikes were changed several times.
   156. David Concepcion de la Desviacion Estandar (Dan R) Posted: December 14, 2010 at 08:32 PM (#3710601)
Since CHONE WAR sets its positional weights by decades, I imagine it sets its pitcher replacement levels by decades as well. Needless to say, I think it's preposterous.
   157. Alex King Posted: December 15, 2010 at 01:28 AM (#3710917)
Dan/156:

In theory, yes, it's preposterous. In practice, however, the decade replacement levels only have a big effect on the 1870-1880 pitchers and the AA position players (Sean Smith uses a constant replacement level for the AA, treating the 1882 AA the same as the 1888 AA). Otherwise, the practical effect is quite small; when I smoothed WAR's replacement levels, I only saw a few minor changes to my ballot.
   158. Bleed the Freak Posted: November 29, 2012 at 02:47 PM (#4312561)
On the Buzz Arlett thread, James Newburg discussed his reliance on DRA as his main defensive metric.

The basis for my position player rankings is ~75% Dan R(25% Old Dan - 75% weight to salary, 25% to PA) and 50% Dan R modified for DRA defensive metrics, and 25 % WAR (12.5% Rally and 12.5% Baseball-reference).

Below is a list of players with the largest fluctations in my valuation system between:
Player value using Dan's defense and using DRA ratings:

Huge positive swing with use of DRA:
Cupid Chlds
Joe Gordon
Keith Hernandez
Richie Ashburn
Fred Clarke
Tommy Leach - straddles PHOM, though Rally/BR aren't at all fans
Art Fletcher - moves to consideration set
Ed Delehanty
Frankie Frisch
Ivan Rodriguez
Bobby Veach - moves to consideration set
Jimmy Sheckard
Mike Griffin
Bill Dahlen
Harry Hooper
Luke Appling
Joe Tinker
Dave Bancroft - moves to PHOM
Andruw Jones - potential PHOM
Roberto Clemente
Sam Rice
Roy White - moves to consideration set
Buddy Bell - everyone should review his case - solid to excellent by the metrics available
Rickey Henderson
Todd Helton - potential PHOM
Joe Cronin
Bobby Grich
George Sisler - cements PHOM status
Bill Terry - straddles PHOM line
Bobby Wallace
Tony Phillips
Jose Cruz - moves to consideration set

Huge negative swing with DRA:
Chipper Jones
Derek Jeter
Duke Snider
Willie Stargell
Wade Boggs
Dave Winfield
Craig Biggio
Roy Campanella
Bill Freehan
Stan Hack - straddles PHOM
Vada Pinson
Ozzie Smith
Harmon Killebrew - straddles PHOM
Scott Rolen
Harry Heilmann
Sam Crawford
Joe Kelley
Roger Bresnahan
Amos Otis
Earl Averill - straddles consideration set
Chuck Klein
Mickey Cochrane
Pete Rose
Bill Dickey
Jason Giambi
Nellie Fox - makes HOM selection look even worse
Sal Bando
Gary Sheffield
Joe Medwick - moves to thick of consideration set
Edd Roush - makes HOM selection look even worse
Kirby Puckett

The reliance on DRA to compute Dan R WAR has resulted in the following large variances when compared against baseball-reference WAR:

Dan R/DRA huge positive ranking difference:
Barry Larkin
Gabby Hartnett
Alan Trammell
Gary Sheffield
Joe Cronin
Jimmy Sheckard
Tommy Leach
Arky Vaughan
David Concepcion
Yogi Berra
Lou Boudreau
Pie Traynor
Paul Waner
Heinie Groh
Fred Clarke
Hughie Jennings
Jim Edmonds
Elmer Flick
Bert Campaneris
Tim Raines
Tim Salmon
Darrell Evans
Eric Davis
Luke Appling
Mike Piazza
Robin Yount
Rabbit Maranville
Brian Giles
Dick Bartell
Bill Dahlen
Bill Dickey
Mark McGwire
Max Carey

Baseball Reference WAR ranking significantly higher:
Sal Bando
Brooks Robinson
Kenny Lofton
Carl Yastrzemski
Andre Dawson
Pete Rose
Edgar Martinez
Duke Snider
Roberto Clemente
Larry Walker
Ron Santo
Vada Pinson
Ken Griffey Jr.
Paul Molitor
Wade Boggs
Mike Tiernan
Andruw Jones
Ken Boyer
Kirby Puckett
Willie Davis
Craig Biggio
Harry Heilmann
Earl Averill
Buddy Bell
Jake Beckley
George Davis
Bobby Abreu
Billy Hamilton
Tony Perez
Nellie Fox
Cesar Cedeno
   159. Chris Cobb Posted: December 19, 2012 at 01:11 PM (#4328581)
There isn't, so far as I was able to determine, a thread for discussing DRA, so I'll post this here as the most proximate "advanced metrics thread"--

I am looking at refining the fielding component in my system, so I've begun examining the DRA numbers, which some voters rely on heavily. Obviously, I need to get the book, but until that happens, it would be helpful for me to have some basic information about DRA.

What are the baseline and the win-value of DRA runs saved? Does DRA begin from the increasingly-accepted premise, that fielding value is calculated against an average fielder at the position? Does DRA use the increasingly standard framework of 10 runs = 1 win? Are there adjustments for scoring environments?

Answers to this question as well as any comments on the comparative accuracy and reliability of DRA as a fielding measure would be very helpful.

Thanks!
   160. theorioleway Posted: December 20, 2012 at 10:38 PM (#4329790)
Chris Cobb: this is probably a good place to start for you: http://www.oup.com/us/companion.websites/9780195397765/appendices/?view=usa
   161. Chris Cobb Posted: December 22, 2012 at 11:28 AM (#4330867)
Thanks for the link! Lots of interesting information here.
Page 2 of 2 pages  < 1 2

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
danielj
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 0.6108 seconds
49 querie(s) executed