Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Hall of Merit > Discussion
Hall of Merit
— A Look at Baseball's All-Time Best

Monday, October 11, 2004

Battle of the Uber-Stat Systems (Win Shares vs. WARP)!

Don’t ever say that I never gave you anything! :-)

John (You Can Call Me Grandma) Murphy Posted: October 11, 2004 at 02:46 PM | 366 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 4 of 4 pages  < 1 2 3 4
   301. jimd Posted: February 12, 2007 at 09:44 PM (#2296237)
ba-ba-bump
   302. KJOK Posted: March 25, 2007 at 06:13 AM (#2317375)
From Clay Davenport's latest chat on Friday (March 23): (Bolded emphasis mine)

Clay Davenport: "......... Keep in mind that the replacement level in the WARPs is very low indeed, what a AA player might do. It is geared towards what the worst teams in history actually accomplished."
   303. Paul Wendt Posted: March 26, 2007 at 03:55 PM (#2317983)
Just FYI, Brandon (aka Patriot) has posted 5 year park factors for each team since 1901

They named "Pythagenpat" for him.

Joe Dimino is credited with an assist concerning these pythagen definitions with visual basic code(?).
http://www.super70s.com/Baseball/Background/Glossary/S/Sabermetrics/Pythagorean.asp
   304. Paul Wendt Posted: April 01, 2007 at 04:03 AM (#2321522)
In the Bancroft thread last Fall, someone mentioned dividing Win Shares by three . . .
That makes sense because, first, many rating systems are denominated in wins and, second, Bill James assigns three Win Shares to every team for every win.

Has anyone compared WinShares/3 and WARP1 for large numbers of player-seasons or -careers?
The idea is so simple, the answer must be yes!
Such comparison would provide one, relativistic point of entry to understanding, criticizing, etc. Which players does WinShares rate highest, relative to WARP? George Van Haltren is rated three wins greater by Win Shares, about 114 to 111. Whose approach rates pre-expansion CFs higher and, if possible, why? And so on.

Simlarly, systematic comparison with WinShares/3 or with WARP1 may be part of an effective explanation for a new rating denominated in wins, such as Warp-Rosenheck --or Win&Loss; Shares in Bill James' future. Here are some examples (or here is statistical analysis) of some of the biggest differences in the numbers of wins attributed: [list with commentary].

As far as I know, no such comparison may be equally illuminating for Pennants Added. Because it is denominated in pennants, the equally "natural" comparison is merely ordinal. Who ranks the greatest number of places higher? And so on.
   305. TomH Posted: April 01, 2007 at 07:34 PM (#2321743)
altho WS has a lower replacement level than WARP, so a comparative formula might be something like WS/3 - PA/750 = WARP, altho my '750' is merely a guess. (or IP/170)
   306. Chris Cobb Posted: April 01, 2007 at 09:13 PM (#2321794)
WS has a lower replacement level than WARP

Well, the batting replacement level for WS is lower than for WARP, but the fielding replacement level for win sharees is higher than for WARP, and the two basically balance out. I suspect the pitching replacement level for win shares is higher than for WARP for the post-world war 2 game, although I haven't tracked this data, so I don't know for sure.

I hav found that for most Hom-calibre post WW2 position players, WS/3 is similar to WARP1. There is not a consistent difference in magnitude of the sort one would expecct if replacement level in one were consistently lower than replacement level in the other. Prior to WW2, WARP1's fielding replacement level is so much lower that WARP1 totals begin consistently to outstrip win share/3 totals, and this difference continues to increase as one moves back in time.

I don't have parallel data gathered for a representative cross-section of players, but I have gathered it for all position players who would be considered "serious candidates" for the HoM.
   307. TomH Posted: April 02, 2007 at 12:12 PM (#2322408)
Yes, that is what pretty much what it looks like once I add up the ##s of my top backloggers, Chris, although I still see WS having a slightly lower repl level, which is hwat I would expect given that WS add up to ALL team wins, while WARP should approx add to team wins above 20 per yr or so.

Of my top 9 who played mostly after 1940 (nonpitchers), their WS/3 avgs about 4 more than their WARP1.
Of the top 9 pre-1940 guys, WARP1 is avg of 2 career pts higher than WS/3.
   308. KJOK Posted: April 04, 2007 at 05:24 AM (#2324250)
A great article on the whole "peak" argument, very relevant to our HOM discussions:

What is Peak?

over at Patriot's blog.
   309. Bleed the Freak Posted: July 04, 2007 at 03:34 PM (#2428724)
David Foss posted this link in February in another discussion thread, it belongs here:

ftp://ftp.baseballgraphs.com/winshares/

This link contains historical win shares from 1876-2006. Thanks again David for providing this link.

Does anyone have a link somewhere where I can download WARP data from 1876-2006.

Keep up the good work Hall of Merit voters and posters. I've learned so much from reading the threads over the past five years.
   310. ronw Posted: September 07, 2007 at 08:30 PM (#2515604)
Those much smarter than I have probably already realized this, but I am not sure Win Shares is a good method for the HOM. I was thinking about Hugh Duffy's dominance of Win Shares with a relatively low OPS+. Other than 1894 and 1891, he never had an OPS+ greater than 130.

I thought that was solely because of the way Win Shares allocates fielding WS disproportionately towards outfielders. However, I was surprised to find that Duffy was among the league leaders among outfielders in batting WS, even though he was nowhere near the leaders in OPS+.

For example, in 1892, Duffy led all outfielders with 23.5 batting WS, although he had a 125 OPS+. In 1893, he was second among outfielders at 23.1 bWS to Ed Delahanty (23.3), despite another 125 OPS+. Of course he led the league in 1894 (28.0) (177 OPS+).

I was seeing him as historically dominant from 1891-1894, primarily because of Win Shares, especially batting Win Shares. Just recently, I wondered how someone with a relatively low OPS+ could consistently be among the league leaders in batting WS?

First I thought that Duffy had more plate appearances than anyone else. It is true that he had over 600PA each of these years, and so had more opportunity to accumulate batting WS.

But others had similarly high PA but didn't have nearly the amount of batting WS.

For example, in 1892, when Duffy led all outfielders in batting WS, he had 673 PA. Sam Thompson had a similar number of PA, with 679. Let's compare their statistics.

Player  BA   OBP   SLG   AB R H  2B  3B  HR   RBI BB   SO
Duffy .301  .364  .410  612  125  184  28  12   5 81 60   37
Thompson .305  .377  .432  609  109  186  28  11   9   104 59   19


Very similar raw stats. How were their adjusted numbers? (BRAA adjusted for season from Prospectus, BatRuns and OPS+ from bbref)

Player   OPS+   BRAA BatRns
Duffy 125   25   17.1
Thompson 144   40   32.9


OK, so compared to Thompson, Duffy played in a bandbox. Indeed the Philadelphia Baseball Grounds had an 1892 batting park factor of 100, while Boston's South End Grounds was 109.

But what were their raw batting WS? (Remember, similar raw stats, similar PA, but Duffy played in an easier park.)

Player  bWS (raw)
Duffy   23.5
Thompson   18.1


How can this be? Then my relatively simple brain remembered that WS can be affected by team wins, especially team wins above their pythagorean projection.

  
Team Record  Pythag Record   RS RA
1892 Boston   102-48  94-56  862   649
1892 Phil.  87-66  92-61  860   690


Based on this, Boston had 306 WS to dole out, while Philadelphia had 261.

Here are the players who weren't traded from Boston during 1892. Only Harry Stovey (3 WS total for Boston) had significant hitting time before he was traded.

Player   bWS  OPS+ AB
Duffy 23.5 125 612
Long  19.5 107 646
McCarthy 16.4  92 603
Tucker   15.9 106 542
Nash  14.2 101 526
Lowe   9.4  84 475
Stivetts  8.6 126 240
Ganzel 4.5  97 198
Quinn  2.9  54 532
Bennett   2.5  81 114
Kelly  2.5  53 281
Nichols   0.7  57 197
Total   120.8


Here are the players who weren't traded from Philadelphia during 1892. No traded player had significant hitting time.

Player  bWS  OPS+ AB
Connor  22.0 167 564
Hamilton   20.8 152 554
Thompson   18.1 144 609
Delahanty  15.6 158 477
Hallman 11.5 117 586
Cross 9.5 108 541
Clements 9.0 128 402
Allen 6.5  89 563
Reilly   0.0  47 331
Total  113.0


So according to this, Thompson seems to be penalized because he has three great-hitting teammates, and the Phillies did not have a great record. Duffy, although clearly a worse hitter than any of the Phillies top 4, has more batting win shares than all of them because: (a) he had worse-hitting teammates; and (b) his team performed well.

I don't think I am going to use Win Shares anymore for this project.
   311. ronw Posted: September 07, 2007 at 08:32 PM (#2515607)
Those much smarter than I have probably already realized this, but I am not sure Win Shares is a good method for the HOM. I was thinking about Hugh Duffy's dominance of Win Shares with a relatively low OPS+. Other than 1894 and 1891, he never had an OPS+ greater than 130.

I thought that was solely because of the way Win Shares allocates fielding WS disproportionately towards outfielders. However, I was surprised to find that Duffy was among the league leaders among outfielders in batting WS, even though he was nowhere near the leaders in OPS+.

For example, in 1892, Duffy led all outfielders with 23.5 batting WS, although he had a 125 OPS+. In 1893, he was second among outfielders at 23.1 bWS to Ed Delahanty (23.3), despite another 125 OPS+. Of course he led the league in 1894 (28.0) (177 OPS+).

I was seeing him as historically dominant from 1891-1894, primarily because of Win Shares, especially batting Win Shares. Just recently, I wondered how someone with a relatively low OPS+ could consistently be among the league leaders in batting WS?

First I thought that Duffy had more plate appearances than anyone else. It is true that he had over 600PA each of these years, and so had more opportunity to accumulate batting WS.

But others had similarly high PA but didn't have nearly the amount of batting WS.

For example, in 1892, when Duffy led all outfielders in batting WS, he had 673 PA. Sam Thompson had a similar number of PA, with 679. Let's compare their statistics.

Player BA OBP SLG AB R H 2B 3B HR RBI BB SO
Duffy .301 .364 .410 612 125 184 28 12 5 81 60 37
Thompson .305 .377 .432 609 109 186 28 11 9 104 59 19 


Very similar raw stats. How were their adjusted numbers? (BRAA adjusted for season from Prospectus, BatRuns and OPS+ from bbref)

Player OPSBRAA BatRns
Duffy 125 25 17.1
Thompson 144 40 32.9 


OK, so compared to Thompson, Duffy played in a bandbox. Indeed the Philadelphia Baseball Grounds had an 1892 batting park factor of 100, while Boston's South End Grounds was 109.

But what were their raw batting WS? (Remember, similar raw stats, similar PA, but Duffy played in an easier park.)

Player bWS (raw)
Duffy 23.5
Thompson 18.1 


How can this be? Then my relatively simple brain remembered that WS can be affected by team wins, especially team wins above their pythagorean projection.


Team Record Pythag Record RS RA
1892 Boston 102
-48 94-56 862 649
1892 Phil
87-66 92-61 860 690 


Based on this, Boston had 306 WS to dole out, while Philadelphia had 261.

Here are the players who weren't traded from Boston during 1892. Only Harry Stovey (3 WS total for Boston) had significant hitting time before he was traded.

Player bWS OPSAB
Duffy 23.5 125 612
Long 19.5 107 646
McCarthy 16.4 92 603
Tucker 15.9 106 542
Nash 14.2 101 526
Lowe 9.4 84 475
Stivetts 8.6 126 240
Ganzel 4.5 97 198
Quinn 2.9 54 532
Bennett 2.5 81 114
Kelly 2.5 53 281
Nichols 0.7 57 197
Total 120.8 


Here are the players who weren't traded from Philadelphia during 1892. No traded player had significant hitting time.

Player bWS OPSAB
Connor 22.0 167 564
Hamilton 20.8 152 554
Thompson 18.1 144 609
Delahanty 15.6 158 477
Hallman 11.5 117 586
Cross 9.5 108 541
Clements 9.0 128 402
Allen 6.5 89 563
Reilly 0.0 47 331
Total 113.0 


So according to this, Thompson seems to be penalized because he has three great-hitting teammates, and the Phillies did not have a great record. Duffy, although clearly a worse hitter than any of the Phillies top 4, has more batting win shares than all of them because: (a) he had worse-hitting teammates; and (b) his team performed well.

I don't think I am going to use Win Shares anymore for this project.
   312. ronw Posted: September 07, 2007 at 08:46 PM (#2515627)
Third try for formatting?

Those much smarter than I have probably already realized this, but I am not sure Win Shares is a good method for the HOM. I was thinking about Hugh Duffy's dominance of Win Shares with a relatively low OPS+. Other than 1894 and 1891, he never had an OPS+ greater than 130.

I thought that was solely because of the way Win Shares allocates fielding WS disproportionately towards outfielders. However, I was surprised to find that Duffy was among the league leaders among outfielders in batting WS, even though he was nowhere near the leaders in OPS+.

For example, in 1892, Duffy led all outfielders with 23.5 batting WS, although he had a 125 OPS+. In 1893, he was second among outfielders at 23.1 bWS to Ed Delahanty (23.3), despite another 125 OPS+. Of course he led the league in 1894 (28.0) (177 OPS+).

I was seeing him as historically dominant from 1891-1894, primarily because of Win Shares, especially batting Win Shares. Just recently, I wondered how someone with a relatively low OPS+ could consistently be among the league leaders in batting WS?

First I thought that Duffy had more plate appearances than anyone else. It is true that he had over 600PA each of these years, and so had more opportunity to accumulate batting WS.

But others had similarly high PA but didn't have nearly the amount of batting WS.

For example, in 1892, when Duffy led all outfielders in batting WS, he had 673 PA. Sam Thompson had a similar number of PA, with 679. Let's compare their statistics.

Player BA   OBP  SLG  AB   R   H 2B 3B HR RBI BB SO
Duffy .301 .364 .410 612 125 184 28 12 5   81 60 37
Thompson .305 .377 .432 609 109 186 28 11 9  104 59 19


Very similar raw stats. How were their adjusted numbers? (BRAA adjusted for season from Prospectus, BatRuns and OPS+ from bbref)

Player   OPS+ BRAA BatRns
Duffy 125 25   17.1
Thompson 144 40   32.9


OK, so compared to Thompson, Duffy played in a bandbox. Indeed the Philadelphia Baseball Grounds had an 1892 batting park factor of 100, while Boston's South End Grounds was 109.

But what were their raw batting WS? (Remember, similar raw stats, similar PA, but Duffy played in an easier park.)

Player bWS (raw)
Duffy 23.5
Thompson 18.1


How can this be? Then my relatively simple brain remembered that WS can be affected by team wins, especially team wins above their pythagorean projection.

Team  Record Pythag  RS  RA
1892 Boston 102-48 94-56   862 649
1892 Phil.   87-66 92-61   860 690


Based on this, Boston had 306 WS to dole out, while Philadelphia had 261.

Here are the players who weren't traded from Boston during 1892. Only Harry Stovey (3 WS total for Boston) had significant hitting time before he was traded.

Player bWS OPS+ AB
Duffy 23.5 125 612
Long  19.5 107 646
McCarthy 16.4  92 603
Tucker   15.9 106 542
Nash  14.2 101 526
Lowe   9.4  84 475
Stivetts  8.6 126 240
Ganzel 4.5  97 198
Quinn  2.9  54 532
Bennett   2.5  81 114
Kelly  2.5  53 281
Nichols   0.7  57 197
Total   120.8


Here are the players who weren't traded from Philadelphia during 1892. No traded player had significant hitting time.

Player bWS  OPS+ AB
Connor 22.0 167 564
Hamilton  20.8 152 554
Thompson  18.1 144 609
Delahanty 15.6 158 477
Hallman   11.5 117 586
Cross   9.5 108 541
Clements   9.0 128 402
Allen   6.5  89 563
Reilly  0.0  47 331
Total 113.0


So according to this, Thompson seems to be penalized because he has three great-hitting teammates, and the Phillies did not have a great record. Duffy, although clearly a worse hitter than any of the Phillies top 4, has more batting win shares than all of them because: (a) he had worse-hitting teammates; and (b) his team performed well.

I don't think I am going to use Win Shares anymore for this project.
   313. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 07, 2007 at 09:11 PM (#2515656)
Win Shares' crediting of teams' over/underperformance of their component stats to their players is well known by most voters and has been debated virtually ad infinitum. I'm glad to see you don't find it compelling, as I don't either. Welcome to the forces of light. :)
   314. JPWF13 Posted: September 07, 2007 at 09:32 PM (#2515673)
So according to this, Thompson seems to be penalized because he has three great-hitting teammates, and the Phillies did not have a great record. Duffy, although clearly a worse hitter than any of the Phillies top 4, has more batting win shares than all of them because: (a) he had worse-hitting teammates; and (b) his team performed well.


Thompsons team had 87 wins (pythag was 92, but WS uses actual) so 261 WS are divided among the team.
Duffys team had 102 wins (pythag was 94) so 306 WS were divided among the team.

If you use pythag wins rather than real wins Duffy gets 21.5 batting WS and Thompson 19.1
Thompson WAS a lesser % of his team's offense than Duffy was to his. That could be an allocation problem- too many of DUFFY's team's WS are going to offense instead of pitching and/or defense.

ALSO James found that run estimators tend top get a bit wonky before 1920 and especially before 1900. Duffy has a great SB and SH advantage over Thompson- James found that while SB and Sh do not correlate [positively] with scoring in our era- they did pre 1920 and especially pre 1900. So James' pre 1900 run estimator (a reworked version of runs created) may see Duffy as Thompson's offensive equal despite Thompson's 144 to 125 OPS+ advantage.
   315. Paul Wendt Posted: September 07, 2007 at 11:10 PM (#2515740)
Thanks for the calculation, J
So only half of Duffy's 1892 bws margin over Thompson is generated by Bill James' full allocation of wins.

by the way,

Wins above Pythagorean projection, Boston NL 1891-99
2 <u>8 8 5 0 1 2 4 0</u>

bold - excellent team, .625 or better (7 of 9 seasons)
<u>underline</b> - Hugh Duffy a regular outfielder (8 of 9 seasons, four cf then four rf)
italic - 154 game schedule (3 seasons)

1892 was the split season with a playoff (not included in season statistics) between the first and second-half leaders, Boston and Cleveland.
   316. TomH Posted: September 08, 2007 at 01:34 AM (#2516056)
What are the OWPs for Duffy and Thompson in 1892? Betcha it's a lot different than OPS+.
   317. OCF Posted: September 08, 2007 at 01:52 AM (#2516124)
For what it's worth, the system I've been using all along, which comes from RC and outs as given in a Stats, Inc. Handbook, does basically use James's adjustment for pre-1900 run estimators (SB, etc.) but it's not WS and it doesn't care what the team's W/L record was.

In that system, I have Duffy 1892 as 34, and I have Thompson 1892 as 34. So there you are - very similar. If I switch to RC above 75% of average, I get Duffy 54, Thompson 53. OK, their playing time was pretty similar. On the same scale (back to the first version, RCAA), I have Duffy's "wow" year of 1894 as a 69. OK, that's a very good year - but it wasn't Frank Chance 1906 (scored as 78) or even George Stone 1906 (scored as 92).

These are all offense only. Of course, Duffy was a better defender than Thompson. Not that I was any fan of electing Thompson at the time. The differences between OPS+ and this version of RC do erase Thompson's advantage, but only to equalize them - not to put Duffy ahead.
   318. Chris Cobb Posted: September 08, 2007 at 02:34 AM (#2516245)
What are the OWPs for Duffy and Thompson in 1892? Betcha it's a lot different than OPS+.

Bbref's new OWP numbers for the two are

Duffy .541
Thompson .570

Not so different from what OPS+ shows.

The formula BBref is using for OWP isn't listed, however, so it's not clear where these numbers come from.

FWIW, EQA sees a big difference between the pair.

Duffy .291 EQA, 25 BRAA
Thompson .309 EQA, 40 BRAA

That gap seems a bit large to me in favor of Thompson.
   319. KJOK Posted: September 21, 2007 at 09:43 PM (#2536096)
Just wanted to get Tango's latest WARP/BP comments into the archive:


I’ve also railed on BP for replacement level and using Runs Created. Those however are more philosophical disagreements. I like to consider replacement level at around .300, and could live with it being as low as .250. Clay (and by extension BP) uses .150. Clay is in the clear minority on this one and has a tougher job to explain himself. But, he could possibly muster enough evidence to support himself. However, that has never happened. I’d also be willing to debate that with them.

Strangely, rather than using EqR as their basis for VORP (and MLVr), they use Bill James old RC equation (one that even James himself doesn’t use). It’s one of those things that is so buried under the machinations of the process, that no one bothers to look, and deride BP for using. This one, while blatantly a very poor choice, is not “wrong”, because anything short of an all-encompassing sim would be “wrong”. However, it’s an extremely poor choice, one that BP should not be making. BaseRuns is the obvious choice here.

***
One thing that BP has straightened out is they have gotten rid of Pythagenport in favor of Pythagenpat. What would be nice however is if they call it Pythagenpat or whatever name David/Patriot want, rather than continuing to use Pythagenport as the name. And another is that Woolner did use the Tango Distribution over his, even though that was also a philosophical choice.
In both these instances, they went with the cleaner method that works a bit better in the normal range, and much better at the extremes. {clap clap clap}

***
While I’m here… will OPS+ go away please? No one will bother to calculate OPS+ on their own. So, why not calculate some form of Runs Created as a “+” metric.
   320. Joey Numbaz (Scruff) Posted: September 25, 2007 at 01:42 PM (#2541337)
Here's an interesting thing:

There is a serious multiplication bug in Excel 2007, which has been reported. The example first that came to light is =850*77.1 — which gives a result of 100,000 instead of the correct 65,535. It seems that any formula that should evaluate to 65,535 will act strangely.

Just thought you stat geeks would want to know.
   321. KJOK Posted: September 26, 2007 at 07:09 PM (#2543712)
Baseball Prospectus put together a spreadsheet of the "Top 5 Best Players in Baseball" by year based on WARP3 in a rolling, six-year period. The weights are assigned as follows:


Year N-3 7%
Year N-2 13%
Year N-1 22%
Year N 31%
Year N+1 18%
Year N+2 9%

I posted a copy of the spreadsheet in the HOM Yahoo egroup FILES section.
   322. DL from MN Posted: September 26, 2007 at 08:56 PM (#2543867)
Dazzy Vance was a surprise...
   323. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 26, 2007 at 09:24 PM (#2543912)
That may have something to do with how WARP weights strikeouts. More credit is given for two pitchers with the same ERA+ to the one with higher K (although the methodology is, of course, a mystery), so I would imagine that Vance's insanely high league-relative K rate would serve him well--perhaps too well--in their formulas.
   324. jimd Posted: September 27, 2007 at 05:43 PM (#2545568)
More credit is given for two pitchers with the same ERA+ to the one with higher K

Getting each out has a certain value; the pitcher retains all of the value of his K's while sharing the value of the BIP-out with his fielders.

Dazzy pitched in front of the fielding-challenged for almost all of his prime. His ERA+ is therefore misleadingly low.
   325. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 27, 2007 at 05:50 PM (#2545577)
There are two separate issues here. The first is adjusting for defensive support, which everyone agrees should be done. If Vance's fielders were below average, he shouldn't be penalized for that, just as Palmer should be penalized for having above-average fielders. The second issue is intrinsically rewarding K's regardless of defensive support, which BP does. Given two *teammates* with the same ERA+ and the same fielders, the one with the higher K rate will still get more BP WARP, on the grounds that he got a greater share of his outs by himself. The latter I don't agree with.
   326. jimd Posted: September 27, 2007 at 08:26 PM (#2545878)
BP builds their statistical ratings from the bottom-up, the component stats, K's, Hits allowed, HR's allowed, etc. ERA+ is not one of those but a run-level stat and the fact that the two pitchers happened to produce the same ERA+ from those component stats is deemed irrelevant by BP.

Let's look at this from the batting perspective.

Suppose you had two hitters that played for the same team, each playing half the season in the 3 slot, the other half in the 4 slot. Suppose they each had the same number of RBI's, but one had significantly more Runs Created than the other. Does that fact invalidate Runs Created?

Which is more important? The lower-level components as predictors of what "should have" happened at the run-level? Or the run-level result stats as documentors of what did happen? Are you arguing for components for hitters and run-level for pitchers? And if so, why?

(This also has some relation to Thompson/Duffy, WARP vs WinShares, component-level vs win-level.)
   327. jimd Posted: September 27, 2007 at 08:31 PM (#2545904)
And if so, why?

This reads as much more "combative" than I intended.

There may well be very good reasons for different persectives here.

I just haven't thought enough about these issues myself, either.
   328. Paul Wendt Posted: September 27, 2007 at 08:46 PM (#2545968)
325. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 27, 2007 at 01:50 PM (#2545577)
There are two separate issues here. The first is adjusting for defensive support, which everyone agrees should be done. If Vance's fielders were below average, he shouldn't be penalized for that, just as Palmer should be penalized for having above-average fielders. The second issue is intrinsically rewarding K's regardless of defensive support, which BP does. Given two *teammates* with the same ERA+ and the same fielders, the one with the higher K rate will still get more BP WARP, on the grounds that he got a greater share of his outs by himself. The latter I don't agree with.

The rationale may be that first base on error is missing data whose incidence is greater for pitchers with fewer strikeouts. Crediting pitchers for strikeouts is a proxy for debiting bases on errors.
   329. Dr. Chaleeko Posted: September 27, 2007 at 09:10 PM (#2546099)
Simple advancement on outs for runners already on base during balls in play also argues for Ks since it increases the run-expectancy above what it would be with the K.
   330. David Concepcion de la Desviacion Estandar (Dan R) Posted: September 27, 2007 at 09:49 PM (#2546197)
jimd, *Yes* I most *definitely* believe component stats should be used as the basis for calculating hitters' performance and runs the basis for calculating pitchers' performance, for the critical reason that pitchers control their own context and hitters do not. Some pitchers can and do change their approach from the stretch, distorting the component stats/runs relationship (Glavine is the obvious example on one hand, and Nolan Ryan appears to be one on the other) to a significant degree. Batters cannot do this, since they only come up one out of every nine times. This is the same reason why you can use straight BaseRuns for pitchers but not for hitters, and also why you have to apply the Pythagorean theorem differently to hitters than pitchers (for hitters, you apply it to team RS/RA totals, whereas for pitchers you should calculate the team's winning percentage in their innings and then in the remaining innings). This can be a large effect: take a pitcher who throws 30 complete games and allows 240 runs in a 4.0 R/G league on an otherwise average team. If you apply Pythagoras to the team RS/RA, you'll get 648 RS and 768 RA, which comes out to an exponent of 1.85 and a projected record of 68.3-93.7. But in fact, there are two separate run-scoring contexts: one in the pitchers' starts, one in all other starts. In games started by the pitcher, the team scores 120 runs and allows 240, for an exponent of 2.03 and a projected record of 5.9-24.1 in his starts, plus a .500 record in all other games gives an overall projected record of 71.9-90.1. So treating the pitcher like a hitter causes you to misstate his value by 3.6 wins!

Paul Wendt and Dr. Chaleeko, there is an empirically measurable marginal value of a K relative to a non-GIDP, non-SF, non-H fielded out, and it is .008 runs. Never adds up to more than a handful of runs even in the most extreme cases (although I do factor it in in my run estimation nonetheless).
   331. TomH Posted: September 27, 2007 at 10:32 PM (#2546312)
empirically measurable marginal value of a K relative to a non-GIDP, non-SF, non-H fielded out, and it is .008 runs.
--

I assume analysis of pre-1911 pitching would put this at a different value.
   332. Paul Wendt Posted: September 28, 2007 at 01:02 AM (#2546963)
I don't know Davenport's rate of credit for strikeouts even approximately, and except by reference to others such as DanR finds .008" I wouldn't be able to judge the magnitude if I knew it, although I would recognize .08 runs as too high.

TomH is right that the true strikeout premium must vary historically. I guess that where Davenport does use fixed all-time parameters he estimates them for only some recent portion of mlb history, so his estimate would be little or none influenced by 100- or 150-year-old conditions, but I am guessing.
   333. KJOK Posted: September 28, 2007 at 08:33 PM (#2548268)
I'm almost certain strikeouts consistently are around .04 runs more damaging than the average 'other' out.
   334. Paul Wendt Posted: September 29, 2007 at 03:00 AM (#2549353)
333. KJOK Posted: September 28, 2007 at 04:33 PM (#2548268)
I'm almost certain strikeouts consistently are around .04 runs more damaging than the average 'other' out.

I'm almost certain consistency at that precision is possible from year to year but not from model to model. There are too many variations in the methods of measuring that cost.

--
New England Symposium for Statistics in Sports
Months ago I mentioned this one-day conference on Statistics in Sports, tomorrow at Harvard University. See "Program" for the author-title list and the abstracts. The organizers hope that it will be annual.
Statistics in Sports (one-day conference Saturday)
   335. Paul Wendt Posted: September 29, 2007 at 03:19 AM (#2549438)
Where is the Dave Johnson thread?
heh - if i may anticipate jtm

There is a long display cabinet along one wall of the sciences library at Harvard University. The current exhibit is "From BA to BABIP - The History of Baseball Statistics". Regarding the abstruse work of Earnshaw Cook, Percentage Baseball (imagine a copy of the early 1960s book open to a telling page), the exhibit notes his splash of publicity thanks to a Sport Illustrated feature by young Frank Deford and his generally poor reception in academia, scornful reception in baseball. But Cook did make one convert [?? maybe not even a good paraphrase]. Davey Johnson, a math major in college at the time, accepted many of Cook's ideas and later became manager of the Mets.
   336. Paul Wendt Posted: October 13, 2007 at 05:05 AM (#2574168)
"heh"

[copied from "2005 Results"]

141. TomH Posted: October 06, 2007 at 10:28 PM (#2564778)
well, I can't FIND the uber WS vs WARP thread, so let's continue here...

DanR: The only way you can calculate that a player who got 3 WS on the 1899 Cleveland Spiders contributed exactly one win to the team is if you use a replacement level of precisely 52% of league average offense. Let's take a hypothetical guy who was 3 WS = 1 win in 1899. League average was .214 runs per batting out, so James's 52% baseline is .111 runs per out. Say the guy made 300 outs. If he was 1 win = 10.77 runs in 1899 above that baseline, then he created .111*300 + 10.77 = 44.2 runs. OK, so if you use James's 52% figure that's 3 batting WS. But if you used a baseline of 25% of league average offense, then he'd magically get 44.2 - (300*.214*.25 = 16.05) = 28.15 runs/10.77 runs/out * 3 WS/win = 7.8 batting WS.

No, that is not right. You can't merely move the baseline and then neglect to reaccount for the ratio of runs to wins, or there would be far more win shares than wins. WS starts with team wins, figures how many runs it takes to make thsoe wins in the team as a whole, and then distributes them (offensively) but RC/out. The above calculations are not correct; changing the baseline would not give the example batter 4.8 more BWS. Same with the Rosen example.

DanR: See, there's nothing inherently "true" or "right" about the Win Shares allocation system.

Oh, I completely agree. But it makes much sense for the application for which it was designed; because using absolute zero has its own problems, and using 80% of average (or precisely average, the only other measure that has inherent good properties to it) causes wins then that need to be re-allocated by playing time, since there would be many fewer WS earned than wins the team achieved. Using 'average', you could create a system that then adds so many "wins" for each player by playing time (1 win per 150PA or 40 IP or some such thing). But James created a system that did not require that. 52% on offense was a low enough replacement level to make it work. I am not really trying to defend 52% as "right"; is it arbitrary? Sure. Would other numbers work? Sure; IF you wished to go back and subtract or add in fudge factors to make the individual totals match ther team total. WS is a top-down approach, and maybe I should leave it at that; it is DIFFERENT from a bottom-up approach that almost every other system uses. In some applications, it will be a better tool. For others, it is worse. If I had invented it, I might have tried to come up with a player's Win Shares AND Loss Shares.

DanR: And regarding the 1975-76 Reds, who would they have been forced to play if one of their stars had gone down? A replacement player, of course--one whose production would probably be approximately 80% of positional average.

My point about the extreme teams was that great teams tend to have higher freely available talent on the bench (Dan Driessen!), whereas the Spiders do not (duh). Is this important when considering long-term replacement level for the HoM? Maybe not. Again, the question WS was designed to answer was "how do I distribute the 108 wins in Cincinnati's 1975 team among their players?". Babe Ruth, on the 99 Spiders, could have been just as great a player but would have likely won fewer additional games for that team. This is pretty obvious, no? WS captures this. Whether you think this is improtant or relveant can be argued. But it does capture it.

DanH #145
. . .
Since there may not be more than 15-20 human beings playing baseball on this earth capable of doing that at any given time, FAT shortstops are below-average fielders as well as hitters. Given that, it seems entirely reasonable that the mega-expansion wouldn't affect all positions equally, and that the gap between the #24 and the #20 shortstop would be bigger than the gap between the #24 and the #20 1B. Secondly, you had the move to turf fields, which has been discussed at great length. The combination of those factors is more than enough to convince me that you really did need a super glove at SS to compete in the '70s (and many of the winning teams like the Orioles, Reds, A's, and Yankees had them), and that Concepción and Campaneris "deserve" all the credit for the pennants they added. Your mileage is free to vary.

TomH:

Mea culpa on the calculations, but does that change the substance of my point? That 52% is an arbitrary number, and that using a different baseline level would lead to a different allocation of Win Shares among batters?

What are the problems of using absolute zero, besides the fact that it's just blatantly not representative of how baseball works? (Neither is 52%, of course).

Using wins above/below average plus credit by playing time would be an INFINITELY better system than the current Win Shares model, in my opinion. Incomparably better.

Moreover, 52% is not a low enough level to make it work, because there *are* still players who hit at worse than 52% of league average. Look at Bill Bergen, who had not a single Batting Win Share in his nearly 1,000 game-long career. The presence of Bergen causes the Batting Win Shares of all of his teammates to be inappropriately reduced.

146. TomH Posted: October 07, 2007 at 08:12 PM (#2566273)
If you used absolute zero, every batter would have so many runs above baseline, you would have to go back and take away wins per playing time to get the team wins to match win shares. Or make believe it takes oddles of runs to create a 'win'. E.g., if a team only wins 40 games and so gets 120 WS, and if 60 were batting, and they scored a mere 600 runs, that's 10 runs per win share, or 30 runs per win. Wouldn't work.

Your take on WHY the FAT talent level of shortstops dropped in 1970 is a fascinating theory. If SS FAT level drops often with expansions, that would be a large finding (not only for this dicsussion, but for MLB GMs!!). That is actually a question that really piques my interest at the moment.


DanR later
TomH:
Is that why James picked 52%? Because it's the only number that gives you 10 runs a win in the modern game with no tweaking? That would be sort of cute. Still empirically wrong, but a tiny bit less arbitrary--a number selected for convenience rather than for accuracy.
   337. Paul Wendt Posted: December 04, 2007 at 04:26 AM (#2633184)
"A Quick and Dirty Fix for Linear Weights" --Phil Birnbaum, editor, By the Numbers

Charlie Pavitt, In Response to Win Shares: A Partial Defense of Linear Weights

Charlie Pavitt maintains a bibliography of mainly-academic mainly-statistical research on baseball --recently moved from the University of Delaware website, where is it? He reviews published academic work in a front page column of By the Numbers, the newsletter of the Statistical Analysis Committee, SABR.
   338. Paul Wendt Posted: January 02, 2008 at 06:32 PM (#2658356)
[from the 2008 mock HOF thread. emphasis mine]

150. DL from MN Posted: January 02, 2008 at 09:59 AM (#2658223)
. . .
I would caution voters that using raw WS totals overrates outfielders over infielders over pitchers in the modern era due to replacement value issues.


151. kwarren Posted: January 02, 2008 at 10:57 AM (#2658270)
Pitchers are worth much less to their teams on an individual basis in the modern era, because they play considerably less than in previous eras.

I agree that outfielders rates outfielders slightly higher than infielders, but I assmumed that was because they tended to be, on average, much better hitters. And even though infielders (2b, SS, 3b) tend to contribute more than corner outfielders defensively it does not compensate for the better hitting that outfielders usually provide.

This trend has been changing recently with the advent of power hitting infielders. <u>Consider the top 18 players in 07 using Win Shares.

6 OF, 4 1B, 3 3B, 2 SS, 1 2B, 1 DH</u>
. . .


152. DL from MN Posted: January 02, 2008 at 11:28 AM (#2658292)
Look at that list - 11 bats, 6 gloves (no C) and no pitchers; this demonstrates what I was saying. I don't think I want to see a HoF that is all OF and no pitchers post 1975. <u>The $$ paid out for pitchers is higher per win share than the $$ paid out for outfielders. The market is in significant disagreement with Win Shares.</u>

It seems questionable to me that people would use a strict system for HoF voting (most Win Shares) that wouldn't put a pitcher on it's 25 man roster for the best players of the most recent season.

There's a (long) discussion thread on WARP, win shares and replacement value on the HoM site. There's no need to repeat it all here.


This is that long discussion thread.

Many of the same themes have been discussed in the "Dan Rosenheck" thread since DanR introduced another uber-system.
   339. Paul Wendt Posted: January 02, 2008 at 06:48 PM (#2658372)
The two matters I have emphasized should be studied and summarized systematically. Has anyone done that?

1. The historical distributions of player-season win shares by fielding position and year.

2. The relation between salaries and win shares in one season, and more complicated relations among player-season salaries and win shares.

Either study should use a complete table of win shares by player-team-season, integrated with the baseball data that is more widely available. For example the "Lahman database" includes a complete table of fielding games by position and player-team-season, and a nearly-complete table of compen$ation for some recent period.

Is a complete win shares table available?
Does the digital edition of Win Shares provide the table through 2002 or so?
   340. Paul Wendt Posted: January 03, 2008 at 12:38 AM (#2658680)
In print, player-team-season Win Shares is always an integer. Two popular peak measures are sums of three and five of those integers. How frequently does a player gain or lose by rounding at the season level, before the sum, rather than rounding the sum?

(3)
<u>Sum of three Win Shares</u> season ratings
The distribution of "gain" (which may be positive or negative here) from rounding at the season level is very close to Normal with standard deviation 0.5

Consider a player with reported win shares 34 + 29 + 28 = 91.
The probability is about 2/3 that the true sum lies between 90.5 and 91.5, so that taking the sum before rounding would also yield 91.
The probability is about 19/20 or 95% that the true sum lies between 90 and 91
The true sum is between 89.5 and 91.5, so that taking the sum before rounding would surely yield 90, 91, or 92

That is, 1 out of 3 more accurately round to 90 or 92 than to 91
1 out of 20 are truly outside the range 90 to 92

(5)
The distribution of gain is roughly Normal with s.d. 0.8

Consider reported 5-year sum 21 + 22 + 23 + 24 + 25 = 115
The probability is more than 50% that the true sum lies between 114.5 and 115.5, so that taking the sum before rounding would also yield 91
The probability is almost 90% that the true sum lies between 114 and 116, or 115+/-1
The probability is almost 99% that the true sum lies between 113.5 and 116.5, so that taking the sum before rounding would yield 114 or 115 or 116
   341. Paul Wendt Posted: January 03, 2008 at 12:45 AM (#2658681)
Sorry, that should be:

> (5)
> <u>Sum of five Win Shares</u> season ratings
> The distribution of gain is roughly Normal with s.d. 0.64

(0.8 is the rough number of standard deviations for gain +0.5)
The approximate probabilities for (5) are correct, I think.
   342. Paul Wendt Posted: May 01, 2008 at 12:53 PM (#2764939)
Win Shares are denominated in wins (thirds of wins). The same is true of player-team-season ratings by Davenport (WARP) and Palmer (TPR, TPI; BFW, PFW).
Win Shares for all players in a team-season are normalized so that their sum is equal to team-season wins (times three). That is not true of Davenport's or Palmer's ratings, nor of many others.

Therefore the following data on team-season Games and Decisions is specifically relevant to using and understanding and improving the Win Shares system.

For every team-season Games Played = Decisions + NoDecisions. It is only a little misleading to call the NoDecisions "Ties" and it is convenient, so let me say it. For short, G = D + T.

Even today it is common that two teams in one league-season (whatever that is under interleague play) finish with different numbers of Games or Decisions. There were many differences between teams in 1994 when the season ended abruptly in August. In 24 league-seasons since then (1995-2006, two leagues) there have been only 5 with equal numbers of games played for all teams (almost inevitably, the number is 162). There have been 7 with equal number of decisions for all teams (again almost inevitably 162). And there have been 14 seasons with equal numbers of ties for all teams (zero). In those 24 league-seasons, the biggest differences between teams have been two games played (four times), two decisions (four times), and one tie (10 times).

Without further ado,

Latest league-season with given difference in games played for some pair of teams
1 game, 2006 NL or AL
2 game, 2000 NL
3 game, see 4
4 game, 1994 NL
5-6-7g, see 8
8 game, 1981 NL
9 game, see 10
10 game, 1945 AL
11 game, 1892 NL
For now coverage ends or begins in 1892 because 1891 is the latest season a major league team did not complete the season.

Now pass over some abnormal seasons: 1994, 1981, 1942-1945, and 1918. (In 1918, 1942, 1945, and 1981 --but neither 1943 nor 1944-- the biggest difference among teams within league was at least seven games.)

Adjusted by passing over 1918, 1942-45, 1981, and 1994
1 game, 2006 AL or NL
2 game, 2000 NL
3 game, see 4
4 game, 1989 NL
5 or 6, see 7
7 game, 1953 AL
8 game, 1938 AL
10 game, 1893 NL
11 game, 1892 NL

1953. Is that ancient history?

Latest league-season with given difference in decisions for some pair of teams
1 deci, 2006 AL or NL
2 deci, 2002 NL
3 deci, see 4
4 deci, 1994 NL
5-6-7-8, see 9
9 deci, 1981 NL
That 9-decision difference in 1981 is the greatest in the period 1892-2006, tied in 1945 and 1906 but never exceeded.

Adjusted by passing over 1918, 1942-45, 1981, and 1994
1 deci, 2006 AL or NL
2 deci, 2002 NL
3 deci, 1979 AL
4 deci, 1978 AL
5 deci, 1962 NL
6 deci, 1938 AL
7 deci, 1907 AL or NL
8 deci, see 9
9 deci, 1906 AL

Latest league-season with given difference in "ties" (all no-decision games) for some pair of teams
1 tie, 2005 NL
2 ties, 1989 NL
3 ties, 1981 NL
4 ties, 1953 AL
5 ties, 1937 AL
6 ties, 1916 NL
7 ties, see 8
8 ties, 1911 NL

Commonly the maximum difference of T ties between teams in one league-season is a difference between one team with T ties and one or more with no ties. Detroit holds the ties record with 10 in 1904 but the maximum difference between teams was only two because every team tied at least two games. In a few league-seasons every team tied at least three games: 1907 AL, 1914 AL, 1914 FL. In the 46 major league seasons 1892-1914 there were 13 with no ties.
   343. Paul Wendt Posted: May 01, 2008 at 01:00 PM (#2764946)
Concerning the Win Shares rating system, variation in number of decisions is probably more important than that in numbers of games or ties. If all "ties" are replayed, that means uniformity in number of decisions and in a sense replays restore equal opportunities to earn win shares.
Here are some examples of big differences in numbers of decisions

1907 NL
_G_ _D_ _W_ _L_
155 152 107 _45 Chicago
157 154 _91 _63 Pittsburgh
149 147 _83 _64 Philadelphia
155 153 _82 _71 New York
The second division played 153 to 148 decisions.

The difference in decisions between PIT and PHI teams represents about 12 win shares for the players on each team in expectation.

1907 AL
_G_ _D_ _W_ _L_
153 150 _92 _58 Detroit
150 145 _88 _57 Philadelphia
157 151 _87 _64 Chicago
158 152 _85 _67 Cleveland
The second division played 152 (St Louis) to 148 decisions.

For Cleveland that is almost 12 win shares gained relative to Philadelphia; for Philadelphia about 14 win shares lost relative to Cleveland.

1981 NL (overall records in split season)
_G_ _D_ _W_ _L_
103 102 _59 _43 St. Louis
...
103 102 _46 _56 Pittsburgh
...
111 111 _56 _55 San Francisco

The 9-game difference represents a gain of about 14 win shares for San Francisco relative to St. Louis or a loss of about 16 win shares for St. Louis relative to San Francisco.
   344. Paul Wendt Posted: May 18, 2008 at 03:27 PM (#2784983)
Maybe this is a useful relocation. At least Brock will meet this thread.

Brock Hanke wrote this today in "Ranking the Hall of Merit Firstbasemen" #93. This is only part of what he wrote and what he wrote is only preliminary.

A post on this subject that I thought would take a day or two has taken over a week to work up. The essence of the post is that I think you should amortize 1870s catcher playing time out to maybe 90 games instead of 162 when you're doing Season Equivalents, and that this is the only decade and the only position to which that applies. It only applies to catchers, and it only applies to 1870s catchers, except for one season of Charlie Bennett (1882) when he actually played his team's entire schedule, albeit not all at catcher.

Brock,
I'm sure there are some strengths and some weaknesses to your study. I hope that I can make time to give it a close look . I'm not very good at making time but the baseball subject commonly grabs me.

From what you say I infer that the player ratings part of your thesis --in contrast to the history and the curve fitting-- puts you in the Pete Palmer school, or addresses the Palmer school. Palmer, Clay Davenport, Bill James, and their followers. Their marquee ratings are career sums of season ratings denominated in games. Among them Palmer makes no adjustment for season length. He once gave a talk or wrote a paper on the primacy of the season and I suppose he would say that the career-sum is secondary although it helps sell books. Bill James, too, makes no adjustment for season length, although he does give some space to win shares per 162 games beside more the career, 5-year, and 3-year totals. Davenport prorates every season at the rate (162/G)^2/3 rather than the linear 162/G. (By "amortize" I think you mean the linear 162/G where G is team games played or scheduled.)

Chris Cobb is in the Palmer school.
Those who rely on raw win shares or season-prorated win shares are taking this approach.

Joe Dimino is not. He asks "Joe Torre, 1960-1977: how much did he contribute toward winning 18 pennants?" "Deacon White, 1871-1890 (or 1869-1890): how much did he contribute toward winning 20 (or 22) pennants?"
People who say "a pennant is a pennant" may be, and surely some are, professing this approach. --even if they don't follow it all the way to a numerical rating.
   345. John (You Can Call Me Grandma) Murphy Posted: May 18, 2008 at 04:21 PM (#2785018)
I guess I'm also in the latter school, Paul, despite using Win Shares heavily in my analysis.
   346. Chris Cobb Posted: May 18, 2008 at 05:01 PM (#2785029)
Chris Cobb is in the Palmer school.

I am not sure this is true, though I am also not sure exactly what you mean by putting me in "the Palmer school."
   347. Paul Wendt Posted: May 19, 2008 at 12:48 AM (#2785487)
You may be right, Chris.

What I mean by the Palmer school is that the focus is coming up with the right season rating denominated in wins.
(Palmer might say that they shouldn't be added except to sell books to people who are interested in adding them, and no one should be interested in adding them. If I ever have a conversation with him that gets beyond one beer --and we are presently at zero beers-- I will ask about that. But that isn't a tenet of the school.)

Dan Rosenheck is in the Palmer school during the regular term, and he is interested in adding wins. But he has a summer job in another school, putting all of the players into one modern free agent labor market.
   348. Paul Wendt Posted: January 19, 2009 at 06:58 PM (#3055239)
[two copies from the thread on Dan Rosenheck's WARP]

678. Blackadder Posted: December 27, 2008 at 09:54 AM (#3038841)
Apparently Clay Davenport is reworking BP's WARP, to include PBP fielding when it is available and a more realistic replacement level. Jay Jaffe quoted some of the preliminary results, and eye-balling it the replacement level still looked a little too low, but I'll withhold judgment until his system is public. Still, it is a very welcome development that there seems to be convergence in opinion about the correct methodology for player valuation.

680. Devin McCullen cries "Enraha!" Posted: January 09, 2009 at 04:23 PM (#3047931)
This seems like as good a place as any to mention this: BP now has searchable stats for Batting Translations, Pitcher Translations, and WARP Leaderboards here. They only go back to 1901, and its year-by-year, but I assume folks will find these helpful.
   349. Paul Wendt Posted: April 03, 2009 at 08:18 PM (#3123658)
Last month we have reported and discussed a few measures that have changed with the new edition of WARP, the original by Clay Davenport which is incorporated in player "DT cards" at baseballprospectus.com. (Where? probably among "Pitchers for the Hall of Merit" and "Ranking Pitchers for the Hall of Merit" for 1871-1892 or 1893-1923)

Today I compiled some career Advanced Pitching Statistics for all fifteen major league pitchers on the 1893-1923 ballot. I noticed that the measures {XIP, RAA, PRAA, PRAR} have not changed from last year for Cy Young whereas they have changed for the other pitchers (fourteen). I reported this apparent problem to Clay Davenport.
   350. KJOK Posted: April 07, 2009 at 05:26 AM (#3127718)
Cy Young's WARP1 has changed from 193 to 129, so BP has certainly at least made a major revision in the WARP calculation.
   351. Paul Wendt Posted: April 07, 2009 at 12:20 PM (#3127770)
revised,
XIP RAA PRAA PRAR DERA
7399 901 932 2013 3.37
4839 570 693 1231 3.21

By DERA he now ranks third in this group and he is second to Johnson by XIP, RAA, PRAA, or PRAR.
newDERA name
2.94 Rusie A
3.04 Johnson W
3.21 Young C
3.22 Alexander P
3.31 Walsh E
   352. Paul Wendt Posted: April 07, 2009 at 07:27 PM (#3128373)
Here are the 2008 and the revised values of DERA for 31 pitchers with debuts in the 1890s and at least 2000 career innings. They are ordered by the revised value (column two).

DERA newDERA
3.36 3.17 Hahn N
3.37 3.21 Young C
3.89 3.36 Breitenstein T
3.76 3.41 Nichols K
3.54 3.51 Waddell R
3.95 3.64 McGinnity J
3.92 3.68 Griffith C
4.03 3.83 Leever S
4.04 3.89 Willis V
3.96 3.95 Cuppy N
4.22 3.96 Donovan B
4.38 3.97 Taylor J
4.01 4.07 Tannehill J
4.17 4.08 Orth A
4.17 4.08 Mercer W
4.29 4.09 Hawley Pink
4.18 4.09 Phillippe D
4.16 4.09 Chesbro J
4.09 4.13 Dinneen B
4.43 4.15 Meekin J
4.19 4.18 Powell J
4.27 4.22 Killen F
4.24 4.23 Taylor J
4.34 4.24 Sparks T
4.37 4.29 Howell H
4.47 4.31 Kennedy B
4.37 4.42 Donahue R
4.61 4.57 Sudhoff W
4.73 4.71 Fraser C
4.73 4.75 Kitson F
4.77 5.05 Carsey K

Sam Leever now leads his 1900-1902 teammates comfortably.

The revision benefits Ted Breitenstein more than anyone else in this group (row three) but many with 1880s debuts gained more than he did and Charlie Getzien from the 1880s lost more than anyone in this group.

Among the leaders by career innings, at least, the size of the revision on the DERA scale is generally greater for the 1870s and 1880s debutantes. Cherokee Fisher from the early 1870s now gets credit for DERA 2.04, down from 4.19!
   353. DL from MN Posted: April 08, 2009 at 12:01 AM (#3128961)
Dutch Leonard seems to have benefitted from this revision of WARP
   354. Paul Wendt Posted: April 13, 2009 at 03:42 PM (#3136031)
This weekend I posted a couple of items regarding park factors at Mule Suttles #94-95. Initially the point was to learn what we may know be able to do for the Negro Leagues and Mule Suttles #20-32 was the occasion for an important part of that work four years ago.

I included remarks on the use of park factors by Bill James in Win Shares. Pages 86ff he explains his park adjustment calculations (which include some broad and some narrow mistakes).

James also states (p87 col2) where he uses the overall park adjustment, which is a "factor" for run scoring. Those applications are in "Dividing Win Shares between Offense and Defense" and "Dividing Offensive Win Shares among a Team's Hitters" (p17-25). More on that later.

However, regarding the specialized park factors for Home Run and Non-Home Run adjustments, he says only that they "will also be needed later in the process" (p87).

Does anyone here know where?
   355. Paul Wendt Posted: April 14, 2009 at 05:21 PM (#3137709)
Dutch Leonard seems to have benefitted from this revision of WARP.

I don't have any WARP data, only Advanced Pitching Statistics.
By DERA the Dutch Leonards gain 0.07 and 0.08, or about two points on the ERA+ scale.


The Hall of Merit relief pitchers Fingers, Gossage, and Eckersley all lose about 0.40.

Is anyone able to check any of these recent or active relief pitchers, because you too have the 2008 edition data?
(columns one, three, five, any one of which is redundant)

XIP    newXIP    PRAA    newPRAA    DERA    newDERA    name
856    1020    161    213    2.81    2.62    NATHAN J
1163    1358    137    192    3.44    3.23    PERCIVAL T
1673    1968    417    311    2.26    3.07    RIVERA M
1270    1463    246    130    2.75    3.70    WAGNER B
1789    2045    265    4    3.17    4.48    HOFFMAN T
1573    1796    155    
-21    3.61    4.60    HERNANDEZ R
1491    1743    87    
-155    3.98    5.30    JONES T 
   356. Paul Wendt Posted: April 16, 2009 at 02:44 AM (#3140312)
Some big revisions have been posted during the last few days. --the last 36 hours if I looked up those recent or active relief pitchers at noon yesterday but I don't recall whether there was some lag here at my laptop.

For brevity here are two pitchers only, Francisco Rodriguez and poor Todd Jones. For simplicity I will call the three sets of estimates (2008), recent, and today; 2008 and recent are the two that I posted yesterday. For readability the layout is three successive rows.

Francisco Rodriguez
XIP  RAA  PRAA PRAR DERA
679  
?    143  361  2.60  (2008)
816  176   81  171  3.61  recent
816  176  174  287  2.57  today 


Todd Jones
XIP  RAA  PRAA PRAR DERA
1491 
?     87  493  3.98  (2008)
1743 86  -155   39  5.30  recent
1743 86    71  313  4.13  today 
   357. DL from MN Posted: April 16, 2009 at 02:17 PM (#3140615)
The relievers are very much in flux. Not sure what's happening.
   358. DL from MN Posted: April 16, 2009 at 04:56 PM (#3140874)
It looks like replacement value went back down again. All the pitchers gained in PRAR and most in PRAA.
   359. Nineto Lezcano needs to get his shit together (CW) Posted: April 16, 2009 at 05:31 PM (#3140937)
Lot of weird stuff going on here. I have full WARP from last year and from a week ago; tonight I might compile the new new WARP and take another look at it. I don't know if replacement level went down or up - I've seen position players moving in both directions.
   360. Paul Wendt Posted: April 16, 2009 at 06:04 PM (#3140987)
It looks like replacement value went back down again. All the pitchers gained in PRAR and most in PRAA.

maybe a decrease in replacement level but it is not enough to float the PRAR of all pitchers --not those who must give back lots of runs to their fielders. For example, poster boy Al Spalding is down from 210 to 120 (PRAR). BY DERA he is now a small gainer from last year: 4.13, 2.50, 4.06.

Last month I speculated that there had been some revision regarding the cooperation by pitchers and fielders, as if Spalding had been credited with doing so remarkably well given all those jokers running around without gloves --some transhistorical average fielding as a benchmark. That isn't plausible but the estimates for some early pitchers on famous teams do look like they have enjoyed that mistake and now suffered its correction.
   361. Joey Numbaz (Scruff) Posted: April 17, 2009 at 05:31 AM (#3141995)
I'm lurking here and very interested in what you guys figure out about the revisions. How could they possibly decide to lower the replacement level further?
   362. Paul Wendt Posted: April 17, 2009 at 04:24 PM (#3142272)
DERA only, here are the big winners and big losers by 2009 revisions, among 407 pitchers with 2000 career innings.

16 losers (DERA up at least 0.20 runs)

DERA newnew
4.39 4.88 Getzien C
4.97 5.34 Billingham J
4.70 5.06 Ellis D
4.58 4.93 Briles N
4.61 4.95 Sele A
4.65 4.98 Splittorff P
4.44 4.77 Leonard Den
4.36 4.66 Kaat J
4.35 4.61 Goltz D
3.95 4.20 Radke B
4.63 4.87 Reuss J
4.57 4.81 Burkett J
3.73 3.96 Keefe T
4.55 4.76 Donohue P
4.53 4.74 Gura L
4.20 4.40 McBride D

By debut they are three from the 1870s and 80s, including the one at the head of the list; one from the 1920s; twelve from Jim Kaat to the present.

I count 27 with DERA down at least 0.2 runs. The threshold for listing here is a little higher in order to make the group size similar.

15 winners (DERA up at least 0.24 runs)
DERA newnew
3.88 3.53 Rusie A
4.10 3.76 Zettlein G
4.02 3.68 Niekro P
3.89 3.57 Breitenstein T
3.84 3.55 Garver N
4.84 4.56 Cunningham B
4.44 4.16 Ward JM
3.89 3.62 McMahon S
4.46 4.19 Ramos P
4.07 3.81 McCormick J
4.47 4.21 Honeycutt R
4.45 4.19 Hough C
4.20 3.96 Galvin J
4.08 3.84 Morris E
4.55 4.31 Patten C

By debut date they are eight from the 1870s and 80s, Breitenstein 1891, Patten 1901, and five from the 1940s to 70s.
If this holds up then at least McCormick from olden days and Garver from modern times should get another look.
   363. Paul Wendt Posted: April 17, 2009 at 04:25 PM (#3142274)
Now I will let this rest a few days, both hoping to hear from Clay Davenport and planning to revisit the pages for some of the biggest winners and losers by revision, also some of the extreme revised values.
   364. Paul Wendt Posted: November 20, 2009 at 09:06 PM (#3392813)
This year I didn't get any reply to email inquiry about revisions to WARP.
   365. Paul Wendt Posted: November 20, 2009 at 09:11 PM (#3392818)
Regarding "sample size" and the numbers of games scheduled.
Quoting from Bleed and Brent, "2010 Ballot Discussion" #307-309, where I have also posted the first part of my comments as #319.

308. Brent Posted: November 19, 2009 at 09:30 PM (#3391965)
>> [Bleed #307] Does anyone else have thoughts on how to use Rally Monkey's WAR to reflect the value of 19th century ballplayers?
<<

I guess the first step is to articulate the reasons you'd like to make adjustments to reduce the results from the simple extrapolations. Is it because you think the short-season data aren't representative and you want to regress them? Or is it an adjustment for perceived league quality (in which case you'd also want to adjust data from longer seasons)?


309. Bleed the Freak Posted: November 19, 2009 at 10:39 PM (#3392000)
My reasoning would be that short-season data is a smaller sample size, and might not be fully representative, so regression may be necessary.


That may be reasonable regarding "peak" credit for seasons not supported by neighbors of about the same quality; that is, reasonable in "non-consecutive peak" analysis. I think it is generally unreasonable, along two lines below (1,2).

Before getting there, let me simply state what is a generally reasonable concern about the number of games in the championship schedule, or the "length of the season", for anyone who cares about "pennants" as well as games and runs. For the same league, same teams, winning percentage .625 may generate the same probability of winning a 126-game pennant race as does .600 in a 162-game pennant race. I presume that handling this point is a big part of "pennants added" analysis by Joe Dimino, following Michael Wolverton, if I understand correctly.

1.
The matter of so-called sample size may be all about talent rather than achievement. It does seem to be all about talent rather than achievement for Tom Tango and "Bleed" in the ThinkFactory discussion cited here last week. --or one remove from that citation. Tom Tango argued for Edgar Martinez among other things. Bleed interpreted Tango's rating system and Dan Rosenheck's WARP in terms of root-n, the square root of the number of observations, which is ubiquitous in mathematical statistics. Insofar as we care about talent rather than achievement, the issue of so-called sample size is statistically significant (an abuse of technical language) and may sometimes be significant on the scale of this project. "N" is all about how certain we can be that Barry Bonds is truly a talented player. Just how unlikely is it that a league-average talent could have posted his playing career? Please quantify!

2.
Concerning achievement rather than talent, in major league baseball from 1871, I doubt that anyone really means a sample size issue. Essentially we have the complete record for Ross Barnes in championship play 1871-1876, same as we do for Dave Cash 1971-1976. Who needs more? Well, the entertainment of paying customers in lots of other games was an important part of Barnes' job, but a trivial part of Cash's job. We don't have a record of those other games Barnes played, not even how many of them he played; his playing time and all the details are unobserved, practically (existing records have not been compiled). So "who needs more?" is who cares what Barnes achieved in those other games. Maybe he and Harry Schafer played every day in 1871, and performed equally as batsmen in those other games. That isn't likely if they were both trying, but it isn't impossible either. More important, there is no reason to suppose Barnes surpassed Schafer in those games by the same margin he surpassed Schafer in their NAPBBP games. Statistics as a discipline shows, tells, teaches how to make some some auxiliary postulates, interpret the historical record as a sample, and express what we know about those other games (in terms of estimates and probabilities that jointly quantify what we know and don't know).

We do suppose that Barnes and Schafer were "trying", or they were obliged to "try" in those other games of the 1870s. That's most of the distinction from Dave Cash's and Richie Hebner's play in exhibition games of the 1970s. Nevertheless, no one cares much about those other ballgames even in the 1870s.

"A pennant is a pennant" expresses an important principle here. Perhaps some participants treat it as a constitutional obligation. For some it is a personal guideline. Almost everyone takes it seriously, no one simply dismisses it. Routinely it means uniform weight for all major league pennant races, and debate tinkers with the details (what's a major league? how if at all do we pay attention to minor league seasons?). At the same time, however, it means uniform weight zero for everything else: assaulting a nurse on the street, driving a car while intoxicated, muffing a fly in March.
   366. Bleed the Freak Posted: January 03, 2011 at 12:38 AM (#3722035)
Joe Dimino - to answer your question in the 2011 ballot discussion post 327

And if you can find me a good new proxy for team defense or a way to get at the old BPro cards (since the BPro cards no longer have what I need), I'm all ears!

A hat tip to Chris Jaffe for mentioning the old-school DT Cards in an article he wrote about Omar Vizquel:

http://www.hardballtimes.com/main/article/when-do-we-start-taking-omar-vizquels-cooperstown-case-seriously/

Under the references and resources section, he lists the direct link to Vizquel, and mentions that, to query for other players, using the baseball-reference (Lahman database) abbreviation will net the correct result.

http://www.baseballprospectus.com/dt/vizquom01.shtml

I hope this is helpful.
Page 4 of 4 pages  < 1 2 3 4

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
Mike Emeigh
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 1.0166 seconds
51 querie(s) executed