Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
202. DL from MN
Posted: June 16, 2006 at 10:32 PM (#2066331)
If Keith Hernandez had thrown righthanded, he'd be a HoM 3rd baseman. A lot of the reason there weren't as many great 3rd basemen is the conflicting demands. The best power hitters are strong and usually lefthanded; all 3rd basemen need a strong right throwing arm. If you have a strong lefthander, he ends up at RF (or pitching). If you have a LH who can handle grounders he ends up at 1B. I think this is part of the reason there aren't as many good hitting catchers also, since they need a strong right arm.
I'll put it another way:
Lefthanded hitters are more valuable. Lefthanded throwers must play OF or 1B. Lefthanded hitters are usually lefthanded throwers. OF and 1B will have the most valuable hitters.
3B, SS and 2B can't compete for value because the players are predominantly righthanded hitters. 3B can't compete for defensive value with C, SS and 2B because they don't get as many chances to show value.
There are a couple strong 3B candidates available on the current ballot. Ken Boyer won an MVP and was an all-star 7 times. Bob Elliott led the majors in RBIs during the 40's, won an MVP and was a 7 time all-star. Elliott is easily one of the top 10 RH hitters of his era (Foxx, DiMaggio, Medwick, Bob Johnson, Gordon, Doerr, Frank McCormick, Billy Herman, Ken Keltner... I need help with the Negro Leaguers).
Look at the list of HoM 3B. Mel Ott, Eddie Mathews, Frank Baker, Stan Hack, Jud Wilson all bat lefty - that's HALF. Brett and Boggs also hit lefty. If you want a reason for the lack of 3B, there it is. The requirement of wearing a glove on your left hand makes 3B a "glove" position.
203. jimd
Posted: June 16, 2006 at 10:37 PM (#2066335)
How does one calculate fielding value? There are two parts of the problem.
One part calculates how fielders measure relative to their peers, normalizing, adjusting, comparing, etc. to the measured averages at the position. All the uberstats do this in one way or another.
The other part is to establish the value of an average fielder at each position
compared to the value of the average fielder at other positions. The only way that I've ever seen this described in print is using the theory that I label "absence-of-offense".
How much more is average CF defense worth compared to average LF defense?
According to "absence-of-offense", the answer is simple: How much offensive value are managers willing to give up to get it? If it's worth a lot to them, then CF'ers will hit a lot less than LF'ers, as they give up more and more offensive value to get more fielding value. If it's not worth much to them, then CF'ers will hit about the same as LF'ers. (Over a sufficiently large sample of personnel decisions, of course.)
The reason "absence-of-offense" works is because there are no positional distinctions on offense; everybody plays the same position, batter, so all fielding positions have the same opportunity to contribute.
This principle is easily extended to all other fielding positions. Except pitcher, because it's clear that managers are willing to sacrifice all offensive value, and all fielding value too, to get average pitching value. Pitcher is off the chart. So all we can conclude is: an average pitching staff is worth considerably more than 1/9th of the total value of an average team (at least nowadays). James estimates about 1/3rd. WARP ranges post-1893 from about 25% in the 1890's to 45% in the 1970's. (Haven't done later.)
I always thought that "absence-of-offense" was the guiding principle in the choice of the values for the fielding "intrinsic weights" (Win Shares, p. 67). But James never actually says this. He gives no explanation of the "whys" for the "intrinsic weights". Maybe he has a fundamental insight into calculating fielding value that he hasn't published to his book audience, but I'm unaware of it.
Until somebody explains to me a good alternative theory, I'm sticking with "absence-of-offense", and based on that, the "intrinsic weights" in Win Shares appear to be flawed, biased in favor of the outfield when compared to the infield.
204. DL from MN
Posted: June 16, 2006 at 10:39 PM (#2066336)
Oh crap, I had a whole lot of words eaten that can be summed up as this: 3B are being left out because they're usually right handed. RH players put up less value at the plate. Half of the current HoM 3B batted LH. You should throw them out of the list, they're all freaks. Failure to recognize great righthanded hitting 3B is the reason they're underrepresented. Wearing a glove on your lefthand makes 3B a "glove" position. Vote for Boyer and Bob Elliott.
205. jimd
Posted: June 17, 2006 at 12:05 AM (#2066450)
The distribution of "bats right" vs "bats left" + "bats both" for regulars, 1901-1960 (each season counted separately).
Assume that if all were BR, then IF would be just as likely to be All-Stars as OF.
To explain the final WS results (a 60:40 mix of OF/IF) due to handedness alone, I have to hypothesize that (BB+BL) is 3 times more likely to be an "All-Star" than BR.
If I were to examine my "All-Star" data for batting side, this makes the following predictions. That Infielder All-Stars would split about 56:44 between BR and BL+BB. And that Outfielder All-Stars would split about 18:82 between BR and BL+BB.
If I were to examine random years from my All-Star lists, any bets on whether I'd find that 20% or less of the Outfield AllStar seasons Batted Right?
(Note: I'm not a statistician. I haven't taken a stats course in 33 years. Anybody who knows how to do this stuff is welcome to suggest a better approach. I'm all ears.)
206. jimd
Posted: June 17, 2006 at 12:06 AM (#2066453)
(each season counted separately).
that should say:
(each player season counted separately).
207. Brent
Posted: June 17, 2006 at 03:53 AM (#2066725)
At least 20% of all infielder star candidates would have to be moved to cause this large imbalance. It implies that a significant number of outfield stars would have evidence of great infield play in the middle/high minors. Does this sound reasonable?
a) When you make this calculation, I think you are assuming that the distribution of talent at each position starts out equal. I don't make that assumption; I strongly doubt that the distribution of talent by position is equal. (In particular, I think historically talent has been overrepresented at shortstop and at center field because that is where players with the best speed and athletic talent tend to start. In our HoM, those are the best represented positions.)
b) Many star players have been moved from the infield to the outfield early in their careers. Twenty percent doesn't sound unreasonable to me.
The other part is to establish the value of an average fielder at each position compared to the value of the average fielder at other positions. The only way that I've ever seen this described in print is using the theory that I label "absence-of-offense".
In general terms, I agree -- the defensive spectrum describes the trade-offs between offense and fielding ability. My disagreement is that you are measuring the trade-off at the top of the talent distribution, by comparing star players. In principle, however, I think the trade-off should be calibrated at the bottom of the distribution, comparing replacement level players. If a replacement level shortstop (with average fielding ability) hits with OPS+ of 75, and a replacement level first basemen hits with OPS+ of 105, then the "value" of the shortstop relative to the first baseman is 30 points of OPS+ (or whatever batting metric you want to use).
Unfortunately, measuring replacement levels is very difficult. With Win Shares, James essentially gave up on trying to measure replacement level. WARP has made a complete mess of it by incorrectly assuming that there are separate replacement levels for offense and defense and by overadjusting the WARP1 replacement levels for ball-in-play context. For win shares, I think Studeman's "win shares above bench" and Joe Dimino's replacement estimates are getting in the right ballpark, but there are too many problems with measuring replacement value to claim we can measure it with high accuracy.
The distribution of stars (i.e., the upper tail of the talent distribution) would be useful in determining the value of positions if we knew that the distribution of talent was "supposed to be" identically distributed by position, but I don't accept that premise. I don't know of any theoretical reason to think it should be. So I don't think your counts of all stars by position provide evidence one way or the other on the validity of win shares fielding values.
On the other hand, as John said on the other thread, if a voter uses win shares and also cares about positional balance they may want to adjust the raw WS numbers in their voting systems. Your tabulation is certainly interesting and useful for that purpose.
208. Chris Cobb
Posted: June 17, 2006 at 04:13 AM (#2066738)
Many star players have been moved from the infield to the outfield early in their careers. Twenty percent doesn't sound unreasonable to me.
But they move from the infield to the outfield, don't they, because they are shifting down the defensive spectrum, not because teams prefer to have their stars in the outfield rather than the infield? In the 1890s, yes, there is evidence that teams may have protected their stars from the violence on the infield base paths, and, yes, it seems likely that baseball men got the idea into their heads in the 1970s that it was axiomatic that anyone who could hit for power certainly couldn't play shortstop, but are there other instances?
If they are shifting down the defensive spectrum, wouldn't it stand to reason that teams are making tradeoffs between offensive and defensive value?
209. Brent
Posted: June 17, 2006 at 04:58 AM (#2066774)
Bill James (Juan Samuel comment in 1985 Abstract) cited Pete Rose as an example: "Doing this study [of injuries to second basemen] I came to an appreciation of the wisdom of [the] decision...to get Pete Rose out of the middle of the infield before he got hurt. There were just so many young second basemen who came up and hit well and fielded well, but who didn't become the players that they might have been because their progress was continually set back a pace or two by injuries." James also had a long discussion of injury risks for second basemen and for catchers in his study of rookies in the 1987 Abstract.
AFAIK, Rose's defense at second base wasn't criticized. For other players (e.g., Aaron) the move happened in the minors so it's harder to tell whether the player was experiencing defensive difficulties. My understanding is that Aaron was considered capable of playing second base at a major league level, but they thought his bat would be available every day if he were in the outfield. But I do read or hear discussion of injury risks that I have to believe that it's a consideration for at least some baseball front offices.
210. Brent
Posted: June 17, 2006 at 05:10 AM (#2066777)
I'll add (quoting James again): "Pete Rose...was moved to the outfield in 1967, with the reason given that playing the outfield would involve less risk of an injury and allow him to play longer."
211. Chris Cobb
Posted: June 17, 2006 at 02:06 PM (#2066877)
Brent,
Thanks for the examples. If this was part of the conventional wisdom from the 1950s until the 1980s, that would help explain why infielder hitting numbers drop so dramatically during this era. Do you think this was c. w. any earlier than that? There are so many strong-hitting shortstops and second basemen in the 1920s and 1930s that I have a hard time believing that hitters were being systematically steered away from the positions: Cronin, Gehringer, Vaughan, Frisch, Hornsby et al. But maybe there is evidence that strong hitters were being steered away from the infield to protect them from injury?
212. Dr. Chaleeko
Posted: June 17, 2006 at 03:37 PM (#2066900)
Regarding Jim and Brent's argument...
On the other hand, as John said on the other thread, if a voter uses win shares and also cares about positional balance they may want to adjust the raw WS numbers in their voting systems. Your tabulation is certainly interesting and useful for that purpose.
Or they begin their ranking process by comparing to positional peers first, then (generally) basing their final, integrated ballot on the relative dominance each guy has over his position. And if that means Clemente is your #13 guy, well, that's the crumbling cookie for you, but at least it's a consistent and somewhat sensible way to look at the ranking and ballot-construction process. And you can alwawys "manually override" the process and move a guy up if you feel like you're slotting him too low.
213. Dr. Chaleeko
Posted: June 17, 2006 at 07:58 PM (#2067064)
It would be interesting to have a little list of 2b, SS, 3b who were severly injured during a defensive play due to contact or do to an unusual task associated with the position (namely pivoting). Here's a few that spring to mind as well as some guesses:
2B
Joe Morgan (I think??? Didn't he have a fractured ankle from a bad pivot?)
Craig Biggio's torn ACL?
Didn't Tony Phillips rupture an ACL on the pivot? Or did he do it to someone else, maybe Vina?
SS
Tony Fernandez: hobbled by Madlock
3B
Scott Rolen: dislocated shoulder in 200? post-season during collision with passing runner.
214. Brent
Posted: June 17, 2006 at 09:19 PM (#2067140)
Thanks for the examples. If this was part of the conventional wisdom from the 1950s until the 1980s, that would help explain why infielder hitting numbers drop so dramatically during this era. Do you think this was c. w. any earlier than that?
It's hard to say. Many infield to outfield conversions took place in the minors, so there isn't much written on them, and if there were, it would be hard to tell if the reasons given to the press reflected the actual management decision making.
I'll tell you, though, that if I were a baseball exec and had read the Bill James study of rookies (from his 1987 Abstract), I would certainly move a great-hitting minor league second baseman (like Aaron) to the outfield to avoid injuries. James found, for example, comparing rookie second basemen who hit well to a matched set of rookie outfielders, the outfielders went on to play 43% more games, scored 74% more runs, drove in 92% more runs, and hit 3.2 times as many home runs. Even poor hitting rookie second basemen had shorter careers than poor hitting outfielders--a very surprising result, since the second basemen were contributing more on defense. Injuries, small and large, hurt the subsequent development of rookie second basemen. The same was also true of catchers, and to a lesser extent, of third basemen. Shortstops, on the other hand, didn't seem to suffer as many injuries and had longer careers than matched outfielders.
215. Brent
Posted: June 17, 2006 at 10:02 PM (#2067158)
The same was also true of catchers
I should amend that to say that good-hitting rookie catchers had shorter careers than matched good-hitting outfielders. The poor-hitting rookie catchers, however, had longer careers than poor-hitting outfielders (as expected).
216. Mefisto
Posted: June 17, 2006 at 10:52 PM (#2067177)
The other part is to establish the value of an average fielder at each position compared to the value of the average fielder at other positions. The only way that I've ever seen this described in print is using the theory that I label "absence-of-offense".
I've never cared for this approach. What ends up happening is that certain positions go through phases, fads even, when great hitters suddenly become common at a position: CF in the 50s, 3B in the 70s, SS recently. Now, if the only purpose of the evaluation is compare a player to his peers, then it doesn't matter. But if you're trying to make cross-era comparisons, then these players get undervalued.
The true measure of defense would distribute defensive value proportionally to the run value of the plays potentially to be made at a position compared to the run value of those plays potentially made at other positions. This would establish the relative value of the different positions. Each defender would then be evaluated against the average or the replacement level, whichever is preferred, at that position.
I'm with Brent on this. I agree with jimd's approach regarding absense of offense as the approach - but I think you have to do it through the replacement level players (IMO that is the bottom 15-20% of regulars), not averages, because as mefisto says, star-gluts will wreak havoc with the numbers.
I have no problem with evaluating fielders against average (which is replacement level for fielding) - but I think absense of offense is the way to approach the 'position constant' portion of it. How much offense managers are willing to give up basically determines the defensive value of playing the position at an average level.
And the great part about this is that it adjusts as the game adjusts. Errors are up and DPs are down? Then managers sacrifice more offense to get a good fielding 3B and less offense at 2B. Even better, catchers and 1b have their hands take a beating because of bad gloves? Offense goes down at those positions, and your system adjusts. Etc., etc.. This approach would be very adaptable to changes in the game.
218. jimd
Posted: June 19, 2006 at 07:02 PM (#2068566)
The true measure of defense would distribute defensive value proportionally to the run value of the plays potentially to be made at a position compared to the run value of those plays potentially made at other positions. This would establish the relative value of the different positions.
I'm not sure what you're suggesting here. If you're suggesting measuring the absolute value of the plays made at the position in comparison to having nobody playing there, well, it doesn't work. 1B participates in more plays than any other position. The potential for havoc by having a total klutz there is enormous. But with the modern mitt, acceptable performance is not difficult, hence the general conclusion that 1b is the easiest position overall.
219. jimd
Posted: June 19, 2006 at 07:26 PM (#2068608)
but I think you have to do it through the replacement level players
Now we're talking the difference between the theory and the implementation of a calculation method. The theory is always couched "given a long enough time" so that star-gluts and manager fads get averaged away.
220. Mefisto
Posted: June 19, 2006 at 07:30 PM (#2068618)
I'm not sure what you're suggesting here. If you're suggesting measuring the absolute value of the plays made at the position in comparison to having nobody playing there, well, it doesn't work. 1B participates in more plays than any other position. The potential for havoc by having a total klutz there is enormous. But with the modern mitt, acceptable performance is not difficult, hence the general conclusion that 1b is the easiest position overall.
My suggestion is more theoretically pure than pragmatic. What I mean is that the defensive value of a position essentially depends on how many plays someone needs to make at a position (adjusted for the difficulty of said plays and the run value of them). I want to measure defensive value by what happens in the field rather than at the plate.* I agree that there is no absolute value for that -- to paraphrase Bill James, the number of plays I could mishandle is essentially infinite -- but IMO we can and should use positional averages to establish that. IOW, we're judging by major league minimum standards.
*I agree with Joe that if we're going to use an indirect measure like batting, the correct one is replacement level.
221. jimd
Posted: June 19, 2006 at 07:35 PM (#2068623)
At least 20% of all infielder star candidates would have to be moved to cause this large imbalance. It implies that a significant number of outfield stars would have evidence of great infield play in the middle/high minors. Does this sound reasonable?
Here, we're not talking about the majority of IF-to-OF shifts in the minors, the ones where the guy isn't capable of playing an acceptable major-league 3B. We're talking about a conscious decision to move a major-league all-star 2b (or whatever IF position) to the OF, 20% of those guys being shifted.
Now maybe it does happen, 20% of the potention IF All-Stars are shifted to OF, in a conscious decision by their clubs to sacrifice peak value for career value (they would have more value per season as an IFer, but are more likely to accumulate a higher total as an OFer). However, to make this argument, you also have to convince me that James had this information and built it into Win Shares.
222. jimd
Posted: June 19, 2006 at 07:38 PM (#2068628)
What I mean is that the defensive value of a position essentially depends on how many plays someone needs to make at a position (adjusted for the difficulty of said plays and the run value of them). I want to measure defensive value by what happens in the field rather than at the plate.
Oh, I can agree with that statement. Just like apple pie and motherhood ;-)
223. jimd
Posted: June 19, 2006 at 07:45 PM (#2068635)
Fielding value in Win Shares revolve around the "intrinsic weights" assigned to each position. I see three possibilities for how those were chosen:
a) "absence-of-offense" was attempted, but didn't achieve the desired results
b) a new unpublished insight into the search for a "unified fielding theory"
c) trial-and-error until the results matched James' expectations
I think your choice from this menu may determine how acceptable the values are.
224. JPWF13
Posted: June 19, 2006 at 08:07 PM (#2068661)
AFAIK, Rose's defense at second base wasn't criticized.
Let me just say this, I hate Pete Rose, I hated him when he was playing, and I was positively gleeful when he was banned. (just as I'm sure Bonds haters feel whenever more info leaks about him).
I always claimed that Pete Rose was a bad fielder- because why else would he have been moved around so much.
Truth be told, I didn't see him play 2b- I saw him play 3b-
and he was fine at 3b he certainly had the reflexes to play 3b, and his arm was good enough, and when he was younger he certainly was quick enough to play 2b
so supressing my deep hatred of Rose, I assume he was at least a decent defensive 2B
baseball should also hold its collective nose and put his plaque in the hall- what he did wasn't 1/4th as bad as Joe Jackson imho (he took $ to throw a World Series- I don't care what his average was in that series HE TOOK MONEY- hell most of the other sox players got stiffed- but not Joe, he actually got $ in hand) rant over...
225. Dr. Chaleeko
Posted: June 19, 2006 at 09:25 PM (#2068764)
Pete Rose gets a place on the all horticultural team, of course.
1B Mike Ivie
2B Pete Rose
3B Larry Gardner
SS Donie Bush
RF Jim Greengrass
CF Daryl Strawberry
LF Zack Wheat
C Johnny Roseboro
DH Phil Plantier
P Ted Lilly
P Ross Baumgarten
226. DL from MN
Posted: June 19, 2006 at 09:32 PM (#2068777)
Shouldn't the SS be Jeter, Tim McCarver would swear that his sh*t smells like roses...
228. Mike Emeigh
Posted: June 20, 2006 at 03:41 PM (#2069642)
Fielding value in Win Shares revolve around the "intrinsic weights" assigned to each position. I see three possibilities for how those were chosen:
a) "absence-of-offense" was attempted, but didn't achieve the desired results
b) a new unpublished insight into the search for a "unified fielding theory"
c) trial-and-error until the results matched James' expectations
Based on James's description in WS of the way that the catcher weight was chosen, I would have to conclude it was (c). James gave catcher the highest intrinsic weight because when he didn't, top catchers couldn't get enough WS.
Win Shares is not an entirely objective value system (all ranking systems, to be honest, contain subjective elements, so there's nothing intrinsically wrong there, but it needs to be noted for the record). James chose 52/48 for the defense/offense divide and 67.5/32.5 for the pitching/fielding split because it gave him what he thought were appropriate results for pitchers vs position players. Clay Davenport did something similar, I should note, when deciding to divide BIP credit 70/30 fielding/pitching in his defensive ratings because it balanced out the rankings of top pitchers over time.
-- MWE
229. Dr. Chaleeko
Posted: June 20, 2006 at 03:43 PM (#2069644)
Well, Bush is already on the All-Presidents Team, and that All 1990s Bands of Questionable Talent Team so why not!
230. DavidFoss
Posted: June 20, 2006 at 04:54 PM (#2069687)
Charley Root needs a spot on the pitching staff
Billy Beane should be the GM.
231. Dr. Chaleeko
Posted: June 20, 2006 at 05:06 PM (#2069695)
Billy Beane or Branch Rickey?
232. Paul Wendt
Posted: June 20, 2006 at 05:22 PM (#2069707)
c) trial-and-error until the results matched James' expectations
I agree with Mike Emeigh on the methodology and the comment: "(all ranking systems, to be honest, contain subjective elements, so there's nothing intrinsically wrong there, but it needs to be noted for the record)."
And it's clear that James used method (c) in combination with informal (b) insight, for the finer weights that are not intrinsic (fixed for all time). The claim formula for catchers:
-pre-1900 : 2PO + 4A - 3E - PB + 4DP
1900-1930 : 2PO + 4A - 8E - 4PB + 8DP [page 71, 1900-1930 coefficients doubled]
1931-1986 : PO + 4A - 8E - 4PB + 6DP
"In essence, we have increased the value of the 'skill elements' relative to the value of putouts, which are essentially a meter on the catcher's playing time." (By the way, I'm sure that isn't approximately true until the late 19th century.)
Not "in essence" but "mainly" I would say. The relative weights on A, E, PB, and DP are not fixed but radically different before and after 1900; with a modest revaluation of DP i 1930. He must have experimented.
233. Dr. Chaleeko
Posted: June 20, 2006 at 05:36 PM (#2069714)
And we all know who the arch rivals of the All Horticultural Team are, right?
The All Garden Annoyance Team
--------------------------------
1B Harry "Slug" Heilmann
2B Miller "Mighty Mite" Huggins
3B Frank Crosetti
SS Rabbit Maranville
RF Rob Deer
CF Bug Holliday
LF Johnny Grub
C Don "Sluggo" Slaught
DH Red Badgro
P Dave Frost
P Chris Spurgeon
P Hooks Wiltse
MGR John Boles
234. jimd
Posted: June 20, 2006 at 07:40 PM (#2069818)
The All Garden Annoyance Team
P Grasshopper Jim Whitney
P Toad Ramsey
P Bugs Raymond
P Mickey Mouse Melton
235. Brent
Posted: June 21, 2006 at 03:03 AM (#2070420)
but I think you have to do it through the replacement level players
Now we're talking the difference between the theory and the implementation of a calculation method. The theory is always couched "given a long enough time" so that star-gluts and manager fads get averaged away.
No. I don't accept the premise that the distribution of talent is the same at each position. If the distribution of talent differs by position, then theoretically the calculation of fielding value based on replacement-level players will give a different answer than calculation based on all stars. And the differences don't get averaged away "given a long enough time."
236. sunnyday2
Posted: June 21, 2006 at 11:43 AM (#2070554)
Catching up on this thread since traveling last weekend:
>If Keith Hernandez had thrown righthanded, he'd be a HoM 3rd baseman.
If Keith Hernandez had thrown righthanded, he'd still be a ML prick.
Very interesting hypothesis re. our shortage of HoM 3Bs--i.e. of course we're short 3Bs because only the righthanded population is eligible. (Of course other positions are all RH but not short of HoMers, though catcher also supports the hypothesis.) The corollary--or really, the more important part of the hypothesis--to this is that if RH only can (or, rather, do) play the position, then lots of innings are going to guys that wouldn't make the MLs in an ambidextrous world.
Re. players moving down the spectrum and the reasons for that--Rod Carew moved to 1B for the same reason as Ernie Banks, so that he could stay in the lineup more. He was not regarded as a great 2B but, still, he moved so he could stay in the lineup and get more PAs. His true position was "hitter." When he becomes eligible I'll be interested to compare him to Banks. As a Twins fan it pains me to say that Carew is probably pretty overrated for the usual reason that he did one thing (hit for average) so very very well.
The "absence of offense" provides an interesting perspective but I don't think you can assume a high correlation for the reason that the connection beetween offense and defense is filtered through the knowledge and opinions and biases of the management, which is an imperfect reflection of reality.
Being married to a Master Gardener, I can tell you that no Toad never did no harm to no garden.
237. Dr. Chaleeko
Posted: June 21, 2006 at 01:01 PM (#2070570)
If Keith Hernandez had thrown righthanded, he'd still be a ML prick.
Do you say this due to the coke problems? It always seemed like he was a smiling, happy guy on the field, but i never knew much about him as a person other than the snorting.
238. DL from MN
Posted: June 21, 2006 at 02:02 PM (#2070599)
Of the RH positions (C, 2B, 3B, SS) there are 36 righthanded batters, 3 switch hitters and 14 lefthanded batters. The majority of lefthanded batters are catchers (5) and 3rd basemen (5) and there is 1 switch hitting catcher (Mackey) and 1 switch hitting 3B (George Davis). The better hitting players at both positions are lefty (Berra, Cochrane, Dickey, Santop, Mathews, Baker, Ott (Josh Gibson and John Beckwith batted RH). Catcher and 3B in the HoM are very much dominated by freak athletes who could throw RH and bat LH - 6 of 13 C and 6 of 12 3B. There are only 5/28 among SS and 2B.
I figured a group of systemizers like this would be interested in this article:
http://www.eetimes.com/news/latest/showArticle.jhtml;jsessionid=N2B3QOHS4IJROQSNDLSCKHA?articleID=189401732
239. DavidFoss
Posted: June 21, 2006 at 03:19 PM (#2070658)
The "absence of offense" provides an interesting perspective
The trouble with 3B is that its in the *middle* of the spectrum and not at one of the ends. Guys like George Davis and A-Rod prove this point. If a star 3B can play a tougher position like SS, he will. I think there tend to be more multi-positional players who have played 3B than other positions (though I could be wrong on that point).
240. sunnyday2
Posted: June 21, 2006 at 03:30 PM (#2070674)
>I think there tend to be more multi-positional players who have played 3B than other positions (though I could be wrong on that point).
That sure seems to be the case. And like any player who does a variety of things well, instead of one thing great, a guy who plays multi positions (think Tommy Leach) will also tend to be underrated. Rose and Killebrew are the exceptions that prove the rule. (Dick Allen is the rule, though of course he has other issues. But he gets no credit whatsoever for playing 3B, he's just thought of as a mediocre fielding 1B.)
241. DL from MN
Posted: June 21, 2006 at 04:13 PM (#2070716)
Martin Dihigo fits in this category.
242. Steve Treder
Posted: June 21, 2006 at 04:30 PM (#2070736)
I think there tend to be more multi-positional players who have played 3B than other positions (though I could be wrong on that point).
243. DL from MN
Posted: June 21, 2006 at 05:16 PM (#2070786)
Could this be screwing up the replacement level? People who can hit get a tryout at 3rd, if they can't throw they move to 1B and if they can't field grounders they move to RF. There are gloveless sluggers bringing up the offensive levels and converted SS bringing up the defensive levels. Managers will bat Troy Glaus four times but replace him during high leverage innings with a utility infielder.
244. Mike Emeigh
Posted: June 21, 2006 at 09:24 PM (#2071078)
If a star 3B can play a tougher position like SS, he will.
True in today's game, not true before about 1930 when 3B was still considered (defensively) on a virtual par with middle infield positions.
-- MWE
245. jimd
Posted: June 22, 2006 at 01:11 AM (#2071579)
not true before about 1930 when 3B was still considered (defensively) on a virtual par with middle infield positions
Dead-ball 2B hit somewhat more than 3B from around 1900-1930.
19thC 3B hit a little more than 2B.
To consider the positions equal in weight before 1940 or so would be reasonable.
James' refinement of the "intrinsic weights" for these two positions is far more fussy than he is with the other positions.
246. Steve Treder
Posted: June 22, 2006 at 01:18 AM (#2071597)
James' refinement of the "intrinsic weights" for these two positions is far more fussy than he is with the other positions.
As it should be. The relationship of 3B to all other positions is the most difficult of all such relationships to establish.
247. jimd
Posted: June 22, 2006 at 01:34 AM (#2071666)
Based on top-32. Additional players added each season
when necessary to match WS count due to ties.
248. jimd
Posted: June 22, 2006 at 01:43 AM (#2071704)
As it should be. The relationship of 3B to all other positions is the most difficult of all such relationships to establish.
That may be. But the "intrinsic weights" appear to be wrong for the two positions during the 19th Century, when 3b outhit 2b. This will overrate e.g. Ned Williamson and Jimmy Collins at the expense of Fred Dunlap and Cupid Childs.
James also makes no adjustment for 1B circa 1890-1920 when it appears to be a more difficult position to field than any of the OF positions, at least when measured by "absence-of-offense" (though still not close to 2B/3B in fielding difficulty). The change in offensive value of 1B relative to historical norms is of a similar magnitude as the 2b/3b shifts.
249. Chris Cobb
Posted: June 22, 2006 at 01:59 AM (#2071764)
Thanks, jimd, for the WARP1 all-star tables.
WARP1 seems to have a more plausible OF/IF split, but it sure looks like pitchers are overrated from 1941-60!
That may be. But the "intrinsic weights" appear to be wrong for the two positions during the 19th Century, when 3b outhit 2b. This will overrate e.g. Ned Williamson and Jimmy Collins at the expense of Fred Dunlap and Cupid Childs.
I agree, though it's worth mentioning that WARP1 errs in the other direction, giving pre-1900 second basemen significantly more defensive credit than third basement, overrating Dunlap and Childs at the expense of Williamson and Collins. I agree with jimd's assessment that simply treating the two positions as equal defensively to 1940 would be reasonable.
250. jimd
Posted: June 22, 2006 at 02:44 AM (#2071875)
WARP1 seems to have a more plausible OF/IF split, but it sure looks like pitchers are overrated from 1941-60!
Maybe.
Pitchers are 28% of the All-Stars from 1901-1940. This would be consistent with a model that pitching staffs consist of 3 full-time starters and a number of role pitchers that fill up the slack as spot-starters (due to the omnipresent doubleheaders) and relievers, long and medium (who are oriented more towards comeback mode, i.e. 2-3 inning stints between the pinch-hitters in the 9 slot, than they are towards modern lead-protection mode). 8 starting players and 3 full-time pitchers; 3/11 = 27%.
Pitchers are 40% of the All-Stars from 1941-1960. This would be consistent with a model that pitching staffs consist of 4 full-time starters and one full-time relief specialist, in addition to the role pitchers. Double-headers are in decline allowing 4 man all-star rotations to develop (see Cleveland), and a number of relievers reached short-term star status (though they lacked longevity, except for Wilhelm). 8 starting players and 5 full-time pitchers; 5/13 = 38%.
Or maybe it's just a strong era for pitchers, like CF in the 1910's or 1B in the 1930's. Most likely, it's both together.
251. Brent
Posted: June 22, 2006 at 03:02 AM (#2071898)
Considering that pitcher workloads were much lighter in 1941-60 than in 1901-20, I think this is clear evidence that WARP1 overdid their adjustments to pitcher replacement values.
252. KJOK
Posted: June 22, 2006 at 05:24 AM (#2072022)
TangoTiger has a two part series on how to calculate Fielding Positional Adjustments. Part I is here:
253. Paul Wendt
Posted: June 22, 2006 at 05:05 PM (#2072300)
the more important part of the hypothesis--to this is that if RH only can (or, rather, do) play the position, then lots of innings are going to guys that wouldn't make the MLs in an ambidextrous world.
or if the game were played clockwise in odd innings and counterclockwise in even ones, with no position changes permitted (infielders change locations when first and third bases do).
On jimd #247
In both the Win Shares and WARP1 tables, the three smallest numbers are those for catchers in the three 20-year periods. But the thirdbaseman 1921-1940 are also "outstanding" in both.
At SS, 2B, and P, the two rating systems show remarkable opposite time patterns. For Win Shares, the five men around the basepaths all gain all-stars at the expense of pitchers. For WARP, the SS and 2B specifically lose all-stars to the pitchers. 3B, 1B, and C all-star numbers go upanddown or downandup with little net effect from early to mid-century.
254. Mike Emeigh
Posted: June 23, 2006 at 02:32 AM (#2072926)
I agree with jimd's assessment that simply treating the two positions as equal defensively to 1940 would be reasonable.
I'm inclined to agree with this, and I also think that until about 1930 (and maybe to 1940) there's not a lot of difference between 2B/3B and SS.
-- MWE
255. Jeff M
Posted: June 23, 2006 at 03:49 AM (#2072994)
I think you have to do it through the replacement level players...
How much offense managers are willing to give up basically determines the defensive value of playing the position at an average level.
I don't understand how to make these two statements work together. The replacement approach makes sense to me, so I don't need an explanation there.
However, I've always had trouble with the "absence of offense" approach to defense because it seems to assume managers and scouts are making the correct decisions about which players are most valuable offensively, and penciling them into the appropriate defensive spot as a result.
Much of the sabermetric literature over the last few years has called that into question in various ways. Some players traditionally considered good hitters have been determined to be out machines and drains on run scoring by sabermetric methods. So if teams are misinterpreting how good (or bad) a hitter is, and the absence of offense determines the value of defense at a position, won't a position's defensive value be misrepresented?
Some managers thought Tony Womack was a leadoff hitter. :)
256. Chris Cobb
Posted: June 23, 2006 at 04:59 AM (#2073033)
However, I've always had trouble with the "absence of offense" approach to defense because it seems to assume managers and scouts are making the correct decisions about which players are most valuable offensively, and penciling them into the appropriate defensive spot as a result.
Much of the sabermetric literature over the last few years has called that into question in various ways.
The "absence of offense" approach relies on an assumption similar to the Hall of Merit project itself, actually.
It is surely correct that individual managers, scouts, etc. will make mistakes in both their evaluation of players and in their assumptions about how value is created in baseball, but the acceptable offensive level for a position is established not by the choice on any one person but by the combined effect of all the choices of all the people with decision-making power about who plays and who doesn't. The "absence of offense" approach takes the position that this aggregate assessment is likely, on the whole, to be a reasonably accurate judgment.
The Hall of Merit as a group project is premised on the idea that the aggregate of all the choices of all the people with decision-making power about who gets into the HoM and who does will constitute a more reliable judgment about who the best players are than a single individual creating a list. Individual voters make mistakes, and the electorate as a whole may be mistaken in its judgments about value, but the results of the process cannot be lightly dismissed as evidence about who the best players have been.
Returning to assessment based on the combined effect of management decisions across the major leagues: there may be times when assessments of value should not rely on the evidence of such combined effect, especially when sabermetric or historical study identifies deeply flawed conventional wisdom about strategy or value that was pervasive in some period of baseball history. However, in the absence of such evidence about general errors, dismissing this evidence based on aggregated management decisions because individual managers make serious mistakes seems unwarranted.
257. Paul Wendt
Posted: June 23, 2006 at 05:18 AM (#2073047)
Chris Cobb: The "absence of offense" approach relies on an assumption similar to the Hall of Merit project itself, actually.
. . .
The Hall of Merit as a group project is premised on the idea that the aggregate of all the choices of all the people with decision-making power about who gets into the HoM and who does will constitute a more reliable judgment
Whew. I thought I might read that the Hall of Merit project, generally judging merit largely by value, supposes that players are generally used where and how they have greatest value.
JeffM: Some players traditionally considered good hitters have been determined to be out machines and drains on run scoring by sabermetric methods. So if teams are misinterpreting how good (or bad) a hitter is, and the absence of offense determines the value of defense at a position, won't a position's defensive value be misrepresented?
Unless those misjudgments about batter-runner value differ systematically by fielding position, they are only a source of noise, not bias. (Noise is a serious problem, I admit.)
Misjudgments of the fielding positions, what player types can or should play them, are probably a source of bias.
258. KJOK
Posted: September 18, 2006 at 05:01 PM (#2180657)
Good article at Baseball Prospectus on Replacement Level.
One statement I would take exception to is that for position players, the replacement level doesn't matter so much, because it impacts all players equally - it doesn't quite work that way, and over a career where you set the replacement level has a rather large impact.
259. Dr. Chaleeko
Posted: September 19, 2006 at 02:56 AM (#2181528)
KJ,
A very interesting article indeed. And somewhat pertinent to at least one HOM candidate discussion. Hasn't there been some discussion of Frank Chance in chaining versus replacement terms?
260. KJOK
Posted: September 19, 2006 at 05:19 AM (#2181624)
A very interesting article indeed. And somewhat pertinent to at least one HOM candidate discussion. Hasn't there been some discussion of Frank Chance in chaining versus replacement terms?
Probably, although it was more along the lines of 'actual' replacement instead of theoretical. I think 'raising' the theoretical replacement level helps Frank. I basically use a '.500' baseline, and of course he's near the top of my ballot...
261. sunnyday2
Posted: October 04, 2006 at 03:10 AM (#2196929)
Did everybody see this?
-----------
Baseball Prospectus' WARP1 is wrong
Let’s start off with the defintion of WARP-1:
WARP-1
Wins Above Replacement Player, level 1. The number of wins this player contributed, above what a replacement level hitter, fielder, and pitcher would have done, with adjustments only for within the season.
Then, let’s look at a .500 team. When I need a .500 team, pretty much without fail, I look for the Houston Astros, and they satisfy my needs. The scored about as much as they allowed, and won about as much as they allowed. Let’s take a look at their team:
BP does a great job in presenting their stats, making my job very easy:
http://www.baseballprospectus.com/dt//2006HOU-N.shtml
If you go down to the “Advanced Batting Statistics” section, which is a misnomer, since the data there is batting, fielding, and pitching, the team WARP-1 totals is 58.5 wins. The Astros won 82 games, which is pretty much what their RS/RA numbers would have expected. 82 minus 58.5 is 23.5 wins. 23.5 / 162 = .145. Another perennial .500 team I like is the Seattle Mariners. Their team WARP-1 is 52.1, and they won 78. Their RS/RA would have expected around that as well. 78 minus 52.1 = 25.9 wins. 25.9/162=.160. The Yanks won 97, which is also around their pythag record. Team WARP-1 is 71.9. 97 minus 71.9 = 25.1 wins, or a .155 record.
What do we learn here? That BP’s WARP-1 treats the replacement level as a team with a .150 record.
I’ve shown elsewhere on this site (click on the “Talent Distribution” category at the bottom of this entry) how the most likely team replacement level is a .300 record. This can be shown in many many ways. And, that pretty much is the number most analysts would use. To use anything else is, frankly, just plain wrong. Or needs a ton of explaining.
The replacement level that I use are: for a position player and a starting pitcher is .380. For a reliever, it’s .470. A team of such players will win .300 games.
So, why does BP calculate WARP-1 the way they do? The likelihood is that it treats a “replacement-level’ position player as a replacement-level fielder and replacement-level batter. But, such a player is not the 420th best position-player in the world. He’s probably not even in the top 1000 players in the world. Why is this the benchmark? What does it tell us?
I know all about the 1899 Spiders, and the recent Tigers. It doesn’t matter. Even if an MLB team posts a .140 or .250 record, our best estimate of the true talent level of these teams is nowhere close to those records. They probably need to be regressed 25-50% towards the mean.
Posted by Tangotiger on 10/03 at 02:50 PM
262. KJOK
Posted: October 04, 2006 at 04:34 AM (#2196992)
Yes, excellent, excellent article. Now he needs to write one about Wins Shares too....
263. Dr. Chaleeko
Posted: October 04, 2006 at 01:02 PM (#2197072)
Yes, yes, yes. It's all (or in large measure) about that wacky AA fielding thing.... If a rational and thoughtful voice like Tango's isn't strong enough to tell WARPers that something smells rotten in the Fielding Stats of Denmark, what will?
KJ,
He did do up a big Win Shares dialogue with Rob Wood that's pretty good. I don't have an URL, but if you google it, I'll bet you'll find it. Their main objections to WS are pretty much the same as we've all discussed: the zeroing out thing, questions about team v. individual fielding, stuff like that. It's a good discussion.
264. DL from MN
Posted: October 04, 2006 at 01:16 PM (#2197083)
What kinds of players is that going to boost? Players who were average for a very long time, Nellie Fox for example. This is the reason I look much more at the BRAA and FRAA than I do at the actual WARP. I have more confidence in their calculation of the 50th percentile than I do of their calculation of the replacement level.
265. Dr. Chaleeko
Posted: October 05, 2006 at 09:41 PM (#2199712)
OK, so I had this really interesting blog-to-email discussion with a guy from a reputable online baseball analysis group about replacement and about the problems with Clay D's way of looking at replacement. I'd been harping on my favorite examples of WARP's wacky fielding...you remember the example of Barry Bonds, 17% fielding, right? Or Gil Hodges, 33%. So this guy says to me something to the effect of:
There are no replacement fielders or hitters, only replacement players.
This kind of knocked me over, and I couldn't wrap my head around it. So I queried him about it to get a better understanding of his point. At one point I summarized his position in my own words to make sure I understood it, and he said that the summation was just about on target. So I'll share that with you because I think it's potentially very interesting/important.
One quick note about this, the fellow told me that replacement can be roughly defined as any player whose combined hitting/fielding/pitching contributions are 17 runs below average per 150 games played.
OK, here goes:
So if you were Clay Daveport, you would do away with BRAR/FRAR and compare all players to average levels of batting and fielding (putting aside questions of position for the moment). And you would note instead how many runs above or below the threshold of -17 runs /150 games versus average the player is in order to gauge his replacement value. You still have distinctions between batting and fielding but the level of replacement in each is not important to you because it will ultimately be self-defining over a large population of players and because total value is more important. So you needn't force fit any kind of threshold for each of batting and fielding, except as is relative to average (the -17/150). So your theoretical objection to Clay's work is, therefore, the force fitting of a hard-and-fast line that defines replacment in each facet of a batter-fielder's game, and which then causes the illusion Tango describes where Clay ends up with a .150 replacement level due to the interaction between his fielding and batting measures.
Just because it takes a spell to digest this, let me break it out piece by piece.
1) Replacement level is -17 RAA net of hitting/fielding/pitching.
2) Therefore a replacement player is someone whose total net RAA don't exceed this, no matter how they do it, whether by fielding ineptitude or offensive awfulness, or pitching "prowess"
3) Distinctions between batting and fielding are based on BRAA/FRAA not on BRAR/FRAR because we're not trying to peg the player's performance in each facet to a single threshold, but rather trying to figure how many runs above or below average he is.
4) Therefore BRAR/FRAR are artificial and redundant. They are an attempt to force-fit a threshold of replacement where it needn't exist.
To explain why, the gentleman used the example that a "typical" replacement team might have the following breakdown of winning percentages among its players:
hitters: .395
fielders: .485
pitchers: .380
BUT, that's only the mean replacement value. (Making up numbers here...) if the team has .350 hitters, it might have .500 fielders and .400 pitchers. Or it might have .200 pitchers, .500 hitters and .600 fielders. The point is that the typical replacement level can be misleading if it doesn't offer a true sense of the net RAA for a player...that is, it doesn't matter how a guy goes about racking up -17 RAA/150 G, only that he does it (well, doesn't do it).
5) The artificial BRAR/FRAR line leads to potentially large distortions due to the interaction of fielding and hitting replacement values (and pitching too, I guess, I didn't ask); it's why WARP's real replacement is .150ish (24-138) instead of the more realistic .300ish (49-113). I haven't done so, but I suspect if you use pythagorus or pythaganpat to figure out what the BRAR/FRAR distinctions do to the system's real replacement level, I think you'll see why there's bigtime distortion.
Something the fellow added was that it is important to adjust for position, though I'd guess we all know that.
Anyway, I learned a lot from this guy, and I figured that maybe it would be pretty important stuff for those who live/die by WARP---and for all of us too since we talk about replacement a lot.
266. KJOK
Posted: October 05, 2006 at 11:45 PM (#2199802)
This was a post from Clay D. himself, repeated in one of the BTF comment threads (bolding is mine) :
...WARP numbers for individuals don't add up to the same values that I would get if I evaluated them from their team numbers. Consider the 2004 Orioles. This team actually went 78-84. Since the replacement level win% is .153, their actual wins above replacement is 78-162*.153=53.2. Call it 53.
The sum of their individuals is 81, a big difference to be sure from the desired 53. But consider the team totals of batting rar (217), fielding rar (164) and pitching rar (381). These numbers have been adjusted to the standardized league of 9.0 runs per game, pythagorean exponent of 2. In the standard, a replacement level team scores 537 runs per 162 (replacement level EQA of .230, divided by league average of .260, raised to 2.5 power to convert it to runs, times 162 times 4.5 runs per team per game = 537) and allows 1262 (replacement pitching and fielding together yield a standard 7.79 replacement ERA, times 162 games = 1262).
A team that scores 537 and allows 1262 has a Pythagorean win pct of .153. The Orioles adjust to 537+217 brar=754 runs scored and 1262-164 frar-381 prar=717 runs allowed. That gives them Pyth% of .525 (85.1 wins), which makes them (.525-.153)*162=60 wins above replacement.
So we estimate the team as being 60 warp, and in real life they had 53 warp. That's not so bad, considering that they underperformed their real-world Pythagorean estimate by about 4 wins, and they allowed a few more runs than expected based on their pitching statistics.
The main point of that was to show that while the individual WARPs added together to 81, the team WARP only came to 60. There is a diminishing returns principle in the system, which follows from the mathematics of the Pythagorean model. A given number of extra runs will increase the win total of a bad team more than it will a good team. When Tejada or Mora or Lopez have their WARP calculated, they are adding their statistics to a replacement level team (simplifying; I actually calculate a team as average but for one replacement level player, compared to average but for this player) After you add Tejada, the team isn't replacement level anymore, and you don't get as many wins from from Mora as before. Now that you have Tejada and Mora, the team is even farther from replacement level, and Lopez' contributions don't count as much. That, in a nutshell, is the problem. The team environment does make a difference on how many wins a player's performance creates, but I've adjusted that context out.
267. KJOK
Posted: October 05, 2006 at 11:50 PM (#2199806)
And my comment would be, using a .300 baseline, which I think is even too low, you would get:
78 - 162*.300 = 29.4 Team Wins Above Replacement Team.
Actual Orioles individual WARP = 81
"OVER" WARP for just this one almost average team = 81 - 29 = 52! (52/29 = 170% "over"WARP'd!)
268. Brent
Posted: October 06, 2006 at 02:53 AM (#2200221)
I agree with Dr. C's friend that there are no no replacement fielders or hitters, only replacement players. A player's hitting can't be separated from his fielding (except for a DH or pinch hitter, for whom the value of fielding is zero), so you have to evaluate the whole package to see if he's above replacement level. However, WARP also uses replacement levels to introduce the defensive spectrum, so if you drop them you need to come up with another way to introduce the defensive spectrum into the evaluation.
269. Dr. Chaleeko
Posted: October 06, 2006 at 01:14 PM (#2200401)
Joe Diminio,
Based on the WARP is wrong article that KJOK pasted in above, and some of Clay D's responses, can you describe the replacement levels in PA and your pitcher WAR? Do they avoid any of these same issues of too-low replacement? Thanks!
270. jimd
Posted: October 06, 2006 at 05:08 PM (#2200613)
1) Replacement level is -17 RAA net of hitting/fielding/pitching.
Not quite. RAAP, not RAA. Batting runs must be calculated relative to positional average for this to work.
3) Distinctions between batting and fielding are based on BRAA/FRAA not on BRAR/FRAR because we're not trying to peg the player's performance in each facet to a single threshold, but rather trying to figure how many runs above or below average he is.
Again, not quite. BRAAP, not BRAA. Batting runs must be calculated relative to positional average for this to work.
*****
BRAA is a useless stat unless the batting difference between position is considered. It can be added to BRAA (to create BRAAP) or added to to FRAA (to create FRAR). But it MUST be accounted for.
If you're not quite clear about what I mean, think about it this way. An average fielding 1b-man who hits like an average SS doesn't have a job. An average fielding SS who hits like an average 1b-man is an All-Star.
Davenport's system needs one last step. After values are calculated relative to the various replacement thresholds (batting, pitching, SS fielding, 1B fielding, etc.), an amount proportional to playing time must be subtracted to account for the difference between his "theoretical" replacment level, and the "true" replacement level of -17 RAA (or whatever it really is).
(Win Shares needs similar adjustments to BWS, PWS, SS-FWS, FB-FWS, etc, for similar reasons, and unfortunately each adjustment is separate. Of course, they then wouldn't add up to Wins, but as it stands, a large proprotion of the Win Shares are awarded for sub-replacement play.)
271. Rally
Posted: October 06, 2006 at 05:26 PM (#2200640)
Why do people even talk about WARP and win shares anymore?
We have an uberstat if anyone cared to put it together: linear weights. Too bad MGL hasn't posted that in a few years.
You don't need his expensive dataset to do it though:
Park adjusted LW runs + ZR defense + baserunning + OF throwing + double plays. Throw in cluthiness if you so desire. What else is there?
And its all available for not too much $:
Baserunning from either the Bill James handbook or Hardball Times
DP data from Hardball Times.
I can't remember where I saw OF throwing but there are sources. If you're a baseball hack you can pull it together from MLB.com gamelogs.
272. jimd
Posted: October 06, 2006 at 11:59 PM (#2201263)
That BP’s WARP-1 treats the replacement level as a team with a .150 record.
Compare to Win Shares which treats the replacement level as a team with a .000 record. WARP-1 is closer to reality (it gets you about halfway there).
The main point of that was to show that while the individual WARPs added together to 81, the team WARP only came to 60.
Non-linear models are non-intuitive and take some getting used to.
Those who have taken courses in relativistic physics know what I'm talking about.
273. jimd
Posted: October 07, 2006 at 12:02 AM (#2201269)
Clarification:
BRAA is a useless stat unless the batting difference between position is considered. It [the batting differences between positions] can be added to BRAA (to create BRAAP) or added to to FRAA (to create FRAR). But it MUST be accounted for.
Why do people even talk about WARP and win shares anymore?
Well, we still have to deal with players in the pre-ZR era. And, regardless of how the defensive figures are derived (PBP or using traditional categories), there are interesting philosophical questions to consider, such at the replacement level discussion happening here and elsewhere.
Compare to Win Shares which treats the replacement level as a team with a .000 record. WARP-1 is closer to reality (it gets you about halfway there).
Well, I accept Bill James' argument that he's not using replacement level at all as a baseline for Win Shares. The most natural complement to Win Shares isn't necessarily Win Shares Above Replacement (or Bench, as the Hardball Times does), I think, but Loss Shares.
275. Patriot
Posted: October 07, 2006 at 01:27 AM (#2201490)
Well I think WS has a baseline, and it's as clear as a bell. Just talking offense, if you produce at less then 52%, you can't get any WS. If that's not a baseline, I don't know what is. The WS offensive baseline is ~.200. That Bill pretends that contributions above .200 = contributions above 0 doesn't make it so.
I posted this in the other thread on WARP, but I don't know if anyone saw it. But I think there's a problem with Tango's methodology with WARP1.
If a pitcher gives up say 60 runs less than whaat a replacement pitcher is expected to give up, he's going to get credit for 60 PRAR and probably around 6 WARP. Well quite a few of those runs prevented are going to be given out to fielders on the team, too, so those runs are getting double counted. To get what WARP1 thinks is a true replacement level team, it'd be better to just calculate WARP1 for position players with just there offense.
Park adjusted LW runs + ZR defense + baserunning + OF throwing + double plays.
And I've done that, although my OF arms is based just off of A so it can be improved a lot. Baserunning, save SB and CS, isn't included either. I would think someone here could write up code that could grab baserunning data from game logs and we could calculate baserunning runs very simply from there.
278. KJOK
Posted: October 16, 2006 at 11:12 PM (#2214439)
Maybe someone else can confirm this - it looks to me as if Baseball Prospectus has recently completed their yearly "re-calc" of WARP, causing the WARP numbers for almost all players to change by at least a little bit.
279. Juan V
Posted: October 16, 2006 at 11:17 PM (#2214442)
Well, their recalibrations almost threw my 85 and 86 ballots outta whack, but I haven´t seen any changes since.
280. KJOK
Posted: October 17, 2006 at 12:42 AM (#2214520)
Well, we just did '87 ballots, so a change before 86 would be 'recent' I would think. They have 2006 stats posted, so the change likely occured whenever 2006 stats were completed I would think.
281. KJOK
Posted: October 24, 2006 at 10:58 PM (#2223499)
Went to update Stargell - in two years he's gone from 89.8 WARP1 to 110.2. Seems like a big jump to me - defense went from terrible to around average....
282. TomH
Posted: October 28, 2006 at 10:04 PM (#2227414)
Many of us measure Value in terms of regular season wins.
How do I establish “value” for being a World Series winner?
How does it change from 1903-68 to the modern day wild card format?
How much is a hit in a playoff game worth compared to a regular season one?
I’ll begin by estimating the typical franchise/fan value of winning it all.
A good season, 90 wins out of 162, is 25 wins better than a bad one.
Pre-1969: I’ll say a W.S. is worth an extra 50 wins. That way, two teams, one which won 65 games in year 1 and 90 games and a Ring in year 2, has the same “value” as a team that won 105 and 100 games but came up short of a pennant both years (probably need a few more years in here to make it more realistic..)
Winning the regular season pennant I’ll say is worth half of that; 25 extra wins. Continuing the previous example, a team that won 80 games in year 1 and 100 and the pennant but lost the Big One in year 2 would be valued the same as the other instances given.
Team A
65 wins = 0
90 wins + WS = 25 + 50 = 75 ..total of 75
Team B
105 wins = 40
100 wins = 35 ..total of 75
Team C
80 wins = 15
100 wins + pennant = 35 + 25 = 60 ..total of 75
Obviously this is not the same for all franchises; after a long period of good teams but finishing frustratingly short, a trophy is probably worth more.
If a Ring = 50 wins, = 25 wins more than the pennant the team already clinched to get INTO the W.S., then one W.S. game victory = a max of 6 wins, since it takes 4 games to get a Ring. You can win 3 and still get nothing (well, it’s better than getting swept I guess), so I’ll call each W.S. game win = 5 regular season wins.
In actuality, each regular season win has some extra vaue of pushing you TOWARD the goal of winning the pennant
Now, fast forward to 2000. 30 teams instead of 16, 8 playoff teams instead of 2.
It is harder to win a Ring.
But it’s easier to get some good feeling = bonus credit = $$ for at least making the playoffs.
I'll say a Ring in 1995ff = 40 wins. Pennant = 22, making it to LCS = 13, getting to playoffs = 8.
So W.S. victory = 18 wins beyond already winning the pennant.
LCS victory = 9 wins + ½ of WS (potential to win WS, which you get only if you win the LCS) = 18
LDS win = 5 + ½ of above 18 = 14
LDS set is only best of 5, so in each case, a single game win in LDS. LCS and WS in modern day is worth about 3 or 4 regular season wins (dividing the end result value by 5 like I did above in the pre-1969 example).
But getting to the playoffs is worth less in modern day, since who knows whether you’ll win the WS, or get 2006 St Louis Cardinal-ed by some mediocre team that pays superbly in October.
So, there’s my rough-cut for now. Giving approx credit to post-season exploits at 3-4 times the regular season rate in modern day, 5 X for pre-1969, in between for 1969-1993.
Pre-1969, great close-pennant-race heroics worth a little more, since it was all or nothing. If Yaz had needed bonus points, he would have gotten them from me :)
283. Dr. Chaleeko
Posted: October 29, 2006 at 04:45 AM (#2227496)
Tom, I think you and I will just go to our graves disagreeing about this one, but I wanted to bring up three potential issues I see here in the interest of giving you some feedback (since that seems like what you're asking for). Feel free to do as you like with it! ; )
1) Mantle, Snider, Derek Jeter, or Chipper Jones won't really need the extra credit anyway, so you're in essence building up cases for Bernie Williams, Charlie Keller, Graig Nettles, or Dandy Andy Pettitte---borderliners from dynastic teams who get a ton of october PAs and INNs (especially since the three-tier format). They will benefit disproprtionately (as well as lesser players than they like maybe Ralph Terry or Jim Gilliam or Paul Blair or Willie Davis) thanks to their having the good fortune to have Joe D, Lou Gehrig, Mickey Mantle, or Greg Maddux as teammates. If I understand your concept, then in some cases, they will benefit to the tune of one to three seasons (maybe more for the braves and yanks). That's a ton of credit, and there's a lot candidates on both sides of the ball who could benefit from a season of credit or fall behind someonelse who was to receive such a disbursement of October credit.
2) Meanwhile guys before free-agency who were reserved to their clubs permanently get in essence penalized for signing as 18-20 years olds with teams that had a chronic inability to leverage frontline talent into perennial winners. Or teams that simply can't compete with various Yankee juggernauts. That's your classic Ernie Banks, Luke Appling HOMer, but it's also George Sisler or Goose Goslin.
3) And you may ultimately be forcing yourself to view the free agency choices of post-Messersmith players as value judgments on the wisdom of taking the money (in a potentially limited market dictated mostly by fielding position) versus latching on with a winning/dynastic team (where hindsight is 20/20). This sytem, for instance, may give a better than average corner like david Justice (398 Oct AB!---he averaged 401 per season over his career), Paul O'Neill (299 October AB) or Hideki Matsui (151 Oct. AB in three years and very likely to grow) a big advantage over other better-than-average corners like, say, Bobby Bonilla (149 Oct AB...passed by Matsui in just three years) or Tim Salmon (59 Oct AB) or Luis Gonzalez (87 Oct AB) or Moises Alou (134 Oct AB) or even that Sosa guy (53 Oct AB). When their free agency came along, they didn't seek out NY or ATL or else weren't sought by them. And so they don't reap the post-season reward. But were they really players with an extra season or more value in them? Or did they just happen into good teammates? Or happen into not-good teammates?
Anyway, these seemed like issues worth talking about. I know we'll probably always disagree, but in the interest of offering some feedback, I thought I'd see if they were helpful to you in any way.
[Just for interest's sake, Jeter has 478 Oct AB, Bernie has 465, Tino 356, Posada 307, Chipper 333, Manny has 307, Andruw 238. And here's an interesting one: Mr. October actually has 281. Wow, that's a lot for an era with a two-tiered system. Between 1971-1982 he only missed October in 1976 and 1979. Wow.]
284. DL from MN
Posted: October 29, 2006 at 03:34 PM (#2227574)
I'm willing to give credit to postseason play but my credit requires the player to have performed above average and I'll only credit at the normal rate, no multiplier. For most players this is a drop in the bucket compared to the entirety of their career. For someone like Mariano Rivera it is equivalent to an extra season of pitching.
285. TomH
Posted: October 29, 2006 at 07:23 PM (#2227677)
Dr C, thanks for the perspective.
I didn't mean to imply I was about to give another 4 seasons worth of credit to Bernie Williams becaue he happened to have played for the Yankees. But like DL said, if someone performs significantly above or below par for lots of playoffs games, the methods I wrote down attempt to gauge how MUCH extra credit I would give. Rolllie Fingers was lucky to have pitched for the A's. He also pitched well in clutch situations in 1972-74, and I'll try to quanitfy how much bonus I'll give him for that; because frankly, he'll need it to sniff my ballot.
286. DL from MN
Posted: October 30, 2006 at 02:40 PM (#2227973)
Can someone name the players in the top 25 with the best cases for postseason bonuses? I haven't been considering this in a systematic manner.
287. Dr. Chaleeko
Posted: October 30, 2006 at 04:21 PM (#2228050)
Ask, and you shall receive. What I've done here is to take all the top 85 or so vote-getters, plus the top candidates in the next five elections (anyone with 200+ WS, basically), and tabulated how many post-season series they were in and how many ABs or INNs they collected during October. First pitchers, then hitters.
Oct oct oct name series abs inn ------------------------------- r jackson 17 281 rose 14 268 garvey 11 222 rizzuto 9 183 morgan 11 181 lopes 11 181 perez 11 172 e howard 10 171 bench 10 169 cey 9 161 bando 9 159 baker 8 149 campaneris 9 144 mcrae 13 143 stargell 8 133 munson 6 129 porter 8 120 tenace 9 114 chambliss 7 114 r smith 8 107 maddox 8 107 hebner 9 100 c cooper 5 96 schang 6 94 bancroft 4 93 oliver 7 92 decinces 4 89 grich 5 88 cepeda 4 87 brock 3 87 luzinski 6 82 moore ? 81 Monday 9 81 otis 6 78 lundy ? 76 doyle 3 76 foster 7 76 burns 3 74 keller 4 72 chance 4 70 bartell 3 68 yaz 3 65 matthews 4 65 madlock 4 65 cuyler 3 64 rice 3 63 leach 2 58 singleton 4 57 cendeno 4 52 oliva 3 51 carew 4 50 h wilson 2 47 cash 3 46 aparicio 2 42 williamson 2 41 2 traynor 2 41 staub 2 41 dunlap 1 40 roush 1 28 boyer 1 27 long 1 27 dimaggio 1 27 wynn 2 26 duffy 1 26 maranville 2 26 fox 1 24 childs 1 22 stephens 1 22 pinson 1 22 elliott 1 21 f jones 1 21 j ryan 1 20 5 berger 2 18 manush 1 18 lombardi 2 17 cravath 1 16 bresnahan 1 16 rosen 2 13 oms ? 12 klein 1 12 kueen 1 12 murcer 4 11 f howard 1 10 bo bonds 1 8 veach 1 1 trouppe ? ? monroe ? ? a wilson ? ? s white ? ? h smith ? ? taylor 0 0 beckley 0 0 browning 0 0 ch jones 0 0 gvh 0 0 b johnson 0 0 j mcgraw 0 0 fregosi 0 0 vernon 0 0 kell 0 0 travis 0 0 jenkins 0 0 hargrove 0 0 harrah 0 0
Who stands to gain the most in this group? The biggest beneficiaries would be borderliners or near borderliners. Who fits the model?
Tony Perez
Fingers
Mays
Grimes
Gomez
Munson
Bando
E Howard
Rizzuto
maybe Dean
Tenace
Campaneris
T McGraw
Garvey
Lopes
Cey
We'll start seeing lots of Royals and Yankees and Cardinals soon as well, with Randolph, Quis, Nettles, Guidry, Goose, Ozzie, and also Keith Hernandez getting a lot of October time among the soon-to-be-retired crowd. Don Baylor will get a decent chunk too, I'd think.
288. DL from MN
Posted: October 30, 2006 at 05:43 PM (#2228140)
If you can translate that into BRAA, FRAA and PRAA I don't need to do anything but add the numbers. The next questions are: What is "average" for the postseason? Was the performance above average? By how much?
Replacement value for the postseason doesn't make any sense. Actually, this may be one of those places where using WPA is justified. If someone could figure out the win probability added then I could just translate it into runs above average. I assume there is decent play-by-play information on baseball postseason much farther back than there is on regular season data. I know little about calculating WPA so I don't know the scope of calculating WPA for the list above. Obviously the context matters and nobody has calculated the postseason win expectancies but I think using the generic table is enough for a back of the envelope calculation.
It could be a really fun side project to calculate the WPA for all postseason play to find out just who the real "Mr. October" is.
289. Dr. Chaleeko
Posted: October 30, 2006 at 08:01 PM (#2228220)
Worth noting that relievers in the post season should probably jump up in value. Look at Fingers and McGraw with 55 innings or so each. As the fireman for their teams, they are very likely pitching mondo important innings.
In fact, there's a way of thinking about the WPA/WXRL question that could make sense. The goal is not so much to win games in the post-season but to win series. When thought of that way, the magnitude of each AB or IP grows. Blown saves and converted saves take on a much different character when your season comes down to three games or so.
And no one would be likely to benefit more from such a recalibration than an ace reliever (I'd think, though no evidence to prove that). Looking this way at it, you also would find yourself really bonusing guys like Maz or Carter or Luis Gonzalez, since their hits increased the likelihood of winning the World Series so much. But you'd also have to give a big nod to Jack Clark or Ozzie Smith for hitting key NLCS homers as well. I don't know if you'd want to count Bobby Thomson's his or Bucky Friggin' Dent's since they technically occured during the regular season.
I think I read an article about this lately, one that ranked the biggest hits of all time based on how much they moved teams toward a World Series. Don't remember the author or the publication. One of its findings was that Tony Womack's hit just before Gonzo's single was a really huge hit that's oft overlooked.
290. TomH
Posted: October 30, 2006 at 08:30 PM (#2228238)
I'll save a lengthy post for when the handlebar mustache man becomes eligible, but I ran a rough cut today of a Win Probability analysis for Fingers in his October performances. Near as I can figure, replacing Rollie with an average MLB reliever would likely have cost the A's one world series trophy, probably either 1972 or 1973. A lot of that is the fortuitous circumstances of winning 4 straight post-season series by one game each. But that'll be a fun one to argue about after Thanksgiving.
291. DL from MN
Posted: October 30, 2006 at 09:22 PM (#2228268)
Another borderliner who will be helped out: Kirby Puckett
I'm on board with using WPA, now I just need it for LOTS of players which is a job I can't really tackle. I don't want to give credit just to the players somebody makes a case for without considering everyone in my top 100.
292. KJOK
Posted: October 30, 2006 at 11:01 PM (#2228345)
In fact, there's a way of thinking about the WPA/WXRL question that could make sense. The goal is not so much to win games in the post-season but to win series.
I believe defining what you believe the "goal" of playing is will be a big determinate of how you view this discussion. Is the goal:
1. Scoring/preventing runs?
2. Winning games?
3 Making the playoffs?
4. Winning Playoff Series?
5. Winning the Championship?
Depending on the answer to this question should determine what type of "measurement system" of merit you might use.
293. Dr. Chaleeko
Posted: October 31, 2006 at 02:27 PM (#2228702)
KJ,
You said it better than I did. And to fold in other comments from above, your choice of a baseline measurement is going to have an affect as well, as is how you view success/failure in vanishingly small samples, as is the October run environment itself, as is the distribution of runs across games (take the 1960 WS as an example of this last point).
Hornet's nest.
294. DavidFoss
Posted: November 03, 2006 at 12:14 AM (#2230568)
Speaking of Win Shares...
Does anyone have Acrobat issues with their Digital Update PDF file? It has to do with the speed of the searches. If I search for "Mantle" I get the following behavior:
Adobe Acrobat 6 -> Almost instantaneously returns a couple of dozen hits.
Adobe Acrobat 7 -> Takes over a minute to scan the document to get the hits.
The thing is that I don't know how to go back to Acrobat 6 once a computer is running Acrobat 7.
Anyone seeing similar behavior? Does the file need to be indexed or something?
Any word on another digital update (through 2005 or 2006?).
Thanks.
295. KJOK
Posted: November 03, 2006 at 01:40 AM (#2230624)
David:
I get a first hit in about 5 seconds, and 33 more hits in the next 25 seconds with Acrobat 7.
I don't believe there are going to be any more digital updates.
296. DavidFoss
Posted: November 03, 2006 at 03:17 PM (#2230891)
Thanks for checking. With Acrobat 6, the file appears to be 'indexed' so that the hits are returned almost instantaneously. I was just hoping there was a trick to getting that back. Anyhow, not a showstopper but a pet peeve. :-)
"And no one would be likely to benefit more from such a recalibration than an ace reliever (I'd think, though no evidence to prove that). Looking this way at it, you also would find yourself really bonusing guys like Maz or Carter or Luis Gonzalez, since their hits increased the likelihood of winning the World Series so much. But you'd also have to give a big nod to Jack Clark or Ozzie Smith for hitting key NLCS homers as well. I don't know if you'd want to count Bobby Thomson's his or Bucky Friggin' Dent's since they technically occured during the regular season."
I couldn't disagree with this more . . . sorry Dr. C, probably why I'm such a detractor of WPA . . . I really don't like it at all as anything but a fun junk stat.
The problem with giving so much credit to Mazeroski, Carter and Gonzalez is that to do so you deduct credit from the guys that made those hits possible - the players that won the other 3 games and the ones that provided the other runs and prevented the outs in the game leading up to the point where they had their big hits. It all counts, and WPA misses that completely.
298. Paul Wendt
Posted: December 22, 2006 at 02:54 AM (#2266912)
During e-housekeeping I found a note that I drafted 2006-02-02 (among many other HOM tiddrafts that the world shall never see)
Marc Seriously, WARP1 is OK it's just that I can't keep my spreadsheets up to date with the frequent changes. [That is an important practical point. -pgw] The changes also bring into question the validity of the whole system. If its owners are pretty sure that their previous iterations suck, how do they know the current one doesn't?
Problem: Bill James has developed a measure of Loss Shares, and in the limited privacy of the SABR Statistical Analysis egroup he has essentially said that he cannot recommend Win Shares.
The main difference may be where Davenport and James live in the publishing world. Davenport on the web cheaply publishes every revised rating. James is the one man who can get a big check from a paper publisher for a tome on this subject, so years rather than months pass between revisions that see the light of day.
299. DanG
Posted: December 22, 2006 at 02:36 PM (#2267084)
Bill James has … said that he cannot recommend Win Shares.
Really, that’s no newsflash. We’ve known from Day 1 that win shares was not the perfect ending to the quest for the ultimate uber-stat. No doubt, James knew this as well.
That he would improve upon the system was inevitable. I’m sure that for every modification that people have made to WS, James has considered something similar.
Paul points out the useful fact that we get to see every incremental change in WARP; we will only see the win shares system change at its next plateau.
300. KJOK
Posted: February 12, 2007 at 07:56 PM (#2296189)
Just FYI, Brandon (aka Patriot) has posted 5 year park factors for each team since 1901:
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1901-20 -66-101--80--52--54--43--42--06-250 694 247/149
1921-40 -79--85--77--53--39--54--90--28-183 688 241/146
1941-60 -76-101--87--71--69--60--60--28-147 699 264/200
--------221-287-244-176-162-157-192--62-580 2081 752/495
error fixed...
I'll put it another way:
Lefthanded hitters are more valuable. Lefthanded throwers must play OF or 1B. Lefthanded hitters are usually lefthanded throwers. OF and 1B will have the most valuable hitters.
3B, SS and 2B can't compete for value because the players are predominantly righthanded hitters. 3B can't compete for defensive value with C, SS and 2B because they don't get as many chances to show value.
There are a couple strong 3B candidates available on the current ballot. Ken Boyer won an MVP and was an all-star 7 times. Bob Elliott led the majors in RBIs during the 40's, won an MVP and was a 7 time all-star. Elliott is easily one of the top 10 RH hitters of his era (Foxx, DiMaggio, Medwick, Bob Johnson, Gordon, Doerr, Frank McCormick, Billy Herman, Ken Keltner... I need help with the Negro Leaguers).
Look at the list of HoM 3B. Mel Ott, Eddie Mathews, Frank Baker, Stan Hack, Jud Wilson all bat lefty - that's HALF. Brett and Boggs also hit lefty. If you want a reason for the lack of 3B, there it is. The requirement of wearing a glove on your left hand makes 3B a "glove" position.
One part calculates how fielders measure relative to their peers, normalizing, adjusting, comparing, etc. to the measured averages at the position. All the uberstats do this in one way or another.
The other part is to establish the value of an average fielder at each position
compared to the value of the average fielder at other positions. The only way that I've ever seen this described in print is using the theory that I label "absence-of-offense".
How much more is average CF defense worth compared to average LF defense?
According to "absence-of-offense", the answer is simple: How much offensive value are managers willing to give up to get it? If it's worth a lot to them, then CF'ers will hit a lot less than LF'ers, as they give up more and more offensive value to get more fielding value. If it's not worth much to them, then CF'ers will hit about the same as LF'ers. (Over a sufficiently large sample of personnel decisions, of course.)
The reason "absence-of-offense" works is because there are no positional distinctions on offense; everybody plays the same position, batter, so all fielding positions have the same opportunity to contribute.
This principle is easily extended to all other fielding positions. Except pitcher, because it's clear that managers are willing to sacrifice all offensive value, and all fielding value too, to get average pitching value. Pitcher is off the chart. So all we can conclude is: an average pitching staff is worth considerably more than 1/9th of the total value of an average team (at least nowadays). James estimates about 1/3rd. WARP ranges post-1893 from about 25% in the 1890's to 45% in the 1970's. (Haven't done later.)
I always thought that "absence-of-offense" was the guiding principle in the choice of the values for the fielding "intrinsic weights" (Win Shares, p. 67). But James never actually says this. He gives no explanation of the "whys" for the "intrinsic weights". Maybe he has a fundamental insight into calculating fielding value that he hasn't published to his book audience, but I'm unaware of it.
Until somebody explains to me a good alternative theory, I'm sticking with "absence-of-offense", and based on that, the "intrinsic weights" in Win Shares appear to be flawed, biased in favor of the outfield when compared to the infield.
Infielders: 79% BR, 7% BB, 14% BL
Outfielders: 40% BR, 4% BB, 56% BL
Assume that if all were BR, then IF would be just as likely to be All-Stars as OF.
To explain the final WS results (a 60:40 mix of OF/IF) due to handedness alone, I have to hypothesize that (BB+BL) is 3 times more likely to be an "All-Star" than BR.
If I were to examine my "All-Star" data for batting side, this makes the following predictions. That Infielder All-Stars would split about 56:44 between BR and BL+BB. And that Outfielder All-Stars would split about 18:82 between BR and BL+BB.
If I were to examine random years from my All-Star lists, any bets on whether I'd find that 20% or less of the Outfield AllStar seasons Batted Right?
(Note: I'm not a statistician. I haven't taken a stats course in 33 years. Anybody who knows how to do this stuff is welcome to suggest a better approach. I'm all ears.)
that should say:
(each player season counted separately).
a) When you make this calculation, I think you are assuming that the distribution of talent at each position starts out equal. I don't make that assumption; I strongly doubt that the distribution of talent by position is equal. (In particular, I think historically talent has been overrepresented at shortstop and at center field because that is where players with the best speed and athletic talent tend to start. In our HoM, those are the best represented positions.)
b) Many star players have been moved from the infield to the outfield early in their careers. Twenty percent doesn't sound unreasonable to me.
The other part is to establish the value of an average fielder at each position compared to the value of the average fielder at other positions. The only way that I've ever seen this described in print is using the theory that I label "absence-of-offense".
In general terms, I agree -- the defensive spectrum describes the trade-offs between offense and fielding ability. My disagreement is that you are measuring the trade-off at the top of the talent distribution, by comparing star players. In principle, however, I think the trade-off should be calibrated at the bottom of the distribution, comparing replacement level players. If a replacement level shortstop (with average fielding ability) hits with OPS+ of 75, and a replacement level first basemen hits with OPS+ of 105, then the "value" of the shortstop relative to the first baseman is 30 points of OPS+ (or whatever batting metric you want to use).
Unfortunately, measuring replacement levels is very difficult. With Win Shares, James essentially gave up on trying to measure replacement level. WARP has made a complete mess of it by incorrectly assuming that there are separate replacement levels for offense and defense and by overadjusting the WARP1 replacement levels for ball-in-play context. For win shares, I think Studeman's "win shares above bench" and Joe Dimino's replacement estimates are getting in the right ballpark, but there are too many problems with measuring replacement value to claim we can measure it with high accuracy.
The distribution of stars (i.e., the upper tail of the talent distribution) would be useful in determining the value of positions if we knew that the distribution of talent was "supposed to be" identically distributed by position, but I don't accept that premise. I don't know of any theoretical reason to think it should be. So I don't think your counts of all stars by position provide evidence one way or the other on the validity of win shares fielding values.
On the other hand, as John said on the other thread, if a voter uses win shares and also cares about positional balance they may want to adjust the raw WS numbers in their voting systems. Your tabulation is certainly interesting and useful for that purpose.
But they move from the infield to the outfield, don't they, because they are shifting down the defensive spectrum, not because teams prefer to have their stars in the outfield rather than the infield? In the 1890s, yes, there is evidence that teams may have protected their stars from the violence on the infield base paths, and, yes, it seems likely that baseball men got the idea into their heads in the 1970s that it was axiomatic that anyone who could hit for power certainly couldn't play shortstop, but are there other instances?
If they are shifting down the defensive spectrum, wouldn't it stand to reason that teams are making tradeoffs between offensive and defensive value?
AFAIK, Rose's defense at second base wasn't criticized. For other players (e.g., Aaron) the move happened in the minors so it's harder to tell whether the player was experiencing defensive difficulties. My understanding is that Aaron was considered capable of playing second base at a major league level, but they thought his bat would be available every day if he were in the outfield. But I do read or hear discussion of injury risks that I have to believe that it's a consideration for at least some baseball front offices.
Thanks for the examples. If this was part of the conventional wisdom from the 1950s until the 1980s, that would help explain why infielder hitting numbers drop so dramatically during this era. Do you think this was c. w. any earlier than that? There are so many strong-hitting shortstops and second basemen in the 1920s and 1930s that I have a hard time believing that hitters were being systematically steered away from the positions: Cronin, Gehringer, Vaughan, Frisch, Hornsby et al. But maybe there is evidence that strong hitters were being steered away from the infield to protect them from injury?
On the other hand, as John said on the other thread, if a voter uses win shares and also cares about positional balance they may want to adjust the raw WS numbers in their voting systems. Your tabulation is certainly interesting and useful for that purpose.
Or they begin their ranking process by comparing to positional peers first, then (generally) basing their final, integrated ballot on the relative dominance each guy has over his position. And if that means Clemente is your #13 guy, well, that's the crumbling cookie for you, but at least it's a consistent and somewhat sensible way to look at the ranking and ballot-construction process. And you can alwawys "manually override" the process and move a guy up if you feel like you're slotting him too low.
2B
Joe Morgan (I think??? Didn't he have a fractured ankle from a bad pivot?)
Craig Biggio's torn ACL?
Didn't Tony Phillips rupture an ACL on the pivot? Or did he do it to someone else, maybe Vina?
SS
Tony Fernandez: hobbled by Madlock
3B
Scott Rolen: dislocated shoulder in 200? post-season during collision with passing runner.
It's hard to say. Many infield to outfield conversions took place in the minors, so there isn't much written on them, and if there were, it would be hard to tell if the reasons given to the press reflected the actual management decision making.
I'll tell you, though, that if I were a baseball exec and had read the Bill James study of rookies (from his 1987 Abstract), I would certainly move a great-hitting minor league second baseman (like Aaron) to the outfield to avoid injuries. James found, for example, comparing rookie second basemen who hit well to a matched set of rookie outfielders, the outfielders went on to play 43% more games, scored 74% more runs, drove in 92% more runs, and hit 3.2 times as many home runs. Even poor hitting rookie second basemen had shorter careers than poor hitting outfielders--a very surprising result, since the second basemen were contributing more on defense. Injuries, small and large, hurt the subsequent development of rookie second basemen. The same was also true of catchers, and to a lesser extent, of third basemen. Shortstops, on the other hand, didn't seem to suffer as many injuries and had longer careers than matched outfielders.
I should amend that to say that good-hitting rookie catchers had shorter careers than matched good-hitting outfielders. The poor-hitting rookie catchers, however, had longer careers than poor-hitting outfielders (as expected).
I've never cared for this approach. What ends up happening is that certain positions go through phases, fads even, when great hitters suddenly become common at a position: CF in the 50s, 3B in the 70s, SS recently. Now, if the only purpose of the evaluation is compare a player to his peers, then it doesn't matter. But if you're trying to make cross-era comparisons, then these players get undervalued.
The true measure of defense would distribute defensive value proportionally to the run value of the plays potentially to be made at a position compared to the run value of those plays potentially made at other positions. This would establish the relative value of the different positions. Each defender would then be evaluated against the average or the replacement level, whichever is preferred, at that position.
I have no problem with evaluating fielders against average (which is replacement level for fielding) - but I think absense of offense is the way to approach the 'position constant' portion of it. How much offense managers are willing to give up basically determines the defensive value of playing the position at an average level.
And the great part about this is that it adjusts as the game adjusts. Errors are up and DPs are down? Then managers sacrifice more offense to get a good fielding 3B and less offense at 2B. Even better, catchers and 1b have their hands take a beating because of bad gloves? Offense goes down at those positions, and your system adjusts. Etc., etc.. This approach would be very adaptable to changes in the game.
I'm not sure what you're suggesting here. If you're suggesting measuring the absolute value of the plays made at the position in comparison to having nobody playing there, well, it doesn't work. 1B participates in more plays than any other position. The potential for havoc by having a total klutz there is enormous. But with the modern mitt, acceptable performance is not difficult, hence the general conclusion that 1b is the easiest position overall.
Now we're talking the difference between the theory and the implementation of a calculation method. The theory is always couched "given a long enough time" so that star-gluts and manager fads get averaged away.
My suggestion is more theoretically pure than pragmatic. What I mean is that the defensive value of a position essentially depends on how many plays someone needs to make at a position (adjusted for the difficulty of said plays and the run value of them). I want to measure defensive value by what happens in the field rather than at the plate.* I agree that there is no absolute value for that -- to paraphrase Bill James, the number of plays I could mishandle is essentially infinite -- but IMO we can and should use positional averages to establish that. IOW, we're judging by major league minimum standards.
*I agree with Joe that if we're going to use an indirect measure like batting, the correct one is replacement level.
Here, we're not talking about the majority of IF-to-OF shifts in the minors, the ones where the guy isn't capable of playing an acceptable major-league 3B. We're talking about a conscious decision to move a major-league all-star 2b (or whatever IF position) to the OF, 20% of those guys being shifted.
Now maybe it does happen, 20% of the potention IF All-Stars are shifted to OF, in a conscious decision by their clubs to sacrifice peak value for career value (they would have more value per season as an IFer, but are more likely to accumulate a higher total as an OFer). However, to make this argument, you also have to convince me that James had this information and built it into Win Shares.
Oh, I can agree with that statement. Just like apple pie and motherhood ;-)
a) "absence-of-offense" was attempted, but didn't achieve the desired results
b) a new unpublished insight into the search for a "unified fielding theory"
c) trial-and-error until the results matched James' expectations
I think your choice from this menu may determine how acceptable the values are.
Let me just say this, I hate Pete Rose, I hated him when he was playing, and I was positively gleeful when he was banned. (just as I'm sure Bonds haters feel whenever more info leaks about him).
I always claimed that Pete Rose was a bad fielder- because why else would he have been moved around so much.
Truth be told, I didn't see him play 2b- I saw him play 3b-
and he was fine at 3b he certainly had the reflexes to play 3b, and his arm was good enough, and when he was younger he certainly was quick enough to play 2b
so supressing my deep hatred of Rose, I assume he was at least a decent defensive 2B
baseball should also hold its collective nose and put his plaque in the hall- what he did wasn't 1/4th as bad as Joe Jackson imho (he took $ to throw a World Series- I don't care what his average was in that series HE TOOK MONEY- hell most of the other sox players got stiffed- but not Joe, he actually got $ in hand) rant over...
1B Mike Ivie
2B Pete Rose
3B Larry Gardner
SS Donie Bush
RF Jim Greengrass
CF Daryl Strawberry
LF Zack Wheat
C Johnny Roseboro
DH Phil Plantier
P Ted Lilly
P Ross Baumgarten
Based on James's description in WS of the way that the catcher weight was chosen, I would have to conclude it was (c). James gave catcher the highest intrinsic weight because when he didn't, top catchers couldn't get enough WS.
Win Shares is not an entirely objective value system (all ranking systems, to be honest, contain subjective elements, so there's nothing intrinsically wrong there, but it needs to be noted for the record). James chose 52/48 for the defense/offense divide and 67.5/32.5 for the pitching/fielding split because it gave him what he thought were appropriate results for pitchers vs position players. Clay Davenport did something similar, I should note, when deciding to divide BIP credit 70/30 fielding/pitching in his defensive ratings because it balanced out the rankings of top pitchers over time.
-- MWE
Billy Beane should be the GM.
I agree with Mike Emeigh on the methodology and the comment: "(all ranking systems, to be honest, contain subjective elements, so there's nothing intrinsically wrong there, but it needs to be noted for the record)."
And it's clear that James used method (c) in combination with informal (b) insight, for the finer weights that are not intrinsic (fixed for all time). The claim formula for catchers:
-pre-1900 : 2PO + 4A - 3E - PB + 4DP
1900-1930 : 2PO + 4A - 8E - 4PB + 8DP [page 71, 1900-1930 coefficients doubled]
1931-1986 : PO + 4A - 8E - 4PB + 6DP
"In essence, we have increased the value of the 'skill elements' relative to the value of putouts, which are essentially a meter on the catcher's playing time." (By the way, I'm sure that isn't approximately true until the late 19th century.)
Not "in essence" but "mainly" I would say. The relative weights on A, E, PB, and DP are not fixed but radically different before and after 1900; with a modest revaluation of DP i 1930. He must have experimented.
The All Garden Annoyance Team
--------------------------------
1B Harry "Slug" Heilmann
2B Miller "Mighty Mite" Huggins
3B Frank Crosetti
SS Rabbit Maranville
RF Rob Deer
CF Bug Holliday
LF Johnny Grub
C Don "Sluggo" Slaught
DH Red Badgro
P Dave Frost
P Chris Spurgeon
P Hooks Wiltse
MGR John Boles
P Grasshopper Jim Whitney
P Toad Ramsey
P Bugs Raymond
P Mickey Mouse Melton
Now we're talking the difference between the theory and the implementation of a calculation method. The theory is always couched "given a long enough time" so that star-gluts and manager fads get averaged away.
No. I don't accept the premise that the distribution of talent is the same at each position. If the distribution of talent differs by position, then theoretically the calculation of fielding value based on replacement-level players will give a different answer than calculation based on all stars. And the differences don't get averaged away "given a long enough time."
>If Keith Hernandez had thrown righthanded, he'd be a HoM 3rd baseman.
If Keith Hernandez had thrown righthanded, he'd still be a ML prick.
Very interesting hypothesis re. our shortage of HoM 3Bs--i.e. of course we're short 3Bs because only the righthanded population is eligible. (Of course other positions are all RH but not short of HoMers, though catcher also supports the hypothesis.) The corollary--or really, the more important part of the hypothesis--to this is that if RH only can (or, rather, do) play the position, then lots of innings are going to guys that wouldn't make the MLs in an ambidextrous world.
Re. players moving down the spectrum and the reasons for that--Rod Carew moved to 1B for the same reason as Ernie Banks, so that he could stay in the lineup more. He was not regarded as a great 2B but, still, he moved so he could stay in the lineup and get more PAs. His true position was "hitter." When he becomes eligible I'll be interested to compare him to Banks. As a Twins fan it pains me to say that Carew is probably pretty overrated for the usual reason that he did one thing (hit for average) so very very well.
The "absence of offense" provides an interesting perspective but I don't think you can assume a high correlation for the reason that the connection beetween offense and defense is filtered through the knowledge and opinions and biases of the management, which is an imperfect reflection of reality.
Being married to a Master Gardener, I can tell you that no Toad never did no harm to no garden.
Do you say this due to the coke problems? It always seemed like he was a smiling, happy guy on the field, but i never knew much about him as a person other than the snorting.
I figured a group of systemizers like this would be interested in this article:
http://www.eetimes.com/news/latest/showArticle.jhtml;jsessionid=N2B3QOHS4IJROQSNDLSCKHA?articleID=189401732
The trouble with 3B is that its in the *middle* of the spectrum and not at one of the ends. Guys like George Davis and A-Rod prove this point. If a star 3B can play a tougher position like SS, he will. I think there tend to be more multi-positional players who have played 3B than other positions (though I could be wrong on that point).
That sure seems to be the case. And like any player who does a variety of things well, instead of one thing great, a guy who plays multi positions (think Tommy Leach) will also tend to be underrated. Rose and Killebrew are the exceptions that prove the rule. (Dick Allen is the rule, though of course he has other issues. But he gets no credit whatsoever for playing 3B, he's just thought of as a mediocre fielding 1B.)
That's certainly been my observation.
True in today's game, not true before about 1930 when 3B was still considered (defensively) on a virtual par with middle infield positions.
-- MWE
Dead-ball 2B hit somewhat more than 3B from around 1900-1930.
19thC 3B hit a little more than 2B.
To consider the positions equal in weight before 1940 or so would be reasonable.
James' refinement of the "intrinsic weights" for these two positions is far more fussy than he is with the other positions.
As it should be. The relationship of 3B to all other positions is the most difficult of all such relationships to establish.
---------RF--CF--LF--SS--3B--2B--1B--CA--PI TOT OF/IF
1901-20 -66-101--80--52--54--43--42--06-250 694 247/149
1921-40 -79--85--77--53--39--54--90--28-183 688 241/146
1941-60 -76-101--87--71--69--60--60--28-147 699 264/200
--------221-287-244-176-162-157-192--62-580 2081 752/495
Top 32 players each season (including ties),
broken down by position, aggregated by 20 year intervals.
WARP-1 (earlier edition)
---------RF--CF--LF--SS--3B--2B--1B--CA--PI TOT OF/IF
1901-20 -43-102--61-110--46--99--28--13-190 694 206/255
1921-40 -69--78--63--71--29--84--72--24-197 688 210/184
1941-60 -56--79--56--57--48--65--45--14-279 699 191/170
--------168-259-180-238-123-248-145--51-666 2081 607/609
Based on top-32. Additional players added each season
when necessary to match WS count due to ties.
That may be. But the "intrinsic weights" appear to be wrong for the two positions during the 19th Century, when 3b outhit 2b. This will overrate e.g. Ned Williamson and Jimmy Collins at the expense of Fred Dunlap and Cupid Childs.
James also makes no adjustment for 1B circa 1890-1920 when it appears to be a more difficult position to field than any of the OF positions, at least when measured by "absence-of-offense" (though still not close to 2B/3B in fielding difficulty). The change in offensive value of 1B relative to historical norms is of a similar magnitude as the 2b/3b shifts.
WARP1 seems to have a more plausible OF/IF split, but it sure looks like pitchers are overrated from 1941-60!
That may be. But the "intrinsic weights" appear to be wrong for the two positions during the 19th Century, when 3b outhit 2b. This will overrate e.g. Ned Williamson and Jimmy Collins at the expense of Fred Dunlap and Cupid Childs.
I agree, though it's worth mentioning that WARP1 errs in the other direction, giving pre-1900 second basemen significantly more defensive credit than third basement, overrating Dunlap and Childs at the expense of Williamson and Collins. I agree with jimd's assessment that simply treating the two positions as equal defensively to 1940 would be reasonable.
Maybe.
Pitchers are 28% of the All-Stars from 1901-1940. This would be consistent with a model that pitching staffs consist of 3 full-time starters and a number of role pitchers that fill up the slack as spot-starters (due to the omnipresent doubleheaders) and relievers, long and medium (who are oriented more towards comeback mode, i.e. 2-3 inning stints between the pinch-hitters in the 9 slot, than they are towards modern lead-protection mode). 8 starting players and 3 full-time pitchers; 3/11 = 27%.
Pitchers are 40% of the All-Stars from 1941-1960. This would be consistent with a model that pitching staffs consist of 4 full-time starters and one full-time relief specialist, in addition to the role pitchers. Double-headers are in decline allowing 4 man all-star rotations to develop (see Cleveland), and a number of relievers reached short-term star status (though they lacked longevity, except for Wilhelm). 8 starting players and 5 full-time pitchers; 5/13 = 38%.
Or maybe it's just a strong era for pitchers, like CF in the 1910's or 1B in the 1930's. Most likely, it's both together.
TangoTiger Fielding Positional Adjustment Calculations
or if the game were played clockwise in odd innings and counterclockwise in even ones, with no position changes permitted (infielders change locations when first and third bases do).
On jimd #247
In both the Win Shares and WARP1 tables, the three smallest numbers are those for catchers in the three 20-year periods. But the thirdbaseman 1921-1940 are also "outstanding" in both.
At SS, 2B, and P, the two rating systems show remarkable opposite time patterns. For Win Shares, the five men around the basepaths all gain all-stars at the expense of pitchers. For WARP, the SS and 2B specifically lose all-stars to the pitchers. 3B, 1B, and C all-star numbers go upanddown or downandup with little net effect from early to mid-century.
I'm inclined to agree with this, and I also think that until about 1930 (and maybe to 1940) there's not a lot of difference between 2B/3B and SS.
-- MWE
How much offense managers are willing to give up basically determines the defensive value of playing the position at an average level.
I don't understand how to make these two statements work together. The replacement approach makes sense to me, so I don't need an explanation there.
However, I've always had trouble with the "absence of offense" approach to defense because it seems to assume managers and scouts are making the correct decisions about which players are most valuable offensively, and penciling them into the appropriate defensive spot as a result.
Much of the sabermetric literature over the last few years has called that into question in various ways. Some players traditionally considered good hitters have been determined to be out machines and drains on run scoring by sabermetric methods. So if teams are misinterpreting how good (or bad) a hitter is, and the absence of offense determines the value of defense at a position, won't a position's defensive value be misrepresented?
Some managers thought Tony Womack was a leadoff hitter. :)
Much of the sabermetric literature over the last few years has called that into question in various ways.
The "absence of offense" approach relies on an assumption similar to the Hall of Merit project itself, actually.
It is surely correct that individual managers, scouts, etc. will make mistakes in both their evaluation of players and in their assumptions about how value is created in baseball, but the acceptable offensive level for a position is established not by the choice on any one person but by the combined effect of all the choices of all the people with decision-making power about who plays and who doesn't. The "absence of offense" approach takes the position that this aggregate assessment is likely, on the whole, to be a reasonably accurate judgment.
The Hall of Merit as a group project is premised on the idea that the aggregate of all the choices of all the people with decision-making power about who gets into the HoM and who does will constitute a more reliable judgment about who the best players are than a single individual creating a list. Individual voters make mistakes, and the electorate as a whole may be mistaken in its judgments about value, but the results of the process cannot be lightly dismissed as evidence about who the best players have been.
Returning to assessment based on the combined effect of management decisions across the major leagues: there may be times when assessments of value should not rely on the evidence of such combined effect, especially when sabermetric or historical study identifies deeply flawed conventional wisdom about strategy or value that was pervasive in some period of baseball history. However, in the absence of such evidence about general errors, dismissing this evidence based on aggregated management decisions because individual managers make serious mistakes seems unwarranted.
The "absence of offense" approach relies on an assumption similar to the Hall of Merit project itself, actually.
. . .
The Hall of Merit as a group project is premised on the idea that the aggregate of all the choices of all the people with decision-making power about who gets into the HoM and who does will constitute a more reliable judgment
Whew. I thought I might read that the Hall of Merit project, generally judging merit largely by value, supposes that players are generally used where and how they have greatest value.
JeffM:
Some players traditionally considered good hitters have been determined to be out machines and drains on run scoring by sabermetric methods. So if teams are misinterpreting how good (or bad) a hitter is, and the absence of offense determines the value of defense at a position, won't a position's defensive value be misrepresented?
Unless those misjudgments about batter-runner value differ systematically by fielding position, they are only a source of noise, not bias. (Noise is a serious problem, I admit.)
Misjudgments of the fielding positions, what player types can or should play them, are probably a source of bias.
One statement I would take exception to is that for position players, the replacement level doesn't matter so much, because it impacts all players equally - it doesn't quite work that way, and over a career where you set the replacement level has a rather large impact.
Replacement Level - Chaining
A very interesting article indeed. And somewhat pertinent to at least one HOM candidate discussion. Hasn't there been some discussion of Frank Chance in chaining versus replacement terms?
Probably, although it was more along the lines of 'actual' replacement instead of theoretical. I think 'raising' the theoretical replacement level helps Frank. I basically use a '.500' baseline, and of course he's near the top of my ballot...
-----------
Baseball Prospectus' WARP1 is wrong
Let’s start off with the defintion of WARP-1:
WARP-1
Wins Above Replacement Player, level 1. The number of wins this player contributed, above what a replacement level hitter, fielder, and pitcher would have done, with adjustments only for within the season.
Then, let’s look at a .500 team. When I need a .500 team, pretty much without fail, I look for the Houston Astros, and they satisfy my needs. The scored about as much as they allowed, and won about as much as they allowed. Let’s take a look at their team:
BP does a great job in presenting their stats, making my job very easy:
http://www.baseballprospectus.com/dt//2006HOU-N.shtml
If you go down to the “Advanced Batting Statistics” section, which is a misnomer, since the data there is batting, fielding, and pitching, the team WARP-1 totals is 58.5 wins. The Astros won 82 games, which is pretty much what their RS/RA numbers would have expected. 82 minus 58.5 is 23.5 wins. 23.5 / 162 = .145. Another perennial .500 team I like is the Seattle Mariners. Their team WARP-1 is 52.1, and they won 78. Their RS/RA would have expected around that as well. 78 minus 52.1 = 25.9 wins. 25.9/162=.160. The Yanks won 97, which is also around their pythag record. Team WARP-1 is 71.9. 97 minus 71.9 = 25.1 wins, or a .155 record.
What do we learn here? That BP’s WARP-1 treats the replacement level as a team with a .150 record.
I’ve shown elsewhere on this site (click on the “Talent Distribution” category at the bottom of this entry) how the most likely team replacement level is a .300 record. This can be shown in many many ways. And, that pretty much is the number most analysts would use. To use anything else is, frankly, just plain wrong. Or needs a ton of explaining.
The replacement level that I use are: for a position player and a starting pitcher is .380. For a reliever, it’s .470. A team of such players will win .300 games.
So, why does BP calculate WARP-1 the way they do? The likelihood is that it treats a “replacement-level’ position player as a replacement-level fielder and replacement-level batter. But, such a player is not the 420th best position-player in the world. He’s probably not even in the top 1000 players in the world. Why is this the benchmark? What does it tell us?
I know all about the 1899 Spiders, and the recent Tigers. It doesn’t matter. Even if an MLB team posts a .140 or .250 record, our best estimate of the true talent level of these teams is nowhere close to those records. They probably need to be regressed 25-50% towards the mean.
Posted by Tangotiger on 10/03 at 02:50 PM
KJ,
He did do up a big Win Shares dialogue with Rob Wood that's pretty good. I don't have an URL, but if you google it, I'll bet you'll find it. Their main objections to WS are pretty much the same as we've all discussed: the zeroing out thing, questions about team v. individual fielding, stuff like that. It's a good discussion.
There are no replacement fielders or hitters, only replacement players.
This kind of knocked me over, and I couldn't wrap my head around it. So I queried him about it to get a better understanding of his point. At one point I summarized his position in my own words to make sure I understood it, and he said that the summation was just about on target. So I'll share that with you because I think it's potentially very interesting/important.
One quick note about this, the fellow told me that replacement can be roughly defined as any player whose combined hitting/fielding/pitching contributions are 17 runs below average per 150 games played.
OK, here goes:
Just because it takes a spell to digest this, let me break it out piece by piece.
1) Replacement level is -17 RAA net of hitting/fielding/pitching.
2) Therefore a replacement player is someone whose total net RAA don't exceed this, no matter how they do it, whether by fielding ineptitude or offensive awfulness, or pitching "prowess"
3) Distinctions between batting and fielding are based on BRAA/FRAA not on BRAR/FRAR because we're not trying to peg the player's performance in each facet to a single threshold, but rather trying to figure how many runs above or below average he is.
4) Therefore BRAR/FRAR are artificial and redundant. They are an attempt to force-fit a threshold of replacement where it needn't exist.
To explain why, the gentleman used the example that a "typical" replacement team might have the following breakdown of winning percentages among its players:
hitters: .395
fielders: .485
pitchers: .380
BUT, that's only the mean replacement value. (Making up numbers here...) if the team has .350 hitters, it might have .500 fielders and .400 pitchers. Or it might have .200 pitchers, .500 hitters and .600 fielders. The point is that the typical replacement level can be misleading if it doesn't offer a true sense of the net RAA for a player...that is, it doesn't matter how a guy goes about racking up -17 RAA/150 G, only that he does it (well, doesn't do it).
5) The artificial BRAR/FRAR line leads to potentially large distortions due to the interaction of fielding and hitting replacement values (and pitching too, I guess, I didn't ask); it's why WARP's real replacement is .150ish (24-138) instead of the more realistic .300ish (49-113). I haven't done so, but I suspect if you use pythagorus or pythaganpat to figure out what the BRAR/FRAR distinctions do to the system's real replacement level, I think you'll see why there's bigtime distortion.
Something the fellow added was that it is important to adjust for position, though I'd guess we all know that.
Anyway, I learned a lot from this guy, and I figured that maybe it would be pretty important stuff for those who live/die by WARP---and for all of us too since we talk about replacement a lot.
...WARP numbers for individuals don't add up to the same values that I would get if I evaluated them from their team numbers. Consider the 2004 Orioles. This team actually went 78-84. Since the replacement level win% is .153, their actual wins above replacement is 78-162*.153=53.2. Call it 53.
The sum of their individuals is 81, a big difference to be sure from the desired 53. But consider the team totals of batting rar (217), fielding rar (164) and pitching rar (381). These numbers have been adjusted to the standardized league of 9.0 runs per game, pythagorean exponent of 2. In the standard, a replacement level team scores 537 runs per 162 (replacement level EQA of .230, divided by league average of .260, raised to 2.5 power to convert it to runs, times 162 times 4.5 runs per team per game = 537) and allows 1262 (replacement pitching and fielding together yield a standard 7.79 replacement ERA, times 162 games = 1262).
A team that scores 537 and allows 1262 has a Pythagorean win pct of .153. The Orioles adjust to 537+217 brar=754 runs scored and 1262-164 frar-381 prar=717 runs allowed. That gives them Pyth% of .525 (85.1 wins), which makes them (.525-.153)*162=60 wins above replacement.
So we estimate the team as being 60 warp, and in real life they had 53 warp. That's not so bad, considering that they underperformed their real-world Pythagorean estimate by about 4 wins, and they allowed a few more runs than expected based on their pitching statistics.
The main point of that was to show that while the individual WARPs added together to 81, the team WARP only came to 60. There is a diminishing returns principle in the system, which follows from the mathematics of the Pythagorean model. A given number of extra runs will increase the win total of a bad team more than it will a good team. When Tejada or Mora or Lopez have their WARP calculated, they are adding their statistics to a replacement level team (simplifying; I actually calculate a team as average but for one replacement level player, compared to average but for this player) After you add Tejada, the team isn't replacement level anymore, and you don't get as many wins from from Mora as before. Now that you have Tejada and Mora, the team is even farther from replacement level, and Lopez' contributions don't count as much. That, in a nutshell, is the problem. The team environment does make a difference on how many wins a player's performance creates, but I've adjusted that context out.
78 - 162*.300 = 29.4 Team Wins Above Replacement Team.
Actual Orioles individual WARP = 81
"OVER" WARP for just this one almost average team = 81 - 29 = 52! (52/29 = 170% "over"WARP'd!)
Based on the WARP is wrong article that KJOK pasted in above, and some of Clay D's responses, can you describe the replacement levels in PA and your pitcher WAR? Do they avoid any of these same issues of too-low replacement? Thanks!
Not quite. RAAP, not RAA. Batting runs must be calculated relative to positional average for this to work.
3) Distinctions between batting and fielding are based on BRAA/FRAA not on BRAR/FRAR because we're not trying to peg the player's performance in each facet to a single threshold, but rather trying to figure how many runs above or below average he is.
Again, not quite. BRAAP, not BRAA. Batting runs must be calculated relative to positional average for this to work.
*****
BRAA is a useless stat unless the batting difference between position is considered. It can be added to BRAA (to create BRAAP) or added to to FRAA (to create FRAR). But it MUST be accounted for.
If you're not quite clear about what I mean, think about it this way. An average fielding 1b-man who hits like an average SS doesn't have a job. An average fielding SS who hits like an average 1b-man is an All-Star.
Davenport's system needs one last step. After values are calculated relative to the various replacement thresholds (batting, pitching, SS fielding, 1B fielding, etc.), an amount proportional to playing time must be subtracted to account for the difference between his "theoretical" replacment level, and the "true" replacement level of -17 RAA (or whatever it really is).
(Win Shares needs similar adjustments to BWS, PWS, SS-FWS, FB-FWS, etc, for similar reasons, and unfortunately each adjustment is separate. Of course, they then wouldn't add up to Wins, but as it stands, a large proprotion of the Win Shares are awarded for sub-replacement play.)
We have an uberstat if anyone cared to put it together: linear weights. Too bad MGL hasn't posted that in a few years.
You don't need his expensive dataset to do it though:
Park adjusted LW runs + ZR defense + baserunning + OF throwing + double plays. Throw in cluthiness if you so desire. What else is there?
And its all available for not too much $:
Baserunning from either the Bill James handbook or Hardball Times
DP data from Hardball Times.
I can't remember where I saw OF throwing but there are sources. If you're a baseball hack you can pull it together from MLB.com gamelogs.
Compare to Win Shares which treats the replacement level as a team with a .000 record. WARP-1 is closer to reality (it gets you about halfway there).
The main point of that was to show that while the individual WARPs added together to 81, the team WARP only came to 60.
Non-linear models are non-intuitive and take some getting used to.
Those who have taken courses in relativistic physics know what I'm talking about.
BRAA is a useless stat unless the batting difference between position is considered. It [the batting differences between positions] can be added to BRAA (to create BRAAP) or added to to FRAA (to create FRAR). But it MUST be accounted for.
Well, we still have to deal with players in the pre-ZR era. And, regardless of how the defensive figures are derived (PBP or using traditional categories), there are interesting philosophical questions to consider, such at the replacement level discussion happening here and elsewhere.
Compare to Win Shares which treats the replacement level as a team with a .000 record. WARP-1 is closer to reality (it gets you about halfway there).
Well, I accept Bill James' argument that he's not using replacement level at all as a baseline for Win Shares. The most natural complement to Win Shares isn't necessarily Win Shares Above Replacement (or Bench, as the Hardball Times does), I think, but Loss Shares.
If a pitcher gives up say 60 runs less than whaat a replacement pitcher is expected to give up, he's going to get credit for 60 PRAR and probably around 6 WARP. Well quite a few of those runs prevented are going to be given out to fielders on the team, too, so those runs are getting double counted. To get what WARP1 thinks is a true replacement level team, it'd be better to just calculate WARP1 for position players with just there offense.
And I've done that, although my OF arms is based just off of A so it can be improved a lot. Baserunning, save SB and CS, isn't included either. I would think someone here could write up code that could grab baserunning data from game logs and we could calculate baserunning runs very simply from there.
How do I establish “value” for being a World Series winner?
How does it change from 1903-68 to the modern day wild card format?
How much is a hit in a playoff game worth compared to a regular season one?
I’ll begin by estimating the typical franchise/fan value of winning it all.
A good season, 90 wins out of 162, is 25 wins better than a bad one.
Pre-1969: I’ll say a W.S. is worth an extra 50 wins. That way, two teams, one which won 65 games in year 1 and 90 games and a Ring in year 2, has the same “value” as a team that won 105 and 100 games but came up short of a pennant both years (probably need a few more years in here to make it more realistic..)
Winning the regular season pennant I’ll say is worth half of that; 25 extra wins. Continuing the previous example, a team that won 80 games in year 1 and 100 and the pennant but lost the Big One in year 2 would be valued the same as the other instances given.
Team A
65 wins = 0
90 wins + WS = 25 + 50 = 75 ..total of 75
Team B
105 wins = 40
100 wins = 35 ..total of 75
Team C
80 wins = 15
100 wins + pennant = 35 + 25 = 60 ..total of 75
Obviously this is not the same for all franchises; after a long period of good teams but finishing frustratingly short, a trophy is probably worth more.
If a Ring = 50 wins, = 25 wins more than the pennant the team already clinched to get INTO the W.S., then one W.S. game victory = a max of 6 wins, since it takes 4 games to get a Ring. You can win 3 and still get nothing (well, it’s better than getting swept I guess), so I’ll call each W.S. game win = 5 regular season wins.
In actuality, each regular season win has some extra vaue of pushing you TOWARD the goal of winning the pennant
Now, fast forward to 2000. 30 teams instead of 16, 8 playoff teams instead of 2.
It is harder to win a Ring.
But it’s easier to get some good feeling = bonus credit = $$ for at least making the playoffs.
I'll say a Ring in 1995ff = 40 wins. Pennant = 22, making it to LCS = 13, getting to playoffs = 8.
So W.S. victory = 18 wins beyond already winning the pennant.
LCS victory = 9 wins + ½ of WS (potential to win WS, which you get only if you win the LCS) = 18
LDS win = 5 + ½ of above 18 = 14
LDS set is only best of 5, so in each case, a single game win in LDS. LCS and WS in modern day is worth about 3 or 4 regular season wins (dividing the end result value by 5 like I did above in the pre-1969 example).
But getting to the playoffs is worth less in modern day, since who knows whether you’ll win the WS, or get 2006 St Louis Cardinal-ed by some mediocre team that pays superbly in October.
So, there’s my rough-cut for now. Giving approx credit to post-season exploits at 3-4 times the regular season rate in modern day, 5 X for pre-1969, in between for 1969-1993.
Pre-1969, great close-pennant-race heroics worth a little more, since it was all or nothing. If Yaz had needed bonus points, he would have gotten them from me :)
1) Mantle, Snider, Derek Jeter, or Chipper Jones won't really need the extra credit anyway, so you're in essence building up cases for Bernie Williams, Charlie Keller, Graig Nettles, or Dandy Andy Pettitte---borderliners from dynastic teams who get a ton of october PAs and INNs (especially since the three-tier format). They will benefit disproprtionately (as well as lesser players than they like maybe Ralph Terry or Jim Gilliam or Paul Blair or Willie Davis) thanks to their having the good fortune to have Joe D, Lou Gehrig, Mickey Mantle, or Greg Maddux as teammates. If I understand your concept, then in some cases, they will benefit to the tune of one to three seasons (maybe more for the braves and yanks). That's a ton of credit, and there's a lot candidates on both sides of the ball who could benefit from a season of credit or fall behind someonelse who was to receive such a disbursement of October credit.
2) Meanwhile guys before free-agency who were reserved to their clubs permanently get in essence penalized for signing as 18-20 years olds with teams that had a chronic inability to leverage frontline talent into perennial winners. Or teams that simply can't compete with various Yankee juggernauts. That's your classic Ernie Banks, Luke Appling HOMer, but it's also George Sisler or Goose Goslin.
3) And you may ultimately be forcing yourself to view the free agency choices of post-Messersmith players as value judgments on the wisdom of taking the money (in a potentially limited market dictated mostly by fielding position) versus latching on with a winning/dynastic team (where hindsight is 20/20). This sytem, for instance, may give a better than average corner like david Justice (398 Oct AB!---he averaged 401 per season over his career), Paul O'Neill (299 October AB) or Hideki Matsui (151 Oct. AB in three years and very likely to grow) a big advantage over other better-than-average corners like, say, Bobby Bonilla (149 Oct AB...passed by Matsui in just three years) or Tim Salmon (59 Oct AB) or Luis Gonzalez (87 Oct AB) or Moises Alou (134 Oct AB) or even that Sosa guy (53 Oct AB). When their free agency came along, they didn't seek out NY or ATL or else weren't sought by them. And so they don't reap the post-season reward. But were they really players with an extra season or more value in them? Or did they just happen into good teammates? Or happen into not-good teammates?
Anyway, these seemed like issues worth talking about. I know we'll probably always disagree, but in the interest of offering some feedback, I thought I'd see if they were helpful to you in any way.
[Just for interest's sake, Jeter has 478 Oct AB, Bernie has 465, Tino 356, Posada 307, Chipper 333, Manny has 307, Andruw 238. And here's an interesting one: Mr. October actually has 281. Wow, that's a lot for an era with a two-tiered system. Between 1971-1982 he only missed October in 1976 and 1979. Wow.]
I didn't mean to imply I was about to give another 4 seasons worth of credit to Bernie Williams becaue he happened to have played for the Yankees. But like DL said, if someone performs significantly above or below par for lots of playoffs games, the methods I wrote down attempt to gauge how MUCH extra credit I would give. Rolllie Fingers was lucky to have pitched for the A's. He also pitched well in clutch situations in 1972-74, and I'll try to quanitfy how much bonus I'll give him for that; because frankly, he'll need it to sniff my ballot.
Oct oct oct
name series abs inn
----------------------------
palmer 12 35 124
carlton 10 27 99.3
blue 8 13 64.7
seaver 5 20 61.7
fingers 9 7 57.3
mays 4 17 57.3
grimes 4 19 56.7
t mcgraw 9 5 52.3
gomez 5 20 50.3
bridges 4 18 46
cicotte 2 15 44.7
koosman 5 17 40.3
tiant 3 8 34.7
dean 2 15 34.3
a cooper 32
walters 2 10 29
rogers 2 7 27.7
kaat 4 9 24.7
welch 2 10 22
newcombe 3 8 22
lyel 6 2 21.3
trout 2 7 15.7
perry 1 4 14.7
niekro 2 3 14
trucks 1 4 13.3
marshall 2 0 12
willis 1 4 11.3
quinn 3 4 10.7
leever 1 4 10
shocker 1 2 7.67
hiller 2 0 5.3
redding 0 0 0
joss 0 0 0
leonard 0 0 0
mullane 0 0 0
w cooper 0 0 0
Oct oct oct
name series abs inn
-------------------------------
r jackson 17 281
rose 14 268
garvey 11 222
rizzuto 9 183
morgan 11 181
lopes 11 181
perez 11 172
e howard 10 171
bench 10 169
cey 9 161
bando 9 159
baker 8 149
campaneris 9 144
mcrae 13 143
stargell 8 133
munson 6 129
porter 8 120
tenace 9 114
chambliss 7 114
r smith 8 107
maddox 8 107
hebner 9 100
c cooper 5 96
schang 6 94
bancroft 4 93
oliver 7 92
decinces 4 89
grich 5 88
cepeda 4 87
brock 3 87
luzinski 6 82
moore ? 81
Monday 9 81
otis 6 78
lundy ? 76
doyle 3 76
foster 7 76
burns 3 74
keller 4 72
chance 4 70
bartell 3 68
yaz 3 65
matthews 4 65
madlock 4 65
cuyler 3 64
rice 3 63
leach 2 58
singleton 4 57
cendeno 4 52
oliva 3 51
carew 4 50
h wilson 2 47
cash 3 46
aparicio 2 42
williamson 2 41 2
traynor 2 41
staub 2 41
dunlap 1 40
roush 1 28
boyer 1 27
long 1 27
dimaggio 1 27
wynn 2 26
duffy 1 26
maranville 2 26
fox 1 24
childs 1 22
stephens 1 22
pinson 1 22
elliott 1 21
f jones 1 21
j ryan 1 20 5
berger 2 18
manush 1 18
lombardi 2 17
cravath 1 16
bresnahan 1 16
rosen 2 13
oms ? 12
klein 1 12
kueen 1 12
murcer 4 11
f howard 1 10
bo bonds 1 8
veach 1 1
trouppe ? ?
monroe ? ?
a wilson ? ?
s white ? ?
h smith ? ?
taylor 0 0
beckley 0 0
browning 0 0
ch jones 0 0
gvh 0 0
b johnson 0 0
j mcgraw 0 0
fregosi 0 0
vernon 0 0
kell 0 0
travis 0 0
jenkins 0 0
hargrove 0 0
harrah 0 0
Who stands to gain the most in this group? The biggest beneficiaries would be borderliners or near borderliners. Who fits the model?
Tony Perez
Fingers
Mays
Grimes
Gomez
Munson
Bando
E Howard
Rizzuto
maybe Dean
Tenace
Campaneris
T McGraw
Garvey
Lopes
Cey
We'll start seeing lots of Royals and Yankees and Cardinals soon as well, with Randolph, Quis, Nettles, Guidry, Goose, Ozzie, and also Keith Hernandez getting a lot of October time among the soon-to-be-retired crowd. Don Baylor will get a decent chunk too, I'd think.
Replacement value for the postseason doesn't make any sense. Actually, this may be one of those places where using WPA is justified. If someone could figure out the win probability added then I could just translate it into runs above average. I assume there is decent play-by-play information on baseball postseason much farther back than there is on regular season data. I know little about calculating WPA so I don't know the scope of calculating WPA for the list above. Obviously the context matters and nobody has calculated the postseason win expectancies but I think using the generic table is enough for a back of the envelope calculation.
It could be a really fun side project to calculate the WPA for all postseason play to find out just who the real "Mr. October" is.
In fact, there's a way of thinking about the WPA/WXRL question that could make sense. The goal is not so much to win games in the post-season but to win series. When thought of that way, the magnitude of each AB or IP grows. Blown saves and converted saves take on a much different character when your season comes down to three games or so.
And no one would be likely to benefit more from such a recalibration than an ace reliever (I'd think, though no evidence to prove that). Looking this way at it, you also would find yourself really bonusing guys like Maz or Carter or Luis Gonzalez, since their hits increased the likelihood of winning the World Series so much. But you'd also have to give a big nod to Jack Clark or Ozzie Smith for hitting key NLCS homers as well. I don't know if you'd want to count Bobby Thomson's his or Bucky Friggin' Dent's since they technically occured during the regular season.
I think I read an article about this lately, one that ranked the biggest hits of all time based on how much they moved teams toward a World Series. Don't remember the author or the publication. One of its findings was that Tony Womack's hit just before Gonzo's single was a really huge hit that's oft overlooked.
I'm on board with using WPA, now I just need it for LOTS of players which is a job I can't really tackle. I don't want to give credit just to the players somebody makes a case for without considering everyone in my top 100.
I believe defining what you believe the "goal" of playing is will be a big determinate of how you view this discussion. Is the goal:
1. Scoring/preventing runs?
2. Winning games?
3 Making the playoffs?
4. Winning Playoff Series?
5. Winning the Championship?
Depending on the answer to this question should determine what type of "measurement system" of merit you might use.
You said it better than I did. And to fold in other comments from above, your choice of a baseline measurement is going to have an affect as well, as is how you view success/failure in vanishingly small samples, as is the October run environment itself, as is the distribution of runs across games (take the 1960 WS as an example of this last point).
Hornet's nest.
Does anyone have Acrobat issues with their Digital Update PDF file? It has to do with the speed of the searches. If I search for "Mantle" I get the following behavior:
Adobe Acrobat 6 -> Almost instantaneously returns a couple of dozen hits.
Adobe Acrobat 7 -> Takes over a minute to scan the document to get the hits.
The thing is that I don't know how to go back to Acrobat 6 once a computer is running Acrobat 7.
Anyone seeing similar behavior? Does the file need to be indexed or something?
Any word on another digital update (through 2005 or 2006?).
Thanks.
I get a first hit in about 5 seconds, and 33 more hits in the next 25 seconds with Acrobat 7.
I don't believe there are going to be any more digital updates.
I couldn't disagree with this more . . . sorry Dr. C, probably why I'm such a detractor of WPA . . . I really don't like it at all as anything but a fun junk stat.
The problem with giving so much credit to Mazeroski, Carter and Gonzalez is that to do so you deduct credit from the guys that made those hits possible - the players that won the other 3 games and the ones that provided the other runs and prevented the outs in the game leading up to the point where they had their big hits. It all counts, and WPA misses that completely.
Marc
Seriously, WARP1 is OK it's just that I can't keep my spreadsheets up to date with the frequent changes. [That is an important practical point. -pgw] The changes also bring into question the validity of the whole system. If its owners are pretty sure that their previous iterations suck, how do they know the current one doesn't?
Problem: Bill James has developed a measure of Loss Shares, and in the limited privacy of the SABR Statistical Analysis egroup he has essentially said that he cannot recommend Win Shares.
The main difference may be where Davenport and James live in the publishing world. Davenport on the web cheaply publishes every revised rating. James is the one man who can get a big check from a paper publisher for a tome on this subject, so years rather than months pass between revisions that see the light of day.
Really, that’s no newsflash. We’ve known from Day 1 that win shares was not the perfect ending to the quest for the ultimate uber-stat. No doubt, James knew this as well.
That he would improve upon the system was inevitable. I’m sure that for every modification that people have made to WS, James has considered something similar.
Paul points out the useful fact that we get to see every incremental change in WARP; we will only see the win shares system change at its next plateau.
Historical Park Factors
You must be Registered and Logged In to post comments.
<< Back to main