Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. SM in DC
Posted: January 17, 2002 at 08:40 PM (#84407)
Hi, my name's Bret Boone. My dad's famous and I just got a huge contract on the strength of a career season. Pretty soon I'll be using my money to move into this very exclusive, gated community -- I think Brady Anderson and Kevin Maas live there now.
2. Steve Treder
Posted: January 17, 2002 at 09:04 PM (#84408)
List that Bret Boone hopes his name won't someday be on:
Weirdest One-Year HR Flukes of All-Time:
1. Wally Moses, 1937
3. SM in DC
Posted: January 17, 2002 at 09:26 PM (#84412)
The question now becomes who will fall back to earth faster -- Boone or Eeeeechiro?
4. RP
Posted: January 17, 2002 at 09:39 PM (#84413)
"only Chico Fernandez and Brady Anderson went on to look worthless in the three years following their HR-fluke years"
Huh? In 1997 Brady had an .860 OPS. He had a mediocre season in '98 (.775 OPS), but rebounded in 1999 with an .880 OPS and 36 stolen bases in 43 attempts. How is that worthless?
5. Darren
Posted: January 17, 2002 at 09:40 PM (#84414)
Wouldn't Ichiro have to have been 'on earth' originally in order to fall back to it?
He's only played one year and was great.
6. SM in DC
Posted: January 17, 2002 at 09:53 PM (#84415)
It's just my theory, but after watching some of the latest imports there is a double adjustment. First, the Japanese player/pitcher has the edge b/c the league is unsure of how to pitch to or what pitches to key on.
We'll use Nomo as an example for his first three seasons his K totals were over 230 then dropped in 1998 to the 160 level. The past two seasons he has posted 180 then 220.... a nice parabolic pattern. His ERA however, skyrocketed after year two and has held in the mid- to high-4.00s...
I wouldn't be so quick to pencil Ichiro in as a repeat batting champion.... .300 would be a good 2nd year.
7. Zeke
Posted: January 17, 2002 at 10:18 PM (#84417)
Can't wait to read the headlines next July: M'S TRYING FRANTICALLY TO DUMP .240-HITTING BOONE AND HIS SALARY.
8. SM in DC
Posted: January 17, 2002 at 10:49 PM (#84419)
Of course, Truth, but I hold several classes of players in the "double-correction" genre, probably most obvious, to me anyway, are knuckleball pitchers and "old" rookies -- see Shane Spencer's eruption in 1998 then settling there after and Tim Wakefield's domination of the NL in the early 90s, vanishing act then reemergence in Boston
9. WaltDavis
Posted: January 17, 2002 at 10:53 PM (#84420)
Well, Velarde is still a decent hitter (780 OPS, which would be 5th among AL 2B in 2001 ... I guess 4th now that Alomar is gone). And he smacks lefties (OPS about 840 over the last 4 years). So he's useful. Also, did Beane say anything about how important his clubhouse presence was? It's not a Proven Veteran signing unless (1) the guy stinks; (2) the team talks about his presence; and preferably (3) he blocks a young player's advancement. :-)
The M's could probably use a lefty version of Velarde to occasionally mix in for Cirillo and Boone. Oh wait, they've got McLemore ... for 2 years and a whole bunch of money. :-)
To the extent that I'd criticize Beane, it's for ending up with Justice, Hatteberg, Venafro, and signing one of those lefties (Holtz?) all making pretty good money. I'm still dumbfounded by the Hatteberg signing ... but it wouldn't be the first time Beane knew more than me.
10. bob mong
Posted: January 18, 2002 at 12:09 AM (#84422)
can anyone point me to some literature/study showing how OBP is a better indicator of scoring runs than batting average? I am extremely interested in seeing someone's work on this subject.
11. bob mong
Posted: January 18, 2002 at 12:21 AM (#84424)
and regarding boonie:
what do you want the Ms to do? Sign one of the many cheap, better free agent 2b out there? I haven't heard any actual names of who the mariners would be better off having. or, perhaps, should they have signed boone for cheaper? oh yeah, i guess they can't just unilaterally impose whatever contract they want on him; he has to agree.
sure it isn't the best signing of all time, but the mariners are looking to WIN next year and can't afford to have a prospect learn how to hit/field at 2B; they needed someone who can hit NOW.
sure he won't hit as well in 02 as he did in 01. but he had one of the best hitting years for a 2B ever. if he comes down from that to merely very good, i think he will be worth $8 million.
12. Steve Treder
Posted: January 18, 2002 at 12:28 AM (#84425)
"I think that Bret Boone will be able to repeat his 2001 performance ... why can't he pop 25+ a year in the Safe?"
His 2001 peformance included 37 homers, which is roughly 50% more than 25. And Safeco is not a good hitters' park.
A very good defensive 2nd baseman who hits 25 HR a year is a terrific player, and that isn't unreasonable to expect from Boone at this point. But his OBP probably won't be very good. It's almost a certainty that he'll never match his 2001 numbers again.
13. . . . . . .
Posted: January 18, 2002 at 01:28 AM (#84428)
Interesting how Boone got stronger all of a sudden, eh? Put on about 20 lbs of muscle in one off-season in his 30's? (cough, cough).
If I were the M's, I'd take out insurance on this deal in case a new drug testing program is put into place in the next few years...
14. SM in DC
Posted: January 18, 2002 at 01:14 PM (#84433)
Yeah I put on 40 pounds in college too, but it certainly didn't put me in any shape to hit 30+ bombs... There could definitely be a little pharmacutical use, maybe?
15. Steve Treder
Posted: January 18, 2002 at 04:34 PM (#84438)
" "Blithely accusing" would be one thing, but I've read anonymous comments from ballplayers (not the most reliable source) about considerable steroid use in MLB. I don't like to think about it, because it takes away from accomplishments I'd rather just enjoy-- which is probably how most everyone else feels. But if someday there's some medical complication, then it will all come out. Barry Bonds is a lot bigger than he used to be, too, and there were comments made before the season about how much bigger he'd gotten."
Well said.
I think it would be naive to presume that steroid use ISN'T widespread in baseball. Performance-enhancing drugs are obviously widely used in many other sports -- why would the Olympics go to the efforts they do to test if there wasn't good reason to suspect that many athletes were using them?
While there may very well be long-term health problems caused by steroids, the extent of these is at this point highly variable and largely little known. What is disputed by practically no one is that, used judiciously and as an integrated part of a rigorous training and conditioning program, performance-enhancing drugs do in fact enhance performance.
So put yourself in the athlete's position. You have probable cause to suspect that many of your competitors/opponents are using these drugs. You have a very short window of time to establish yourself and to be successful. The rewards of success include stupendous wealth, fame, and adulation. The consequences of failure include humiliation and poverty. Faustian bargain it may be, but wouldn't you be tempted?
Baseball players are people, as prone to greed, temptation, and unwise judgment as any of the rest of us. I think it is the least likely of circumstances that steroids aren't part of the MLB environment as we enter the 21st century. Welcome to reality.
16. RichRifkin
Posted: January 18, 2002 at 05:25 PM (#84439)
"... dzop's hint of steroids is not unlikely... "
And
"I've read anonymous comments from ballplayers (not the most reliable source) about considerable steroid use in MLB."
And
"I think it would be naive to presume that steroid use ISN'T widespread in baseball."
Then call me naive. I don't presume that steroid use is widespread in baseball.
In the specific case of Bret Boone, I don't think it's fair to specifically accuse him of steroid use without any specific evidence. I don't deny that Boone had a fluky great season in 2001. Everything went incredibly well for him. But fluky great seasons are not unique to our era. Nor are giant leaps in performance that become sustained. Well before "juice" was introduced to sports a little more than 30 years ago, these anomolies occurred. I haven't seen any evidence that there have been an increase in anomolous seasons in the last 10 years. If "juice" use is now ubiquitous - the charge that has been levied - then wouldn't giant leaps in performance be much more common today than they are?
And what happens next year if Boone's performance returns to his prior levels? Does that mean he went off the "juice"? Or does that mean that he was never on it, and that his 2002 season was entirely a fluke? And are many of his accusers making these two seemingly incompatible arguments at the same time?
My understanding - also based on comments made by players - is that "performance enhancing nutrition" is now widespread. Players work out with weights almost every day, almost all year long. In order to help them increase muscle mass, they eat the protein crap that is sold at stores like GNC Nutrition Centers. Perhaps some of the ingredients in that crap has similar harmful effects as steroids can have. I don't know. But, until I see some convincing evidence, I will continue to doubt that more than a handful of players are taking illegal anabolic steroids.
Finally, my interpretation of the comments made by players that steroid abuse is "widespread" - none of which I've ever heard; but perhaps I just missed them - is that these kind of accusations can be made out of jealousy and frustration. For example, it's possible that Player A is unmuscled and struggling while Player B, his highly muscled teammate, begins to figure things out and is raking the ball. Player A then, in frustration, drops a hint to the local beat-reporter that Player B is on the "juice." This goes unreported as a specific allegation. But that beat-reporter tells others in the press and the rumors spread that Player B is on the "juice."
17. RichRifkin
Posted: January 18, 2002 at 05:53 PM (#84441)
"So put yourself in the athlete's position. You have probable cause to suspect that many of your competitors/opponents are using these drugs. You have a very short window of time to establish yourself and to be successful. The rewards of success include stupendous wealth, fame, and adulation. The consequences of failure include humiliation and poverty. Faustian bargain it may be, but wouldn't you be tempted?"
Steve,
If what you say is true, why wouldn't the MLBPA be pushing for steroids testing?
Your argument is that individual players are in a prisoner's dilemma - if you've studied game theory, you'll know what that is. The Union could easily solve this dilemma, which presumably would benefit the players collectively, by demanding steroid testing. Yet they don't. Doesn't that alone logically suggest that steroid use is far less common than you contend?
I've read anonymous comments from ballplayers (not the most reliable source) about considerable steroid use in MLB.
They're even less reliable than you portray. Most of the anonymous comments are about rumors, not about firsthand knowledge.
19. Steve Treder
Posted: January 18, 2002 at 06:35 PM (#84443)
"Your argument is that individual players are in a prisoner's dilemma - if you've studied game theory, you'll know what that is. The Union could easily solve this dilemma, which presumably would benefit the players collectively, by demanding steroid testing. Yet they don't. Doesn't that alone logically suggest that steroid use is far less common than you contend?"
It is the prisoner's dilemma precisely.
But I don't think the MLBPA's inaction on the issue proves anything at all. There are obvious reasons for everyone involved to keep the whole question underground. The prevailing logic could very well be "don't ask, don't tell," just as it was for many years regarding players' widespread use of amphetamines ("greenies").
Look, I'm not saying that I "know" Bret Boone or anyone else uses steriods. I'm saying just the opposite: I DON'T know. My assumption is neither that players do, nor that they don't. What I'm saying is that given what we do know -- that these substances exist, they are available, and they are and have been used by many athletes in many sports around the world -- I just don't think it's a logical conclusion that baseball players are necessarily different from these other athletes, and therefore to assume that steroids are NOT used by baseball players.
20. Lest we forget
Posted: January 18, 2002 at 07:06 PM (#84444)
Love the one year home run fluke list.
What about Andre Dawson? Whatever you want to make of it, for whatever reason, hitting 49 in 1987 was by far his greatest output.
And what a former home run king? Wouldn't Maris be up for consideration?
Kevin Mitchell in 1989?
George Bell in 1987 (there's that year again)?
And depending on what the future holds.. Barry Bonds, 2001...stay tuned.
21. Lest we forget
Posted: January 18, 2002 at 07:06 PM (#84445)
Love the one year home run fluke list.
What about Andre Dawson? Whatever you want to make of it, for whatever reason, hitting 49 in 1987 was by far his greatest output.
And what of a former home run king? Wouldn't Maris be up for consideration?
Kevin Mitchell in 1989?
George Bell in 1987 (there's that year again)?
And depending on what the future holds.. Barry Bonds, 2001...stay tuned.
22. Cris E
Posted: January 18, 2002 at 07:27 PM (#84446)
'87 was a good HR year for all of baseball (except pitchers).
23. scruff
Posted: January 18, 2002 at 10:05 PM (#84448)
"Runs Created is fundamentally runners on base times total bases divided by the number of outs made (with more refined versions having modifiers for stolen bases etc. but still basically the same)."
It's actually divided by plate appearances, not outs made.
24. McCoy
Posted: January 19, 2002 at 06:51 AM (#84451)
Steve T,
25. . . . . . .
Posted: January 19, 2002 at 08:00 AM (#84454)
Shirley, I've played baseball for so long I feel funny without a bat in my hands. And I'm not exaggerating.
In regards to Boone and steroids, I obviously dont KNOW that he used them, but good god, he sure looks like he did. I'm a pretty religious lifter, and I can tell you that I, nor any of my friends, ever had results like he got in such a short time. Maybe its cause he uses a fancy-schmancy strength coach, but I think a sip of the juice is more likely. Remember, given the risks of steroids, it'd make the most sense to use them in a contract year, then drop off them once you got the guarenteed millions in your pocket.
26. Robert Dudek
Posted: January 19, 2002 at 02:55 PM (#84455)
I don't even want to speculate about steroid use - what would be the point?
There is probably a 95% chance that Boone will be significantly worse in 2002. So what. The odds are very good that Boone's 2002 will fall somewhere in between his 2001 and 2000. He will still be among the best players in baseball at his position and you generally want to collect as many of those guys as you can because that's how you win pennants.
The money is a secondary issue for a club like the Mariners because they are extremely likely to have massive revenue over the next few years. When you've got the money, all you should care about is whether you are paying for quality goods or not. I'm pretty confident that they are.
27. Mike
Posted: January 19, 2002 at 03:23 PM (#84456)
To answer one of your questions Shirley-- part of the reason people have batting coaches is to avoid backsliding. If you had ever played the game (TM), you would know that hitters get into horrible habits all of the time. If all a batting coach is able to do is keep a player from backsliding, especially once he reaches a solid level of performance, that is a pretty good job. Also, pitchers make adjustments and hitters need to make adjustments back. As players age, and their physical skills deteriorate, they need to make mechnical adjustments to compensate for decreased bat speed, coordination etc.
There is also a limit to how good a particular player can get. It is unsual for a player Bret Boone's age to markedly improve his game because he has been coached by professionals for more than a decade and if a breakthrough was going to happen *most likely* it would have happened already. There is only so much you can learn. That said, as Luis Gonzalez, Jamie Moyer and some others have shown (including just about every knuckleballer), life can begin after 30. A small number of players do improve dramatically in their mid-30s (just as I imagine a small number of people are in better shape when they are 50 than when they are 25 or a small number of lawyers suddenly turn from Marcia Clark into Clarence Darrow at age 40), not many do. And, as I am sure you know, players do have fluke seasons.
29. McCoy
Posted: January 19, 2002 at 05:10 PM (#84458)
So then since Ned's HR output had nothing to do with him and they changed the rules I believe the next year wouldn't that qualify as a fluke season? I know the comparison to Biggio isn't exactly appropriate that is why I made it. Biggio is or was a doubles hitter which is why I used him. And even if they had changed the rules today he still wouldn't hit 140 homers. I was trying to show how extreme of a difference it was for someone to hit 27 homers compared to others years output.
30. Steve Treder
Posted: January 19, 2002 at 06:09 PM (#84460)
Maris' 1961 season was obviously his best, by a wide margin. I don't think I'd classify it as a "fluke," however. He had obviously established himself in 1960 as one of the top power hitters in baseball, and as MikeEmeigh points out, the AL in 1961 was a great environment for power hitters (others who had career highs in HRs in that league: Mantle, Gentile, Colavito, Cash). Maris sufferred significantly from the emotional stress of the '61 pressure, which certainly impacted his performance in '62, though even in that year he was pretty damn good, with 68 EBH and 100 RBI. Following 1962, he was never physically 100% again.
BTW, the Kansas City ballpark he played in was actually a good hitter's park. Maris' performance in 1959 was impacted by an early-season illness. The Cleveland park wasn't a good one for batting averages, but was pretty much neutral in its overall HR impact, though in his time there Maris hit 8 HR at home and 15 on the road. And while Yankee Stadium's short porch in RF would seem to be a big explanation for his HR surge, in fact in 1960-61 Maris hit 43 HR at home and 57 on the road. He was, for a poignantly short time, a legitimately great player.
Of course there might be some kind of explanation regarding some of the fluke HR seasons I listed in the earlier post ... I hadn't been aware of Moses' shoulder injury following his great 1937 performance, for instance. But in the Dave Johnson case, obviously he was helped by moving to the Atlanta "launching pad," but the fact still remains that he hit 17 HR on the road in 1973, after only once hitting more than 7 road HR in his 7 years with the Orioles.
And I defy anyone, anyone, to explain the Campaneris or Fernandez cases as anything other than sheer random weirdness.
In the case of Williamson in 1884, certainly that was a park effect of the most extreme kind. Williamson hit 27 HR that year (25 at home), while several teammates did quite well too: Fred Pfeffer hit 25 (all at home), Abner Dalymple hit 22 (18 at home), and Cap Anson hit 21 (20 at home). Overall the White Stockings in 1884 hit 142 HR, 131 in their home park, in a league in which the next-best team HR total was 39. Perhaps someone can verify this, but it's my recollection that after 1884, the team either had the ballpark significantly reconfigured, or moved parks altogether.
31. RichRifkin
Posted: January 19, 2002 at 09:30 PM (#84464)
dzop writes: "In regards to Boone and steroids, I obviously don't KNOW that he used them, but good god, he sure looks like he did. I'm a pretty religious lifter, and I can tell you that I, nor any of my friends, ever had results like he got in such a short time. Maybe it's cause he uses a fancy-schmancy strength coach, but I think a sip of the juice is more likely."
dzop,
It sounds like you know a LOT more about pumping iron than I do. After finishing my high school football career - 20 years ago, YIKES! - I have never been a serious weight lifter.
But I wonder if it's still not possible, or even likely, that this is the explanation for Boone: 1) prior to 2001, he was never a serious or dedicated weight lifter; 2) prior to 2001, Boone never lifted for power or muscle mass gain, but rather for building endurance; and 3) prior to 2001, Boone never modified his diet to match a strength gain program, incorporating things like the protein mixes that are sold by EAS and other companies?
Actually, statistics handle improvement over time perfectly well. Otherwise, we wouldn't know that, say, Luis Gonzalez had even improved. Statistical interpretation is given its level of primacy here because it is the only meaningful way to talk about how well a player performs.
People attributing Bret Boone's 2001 season to steroids instead of conditioning, coaching, and skill has nothing to do with statistics. They are two alternative ways of explaining a different level of performance than we're accustomed to seeing from Boone. The only way we know Boone played better than usual in 2001 is by looking at the stats.
To put it simply: the numbers can grasp Boone's improvement. They did. Nowhere in Boone's prior numbers is there any indication of how he would perform in 2001, and there's not supposed to be. They just show how he did in 1992-2000. That's a reasonably good predictor of how a player will perform in 2001, but it is obviously not close to perfect. To blame statistics for not doing what they're not supposed to is ridiculous. And I don't understand how anyone can suggest that mathematical analysis has an "inability to quantify the qualitative shifts in performance boosts." That's exactly what statistics do. When Boone got better, statistics quantified it, regardless of how he did it.
In short, I think Boone will continue to play at an improved level, though probably not as improved as he was in 2001. That some people don't is not a reflection of their preference for statistics over whatever kind of analysis Shirley is suggesting, it is a reflection of their preference for analysis based on the 9 years from 1992 through 2000 over analysis based on 2001.
33. RichRifkin
Posted: January 20, 2002 at 05:45 AM (#84467)
"as Stargell said, the body gives up after a while (and for most people, well before 37)"
I don't argue with that at all, espeically in regard to baseball. But if you've seen the amazing Jack LaLane in recent years - I think he's now over 90 years old - it is remarkable how well the body can be maintainted if you work at it, as LaLane has.
34. Robert Dudek
Posted: January 20, 2002 at 01:52 PM (#84470)
Pete...
Whatever else is going on, the numbers capture very well that Barry Bonds' 2001 was the greatest offensive season in major league history.
No one can predict how many homeruns Barry Bonds will hit in his career - that's not sabermerics anyway, which focuses on trying to understand what has already happened rather than trying to predict what will happen.
I couldn't agree more with Robert Dudek's above post.
My problem with Shirley's post was not that I think statistics handle everything perfectly; I'm well aware that they have plenty of limitations. I just don't see how saying Boone was taking steroids is the result of mathematical analysis and saying he had better coaching and conditioning isn't.
36. Steve Treder
Posted: January 20, 2002 at 07:24 PM (#84473)
"You guys have been too nice to Shirley: she dropped vitriol all over this place and should be called on it. How can we take seriously the analysis of someone whose only bit of analysis that we have seen, that is, her analysis of the people on this list (all "nerds" who have never played a sport) is so obviously in error?"
I agree. I've been doing my best to ignore Shirley's posts; IMO silly sniping between posters is just childish noise that obscures the subject matter that interests me.
But: Shirley's presumptuousness that she knows anything, let alone everything, about "us" is appalling. Shirley's rude and abrasive tone is revolting. And the one snippet she provides about her baseball expertise (the Bonds patience and uppercut business) betrays the most trivial understanding of how the game is played, and her bloated diatribe on the subject of quantitative analysis can be summed up in one brief phrase: numbers don't tell the whole story.
Thanks for that original insight, Shirley. Our lives are now changed.
?? This is not a statement that makes any sense. The past matters *only* in the way it helps us predict the future. "Those that don't remember history are condemned to repeat its mistakes" implicitly means "repeat its mistakes *in the future*". Of course sabremetrics is about trying to predict the future, as are all belief systems, including scientific/numerical or less hard belief systems.
Andy, there's a difference between a prediction like "If A, then B" and a prediction like "B." Sabermetrics is reasonably good at the former. At the latter? Not so good. And I think sabermetrics gets a bad rap from much of the public because people think it is, or should be, doing the latter.
The past matters *only* in the way it helps us predict the future. "Those that don't remember history are condemned to repeat its mistakes" implicitly means "repeat its mistakes *in the future*". Of course sabremetrics is about trying to predict the future, as are all belief systems, including scientific/numerical or less hard belief systems.
Perhaps I'm in the minority, but I have a little more interest in the past than that. In any case, I like the analogy to the history adage, because it illustrates the point that sabermetrics provide about as good a predictor of the future as history books do, which is to say it gives you a good frame of reference to make informed guesses, but not nearly enough to be dependably correct.
Andy James:
If a clearly superior horse runs just well enough to defeat lesser horses, thus earning a win but an unimpressive time, what can we expect that horse to do against a better class of horses? Run faster?
I think that very aptly describes the behavior of the Yankees in the regular season vs. the post season lately.
It may be that some managers are very good at evaluating individual players' talents, while other managers are better at envisioning how those talents fit together.
Very good point. I think that's how the Seattle Mariners won 116 games in 2001.
39. WaltDavis
Posted: January 21, 2002 at 12:58 AM (#84480)
One thing you learn quickly is that, even though statistical consideration tends to treat the horses in a race as individuals, as if they ran independently of one another, in practice they run very much against each other. That is, the horses very much affect on anothers' outcomes as they run together. This is an area where the numbers lose their effectiveness, and you have to (gulp, sorry) begin to imagine the psychology of the horses (which is alien terrain), and envision their interaction. This is where developed intuition supercedes number crunching.
Perhaps better would be to develop statistical models that don't assume independence. The assumption of independence is becoming rarer in statistical modeling now that (1) there are plenty of models developed that don't require that assumption and (2) fast computers and good software make estimating such models pretty quick and straightforward. I know squat about horse racing, but a given horse's race times can be treated similarly to repeated measures (without assuming equal distance between measures nor the same # of measures per subject) with another nesting factor for race, preferably sufficiently controlled for by using race-specific factors such as weather, "track speed", avg. time of that race, avg. time of horses prior to that race, etc. Gimme the data (including unique horse and race id's) and I can probably do this in about 6 lines of code. :-) Just don't wager on the results.
[though it's been a while since I read it, I recall the piece on pitcher abuse points in last season's BP is an example of where a repeated measures approach should have been taken]
As to predicting Barry Bonds and Luis Gonzalez, and sort of agreeing with one of Mike's comments, sabermetricians (at least "mainstream" ones) don't seem real fond of filling us in on the confidence intervals around their predictions. Hopefully that's not out of their own ignorance but rather a desire to not confuse stat-o-phobic readers. Though I seriously doubt that Bonds and Gonzalez fell within a 95% confidence interval, I suspect the 95% confidence interval around things like EQA or XRuns or RC or whatever your favorite measure are can get pretty broad, especially in a sample of 162 games. That variance may not be constant across position/age either.
I haven't read the gnarly sabermetric stuff so my following pronouncement is grossly premature and I deserve the negative comments that may be coming my way, but the four primary statistical problems I see with sabermetrics are:
1. The assumption of independence. This is the area where there have been some steps taken (e.g. park effects, era effects) though it's not clear to me that the authors understand they're helping control for the lack of independence. Note, controlling for a lack of independence will generally increase your standard errors which means you're likely to find less significance.
2. The ecological fallacy. This one may not be as bad as I think it is. But near as I can tell, most/all of the formulas (i.e. the weights) used to come up with these player-level comprehensive measures like RC, XRuns, EQA are based on team-level regressions and other team-level analyses. Alas, there need be little to no connection between team-level functions and player-level functions (though in this case, the connection is probably higher than in most cases). Note, this is related to point #1 above. If I'm way off-base here, I apologize...and I'd be curious as to what the dependent variable is in the models underlying the weights.
2a. Near as I can tell, the quality of these player-level comprehensive measures, especially the ones measured in runs, is that when you sum up the predicted runs for the players on a team in a given season, you get a number close to the actual number of runs for the team. If I correctly understand how these weights were derived, a close match to the aggregate is to be expected and doesn't tell you much about the quality of your player-level model (though it's certainly better than being off by a lot). At the very least, even if the weights are correct, the amount of error at the player-level is likely much higher than the error at the team level. Also, under- and over-estimates of the team total based on player-level estimates are not necessarily evidence of a team over- or under-performing their expectations, they may be evidence of the error in the player-level model.
3. confidence intervals.
4. Lack of training in measurement theory and modeling. Sabermetricians are hardly the only ones seriously lacking here.
40. Robert Dudek
Posted: January 21, 2002 at 01:15 AM (#84481)
Andy Cleary....
I wasn't making myself at all clear in my last post. I apologize.
Sabermetrics is about understanding what has happened. We study past baseball games and their records to try to deepen our understanding of the game.
That is the essential work of sabermetrics. We can then use this knowledge to help predict the future IF WE WANT TO. Your "comdemned to repeat history" analogy doesn't apply to baseball because we are not talking about some sort of moral imperative to avoid catastrophe (I always thought that that phrase was 70% hokum anyway). Sabermetrics is absolutely meaningless outside of the confines of its narrowly defined universe (I speak here of baseball and it's rules of play).
Many years ago there were studies done that strongly suggested that offenses based on power and walks generally score more runs than those based on batting average and speed. By the standards of sabermetrics this is extremely useful knowledge, but it rests on the assumption that the conditions which produced this phenomenon in the past are maintained in the future. There is no guarantee that they will be maintained.
The primary focus of sabermetrics is studying baseball in order to understand it - what anyone does with this knowledge is their own business and there certainly is no mandate that it has to have practical consequences.
41. Robert Dudek
Posted: January 21, 2002 at 02:10 AM (#84482)
WaltDavis wrote: "2a. Near as I can tell, the quality of these player-level comprehensive measures, especially the ones measured in runs, is that when you sum up the predicted runs for the players on a team in a given season, you get a number close to the actual number of runs for the team. If I correctly understand how these weights were derived, a close match to the aggregate is to be expected and doesn't tell you much about the quality of your player-level model (though it's certainly better than being off by a lot). At the very least, even if the weights are correct, the amount of error at the player-level is likely much higher than the error at the team level. Also, under- and over-estimates of the team total based on player-level estimates are not necessarily evidence of a team over- or under-performing their expectations, they may be evidence of the error in the player-level model."
A formula like XRUNS is developed at the team level. The true weights of the offensive events are not static but change over time as general offensive conditions change. The main reasons why team XRUN predictions miss their mark are (a) differences in clutch performance (usually assumed to be due to chance, but lineup design can have an effect here) and (b) elements like baserunning and moving runners via outs which are not captured in the formula.
On a player level, there is no "standard" by which to judge accuracy and that is because runs are scored on a team level. A player's contribution to team runs is also affected by lineup position and how many teammates are on base (batting a lot with 2 out and no one on lowers the value of your offensive performance).
Nevertheless, in a linear formula, the sum of the contribution of each player (as captured by the formula elements) does a good job of estimating the number of runs a team will score.
In my opinion, we are very near the limit of what we can do with the traditional stats. To get more accurate, we will have to use play-by-play data in conjunction with computer modelling of offenses.
42. RichRifkin
Posted: January 21, 2002 at 07:44 AM (#84483)
"... btw, the jockey plays about as much of a role in the race as a manager in a ballgame. Maybe less, in my opinion."
Andy James,
I've never been much of a horseracing fan. But my grandfather - who was indeed a serious horse gambler - told me something to keep in mind about jockeys: they are often a signal as to what kind of horse you are looking at, particularly in minor races. My grandfather was born in Poland in 1886 and died in San Francisco in 1980. So, in his day, if Eddie Arcaro or Willie Shoemaker or later Laffit Pincay, Jr. were on the mount, that would be a good signal, barring other information, that you're looking at a very good horse. The few times I've been to Golden Gate Fields or Bay Meadows, I've done the same when Russell Baze is on a horse. Chances are, it's a damn good horse. Baze has been too successful for him to get on a bad horse. And the owners and trainers of the best horses tend to seek out a quality and known jockey like Russell Baze.
43. Steve Treder
Posted: January 21, 2002 at 07:08 PM (#84487)
"The grace and thoughtfulness reflected by the people who post on this site truly remarkable -- even when their posts are in response to zealoted and argumentative boors. That is why I keep coming back to a site which I sincerely believe is 'Baseball for the Thinking Fan'. Gentlemen and scholars..."
And, one hopes, ladies as well.
44. WaltDavis
Posted: January 21, 2002 at 07:33 PM (#84488)
Nevertheless, in a linear formula, the sum of the contribution of each player (as captured by the formula elements) does a good job of estimating the number of runs a team will score.
Yes, but I believe this is the expected case whether the given formula does a good job of esimating individual "contribution" or not. One point I should have made before is it depends a lot on what you use these for. If all you want is a prediction of how many runs a team will score, then this does a good job. But if you want to assess a given players contribution or compare individual players, which happens all the time, then this may not do a good job.
That may just sum up my general criticism. From what I've seen, team- and league-level understanding and precision of prediction is very good. At the player level, the quality of the measures is quite unclear. Many mainstream sabermetric writers (Neyer, BP) and many of the posters here (including myself) have a tendency to make pronouncements about players that sound like they're based on measures as accurate as at the team level. And that's something we just don't know. But we do know that in most cases where you have people nested within groups (players on teams), the group-level variation is tiny compared to the individual-level variation. So even if the weights are roughly correct, the amount of error at the player-level is likely 3-9 times greater than at the team level.
On a player level, there is no "standard" by which to judge accuracy and that is because runs are scored on a team level. A player's contribution to team runs is also affected by lineup position and how many teammates are on base (batting a lot with 2 out and no one on lowers the value of your offensive performance).
The lack of a "standard" is a big problem (see measurement theory). But I'm not sure that we can't use runs (or some combo of runs and rbi). I know these are highly context dependent, but it may be possible to control for some/most of that context.
At the very least, a model along the lines of a "multi-level" linear regression (aka fixed & random effects model, hierarchical linear model, variance components, mixed models, etc.) should be possible. Such models simultaneously estimate group- and individual-level regression models, where the dependent variable of the group-level model is essentially the mean of the dependent variable of the individual-level model. At the group-level we'd have team-level independent variables like OBP and SLG (or walk rates, single rates, double rates, etc.); at the individual-level you have player-level independent variables, measured as deviations from the team mean where appropriate. Such things as average # of runners on base when a player batted or the average OPS of the three batters behind him could be included in the model.
Now I say "should be possible" cuz I've been tossing this idea around in the back of my head for a couple years now and still haven't decided on the right operationalization of the dependent variable.
In my opinion, we are very near the limit of what we can do with the traditional stats. To get more accurate, we will have to use play-by-play data in conjunction with computer modelling of offenses.
I'm not so sure we're at that limit, seems to me there's lots that could be done with fancier econometric models and such (though whether worth the effort is a good question). But I'd agree that with the play-by-play data available, there's no reason not to move on. Without question, there's a data revolution.
By the way, having mildly dissed sabermetricians, I should certainly have pointed out that I am very impressed with the care and thought that folks put into these studies. I've sometimes got questions as to whether context and other statistical issues are addressed properly, but folks are certainly keenly aware of context and go to considerable lengths to try to correct for it. Moreover, the statistical independence of many events has been studied -- not sure authors always realized this, but studies debunking "hot streaks", "clutch hitting", some of Voros' DIPS work can also be viewed as studies documenting the independence of plate appearances, games, seasons, etc. for at least some phenomena. Overall I'm quite impressed with sabermetricians' creativity, attention to detail, and willingness to address alternative hypotheses. With more stat training, a lot of you folks would be much better at my job than I am (some of you I assume already are :-).
45. bob mong
Posted: January 22, 2002 at 12:40 AM (#84490)
voros,
what walt is trying to say (i think) is that you can't start with a *team* stat (i.e., Runs or Runs Allowed) find a statistical model that uses other *team* stats (i.e., OBP, SLG, BB, etc.) to accurately predict the original stat, then extrapolate from the *essentially TEAM-based* statistical model to the individual level without a loss of accuracy.
In other words, someone creates a model, say XR (JimFurtado, correct?), that takes TEAM H, TEAM BB, etc, and relates all those stats to TEAM runs scored. If you take that model and apply it to INDIVIDUALS you are, essentially, misapplying it. That doesn't mean it isn't useful, it just means you are losing accuracy and statistical power when you do that.
When you say that, based on XR values, a player "created" 100 runs in a season, what you really are doing is dividing up a TEAM's runs among the team members based on a statistical model that works very well on a TEAM level. This would not pass statistical muster with anyone who knows their statistical stuff.
But Voros, offense isn't linear. On a team, a player's offensive performance interacts with *other* player's offensive performances, not his own.
Runs created when applied to individuals, for instance, treats Barry Bonds' home runs as more valuable than Dante Bichette's, because Bonds is treated as driving in Bonds (with his high OBP), while Bichette is treated as driving in Bichette (wth his low OBP). That's obviously wrong. Now, if a team has a high OBP, it's HRs *are* more valuable. But that's not true for an individual player.
47. Robert Dudek
Posted: January 22, 2002 at 04:12 AM (#84495)
Since we're sort of on this topic, did anyone notice how large the XR error was for the 2001 San Francisco Giants? Their projection was for 878.4 runs and they actually scored 799. That 79.4 was the largest absolute error in the history of the formula (1955-2001) - the next largest being the 1959 Indians who were projected to score 682.2 and actually scored 745, a difference of 62.8 going the other way.
What happened?
I think part of it might be that Bonds hit a lot of solo homers and the guys hitting behind him didn't produce. But could it possibly explain everything?
You miss probably the most important aspect of obp, avoiding outs. But setting that aside, Bichettes home-runs are not more or less valuable than bonds, his whole package is less valuable than bonds because a low obp destroys the continuancy of the offense. The obp of every batter in the offense is by far the most important measure of what they contribute to the teams offense.
You misunderstand me. You're preaching to the choir on the importance of OBP. My comments are directed to a specific formula, runs created. (As MikeEmeigh points out, my comments are directed at the original runs created methodology. I never bothered to look at Bill James "new" formula introduced a couple o' years ago.) Of course Bonds is more valuable than Bichette, and of course Bichette's HRs aren't more valuable than Bonds' HRs. But the runs created methodology *treats* Bonds' HRs as more valuable.
49. Steve Treder
Posted: January 22, 2002 at 04:15 PM (#84502)
"... did anyone notice how large the XR error was for the 2001 San Francisco Giants? Their projection was for 878.4 runs and they actually scored 799. That 79.4 was the largest absolute error in the history of the formula (1955-2001) - the next largest being the 1959 Indians who were projected to score 682.2 and actually scored 745, a difference of 62.8 going the other way.
What happened?"
Believe me, watching the team every day, I noticed, I noticed. It was like toothache pain: you want to ignore it, but somehow you just can't.
The underefficiency of the Giants' offense was a perfect storm: combine a leadoff spot with an abysmal OBP (and a fairly high number of HRs), with a #2 hitter who hits lots of singles, doubles, and homers, but rarely walks (meaning he leaves the bases empty a lot, and when he doubles, guess what happens to the #3 hitter!), the most extraordinary season in our lifetimes batting third, and #4 and #5 hitters who perform dismally with runners on base (check out what Kent, Russ Davis, and Snow did in such situations; the only guy who did all right with men on base was Galarraga).
I suspect if you played the season over with exactly the same cast of characters, the #4 and #5 hitters would perform more normally and drive some runs in once in a while, and it wouldn't be as extreme. But it does illustrate what's been talked about here quite a bit: simply computing a team's expected runs on the basis of its team totals doesn't work well if the team accumulated those totals in a very irregular pattern. Since the bulk of the Giant's HRs and BBs were coming from the same spot in the order, their offense was bound to be as inefficient as possible.
50. WaltDavis
Posted: January 22, 2002 at 08:18 PM (#84503)
Voros,
Bob Mong hits the nail pretty much right on the head.
It's called the "ecological fallacy" (among other names too no doubt). Relationships (correlations, regressions, etc.) among variables at the aggregate level need be in no way related to relationships among those same variables at the individual level. You can find a pretty clear treatment of it in the first few pages of this link before it gets into more statistical modeling questions.
One of the examples in the piece is pretty powerful. He looks at the %age of native- and foreign-born residents (of states) with an income greater than $50,000. Using individual-level census data, the real %ages are estimated as 35% for native-born and 28% for foreign-born. The "ecological regression" (i.e. regressing state %'s with $50,000 income on %foreign-born) yields estimates of 26% with high-income for native-born and a whopping 85% with high-income for foreign born.
To take an example not found in that piece: if you correlated a state's percentage of black voters with its percentage of votes for Republican presidential candidates (this was true at least as of 1992), you'd find a positive correlation. Yet if you look at individual polling data, you'd find that something like 85-90% of black voters support Democratic presidential candidates. In this case, part of the explanation is obvious -- Southern states are still the ones with higher percentages of black voters but they also have lots of conservative whites.
And we came up with a possible case in the Yankees payroll thread. Robert found what appeared to be substantial correlation between payroll disparity and performance disparity measured at the level of MLB. You found little correlation between payroll and performance at the team level. And I bet if we looked at the correlation between salary and performance at the individual level, we'd find yet another relationship (or at least a whole lot more variability and error).
Anyway, at a minimum, making inferences from team to player results in aggregation bias, and that's if the relationships among the variables are relatively constant between the levels. That is, at the very least, you're ignoring individual-level error.
However, there's no statistical reason that the relationships at the individaul level even have to be in the same direction much less of equal impact, though in this case I think it's safe to say that a double is positively related to productivity at both the team and player level.
But it is certainly possible that an HR may have a different weight at the two levels -- i.e. it may have a greater impact on individual productivity than on team productivity. Conversely, walking may not increase the productivity of a player by much, but it advances runners and creates more plate appearances and therefore at the team-level the walk may be more important.
All of that is speculation but so is assuming that the weights are the same. My point is that it's well-known among statisticians that applying aggregate-level models to individual-level data can lead to serious problems in terms of accuracy and bias. From what I've seen, this is not well-known among sabermetricians.
51. bob mong
Posted: January 22, 2002 at 10:34 PM (#84504)
good link, walt.
52. bob mong
Posted: January 22, 2002 at 10:46 PM (#84505)
offensively speaking, the job/goal of the team and the players is to score runs.
53. bob mong
Posted: January 22, 2002 at 10:48 PM (#84506)
probably better phrasing:
it has an undefined error.
54. Robert Dudek
Posted: January 23, 2002 at 04:17 AM (#84508)
Everything depends on what we are trying to measure. If we want to know what the player ACTUALLY contributed to team runs scoring, then obviously the assumptions of run estimation formulas that use the traditional stats are going to have a substantial degree of inaccuracy. This is because a formula like XR assumes that the events occur in a way they would on an average team over many years, and this is plainly not the case on an individual player basis.
A lot depends on the batting order. A walk has a greater value (relative to a homerum, for example) for a leadoff hitter than for a #3 hitter because the #1 hitter will be batting more often leading off an inning where the value of a walk versus homerun is maximized. Conversely a walk is least valuable versus a homeun when there is a man on 2nd or men on 2nd and 3rd with 2 outs. In such situations, a walk adds relatively little to run expectation but a homerun adds a great deal. This can all be garnered from a base-out run expectation table.
XR and formulas like it are not designed to deal with actual value but rather theoretical value. They ask what a player's value would be in a lineup neutral context, and even then, the more extreme the player the less certain we can be about the run estimate.
The example of the 2001 SF Giants is apt. Barry Bonds is estimated by the formula to have about 185 XR in 2001, but it is highly unlikely that that is how many runs he actually created for his actual team (since the Giants scored almost 80 runs less than expected by the formula).
I believe that the three main reasons for this are:
1) lack of an effective leadoff hitter (.315 OBP) meant that there were not a lot of men on base when Bonds was batting (thus limiting the value of his homeruns) and he was more likely to be batting with 1st base open (thus limiting the value of Bonds' walks);
2) while the number 4 hitter for SF hit well (.372OBP, .530SLG - Bonds batted in this spot for several games), the #5 hitter was abysmal (.321OBP, .393SLG). This meant that Bonds was not as likely to score after reaching base (except on homeruns) than a typical National League #3 hitter. The Giants' #3 hitter reached base via walk and non-homerun base hit 284 times and scored 71 runs (when not hitting a homer) - scoring exactly 25% of the time. The average National League #3 hitter reached 238 times via hit or walk not counting a homerun and scored 79 times (when not hitting a homerun) - a 33.2% rate.
3) Concentrating a lot of your production in one spot probably leads to inherent offensive inefficiencies as alluded to by Steve Treder.
To summarize, through no fault of his own (he hit well with men on base) Bonds' actual contribution to team runs was vastly overstated by XR and would be by any linear weights formula.
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. SM in DC Posted: January 17, 2002 at 08:40 PM (#84407)Weirdest One-Year HR Flukes of All-Time:
1. Wally Moses, 1937
Huh? In 1997 Brady had an .860 OPS. He had a mediocre season in '98 (.775 OPS), but rebounded in 1999 with an .880 OPS and 36 stolen bases in 43 attempts. How is that worthless?
He's only played one year and was great.
We'll use Nomo as an example for his first three seasons his K totals were over 230 then dropped in 1998 to the 160 level. The past two seasons he has posted 180 then 220.... a nice parabolic pattern. His ERA however, skyrocketed after year two and has held in the mid- to high-4.00s...
I wouldn't be so quick to pencil Ichiro in as a repeat batting champion.... .300 would be a good 2nd year.
The M's could probably use a lefty version of Velarde to occasionally mix in for Cirillo and Boone. Oh wait, they've got McLemore ... for 2 years and a whole bunch of money. :-)
To the extent that I'd criticize Beane, it's for ending up with Justice, Hatteberg, Venafro, and signing one of those lefties (Holtz?) all making pretty good money. I'm still dumbfounded by the Hatteberg signing ... but it wouldn't be the first time Beane knew more than me.
what do you want the Ms to do? Sign one of the many cheap, better free agent 2b out there? I haven't heard any actual names of who the mariners would be better off having. or, perhaps, should they have signed boone for cheaper? oh yeah, i guess they can't just unilaterally impose whatever contract they want on him; he has to agree.
sure it isn't the best signing of all time, but the mariners are looking to WIN next year and can't afford to have a prospect learn how to hit/field at 2B; they needed someone who can hit NOW.
sure he won't hit as well in 02 as he did in 01. but he had one of the best hitting years for a 2B ever. if he comes down from that to merely very good, i think he will be worth $8 million.
His 2001 peformance included 37 homers, which is roughly 50% more than 25. And Safeco is not a good hitters' park.
A very good defensive 2nd baseman who hits 25 HR a year is a terrific player, and that isn't unreasonable to expect from Boone at this point. But his OBP probably won't be very good. It's almost a certainty that he'll never match his 2001 numbers again.
If I were the M's, I'd take out insurance on this deal in case a new drug testing program is put into place in the next few years...
Well said.
I think it would be naive to presume that steroid use ISN'T widespread in baseball. Performance-enhancing drugs are obviously widely used in many other sports -- why would the Olympics go to the efforts they do to test if there wasn't good reason to suspect that many athletes were using them?
While there may very well be long-term health problems caused by steroids, the extent of these is at this point highly variable and largely little known. What is disputed by practically no one is that, used judiciously and as an integrated part of a rigorous training and conditioning program, performance-enhancing drugs do in fact enhance performance.
So put yourself in the athlete's position. You have probable cause to suspect that many of your competitors/opponents are using these drugs. You have a very short window of time to establish yourself and to be successful. The rewards of success include stupendous wealth, fame, and adulation. The consequences of failure include humiliation and poverty. Faustian bargain it may be, but wouldn't you be tempted?
Baseball players are people, as prone to greed, temptation, and unwise judgment as any of the rest of us. I think it is the least likely of circumstances that steroids aren't part of the MLB environment as we enter the 21st century. Welcome to reality.
And
"I've read anonymous comments from ballplayers (not the most reliable source) about considerable steroid use in MLB."
And
"I think it would be naive to presume that steroid use ISN'T widespread in baseball."
Then call me naive. I don't presume that steroid use is widespread in baseball.
In the specific case of Bret Boone, I don't think it's fair to specifically accuse him of steroid use without any specific evidence. I don't deny that Boone had a fluky great season in 2001. Everything went incredibly well for him. But fluky great seasons are not unique to our era. Nor are giant leaps in performance that become sustained. Well before "juice" was introduced to sports a little more than 30 years ago, these anomolies occurred. I haven't seen any evidence that there have been an increase in anomolous seasons in the last 10 years. If "juice" use is now ubiquitous - the charge that has been levied - then wouldn't giant leaps in performance be much more common today than they are?
And what happens next year if Boone's performance returns to his prior levels? Does that mean he went off the "juice"? Or does that mean that he was never on it, and that his 2002 season was entirely a fluke? And are many of his accusers making these two seemingly incompatible arguments at the same time?
My understanding - also based on comments made by players - is that "performance enhancing nutrition" is now widespread. Players work out with weights almost every day, almost all year long. In order to help them increase muscle mass, they eat the protein crap that is sold at stores like GNC Nutrition Centers. Perhaps some of the ingredients in that crap has similar harmful effects as steroids can have. I don't know. But, until I see some convincing evidence, I will continue to doubt that more than a handful of players are taking illegal anabolic steroids.
Finally, my interpretation of the comments made by players that steroid abuse is "widespread" - none of which I've ever heard; but perhaps I just missed them - is that these kind of accusations can be made out of jealousy and frustration. For example, it's possible that Player A is unmuscled and struggling while Player B, his highly muscled teammate, begins to figure things out and is raking the ball. Player A then, in frustration, drops a hint to the local beat-reporter that Player B is on the "juice." This goes unreported as a specific allegation. But that beat-reporter tells others in the press and the rumors spread that Player B is on the "juice."
Steve,
If what you say is true, why wouldn't the MLBPA be pushing for steroids testing?
Your argument is that individual players are in a prisoner's dilemma - if you've studied game theory, you'll know what that is. The Union could easily solve this dilemma, which presumably would benefit the players collectively, by demanding steroid testing. Yet they don't. Doesn't that alone logically suggest that steroid use is far less common than you contend?
I've read anonymous comments from ballplayers (not the most reliable source) about considerable steroid use in MLB.
They're even less reliable than you portray. Most of the anonymous comments are about rumors, not about firsthand knowledge.
It is the prisoner's dilemma precisely.
But I don't think the MLBPA's inaction on the issue proves anything at all. There are obvious reasons for everyone involved to keep the whole question underground. The prevailing logic could very well be "don't ask, don't tell," just as it was for many years regarding players' widespread use of amphetamines ("greenies").
Look, I'm not saying that I "know" Bret Boone or anyone else uses steriods. I'm saying just the opposite: I DON'T know. My assumption is neither that players do, nor that they don't. What I'm saying is that given what we do know -- that these substances exist, they are available, and they are and have been used by many athletes in many sports around the world -- I just don't think it's a logical conclusion that baseball players are necessarily different from these other athletes, and therefore to assume that steroids are NOT used by baseball players.
What about Andre Dawson? Whatever you want to make of it, for whatever reason, hitting 49 in 1987 was by far his greatest output.
And what a former home run king? Wouldn't Maris be up for consideration?
Kevin Mitchell in 1989?
George Bell in 1987 (there's that year again)?
And depending on what the future holds.. Barry Bonds, 2001...stay tuned.
What about Andre Dawson? Whatever you want to make of it, for whatever reason, hitting 49 in 1987 was by far his greatest output.
And what of a former home run king? Wouldn't Maris be up for consideration?
Kevin Mitchell in 1989?
George Bell in 1987 (there's that year again)?
And depending on what the future holds.. Barry Bonds, 2001...stay tuned.
It's actually divided by plate appearances, not outs made.
In regards to Boone and steroids, I obviously dont KNOW that he used them, but good god, he sure looks like he did. I'm a pretty religious lifter, and I can tell you that I, nor any of my friends, ever had results like he got in such a short time. Maybe its cause he uses a fancy-schmancy strength coach, but I think a sip of the juice is more likely. Remember, given the risks of steroids, it'd make the most sense to use them in a contract year, then drop off them once you got the guarenteed millions in your pocket.
There is probably a 95% chance that Boone will be significantly worse in 2002. So what. The odds are very good that Boone's 2002 will fall somewhere in between his 2001 and 2000. He will still be among the best players in baseball at his position and you generally want to collect as many of those guys as you can because that's how you win pennants.
The money is a secondary issue for a club like the Mariners because they are extremely likely to have massive revenue over the next few years. When you've got the money, all you should care about is whether you are paying for quality goods or not. I'm pretty confident that they are.
There is also a limit to how good a particular player can get. It is unsual for a player Bret Boone's age to markedly improve his game because he has been coached by professionals for more than a decade and if a breakthrough was going to happen *most likely* it would have happened already. There is only so much you can learn. That said, as Luis Gonzalez, Jamie Moyer and some others have shown (including just about every knuckleballer), life can begin after 30. A small number of players do improve dramatically in their mid-30s (just as I imagine a small number of people are in better shape when they are 50 than when they are 25 or a small number of lawyers suddenly turn from Marcia Clark into Clarence Darrow at age 40), not many do. And, as I am sure you know, players do have fluke seasons.
BTW, the Kansas City ballpark he played in was actually a good hitter's park. Maris' performance in 1959 was impacted by an early-season illness. The Cleveland park wasn't a good one for batting averages, but was pretty much neutral in its overall HR impact, though in his time there Maris hit 8 HR at home and 15 on the road. And while Yankee Stadium's short porch in RF would seem to be a big explanation for his HR surge, in fact in 1960-61 Maris hit 43 HR at home and 57 on the road. He was, for a poignantly short time, a legitimately great player.
Of course there might be some kind of explanation regarding some of the fluke HR seasons I listed in the earlier post ... I hadn't been aware of Moses' shoulder injury following his great 1937 performance, for instance. But in the Dave Johnson case, obviously he was helped by moving to the Atlanta "launching pad," but the fact still remains that he hit 17 HR on the road in 1973, after only once hitting more than 7 road HR in his 7 years with the Orioles.
And I defy anyone, anyone, to explain the Campaneris or Fernandez cases as anything other than sheer random weirdness.
In the case of Williamson in 1884, certainly that was a park effect of the most extreme kind. Williamson hit 27 HR that year (25 at home), while several teammates did quite well too: Fred Pfeffer hit 25 (all at home), Abner Dalymple hit 22 (18 at home), and Cap Anson hit 21 (20 at home). Overall the White Stockings in 1884 hit 142 HR, 131 in their home park, in a league in which the next-best team HR total was 39. Perhaps someone can verify this, but it's my recollection that after 1884, the team either had the ballpark significantly reconfigured, or moved parks altogether.
dzop,
It sounds like you know a LOT more about pumping iron than I do. After finishing my high school football career - 20 years ago, YIKES! - I have never been a serious weight lifter.
But I wonder if it's still not possible, or even likely, that this is the explanation for Boone: 1) prior to 2001, he was never a serious or dedicated weight lifter; 2) prior to 2001, Boone never lifted for power or muscle mass gain, but rather for building endurance; and 3) prior to 2001, Boone never modified his diet to match a strength gain program, incorporating things like the protein mixes that are sold by EAS and other companies?
People attributing Bret Boone's 2001 season to steroids instead of conditioning, coaching, and skill has nothing to do with statistics. They are two alternative ways of explaining a different level of performance than we're accustomed to seeing from Boone. The only way we know Boone played better than usual in 2001 is by looking at the stats.
To put it simply: the numbers can grasp Boone's improvement. They did. Nowhere in Boone's prior numbers is there any indication of how he would perform in 2001, and there's not supposed to be. They just show how he did in 1992-2000. That's a reasonably good predictor of how a player will perform in 2001, but it is obviously not close to perfect. To blame statistics for not doing what they're not supposed to is ridiculous. And I don't understand how anyone can suggest that mathematical analysis has an "inability to quantify the qualitative shifts in performance boosts." That's exactly what statistics do. When Boone got better, statistics quantified it, regardless of how he did it.
In short, I think Boone will continue to play at an improved level, though probably not as improved as he was in 2001. That some people don't is not a reflection of their preference for statistics over whatever kind of analysis Shirley is suggesting, it is a reflection of their preference for analysis based on the 9 years from 1992 through 2000 over analysis based on 2001.
I don't argue with that at all, espeically in regard to baseball. But if you've seen the amazing Jack LaLane in recent years - I think he's now over 90 years old - it is remarkable how well the body can be maintainted if you work at it, as LaLane has.
Whatever else is going on, the numbers capture very well that Barry Bonds' 2001 was the greatest offensive season in major league history.
No one can predict how many homeruns Barry Bonds will hit in his career - that's not sabermerics anyway, which focuses on trying to understand what has already happened rather than trying to predict what will happen.
My problem with Shirley's post was not that I think statistics handle everything perfectly; I'm well aware that they have plenty of limitations. I just don't see how saying Boone was taking steroids is the result of mathematical analysis and saying he had better coaching and conditioning isn't.
I agree. I've been doing my best to ignore Shirley's posts; IMO silly sniping between posters is just childish noise that obscures the subject matter that interests me.
But: Shirley's presumptuousness that she knows anything, let alone everything, about "us" is appalling. Shirley's rude and abrasive tone is revolting. And the one snippet she provides about her baseball expertise (the Bonds patience and uppercut business) betrays the most trivial understanding of how the game is played, and her bloated diatribe on the subject of quantitative analysis can be summed up in one brief phrase: numbers don't tell the whole story.
Thanks for that original insight, Shirley. Our lives are now changed.
?? This is not a statement that makes any sense. The past matters *only* in the way it helps us predict the future. "Those that don't remember history are condemned to repeat its mistakes" implicitly means "repeat its mistakes *in the future*". Of course sabremetrics is about trying to predict the future, as are all belief systems, including scientific/numerical or less hard belief systems.
Andy, there's a difference between a prediction like "If A, then B" and a prediction like "B." Sabermetrics is reasonably good at the former. At the latter? Not so good. And I think sabermetrics gets a bad rap from much of the public because people think it is, or should be, doing the latter.
The past matters *only* in the way it helps us predict the future. "Those that don't remember history are condemned to repeat its mistakes" implicitly means "repeat its mistakes *in the future*". Of course sabremetrics is about trying to predict the future, as are all belief systems, including scientific/numerical or less hard belief systems.
Perhaps I'm in the minority, but I have a little more interest in the past than that. In any case, I like the analogy to the history adage, because it illustrates the point that sabermetrics provide about as good a predictor of the future as history books do, which is to say it gives you a good frame of reference to make informed guesses, but not nearly enough to be dependably correct.
Andy James:
If a clearly superior horse runs just well enough to defeat lesser horses, thus earning a win but an unimpressive time, what can we expect that horse to do against a better class of horses? Run faster?
I think that very aptly describes the behavior of the Yankees in the regular season vs. the post season lately.
It may be that some managers are very good at evaluating individual players' talents, while other managers are better at envisioning how those talents fit together.
Very good point. I think that's how the Seattle Mariners won 116 games in 2001.
Perhaps better would be to develop statistical models that don't assume independence. The assumption of independence is becoming rarer in statistical modeling now that (1) there are plenty of models developed that don't require that assumption and (2) fast computers and good software make estimating such models pretty quick and straightforward. I know squat about horse racing, but a given horse's race times can be treated similarly to repeated measures (without assuming equal distance between measures nor the same # of measures per subject) with another nesting factor for race, preferably sufficiently controlled for by using race-specific factors such as weather, "track speed", avg. time of that race, avg. time of horses prior to that race, etc. Gimme the data (including unique horse and race id's) and I can probably do this in about 6 lines of code. :-) Just don't wager on the results.
[though it's been a while since I read it, I recall the piece on pitcher abuse points in last season's BP is an example of where a repeated measures approach should have been taken]
As to predicting Barry Bonds and Luis Gonzalez, and sort of agreeing with one of Mike's comments, sabermetricians (at least "mainstream" ones) don't seem real fond of filling us in on the confidence intervals around their predictions. Hopefully that's not out of their own ignorance but rather a desire to not confuse stat-o-phobic readers. Though I seriously doubt that Bonds and Gonzalez fell within a 95% confidence interval, I suspect the 95% confidence interval around things like EQA or XRuns or RC or whatever your favorite measure are can get pretty broad, especially in a sample of 162 games. That variance may not be constant across position/age either.
I haven't read the gnarly sabermetric stuff so my following pronouncement is grossly premature and I deserve the negative comments that may be coming my way, but the four primary statistical problems I see with sabermetrics are:
1. The assumption of independence. This is the area where there have been some steps taken (e.g. park effects, era effects) though it's not clear to me that the authors understand they're helping control for the lack of independence. Note, controlling for a lack of independence will generally increase your standard errors which means you're likely to find less significance.
2. The ecological fallacy. This one may not be as bad as I think it is. But near as I can tell, most/all of the formulas (i.e. the weights) used to come up with these player-level comprehensive measures like RC, XRuns, EQA are based on team-level regressions and other team-level analyses. Alas, there need be little to no connection between team-level functions and player-level functions (though in this case, the connection is probably higher than in most cases). Note, this is related to point #1 above. If I'm way off-base here, I apologize...and I'd be curious as to what the dependent variable is in the models underlying the weights.
2a. Near as I can tell, the quality of these player-level comprehensive measures, especially the ones measured in runs, is that when you sum up the predicted runs for the players on a team in a given season, you get a number close to the actual number of runs for the team. If I correctly understand how these weights were derived, a close match to the aggregate is to be expected and doesn't tell you much about the quality of your player-level model (though it's certainly better than being off by a lot). At the very least, even if the weights are correct, the amount of error at the player-level is likely much higher than the error at the team level. Also, under- and over-estimates of the team total based on player-level estimates are not necessarily evidence of a team over- or under-performing their expectations, they may be evidence of the error in the player-level model.
3. confidence intervals.
4. Lack of training in measurement theory and modeling. Sabermetricians are hardly the only ones seriously lacking here.
I wasn't making myself at all clear in my last post. I apologize.
Sabermetrics is about understanding what has happened. We study past baseball games and their records to try to deepen our understanding of the game.
That is the essential work of sabermetrics. We can then use this knowledge to help predict the future IF WE WANT TO. Your "comdemned to repeat history" analogy doesn't apply to baseball because we are not talking about some sort of moral imperative to avoid catastrophe (I always thought that that phrase was 70% hokum anyway). Sabermetrics is absolutely meaningless outside of the confines of its narrowly defined universe (I speak here of baseball and it's rules of play).
Many years ago there were studies done that strongly suggested that offenses based on power and walks generally score more runs than those based on batting average and speed. By the standards of sabermetrics this is extremely useful knowledge, but it rests on the assumption that the conditions which produced this phenomenon in the past are maintained in the future. There is no guarantee that they will be maintained.
The primary focus of sabermetrics is studying baseball in order to understand it - what anyone does with this knowledge is their own business and there certainly is no mandate that it has to have practical consequences.
A formula like XRUNS is developed at the team level. The true weights of the offensive events are not static but change over time as general offensive conditions change. The main reasons why team XRUN predictions miss their mark are (a) differences in clutch performance (usually assumed to be due to chance, but lineup design can have an effect here) and (b) elements like baserunning and moving runners via outs which are not captured in the formula.
On a player level, there is no "standard" by which to judge accuracy and that is because runs are scored on a team level. A player's contribution to team runs is also affected by lineup position and how many teammates are on base (batting a lot with 2 out and no one on lowers the value of your offensive performance).
Nevertheless, in a linear formula, the sum of the contribution of each player (as captured by the formula elements) does a good job of estimating the number of runs a team will score.
In my opinion, we are very near the limit of what we can do with the traditional stats. To get more accurate, we will have to use play-by-play data in conjunction with computer modelling of offenses.
Andy James,
I've never been much of a horseracing fan. But my grandfather - who was indeed a serious horse gambler - told me something to keep in mind about jockeys: they are often a signal as to what kind of horse you are looking at, particularly in minor races. My grandfather was born in Poland in 1886 and died in San Francisco in 1980. So, in his day, if Eddie Arcaro or Willie Shoemaker or later Laffit Pincay, Jr. were on the mount, that would be a good signal, barring other information, that you're looking at a very good horse. The few times I've been to Golden Gate Fields or Bay Meadows, I've done the same when Russell Baze is on a horse. Chances are, it's a damn good horse. Baze has been too successful for him to get on a bad horse. And the owners and trainers of the best horses tend to seek out a quality and known jockey like Russell Baze.
And, one hopes, ladies as well.
Yes, but I believe this is the expected case whether the given formula does a good job of esimating individual "contribution" or not. One point I should have made before is it depends a lot on what you use these for. If all you want is a prediction of how many runs a team will score, then this does a good job. But if you want to assess a given players contribution or compare individual players, which happens all the time, then this may not do a good job.
That may just sum up my general criticism. From what I've seen, team- and league-level understanding and precision of prediction is very good. At the player level, the quality of the measures is quite unclear. Many mainstream sabermetric writers (Neyer, BP) and many of the posters here (including myself) have a tendency to make pronouncements about players that sound like they're based on measures as accurate as at the team level. And that's something we just don't know. But we do know that in most cases where you have people nested within groups (players on teams), the group-level variation is tiny compared to the individual-level variation. So even if the weights are roughly correct, the amount of error at the player-level is likely 3-9 times greater than at the team level.
On a player level, there is no "standard" by which to judge accuracy and that is because runs are scored on a team level. A player's contribution to team runs is also affected by lineup position and how many teammates are on base (batting a lot with 2 out and no one on lowers the value of your offensive performance).
The lack of a "standard" is a big problem (see measurement theory). But I'm not sure that we can't use runs (or some combo of runs and rbi). I know these are highly context dependent, but it may be possible to control for some/most of that context.
At the very least, a model along the lines of a "multi-level" linear regression (aka fixed & random effects model, hierarchical linear model, variance components, mixed models, etc.) should be possible. Such models simultaneously estimate group- and individual-level regression models, where the dependent variable of the group-level model is essentially the mean of the dependent variable of the individual-level model. At the group-level we'd have team-level independent variables like OBP and SLG (or walk rates, single rates, double rates, etc.); at the individual-level you have player-level independent variables, measured as deviations from the team mean where appropriate. Such things as average # of runners on base when a player batted or the average OPS of the three batters behind him could be included in the model.
Now I say "should be possible" cuz I've been tossing this idea around in the back of my head for a couple years now and still haven't decided on the right operationalization of the dependent variable.
In my opinion, we are very near the limit of what we can do with the traditional stats. To get more accurate, we will have to use play-by-play data in conjunction with computer modelling of offenses.
I'm not so sure we're at that limit, seems to me there's lots that could be done with fancier econometric models and such (though whether worth the effort is a good question). But I'd agree that with the play-by-play data available, there's no reason not to move on. Without question, there's a data revolution.
By the way, having mildly dissed sabermetricians, I should certainly have pointed out that I am very impressed with the care and thought that folks put into these studies. I've sometimes got questions as to whether context and other statistical issues are addressed properly, but folks are certainly keenly aware of context and go to considerable lengths to try to correct for it. Moreover, the statistical independence of many events has been studied -- not sure authors always realized this, but studies debunking "hot streaks", "clutch hitting", some of Voros' DIPS work can also be viewed as studies documenting the independence of plate appearances, games, seasons, etc. for at least some phenomena. Overall I'm quite impressed with sabermetricians' creativity, attention to detail, and willingness to address alternative hypotheses. With more stat training, a lot of you folks would be much better at my job than I am (some of you I assume already are :-).
what walt is trying to say (i think) is that you can't start with a *team* stat (i.e., Runs or Runs Allowed) find a statistical model that uses other *team* stats (i.e., OBP, SLG, BB, etc.) to accurately predict the original stat, then extrapolate from the *essentially TEAM-based* statistical model to the individual level without a loss of accuracy.
In other words, someone creates a model, say XR (JimFurtado, correct?), that takes TEAM H, TEAM BB, etc, and relates all those stats to TEAM runs scored. If you take that model and apply it to INDIVIDUALS you are, essentially, misapplying it. That doesn't mean it isn't useful, it just means you are losing accuracy and statistical power when you do that.
When you say that, based on XR values, a player "created" 100 runs in a season, what you really are doing is dividing up a TEAM's runs among the team members based on a statistical model that works very well on a TEAM level. This would not pass statistical muster with anyone who knows their statistical stuff.
Runs created when applied to individuals, for instance, treats Barry Bonds' home runs as more valuable than Dante Bichette's, because Bonds is treated as driving in Bonds (with his high OBP), while Bichette is treated as driving in Bichette (wth his low OBP). That's obviously wrong. Now, if a team has a high OBP, it's HRs *are* more valuable. But that's not true for an individual player.
What happened?
I think part of it might be that Bonds hit a lot of solo homers and the guys hitting behind him didn't produce. But could it possibly explain everything?
You miss probably the most important aspect of obp, avoiding outs. But setting that aside, Bichettes home-runs are not more or less valuable than bonds, his whole package is less valuable than bonds because a low obp destroys the continuancy of the offense. The obp of every batter in the offense is by far the most important measure of what they contribute to the teams offense.
You misunderstand me. You're preaching to the choir on the importance of OBP. My comments are directed to a specific formula, runs created. (As MikeEmeigh points out, my comments are directed at the original runs created methodology. I never bothered to look at Bill James "new" formula introduced a couple o' years ago.) Of course Bonds is more valuable than Bichette, and of course Bichette's HRs aren't more valuable than Bonds' HRs. But the runs created methodology *treats* Bonds' HRs as more valuable.
What happened?"
Believe me, watching the team every day, I noticed, I noticed. It was like toothache pain: you want to ignore it, but somehow you just can't.
The underefficiency of the Giants' offense was a perfect storm: combine a leadoff spot with an abysmal OBP (and a fairly high number of HRs), with a #2 hitter who hits lots of singles, doubles, and homers, but rarely walks (meaning he leaves the bases empty a lot, and when he doubles, guess what happens to the #3 hitter!), the most extraordinary season in our lifetimes batting third, and #4 and #5 hitters who perform dismally with runners on base (check out what Kent, Russ Davis, and Snow did in such situations; the only guy who did all right with men on base was Galarraga).
I suspect if you played the season over with exactly the same cast of characters, the #4 and #5 hitters would perform more normally and drive some runs in once in a while, and it wouldn't be as extreme. But it does illustrate what's been talked about here quite a bit: simply computing a team's expected runs on the basis of its team totals doesn't work well if the team accumulated those totals in a very irregular pattern. Since the bulk of the Giant's HRs and BBs were coming from the same spot in the order, their offense was bound to be as inefficient as possible.
Bob Mong hits the nail pretty much right on the head.
It's called the "ecological fallacy" (among other names too no doubt). Relationships (correlations, regressions, etc.) among variables at the aggregate level need be in no way related to relationships among those same variables at the individual level. You can find a pretty clear treatment of it in the first few pages of this link before it gets into more statistical modeling questions.
One of the examples in the piece is pretty powerful. He looks at the %age of native- and foreign-born residents (of states) with an income greater than $50,000. Using individual-level census data, the real %ages are estimated as 35% for native-born and 28% for foreign-born. The "ecological regression" (i.e. regressing state %'s with $50,000 income on %foreign-born) yields estimates of 26% with high-income for native-born and a whopping 85% with high-income for foreign born.
To take an example not found in that piece: if you correlated a state's percentage of black voters with its percentage of votes for Republican presidential candidates (this was true at least as of 1992), you'd find a positive correlation. Yet if you look at individual polling data, you'd find that something like 85-90% of black voters support Democratic presidential candidates. In this case, part of the explanation is obvious -- Southern states are still the ones with higher percentages of black voters but they also have lots of conservative whites.
And we came up with a possible case in the Yankees payroll thread. Robert found what appeared to be substantial correlation between payroll disparity and performance disparity measured at the level of MLB. You found little correlation between payroll and performance at the team level. And I bet if we looked at the correlation between salary and performance at the individual level, we'd find yet another relationship (or at least a whole lot more variability and error).
Anyway, at a minimum, making inferences from team to player results in aggregation bias, and that's if the relationships among the variables are relatively constant between the levels. That is, at the very least, you're ignoring individual-level error.
However, there's no statistical reason that the relationships at the individaul level even have to be in the same direction much less of equal impact, though in this case I think it's safe to say that a double is positively related to productivity at both the team and player level.
But it is certainly possible that an HR may have a different weight at the two levels -- i.e. it may have a greater impact on individual productivity than on team productivity. Conversely, walking may not increase the productivity of a player by much, but it advances runners and creates more plate appearances and therefore at the team-level the walk may be more important.
All of that is speculation but so is assuming that the weights are the same. My point is that it's well-known among statisticians that applying aggregate-level models to individual-level data can lead to serious problems in terms of accuracy and bias. From what I've seen, this is not well-known among sabermetricians.
it has an undefined error.
A lot depends on the batting order. A walk has a greater value (relative to a homerum, for example) for a leadoff hitter than for a #3 hitter because the #1 hitter will be batting more often leading off an inning where the value of a walk versus homerun is maximized. Conversely a walk is least valuable versus a homeun when there is a man on 2nd or men on 2nd and 3rd with 2 outs. In such situations, a walk adds relatively little to run expectation but a homerun adds a great deal. This can all be garnered from a base-out run expectation table.
XR and formulas like it are not designed to deal with actual value but rather theoretical value. They ask what a player's value would be in a lineup neutral context, and even then, the more extreme the player the less certain we can be about the run estimate.
The example of the 2001 SF Giants is apt. Barry Bonds is estimated by the formula to have about 185 XR in 2001, but it is highly unlikely that that is how many runs he actually created for his actual team (since the Giants scored almost 80 runs less than expected by the formula).
I believe that the three main reasons for this are:
1) lack of an effective leadoff hitter (.315 OBP) meant that there were not a lot of men on base when Bonds was batting (thus limiting the value of his homeruns) and he was more likely to be batting with 1st base open (thus limiting the value of Bonds' walks);
2) while the number 4 hitter for SF hit well (.372OBP, .530SLG - Bonds batted in this spot for several games), the #5 hitter was abysmal (.321OBP, .393SLG). This meant that Bonds was not as likely to score after reaching base (except on homeruns) than a typical National League #3 hitter. The Giants' #3 hitter reached base via walk and non-homerun base hit 284 times and scored 71 runs (when not hitting a homer) - scoring exactly 25% of the time. The average National League #3 hitter reached 238 times via hit or walk not counting a homerun and scored 79 times (when not hitting a homerun) - a 33.2% rate.
3) Concentrating a lot of your production in one spot probably leads to inherent offensive inefficiencies as alluded to by Steve Treder.
To summarize, through no fault of his own (he hit well with men on base) Bonds' actual contribution to team runs was vastly overstated by XR and would be by any linear weights formula.
You must be Registered and Logged In to post comments.
<< Back to main