User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
Page rendered in 0.8697 seconds
59 querie(s) executed
Dialed In — Thursday, March 18, 2004March 18, 2004Is the talent level in baseball really higher than it was 50 years ago? DOMINATE!
There have been some outstanding discussions in Clutch Hits lately regarding the greatest player ever and the differences in eras.
There?s little to debate with respect to whether or not humans are in better health or athletes are bigger and faster and stronger. What is debated is how much this evolution matters in discussing the quality of the players in Major League Baseball.
Is Barry Bonds the greatest player ever? Can Babe Ruth?s dominance be washed away amid "timeline adjustments"? Did Ted Williams only appear to be so great because the other players in his leagues were so "not great"? Are today?s average players, like Sean Casey, better than yesteryear?s stars, like Hank Greenberg?
With all the steroid talk, does that dampen the enthusiasm with which we proclaim today?s athletes better due to evolution, rather than pharmaceuticals?
Stephen Jay Gould wrote a piece some time ago on why we don?t see .400 hitters any longer. It was first published as "Entropic Homogeneity Isn?t Why No One Hits .400 Any More" in Discover, August 1986. The version Mark Field, who is a clutch hitter, was kind enough to share with me is: Gould, S. J., Full House: The Spread of Excellence from Plato to Darwin. New York: Harmony Books, 1996.
Gould reasons: As quality of baseball play has improved, it has also become far less variable. "Declining variation arises as a general property of systems that stabilize and improve while maintaining constant rules of performance through time. The extinction of .400 hitting is, paradoxically, a mark of increasingly better play." "Standard deviations have been dropping steadily and irreversibly?reaching a stable plateau by about 1940."
That is a good assessment of what it looks like has happened.
However, Gould?s findings are often exaggerated. Gould says MLB talents (variation) has effectively "stabilized" since 1940. That means, to me, that the talent level of the 1940s is roughly similar to the talent of today.
I?m a skeptic, so I want to see for myself.
Obviously, one of the finest tools in the world, the Lahman database, allowed me to look at different eras and league wide values.
Here?s the argument, as I hear it, regardless of what has been said:
It is harder to dominate today because the overall quality of play is better.
From Gould, "this decline [in variation] produces a decrease in the difference between the average and stellar performance. Therefore, modern leaders don?t stand so far above their contemporaries."
So a player like Bonds or ARod that dominates as much as they do, are better than comparably dominant players from older times, be it Williams, Ruth or Honus Wagner. This is not to say those aren?t great players, but that they aren?t as good as Bonds.
So how we do test this? Clearly, we can?t test it directly, as Ruth is 108 years old, and well, dead. We haven?t perfected time travel, Jack the Ripper notwithstanding, so the 2003 Devil Rays cannot go back in time to play the 1927 Yankees. So what we look at can only be used as evidence ? some of you will find it more compelling than others will. Most of you have an opinion, and you will say, "It?s flawed" because it doesn?t agree with your preconceptions. That and I?m sure I?m not doing everything correctly, so the work is flawed.
As Gould did, we?re going to look at standard deviations of the leagues.
Let?s look at the inferences we can start with, both from Gould and arguments made in Clutch Hits:
Here are the data treatments: Plate appearance cut-offs were assigned in an effort to remove pitchers from the hitting statistics. It was initially suggested by Mark Field to use a very low number of PAs, like 10, but it is apparent that pitchers should be removed from the section.
To do that, I set the cut-off at "where pitchers start showing up". While looking over the data, I could find Bob Feller with 140 plate appearances, so I had to raise the cut-off to 150 PAs. This affects the selectivity slightly because it means a player had to be good enough to warrant 150 plate appearances. Pitchers got fewer and fewer PAs as time has passed, going from 163 in the 1900-1920s, to merely 105 in the 1983-2002.
I summarized the data for every five seasons. There are a few exceptions to five seasons (4 seasons or 6 seasons) where I knew there would be specific changes in the game (like expansion or the DH).
I also split the leagues.
The Eras and player counts:
This can be altered if the peer review thinks it is necessary.
I looked at each league?s batting average (BA), on-base percentage (OBP), slugging percentage (SLG), and isolated power (ISO). I calculated the league average for the sample I selected and calculated the standard deviation on the same sample.
The NL chart shows just what Gould observed ? since the mid-to-late-thirties, the standard deviation of batting average and on-base percentage have been flat ? really flat.
But look at slugging (and thus ISO). Slugging SDs have been on the rise, overall. Today is the highest ever. And by a lot.
So what do we have? The highest variance in slugging percentage ever. Much higher than when Ruth was in the game. So what can we take from this data?
Looking back at our four (essentially two and their converses) assumptions, it appears that a player in today?s game has the greatest potential to "outpace" the league in slugging percentage in baseball history.
Has the overall talent in the league worsened with respect to slugging?
That doesn?t seem very likely if athletes are bigger and stronger. So what are other possible explanations?
Regardless of the explanations (and excluding steroid use), it does not appear that MLB hitters have decreased the variance of batting average, on-base percentage, slugging percentage or isolated power since the late 1930s. Does variation, or lack of it, explain what Gould and others claim?
I know the gloves have made a big difference in what becomes a hit and not. I also know that better quality fields have made a difference. I am pretty sure the changes in parks have made it easier to get hits.
One thing I generally hear is that Camden Yards is a "pitchers? parks". That?s clearly nonsense. It can be called a "pitchers? park" compared to the other parks in the AL today, but it is a massive hitters? park compared to the park it replaced, Memorial Stadium. What does that also mean? That means the other AL parks have gotten significantly easier for hitters. Enough to drive OPACY from a 104 park factor (1993) to a 95 park factor (2003) (The Baseball Encyclopedia, 2004 ed. Palmer, Gillette, et al).
I also am not touching on things like sliders - mostly because I believe that sliders and forkballs (which I threw as a 12-year old) and nickel curves and hard curves and slip pitches and heavy fastballs were just as common, in some form, in the 1930s and 1940s as they are today.
While I believe humans are bigger and stronger, the skill set required to excel at major league baseball is eye-hand coordination and throwing a baseball in an extremely unnatural manner. I know of no evidence that evolution would provide for these things in the societies we have developed in a significant manner. And the pool of MLB players have always been to the far right of the spectrum, so change there is very, very slow.
The best pitchers of the last decade haven?t been "physical specimens" like the hitters have. Randy Johnson, while tall, is lanky. Greg Maddux is about as "average build" as they come. Pedro Martinez is average size for a 1920s player. In my opinion, that certainly means that the pitchers of the Williams? era could have been just as good.
In the end, Gould?s work supports those saying the difference between Ted Williams? era and Barry Bonds? is not significant. The data I accumulated indicates as much. Players today simply do not appear to outstrip the players of a generation ago, and when your dad says Mickey Mantle was the best player he ever saw, he might be right.
AL chart:
|
BookmarksYou must be logged in to view your Bookmarks. Hot TopicsSteve Austin is not a Baseball Player
(159 - 12:27am, Jul 07) Last: Infinite Yost (Voxter) Defensive Replacement Level Defined (41 - 1:20pm, Mar 14) Last: Foghorn Leghorn Reconciliation - Getting Defensive Stats and Statheads Back Together (30 - 1:42pm, Apr 28) Last: GuyM Handicapping the NL East (77 - 2:02pm, Oct 15) Last: The Interdimensional Council of Rickey!'s Landing Buerhle a Great Move (79 - 8:43am, Feb 04) Last: Foghorn Leghorn Weekly DRS Update (Defensive Stats Thru July 19, 2010) (3 - 2:47pm, Sep 27) Last: Home Run Teal & Black Black Black Gone! You Have Got To Be Kidding Me (8 - 3:52am, May 01) Last: Harris Weekly DRS Update (Defensive Stats Thru July 4, 2010) (2 - 4:05pm, Jul 11) Last: NewGrass Weekly DRS Update (Defensive Stats Thru Jun 29, 2010) (5 - 12:47pm, Jul 04) Last: Harveys Wallbangers Weekly DRS Update (Defensive Stats Thru Jun 13, 2010) (15 - 1:51am, Jun 16) Last: Chris Dial Weekly DRS Update (Defensive Stats through games of June 6, 2010) (17 - 7:08pm, Jun 14) Last: Foghorn Leghorn Daily Dose of Defense (41 - 8:31pm, Jun 04) Last: Tango 2009 NL OPD (Offense Plus Defense) (37 - 11:22pm, Feb 17) Last: Foghorn Leghorn NOT authorized by Major League Baseball or its Member Teams (40 - 7:32pm, Feb 16) Last: GregQ 2009 AL OPD (Offense Plus Defense) (35 - 9:05pm, Jan 05) Last: Foghorn Leghorn |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
About Baseball Think Factory | Write for Us | Copyright © 1996-2021 Baseball Think Factory
User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
|
| Page rendered in 0.8697 seconds |
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Nick S Posted: March 18, 2004 at 03:16 AM (#615138)First let me give a little more background on this. Chris linked to the "greatest ever" thread (#10747). Two other recent threads raised this issue, #9600 and 9608. Both are linked in 10747, so I'm lazily not going to provide them here.
Gould's article can be found in a few of his books. If anyone does not have access to them, I have a copy and can fax it to you. I also strongly recommend an earlier analysis of Gould's study by Ken Adams, which can be found at http://www.bigbadbaseball.com/plus192837/inside.html
I think Chris summarized Gould's argument well, but I'll do it again briefly. If you visualize a normal curve, the right tail represents the extreme of human performance. When we say Ruth contributed X runs above average, we are in effect saying that he was X far out on that right tail. Another way to say this is that he exceeded the mean by X standard deviations.
The problem in comparing eras is that there is some reason to believe the mean may have shifted to the right. We see this in track, for example. The winning times for the 100 in 1910 wouldn't even get to the Olympics by 1960. The same is true for other events.
No problem, you might say -- the modern athletes will still exceed the new mean by the same number of standard deviations as the older ones. Except they can't. There are physiological limits to human performance. Those on the extreme right tail are bumping up against those limits. If the mean moves to the right, the existence of physiological limits means that the distance between the mean and the right tail is compressed. It's not possible to exceed the mean by as much as it used to be. Thus, Ted Williams won't exceed the mean by quite as much as Ruth did, even if he was, in fact, a greater hitter.
Gould concluded that this was actually what was happening. He argued that we could test the movement of the mean to the right by the amount of variance. If variance declines, then we're seeing the effect he predicted. Chris's data seem inconsistent with Gould's conclusion, at least in the sense that competition levels seem to have peaked by the 50s and stayed constant since.
I think the most important point in opposition to Chris's conclusion -- a point Adams made in analyzing Gould and that exposeur has already mentioned above -- is that variation can increase for reasons other than level of competition. Specifically, variation can increase when the run environment is higher and decrease when it's lower. There is another, more technical issue which I'll raise below, but I'll talk about runs first.
The way I see it is this. Offenses are constructed differently now than they were in baseball's early years. Early offenses relied much less on HRs and much more on other aspects of offense. This is obvious to us all and I won't bother to document it. In addition, however, offense has steadily come to rely much more on walks. In 1901 in the NL there were .069 BB/AB. In 1931 there were .081, in 1961 there were .095, in 1998 there were .197 (!).
This tandem increase in HR and BB means that both OBA and ISO/SLG are going to increase as well as long as BA remains within a fairly narrow range (it has). That increase will also increase variation in those measures. In addition, rising and falling run environments will also increase or decrease variation. These are what I think Chris has actually measured, not the level of competition.
Now let me raise the technical issue. When MLB integrated, it brought together 2 previously separate populations. I know just enough statistics to be dangerous, but I do remember that combining 2 populations should increase variance unless the 2 populations are both normally distributed around the same mean (I assume both Gould's and Chris's data were normally distributed). Assuming I'm correct, I wonder if this is true. Integration in the NL brought in a disproportionate number of black superstars in the 50s and 60s. It wasn't so much that the two populations were combined at all levels, it was that the right tail blacks were integrated into the league. This, it seems to me, should itself have increased variation. Can someone more knowledgeable about statistics weigh in here?
Lastly, I do believe that we can test Gould's theory. I think that using OPS+ data will eliminate most of the noise -- i.e., the change in variation due to changes in the run environment -- from the data. I don't know where to get this information in easily usable format (Excel). If anyone does, feel free to let us know.
Is this right? If so, why would it only affect Camden Yards? We are certainly not seeing huge shifts in other parks that have remained unchanged over the past decade in the AL East (Yankee Stadium, Fenway, Skydome). Meanwhile, some stadia are playing more like a hitters park than they used to.
How can you extrapolate from one park, when the other parks aren't showing the same changes?
PS -- I apologize if my link tags don't work. I am new at this.
(continued)
Also, I have another set of posts starting
<a > that look at the dispersion of pitching talent over the past dozen seasons. I think this might interest you.
Sorry for the goof.
So wouldn't one also want to measure the variance in some sort of comprehensive metric that is normalized for its offensive context, e.g., modifications to OPS as posed by Tango's "OPS begone" and the "How full of S is OPS" article linked in Primate Studies? (Or perhaps EQA, but I'm not as familiar with that metric.)
2) Bob Twining makes a lot of good points. It's hard to tell, without a lot of research, whether the pool of possible players has increased or decreased from a given time period. The US population has risen quite a bit, but then, participation in baseball has shrunk. However, blacks can now play MLB, and Latin American countries now contribute large numbers of players. But which period ends up with the largest pool? Now? Or the 60's or 70's, when you still had high levels of US participation, plus blacks, plus the first influx of Latin American players? It's too hard to tell without a lot of research.
And, not that anyone in this article or thread was arguing this, but there's often a very generalized evolutionary assumption re. sports, with ever-improving track and field times as evidence; baseball is then included under this umbrella.
However, the problem with such arguments is that they are a little like the average political argument, or dinner party philosophical argument--terms aren't really defined very well, and the supporting research usually is missing, resulting in smart people arguing in frustrating circles.
So thanks, Chris, for the quantitative research, which attempts to sidestep a lot of this by not asking why there's more or less variance, but whether there is.
Thanks for the commentary.
Let's mold this - several of you (and Mark included) said we should divide the SD by the mean of a given league? Mark suggested this previously, but I'm not sure what we are generating from that.
After taking the averages, I checked the correlations, and the correlation (r, not r^2) was 0.94 for ISO. SLG was lower, but still over .8. BA and OBP were in the .3 range (slightly lower, IIRC).
Additional work we want to see then is:
1. divide the SD by the mean and plot that.
2. Generate OPS+ for my players and check that variance.
Manual input of Park Factors will take some time (which is why I resisted that in the first place, heh). I can work on the SD/mean and plot this weekend.
I really appreciate the support and feedback, and I'll try to take the critique and improve what we have so far. As this is a masive portion of data in a different manipulation, I want the peer review process to work, so keep the suggestions and critiques coming - have a little patience. I'll try to keep the columnar format going, so we can discuss well as well.
Oh and the y-axis is the SD. It's SD vs. time period.
Thanks again.
-If Babe Ruth in his baseball prime was transported into our modern league, would he be worth more wins than Barry Bonds over a career? Over one season?
-If Babe Ruth was born in the same year as Barry Bonds, and grew up in the same environment, would he be worth more wins than Barry Bonds over a career? Over one season?
-Who was worth more wins in his contextual era?
Lou Gehrig (6'0",200) and Babe Ruth (6'2",215) were giants in their day. Pedro is 5'11' and said last year that he was 195. I think Pedro would've towered over most players back then.
That doesn't happen much here...
So how does that affect the SD? Is Gould's theory even appropriate to this problem? Or is it appropriate to the swimmer or track man? Does Gould say what sort of characteristics a human activity would have (and not have) that his theory would apply to? (Of course, now that I think of it, Gould at least thinks it applies to baseball, but I wonder.)
And as an aside, an article about the Twins' outfield (Stewart-Hunter-Jones are now the only all-African-American OF in MLB) included some numbers. The percent of black Americans in MLB has declined by 50 percent in the past decade. It is half of what it was a mere 10 years ago. That may be football or basketball's gain BTW, but baseball has compensated nicely because the percent of Hispanics has doubled. So it is apropos of nothing relative to this article. Just an aside.
Marc, Gould was dealing specifically with relative skills (hitting, for example)--skills performed against someone else who can improve. Absolute skills (swimming, track, fielding pct) are used to demonstrate the assumption that athletic performance does tend to improve in general (because of modern training, equipment, etc).
Gould's theory as it pertained to batting average was that because hitting is a relative skill, the mean stayed relatively consistent, yet variation would still have to shrink. Gould's work was done specifically to deal with the problem presented by relative skills.
These are the explanations I can come up with to explain this (lack of) phenomenon:
I wish your 3 suggestions were the only issues. Here are some others:
1. Chris's graphs show 5 year blocks. This means that the years 40-42 are included in that period and would mitigate the effect you suggest.
2. Variation depends on the number and quality of replacement players. If most stars left, the extreme right tail players would be gone, thereby reducing variation during the war years.
3. Remember that smaller run environments also reduce variation. In WWII rubber was in short supply and MLB used a substance called balata instead of rubber in the balls. This was not as resilient and the effect was that it was harder to get distance on the hits.
I remain convinced that it's possible to tease out all these factors, but it's a lot of work. And I'm glad Chris is the one doing it.
the answer is actually closer to 3. Yes, the stars that stayed behind did dominate, but most were gone. Mark answers your question pretty well.
There is definitely a drop for the war years. I broke the data out by "Eras" first, rather than by 5 year segments,but that simply wasn't granular enough - I guess I know something I'll add in the next piece ;-) .
You must be Registered and Logged In to post comments.
<< Back to main