|
| |||
Dialed In — Sunday, March 19, 2006Greatness - How do we calculate it?What is the greatest seasonal pitching performance ever? Bob Gibson in 1968? Pedro in 1999? Gooden in 1985? Koufax in 1966? How do we determine it? I have been thinking about this for nigh a decade and haven’t really gotten around to solving the question. Many of you reading this will say “Who cares?” or “Which pitcher had the most runs above replacement?” Or something. I’m not looking for “most value”, I am looking for “Greatness”. Not very clear, I know. That’s why I have been stuck on this question for a decade. I took a diversity training class at work last week. I know, big groan. However, this training was about “Inclusion.” It’s great to have a diverse population, but if you really aren’t using the varied people, does it really matter? That’s what Inclusion is all about. So, here at BTF we have the best sabermetric minds around: math, stats, theory, creativity. I want to take what I think and see if we can expand it into something.
What is Greatness?
I know they aren’t the “most valuable” seasons, but they are very valuable seasons, and these great events stand out in time - for you and me and every person that follows baseball. What do we call teh unreachable boundaries of data? Asymptotes. So I want to shove us off the sand in a search for the proper definition and quantification of Greatness. I’ll be Jason; you guys be the Argonauts as we sail in search of the Asymptotes of Greatness (AOG). Note: I know I am using hte word loosely - go with me here.
Where to begin?
What’s Next?
Now I will offend the statheads with a blatant slap: I am not willing to adjust for the leagues. I might consider a park adjustment, but league adjustments are right out. I am not here to debate relativity with you nor Einstein. I am not interested in “relative to one’s peers” for the AOG. I want pushing the bounds of what can happen on a baseball field. If that sours it too much for you, I am sorry, but this is not about value directly, it is about Greatness.
Doing the Work
So, I have the equation that I can use. Yes, I could push it a little lower, but round numbers are fun (4.00 ERA, 300 datapoints, 1 ER in 300 IP - you get the idea). Maybe this isn’t the best approach. I think the basic idea makes sense to me, and I am pretty sure this will work for all stats, but generating the Magnitude of Greatness (MOG) is tricky. Where do You come in? Right here. Now what do I do? Annnnnnnnnnnnnnnnd, go!
| |||
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
DUH
Can somebody just tell me the answer, pleae?
Yes, I could live w/a definition like this. However, the definition presupposes (it seems to me) some determinate, though not necessarily fixed, sense of "human capabilities." I must say I find that odd coming from you, Chris. Maybe I've misunderstood your prior positions, but the notion of a determinate, even "natural" sense of human capabilities doesn't fit my impression of your views.
A year ago, I would have set the hits AOG at 255. Now it has to be set at 263 (ooh, one more than demonstrated).
Thanks JC - good tip.
Gibson threw 304.2 IP. He allowed 38 ER. SO the demonstrated "unreached" measure is 37 ER in those IP - 1.092 ERA.
And Gibson is 1.092/1.12 = 0.976 MOG.
Simplistic, but entertaining.
Top 20 Seasons
year playerID nameLast nameFirst MOG1968 gibsobo01 Gibson Bob 296.6
1966 koufasa01 Koufax Sandy 204.4
1985 goodedw01 Gooden Dwight 197.8
1972 perryga01 Perry Gaylord 195.3
1972 carltst01 Carlton Steve 191.7
1971 woodwi01 Wood Wilbur 190.8
1968 mclaide01 McLain Denny 187.8
1971 bluevi01 Blue Vida 187.6
1964 chancde01 Chance Dean 184.5
1963 koufasa01 Koufax Sandy 180.7
1965 koufasa01 Koufax Sandy 180.0
1971 seaveto01 Seaver Tom 177.8
1968 tiantlu01 Tiant Luis 176.2
1978 guidrro01 Guidry Ron 171.6
1975 palmeji01 Palmer Jim 168.9
1972 woodwi01 Wood Wilbur 164.1
1968 mcdowsa01 McDowell Sam 162.7
1964 drysddo01 Drysdale Don 160.8
1972 hunteca01 Hunter Catfish 158.1
1969 gibsobo01 Gibson Bob 157.6
Gibson really dusts the competition. And Lower IP guys just vanish.
The top 30:
year playerID nameLast nameFirst MOG1968 gibsobo01 Gibson Bob 296.6
1920 alexape01 Alexander Pete 207.9
1966 koufasa01 Koufax Sandy 204.4
1933 hubbeca01 Hubbell Carl 203.2
1985 goodedw01 Gooden Dwight 197.8
1972 perryga01 Perry Gaylord 195.3
1972 carltst01 Carlton Steve 191.7
1971 woodwi01 Wood Wilbur 190.8
1945 newhoha01 Newhouser Hal 189.2
1968 mclaide01 McLain Denny 187.8
1971 bluevi01 Blue Vida 187.6
1946 fellebo01 Feller Bob 186.2
1964 chancde01 Chance Dean 184.5
1923 luquedo01 Luque Dolf 182.4
1944 troutdi01 Trout Dizzy 181.7
1963 koufasa01 Koufax Sandy 180.7
1965 koufasa01 Koufax Sandy 180.0
1971 seaveto01 Seaver Tom 177.8
1968 tiantlu01 Tiant Luis 176.2
1978 guidrro01 Guidry Ron 171.6
1942 coopemo01 Cooper Mort 171.1
1975 palmeji01 Palmer Jim 168.9
1943 chandsp01 Chandler Spud 168.6
1946 newhoha01 Newhouser Hal 164.9
1972 woodwi01 Wood Wilbur 164.1
1968 mcdowsa01 McDowell Sam 162.7
1964 drysddo01 Drysdale Don 160.8
1972 hunteca01 Hunter Catfish 158.1
1969 gibsobo01 Gibson Bob 157.6
1933 warnelo01 Warneke Lon 157.0
1.74 (Koufax, 1964)
1.73 (Koufax, 1966)
1.69 (Ryan, 1981)
1.66 (Hubbell, 1933)
1.65 (Chance, 1964)
1.64 (Chandler, 1943)
1.63 (Maddux, 1995)
1.60 (Tiant, 1968)
1.56 (Maddux, 1994)
1.53 (Gooden, 1985)
... and then you get 1.12, Gibson 1968.
As bunched up as those earlier numbers are, it's almost like there's a practical limit to how low an ERA can go, and it's about one and a half. That's the boundary of human capability, as you put it. Except for Bob Gibson, who blew past it with room to spare.
Now that's greatness.
I like the asymtotes idea you've got going here, but I think you need to better define your "greatness" or else it will one side of your brain yelling "Eckersley!" while the other shouts "Gibson!".
Regretablly, I don't think I've got any suggestions other than to maybe look at RA instead of just ERA and give some weight to Ks.
It seems like if you want to reduce the impact of IP you could, instead of using strict IP, use some normalized version of IP where you compare the pitcher's IP for the season against ? - maybe the average IP for a starting pitcher in the league that year (or the average IP of an ERA qualifier that year). That would be a way of saying "here's a guy who was great, but also provided his team with X% more innings than they might otherwise have gotten (giving him extra credit for that, or taking away credit if he was great but didn't log the innings)."
That would allow guys with fewer absolute IP (but still a healthy number adjusted for era) to be competitive with the 300+ IP hurlers of the past.
Agreed. And Gibson gave up a relatively huge number of unearned runs. His RA was 1.45, which fits in with the other top seasons as pushing right up against the practical limits.
Dial, you need to explain what the second variable. What is the trendline measuring?
Anyway, although I think I know what you're going for, I really don't think you're that close. You just have to adjust. It seems odd that 4 of the 20 greatest pitcher seasons since 1947 would be in 1968. It seems odd that all but 3 of them would be in the era of 1964-1972.
Now maybe that's not so odd -- the 60s and 70s were the era when starting pitching dominated the game. And Gibson's season was the most dominant of that era (though Vinay makes a good point about RA instead of ERA). Today's starters may be as talented, but they certainly have less "greatness thrust upon them."
Still, your top 30 since 1920 has 17 from 64-72 and 6 from 42-46 (should these count?).
So maybe what you're measuring is closer to "here's what a great pitcher can do under near-optimal conditions for pitching." If that's what you're going for, that's fine. But I think that puts me pretty close to the "who cares" camp.
It does seem a real statement of Gibson's greatness that he makes the list in both 68 and 69 even though scoring was up .6 runs. Batters were probably so traumatized by 1968 that they got queasy just thinking about him in 69.
I would set the AOG at some point less great than the all-time greatest performance. (I know that's contradictory with asymptotes, but you're playing fast and loose, so I'll play faster and looser.) If you always set the AOG at [greatest + 1], there's no way to differentiate between the greatness in Gibson's ERA and the greatness in Bonds's HR total. However, if you set the AOG at--just blindly stabbing here--3 standard deviations above the mean, maybe that's a 1.55 ERA, or 66 HRs, or whatever. Obviously once you're that far out, you're excluding nearly everybody, by definition. But it gives you perspective on how overwhelming the best performances were.
Another way to think of this: what's more great, Ichiro's 262 hits, or if Neifi Perez had somehow managed 282 that same year. My gut says that 282, which blows away the record, is astonishing, while 262 is very impressive but not decisively more so than the previous record. 262/263 is more or less equal to 282/283, but if mean+3 SDs = 250, you have a very different story.
Or, you could go to spring training, attach sensors to the joints of ML players and try to quantify SWAGGER.
Most Innings Pitched
Fewest Runs Allowed
Fewest Hits Allowed
Fewest Walks
Most Strikeouts
Using, of course, not all baseball history, but Chris's selected 1947-2004 era.
Eg, Gibson's 268 strikeouts are worth 268/383=70 points?
But you might want to adjust for fielding.
At which point, aren't we into Win Shares territory, or using FIP or DIPS instead of ERA? Maybe I've missed something, but I can't see what the difference is apart from adding AUA (another ugly acronym). AOG? MOG? UGH!
So I mean:
lgERA*IP-ER
Records are always increasing in magnitude (that's a rule). You should be able to build a model for what the *next* record setting performance should be based on the sequence of records that have been set. For me, greatness would be breaking a record by more than what would have been expected. In this case, I think you could still adjust for league/era but still get at what you're looking for.
For example, probably even adjusting for altitude, Bob Beamon's jump in 68 has to be considered "great". It was so far beyond what could have been expected to be the maximum accomplishment, I think that it falls into your category of "greatness".
You'll need (or someone who has more time than me will need) to do some math-y type work on modelling maximum/minimum's of distributions, but I think that it is still doable.
I agree, but without the adjustments, assumably.
And Gibson gave up a relatively huge number of unearned runs. His RA was 1.45, which fits in with the other top seasons as pushing right up against the practical limits.
Maybe that is the solution - RA.
Chris appears to be creating a metric to quantify something that is already quantified by eyeballing commonplace stat lines.
Here's where I disagree: post 28 - Dan, presently likes Maddux, Gibson then Pedro - he's nearly throwing IP out the window, and I happen to think they are more important than that. We call everything a big tie for greatest seasons.
Not as important as is showing up in the first pass, but somewhere in there. Yes, this is really going to have to pass the sniff test mostly because somewhere in here is arbitrariness of weighting.
I know that moves people into the "who cares" camp. But there has to be an attachemnt between what we see and know and what (for lack of a bette reference) the media and public see.
However, if you set the AOG at--just blindly stabbing here--3 standard deviations above the mean, maybe that's a 1.55 ERA, or 66 HRs, or whatever. Obviously once you're that far out, you're excluding nearly everybody, by definition. But it gives you perspective on how overwhelming the best performances were.
Then this stumbles into relativity. I can live with relativity to history, but not a seasonal one. It really doesn't matter how good the environment gets (thus far), the uncontrollable variables push back to an ERA of (say) 1. Or you can only get 700 ABs, so 296 hits is the complete maximum. And so forth.
Maybe I've missed something, but I can't see what the difference is apart from adding AUA (another ugly acronym). AOG? MOG? UGH!
You haven't missed anything. Those are all vlaue measures - and are all based in relativity.
at least control for that whole DH thing.
I want to avoid that. But maybe - the DH adds - what - 6-8%? Not sure if that's a big factor - or a big enough one to matter at the top.
So maybe what you're measuring is closer to "here's what a great pitcher can do under near-optimal conditions for pitching."
I'm not so much going for *that*, but obviously near-optimal conditions for pitching will produce the "Greatest" seasons. Same with any stat.
I think this is an interesting concept, but not one I can do. I'd like to keep the methodology simple so anyone can calculate any given season in case I don't.
I plan to look at the leaders based on RA this afternoon (stupid job).
Thanks, Vaux.
See, to me, the simply aren't as great. The as you increase innings, the likelihood you will give up runs increases, both from a fatigue factor and some luck regressing to the mean, and that line almost has to be exponential.
Otherwise, we simply go with Eck 90.
Okay, so instead of arguing over totals in an encyclopaedia column, you're now going to argue about weightings in a 'one big number' formula. I don't see where the gain is being made. Except in the realm of doing something just for the fun of it.
To paraphrase the most original moralist in England, "I'm wanting you to tell me, I'm willing you to tell me, and I'm waiting for you to tell me."
That seems to be the trendy thing to say on the internet, but would you rather have 303 innings of Bob Gibson '68 or 213 innings of Pedro '99? FWIW, BPro translates Gibson to 25-5 and Martinez to 26-5. I have to go to work now, so I don't have time at the moment to look up how they came up with those translations, but, from where I sit, it looks like it's too close to call.
Mostly for fun - I don't know how "Greatness" can be for anything else, as there'll be considerable subjectivity. But minimizing that subjectivity is a goal.
And there are too many columns in the encyclopedia to argue over - we'll have fewer wightings to argue over.
The gain will also be in the collaborative effort, rather than a dictatorialone.
Regardless of what you measure, you need to consider context. If you wanted to measure the greatest runners of all time, as soon as you factor in distance you drop all the sprinters; and when you factor in speed you lose all the marathoners. The only ways you can reconcile the two are to (a) develop separate measures of greatness for each, or (b) properly adjust for the differences between marathons and sprints. And either way, you're accounting for context.
Let's go back to the example of ERA greatness. You've already defined a context by throwing out 1946 and prior. You've defined it further by segmenting into seasons of 150+ IP. And then you say that you don't want to adjust for league because, as you state, "I want pushing the bounds of what can happen on a baseball field." Well, can someone have pushed the bounds before 1946? Can someone have pushed the bounds with fewer than 150 IP? Heck, could someone have pushed the bounds of achievement in the minors? From June of one season to May of the next season?
Really, all you're doing here is defining greatness, given certain constraints. But I think you're striving for an unconstrained Greatness, and allowing yourself to believe that you can get there with a subset of Lahman. Not gonna happen.
OK, now having taken a few jabs at your bubble, let me see if I can offer something more constructive.
Ignoring context, perfection is achieveable. A pitcher could throw X pitches and get X outs. But when you let context move in, that level of perfection is not achieveable: all it takes is for one batter to take a pitch, and now you're getting X outs in X+1 pitches. Is it any less a measure of greatness? No. It might make it harder to quantify, harder to discern greatness amidst the data. But it's still as perfect as the pitcher could perform in the context.
I believe to measure Greatness you need to consider three things:
(a) define the context,
(b) determine how difficult perfection would be in that context, and
(c) measure how close to perfection the performance is.
We can nail (a) and (c) pretty easily, but (b) is where things get difficult. You need to adjust for park; for quality of competition faced ("league-year" might not be enough); for defensive quality; for umpires; for managerial decisions; for, well, everything. You need to assign a degree of difficulty to each performance. Is {X+1 pitches, X outs} more or less difficult than {Y pitches, Y outs}? At what values of X and Y are they equivalent?
I don't see how you solve for Greatness without this.
Thanks!
And then you say that you don't want to adjust for league because, as you state, "I want pushing the bounds of what can happen on a baseball field." Well, can someone have pushed the bounds before 1946?
Fair enough. As you can see, I have added the 1920-1946 players as well.
I do think modern equipment is a factor - that's a time when people are playing the same sport. Throwing underhanded and just a soggy ball simply arent'the same game. Games without gloves aren't.
I don't think "pitches" are a measure of Greatness at all.
You need to adjust for park; for quality of competition faced ("league-year" might not be enough); for defensive quality; for umpires; for managerial decisions; for, well, everything. You need to assign a degree of difficulty to each performance. Is {X+1 pitches, X outs} more or less difficult than {Y pitches, Y outs}? At what values of X and Y are they equivalent?
I don't see how you solve for Greatness without this.
Because some of those things don't exist in practical terms. They aren't measurable.
It doesn't matter which park. The most HRs in a season wasn't done in Coors. The highest BA wasn't. The most K's weren't in Dodger Stadium. Nor was hte lowest ERA. And plenty of pitchers have pitched under those conditions (how many pitchers pitched half their games in Dodger Stadium? 5 per season?)
1968, 10 team league, use N = 10/2 = 5, 5th lowest ERA was Steve Blass with 2.12. Gibson was 1.00 runs lower, times 304.2 innings, is 33.9 Runs Above All-Star.
1999, 14 team league, N = 7, 7th lowest ERA is Colon with 3.95, Pedro was 1.88 lower, times 213.3 innings, is 44.6 RAAS.
2000, Heredia is 7th lowest with 4.12, Pedro was 2.38 lower, 217 innings, is 57.4 RAAS.
1884, 8 team NL, 4th best ERA was Galvin at 1.99, Charlie Radbourne led the league in ERA 1.38 and innings 678.7, but this only led to a RAAS of 46.0.
etc.
That's not bad at all, but you need an IP component. It's obviously easier to have an ERA lower with fewer innings.
Pedro's 1999 was 213.1 IP. Koufax' 1965 was 335.2 IP. Eck's 1990 was 73.1 IP.
So Koufax is 123.1 IP more than Pedro, and Pedro is 140 IP more than Eck.
IP is a very important component here.
You mention in the intro that you're looking at guys with 150+ IP, so I'm not sure why you keep bringing Eck up as a problem.
I personally don't like adjusting for era standards like "everybody threw X IP". They didn't throw them well, and they are not, generally showing up on my chart that much. You have to throw a lot and be good.
Because people want to discount 335 IP compared to 213 IP based on ERA+.
Sometimes people have no trouble going "well it was only 75 IP", while the difference in IP between Koufax and Pedro was very comparable to the difference between Pedro and Eck.
People recognize that Pedro => Eck is a lot of IP, but not so much for Koufax => Pedro.
---
I'm not sure you can say that Pedro's performance was much better than Gibson's because the league scored more. The absolute best outcome is a 0.00 RA in Gibson's 304 innings, but that isn't really realistic; I think that a 1.5 RA is near the boundary level of what is possible. Pedro's 285 ERA+ translates to a 1.02 ERA in the NL in 1968; the difference between that and what Gibson did, over Gibson's innings, is 3 fewer earned runs. Of course, if Pedro started off as he did (285 ERA+ in 217 innings) he would have given up 25 runs in those innings; to equal Gibson, he'd need to throw 87 more innings and give up 13 or fewer ER (a 216 ERA+).
I did notice that there was a really nice bell curve shape to the number of times an ERA was produced.
Using ERA+ (both era and park adjusted) times IP leads one to:
Gibson 68 78604
Pedro 00 61845
Eck 90 44420
Gooden 85 62527
Pete 20 61034
Koufax 66 61370
Hmmm, I wonder if this actually works. I like the way that smells so far.
Shirley somebody did this. Boy, that Gibson 68 is something else, huh?
But this isn't using the asymptotes, and wouldn't be applicable to all stats, which is a nice goal.
Dazzy Vance is a good example. His K/IP won't blow you away -- though they don't look out of place in a modern pitching line. But compare them to the other guys on the leaderboard -- utter dominance.
Now I know that this is directly contrary to what you stated in the article intro, but I do think there's a relative component of greatness. 29 HR was well beyond what anybody expected for HR totals in (say) 1917.
Second (and I know this is old ground for both of us) workload has to enter into the equation. But simply counting the innings doesn't cut it for me. Again, I like to use the leaderboards to get a baseline. Basically I take the IP totals for #2 through 5 (to avoid counting a guy's heavy workload against himself)
So taking Bob Feller's 1946. He moves forward a tad using RA+ rather than ERA+. I get his RA+ at 171 (but I'm just doing a quick and dirty and may have an error in my calculations) and he worked roughly a third more innings than the numbers two through five.
Multiply the two and I get a quick and dirty wow factor of 228 -- and that looks good in terms of what I think you were looking for.
I agree - but there the game was so different.
What are the "Great Eras"?
Pre-1900 (30 yrs), Deadball (20 yrs), Pre War (22 yrs), Integration start (15 yrs), Expansion (which continues today - 45 yrs).
Expansion also has the development of equipment (gloves, helmets, bats, fields).
Have we pushed the equipment assistance to the limit? Probably. Which is why I went for post-integration. Are these the wrong eras? (Yes, I know about 1911-1912).
I think each era may have it's own Greatness requirements, but I think the Pre-War and Integration Starts periods aren't dramatically different (except the gloves), and some portion of quality of play. Deadball and Pre-1900 are different beasts altogether.
I understand the desire for W% - many many people like it - it's just not my bag, baby.
I think it depends on how long it lasts, and is it really that different from teh 1950s?
Just taking the argument to the extreme, is all. We could talk about it at the level of Batters Faced, but as soon as I mention that X outs in X+1 BF isn't as good as X outs in X BF, someone will surely come along and say if the "X in X+1" are all one-pitch at-bats while the "X in X" are ten-pitch struggles, then the former is greater. Again, context.
Coming up with a "measure" of Greatness, really, is going to invlove either much more data than you're using or a large amount of subjectivity. And if you're going with the latter you're not really developing a measure as much as an opinion. And you can get that opinion just fine without all the calculations.
If the problem is that everyone has a strong and differing opinion, then, yes, about the only way to resolve it is some relatively objective measure; but again that takes a lot more data than Lahman provides.
this is closer to the idea. Which is why this is really open to ideas - what do *you* think would be a good measure? You have to limit some of the input, but what really makes you say "Man, that was a GREAT season."
Is it just ERA? ERA+? IP*ERA+? What does it for you?
Plus, the level of hitting and scoring that's occcurred in 1993-2005 is dramatically higher than that which occurred in the 1950s.
Rather than looking at ERA or ERA+, look at where the category leader was significantly ahead of his peers. IF 185 ERA+ leads the league, but there were several other pitchers with ERA+ in the 170 range with comparable IPs, then is isn't as impressive as if the leader had 160 ERA+ and no one else is above 140+.
Of course, you've still got the IP/ERA breakdown to deal with....
Har har. And as I noted to start, I included Integration start with teh Expansion years simply because it's such a shorter time period.
Not definitive or anything, Steve, but I looked at 2005 NL = 4.45 R/G; 1955 NL = 4.53 R/G
But I do see that 1999-2000 was high. I don't think the decades are that dissimilar, with this era being slightly higher.
I agree - I think that it is a storng idea, but would take a separate skill than I have. And we want it to be readily updated when Pedro goes off this season and the Mets win the WS.
(These are really just variations on the idea of using SDs -- post 23 -- but probably more accessible for the average fan to interpret.)
R/G, NL, AL, 1950-59:
1950 4.66 5.04
1951 4.46 4.63
1952 4.17 4.18
1953 4.75 4.46
1954 4.56 4.19
1955 4.53 4.44
1956 4.25 4.66
1957 4.38 4.23
1958 4.40 4.17
1959 4.40 4.36
R/G, NL, AL, 1993-2005:
1993 4.49 4.71
1994 4.63 5.23
1995 4.63 5.06
1996 4.68 5.39
1997 4.60 4.94
1998 4.60 5.01
1999 4.95 5.23
2000 5.02 5.28
2001 4.70 4.86
2002 4.46 4.80
2003 4.61 4.86
2004 4.64 5.01
2005 4.45 4.76
The '93-'05 period is distinctly higher-scoring, and it isn't just because of the DH (although the DH is of course another significant differentiator between the eras). Moreover, it isn't just the scoring rate itself that's different: the mode of scoring is dramatically different as well, with vastly more home runs and strikeouts occurring in the later period.
Not quite what you're looking for. The guys who do real well in K+ have always attracted attention. Sometimes beyond their actual effectiveness.
An awsome ERA provided it's backed up by enough innings. Or just a microscopic ERA (like Eck)
Wins? Dunno. 1968's an interesting test. I think more people are just blown away by Gibson's 1968 than McLain's. And I think that was true even in 1968 -- when wins carried far more weight than they do today.
On the other hand what blows so many people away about Carlton's 1972 is the 27 wins on a team that 32-87 when he didn't get the decision.
You know what might be interesting is to approach this backwards. Make a subjective list of great seasons and work backwards. Kind of like what James did in building his HOF monitor.
Jonathan Bernstein has a real quick and dirty MVP predictor which works pretty well in the NL, not so well in the AL. It does provide a first cut notion as to what impresses people.
I definitely recall it being that way back in '68. McLain's 31 wins was hugely (and rightly) celebrated, but Koufax had won 27 and 26 games just 2 and 3 years earlier, and Marichal won 26 in '68, and had won 25 twice before. McLain's win total was higher, of course, but not that much higher.
But even in the pitching-rich '60s, Gibson's ERA just blew everyone away. It had a feeling of unreality about it. Nobody, not even Koufax, had come within miles of a 1.12 ERA.
And then in their showdown in the World Series, McLain wasn't especially effective, while Gibson was at his nastiest, especially in the opening game, with his phenomenal 17-strikeout shutout. (I saved the next morning's sports page from that game, thinking that I had been witness to something historically important ... I think I was correct.) Gibson was the bigger story than McLain, even at the time.
They due exist, and they are approximately measurable- the problem most people have is
A: Adjusting for park/league/era is not "precisely" measurable; and
B: They want to think the numbers they eyeball on the back of a baseball card are a precise representation of what the player is worth.
in 1968 Luis Tiant pitched 258.333 innings with an ERA of 1.60
Great season, but was it really the 19th best season since 1920?
Without adjusting for the league- 1968 AL average ERA was 2.98 (let alone Park, etc) you can't. The AVERAGE (mean) pitcher in the 1968 AL had an ERA of 2.98- was "1968 Joe Average" (Gary bell Boston 11-11 3.12; George Bruent Calif 13-17 2.85) a better pitcher than Mr. Average 2000 (Aaron Sele 17-10 4.51) when the League ERA was 4.91?
2.98 versus 4.91 you have to adjust for that- even if it's not perfect some allowance has to be made for that.
Since 1963, the average in hte NL waggled around 4 R/G, so the increase from this era compared to the 50s is roughly half what it is from the 50s compared to much of history. The AL throws a lot of that off.
And it is a completely differnet conversation than our goal here.
Huh?
In the first place, it isn't a 4-6% increase. It's actually an 8.3% increase (4.45 MLB average for 1950-59, versus 4.82 MLB average for 1993-2005). And in historical comparisons of 10-to-13-year periods, that is indeed dramatic: consider that the overall average for all seasons 1901-2005 is 4.38; the 1993-2005 period is the second-highest sustained period of scoring since 1901, exceeded only by the 1920s-30s. The 1950s was only a slightly above-average period of scoring.
Even if you remove the AL, and just look at the NL only, the difference is 4.2% (4.46 to 4.65), which is again, quite significant for periods of this length in historical context. And removing half of baseball from the analysis doesn't seem to make any sense anyway: the DH is a real change, that's really happened.
And the fact that it may well be "changes in equipment" (by this I assume you mean a livelier ball) that has been a major contributor to the change in scoring rates is, I would say, irrelevant: regardless of the cause(s), scoring rates changed significantly. The periods aren't especially comparable.
And it is a completely differnet conversation than our goal here.
I don't think it is. Especially when attempting to discover and describe the most impressive "wow!" achievements in history, a careful comprehension of environmental context is crucial.
...
the difference is 4.2%
Thanks.
And removing half of baseball from the analysis doesn't seem to make any sense anyway: the DH is a real change, that's really happened.
I'm not. The AL has seen about 12% differnece. I/m pretty sure the DH is about 6%. That makes the AL increase about 6%. So (as you AGREE above) 4-6%.
Look at it this way - it's 8.3%, less the DH (for context as you say), which in this situation would be about 3% or an overall 5% change. Or 4-6%.
And if you read the opening part - I'm not interested in very much "context", so NO, the R/G don't really matter.
Well, okay, but:
(a) A 4-6% change is significant in historical terms. Saying it isn't doesn't make it so.
and
(b) While we can remove the DH for purposes of identifying its impact, its impact remains in real life. Not looking at it doesn't make it go away.
And if you read the opening part - I'm not interested in very much "context", so NO, the R/G don't really matter.
I get that you aren't interested in it, for your immediate purposes. But I suspect that ultimately it can't be ignored. Everything, even outlier performances, perhaps even especially outlier performances, require context for meaningful assessment.
The converse of this is also true.
But I suspect that ultimately it can't be ignored. Everything, even outlier performances, perhaps even especially outlier performances, require context for meaningful assessment.
Maybe. But we haven't seen the highest BA these days - not even close. We have seen the highest SLG and HRs and hits, in pitchers parks.
Oddly, we have also seen some of the lowest ERAs these days, and in hitters parks.
So while I agree the run environment contributes to the event, it is just one variable, and not too critical for the primary function.
And I'm fairly certain for Greatness, 4-6% isn't going to be significant.
This is interesting. It weights the IP pretty big.
Obviously if Gibson threw 186.7 "scoreless" IP, that'll smoke most other seasons, and nearly wipe out all Pedro seasons.
Right now, it seems the "scale" I like the best is the IP*ERA+. It seems to be ranking the seasons pretty well in the few I did.
Can GPA*PA do the same? Yes, I just used GPA.
I'm obviously less confident about that. A 4-6% change in the league-wide average will yield a swing factor quite a bit larger than that in individual outlier performances. And, for AL pitchers, the league-wide change isn't 4-6%, but is instead the full 12%, because for them, the DH simply cannot be factored out of the equation.
You may be right; it might not be worth worrying about. But I'm leery about dismissing it at the outset.
You may well be right too. I figure we can work it without and see how the smell test comes out.
As Walt notes above, I'm coming up with 50% 1964-72 pitchers as "Great".
Soemthing needs tweaked.
It's a handy metric I set up while in fantasy leagues to figure out how much value each pitcher has. When evaluating a starter with 200 IP and a 3.00 ERA vs. a closer with 70 IP and a 1.80 ERA, the QIP immediately told me how much each pitcher was influencing my team ERA.
Its use is in converting averages into an absolute value that can be more easily compared. Another example: one hitter has a .450 OBP in 400 PA and another a .400 OBP in 600 PA in a league with a .340 OBP. Which has been more valuable in getting on base? The first batter is credited with 129 quality PA (because to be at league average, he would have to not reach base in another 129 PA) and the second 105 QPA.
A drawback for your purposes is that QIP may not have the same value across eras, since the value of a scoreless IP changes. You would need to decide whether Gibson's 186 QIP in 1968 are better than Pedro's 141 in 2000.
Well, ya know, nothing has the same value across eras. Everything is context-dependent. And while I understand and sympathize with Chris's wish to just forget about all that for the time being, and focus on just the raw numbers themselves, I just think that sooner or later, one way or another, context is going to rear its obnoxious head.
I don't think so.
since the value of a scoreless IP changes.
This isn't a "value" measure. This is a "under no circumstances can you do that" measure.
No one can throw 300 IP and allow just 1 run per 9 IP. It cannot be done.
Can people outpace a given league? Sure - but that's a relativity score, and isn't as "Ooooooo" as posting a 1.12 ERA under any circumstances in MLB.
Can someone hit 37 triples in a MLB season? I don't think so.
There have been 6 seasons with 60 or more doubles. All within a 10 yr period. And not this one.
Was 1936 a particularly good year for doubles? BS. There's no way context plays a part in that. Sure there were more doubles hit that year than any of the surrounding seasons, but explain that. Don't just say "It was a good year for doubles". That's nonsense (well, maybe, there could have been a park fence change that made more GR2Bs).
4 of the top 20 2B seasons happened in 1936. Now that's weird (as is the spelling of that word).
I just think that sooner or later, one way or another, context is going to rear its obnoxious head.
So far, so good.
Oh, come on, Chris. The period from the mid-1920s through the mid-to-late 1930s featured a very high rate of doubles. Context unquestionably matters. It isn't just coincidence that the record for doubles wasn't achieved in the 1960s.
Sure, Steve, but why 1936?
The precise individual mark in the precise individual season is very likely just coincidental. But let's consider:
Top 20 seasons for MLB 2B/G:
1930
1932
2004
2000
1931
1929
2005
2003
2001
1936
1999
1998
2002
1994
1925
1997
1934
1996
1935
1995
1928
1937
1939
1926
1927
Every last one of them is either between 1925-39 or between 1994-2004.
Now let's look at the seasons in which the top 25 performances in individual doubles were set:
1931
1926
1936
1934
1932
1936
2000
1930
1923
2000
1935
1936
2002
1999
2002
1950
1937
2001
1899
1936
1997
2001
1977
1993
1996
The correlation isn't perfect, but it's damn close. One simply can't look at this and conclude that environmental context isn't an important consideration in assessing how "wowed" we ought to be when considering, say, Earl Webb's 67-double performance of 1931, or Joe Medwick's 64 in 1936. It might be the case, for example, that Frank Robinson's 51 doubles in 1962 are more worthy of a "wow". Without doing some contextual analysis, we can't be sure.
No, Steve. That's a different exercise. I am 100% sure that 51 doubles is pedestrian wrt what could have happened.
I am not interested in relativity, neither from you, nor Einstein for this work.
And I can't see how you could set scarcity standards in a context-free way. There may be some single-game stats -- brief moments of incredible brilliance -- where we could all agree context doesn't matter that much: 20 Ks, 27 BFP (perfect game), 12 RBIs. But at the season level, context will almost always matter. How could it not, if what we're trying to determine is how rare or unusual a performance is? A context-free scarcity standard is an oxymoron.
* * *
BTW, ERA+ is problematic in the way it handles context. It's a lot easier to post a great ERA+ in high-RS times. # of 200+ ERA+ seasons by decade:
1950s 1
1960s 2
1970s 1
1980s 1
1990s 7
2000s 5 (pro-rated)
Ironically, ERA+ probably overcompensates for high-RS periods.
1926
1936
1934
1932
1936
2000
1930
1923
2000
1935
1936
2002
1999
2002
1950
1937
2001
1899
1936
That's really odd. And quite the coincidence.
Precisely.
With no consideration of context, it is impossible to be anything close to 100% sure about that. Impossible.
This is what I am saying.
But at the season level, context will almost always matter. How could it not, if what we're trying to determine is how rare or unusual a performance is? A context-free scarcity standard is an oxymoron.
I don't agree. When you think about run environments, they are caused. They don't (generally) just exist. They are caused by external forces.
Because there *IS* an asymptote for ERA, you can only get so close. Gibson got to 1.12. Did anyone else get as close as 1.53? No. What about 1.56? No.
If you don't believe me, wrt the bounds, just go to the leaderboard for ERA at bb-ref.
The leaders today *are the same as* the leaders fo teh 80s, 70s, 60s, 50s, 40s, 30s, 20s.
The leader *has the same ERA*. You did a "ERA+" by decade. Check this:
ERA leader is less than 2:
20s 2
30s 1
40s 4 (2 due to WW2)
50s 1
60s 5
70s 3
80s 2
90s 5
00s 2 (already)
Sorry, you guys aren't selling me on the greatest ERA is really significantly any less achievable now than before. Particularly since really, just Gibson has broke on through to the other side.
I think there is just as good a chance now for the someone to post a 1.5 ERA *as ever*.
And don't bother with probability calculations (unless they support my position), because the probability is already very small. Once it gets so small, "magic" happens - that is we have a Season of Greatness that cannot be contained by your spreadsheets.
The math says Gibson's season can't happen - but it did.
It appears I've done the impossible.
Unsinkable ships sink.
No, it doesn't. The math says Gibson's season is very unlikely to happen -- and given that it's happened just once so far in history, the math has been right about that. The math also says that the very lowest individual ERA is most likely going to occur in a very low-scoring season -- the math has also been right about that.
You can close your eyes, stick your fingers in your ears, and go "la la la la" all you want, but scarcity without any consideration of context remains an oxymoron.
That's nonsense Steve. Is the probability of Gibson's season (one in X) greater than the number of pitcher seasons? Do you know, or are you making something up?
You can close your eyes, stick your fingers in your ears, and go "la la la la" all you want, but scarcity without any consideration of context remains an oxymoron.
You can close your eyes, stick your fingers in your ears, and go "la la la la" all you want, but you are wrong; you just don't seem to be able to process it.
I'm fine with you not understanding.
<