3 pitching workhorses died during the making of this article.
Lucky would be winning the lottery. Meeting an engaging super-model in the supermarket checkout line. Having an off day on the road line up with an offer to play Pebble Beach.
But winning 13 games, as a rookie, for a team playing in the American League East?
“Yea, I just got lucky on the mound,” Jeremy Hellickson says dryly. “A lot of lucky outs.”
...The premise is based on a sabermetric calculation, called BABIP, which stands for batting average on balls in play, something essentially out of the pitcher’s control. The theory is that since Hellickson had such a low number — a major-league best .223, nearly 70 points below the league norm — he was more lucky than good.
“I hear it; it’s funny,” Hellickson said, not quite sure of the acronym. “I thought that’s what we’re supposed to do, let them put it in play and get outs. So I don’t really understand that. When you have a great defense, why not let them do their job? I’m not really a strikeout pitcher; I just get weak contact and let our defense play.”
...And that translates to the statistic that Hellickson feels matters the most.
“Wins are by far the most important stat,” he said. “You have a terrible day out there but as long as you win, you’re fine.”
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Jacob Posted: March 14, 2012 at 11:26 PM (#4081124)Most important TEAM stat...Sure, I guess.
Hellickson is not going to consistently post a .223 BABIP. No one ever does even extreme flyball guys. However there's also a pretty good chance his peripherals improve (his walks and strikeouts anyway) thereby canceling out some of the expected regression in his BABIP.
I think that's something that's getting lost in a lot of the Hellickson talk. His minor-league peripherals were insane. He's not going to turn into J.A. Happ this season.
But if the theory was sound, it was inevitable. The more advanced metrics and other discoveries make their way outside the basements and into the wider world of baseball, which always seemed like the goal, then eventually it was going to make its way to the participants.
What do you know about this stuff? (please take this tongue in cheek as intended)
I would imagine that the hope would be that the participants would be educated on what those numbers really mean and where they come from and how they work etc. The problem is the small soundbite of the press and a quick question and answer session means that the question is going to be worded poorly(and possibly intentionally to provoke a negative response) and the respondent will be on the defensive because saying someones good seasons wasn't that good(which is what they'll hear with the questions) isn't the best way to approach an honest response.
Good luck with that. These are athletes. You're welcome to try to educate them on SSS and the ins and outs of regression and what not, but most of them pretty much stopped listening when they heard "lucky."
And I'm not criticizing Voros in any way. I'm merely noting that as the work done by him and others in the field continues to take on greater significance in the baseball conversation, it is inevitable (and outside his control) that players would ultimately get in on the discussion. And honestly, the reason Jeremy Hellickson would come to the conclusion that his good season wasn't that good is because that's precisely what a lot of people are saying.
Statistics in player evaluation are simply tools to help us better understand how the abilities of the participants affect the outcome. All of them do so imperfectly, even the best ones. I think you evaluate Hellickson based on how he pitched and to the extent the statistics help you ascertain that, they are invaluable. But too often I think they are seen as stand ins for the player himself. Jeremy Hellickson is not a major league pitcher who got "lucky" by posting a 2.95 ERA, he's a pitcher that, when you look at the sum total of all the evidence available to you, is at least a serviceable MLB starter with some growth potential for more. The 2.95 is just a data point.
I agree, but education on most matters doesn't require intimate knowledge of the subject.(There is zero reason for a player to know about regression, unless he really wants to know) BABIP can be summed up in one or two sentence that anyone willing to listen can comprehend, sure the summation wouldn't be 100% factually accurate, but it's good enough.
It's not like the basics of Dips is that unusual from a pitchers point of view. The pitcher controls opponent homeruns, walks and strikeouts, the better your ability to control those three elements in your favor, the better performance you will have. Every time the pitcher argues about his ability to control hits, you just have to remind him that little dribbler hit just past the fielder or short or whatever, that happened in his most recent game(it's better than crediting the great defensive play that happened in the game---guys are more likely to accept something if you are allowing to blame the failures on something else)
Of course pitchers to an extent also control the type of hits they allow(flyball vs groundball) and it's easy enough to explain that fly balls are easier to catch, but when they don't get caught they have a higher tendency to result in extra base hits, while ground balls are less likely to be fielded, but result in fewer extra base hits... so in that respect the pitcher does have some control on the hits, at least from their point of view.
$SO = SO/AB
$HR = HR/(HR-SO)
$H = (H-HR)/(AB-HR-SO)
$EBH = (2B+3B)/(HR-SO)
$3B = 3B/(2B+3B)
The point of all that mess was to construct a batting line such that every one of those stats could be anything from 0 to 1 regardless of the results of the other stats (IOW, independence). The dollar signs were just my own shorthand at the time (early 1998 I guess) to differentiate column headings from the actual totals. I still do it. Had I known...
Timely!
James Joyce used to kill a horse for every page of Ulysses he completed -- and he double spaced.
Sorry, not totally following that... I think I get the goal of independence, but as written above, $HR and $EBH would generally be negative.
EDIT: Hmm, found this - maybe that denominator is AB-SO?
EDIT: Hmm, found this - maybe that denominator is AB-SO?
I think you're right, Joe. Neither HR/(HR-SO) nor (2B+3B)/(HR-SO) seem like they'd be useful for anything.
I've never quite gotten the "pitchers control HRs" thing. Yes, I know the defense can't control HR's, but the defense can't control hits high off the monster or line drives over the third baseman's head down the line either. If a pitcher throws a hanger to Jose Bautista and it goes 900 feet it's entirely his fault, but if he throws the same pitch to David Eckstein and it's off the wall in right center it's not?
Ideally you'd want to credit the pitchers for the automatic outs and debit them for the plays the defense could never make.
which is what zone-based defensive metrics try to do, after a fashion - by penalizing the fielders less for plays that have a low probability of being made. Of course, the probability of a play being made/not being made is variable; certain plays that would never be made by the defense become more or less routine when Ryan Howard is at the plate and the infield defense is overshifted, at the expense of certain other so-called "routine" outs. I once looked at David Pinto's 10 most valuable plays by a second baseman on video, and IIRC five of them were plays made by the 2B in short right field on grounders by a left-handed hitter into the teeth of an overshift. I venture to say that just about every second baseman, in that specific defensive alignment, makes those plays routinely.
And I'm sure that's why Ron said "ideally". We are still pretty far from the ideal world, IMO.
-- MWE
The Rays play the Mets in a series this year, so I'm sure Jason is happy to know Hellickson has been working on that.
The Rays had the best BABIP in the league though, bay far. There was more distance between them and the number 2 team than there was between Nos. 2 and 16. I'll grant that .223 isn't sustainable, but on a team with defense that good, would it be unreasonable to expect Hellickson (who is obviously skilled) to come in around .250-.260 this year? I expect him to bump his K rate up significantly - if he comes in under 6/9 again I can't seem him repeating 2011.
As it stands now, it is not unreasonable to suppose that if a pitcher can control his HRs and his Fly balls and his GBalls then perhaps he has some effect on his BABIP. HOw much, I'd like to know
And these are generally professionals who take their craft quite seriously. Many keep meticulous records of their opponents' tendencies, their own performance, and even of the umpires. Once they get past the initial resistance to a new stat, I'd think they'd really appreciate the additional information provided.
$HR = HR/(AB-SO)
$EBH = (2B+3B)/(H-HR)
God I must have been very tired last night. :)
He does have some effect, it just isn't much and it's not enough to sustain a .223 BABIP. And some of things he can do to positively affect his BABIP (like giving up more flyballs) tend to have negative consequences that often outweigh the benefits of the slightly lower BABIP.
My point all along is that because the range in BABIP is so small, and the influence of the fielders so large, couldn't we better address pitching ability by staying away from the defense dependent stats altogether. So that knuckleballers and flyball pitchers and guys with high strikeout totals still get the credit for the tendency to have slightly lower BABIPs without having to actually use the BABIP (and the more general stats it affects like ERA and W/L) which has a bunch of outside influences acting on it besides pitching ability. And then of course now with pitch f/x, we could even go further down that road of bypassing it.
Closers as a group seem to be a specific exception to the general rules (as a group the have a low BABIP without giving up enough extra bases to compensate)
EDIT: Standard deviation for BABIP within a year among pitchers with 162+ IP since 1962 is .022. Still .022 in 2010)
EDIT2: Voros, the Lahman database now has BFP. Not sure if you knew that or care any longer.
Not to take you out of context...but... :)
The Mathamatics of Poker has a chapter or so where they reduce poker hands to symbolic values on a scale of 0-1, with the scale representative of NAND strength. They use this scale to illustrate and prove Game Theory concepts across most types of poker.
What about over 1000 IP or 2000?
Indeed. I just switched to the droid razr phone and the spell check is killing me. I miss my iPhone.
get swiftkey
I don't happen to have career SD handy. Voros might. (I think it came up in the discussion way back when) I do know it's pretty small, and that it's primarily explained by the type of pitcher.
I don't happen to have career SD handy. Voros might. (I think it came up in the discussion way back when) I do know it's pretty small, and that it's primarily explained by the type of pitcher.
No, honestly asking.
Done, and provisionally very effective. Thanks much.
Someone in another tread suggested that BABIP was increasing since the 70s. This is not true then? (sorry I forget the thread)
Finally, also what does the mean hover at It's 0.29 yes?
And I've been suggesting for years maybe we should look at SLGIP instead but nobody has to my knowledge.
What about over 1000 IP or 2000?
As sample size goes up, SD goes down.
For any rate stat, the binomial distribution will usually get you close enough. Sometimes it's a little off if there are some extremes in the distribution. The binomial is the number of successes in n trials given a probability of success p (under an assumption of the independence of trials ... which is a little dicey in the case of baseball but probably close enough over a lot of batters ... and a lack of independence will likely decrease the variance in this case).
VAR (in proportion terms) = p*(1-p)*n
So for BABIP = .3 and 1000 BIP, the variance of the binomial is 210. Take the square root to get the SD which is about 14.5. Divide that by 1000 to turn into a proportion if you want and it's .0145. That's what you'd expect assuming constant p across all pitchers. (Note, 162 IP is, what, 500 BIP? That's an SD of about .02 which is roughly what Ron reports. The remaining .002 would be "true talent" variation. (I assume it's bigger than that but you get the idea -- the SD under an assumption of randomness is not far off the observed SD which is evidence of little variation in true talent.)
The GB/FB thing suggests we have a "mixture" distribution -- i.e. (in the extreme) two populations with slightly different distributions of BABIP. Variation in GB/FB might be sufficient to explain the "true talent" variation that is observed. Then you get variation in defensive quality.
Short reliever/starter is potentially due to the difference between pacing yourself and letting it all hang out for one inning.
Someone in another tread suggested that BABIP was increasing since the 70s. This is not true then?
Depending on the distribution, a shift in the mean does not necessarily mean a shift in the variance/SD ... although there is usually sopme positive relationship between the two. In the case of the binomial, there certainly is but it's quite trivial over a small range of p and is quite trivial over even a large range of p from about .3 to .7.
So for BABIP .. if it's .28 then p(1-p)=.2016; for .29 it's .2059; for .3 it's .21. So the SD won't change substantially for any reasonable values of "true" BABIP.
Didn't the rays just improve their defense considerably by adding Pena back instead of relative stiffs at 1B last year and having Jennings in the OF for the entire season?
I was also interested in if the mean has changed over time. ALthough that was not clear from the way I asked it.
If the mean has changed, then this theory probably is not fully formed. Also the GB/FB thing does argue against the notion that pitchers have no control over BABIP.
Why? The theory is that pitchers have no control over BABIP, not hitters, or fielders. An increase in BABIP points to an increase of talent density for hitters, or a decline in fielding. Or possibly external factors that inflate BABIP (parks with less foul ground, lower pitching mound, etcetera).
Also note that "little" is not the same "no" when it comes to control over BABIP. Voros has always argued for "little"
No, that was Finnegans Wake.
Of course, by that time the young Samuel Beckett was often on hand to assist in the slaughter.
Or, as I'll point out for the 2nd time in this thread and umpty-umpth time since shortly after Voros published his original work -- MAYBE what we want is SLGIP not BABIP. FB pitchers give up slightly fewer hits but presumably more of those hits are doubles and triples.
But also don't make too much of that anyway. There are not 100% FB pitchers. I don't know what the respective BABIPs are but, for demonstration purposes, assume 310 for GB and 270 for FB. Current GB/FB ratios are about 4/5 but 7.4% of FBs are HRs so maybe closer to 8/9. An extreme FB pitcher is probably around 6/10 and that's a 285 BABIP; an extreme GB pitcher might be 7/6 and that's 292 (league average in this scenario is 289). So even there we're talking just 7 extra hits per 1000 BIP. Last year Hellickson had 560 BIP and a .55 GB/FB.
If you wanted to get this point across to a Hellickson fanboy, point out that Maddux's career BABIP was 286 and his best season was 248 and nobody thinks Hellikcson is light years better than Maddux. While Maddux is evidence that a pitcher can beat the odds :-), it's also evidence that Hellickson ain't gonna sustain anything remotely close to 223.
Back to my #37 and "standard deviation goes down with sample size" -- you also have population selection going on. As they weed out (fairly or unfairly) crappy BABIP pitchers, the spread of true talent should decline as well, along with the mean BABIP, as you increse the BIP threshold. So, less random variation in your rate estimate due to larger sample size; less random variation for any given sample size due to lower mean rate; less true talent variation due to dropping the bottom end of the distribution. Those should inevitably add up to lower variation ... or something very odd is going on.
You must be Registered and Logged In to post comments.
<< Back to main