I’m not going to mention a certain Twins catcher by name, but I do want to make a distinction between the two concepts above because it seems in our discussion of said catcher, they have been confused.
...The question is - at what point do we conclude that the coin is *not* evenly weighted? Well, it’s not simply a matter of how many times you flip the coin, i.e., sample size. It’s *also* a matter of the *magnitude of the deviation*. If you get 16 heads and four tails, I wouldn’t be so sure the coin is unevenly weighted. But if you get 20 heads and no tails, it’s almost certainly so. (Less than 1 in a million odds of that). In both cases, the sample size is quite small, but in the latter, it’s more than sufficient. When the coin lands at a an 80-percent heads clip, you need a larger sample to determine the coin is rigged because the magnitude of the deviation from the baseline (50/50) is less. And if the coin lands heads at a 55-percent clip, you need an even larger number of flips to determine whether it’s rigged.
So understand that the sample size is only one of two factors in determining the significance of the outlier. The other is the magnitude.
That’s why when you see Verlander strike out 60 batters in 44 IP or Joe Mauer - f*** it, I’ll mention him, hit 11 home runs, you cannot simply say, “it’s only one month, I’m not a believer” without also considering the magnitude of the deviation.
Is 60K in 44 IP a big enough magnitude to make one month significant. Is 11 home runs? In my opinion, yes. But whatever your opinion, you must address both factors if you’re going to get a good gauge of whether it’s dumb luck or a new baseline.
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. bpasinko Posted: June 02, 2009 at 09:05 PM (#3203748)You know, there is the whole discipline known as statistics and, you know, it might have occurred to one or two statisticians that this is a question worth answering.
He's stumbled into "the difference between two proportions." For quick and dirty, we'll assume they're independent -- you could certainly argue they're not since it's Mauer in both samples but we're only assuming that the old PAs are independent of the new PAs conditional on it being Mauer. You could also argue neither of these is a sample since we're looking at the entire population of Mauer PAs but then I'd have to do some hand-waving about super-populations and nobody wants that.
The difference is p1-p2 and its standard error is:
sqrt [ p1(1-p1)/n1 + p2(1-p2)/n2]
and here we get a z over 5 so yes it's "significant" except ....
The issue here is that this is a post-hoc test. If you'd hypothesized beforehand that Mauer would come back with a higher HR rate and now wanted to test your conclusion, this would be a legit test. But the chances that at least one player in MLB would see a huge spike in HR rate in May is a lot higher.
Still, it's over a 5-sigma variation which does suggest that this is a "process out of control" but all that tells us is that the "failure" rate (where each HR is a "failure") is higher than it used to be. The question then becomes what's the new rate. It will be several hundred PA before we'll have a good estimate of that but I'll just point out that Adrian Gonzalez had 11 HR in May in 5 fewer PA so there's no reason to think Mauer's new level of talent is any higher than Gonzalez's (who, to this point, has been a 25-35 HR guy though obviously he's on a torrid pace right now).
And as always, I point folks towards Steve Dillard, Aug 2-17 1979 -- this sort of thing can happen "randomly."
This made me laugh pretty hard.
Really Walt, who the Christ is Steve Dillard? You're showing your age here, you should have used Shane Spencer :-P
**whoever he is, or was
I absolutely love that I can go to BBRef, use the "Random Page" function, pull up a name like "Ed Sixsmith," post it here, and within a few hours Steve Treder or someone will give me his entire life story. (And it will be fascinating.)
I was going to tell a story about Rich Schu moving Mike Schmidt off third base, but I suspect you already know, ...um, "Kevin".
I was asking this question wrt pitcher-batter matchups. What's the tipping point.
5 sigma. IN 20 PAs what is that? 15 hits?
Perhaps not a monster, but you do have the aroma of peanuts about you.
I'm a Cubs fan. When a guy, a complete nobody, hits like 600 for 2 weeks with half-a-dozen HRs or whatever, in an otherwise completely pointless season, it sticks sometimes. And it comes in handy when somebody says "[actual good player] has been on fire, this can't be random."
Granted, I don't have the year and dates committed to memory, I used to go to retrosheet and now b-r's game log to check.
Now do you wanna hear about the month or so when I was convinced that Scot Thompson was gonna be the next George Brett?
Mauer in his career has 56 HR in 2514 PA, or 13 per 600 PA.
This year, he has 12 HR in 126 PA, or 57 per 600 PA.
Absent other information, from today to the end of the season, his HR rate will be far closer to 13 per 600 than 57 per 600.
How much though? This study from five years ago (post 2) suggests adding 131 PA of league average HR rate to your known information. So, if all we knew was the Mauer info of this year, his "true" HR rate would be about halfway between what he has hit (57 per 600) and what the league average is (16? per 600), or 36 or so per 600 PA.
However, for Mauer, we also have his past career, which shows him to not be a HR hitter. We can't simply discard that information (unless of course we have some reason to believe that the Mauer of old approaches batting differently than the Mauer of new).
Just taking a WAG, if we consider the Mauer of old providing a good prior, that the Mauer of new is showing a strong indicator, and we always have our trusty friend regression toward the mean, my WAG is that Mauer will hit 25 per 600 PA from now to the end of the season.
Of course, being a binomial means that I'm 95% sure that it's 25 per 600 PA, plus/minus 12 HR. Which basically means that whatever I say, or anyone says, insofar as Mauer is concerned, will be practically untestable.
You must be Registered and Logged In to post comments.
<< Back to main