Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Baseball Newsstand > Baseball Primer Newsblog > Discussion
Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Monday, July 27, 2009

The Book Blog: mgl: Most people still don’t understand the concept of regression toward the mean…

Dig, it!

There is a thread on BTF about Sabathia’s “numbers,” particularly his BB and K rates, being down this year, as compared to last year, although he is still pitching very well of course.

While the quality of the posts on BTF is nowhere near that of this blog (although they beat us easily in quantity), there are some reasonably intelligent regulars on that site (if anyone interprets that as a dig, it is).

...The thing that people don’t understand (actually one of the things) about regression toward the mean in baseball is that the reason any above or below average player will always regress, on the average, towards average, is that they were not really as good or bad as we thought in the first place, based on any of their stats.  That goes for Sabathia, Halladay, Bonds, Chipper Jones, etc., etc.  Chipper Jones is not as good as his career stats tell us, even after you do all the appropriate adjustments.  Same for Halladay.  And Sabathia.  And everyone else who has been above average and we think has true talent X.  When I say “as we think” I mean as their stats suggest, not as we think based on a credible projection which already does the regression.  And of course, there is some chance that any given player is better than his prior stats - it is just that the chances of him being worse is greater than the chances of him being better.  That is ALWAYS the case, as long as we properly define the mean for that player.

That is the KEY to understanding regression toward the mean and is what most people don’t understand, even if they think they understand the concept.

Repoz Posted: July 27, 2009 at 12:19 PM | 244 comment(s) Login to Bookmark
  Tags: baseball geeks, projections, sabermetrics, site news

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 2 of 3 pages  < 1 2 3 > 
   101. Slivers of Maranville descends into chaos (SdeB) Posted: July 27, 2009 at 04:59 PM (#3268571)
I'm defining true talent as "the innate physical, emotional and mental skill that surfaces in some combination in the face of competition at a particular moment in time and space".


Correct me if I'm wrong, but you seem to treat each of a player's at-bats (say) as the roll of a die in which each of the faces (single, walk, K) is weighted in certain proportions. The weights change over time, but slowly enough that you can aggregate at-bats to determine the weights of the different outcomes. This is a player's 'true talent'.

Which is fine as a simplifying assumption as an aid in statistical modeling. But it's exactly that, a simplifying assumption. That doesn't make it a concrete thing (which is why I dislike the adjective 'true' which implies something real and authentic, as opposed to what a player actually does on the field, which is not true). I don't think you can go around talking about a player's 'true talent' as if it really exists.

Consider how absurd this sounds when applied to other areas of human endeavor. Would you say that there is only 99.99(...)% chance that Mozart had more 'true musical talent' than, say, William Hung?

Let me channel Linda Richmond:

"True talent" is neither true nor a talent. Discuss.
   102. Bring Me the Head of Alfredo Griffin (Vlad) Posted: July 27, 2009 at 05:01 PM (#3268572)
"Dicks shouldn't point their dicks at other dicks lest their own dick be turned against them."

That's from Leviticus, right?
   103. CFiJ Posted: July 27, 2009 at 05:02 PM (#3268573)
The Book is available for free reading from Amazon's Look Inside. I suggest reading the last few pages of Chapter 1. After you do that, let me know if you have more questions.
I did the above and, well, you've just made another sale. "The Book" was long a book I thought it would be nice to have, but I just wasn't motivated to actually go out and get it. After reading various pages on Amazon, I placed an order with overnight shipping. How much do you guys get in royalties? Be thankful for the weak dollar/strong yen.

Tango and mgl are d-i-c-k-s.
That's just weird. MGL has on occasion made points in a dickish way, but I can't think of a time when Tango has been anything but smart, helpful, and even handed. He maybe an archdick in real life, but the Tom Tango/Tangotiger internet avatar is the perfect gentleman scholar.
   104. Backlasher Posted: July 27, 2009 at 05:06 PM (#3268580)
I'm defining true talent as "the innate physical, emotional and mental skill that surfaces in some combination in the face of competition at a particular moment in time and space".

So, true talent "exists" because the definition forces its existence.


I think that is the crux of the problem you have with Sur Peaches. If that is your definition, then true talent is an OUTPUT that has occured previously in time. It is not an ontological composition of the person. Consequently, not only CAN the true talent change from second to second, it DOES change from event to event.

As the phrase is used, "True Talent" is being defined as "the potentiality of a certain output based on a persons then innate physical, emotional and mental skill that surfaces in some combination in the face of competition at a particular moment in time and space." Even with that definition, the still remaining issue is usually whether its an ontoligical characteristic or its just a phenomological result. if its the latter, then its only meaningful over a specific interval of time, and not including that in the statement can lead to incorrect premises, conclusions and analysis.

There is a big difference in saying, "I expect Sabathia to perform at X level over the next Y months because the Z output in A time period does not give me enough information to think he will not approach Q, and Q is correct because R." and saying "Sabathia will always do X because it is innately part of who he is."

If the second statement is understood as the first statement, there isn't a problem. There is a problem when the crux of the argument is whether Z output over A time period gives you enough information to revise your projection. Then the use of True Talent Level and "regression to the mean" is going to hamper discussion.

In fact, it usually leads to someone teaching sophomore level statistics trying to "explain" things.

Unless I am incorrect, the point of contention is precisely over whether Z output (Sabathia's BB rate, K rate, etc.) gives enough invormation to change a forcast. This is precisely a case where I think the terms of art obscure the real debate.
   105. Rickey! trades in sheep and threats Posted: July 27, 2009 at 05:10 PM (#3268585)
This is what we deal all the time: we simply do not know the true of anything

Unless you're limiting yourself to future projections this is just silly. The thing-in-itself is just sitting there staring you in the face and you're wonking out over the impossibility of certainty in probabilistic models? Dude, Barry Bonds was better than Neifi Perez. Was. Is. Will forever be. If you can't know that for true there's little point in asking you for knowledge.
   106. rr Posted: July 27, 2009 at 05:11 PM (#3268589)
While the quality of the posts on BTF is nowhere near that of this blog (although they beat us easily in quantity),


Yeah, well, our music, movie, NBA, and political threads kick their asses. Plus we have Cookie Monster, and they don't.
   107. Shooty Survived the Shutdown of '14! Posted: July 27, 2009 at 05:15 PM (#3268594)
That's from Leviticus, right?

The Book of Schlongs, actually.
   108. Backlasher Posted: July 27, 2009 at 05:25 PM (#3268610)
The Book of Schlongs, actually.

Is that part of the Apocryphia?
   109. villageidiom Posted: July 27, 2009 at 05:31 PM (#3268617)
How certain are you that the coin that came out with 600 heads is in fact the weighted coin? Are you 100% certain, or are you Bonds-is-truly-better-than-Neifi certain?
I'm ALWAYS certain it's the weighted coin. But I mean "ALWAYS" in the MGL sense, which is "not always".
   110. Baseballs Most Beloved Figure Posted: July 27, 2009 at 05:40 PM (#3268644)
About MGL supposedly being a d-i-c-k: doesn't he also spend $10,000 of his own money every year to purchase the data from STATS he uses to construct UZR, and then make the UZR calculations freely available? I think that says a lot more about the man than a few blog posts that are perhaps more curt than they need to be.
It may just say that he has a lot of money to spend as he wishes.
   111. BFFB Posted: July 27, 2009 at 05:47 PM (#3268663)
Unless I am incorrect, the point of contention is precisely over whether Z output (Sabathia's BB rate, K rate, etc.) gives enough information to change a forecast. This is precisely a case where I think the terms of art obscure the real debate.


it would make me look for factors that could be causing it. basic engineering; when your validated model doesn't match plant data start looking at factors which are known to cause a deviation from the ideal system. assuming reality will suddenly start matching the model is dangerous. complex systems do funky things for reasons that aren't always apparent.
   112. Tango Posted: July 27, 2009 at 05:50 PM (#3268666)
How much do you guys get in royalties?


We get 15% of whatever the merchants pay the publisher, which basically means about $1.50. Amazon probably takes home the biggest share. When we self-published, we probably took in 5 or 6$ per copy. Thanks for the support!

Dude, Barry Bonds was better than Neifi Perez. Was. Is. Will forever be. If you can't know that for true there's little point in asking you for knowledge.


At what point did you go from you being 99.99% sure that Bonds was better than Neifi to 100% sure?

If the assertion is that I must be able to answer with 100% certainty (rather than 99.9999999%) that Bonds' true talent level was greater than Neifi Perez's in order for someone to get any benefit out of talking with me, I'll reject that out of hand.

If on the other hand I have to meet that threshhold for you in particular, then that's your choice.
   113. Rickey! trades in sheep and threats Posted: July 27, 2009 at 05:57 PM (#3268681)
If the assertion is that I must be able to answer with 100% certainty (rather than 99.9999999%) that Bonds' true talent level was greater than Neifi Perez's in order for someone to get any benefit out of talking with me, I'll reject that out of hand.

(edited for clarity)

The assertion is that the difference between 99.9999999% and 100% is meaningless, that bringing it up amounts to making a semantic point simply to protect your statistical bonafides, and that doing so is wonkish ass-hattery taken to an absurd extreme. What benefit could you possibly bring to a human conversation by making that distinction? Barry Bonds' true talent level is one of the greatest hitters in the game. Neifi Perez' true talent level is Neifi Freakin' Perez. There is no doubt or quesiton about this, not even 0.0000001% uncertainty. Both players have played their last. The facts are on the table. There is no future state probability to account for. Barry Bonds was a gazillion times better than Neifi Perez. To refuse to state the obvious is ridiculous.
   114. Backlasher Posted: July 27, 2009 at 05:59 PM (#3268685)
it would make me look for factors that could be causing it. basic engineering; when your validated model doesn't match plant data start looking at factors which are known to cause a deviation from the ideal system. assuming reality will suddenly start matching the model is dangerous. complex systems do funky things for reasons that aren't always apparent.

I agree.

That has been my contention in most of these systems.

The real immediate issue is whether the BB rate gives you enough information to change the projection. I gather mgl's answer is no. Based on current knowledge, I would likely agree with mgl. As that bb rate approaches infinity, the answer more assuredly becomes "yes, something is furcked up with Sabathia."

The next and better question isn't the nebulous line where we go from No to Maybe to Yes. Those things require too many events to determine on the present numbers alone. The question is what you allude to:

When does the change in BB rate give me signal to check other criteria to see if something is ######## with Sabathia?

As long as we keep the "your an idiot" "No, your an idiot" back and forth, you miss taking these next steps.

Now its entirely possible that mgl doesn't care or doesn't want my or anyone else's participation in that next step. That is his right. But out here, were Tango is engaging us, it would be nice to get to the good questions. If not, we'll just have to get Shooty to settle the debate from the Book of Schlongs.
   115. Tango Posted: July 27, 2009 at 06:06 PM (#3268704)
What benefit could you possibly bring to a human conversation by making that distinction?


That there is uncertainty about everything, and our job is to state our analysis with an uncertainty level.

There is a time and place for everything. In this thread, the discussion is about regression and uncertainty. So, it's important to note that there is no "when do you know for sure" claims.

If it was a more general "Is Rickey a better ballplayer than Rice", then I'd say, "yes, for sure". But, if someone says "how sure are you", I'm not going to say 100%. How precise you want the answer is how precise I can give the answer.

I see it all the time about how many PA do you need until you know that Mauer/2009, Chipper/2008 is giving us a real change in performance. This is the point. There is no magic point. It's a continuum that never reaches the endpoint.
   116. Shooty Survived the Shutdown of '14! Posted: July 27, 2009 at 06:08 PM (#3268710)
If not, we'll just have to get Shooty to settle the debate from the Book of Schlongs.

The answers in the Book of Schlongs usually end messily. Hopefully we can come to a cleaner solution.
   117. Tango Posted: July 27, 2009 at 06:10 PM (#3268717)
Sam: I am 100% certain that Barry Bonds performed(*) better than Neifi Perez.

(*) The distinction is that we are interpreting actual data, and it is in the past-tense.
   118. Slivers of Maranville descends into chaos (SdeB) Posted: July 27, 2009 at 06:14 PM (#3268726)
The distinction is that we are interpreting actual data


What else is there to interpret?
   119. Rickey! trades in sheep and threats Posted: July 27, 2009 at 06:16 PM (#3268727)
If it was a more general "Is Rickey a better ballplayer than Rice", then I'd say, "yes, for sure". But, if someone says "how sure are you", I'm not going to say 100%. How precise you want the answer is how precise I can give the answer.

And that's fine. Wonkish ass-hattery has its place in the world. Even Dial is useful on occasion. For every season, turn, tern, turn. Etc. All I'm calling out here is the example of Bonds vs Perez. There is no uncertainty there. Just none. Uncertainty and precision only enter the conversation for extremely similar players or future state projections. If I'm trying to decide whether or not to keep Ryan Church or Kelly Johnson as Matt Diaz's more useful half next year, I want uncertainty and precision. If someone's pulling a millionth percentage point out over Perez/Bonds I'm calling ass-hattery.

I love the word ass-hattery. It's awesome.
   120. Tango Posted: July 27, 2009 at 06:18 PM (#3268731)
The real immediate issue is whether the BB rate gives you enough information to change the projection. I gather mgl's answer is no. Based on current knowledge, I would likely agree with mgl.


At the risk of being too precise, every time CC throws a pitch there is "enough" information to change his forecast.

If the question is when is it enough to change his forecast from 2.1 to 2.2 BB / 9IP, then that is more quantifiable. Let's take something easier, like OBP. Say you have someone who had 2000 PA with an OBP of .360, and now after 250 PA, he has been performing with an OBP of .560. What do you do?

The 2000 PA will probably be weighted at say 65%. So, we've got 1350 weighted PA at an OBP of .360. We have 250 PA at an OBP of .560. And we need to add another 200 PA at an OBP of .335 for regression toward the mean reasons (that is, we are not 100% sure that his .360 is his true talent, or true quatlus if the other word bothers anyone). Add it up, and we get .385. That is our best estimate of how he will perform in the foreseeable future. We are of course not 100% certain of that estimate.
   121. Slivers of Maranville descends into chaos (SdeB) Posted: July 27, 2009 at 06:26 PM (#3268741)
I guess I reject the notion that my every activity is simply a probablistic representation of the 'real me' inside, determined by statistical processes. But then I'm a behaviorist in other fields.

If all we had of Beethoven's oeuvre was the first twelve bars of the 5th symphony, would we regress that to the mean and conclude that Beethoven was probably a slightly above average composer?
   122. Tango Posted: July 27, 2009 at 06:28 PM (#3268746)
What else is there to interpret?


You missed the "and" in my condition. If you are going to quote me, quote my whole sentence:

The distinction is that we are interpreting actual data, and it is in the past-tense.


To answer your question, you can interpret actual data for forecasting. You can interpret actual data to determine likelihood of chance playing a role in the results.

If the question is: who actually performed better, then Bonds performed better than Neifi. And we are 100% sure of that.

If the question is: who had the better true talent (as I have defined the word on page 1 somewhere), then we are nine 9s sure that Bonds had the better true talent.

If the question is: who will perform better (take Pujols and, I dunno, Willie Bloomquist) the rest of this year, then we are, I dunno 80% sure it's Pujols.

If the question is: who will have the most true talent between Pujols and Willie for the rest of this year, then we are, I dunno 90% sure it's Pujols.

(Don't forget injuries to the last two questions.)
   123. Tango Posted: July 27, 2009 at 06:29 PM (#3268749)
I guess I reject the notion that my every activity is simply a probablistic representation of the 'real me' inside, determined by statistical processes. But then I'm a behaviorist in other fields.


Yes, this is exactly the notion.

If all we had of Beethoven's oeuvre was the first twelve bars of the 5th symphony, would we regress that to the mean and conclude that Beethoven was probably a slightly above average composer?


Yes, if that's all we had, then exactly correct. If all we had was the one game of Bob Horner hitting 4 HRs, we'd conclude he was probably slightly above average.

Same deal when a pitcher throws a perfecto.
   124. Rickey! trades in sheep and threats Posted: July 27, 2009 at 06:33 PM (#3268754)
I guess I reject the notion that my every activity is simply a probablistic representation of the 'real me' inside, determined by statistical processes. But then I'm a behaviorist in other fields.

That model is fine as long as we're forecasting future potential yous. But if we're talking about past and present yous, you are what you did. If you stabbed your previous six girlfriends to death, you're a serial killer. There's no probability matrix involved.
   125. Backlasher Posted: July 27, 2009 at 06:39 PM (#3268765)
At the risk of being too precise, every time CC throws a pitch there is "enough" information to change his forecast.

If the question is when is it enough to change his forecast from 2.1 to 2.2 BB / 9IP, then that is more quantifiable.


While that question is perhaps a better articulation, I'm not sure if its correct. In general conversations, the question is going to be more fuzzy. "Has CC lost it?" "Does CC suck?" or "What can we expect of him?"

If you attempt to answer that question then you might use the BB projection, and you may use the level of certainty as your answer.

I've seen mgl do this. I usually think its gracious when he does. Nevertheless, when he responds within his frame of reference, somebody is going to piss all over the answer. That is probably one of the reasons why you get the comments you do about him here. They don't like the precision of the answer to the general question.

Now there is another situation also, There is someone jumping in that will say,

"I think CC is heading toward sucksville because his BB rate is 100% higher and he has a little schlong."

I understand the calculation you are doing. I understand your degrees of certainty. I understand your model. I even think it represents some of the best current known system for projection.

Nevertheless, i don't think the answer stops at whether it provides enough to meaningfully changes that model.

A retort might be, "You didn't factor the schlong" At some point:

(1) The change in rate becomes an indicator to seek other data. and
(2) At some point, Schlong size is going to matter.

I think both of those conversations get lost. Moreover, different people react to it so differently. I don't like the patronizing stuff that you saw earlier in this thread. Hutch doesn't like the overly precise ass-hattery answers on general terms. Someone else may not like the geekery of the conversation. Those problems will never be completely solved, but losing the real conversation because of it is going to be a major loss.

I don't object to your use of True Talent. I can understand it, its use is consistent. Nevertheless, there are some definitions in this very thread that are not only inartful, they are going to lead to exacerbating misunderstanding and resulting in erroneous conclusions.
   126. Nineto Lezcano needs to get his shit together (CW) Posted: July 27, 2009 at 06:40 PM (#3268767)
That model is fine as long as we're forecasting future potential yous. But if we're talking about past and present yous, you are what you did. If you stabbed your previous six girlfriends to death, you're a serial killer. There's no probability matrix involved.


It all depends on the questions being involved. For questions involving the actual historical record, yes, Bonds outhit Neifi.

But for predicting the results of events where the outcome is unknown, there is an uncertainty level. Yes, yes, I know there are few calls for projections of Bonds' 1993 season, say, but that doesn't mean they're nonexistant. If we want to answer a counterfactual of some sort - like how well the 1993 Pirates would have done if they had resigned Bonds before he went to free agency - we cannot simply assume that what did happen is what would have happened, or even the most likely occurence. And that's when our level of uncertainty about Bonds' "true talent" comes into play.
   127. jmurph Posted: July 27, 2009 at 06:40 PM (#3268768)
But if we're talking about past and present yous, you are what you did. If you stabbed your previous six girlfriends to death, you're a serial killer.


I'd say there's an, oh, 12% chance you were simply a mass-murderer.
   128. SoSH U at work Posted: July 27, 2009 at 06:44 PM (#3268773)
And a 50 percent chance you're Drew Peterson.
   129. Backlasher Posted: July 27, 2009 at 06:47 PM (#3268776)
That model is fine as long as we're forecasting future potential yous. But if we're talking about past and present yous, you are what you did. If you stabbed your previous six girlfriends to death, you're a serial killer. There's no probability matrix involved.


Yes. I agree.

Its a much different question as to whether you are likely to stabs someone tomorrow. It is information you can use to determine the likelihood of the stabsing.

In a small enough period of time, I'm sure you are just as likely to stab as Jack the Ripper, particularly if it involves John Smoltz after he has gone off on Frank Wren.

Presuming we are locii of potentialities is fine FOR MEDIUM RANGE PROJECTIONS. It doesn't produce useful output on very short and very long range projections. We can measure the validity as we approach those small time intervals and large time intervals. I presume those numbers aren't asshattery, duchebaggery, or dongwranglin'
   130. Dr Love Posted: July 27, 2009 at 06:48 PM (#3268780)
The only people Sam stabs are umps.
   131. Slivers of Maranville descends into chaos (SdeB) Posted: July 27, 2009 at 06:50 PM (#3268787)
If all we had was the one game of Bob Horner hitting 4 HRs, we'd conclude he was probably slightly above average.


Or we could conclude that we don't have enough evidence to say one way or the other.

(Note: I am an archaeologist, so I deal with limited data all the time. Your way of treating it is bizarre to me).

If we want to answer a counterfactual of some sort - like how well the 1993 Pirates would have done if they had resigned Bonds before he went to free agency - we cannot simply assume that what did happen is what would have happened, or even the most likely occurence. And that's when our level of uncertainty about Bonds' "true talent" comes into play.


Dealing with counterfactuals is natural for humans, but runs up against problems when one treats them as part of reality. If a player's 'true talent' is the average of all his performances across all possible universes, does that include a universe where Bonds is white? Where he is a second baseman? Where baseball is played with five bases instead of four?
   132. Shooty Survived the Shutdown of '14! Posted: July 27, 2009 at 07:00 PM (#3268810)
You're proposing that someone stab six ex-girlfriends all at the same time? That seems tricky. Were they having some kind of ex-girlfriend party?

This is why I'm not on Facebook.
   133. Rickey! trades in sheep and threats Posted: July 27, 2009 at 07:02 PM (#3268818)
I'm calling this thread. Shooty wins. At least, I'm 99.999999999999999999999999% sure he does.
   134. Tango Posted: July 27, 2009 at 07:02 PM (#3268819)
Or we could conclude that we don't have enough evidence to say one way or the other.

(Note: I am an archaeologist, so I deal with limited data all the time. Your way of treating it is bizarre to me).


Well, the bet would be that if all you had was the 4HR game from Horner, then your mean for him would be a .501 player (where .500 is average), +/- 1SD = .100.

If a player's 'true talent' is the average of all his performances across all possible universes


No, only in this universe but across all possible environments he would face in the proportion he is expecting to face them. We are starting with the "real me inside" you so perfectly noted.
   135. BFFB Posted: July 27, 2009 at 07:06 PM (#3268822)
I also have to be honest and say that I don't like overly precise answers to that kind of question. When your answer has greater precision than the source data you get issues.

When does the change in BB rate give me signal to check other criteria to see if something is ######## with Sabathia?


That's what's missing alot of the time in discussions about players and why I loved CBW's stuff.

Or we could conclude that we don't have enough evidence to say one way or the other.


It depends what other information you have. If you know what other players accomplished the same thing and what their ability was you could make an inference as to what Horner's ability was. You then assign a confidence level to that inference, which doesn't have to be numeric. A written proviso, for example.
   136. Tango Posted: July 27, 2009 at 07:14 PM (#3268833)
I also have to be honest and say that I don't like overly precise answers to that kind of question. When your answer has greater precision than the source data you get issues.


My answer, while more precise, has greater uncertainty. It is exactly that uncertainty that people seem to have a problem with, since they want me to be more certain than the data allows.

I can say that CC, right now, has a mean talent level of a .600 pitcher, with 1 SD = .025. (Just for illustrative purposes.)

How should I answer this otherwise? I'm giving you my best bet on his mean, and how certain I am of that mean.

Would you rather I saw that he's not as good as I thought he was on April 1? Well, then I might say I'm 70% sure he's not as good as I thought he was, +/- 10%.

Are you saying this is too certain? Or not certain enough?
   137. Slivers of Maranville descends into chaos (SdeB) Posted: July 27, 2009 at 07:17 PM (#3268842)
Well, the bet would be that if all you had was the 4HR game from Horner, then your mean for him would be a .501 player (where .500 is average), +/- 1SD = .100.


That would be the average projection for Horner going forward, sure. I disagree that the number so generated has anything to do with "true talent". I have no problem with the concept of 'average projected performance going forward.' Applying that to, say, retired players I do have a problem with. Treating it as something more real than the actual numbers I have a problem with.

If you project a player to hit .250 .330 .420 and he hits .280 .360 .500 then it is your projection, not his performance, which erred.
   138. Backlasher Posted: July 27, 2009 at 07:37 PM (#3268880)
Dealing with counterfactuals is natural for humans, but runs up against problems when one treats them as part of reality.

YES!!!!!

This is the luck fallacy that you see out here so often. They presume that because a simulated Bonds with four arms that throws knuckleballs would do XYZ, any other outcome was CAUSED by luck.
   139. AROM Posted: July 27, 2009 at 07:41 PM (#3268888)
Peaches: on Beethoven, I would assume that if you were to quantify greatness you would regress much less to the mean than in baseball performance. I honestly don't know much about classical composers, are there any "one hit wonders"? Or is a great piece something that only a great composer can accomplish?

As opposed to a great sequence of 3 notes, which marcel the monkey could accomplish if you gave him enough tries, or a great month of slugging, which can be accomplished by a lucky Kevin Maas aaaa slugger every now and then.
   140. BFFB Posted: July 27, 2009 at 07:50 PM (#3268899)
thought experiments and counterfactuals should be treated as tools for illuminating different aspects of reality, not as a substitute.

luck:sabremetrics::fugacity:ideal gas law

I can say that CC, right now, has a mean talent level of a .600 pitcher, with 1 SD = .025. (Just for illustrative purposes.)


nothing wrong with that format, from my point of view.
   141. Swoboda is freedom Posted: July 27, 2009 at 07:53 PM (#3268905)
from the Book of Schlongs

That is the Song of Schlongs
   142. Shooty Survived the Shutdown of '14! Posted: July 27, 2009 at 07:54 PM (#3268910)
from the Book of Schlongs

That is the Song of Schlongs


I acquiesce to your superior knowledge of schlongs.
   143. Backlasher Posted: July 27, 2009 at 07:59 PM (#3268916)
My answer, while more precise, has greater uncertainty. It is exactly that uncertainty that people seem to have a problem with, since they want me to be more certain than the data allows.

I can say that CC, right now, has a mean talent level of a .600 pitcher, with 1 SD = .025. (Just for illustrative purposes.)

How should I answer this otherwise? I'm giving you my best bet on his mean, and how certain I am of that mean.

Would you rather I saw that he's not as good as I thought he was on April 1? Well, then I might say I'm 70% sure he's not as good as I thought he was, +/- 10%.

Are you saying this is too certain? Or not certain enough?


IMHO, the issue is to recognize, that while this may somewhat rebut:

"CC is the suxxors."

It doesn't rebut:

"I'm worried that CC is sliding. His walk rate is up."

It also may not be a complete answer to the question or topic of discussion.

First, it would rebut.. "CC is sliding, I can tell just from his BB/9 mark." but absent that precise a sentence, IMHO, many people are saying the BB/9 alert them to a problem that they corroborated from other information, some of which they may not precisely be able to state.

I think MGL does take issue when people would just throw out BB/9 rate of a half season in a discussion about CC sliding.

Second, if you set yourself up as a prognosticator, they do want the next statement. A 30% chance of rain +/-10% degree of confidence doesnt' tell someone whether to bring the umbrella.

I think the answer to the question is "I think CC is fine. The walk rate is not different enough to cause a concern, and I haven't seen anything else that would cause a concern."

The problem is when people definitively say the walk rate is "because of luck"; don't examine other causes; and dismiss anyone else that does examine those causes. That does occur in the sabersphere from time to time.
   144. The District Attorney Posted: July 27, 2009 at 08:01 PM (#3268918)
As the phrase is used, "True Talent" is being defined as "the potentiality of a certain output based on a persons then innate physical, emotional and mental skill that surfaces in some combination in the face of competition at a particular moment in time and space." Even with that definition, the still remaining issue is usually whether its an ontoligical characteristic or its just a phenomological result.
My answer, while more precise, has greater uncertainty. It is exactly that uncertainty that people seem to have a problem with, since they want me to be more certain than the data allows.
Personally, I think you're overstating the precision by which you can measure your uncertainty.
luck:sabremetrics::fugacity:ideal gas law

Well, this whole thing is so obvious, I have no idea how we didn't understand it originally.
   145.   Posted: July 27, 2009 at 08:15 PM (#3268942)
Our tools to measure performance are not 100% precise either. If someone hits .300 over a given time period, we take that to mean he hit well. That is based on our mental model that anything above a certain threshold is a good performance.

However, it is possible, no matter how unlikely, that an average player, given the exact same circumstances might have hit .310 over that period, and therefore our player that hit .300 was actually below average. That is why our uncertainty levels are just as relevant to the past as they are the future. How certain we are depends on the tool, of course.

Sure you can say ".300 is .300 and .300 is what he contributed," but what does that mean? It's relevant to us only when we place it within its proper context against all other performances.
   146. Avoid running at all times.-S. Paige Posted: July 27, 2009 at 08:16 PM (#3268945)
from the Book of Schlongs

That is the Song of Schlongs

I acquiesce to your superior knowledge of schlongs.


Sadly I'm having a much easier time following this mini-discussion than the greater one about epistemology or whatever the word is. I think it's because I can relate to it better or something (because I have a schlong).
   147. Tango Posted: July 27, 2009 at 08:19 PM (#3268949)
A 30% chance of rain +/-10% degree of confidence doesnt' tell someone whether to bring the umbrella.


Perfect analogy.

Yes, that's as far as the numbers will take you, and that's as far as I would go. To go further than that is, basically, the definition of b.s. This answer, for example:

I think the answer to the question is "I think CC is fine. The walk rate is not different enough to cause a concern, and I haven't seen anything else that would cause a concern."


... if this is what people expect, I would never say. This is purely looking at the numbers and then looking for reasons to support the numbers. That is the wrong way to look at it. The scout looks at the performance without looking at the resultant PA. I do not want to ask a scout "he's walking alot of batters... why is that?". A scout can tell me if he's hitting his spots, if he's sequencing his pitches. I do not want my scout to know how many walks a pitcher has thrown. This will bias his evaluation.
   148. Nineto Lezcano needs to get his shit together (CW) Posted: July 27, 2009 at 08:22 PM (#3268955)
First, it would rebut.. "CC is sliding, I can tell just from his BB/9 mark." but absent that precise a sentence, IMHO, many people are saying the BB/9 alert them to a problem that they corroborated from other information, some of which they may not precisely be able to state.


Are they, though? I think the process more generally goes like:

* Pitcher is doing poorly (giving up runs, losing games, leaving starts earlier).
* Person goes to look at stats to figure out why.
* Sees figure like BB/9, decides that's the explanation.

And... I think it's obvious to everyone here that a pitcher that walks more batters will generally give up more runs, win fewer games and pitch fewer innings, regardless of the predictive value of that change in walk rate.

A great example of this recently is when people declared that Chien Ming Wang's release point had changed based upon Pitch F/X data, and that explained his performance this year. In reality, the Pitch F/X data was simply wrong - the cameras were incorrectly calibrated for some reason.
   149. BFFB Posted: July 27, 2009 at 08:45 PM (#3269005)
This is purely looking at the numbers and then looking for reasons to support the numbers


I'd disagree there. The numbers aren't an end but a means. One data point of many that can be used to infer a conclusion about, in this case, the likely future performance of pitcher y.

A great example of this recently is when people declared that Chien Ming Wang's release point had changed based upon Pitch F/X data, and that explained his performance this year. In reality, the Pitch F/X data was simply wrong - the cameras were incorrectly calibrated for some reason.


that's not faulty reasoning but bad data.
   150. Backlasher Posted: July 27, 2009 at 08:53 PM (#3269023)
That is why our uncertainty levels are just as relevant to the past as they are the future.

And that is where you cross the line into being 100% wrong. There isn't uncertainty on that past event beyond some small amount of uncertainty as to whether our recollection and recordation is correct. Counting that uncertainty is what Hutch would call ass-hattery.

The use of the current vocabulary has made you believe that there is uncertainty about the past event. That is just wrong.

That is the wrong way to look at it. The scout looks at the performance without looking at the resultant PA. I do not want to ask a scout "he's walking alot of batters... why is that?". A scout can tell me if he's hitting his spots, if he's sequencing his pitches. I do not want my scout to know how many walks a pitcher has thrown. This will bias his evaluation.

And the GM and the fan takes the analysis and data you have given them and reaches a conclusion. I could be wrong, but that is what I perceive as the level of discourse for this site.

You are right, sometimes the people are talking out of their ass, but there might be some level of useful sh1t that results. That is the problem, do you take away the meaningful points just because the conclusion overreached or was overstated.

IMHO, its different on the "Book Blog" and even on Prospectus. There you are trying to improve the analytical process NOT HAVE A PISSING CONTEST ON THE RESULT OR METHOD USED TO ACHIEVE THE RESULT.

You can get past the Schlong measuring and see where the problem is at.

Are they, though? I think the process more generally goes like:

* Pitcher is doing poorly (giving up runs, losing games, leaving starts earlier).
* Person goes to look at stats to figure out why.
* Sees figure like BB/9, decides that's the explanation.

And... I think it's obvious to everyone here that a pitcher that walks more batters will generally give up more runs, win fewer games and pitch fewer innings, regardless of the predictive value of that change in walk rate.


Perhaps your right, but two things:

(1) For the conclusion you posited, it is correct for the reason you stated its correct. Walking more people will effect run rate, which will effect win rate. I higher BB/9 is likely some of hte cause of a different win or run rate.

(2) But the real implied question is, "will it continue" That is where you could be correct. People may be erroneously assuming that the change in BB/9 is a permanent change. That there is some reason to believe this is the new BB/9 rate going forward. I AGREE THAT JUST LOOKING AT THE NUMBER DOESN"T TELL YOU MUCH. I AGREE THAT LOOKING AT THAT NUMBER DOES NOT TELL YOU AS MUCH AS THE PREVIOUS PERFORMANCE. However, where I find fault is when people dismiss a conclusion that it will continue entirely without examining what else the person is considereing WHETHER STATED OR UNSTATED.

A great example of this recently is when people declared that Chien Ming Wang's release point had changed based upon Pitch F/X data, and that explained his performance this year. In reality, the Pitch F/X data was simply wrong - the cameras were incorrectly calibrated for some reason.

I am not aware of the example or the error. I will take your word for it. I agree that if this occurs, those people excarebated an error that was made by the camera placement people. (Just like I think plenty of analysts exacerbated the errors inherent in DIPS conclusions). If they expressed their conclusion cocksure and demeaning to other analysis, then I think that is asshattery. However, if they just pointed out something reasonable then all that happened was they made a mistake.
   151.   Posted: July 27, 2009 at 08:58 PM (#3269042)
And that is where you cross the line into being 100% wrong. There isn't uncertainty on that past event beyond some small amount of uncertainty as to whether our recollection and recordation is correct. Counting that uncertainty is what Hutch would call ass-hattery.

The use of the current vocabulary has made you believe that there is uncertainty about the past event. That is just wrong.


There is not uncertainty over what happened. There is uncertainty about the value provided by the events that occured as they relate to the player. You cannot say with 100% certainty that this player contributed this value or that value above or below the average player.

I fail to see where I am wrong.
   152. Srul Itza Posted: July 27, 2009 at 09:03 PM (#3269058)
There is uncertainty about the value provided by the events that occured as they relate to the player. You cannot say with 100% certainty that this player contributed this value or that value above or below the average player.

To the extent you are talking about the difficulty of, for example, pinning down a park factor and run environment exactly, and then using that to determine value as compared to all other players playing at the same time, I can see this.
   153. Walt Davis Posted: July 27, 2009 at 09:03 PM (#3269059)
the reason any above or below average player will always regress, on the average, towards average

I'll put this up as one of the worst sentences in the history of statistics. It is inherently self-contradictory.

MGL really needs to take some stat classes.

And if that's interpreted as a dig, it is.
   154. Srul Itza Posted: July 27, 2009 at 09:05 PM (#3269062)
the reason any above or below average player will always regress, on the average, towards average

I'll put this up as one of the worst sentences in the history of statistics.


I don't know that I would limit it to statistics.
   155. Phenomenal Smith Posted: July 27, 2009 at 09:17 PM (#3269079)
That's just weird. MGL has on occasion made points in a dickish way, but I can't think of a time when Tango has been anything but smart, helpful, and even handed. He maybe an archdick in real life, but the Tom Tango/Tangotiger internet avatar is the perfect gentleman scholar.


Not true. TT is a jerk on the internets. And I say this as someone who bought THE BOOK and thought it was brilliant. It was, however, written by two jerks and a Dolphin.
   156. Backlasher Posted: July 27, 2009 at 09:17 PM (#3269081)
To the extent you are talking about the difficulty of, for example, pinning down a park factor and run environment exactly, and then using that to determine value as compared to all other players playing at the same time, I can see this.

I also will relent and agree to the same extent as Srul.
   157. Backlasher Posted: July 27, 2009 at 09:21 PM (#3269087)
It was, however, written by two jerks and a Dolphin.

I thought that was the new Disney movie about Moneyball.

Renegade GM and his asshat assistant fail to make good in MLB, until a cuddly Dolphin teaches them the power of love. Starring Brad Pitt, Dmitri Martin, and Gilbert Gottfried as the Dolphin.
   158. Walt Davis Posted: July 27, 2009 at 09:23 PM (#3269092)
Oh, I should have kept reading for more hilarity.

And of course, there is some chance that any given player is better than his prior stats - it is just that the chances of him being worse is greater than the chances of him being better.

Which contradicts what he just finished saying but at least has the advantage of being reasonably close to correct.

That is ALWAYS the case, as long as we properly define the mean for that player.

Which of course you can't. Not with any reasonable degree of accuracy.

MGL doesn't even realize he's an actuary not a statistician.

Here's regression to the mean in words when it comes to baseball projection:

Of all the guys who, say, averaged between a 115-125 OPS+ over the last 3 years, most of them have (i.e. had) "true talent" of about 105-135, skewed towards the lower end, a reasonably large number of them had a true talent less than 105 and a small number had a true talent better than 135. Therefore, as a group, their future performance will tend to be lower than the 115-125 range. Why? Because we know from history that batters who can maintain an OPS+ better than 115-125 are uncommon and those who maintain over 125 rare, so a number of these guys must be in the 115-125 group by chance. However, WE ARE COMPLETELY CLUELESS AS TO WHICH OF THOSE THREE GROUPS ANY INDIVIDUAL BATTER BELONGS TO. We treat them all the same relative to their playing time.

Regression to the mean works on average. It is horrible for those guys who are really as good or better than they've performed over the last 3-4 years and it works horribly for those guys who are much worse than they've performed over the last 3-4 years. It works really well for the guys who are just above (or below) their true talent. And we have no idea which group any individual player falls into.

In short, we don't have a clue how good Chipper Jones, the individual, is relative to his actual performance. We have a guess, a guess based on the average of guys who've performed kinda like him regressed toward the mean of all players. Judging by Dan's fine work on ZiPS projections, such guesses about individual players tend to have a 95% confidence interval of about +/- 35 points of OPS+. Wow, an amazing level of accuracy.

MGL will of course be happy to bet $10,000 on any of his individual projections.
   159. Slivers of Maranville descends into chaos (SdeB) Posted: July 27, 2009 at 09:26 PM (#3269098)
Judging by Dan's fine work on ZiPS projections, such guesses about individual players tend to have a 95% confidence interval of about +/- 35 points of OPS+. Wow, an amazing level of accuracy.


I think you are confusing accuracy with precision.
   160. BFFB Posted: July 27, 2009 at 09:26 PM (#3269099)
Did the cuddly dolphin replace the animated bill hames in the re-write?
   161. Tom Nawrocki Posted: July 27, 2009 at 09:28 PM (#3269104)
MGL really needs to take some stat classes.


I think the key here is that you have to regress MGL's opinion of his own intelligence toward the mean.
   162. Ozzie's gay friend Posted: July 27, 2009 at 09:32 PM (#3269111)
Hmm, I find this explanation to be poorly constructed and confusing (it's kindof a silly hypothetical argument anyways) and the digs at this site come across as immature, and defensive.
   163. Tango Posted: July 27, 2009 at 10:03 PM (#3269170)
Judging by Dan's fine work on ZiPS projections, such guesses about individual players tend to have a 95% confidence interval of about +/- 35 points of OPS+.


Walt, are you saying that we can estimate Chipper's true talent level as being 95% within 35 OPS+ of whatever mean, or are you saying that we can estimate his future performance (over say 600 PA) at that confidence level?

***

This entire paragraph could have been written by MGL, as I've seen him make the same argument:

Of all the guys who, say, averaged between a 115-125 OPS+ over the last 3 years, most of them have (i.e. had) "true talent" of about 105-135, skewed towards the lower end, a reasonably large number of them had a true talent less than 105 and a small number had a true talent better than 135. Therefore, as a group, their future performance will tend to be lower than the 115-125 range. Why? Because we know from history that batters who can maintain an OPS+ better than 115-125 are uncommon and those who maintain over 125 rare, so a number of these guys must be in the 115-125 group by chance. However, WE ARE COMPLETELY CLUELESS AS TO WHICH OF THOSE THREE GROUPS ANY INDIVIDUAL BATTER BELONGS TO. We treat them all the same relative to their playing time.


And with the caps too! Very MGLian of you!

***

Regression to the mean works on average. It is horrible for those guys who are really as good or better than they've performed over the last 3-4 years and it works horribly for those guys who are much worse than they've performed over the last 3-4 years. It works really well for the guys who are just above (or below) their true talent. And we have no idea which group any individual player falls into.


Right, which is why the population you draw your player from should be as close to your player as possible. Ideally, your player will be so indistinguishable from the population you have created that you regress him 100% toward the mean.

In terms of the worst-case scenario, look at the amateur draft. The average career WAR for a player in the first round is something like 10 wins above average. That's career. The range is between 0 and 150. That's quite a range. If all you know is that Barry Bonds was drafted in the first round (you don't know about his family history, how high in the first round, or his athletic prowess, etc), you have no choice but to regress him to a 10 WAR career player.

This is why (I presume) Walt would say that it's horrible to regress such players who are so on the extreme of the population.
   164. Dr. I likes his panda steak medium rare Posted: July 27, 2009 at 10:45 PM (#3269259)
luck:sabremetrics::fugacity:ideal gas law


Now it all makes sense!

I always thought teaching the concept of fugacity was a waste of time.
   165. Srul Itza Posted: July 27, 2009 at 10:53 PM (#3269272)
Gilbert Gottfried as the Dolphin.

Good choice. He actually sounds like a dolphin -- at least the portion of his speech which is within the range of human hearing.
   166. BDC Posted: July 27, 2009 at 10:55 PM (#3269277)
The "mean" is not the single league-average mean, but the mean of the population you are drawing the player from. Therefore, his health, experience and aging are all part of the regression equation

I'm still in primer class, which is why I'm on Primer, but :)

Tango, this is a fine point: I am highly certain that Omar Vizquel is not going to improve markedly as a hitter in 2010. But to say that the factors that make me highly certain are part of the population involves a kind of paradox, doesn't it? The more I know about Vizquel (a highly unusual player), the less he's part of a population, the more sui generis he is.

In any case, the only thing I really wondered about was the hasty phrasing in mgl's post, which made it seem like below-average players are fixing to regress toward average. That doesn't mean that Vizquel has a good chance at an OPS of .760 next year; it just means that he's headed (most likely down) to whatever OPS we can expect a 43-year-old shortstop who was never a great hitter to, on the average, achieve.
   167. phredbird Posted: July 27, 2009 at 11:29 PM (#3269361)
If all we had of Beethoven's oeuvre was the first twelve bars of the 5th symphony, would we regress that to the mean and conclude that Beethoven was probably a slightly above average composer?


this has been a great thread. i hope mgl read it through, and has a slightly different take on btf.

that said, i don't think you can use statistical methods to make any kind of statement like that above. great art resists all attempts at quantification, imho. its a different debate, even with something as mathematical as music.
   168. phredbird Posted: July 27, 2009 at 11:58 PM (#3269429)
er, i hadn't seen walt davis' comments before posting 169.

anyway, a stimulating debate nevertheless.
   169. bumpis hound Posted: July 28, 2009 at 12:01 AM (#3269439)
I'm loving this discussion, it's the kind of thing that made me a lurker here. Great stuff. But, now I'm gonna demonstrate my ignorance, forgive me in advance.

I'm wondering about the usefulness of arguing in the abstract. Correct me if I'm wrong, but WRT projections, isn't proper statistical forcasting only truly useful when triangulated with factors of time and money? IE, the true value of analyzing a player is how long he'll be worth the money? Arguing about Perez v Bonds is like arguing about angels on pinheads. (Obviously, it's different analyzing post-performance, as in MVP/Cy Young/HoF analysis.)

A case that comes to mind, at least in terms of this discussion, would be Rich Aurilia, after his 2001 season. Say he was to become a free agent right after that campaign, and was for some reason determined to move on from the Giants, no chance of re-signing. He had a fantastic year, where everything broke right and he performed at the upper limit of his talent set. If you were the GM, would you say he'd achieved a new plateau of performance and give him a Jeter-type contract? or would you say it was an outlier, that everything happened right (but was not luck), and was the kind of year that he would be very unlikely to repeat?

Of course, it's the latter. Given his chronic groin injury and myriad eye problems, he never came close to another year like 2001, but he was still useful. And I could see how a desperate GM or owner would throw stoopid money after a SS coming off that kind of year, which in hindsight would of course be a major mistake. I guess to sum it up, I assume Aurilia's post-2001 performance would be a regression to his own personal mean. Or am I completely missing some bigger point of the conversation?
   170. Nineto Lezcano needs to get his shit together (CW) Posted: July 28, 2009 at 12:12 AM (#3269457)
Correct me if I'm wrong, but WRT projections, isn't proper statistical forcasting only truly useful when triangulated with factors of time and money?


Depends on what you want projections for. I don't own a baseball team, and nobody's beating down my door to ask me what players they should and shouldn't sign. I don't necessarily care about the money at all. A fantasy owner wants the best projections he can get, but he only cares about his money at the auction, not the player's actual salary.
   171. Backlasher Posted: July 28, 2009 at 12:22 AM (#3269471)
Of course, it's the latter. Given his chronic groin injury and myriad eye problems, he never came close to another year like 2001, but he was still useful.

That is an element that is missing. Most of the models don't directly account for his sore nut muscles. They only account for them to the extent they have previously impacting the performance, i.e. a regression to his mean output.
Of course that new output changes the models mean. Its important to weigh sore nuts, and what I object to is when people discount them entirely because they: (1) presume people can't make a meaningful analysis of sore nuts based on their experiences and (2) presume that it is too random a recurrence to account for in a model.

The converse is taking Sosa's first big slugging season. How reasonable was it to project even more power going forward. Can you account for whatever it may have been that caused that spike in performance (FWIW, I think some GMs have been able to do this and had some successes as a result). Of course, the holy grail is being able to predict this output to any degree of certainty even before the breakout. Knowing when a person has moved into another band of performance whether short term or long term is very important. Its even important if the SEASON PREDICTION for the player ends up being directly on target.

Some analysts claim you can't know. Some scoff when you try to know. Those are the dangerous things. Its not bad to say that you are wrong on a real time or anytime prediction. That is helpful but you got expand your bag of tricks beyond mere classical statistics to do it.
   172. phredbird Posted: July 28, 2009 at 12:28 AM (#3269477)
bumpis hound, i'll put myself in the 'bear of little brain' group, but as i see it there has to be some attempt at using past events to posit some sort of future projection. otherwise there's no way to attempt to move forward, if moving forward is defined as trying to improve your team. what we're seeing -- what i'm seeing -- is an attempt to take it out of the 'so and so is in the best shape of his life and the boys look like they'll take the pennant again, all the scouts like how they're playing' and put it in a context that has more real information buttressing its conclusions. unfortunately, there's no guarantee the more nuanced, sophisticated system will yield better results per instance, giving the sabr haters all the ammo they need.

in the case of rich aurilia, sure, its easy to see now, but put yourself in 2001 and try to figure it out. there would have been somebody, somewhere, who would have had a convincing argument that 2001 was closer to his mean than other previous years. not necessarily representative, but closer; and there's the problem. with enough data, you can refine the expectations. when is enough? at some point, somebody has to decide whether or not to offer a contract. i'm glad i don't have to make that decision.
   173. Tango Posted: July 28, 2009 at 01:12 AM (#3269560)
In any case, the only thing I really wondered about was the hasty phrasing in mgl's post, which made it seem like below-average players are fixing to regress toward average. That doesn't mean that Vizquel has a good chance at an OPS of .760 next year; it just means that he's headed (most likely down) to whatever OPS we can expect a 43-year-old shortstop who was never a great hitter to, on the average, achieve.


Remember that when we talk about the "population mean", we are NOT necessarily talking about all of MLB as the population. That is one possible population. If all you knew was that Omar Vizquel played MLB last year, you have no choice, none, but to regress his performance toward the MLB average.

But, we know more about Omar Vizquel. We know that he's old. We know that he keeps getting signed because of his outstanding hands. We know that he's small. When know that he's been a terrible hitter for a while. When you put all these things together, the population mean he's coming from is decidedly below the overall league mean. And it is that population mean to which you regress his performance.

Now, you may not find a bucket of players substantial enough to give you a fair population mean, which is why you create a regression equation with enough parameters that will give you the population mean you need for Vizquel. That is, you estimate as best you can.

As another example, where do you regress Micah Owings? Well, if you believe that he's a good enough hitter that if he were not a pitcher he could play the field, then it makes no sense whatsoever to regress him to the population of pitchers with .170 OBP. But, how sure are you of that? What population of players does he belong to? MOST IMPORTANTLY, you cannot look at how he's performed in deciding what population he belongs to! You can't say that because he's got a .350 OBP in 100 PA (or whatever he is) means that he must be a natural hitter. You CAN say that he was a standout hitter in college, you can say that he was considered being drafted as a non-pitcher (if any of those things are true). All those can be part of the regression equation.
   174. bumpis hound Posted: July 28, 2009 at 01:25 AM (#3269595)
in the case of rich aurilia, sure, its easy to see now, but put yourself in 2001 and try to figure it out. there would have been somebody, somewhere, who would have had a convincing argument that 2001 was closer to his mean than other previous years.

So I guess the trick is to determine if 2001 was an outlier, meaning he'd regress to the mean of his previous output, or if it represented a new output plateau (via experience, new technique borne of coaching, latent skill being newly tapped, what have you) that in essence elevates him to a new population, against which a new mean would be established and expected.
   175. greenback calls it soccer Posted: July 28, 2009 at 01:32 AM (#3269610)
If all you knew was that Omar Vizquel played MLB last year, you have no choice, none, but to regress his performance toward the MLB average.

Yeah, but aren't you supposed to be regressing towards the MLB true talent average, which itself is only inferred? I wonder if that explains some of the phenomenon of major league talent improving over time.

Has anybody read Feyerabend?
   176. Baldrick Posted: July 28, 2009 at 02:43 AM (#3269876)
"Does CC suck?"

Yes.

Thread over.
   177. Chris Dial Posted: July 28, 2009 at 03:12 AM (#3269936)
All of this talk has to revolve around what a player is going to do in the future doesn't it? Hutch is correct, isn't he, when he says wrt Bonds/Perez (or even Rickey/Rice), that there is no "future state". Bonds was better. That is, his performance was. Same with Rickey over Rice. It's indisputiable that they were better (wrt performance). All this regression is solely, AFAICT, about futures. And when it's past, there is none of that needed.
   178. Slivers of Maranville descends into chaos (SdeB) Posted: July 28, 2009 at 03:38 AM (#3269975)

All of this talk has to revolve around what a player is going to do in the future doesn't it? Hutch is correct, isn't he, when he says wrt Bonds/Perez (or even Rickey/Rice), that there is no "future state". Bonds was better. That is, his performance was. Same with Rickey over Rice. It's indisputiable that they were better (wrt performance). All this regression is solely, AFAICT, about futures. And when it's past, there is none of that needed.


That's my assertion, but Tango and (presumably) mgl disagree.

That is, and I'm open to persuasion on this, I think your analysis is either retrospective, in which case a player's talent is what he's done, or prospective, in which case you can define talent as how you expect him to perform. "That prospect has a lot of talent" means colloquially that you think he is likely to succeed in MLB. But that's not 'real' in the sense that he hasn't done it yet. Only the past has reality. That leaves the present. What is a player's talent level "right now"? Well what do you mean by "right now"? If he's at the plate, you mean this AB? Well, the AB hasn't happened yet. It's prospective.

You could say that there is some ineffable 'talent' that the player has at any given moment, that is neither retrospective nor prospective. But I don't really know what that means, nor do I know if it has any utility in sabermetrics. It seems awfully metaphysical to me. And to say that that quality, which is unmeasurable, is 'real' or 'true', where as what the player actually did do in his career is not, seems wrong.
   179. Tango Posted: July 28, 2009 at 04:17 AM (#3270010)
All this regression is solely, AFAICT, about futures. And when it's past, there is none of that needed.


Correct.

Bonds was better. That is, his performance was.


Correct on the bold.

Bonds was better. That is, his performance was.


If you didn't qualify the bold part with what you said after, and had only said the bold part, then that's where you have to say that Bonds was better at nine 9s certainty. And you are right, this has nothing at all to do with regression.
   180.   Posted: July 28, 2009 at 04:29 AM (#3270014)
I'm really not sure I get that, Tango.

If you concede that his performance was better, than saying he was better is about the same to me.

If you are saying there is a chance (1x10^-50) that he actually was NOT better, then are you really saying that his performance was better, or that there is a chance his performance was NOT better, but our measurements of his performance were inaccurate?

After all, if we could theoretically have tools that measured performance flawlessly (taking absolutely everything into account,) then why would his performance not measure up with his ability?
   181. Tango Posted: July 28, 2009 at 04:33 AM (#3270016)
It may seem nuanced about "Was better", but let me ask this:

The Rays were better than the Phillies.
The Phillies performed better than the Rays.

Both these statements are true. At least the first statement is considered to be true by more than 50% of the people, seeing that most people had the Rays as the better chance to win the WS.

The first statement also requires a further qualification, like we're 80% sure that the Rays were the better team.

But in any confrontation, be it one pitch, one at bat, one series, or one thousand games, EITHER participant can perform better than the other.

Who IS the best hitter in MLB right now? That's Pujols. It doesn't matter if say Joe Mauer or Utley or Zobrist or anyone else might be performing better this year. Our best estimate at the most talented hitter in baseball is Pujols. That's true talent. Now, we are not 100% certain that it is Pujols. We might be 70% certain.

It's important to have this mindset in order to appreciate the difference between true talent and observed performance. True talent equals observed performance when n equals infinity. Anything less than that, and random variation has a role.
   182. Los Angeles Waterloo of Black Hawk Posted: July 28, 2009 at 04:40 AM (#3270022)
- If all we had of Beethoven's oeuvre was the first twelve bars of the 5th symphony, would we regress that to the mean and conclude that Beethoven was probably a slightly above average composer?

Yes, if that's all we had, then exactly correct.


Honestly, throwing art in to this confuses matters. Let's stipulate that Night of the Hunter is a great movie. Is Charles Laughton a great director? Did we know Terence Malick was a great director after Badlands? Should we have been regressing Francis Coppola all along, so therefore One from the Heart was predictable? What does it even mean?

***

As for the guy above who claims Tango is a jerk: whaaaaa???????? I don't remember Tango even ever remotely approaching the realm of jerkhood, much less taking its oath.
   183. Tango Posted: July 28, 2009 at 04:42 AM (#3270024)
Who IS better: Federer or Roddick?

Had Roddick beaten Federer in Wimbledon, would your answer change? No! Only if the two were really close, then your answer could change. Otherwise, Federer IS better. He simply IS. Sometimes, someone performs better than Federer. That's random variation.

It is possible that when Federer loses that someone else actually WAS better. For example, Federer was injured. At that point in time, someone else IS better than Federer.

Absent that kind of information, then Federer IS better than anyone not named Nadal, and the performance on the day in question is irrelevant to answering that.
   184.   Posted: July 28, 2009 at 04:49 AM (#3270028)
My question is with regards to the "random variation." If Ben Zobrist is performing better than Albert Pujols at a point in time, and we chalk this up to "random variation," then either Zobrist has had an easier environment and our tools for measuring performance did not fully take it into account, or Zobrist actually was better for that specific instance in time. What other possibilities are there?

Even with the coin flip, the reason you might get 7/10 heads is going to have to do with the force you exerted on the coin, the distance from the ground, etc. What underlying factors would lead a player to having a performance that is above his true ability, if we accept the reality that his true ability changes every millisecond?

Had Roddick beaten Federer in Wimbledon, would your answer change? No! Only if the two were really close, then your answer could change. Otherwise, Federer IS better. He simply IS. Sometimes, someone performs better than Federer. That's random variation.


If we accept that Federer is better (I know nothing of tennis [Federer plays tennis, right?],) the question then is why did Roddick perform better? Was Federer not mentally prepared? Did Roddick get "lucky" with regards to game conditions that were outside his control? If it's the former, why is that not included in our model of instantaneous true ability? If it's the latter, then how can we say Roddick performed better if our measurements didn't take those game conditions into account?
   185. Tango Posted: July 28, 2009 at 04:53 AM (#3270030)
Let's try another one. You believe Pujols IS the best hitter in baseball. He goes in an 0-fer slump. Do you still say that Pujols IS the best hitter in baseball?

If you answer yes, then you accept "true talent" as I'm trying to use it.

However, your certainty as to Pujols being the best will diminish with every 0-fer game he goes through.
   186. AROM Posted: July 28, 2009 at 04:57 AM (#3270033)
Can you account for whatever it may have been that caused that spike in performance (FWIW, I think some GMs have been able to do this and had some successes as a result).


I would wager that more GMs have been burned by this than benefited. When a player has a breakout year I don't know if he's going to follow the Sosa path or the Adrian Beltre path. When it happens in conjunction with a free agent year, all it takes is one GM to believe it's real improvement for the guy to sign a huge contract. We know that not all breakout years are sustainable, in fact most aren't. But combine a break out year with free agency, when's the last time a player like that didn't get paid as if his future great production was guaranteed?
   187. Tango Posted: July 28, 2009 at 04:57 AM (#3270034)
Shock: say that Pujols has one at bat, and he gets an out. And Zobrist (and Carlos Zambrano!) each get a hit. Is Zobrist the better hitter?

No, this is purely random variation. Confrontations have a binary outcome, 1 or 0, safe or out. Therefore, every observation we see is subject to the binomial distribution.

Pujols has a higher mean (say a true .450 compared to a true .350 for Zobrist and a true .250 for Zambrano). But his performance is a product of three things: his talent level, his environment, and random variation.
   188.   Posted: July 28, 2009 at 05:01 AM (#3270036)

Shock: say that Pujols has one at bat, and he gets an out. And Zobrist (and Carlos Zambrano!) each get a hit. Is Zobrist the better hitter?


You miss which side I am questioning.

I'm not questioning the point that Pujols is still a better hitter, I am questioning the certainty in the thought that, in the above scenario, Zobrist performed better. It's just as likely that Zobrist is much worse, but he got a fat pitch to hit, while Pujols did not.

If you revise the question to indicate that they had the exact same at-bat in every single conceivable way, and Zobrist had a good swing while Pujols did not, then I would say that for that specific millisecond in time, Zobrist's true talent was higher, but that on the average, Pujols's is (much) higher.

I am sorry if I'm being dense or this is overly semantical, but random variation is a tricky thing. Try writing a program that does things at "random...."
   189. SoSHially Unacceptable Posted: July 28, 2009 at 05:30 AM (#3270040)
As for the guy above who claims Tango is a jerk: whaaaaa???????? I don't remember Tango even ever remotely approaching the realm of jerkhood, much less taking its oath.


He probably just said something non-derogatory about Keith Olberman at some point in the past. That's enough to get on Phenomenal's jerk list for life.

And I want to thank Tango for not just responding to every question, but for actually answering them in a way that a dolt like me could understand. If mgl wants to maintain his self-imposed exile from Primer, I'm glad Tom is picking up slack from The Book.
   190. Nineto Lezcano needs to get his shit together (CW) Posted: July 28, 2009 at 06:04 AM (#3270050)
I would wager that more GMs have been burned by this than benefited. When a player has a breakout year I don't know if he's going to follow the Sosa path or the Adrian Beltre path. When it happens in conjunction with a free agent year, all it takes is one GM to believe it's real improvement for the guy to sign a huge contract. We know that not all breakout years are sustainable, in fact most aren't. But combine a break out year with free agency, when's the last time a player like that didn't get paid as if his future great production was guaranteed?


Ooh, ooh, I know the answer here - winner's curse! Winner's curse!
   191. BFFB Posted: July 28, 2009 at 06:37 AM (#3270057)
Pujols has a higher mean (say a true .450 compared to a true .350 for Zobrist and a true .250 for Zambrano). But his performance is a product of three things: his talent level, his environment, and random variation.


Minor quibble. What's really happening is that in every instance the environment changes which impacts on the liklihood of a particular binary outcome independant of what his "true talent" for that pitch would be.

Random variation is a third factor that is only introduced when you attempt to model the outcome of the event due to the (lack of) precision at which you can account for "true talent" and "environment".
   192. Srul Itza At Home Posted: July 28, 2009 at 06:40 AM (#3270058)
I'm not questioning the point that Pujols is still a better hitter, I am questioning the certainty in the thought that, in the above scenario, Zobrist performed better. It's just as likely that Zobrist is much worse, but he got a fat pitch to hit, while Pujols did not.


This type of qualification is unnecessary. It is baseball. Pujols makes an out 60% of the time. He sometimes misses a fat pitch. For batters who were not Barry Bonds 2001-2004, failure at the plate is the default option. The fact that on the exact same pitch, Pujols might miss with Zobrist might get a hit, is just a reminder of the nature of the game.

random variation is a tricky thing. Try writing a program that does things at "random...."

Random variation is the true nature of the universe. The fact that you have trouble simulating it in a computer program does not change that. You can't fight entropy -- in the grand scheme of things.

Never forget the 3 laws of thermodynamics:

You can't win
You can't break even
You can't get out of the game
   193.   Posted: July 28, 2009 at 06:45 AM (#3270061)
[193]: That is, I think, exactly what I've been trying to say (poorly.)

[194]: Do you think that, when Pujols misses, there is a reason for it? It could be anything. Any reason from getting fooled by the pitch to having his mind temporarily distracted, to atmospheric conditions, etc. Do you think that there is some reason, somewhere, for him missing the pitch, or is it just "random?" I guess this is where the argument lies. If there is a reason for it, then in my opinion that reason falls into one of two buckets, either it's a momentary change in his ability, or it's an environmental factor that we did not consider in our measurement. In this sense, random variation only exists in the sense that our measurements are subject to random variation due to variables that we did not and/or could not account for. In another sense, randomness exists as an absolute entity such as karma or luck that will impact that at-bat in completely unpredictable ways, even if we could account for everything. I do not know which "sense" of randomness is correct.
   194. BFFB Posted: July 28, 2009 at 07:05 AM (#3270068)
Never forget the 3 laws of thermodynamics:


But entropy is not "random variation". In statistical mechanics entropy is the level of uncertainty in a system after it's observable properties have been accounted for. It's a measure of uncertainty. A fancy fudge factor to make a theoretical model for predicting the behavior of ideal systems be a usable approximation of reality.

It doesn't actually exist like a rock does, it's a human invention.
   195. Srul Itza At Home Posted: July 28, 2009 at 07:17 AM (#3270072)
Do you think that there is some reason, somewhere, for him missing the pitch

Yes -- it is the expected outcome. Using a round bat of limited size to hit a small ball pitched by a professional pitcher who lets the ball go less than 60 feet away from you, at an unknown velocity, and in unknown trajectory is VERY VERY VERY HARD. I sometimes think people here have no clue as to just how hard it is, even for the best of the best. Even for a guy as good as Pujols, he only needs to start the swing a fraction of second slower, or misjudge the plane of the ball by an inch, to just plain miss it. Remember -- he misses a lot more than he hits. He makes out more often than he gets on base. If he doesn't walk, the odds are around 2 to 1 that he will make an out.

FAILURE is the default option. FAILURE in baseball does not need to be explained.
   196. Srul Itza At Home Posted: July 28, 2009 at 07:19 AM (#3270073)
In statistical mechanics entropy is the level of uncertainty in a system after it's observable properties have been accounted for.

Statistical mechanics is a representation of the universe. It is not the universe.
   197. Jeff K. Posted: July 28, 2009 at 07:47 AM (#3270077)
My question is with regards to the "random variation." If Ben Zobrist is performing better than Albert Pujols at a point in time, and we chalk this up to "random variation," then either Zobrist has had an easier environment and our tools for measuring performance did not fully take it into account, or Zobrist actually was better for that specific instance in time. What other possibilities are there?

Zobrist performed better during that period. Zobrist "was better" means something different. I don't quite get the disconnect here. I disagree with the end point tango is making here (I stand by #10), but this part seems unlikely to engender the confusion it has. Take out the value judgment for a moment.

To borrow tango's coin analogy, you have two coins. One is weighted, such that 80% of the time it will come up heads. The other is normal. The issue is that each flip doesn't land 80% heads and 20% tails, discrete results occur. So in the middle of flipping each coin 10,000 times, which ends with the one having 8,000 heads and the other 5,000, both on the nose, we have a stretch of 100 flips.

During these 100 flips, normal coin comes up heads 75 times, weighted coin 50. If we apply a positive value to heads (like a hit), normal coin performed better. However, weighted coin *was better*, had a higher true talent, whatever you want to say. Talent or innate betterness defines performance, of course, but it doesn't work the other way, not for coins. For baseball players, the only measure of talent we really have is the results. But not just these 100 results, we have all the ones prior. We know Albert Pujols is better than Ben Zobrist because we have thousands of PAs as proof. Pujols performed better in those,and unless our new sample of Zobrist outperforming him is both to the same degree and for as long (give or take) as Pujols outperformed him, we still know this.

Where I disagree with mgl is that if I somehow have a guy who's 27-28 (so no aging issues) and has a ten year track record of contextually neutral exact .300 averages, he will and does say that that guy is not as good as that. And to me, at that point, or the case of Chipper, that's ridiculous. I'd project his 2010 performance to be slightly lower for reasons that dont apply, but there is nowhere near enough justification to regress my judgment of his talent level back to the mean.
   198. Tango Posted: July 28, 2009 at 12:27 PM (#3270110)
The issue is that each flip doesn't land 80% heads and 20% tails, discrete results occur.


Right, exactly. Pujols may have a mean of .450 and Zobrist might be .350, but if you have ONE PA, then the result can only be 0 or 1. And what causes that? Random variation around a true mean + environment.

The environment is whatever it is that the pitch they faced in the park and pitcher and fielders, etc.

So the observation, which is DISCRETE (that's important), can only be 1 or 0. It won't be ".450". Just by random variation, Pujols will get an out.

Some people like to think that everything is preordained so that if you can replicate everything (including the environment) and if Pujols did everything identical, then he was destined to get that out and would do so every single time out. If that is true, then at that moment in time, Pujols had no talent whatsoever for getting a hit.

I don't subscribe to that reasoning.
   199. BDC Posted: July 28, 2009 at 12:38 PM (#3270116)
Tango, thanks for #175 and indeed for all your explanations in this thread. Very interesting conversation.
   200. AROM Posted: July 28, 2009 at 01:16 PM (#3270143)
I am very disappointed in the metaphor of a weighted coin for random variation. I thought people here had more geek cred than that.

The typical player rolls a 6 sided die to determine how well he hits a ball, higher is better. Pujols is rolling a d20. And of course, sometimes on a d20 you can still roll a 1.
Page 2 of 3 pages  < 1 2 3 > 

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
BFFB
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Hot Topics

NewsblogDayton Moore's vision for Kansas Royals validated - ESPN
(13 - 10:15am, Oct 01)
Last: PreservedFish

NewsblogNed Yost on the sixth inning and his bullpen usage: “its just one of those things” | HardballTalk
(20 - 10:14am, Oct 01)
Last: Textbook Editor

NewsblogWhy the Nats will win the World Series - ESPN
(1 - 10:14am, Oct 01)
Last: Jolly Old St. Nick Is A Jolly Old St. Crip

NewsblogOT: Politics, October 2014: Sunshine, Baseball, and Etch A Sketch: How Politicians Use Analogies
(12 - 10:13am, Oct 01)
Last: GregD

NewsblogNL WILD CARD 2014 OMNICHATTER
(14 - 10:13am, Oct 01)
Last: Rickey! trades in sheep and threats

NewsblogSpector: Stats incredible! Numbers from the 2014 MLB season will amaze you
(41 - 10:08am, Oct 01)
Last: stanmvp48

NewsblogLinkedIn: 10 Sales Lessons From “The Captain”
(5 - 10:07am, Oct 01)
Last: Jose Can Still Seabiscuit

NewsblogPrimer Dugout (and link of the day) 10-1-2014
(4 - 10:01am, Oct 01)
Last: Dag Nabbit is part of the zombie horde

NewsblogThe Economist: The new market inefficiencies
(21 - 9:57am, Oct 01)
Last: SoSHially Unacceptable

NewsblogWSJ: Playoff Hateability Index
(20 - 9:42am, Oct 01)
Last: Yeaarrgghhhh

NewsblogAL WILD CARD GAME 2014 OMNICHATTER
(1134 - 9:41am, Oct 01)
Last: Jolly Old St. Nick Is A Jolly Old St. Crip

NewsblogOT: Politics, September, 2014: ESPN honors Daily Worker sports editor Lester Rodney
(4084 - 8:46am, Oct 01)
Last: Bitter Mouse

NewsblogOT: NFL/NHL thread
(8174 - 8:01am, Oct 01)
Last: Norcan

NewsblogThe Calm-Before-The-Storm and Postseason Prediction OMNICHATTER, 2014
(111 - 7:14am, Oct 01)
Last: Jolly Old St. Nick Is A Jolly Old St. Crip

Hall of MeritMost Meritorious Player: 2014 Discussion
(14 - 2:17am, Oct 01)
Last: bjhanke

Page rendered in 1.0578 seconds
52 querie(s) executed