User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
Page rendered in 0.6304 seconds
47 querie(s) executed
| ||||||||
Baseball Primer Newsblog — The Best News Links from the Baseball Newsstand Friday, May 24, 2013Tangotiger Blog: Ensberg and Tango speak on being locked-in“locked-in” with Tom Tango and Morgan Ensberg!...(also check out Kevin Goldstein’s FB page which has been having a terrific back/forth)
Repoz
Posted: May 24, 2013 at 05:04 AM | 78 comment(s)
Login to Bookmark
Tags: history, sabermetrics |
Login to submit news.
BookmarksYou must be logged in to view your Bookmarks. Hot TopicsNewsblog: OT - 2017-18 NBA thread (All-Star Weekend to End of Time edition)
(2698 - 9:55pm, Apr 25) Last: Russlan thinks deGrom is da bomb Newsblog: OTP 2018 Apr 23: The Dominant-Sport Theory of American Politics (762 - 9:55pm, Apr 25) Last: Jess Franco Newsblog: OT: Winter Soccer Thread (1616 - 9:48pm, Apr 25) Last: Richard Newsblog: There are lies, damn lies, and OMNICHATTER! for April 25, 2018. (91 - 9:47pm, Apr 25) Last: Count Vorror Rairol Mencoon (CoB) Newsblog: OT - Catch-All Pop Culture Extravaganza (April - June 2018) (387 - 9:44pm, Apr 25) Last: Tulo's Fishy Mullet (mrams) Newsblog: Raissman: Mike Francesa returning to WFAN in the 3 pm - 7 pm time slot, sources tell News (63 - 8:48pm, Apr 25) Last: Hysterical & Useless Newsblog: VIDEO: Rockies Announcers Sound Like Complete Idiots Talking About Javier Baez (42 - 8:08pm, Apr 25) Last: bunyon Gonfalon Cubs: Riding the Rails of Mediocrity (27 - 7:45pm, Apr 25) Last: Walt Davis Newsblog: Kyle Schwarber hits 2 homers in Cubs' win (27 - 7:36pm, Apr 25) Last: Michael Paulionis Newsblog: Ronald Acuna being called up by Braves | MLB.com (52 - 7:33pm, Apr 25) Last: Hank G. Hall of Merit: Most Meritorious Player: 1942 Ballot (4 - 5:59pm, Apr 25) Last: bjhanke Newsblog: Taking Back the Ballparks - Marlins voting thread (15 - 5:31pm, Apr 25) Last: Greg Pope Newsblog: The unwritten rules of using a position player to pitch ... when you’re winning big (81 - 3:48pm, Apr 25) Last: David Nieporent (now, with children) Newsblog: Primer Dugout (and link of the day) 4-25-2018 (51 - 2:42pm, Apr 25) Last: Rennie's Tenet Newsblog: 'Family' and sense of 'brotherhood' has Diamondbacks picking up right where they left off (19 - 1:39pm, Apr 25) Last: shoewizard |
|||||||
About Baseball Think Factory | Write for Us | Copyright © 1996-2014 Baseball Think Factory
User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
|
| Page rendered in 0.6304 seconds |
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Robert in Manhattan Beach Posted: May 24, 2013 at 05:30 AM (#4451160)That's precisely what it means to be "in the zone" or "locked in." If you're saying to yourself when it's going on, "I'm in the zone," you're not in the zone.(*)
Tango's completely out of his element and embarrassing himself. His knowledge base isn't strong or wide enough to comment on these things, and it's embarrassing that he thinks things he doesn't comprehend must be able to be inferred from things he does comprehend, or they don't exist.
(*) You probably have to have experienced this to fully understand it. There might be substitutes for experience -- I suppose listening to people in the arena might be something of a substitute, but Tango and his ilk aren't willing to do that, either.
@2) I don't really understand your problem here. Tango is conceding guys go through stretches were they see the ball better and can perform at a higher than usual level. He's even conceding that these things are the result of something beyond randomness. His point is that you can't use these streaks predictively, because they end without warning. Seems pretty reasonable to me. If getting locked in for 8 ABs meant you were more likely than average to have even 1-2 subsequent weeks of above average performance it would show up in the numbers, and if anyone was going to find it I'd expect it would be Tango. That's the argument with all this intangible stuff, if it works it should show up somewhere in the results.
I was wondering if this was the single stupidest thing posted on this site. But then I read this:
So, no. Saved by SBB.....
So now people "in the zone" don't even know they're "in the zone"? Even they can only tell by looking back and seeing the streaks? That's asinine. It is rolling a die 100 times, then looking back and seeing 6 sixes out of 7 from rolls 34-40 and saying that the roller was really "in the zone" during that time.
So what? You can't use the streaks to predict the Dow or the phases of the moon either.
That's the argument with all this intangible stuff, if it works it should show up somewhere in the results.
I'm not sure what "it works" means, there, but the "intangible stuff" does show up in the results. It's a big part of them.
So, no. Saved by SBB.....
Good to see that the cult is still intact.
It's nothing like that, because hitting a baseball is nothing like rolling a die. That's an embarrassingly bad analogy.
It is for me, but I have very unusual mechanics.
Even a recreational athlete should know this. Everyone who has ever played pick-up basketball or golf should be able to tell you some games/rounds you're on your game, and some you're not. I'd guess it has to do with mechanics.
Was Dr. Tim Leary "in the zone" when he turned on and tuned in, or when he dropped out?
How many fingers am I holding up while my one hand is clapping??
Why does it so often seem that the posts of SBB and GuyM are being generated by a random meta-linguistic algorithm geared to simulate bipolar behavior?
Dammit, Tango, I really wish you would have edited that transcript. All that scrupulousness is just too much to ask of anyone, even you.
You'd need a lot more granular data. Someone can be striking the ball great, but have a mediocre score b/c of putting, or putting great and chipping poorly. There are 3 or 4 different sets of mechanics that aren't necessarily in sync.
The concept of random chance isn't about studying random number generators. Random chance is useful to analyze situations in which we are uncertain about the outcome. The key is not that there's something specific about the type of outcome that qualifies it, it's _our_ lack of knowledge that brings the concept of 'random chance' into play.
So when it comes to being "locked in," what you have are a myriad of competing factors swirling about a situation where knowing all of the relevant information necessary to predict an outcome with complete accuracy is impossible. We use the concept of random chance as a stand in for all of those things we don't or can't know, and use that to help us model the chances of various outcomes. I don't doubt there are various psychological and physical effects that come with being "locked in" that affect the outcome of a plate appearance. But unless and until we have the necessary information from these effects to help adjust our predictions, treating them as "random" is completely valid.
After all, we're not saying they _are_ random. I'm not completely convinced that _anything_ is 'random' in that sense. What we're saying is, absent further usable information, 'random' is our best means of dealing with it and models it as well as anything else we come up with.
Also, Gene Tenace and Brooks Robinson.
Both guys described their big WS performances in similar terms. Brooks in particular (being Brooks) said something like "I hope we end the Series soon, this can't last forever!"
Exactly. My introduction to the online stathead community was on rec.sport.baseball about 13 or 14 years ago when I argued pressure affects athletic performance sometimes (I used golfers as my example) and got poo-pooed for it. "But they themselves have attested to it," I said. The response, as Primates might well imagine, was "You can't prove it makes any difference."
Same with "the zone" - I've been there. I once hit seven 3-pointers in a row during a basketball game and, believe me, it wasn't random on that day. I've also had stretches where I missed five or 10 in a row, and it wasn't random chance either. Mechanics is probably a big part of it - as well as confidence.
I'm a Strat-O-Matic player, and I fully realize streaks can materialize out of pure chance. Bob Thurman has eight homers in his past five games in my 1957 NL replay. That's hot dice, obviously. Many hot and cold streaks are just random variation. But not all of them. Life is not a table top game. Well, actually it is. But not about baseball. :)
But this is my point. It's not "random" on any day. What it is, however, is something that can be modeled by probability theory. And for those purposes, one question that often gets asked is "if a guy hits six straight three pointers, do his odds improve of making the next one compared to the same guy who has missed six straight?"
And to answer such questions, the _concept_ of random chance is critical to our understanding all of the dynamics involved. The number of significant variables that go into whether you make a three pointer is immense. Absolutely staggeringly huge. And so the best way forward is to account for as much as we can, and use probability for the rest.
There's nothing that even contradicts a statement like "God wanted you to make seven straight three pointers that day." Such a statement is completely compatible with probability theory. Since by definition we can't know God's plans or intentions, probability theory still applies.
*I'm going from memory here, but this is the general idea.
This exact same thought is presented in a research paper done at Cornell
(I was describing this board to my brother, and that is nearly exactly what I used to describe Guym to him. Cult of Tango... I guess it's better than the cult of MGL)
I wouldn't be surprised at all to find people who are struggling would continue to struggle(post #1) but there is the possibility that there is something physical wrong with them that they haven't acknowledged. (which is why sometimes player out perform their projections by a lot, they might have been injured during the time period that their used to base their future on, and now that they are fully healthy, they beat their projections)
It's very rare I find myself agreeing with GuyM, but this (post #5) is one of those cases.
I've had the same experience. My college roommate and I got into a pickup game of 2-on-2 with two guys from the basketball team (not a great, or even good team, but still Div I college, and we were each giving up 3-4 inches of height), and beat them b/c I just couldn't miss.
I must have been 15 for 17 from 18-22 feet. They just could not believe this slightly pudgy, slow white guy refused to miss a shot.
Wouldn't surprise me, but that wouldn't apply in games like baseball and golf where you can't take "extra shots".
It does in golf. If things are going well, you try a shot you shouldn't try -- e.g., going after a tucked pin on a hard green. Then you say to yourself, \"####, that was stupid." Bye, bye zone.
Funny, I'm the opposite. If I have a good score to protect, I'll play more conservatively. If I'm at 55 after 9, I'm trying every crazy ass shot I can think of hoping for a few pars and a birdie.
Long ago I auditioned for an orchestra, and made it. Best audition of my life by far, given (a) I made it, and (b) where they seated me. (Seating = ranking.) As it turns out, I don't remember most of the audition. Once I got maybe 5 or 6 measures in, I was on autopilot. I was performing and, for lack of a better term, daydreaming, at the same time. It wasn't until close to the second page of music that I became aware again of where I was, and that I was performing, and that I hadn't even been paying attention. Because of that sudden shift of attention I messed up*, and the audition was ended. I assumed I had no chance, because all I knew was that I'd started playing, I stopped paying attention, and I messed up. I was told later that the part when I was daydreaming was performed flawlessly.
I have no doubt someone can be "locked in" and that it can result in better performance. I also wouldn't be surprised if it goes below the level of objective detection, which means it can't be found as predictive. Alas.
* I distinctly remember it happening in that order: realizing I wasn't paying attention, then messing up as a consequence. It was not the other way around. When my attention focused, my eyes started darting across the music to figure out where I was based on what I was playing, and that distracted me from what I was actually playing.
That happens in the bedroom too.
Morgan Ensberg had a career ratio of HR to PA of 4.3%. Just doing a binary sim of 1,000,036 PAs, giving me 1,000,000 spans of 37 PA, I get 96 spans of 37 PA with 8+ HR. So, 96 out of a million, or 0.0096% probability assuming he's hitting at his career average.
Same thing, but at his highest season HR/PA rate of 5.8%: 0.0927% probability.
Ensberg had 2580 PAs in his MLB career. That's 2544 periods of 37 PAs. The expected number of 8+ HR periods we should see, if he were hitting HR at his career rate: 0.24.
Ensberg had 624 PAs in his best HR-rate season. That's 588 periods of 37 PAs. The expected number of 8+ HR periods we should see, if he were hitting HR at his best-season rate: 0.55.
The number of such periods he had in 2006 alone: 5.
- - - - -
Just using a binomial formula with his career rate of 4.3%, the mean HR hit in a 37-PA sample is 1.577. Standard deviation is 1.229. That means an 8-HR result in a 37-PA sample is more than 5 standard deviations from the mean. Using any reasonable standard of statistical significance, we would reject the hypothesis that that sample was simply random variation around his career norms.
The same exercise with his best-season rate of 5.8%: an 8-HR stretch is more than 4 deviations from the mean. Same conclusion: this is not simply random variation.
Now, maybe it's not being "locked in". Maybe he faced a string of pitchers who happened to be bad. The games were in Arizona and Houston, but a park factor likely isn't enough to get the actual performance within believable variation from the norm. Could be anything. The person in question says now that he felt locked in at the time, but we don't know how much of that is confirmation bias.
But it's not normal random variation. It's decidedly non-random.
I wanted to quote the whole comment and put a "like" on it. But that would be annoying/space waster. Excellent post.
This is whole key isn't. We don't have a "in the zone meter mood ring" than can measure our "in the zone-ness". You could self report it. Maybe you have a little clicker before every PA where you click "in" "out" "not sure" before you step into the box.
Even then, I think people would be too optimistic on the 'mood'. I don't doubt for a second that people feel "on" before the fact. I know that non-professional athletes have days like that(and yes I know that professional athletes are professional partially because they have been able to maximize the number of "on" days they have, that it's possible at the levels of professional play, that being "on", isn't as big of a factor, but still, it happens frequently enough that I think it's a real phenomenon.)
You're not seeing my point. Probability theory doesn't mean that Morgan Ensberg has the same chance of hitting a HR in every plate appearance, nor does it mean there's no such thing as being "locked in." What it is is a method with which to evaluate the chances of of an event occurring given all of the information we can reasonably account for. If it can be shown that having hit 8 homers in 37 PAs makes it more likely than normal he'll hit a homer in his next PA, then that can be accounted for.
Morgan Ensberg's chances of hitting a home run in his next PA is a random variable. By definition it is subject to random chance because by definition we don't know whether he will or will not hit one. There's nothing "decidedly non-random" about it. That this outcome is influenced by a large number of variables and that these influences are under no obligation to be equal in every PA is a given. But because there is uncertainty to the outcome, probability theory applies. If we can isolate how his chances change based on certain factors, that's fine. But the question with being "locked in" is, can we indeed do that with this?
This is completely wrong. 4-sigma and 5-sigma events will happen even if variation is random. They just won't happen very often. How many times has this kind of performance been observed over the past 20 years? Over the past 50? If you can show they happen much more often than they should, then there is something to talk about. Picking one extreme performance, after the fact, and saying "this was really unlikely" doesn't prove a thing.
This lady rolled the dice 154 times without rolling a 7. That's way less likely than Ensberg's performance. Are you going to tell us she was "in the zone?" Sheesh.
Probably a lot more than random odds say they would have happened. I'm thinking you could go out and find at least one streak per player in their career akin to that in some respects.
That answers a different question than the question being asked. The question at hand is "Is Morgan Ensberg's pattern of performance consistent (or not inconsistent) with being "locked in"? The answer is clearly yes, as the odds that the bunching of his home runs would have appeared by random chance given everything else we know about him are infinitesimally small.
You're asking the question, "Do players generally exhibit patterns consistent with being "locked in"? The answer to that obviously can't be found in Morgan Ensberg's performance -- but no one ever suggested it could.
(The dice roller example is inapposite. Hitting major league pitching is primarily a function of skill. The numbers that result from a dice roll have nothing to do with skill.)
But this is precisely the point: many, many people have gone looking for evidence of a "hot hand" in baseball (and other sports, for that matter). And they have basically found....squat. Streaks like Ensberg's happen as often as we'd expect -- no more and no less. Are you really not aware of this?
My complaint with this is that the focus is on the wrong thing when it comes to probability theory. Probability theory isn't about which things are skill and which is luck, probability theory is about the observer's knowledge of future outcomes.
Take a deck of cards. What are the chances the top card is the eight of spades? 1 in 52? If you turn the card over and replace it on top of the deck, your answer to the question changes radically. The probability has changed significantly, but the card itself has not. What has changed is our knowledge about the card.
So when comparing dice to hitting a baseball, that one involves lots of skill and one involves very little is somewhat of an irrelevancy when using probability theory. The bigger issue is, what do we know about the dice or the players involved that helps us set a probability of an outcome. That one is based on human skill and really complicated physics, and the other based almost solely on really complicated physics, only really matters in so much as how much effort we're willing to spend trying to identify meaningful variables.
So for hitting, what we try and do is to gather more information to help us better understand how the probability of a hit or home run or whatever might change from at bat to at bat. If it's possible to use streak information to make those adjustments, by all means we should.
BTW, I would expect more clustering than chance on these sorts of numbers regardless of being locked in, due to the clustering of home games, good and bad pitching, good and bad hitters parks, weather and probably a few other things I can't think of right now. Building a model where the chance per PA changes based on the results of the four previous PAs would be interesting. See how that model compares to one where it's the same, and then see how large a sample you'd need to be able to actually see the difference between the two.
There's a world of difference between saying, as Voros basically did, "These are the assumptions I made in order to create this model of baseball. The results produced by the model are very useful" and saying, "Because the results of my model built on these assumptions are useful, it is impossible that the batter-pitcher confrontation in baseball does not operate as a weighted random number generator. Because my framework is internally consistent, I am not interested in exploring the question outside of my assumptions. Because many people have drawn wrong conclusions from other assumptions and models, all other assumptions and models are dangerously flawed and must be rejected."
I'd go with stochastic process, not random variable. And by dint of being a stochastic process we can more conveniently model this as an arrival process with a given arrival rate. To the point that is being made here, the arrival rate is extremely unlikely to be a fixed, stationary rate ... we certainly know that this rate varies with age (or else we would not have aging curves) and it also almost-certainly (i.e., p approaching 1) varies over shorter time spans. If we accept that injuries, arousal and attention/focus (whether or not Ray wants to believe in it) can affect this rate, we certainly can envision a player having a peak/trough rate for HRs that is much higher/lower than his "average" rate for HRs, and that this peak/trough rate could be sustained for periods of time without our ability effectively either to model it or predict it.
This is where Voros's initial remarks about the true meaning of "random" are quite apt --- we know that we can't realistically identify and quantify, at every single discrete point in time (i.e., PA), the many parameters that feed into the arrival rate for HRs. In the case of a baseball player, that underlying arrival rate also has to account for the other players on the field (in particular the pitcher), and meaningful prediction of the rate associated with any given PA becomes almost impossible (i.e., p approaching 0). Let's be honest, non-stationary stochastic processes are a beast to work with. Therefore, we typically treat the rate as a constant (as in villiageidiom's analysis)and claim that "hot streaks" exist in retrospect, but (as Voros notes) have no predictive value.
Post #4 correctly notes the resolution, and it WOULD be nice to see the big names rely more on "we can't predict it yet" rather than claiming something is "random". As most of us here know, these two statements are not going to be interpreted by the public as equivalent. Whereas the writers know they are using "random" to mean the former, the public's general understanding of "random" is much more along the lines of "fundamentally unpredictable".
I'm aware of the studies. It doesn't mean that they have accounted for everything. It doesn't mean that they are actually right, as it's a psychological/physiological event. This is something that people who sit at home and never go out and do anything would believe. The concept that people don't get locked in or not goes against every reasonable concept of living that to say it doesn't happen is just insane.
Most of the studies go looking for future performance from a hot hand. Which is great. I 100% agree with the argument that a hot performance has no indication of a future hot performance. But it's ridiculous to say "that hot performance is strictly a statistical blip, that is expected"...
If you are looking at a player who's true level of performance is .300/.400/.500 over the season. That doesn't mean his true level of performance for a particular day equal's those numbers. One day he might be hung over and his true talent is .250/350/.450... one day he could have sharper vision, hearing, his body isn't aching etc... and his talent for that day is .400/.500/.600. Factor in his will to do well(on that hungover day, it might be a blow out, and he just goes through the motions or whatever other variables you can come up with.)
Edit: or what Tom T said in 50.
I disagree, Ensberg could choose to swing away or just try to make contact. if that is within his decision tree, and if that decision could increase his chances of hitting a HR, wouldn't that be non random? I am really not sure how you come out on this question , so I will aks you this;
if some of these rate variables (e.g. pitch velocity, pitch movement, humidity in the park) are not known precisely, but some of the other variables (such as pitch selection, batter's approach) are BOTH open to choice AND can raise or lower the odds, then does that make the event RANDOM or something not quite totally random?
I think that is what the poster is saying. That there really is a class of outcomes, where the rate they occur is not a perfectly random, but not perfectly deterministic either. I prefer to call them "chaotic" but that's just a name.
Say for instance a hand of poker? That's not completely random outcome is it?
I wouldn't go so far to say a hand of poker is completely deterministic process, but I wouldn't say it's perfectly random either. If I use the term "not quite perfectly random." Would you agree? What if I sayd: "definitely not totally random"? Or "decidedly non random.?" The poster is not going so far to say that the process is deterministic, but he's saying it's not perfectly random either.
Isnt that a more accurate description?
This is more elaboration on my last post, so perhaps redundant. ANd forgive me if I am missing your point. But just to clarify:
Are you saying that BECAUSE probability theory applies, then the process must be random? Perhaps this is simply a linguist issue, and we already agree. But I dunno.
Then why did you say that streaks like Ensberg's take place more than random chance would allow for? This is a false statement, and now you claim that you already knew this, yet you made it anyway. Very odd. And none of the rest of your post is not relevant to what we were debating...
To him it sill would be it would just change the underlying probabilities depending on his decision. But for us it really wouldn't even do that unless he shared that information with us. For us, the underlying probabilities are the same because we don't know what approach he's going to take. It's a playing card he can see that we can't, and so the probability of the identity is different for us than it is for him. Is Billy Hamilton going to steal on the next pitch? Our estimation of the chances of this will be a lot different (and less accurate) than Billy Hamilton's, but that does not mean we still can't estimate it.
I'm harping on this because it's critical: it doesn't have to be cards or dice or billiard balls in a bag in order for probability theory to apply. It just has to be a situation where _we_ are uncertain (even if there are others who are not).
It is going to be difficult to identify hot and cold streaks for some other reasons that have not been mentioned. In terms of baseball: if someone has hit 3 HRs in a row, they are very likely not going to see good pitches next up and/or they maybe intentionally walked. ANy study that attempts to measure this would have to take that into account.
In terms of basketball, which I see frequently made reference to and point to studies that show that hot streaks don't exist. One needs to understand that the dynamics of match ups and zone allow teams to double team or otherwise make adjustments to players that are "hot." Again studies need to take this into account, and I am not sure that they do.
Just to make an easy counter example. They say that the outcome is independent of the previous shot, but when you study free throw pct. the second shot is always made at a higher rate, it is like getting a practice shot (but it still counts), so these outcome are not totally independent.
I don't think you've answered the question. Let's ask the question from Hamilton's stand pt?
Is his chance of getting the SB random or not quite random?
We know it's not deterministic, so we reject that at the outset.
If you were Hamilton and his talent and your brain, how do you answer that?
Cfb can defend himself but ... he didn't say that.
And none of the rest of your post is relevant to what we were debating...
It's entirely relevant to it.
You're still confused about what's being debated. This:
streaks like Ensberg's take place more than random chance would allow for?
doesn't test or bear upon whether Ensberg's bunchings are caused in whole or in part by being locked in. Nor is it relevant whether Ensberg being "locked in" is confirmed by the population generally exhibiting the tendency to be "locked in" -- though that's certainly an interesting question.
I didn't say that. I was talking about studies of hot hands(as were you on the part I commented on)
This is what you said.
I am not aware of studies that that look through the whole of history to find streaks like Ensberg and put them into mathematical constructs. I generally see articles by people, done by people with a Raylike grasp of real life, in which they are trying to discover a hot hand. Villigeidiom pointed to a particular hot streak by Ensberg, with no argument on it's predictive power, just on it's randomness of ever happening, and is claiming that this particular type of streak is statistically unlikely, and I was mentioning that it probably happens more than statistical probability says it should.
Right, I agree. They have all these studies and the state this sort of conclusion ("strictly a blip") but they usually don't account for things such as double teaming the hot player in basketball, or simply covering him tighter.
I mean how could you measure that, it would be difficult, no? (as fanboy says, these studies are done by people and they are not all that convincing) Say we find Michael Jordan makes 3-3 FG. then they do this study and prove that his shooting percentage is only 57% on his 4th attempt...
OK, did they really account for the data, when in fact he passed the ball off on say 75% of his fourth possessions and the guy was wide open for an easy basket? I daresay I doubt they did.
Same in baseball, you'd have to account for intentional walks, as well as unintentional/intentional walks.
The problem of course is that these processes are not taking place in a vacuum where most variables can be set and left in place. Because players are involved, decisions can be made, and so the players themselves are manipulating the decisions and hence the outcome probability. It's very difficult to study all this as if it is a truly random process such as sun spots or deer populations. You have human intervention, in the midst of all this.
"Random chance" is a human construct. Physicists (and philosophers) argue about this, but there is a (minority) school of thought that it simply does not exist in real life. That if you could know everything, you'd know the outcome of everything. Heisenberg muddies the waters here.
Throwing dice can be accurately predicted as long as you can know every single relevant variable to predict the outcome. There's nothing random about it; throw same exact two dice exactly the same way under exactly the same conditions and you'll get exactly the same result. What makes it random is we can't possible know every single relevant variable to the accuracy necessary to predict the outcome. It's not the dice, it's _us_ that change the underlying probabilities.
"Random chance" is not a state of nature, it is a human tool we use to help us deal with the concept of uncertainty. And since uncertainty is often based on perspective, whether random chance is a useful tool often depends on that perspective as well.
How many performances like Ensberg would we expect to see in the course of ML history through random chance? I'm fairly certain that if you look hard enough, that you will find streaks similar to his on probably 50% of the major league players who have played the game. It might not be homerun rate outside the norm, instead it might be Pete Kozma going .500/.481/.875 over 27 at bats. Or maybe Dale Long with 8 homeruns in 8 games. Or maybe Bo Hart putting up .460/.481/.660 over 50 at bats. Etc... There is no way that this is purely random noise. They were locked in, and then they weren't.
I don't disagree with any of that, I was trying to find out why you objected to the term: "decidedly non random."
Both, I guess? I don't think his perspective is particularly useful to a coach. How does it help Morgan help a player improve to know that he can model the baseball world as a weighted random number generator and get some useful predictions from that for the future? Whereas figuring out what a player does to get "locked in" and helping him repeat those mechanics, that mental approach, etc., could be very useful to a coach.
This is a great point. With all the basic statistics we are making the fundamental assumption that neighboring PAs are independent. And given the correlations with true runs score, this is certainly true to a first or even second approximation.
But it's possible that adding in correlations between PAs you would get a better model.
And @61 is why free will is an illusion.
A coin flip. Whether a coin flip comes up heads is a function of many things -- which side was up to begin with, the weight of the coin, the force applied, the point at which the force is applied, the air current, where is it caught, etc.
So, looking at past coin flips, it would be very hard for us to know every single variable involved and even harder to measure every single variable involved. But, if we could, we could completely explain the past results of coin flips.
But, even if we knew the correct model that perfectly determines a coin flip, any prediction of future coin flips require us to be able to predict every single one of those variables perfectly. And the underlying model can't change (which it probably doesn't in physics but it almost certainly does for human/social phenomena) ... which means we need to be able to completely understand what determines the parameters of the model and perfectly predict all of those variables.
That's where Voros goes a bit too abstract. It's not (just) about our imperfect knowledge about what causes hot streaks -- even if we had perfect knowledge of what caused past hot streaks, we need to perfectly predict everything that determines the model and everything that feeds into that model to be able to predict future hot streaks with certainty. That's not "imperfect knowledge" that's an inability to predict the future of absolutely everything with absolute certainty.
Which means it falls into the category of why the #### would anybody waste time talking about it?
We know it's not deterministic, so we reject that at the outset.
Then it is "random" or "stochastic." You are misunderstanding what a "random process" means. A process is treated as random when it is not (known to be) deterministic. Yes, this means it is a coin flip but that does not mean that every coin flip is 50/50. The goal in a statistical model is to estimate the parameters underlying a random distribution.
So a Benoulli (0/1) random variable is purely a function of the parameter p -- the probability. Joey Votto has a higher probability of reaching base than does Starlin Castro but it remains random (or "random") whether Votto reaches base in his next PA. How did we determine Votto has a higher probability? By past performance.
OK, but the quality of the pitcher counts too. Fair enough, we bring pitcher's past performance into the model as well and maybe some term to try to capture the interaction. In his next PA, Votto has the misfortune of facing Clayton Kershaw while for some bizarre reason Castro's next PA is against 77-year-old fiddle player Doug Kershaw. Our model now predicts the probability of Votto reaching based has plummeted to .38 while the model suspects that Doug Kershaw can't even get the ball to the plate but it's still Starlin Castro so we now estimate he has a 10% walk rate for this PA raising his expected OBP to .38. But the actual result is still a coin flip.
Now, part of what you're asking also seems to be about conditional probability. Yes, our models tend to center on predicting things like whether Hamilton will be safe on a stolen base attempt. But that is obviously conditional on an attempt being made. In theory though there is nothing stopping us from modelling the decision to attempt a steal and the conditional probability of it being successful.
Now, as Voros points out, Hamilton has a lot more knowledge about whether he's going to go on this pitch than we do. But even here the knowledge isn't exactly perfect -- we've all seen guys break for 2nd, get a bad break, and scramble back to 1B. So even Hamilton can't perfectly predict whether there will be a stolen base attempt on this pitch. At best he can predict whether he will intend to steal on this pitch. Yes, in this sort of scenario, we could consider the steal attempt to be "deterministic" in that it's strictly Hamilton's decision. Similarly, at some split-second in there, Ensberg "decided" whether to swing or not.
In terms of hot streaks, there is a difference between the counter-arguments with regard to health vs. mechanics. Healthy players should perform better, players are often banged up in minor ways during the season so a player's mean performance is presumably a measure of their <100% performance. So, when 100%, they'll perform better than we expect -- i.e. p has gone up. We don't adjust for this in our model but in theory we could. The main thing is though that it's reasonably easy to understand the cause of health issues and predict whether the guy is going to be 100% in his next PA -- i.e. if he was only 80% in his first PA of the game, it's pretty likely he's gonna be about 80% in his next PA. Something like pitcher quality or men on base are similar.
But mechanics doesn't exactly help. In theory we could include mechanics in the model and we'd find that in any PA when Ensberg used his "best" or "most optimal" or whatever mechanics, he did much better on average than usual. Hot streaks then would (often) be the result of him employing these excellent mechanics over 30-50 consecutive PA. So we could "explain" the hot streak ... but we've explained the hot streak of hitting by finding a hot streak of excellent mechanics. The question immediately becomes why did he have a hot streak of excellent mechanics and we haven't really gotten closer to answering the question -- basically we've just come up with a new definition of "hot streaks" or "locked in" as "unexpectedly long runs of excellent mechanics" rather than basing it on performance.
Which would get us back to "luck". Presumably there are times when players have had unexpectedly long runs of excellent mechanics that did not result in excellent performance and vice versa. If "observed hot streaks" are actually a mix of genuine hot streaks and "random" hot streaks and they are missing a set of genuine hot streaks -- i.e. if "hot streak" contains measurement error -- then our ability to detect anything declines.
Still in all it really comes down to this -- do you believe that God, the universe or Bud Selig has pre-determined what is going to happen to you in the next billion nano-seconds?
I don't know, but I'm pretty sure that it's my dick that decides everything I do.
Mike/64: Should Tango's work be evaluated in terms of whether it would be "helpful to a coach?" It doesn't seem to me that is usually the context in which it's presented. In any case, I don't see how it applies specifically to this post, in which Tango finds common ground with a retired ballplayer.
But I think you are identifying a gulf in perspectives that explains why this debate will probably never end. From the point of view of the analyst, the random-number-generator model may be perfectly serviceable. That is, it may answer the questions the analyst is tasked with answering. But a player (or coach) can't take the RNG view: that kind of fatalism would lead to complacency, and reduced performance. The RNG model may work analytically, but paradoxically, that's probably only true in a world in which players are constantly seeking to maximize their own performance. IOW, the RNG model may only work so long as players don't believe it!
A rough parallel is investing. The best investment strategy is to buy and hold a broad index fund. However, if everyone pursued that strategy, the index strategy would stop working. The efficiency of the broad market is dependent on millions of people seeking to gain an edge, to beat the index. They will almost all fail, but their effort is necessary to make the market efficient overall.
It works for many of the questions an analyst is tasked with answering. I don't think it works for all of them, and that is a big part of my gripe with narrowing in on and holding fast to that model as the only workable analytical perspective.
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.115.6700
Well said, Walt. This is where we (as signal modelers) have to simply say, "crap, we'll just treat everything as stationary within some time frame and run with it...." With a ludicrous amount of data, we can probably use various regularization techniques to estimate SOME of the parameters and associated coefficients, but a season's worth of PAs is almost certainly not enough to do so for a single player. Even with our football data (where kids take up to 2000 hits in 12 weeks), it isn't clear if we can all that effectively predict the changes we see in the brains based solely on the incoming mechanics --- assuming the kids are drinking or doing some sort of, ahem, medication, or partaking in risky behaviors that involve head contact outside of sponsored practice and game activities...well...we won't be able to explain all the variance and may have to conclude it is "random".
And GuyM here hits on an issue one certainly experiences at lower levels of baseball --- with my son's Mustang team, I have to be paying attention during pre-game activities to conclude which boys appear to be on their ADD medicines (and which aren't), which boys look like they are rested, which boys appear to have had a bad day, and, critically, which boys with idiot moms/dads who think they know how to coach hitting have said moms/dads sitting right behind home plate to encourage their boy to "Use the swing (mommy/daddy) taught you!" that has led to 12 Ks in 14 ABs....
Managers who put in a fixed lineup each day are basically taking the RNG approach, though the guys who are fiddling every single day are probably just making themselves feel good. If I were to just take the RNG approach, we'd run into issues as the few kids who are hitting on any given day wouldn't be grouped. Still, I'm probably lucky if successful grouping results in an extra 1-2 runs in a 6-inning game.
I certainly agree there can be short-term changes in true talent, and there will be analytical tasks associated with trying to understand whether and why they happen. For example, has Mujica's change in pitch selection really made him a better reliever? And if so, is it likely to be sustainable? And maybe I'm wrong, but I doubt that Tango would deny such a possibility.
I suppose it's true that saberists can develop a reflexive skepticism about the meaning of short-term performance changes, and perhaps will miss a real change as a result. But that's because 95% of the claims made about the importance of short-term performance changes turn out to be nonsense. And in any case, it seems to me that about half the saber pieces I see are attempts to divine whether some player's recent improvement or decline is "real," so I honestly don't see this as a huge danger.
*
I do think that there are limits to how much this cultural/perceptual gap can be closed. Athletes have been taught to dismiss "luck" -- AKA randomness -- since they were children. As an explanation for your own failure, it's an abdication of responsibility to try to learn from failure and do better in the future. As an explanation of opponent success, it's poor sportsmanship. We don't want our athletes to give much credit to randomness, and really they can't afford to think that way. But analytically, it is often an essential factor to understand.
Something like this occurred to me when I had a similar, earlier conversation with Ensberg on Twitter. He had said something about the Astros having the best overall minor league W-L record, and that while the credit for that should go to good players, the job of he and other coaches was to teach them how to win. He said a good performance in a loss was considered a failure.
I questioned that. In the minors? Aren't the minors about player development over winning? I can understand a good performance in a loss being a disappointment, but a "failure"? He stuck with that word. I asked, If you had a good starting pitching prospect at Double-A, would you make him a closer because the AA team needed a closer to win games? He said yes.
This made me think I wouldn't want Ensberg as my team's director of player development, but that doesn't mean I don't want him coaching. That is, even if he is philosophically or strategically "wrong" in how he values wins vs. player development in the minors, he might be a great coach. He says he's teaching players how to win, which can easily be lampooned as #twtw nonsense, but maybe that's just his description of teaching, say, a mental approach that results in better physical performance. What he describes as "how to win" may be how to focus, how to deal with late-game pressure and adverse conditions and so on, in ways that allow a player to maintain good mechanics even in situations where it's tough to do so, like when he feels bad, or the game is on the line and he fears letting his team down. In other words, how to feel like he's "locked in," or at least feel like he's not "not locked in," which may improve his chances of success.
What Tom T/72 said really resonates with me because I've helped coach my son's 9-10 year old team this year. At that age, the kids have the basic baseball skills. Most of them (this is just basic Pony League, not a travel team or anything) can reliably catch, throw accurately, hit a strike, field grounders and fly balls and so on. How well they play depends almost entirely on their mental approach. As Tom said, I can pretty much watch them play catch when they first arrive at the field and tell you if they're going to play well or poorly. But we can also coach them into focusing and concentrating and being confident, and get them to play better. It's really been eye-opening. An expert hitting coach can fiddle with a kid's swing mechanics for 100 hours, and I believe it wouldn't have nearly the effect on performance of teaching that kid a good mental approach to the game.
You're not paying attention to which boys have hot moms? And you call yourself a coach.
I agree with the above comment except I'd say, "There are too many variables to make reliable predictions. Yet. :D"
Let me rephrase the question in a different way.
Are there examples we can find where a more recent set of data pts is more predictive of future events, than the set of all data pts?
Say for example Clemente in 1962 when he really began to emerge as a hitter. His production in 1961 and 1960 might be a better indicator of his future ability rather than his entire career numbers up to that pt.
Now, how hard of a leap is it to go from a season worth of data to say a week or month? So someone's last month or last week might be a better predictor than their career average or their seasonal average. I dont think its a stretch.
Okay now for the better question:
Q2. Assuming you agree with q1, then can we correctly identify those players with reasonable certainty that their short term data set is better indicator or their ability in the immediate future?
This might be a more difficult question to separate true hot streaks from random fluctuation. Say for example when Larry Sheets hit 31 HRs did he really make a breakthrough or was this some sort of fluke?
Try approaching the issue that way..
You must be Registered and Logged In to post comments.
<< Back to main