Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Friday, August 26, 2011

Bill James Explains New ‘Temperature Gauge’ Statistic to Determine How Hot or Cold a Hitter Is

The center of Boston, Mass coordinates.

Bill James has always been ahead of the curve when it comes to baseball statistics. In short, he’s made a life out of breaking down numbers and how they affect the game of baseball. Now, he’s back at it.

James recently explained to NESN’s Tom Caron his newest statistic, temperature gauge. The statistic, which starts at room temperature—72 degrees—goes up or down depending on what a hitter does at the plate over a period of time. For example, if a hitter hits a home run, his temperature goes up, and he gets hotter. On the flip side, a double play will cool you off considerably.

Repoz Posted: August 26, 2011 at 09:27 PM
Tags: history, projections, red sox, sabermetrics

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

1. Jefferson Manship (Dan Lee) Posted: August 26, 2011 at 10:02 PM (#3909880)
I love Bill James, really I do, but I'm beginning to feel like he's just making stuff up because he feels obligated to invent new statistics.

Some guy named Bill James wrote the following about 25 years ago:
The conventional wisdom is completely wrong...a player is just as likely to hit well in a game that follows a slump as he is following a week of hot hitting...I have studied the issue many different ways, trying to isolate something which can be called momentum, and, being unsuccessful, have concluded that that which is called momentum in baseball is not a characteristic of play but a characteristic of the perception of play.
If momentum doesn't exist, what exactly is the point of this new stat?
2. Obo Posted: August 26, 2011 at 10:13 PM (#3909883)
I haven't WTFV but back in the Abstract days James would often come up with what he once called "freakshow stats" that were designed to highlight some particular aspect of the game without necessarily having any serious analytical value. It sounds like "temperature gauge" might fit into that tradition.
3. Robert in Manhattan Beach Posted: August 26, 2011 at 10:20 PM (#3909887)
Obviously a junk stat but it could be fun. It should be normalized to a player's normal performance though. Like if Drew Butera went 3 for 3 (yes obviously this is purely hypothetical, could never really happen) his temp should be like 8000 degrees.
4. Best Regards, President of Comfort, Esq., LLC Posted: August 26, 2011 at 10:22 PM (#3909890)
If momentum doesn't exist, what exactly is the point of this new stat?
Fun. You remember fun, right?
5. LionoftheSenate Posted: August 26, 2011 at 10:33 PM (#3909894)
Players do get hot/cold or locked in from time to time, this is a fact. At the same time, I don't expect this measure to help us understand sooner or predict if or when a player is actually locked in for a brief period. That said, you are a slave to data if you think players don't actually get locked in from time to time. It happens.
6. Bring Me the Head of Alfredo Griffin (Vlad) Posted: August 26, 2011 at 10:45 PM (#3909900)
Fun. You remember fun, right?

This looks about as fun as Productive Outs.
7. JJ1986 Posted: August 26, 2011 at 10:51 PM (#3909902)
That said, you are a slave to data if you think players don't actually get locked in from time to time. It happens.

It happens, but you can't predict when it will end.
8. Joel W Posted: August 26, 2011 at 11:07 PM (#3909912)
If it's just ELO normalized to 72 degrees rather than 1500 points it's pretty fun.
9. The District Attorney Posted: August 26, 2011 at 11:07 PM (#3909913)
Bill recently did a major revamp of his pay site. I'm guessing he wants to drive people there, and saw this stat as a "hook" that could encourage people (even non-stat drunk computer nerds) to check out the rest of it.
10. Best Regards, President of Comfort, Esq., LLC Posted: August 26, 2011 at 11:14 PM (#3909921)
This looks about as fun as Productive Outs.
Productive Outs would have been fun if it was put forward as a junk stat, rather than the secret to the Angels and Marlins success.
11. PreservedFish Posted: August 26, 2011 at 11:23 PM (#3909925)
I think that Productive Outs are more fun than temperature gauge.
12. The Interdimensional Council of Rickey!'s Posted: August 27, 2011 at 12:50 AM (#3909983)
It happens, but you can't predict when it will end.

People who don't believe in "hot hands" are the sorts of people who steal semi trucks in order to smash and grab synthetic vaginas.
13.  Posted: August 27, 2011 at 01:38 AM (#3910021)
If momentum doesn't exist, what exactly is the point of this new stat?

As far as I can tell, nobody really believes that hot streak doesn't exist, the quote is more about the ability to predict when the hot streak ends. Players get hot, that is a statistical fact, the ability to predict when that hot streak is going to end is a statistical impossibility.

Some stat first guys are of the opinion that since a streak can end any time, that basing future predictions upon the guys true talent level or established skill level is better chance for success than assuming a hot streak is going to continue.... I'm in the book that says trust your eyes, if the hot streak is a product of sharply hit balls whether they are fielded or not, then take the chance, if the hot streak is a couple of bleeders getting lucky, go by history.
14. McCoy Posted: August 27, 2011 at 02:10 AM (#3910046)
It would be nice if this was actually based on pitch F/X instead of simply being a silly junk stat. Something like the story about the Red Sox and choosing between Giambi or Ortiz but using pitch F/X instead of scouting.
15. Tom Nawrocki Posted: August 27, 2011 at 02:51 AM (#3910083)
James has been running this stat at least since last season. It's not really new.
16. Sam M. Posted: August 27, 2011 at 03:07 AM (#3910089)
People who don't believe in "hot hands" are the sorts of people who steal semi trucks in order to smash and grab synthetic vaginas.

I am so in the clear on this one, it's not even funny.
17. PreservedFish Posted: August 27, 2011 at 04:11 AM (#3910110)
As far as I can tell, nobody really believes that hot streak doesn't exist, the quote is more about the ability to predict when the hot streak ends.

This is kind of the friendly way of putting it, the way that tries to avoid shouting matches. (The construction is reminiscent of "clutch hits exist, but not clutch hitters.") But, to be clear, the stathead literature on this states that streaks, cold and hot, are nothing more than statistical aberrations.

It's kind of impossible to believe.

I haven't read any new material on this since the 90s, so, I may be unaware of something.
18. Robert in Manhattan Beach Posted: August 27, 2011 at 04:18 AM (#3910113)
It's not that streaks are statistical aberrations, it's that they are indistinguishable from statistical aberrations.
19. Pasta-diving Jeter (jmac66) Posted: August 27, 2011 at 04:20 AM (#3910117)
the sorts of people who steal semi trucks in order to smash and grab synthetic vaginas

you make it sound like that's a BAD thing
20. PreservedFish Posted: August 27, 2011 at 04:35 AM (#3910124)
It's not that streaks are statistical aberrations, it's that they are indistinguishable from statistical aberrations.

Explain the practical difference.
21. Sit down, Sleepy has lots of stats Posted: August 27, 2011 at 04:42 AM (#3910130)
Bill recently did a major revamp of his pay site. I'm guessing he wants to drive people there, and saw this stat as a "hook" that could encourage people (even non-stat drunk computer nerds) to check out the rest of it.

This stat has been there for a while, though. As long as I've been a member, and that's close to "as long as there was a pay site".

One of those that I always glanced at, thought "interesting", then my inner SSS monitor kicked in, and said "no more fun, get on with looking up how bad ryan theriot has been on the basepaths". Then I'd spend an inning or two feeling guilty for enjoying the thermometer thingy, then forget all about it after TLR did something stupid to lose a game.
22. Bob Evans Posted: August 27, 2011 at 01:22 PM (#3910208)
Explain the practical difference.

Do you really want the poor shooter with the "hot hand" to take the potential game-winning shot?
23. Best Regards, President of Comfort, Esq., LLC Posted: August 27, 2011 at 01:32 PM (#3910210)
Explain the practical difference.

If all streaks are statistical abberations, then players can never be playing well or poorly, which is ridiculous.

If all streaks are indistinguishable from statistical abberations, then a player may be playing well or poorly, but it's pointless to use their results to determine that.
24.  Posted: August 27, 2011 at 02:04 PM (#3910220)
A hitter's true talent is gauged by looking at his entire record, which is made up of times when he's feeling well or playing hurt, adjusting to pitchers' patterns or being out-adjusted by pitchers, having a few games at home against pitchers he matches up well against and having a week on the road when he's facing guys whose motions he just can't pick up. Not to mention runs of dumber luck where balls hit bases or stay just fair or bound past a diving Jeter.

The results of all that hot-and-cold play can be beautifully mimicked by a Strat card, but the players themselves are far from Strat cards. It seems to me there's lots of practical significance to that if you are managing, coaching, or just following ballgames intently – maybe less so if you're trying to bet on games, or to model what a player is likely to do in 2012 or over the length of a contract.
25. Best Regards, President of Comfort, Esq., LLC Posted: August 27, 2011 at 02:38 PM (#3910234)
Luck can also be hitting the ball solidly instead of a fraction of an inch off, ripping a line drive instead of popping up or hitting a ground ball.

The talent is not in being able to hit the ball solidly at will, it's to be able keep your swing in the range that you'll make solid contact more often. When exactly you do make that solid contact is luck.
26. PreservedFish Posted: August 27, 2011 at 04:14 PM (#3910269)
If all streaks are statistical abberations, then players can never be playing well or poorly, which is ridiculous.

If all streaks are indistinguishable from statistical abberations, then a player may be playing well or poorly, but it's pointless to use their results to determine that.

This seems to be absolutely no difference at all. It's just a semantics game designed to make us feel more comfortable with the idea that streaks are statistical aberrations. You're just talking about labels, and not practical consequences.
27. PreservedFish Posted: August 27, 2011 at 04:28 PM (#3910275)
Do you really want the poor shooter with the "hot hand" to take the potential game-winning shot?

No, I don't. I think this was the side that I am arguing.

The results of all that hot-and-cold play can be beautifully mimicked by a Strat card, but the players themselves are far from Strat cards. It seems to me there's lots of practical significance to that if you are managing, coaching, or just following ballgames intently

If streaks are absolutely indistinguishable from aberrations, then I still don't see any "practical significance."

I'd love to be wrong on this issue, by the way.
28.  Posted: August 27, 2011 at 04:35 PM (#3910279)
This seems to be absolutely no difference at all. It's just a semantics game designed to make us feel more comfortable with the idea that streaks are statistical aberrations. You're just talking about labels, and not practical consequences

From an analysis point of view there is no difference, but from a managerial/coaching point of view there is a huge difference.
29. Steve Treder Posted: August 27, 2011 at 04:40 PM (#3910280)
The results of all that hot-and-cold play can be beautifully mimicked by a Strat card, but the players themselves are far from Strat cards.

This.
30. PreservedFish Posted: August 27, 2011 at 04:47 PM (#3910288)
28 and 29 - you still have not explained anything. That's about four of you that have said "there is a difference" without being able to articulate what the difference is.
31. Steve Treder Posted: August 27, 2011 at 04:52 PM (#3910292)
That's about four of you that have said "there is a difference" without being able to articulate what the difference is.

Have you ever played a sport yourself? If so, then you know that the idea that anyone is always capable of performing athletically at the exact same level of competence, every day, is ridiculous.
32.  Posted: August 27, 2011 at 04:55 PM (#3910295)
28 and 29 - you still have not explained anything. That's about four of you that have said "there is a difference" without being able to articulate what the difference is.

The difference is from a coaching point of view, if you have a hot player who is based upon the eyes is playing real well, then you keep him in the lineup if it's appropriate. If the guy is getting a lot of lucky bounces then you go by historical norms. What part aren't you getting?

A coach/scout should be able to tell when a player is hot/playing well, just like they can tell when they are scuffling.

Streaks are distinguishable from aberrations with a good honest scouting report. And I wouldn't be at all surprised that a good bit of data mining of Pitchfx data could also be used for gauging when a player is on a hot streak.

Of course every streak ends, so no matter what there is going to be a time when the prediction is going to fail, but to deny hot streaks exist is ludicrous.
33. Steve Treder Posted: August 27, 2011 at 04:59 PM (#3910297)
to deny hot streaks exist is ludicrous.

Entirely.
34.  Posted: August 27, 2011 at 05:06 PM (#3910304)
The difference is one of perspective: inside the game vs. out. Take betting on horse races, for instance. You can do it from home without ever seeing a horse; analysis of the data will suggest your best strategies, and that data could be generated by a computer without any horse ever existing; the simulations would mimic actual racing pretty well.

But every actual race being run involves equine health, track conditions, jockeys' decisions, training patterns, freak occurrences in the paddock or the gate, all kinds of stuff that happen to real people and horses in real time and have to be contended with. No trainer says "well, the success of this mare is in the long run going to obey statistical models anyway, so I'm just going to ignore the immediate conditions and do whatever I always do before her next race."
35. PreservedFish Posted: August 27, 2011 at 05:19 PM (#3910311)
Have you ever played a sport yourself? If so, then you know that the idea that anyone is always capable of performing athletically at the exact same level of competence, every day, is ridiculous.

I agree. I don't like the conclusion that I'm arguing for.

Streaks are distinguishable from aberrations with a good honest scouting report. And I wouldn't be at all surprised that a good bit of data mining of Pitchfx data could also be used for gauging when a player is on a hot streak.

Both of these may be absolutely true, but they're pure conjecture.

But every actual race being run involves equine health, track conditions, jockeys' decisions, training patterns, freak occurrences in the paddock or the gate, all kinds of stuff that happen to real people and horses in real time and have to be contended with.

You're not really describing streaks. Juggling my rotation to start my lefty in Yankee Stadium doesn't mean I'm following the "hot hand."
36.  Posted: August 27, 2011 at 05:22 PM (#3910313)
Both of these may be absolutely true, but they're pure conjecture.

I'm trying to figure out your point?

Streaks exist, statistical anomalies exist, from a past looking viewpoint there is no noticeable difference between them. I don't think anyone on here has said any different.
37. Steve Treder Posted: August 27, 2011 at 05:34 PM (#3910324)
I'm trying to figure out your point?

Me too.
38. PreservedFish Posted: August 27, 2011 at 06:29 PM (#3910363)
The point is that if one cannot ever discern whether a good run of hitting has been a legitimately earned hot streak or a mere statistical aberration, then there is no practical difference between them. They tell us nothing* about who should be in the lineup tomorrow. So saying that "streaks exist" is making a meaningless distinction. It's just applying a familiar label to a totally inscrutable phenomenon so that we might feel more comfortable about it.

(I understand cardsfanboy's distinction between lucky hot streaks, and legitimately earned hot streaks, and it makes sense, but it's still conjecture.)

Again, I haven't seen any updated research on the "do streaks exist" issue since probably the 90s. I know there's a chapter in The Book about it. Maybe I'm dead wrong on current saberthinking about this issue.

*Aside from the extent to which it moves the career or big-sample numbers.
39. PreservedFish Posted: August 27, 2011 at 06:37 PM (#3910372)
Take a look at post #5 to understand what I'm arguing against. LionoftheSenate basically says:

1. Streaks exist, it's a fact.
2. We cannot recognize them in any way, and they are totally valueless for decision making.

In my opinion points 1 and 3 have no meaningful content. Point 2 is one of those scary non-intuitive conclusions that flies in the face of everything we think we know. Points 1 and 3 are set up as if they will refute 2 in some way, but they don't at all.
40. Bob Evans Posted: August 27, 2011 at 08:38 PM (#3910442)
PF, I think I get you now and I can't help you gainsay the position; I agree with it. However, I think LotS was saying something a little different than your #2 and more radical in #1 and #3. (At the risk of putting words in his mouth, here goes.) For #2, he was saying this temperature gauge doesn't help determine streaks; that's all. For #1 and #3, he was saying that players definitely get "locked in". I'll be damned if it can be proven, which is another way of saying you *can't* be definite about that, and I think that's your point. Correct me if you want. 8-)
41.  Posted: August 27, 2011 at 09:57 PM (#3910490)
You're not really describing streaks

Well, I was being a bit oblique by moving to the racing analogy. Horses are among the streakiest of sport performers. A trainer (for that matter, a good handicapper) needs to know what a streak portends for the future. The need is magnified by the small sample size of the past performances.

I think that baseball managers have to know these things too. They deal constantly with SSS among untried performers. They're not going to get unlimited simulation runs to make the right decision. You and I can sit here and say that some prospect's MLE inevitably foretells a normally-curved career hitting .270 over 4,000 PAs, and no two weeks of it will drag it far from its true center, but a manager truly does not know that: his prospect might be getting better or worse than the normal projection for reasons that coaches can enhance or correct, and the current streak might be partial evidence of that.
42. Voros McCracken of Pinkus Posted: August 27, 2011 at 10:43 PM (#3910524)
Probability theory is a means in which we model uncertainty.

That's it, the whole thing in a nutshell. Whether there is or isn't any such thing as pure "random chance" is irrelevant to the study. What is relevant is our own lack of knowledge and how to best deal with this knowledge to not act in complete ignorance. To say "a hot streak is no different than what we would find from random chance" presupposes that they are two different things when they very well might not be. If it's about predicting the outcome of a future at bat (or shot or whatever) based on how hot a player has been, then that's a topic that can and has been evaluated statistically. But that doesn't change whether hot streaks exist, it just changes how we interpret them.

More to the point, there's nothing different between a dice roll and a major league player stepping up to the plate for an at bat. Both contain a level of uncertainty as to the outcome, and so probability theory is applicable to both. That one involves a human swinging a bat isn't any more relevant than that one involves a human rolling the dice. Where the differences come is how we assess the affects of the things we do know in terms of modeling that uncertainty. A di roll may just involve whether we've checked the di to make sure it's not loaded and to make sure the table is level. There are plenty of other things that affect the outcome of that di that aren't at all random, but our lack of ability to process that information makes it essentially random when we model it. A hitter's success or failure has many more factors that can be accounted for (park, weather, pitcher, fielders, umpires, injury and fatigue, etc.). But there's also a whole host of things we couldn't hope to accurately process and in the same way, just because these things aren't specifically random, doesn't mean using random chance isn't an effective way to model it. Saying human beings aren't strat-o-matic cards is obvious, but it doesn't say anything about the possibility that a certain subset of their abilities might be accurately modeled by them.

There's nothing noble about saying "I don't know" and then punting on the work of trying to put a number on the chances of a particular outcome. If you're a manager or general manager you don't have a choice, you have to model your own uncertainty about things and make decisions based on that. You can't say "human beings aren't strat-o-matic cards" and be done with it; you have to be be able to come up with a way to model those outcomes better than strat-o-matic cards or else you're criticism has little meaning.
43. Voros McCracken of Pinkus Posted: August 27, 2011 at 10:51 PM (#3910529)
From a past looking viewpoint there is no noticeable difference between them.

My philosophical view is that there might not actually _be_ any difference. A statistical anomaly is an abstraction and to the extent that abstraction is being used to model the performance of a human in a sporting event, I think the two not only just look equivalent, they _are_ equivalent.
44.  Posted: August 27, 2011 at 11:44 PM (#3910565)
It seems to me that one way we could test for the existence of "heat" in a certain caliber of player would be them striking out less, walking more or something else defense independent.

I think BABIP variation causes a lot of what people talk about as "hot streaks", but there is a real amount of "seeing the ball well" that isn't accounted for in strict results.

Maybe a lack of popping up? Or more line drives?
45. David Nieporent (now, with children) Posted: August 28, 2011 at 12:20 AM (#3910585)
That said, you are a slave to data if you think players don't actually get locked in from time to time. It happens.
Facts are meaningless, you can use facts to prove anything that's remotely true! Facts, schmacts.
46. Ron J Posted: August 28, 2011 at 01:02 AM (#3910609)
#18 Exactly. You get almost precisely the same range of results (hot/cold streaks) playing Stratomatic.

There are cases when somebody's got screwed up mechanics, but it's far more likely that we're seeing significance in randomness.

And sAM, the synthetic vaginas are nothing more than a bonus.
47. Ron J Posted: August 28, 2011 at 01:09 AM (#3910611)
#27 It seems plausible that you could use hot streaks for baseline mechanics. While there's no guarantee that you're doing everything right, it's likely that you're not doing things wrong in a way that's readily exploitable.

Equally plausible, you may be doing something exploitably wrong when you're cold.

Dunno. With the size of today's coaching staffs you'd think that this would be of minimal importance -- that they'd always be looking for this.
48. PreservedFish Posted: August 28, 2011 at 01:19 AM (#3910617)
You can't say "human beings aren't strat-o-matic cards" and be done with it; you have to be be able to come up with a way to model those outcomes better than strat-o-matic cards or else you're criticism has little meaning.

Thank you Voros. This is what I've been trying to express.
49. Steve Treder Posted: August 28, 2011 at 02:37 AM (#3910639)
You can't say "human beings aren't strat-o-matic cards" and be done with it; you have to be be able to come up with a way to model those outcomes better than strat-o-matic cards or else you're criticism has little meaning.

Why is it necessarily a "criticism"? Why isn't it just a statement of obvious and constant fact?

Athletes are human beings, vastly more complicated than the statistical models of their performance that we construct. Whether or not we can "do" anything with this truth doesn't render it any less true.
50. David Nieporent (now, with children) Posted: August 28, 2011 at 02:41 AM (#3910641)
The results of all that hot-and-cold play can be beautifully mimicked by a Strat card, but the players themselves are far from Strat cards. It seems to me there's lots of practical significance to that if you are managing, coaching, or just following ballgames intently
The problem with this argument is that if the second part is true -- if streaks do have significance that can be seen by scouts/firsthand observation, then you shouldn't be able to "mimic them with Strat cards." The position "Streaks are real, even if they don't show up in the statistics" is an Andyesqe attempt to have it both ways. If they're real, then they should show up in the statistics.
51. Voros McCracken of Pinkus Posted: August 28, 2011 at 03:14 AM (#3910645)
Athletes are human beings, vastly more complicated than the statistical models of their performance that we construct. Whether or not we can "do" anything with this truth doesn't render it any less true.

But it's an obvious truth that everyone understands (except maybe the seriously mentally ill), and the only argument you're going to get is when you try and take it further than that. That they're human beings shouldn't have any effect on how we perceive the accuracy of the model. How well the model performs should be judged on its own merits.
52. Rufio_Magillicutty Posted: August 28, 2011 at 04:00 AM (#3910652)
Are we to the point where the media and fan-interaction is so invasive that the very creation of a stat changes the entire nature of player performance to the point where the stat then becomes a valid analytical mechanism to assess player performance?

tl;dr:

stat change player
53. GGIAS (aka Poster Nutbag) Posted: August 28, 2011 at 05:06 AM (#3910671)
My 2 cents (for the zero people that care), is that #42 appears to be as close to dead on about this as can be. Again, just my opinion. The first sentence and the paragraph that follows, spot on.
54. Something Other Posted: August 28, 2011 at 11:20 PM (#3911091)
the sorts of people who steal semi trucks in order to smash and grab synthetic vaginas

you make it sound like that's a BAD thing
I don't say this cynically, but a good part of love IS friction. Even so, I have no idea at all to what Sam H. is referring.
55. there isn't anything to do in buffalo but 57i66135 Posted: August 28, 2011 at 11:50 PM (#3911111)

The point is that if one cannot ever discern whether a good run of hitting has been a legitimately earned hot streak or a mere statistical aberration, then there is no practical difference between them.

i'm coming into this late, but i don't think this statistic is attempting to predict whether a hot/cold streak is likely to continue to the next day. i think its purpose is just to say that the streak has happened. any further inference into the number's meaning would seem to be folly, either on james' part or, more likely, on the part of everyone who's participated in this thread.
56.  Posted: August 28, 2011 at 11:59 PM (#3911118)
if streaks do have significance that can be seen by scouts/firsthand observation, then you shouldn't be able to "mimic them with Strat cards."

I don't see how that follows. One can model economic behavior on computers, but people still live out the real-life decisions and drives that propel that economy. You can model animal population dynamics on computers, but predators and prey still show individual intersubjective behaviors that to them are literally life and death.

Voros's point in #42 is fine from the outside; if you start from the scientific assumption that your goal is to build a statistical model of behavior, you should leave the subjectivity of subjects out of it. But the subjectivity of the subjects can still drive that behavior. I suspect a lot of hot streaks are partly, even mostly, due to a player making an adjustment – getting a little better break on a pitch, or timing his swing at breaking pitches a little better – and then they end when the advance scouts or pitch-charters notice what's going on and counter-adjustments are made. From the inside, that's very important stuff. That constant adjustment and readjustment keeps everyone on their true talent level. And players, coaches, and managers talk about games in exactly that way: are they just deluded about the nature of sport? Are they constructing a priesthood of mumbo-jumbo to hide their fear of Lucretian randomness? :)

The opposed positions in this thread represent different perspectives, and are quite compatible, it seems to me, if you account for the different assumptions based in the perspective.
57. Tom Nawrocki Posted: August 29, 2011 at 01:35 AM (#3911166)
i'm coming into this late, but i don't think this statistic is attempting to predict whether a hot/cold streak is likely to continue to the next day. i think its purpose is just to say that the streak has happened. any further inference into the number's meaning would seem to be folly, either on james' part or, more likely, on the part of everyone who's participated in this thread.

I like to look at this stat for two things: One, some schmo I've barely heard of will creep into the top five, and that's pretty cool. Rockies scrub Charlie Blackmon, who really shouldn't be in the major leagues, was briefly fourth in the majors with a 102-degree temperature.

The other thing is, someone will get REALLY hot, and this gauge will point that out for you. A week or so ago, Aramis Ramirez was at 120 degrees, which I think is the highest I've ever seen. Looking at bb-ref, I see that Aramis had a 12-game streak where he hit .563/.582/.875.

It never occurred to me that I should assume either type of these hot streaks would continue.
58. CrosbyBird Posted: August 29, 2011 at 02:36 AM (#3911196)
I don't say this cynically, but a good part of love IS friction.

Don't use so much lube next time.
59.  Posted: August 29, 2011 at 02:43 AM (#3911202)
Voros's point in #42 is fine from the outside

or as I like to say - just because something could have happened by chance doesn't mean that it did.

-- MWE
60. Tom T Posted: August 29, 2011 at 02:59 AM (#3911208)
Saying human beings aren't strat-o-matic cards is obvious, but it doesn't say anything about the possibility that a certain subset of their abilities might be accurately modeled by them.

This is a problem with the argument --- it expressly acknowledges that there are factors that we may not have modeled accurately, but fails to clarify that we have thus opted for consistency in our models, rather than the power to fully represent the world (i.e., by being consistent, the representation cannot be complete).

Now, it is true that in MOST cases we genuinely can't distinguish between a statistical aberration and a player being "hot" or "cold," but it would require some serious stretches (IMO) to say that this is ALWAYS true.

Let us consider this season for Derek Jeter. The "there is no difference crowd" would argue that prior to going on the DL, Jeter simply was having a long sequence of poor outcomes, in spite of the low probability of such a sequence occurring, and that after coming off the DL he has had a (not-quite-as-long) sequence of positive outcomes --- but it should be noted that this latter sequence is (based on his career norms) not as unlikely to have occurred. Therefore, we can either say that "he had bad luck, now he's having good luck" or we can recognize that we have additional information that is NOT WELL-MODELED and conjecture that "he was injured" and further conjecture that this injury impaired his ability to perform at his true talent level. So, I would argue that because we have additional information, we CAN consider his pre-DL performance to be something other than a mere statistical aberration. But if we did NOT have this information, "statistical aberration" (however unlikely) WOULD be the "most likely" (if not exactly "most satisfying") answer.

However, given that his post-DL performance falls approximately within the range of normal variation (based off his career performance), neither the "the fact that he's human doesn't matter" or the "we just don't model it well" camps can claim any sort of victory --- this "good" chunk could be (and likely is) random variation.

From the above and other posts, it is apparent that many people are (internally, at least) adopting the "best" (perhaps "most accepted" is the better term, here) statistical solution of defining a confidence interval that is used as the basis to say "yeah, that was extremely unlikely." Having such a flag available can serve as a jumping off point to examine other factors that are not included in the model, and subsequently to conclude if additional terms that might be appropriate to incorporate.

As an example let me take something from our football concussion study. We observed appreciable variation in fMRI brain activation of players during the football season, relative to measurements obtained prior to the season. However, this could simply have represented test-retest random variation. Because (as an engineer with a background in operations research) I am almost always in the "we just don't model it well" camp, we looked at two other factors: (a) obviously the players were in a different "environment" during the season than prior to the commencement of practice, and (b) additional, non-imaging-related neurocognitive tests given to the players before and during the season *also* indicated that the players were no longer at "baseline" performance levels. Subsequent analysis of individuals NOT undergoing repeated head trauma indicates that, as expected based on (a) and (b), the 95% confidence interval for our test-retest activation difference is notably smaller than the range of variation observed in our players. This effectively confirms that the prior model of "no additional factors" was inadequate to explain our participants, and that we needed to include additional data to more accurately measure their "performance." (The fact that incorporating the hit count...and location...into the model ALSO improves the ability to predict the retest fMRI activation is then a further confirmation of the insufficiency of the original model.)

Unfortunately, as external observers in baseball analysis, we don't have access to reliable secondary factors/assessments. And those who DO have additional information (managers and coaches) are almost certain to find that even with this information, any model they internally construct is woefully underdetermined. As a consequence, their decisions --- while potentially made in a rational matter, based on "knowns" --- are not, in the end, that much better than would be made looking solely at a strat-o-matic card. Thus the (unsatisfying) validity of the statement that we can't really distinguish between statistical aberrations and "hot" or "cold" streaks, even though we have strong reason to believe the latter DO exist (but are not predictable).

