Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Baseball Newsstand > Baseball Primer Newsblog > Discussion
Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Sunday, November 19, 2017

More on WAR – Joe Blogs – Medium

Bill is off base here. Bill seems to want to use WAR to answer questions that WAR is not really suited to answer. He’s not alone, of course. Other people use WAR to answer the MVP question all the time. The reason it’s done is, there really isn’t a tool out there designed to specifically answer the MVP question. Bill tried to do it with Win Shares. Unfortunately the adjustment methods he chose were too broad in nature.

In any event we don’t have to throw out WAR. It’s really useful for answering a lot of questions. I heartily agree with a Tangotiger suggestion:

I have always thought the best way to design a WAR would be to break it down into separate elements, which we later combine in the most appropriate way to best answer specific questions. The breakdown, IMO, should be: 1) offense, 2) defense (further broken down into components), 3) baserunning, 4) positional adjustment, and 5) context adjustment. (They should all also be presented with the related rate stat to help people answer other specific questions.)

Anyway, by introducing a timing/context adjustment as Tangotiger suggested, the value of the current WAR systems would increase. Our current data sets are much better than they were twenty years ago. We can now provide individual contexts, and need not rely on team ratios as Win Shares did. We should do it.

Unfortunately, though, the additions will generate more confusion as many people will still want to use one number to answer all questions.

“But because that is true, I ASSUMED that these were complex, nuanced, sophisticated systems. I never really looked; I just assumed that the details were out of my depth. But sometime in the last year I was doing some research that relied on these WAR systems, so I took a look at them, and … they’re not very impressive. They’re not well thought through; they haven’t made a convincing effort to address many of the inherent difficulties that the undertaking presents. They tend to get so far into the data, throw up their arms and make a wild guess. I don’t know if I’m going to get the time to do better of it, or if it will be left to others, but … we’re not at anything like an end point here. I assumed that these systems were a lot better than they actually are.”

Jim Furtado Posted: November 19, 2017 at 07:28 AM | 208 comment(s) Login to Bookmark
  Tags: bill james, sabermetrics, war

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 2 of 3 pages  < 1 2 3 > 
   101. BDC Posted: November 20, 2017 at 04:37 PM (#5578962)
Golfers routinely admit it (i.e., being nervous under pressure, with actual performance impact)

Except that a PGA Tour golfer caving under pressure means shooting 81 from the championship tees, just as a pitcher caving under pressure means getting a 90-MPH fastball up in the zone to an All-Star.

I'm just restating what snapper said about repeatable clutch ability being selected for early on. In high-level pro sports, sometimes a clutch player beats you and sometimes you beat them, because you're all already clutch players. Next night/week/season, the situation often reverses itself.
   102. PreservedFish Posted: November 20, 2017 at 04:41 PM (#5578964)
But the data has consistently shown that there aren't. There's never been any finding of repeatability in year to year clutch and un-clutch performance.

The one exception would be the Steve Blass-disease sufferers. That could be called choking.


That's a significant exception, and I'm not aware of any human attribute that doesn't exist on a spectrum. Choking can't be just off/on, either extreme or totally non-existent.

I'm not familiar enough with the statistics to give you a better argument than this. It's been decades since I read the studies on clutch, and I'm not familiar with newer ones than what existed pre-Primer.
   103. Blanks for Nothing, Larvell Posted: November 20, 2017 at 04:43 PM (#5578966)
But the data has consistently shown that there aren't.


Assuming that's true, and Bill James doesn't -- it doesn't matter. We only turn to second-order inferences from data when we don't have first-order direct evidence. Here, we do.

The one exception would be the Steve Blass-disease sufferers. That could be called choking.


Yes, and once we concede that -- as it has to be conceded -- the answer to the question is clear. What possible reason would there be to conclude the psychological impact on performance is binary, with the two settings being "no impact at all" and "catastrophic impact"? That can't be, and isn't, correct. Instead, as with virtually every similar phenomenon, there's a continuum of impact between catastrophic and zero.
   104. fra paolo Posted: November 20, 2017 at 04:45 PM (#5578967)
62: Players cannot control the situations they are placed into. They can control their results in those situations.

Players can influence the outcomes, but I don't feel comfortable asserting that they control them.

But little (I really want to say 'nothing', but I wish to leave myself some weasel room) in the nature of baseball occurs outside of a context, and in that sense I think Cameron (using him as the representative for the General Trends in Modern Sabermetrics) is chasing a chimæra. And that chimæra is a product of his own ideological and psychological context.

When I was writing post 62 I had in one draft a concluding sentence that ended with the question 'Why should we want to strip out context?'. It wasn't that I wanted an answer to that question, but to point out that the existence of the question suggests something about the mentality of what has been going on in sabermetrics since the early 1990s.

Cameron obliquely answers the question in that article. In the end he shrugs his shoulders and says 'You have to put all the context in and it's too hard and won't answer any question I'm interested in answering.'

And we see the same ideological emphasis on tidying-up loose ends that is displayed in pursuing the context-neutral, but here in reverse: 'Some of the context isn't good enough, we need to have all of it.'
   105. PreservedFish Posted: November 20, 2017 at 04:46 PM (#5578969)
Me and SBB, two brains united with a common purpose.
   106. Blanks for Nothing, Larvell Posted: November 20, 2017 at 04:46 PM (#5578970)
Except that a PGA Tour golfer caving under pressure means shooting 81 from the championship tees,


??

You're describing a complete meltdown. Those certainly have happened, but typically it's more like things like your tempo deserting you a couple times in a round, or missing a couple easy five-foot putts you'd usually make.

The culture is different in golf, and so elite players routinely confess to these things happening because of psychology and its impact on performance. There's no reason to try to infer it from putting statistics or the aggregate seasonal data on 175 yard 7-irons.

Same thing assuredly happens in baseball, a very similar physiological function to golf. There's literally no question about it.
   107. Sunday silence Posted: November 20, 2017 at 04:48 PM (#5578971)
This has been a really useful discussion. I've read James's follow up; kikos article and the one on fangraphs that is sort of rebuttal/defense of WAR. They are all very clear at responding to most of my questions. I want to throw in one more player name into the discussion.

JD Martinez. He went on a major hot streak after he arrived in AZ. How much did his bat help to win them more games? Especially interested in win expectancy and that pennant impact tool that someone had. Tx
   108. Zach Posted: November 20, 2017 at 04:53 PM (#5578974)
I'm seeing some people complain about James in this thread, but this has always been his method: he thinks very hard about the imponderables, then comes up with a system that straightforwardly implements his answers.

He's gotten an awful lot of mileage out of his approach. Think about the Keltner list, the black ink test, the gray ink test, Pythagorean win-loss, Fibonacci win-loss, Hall of Fame standards... All of those are mathematically very simple, and philosophically very deep. Thinking about these things is what James does.
   109. BDC Posted: November 20, 2017 at 05:03 PM (#5578981)
That's a complete meltdown. Those certainly have happened, but typically it's more like things like your tempo deserting you a couple times in a round, or missing a couple easy five-foot putts you'd usually make

No, that's a fine round of golf which is a big failure in context. It also happens every week to several Tour golfers.

Smaller things like you're talking about are perhaps psychologically perceptible to introspection, but they don't matter in the big picture. So today you finish three strokes behind Golfer A and tomorrow his tempo deserts him a couple of times and he finishes three strokes ahead of you. That's life in high-level competition. It does not show up as "choking" in the results.

Today my tempo deserts me and I make a bad pitch which you fan on. Tomorrow I make a brilliant pitch and you are waiting for it and hammer it into the seats. And every other permutation over a season, little failures and successes balancing other little failures and successes, but also being intertwined with sheer physical failures and successes. Unless you can't handle it at all anymore à la Blass or Ankiel, you are probably psychologically overmatched some days and not others, just as you're physically overmatched some days and not others, not even counting lucky breaks.

It may be that more guys wash out of the majors (or off the Tour) for psychological reasons than we know about; fair enough. But of those who succeed and are interesting to compare as ML regulars or in MVP arguments, they're all succeeding sometimes and failing sometimes, in ways rarely perceptible as clutch or choke play.

Clayton Kershaw just sucked in a World Series that his team lost because he sucked, and many people were opining knowingly that he is a sucking choker. While sucking greatly in one appearance, though, he also threw 11 innings of 5-hit, 1-run ball against the World Champions in the other two. That's life at the top.


   110. PreservedFish Posted: November 20, 2017 at 05:05 PM (#5578986)
BDC, that just doesn't make sense. Why can't you have one guy that chokes once in 100 PAs, and another guy that chokes 5 times in 100 PAs? There's absolutely no reason to assume that it all evens out, with the exception of the Blasses. We just can't separate a choking strikeout from any other, and there's no way of designing a study to do so because we don't know when people do or don't feel pressure. Bryce Harper might choke whenever his dad is in the stands. We don't know.

What I agree is that it's essentially impossible to determine who are the chokers, and who aren't. It's almost a subject not worth talking about. But that doesn't mean we should abandon our reason and understanding of humanity when we do discuss it.
   111. Blanks for Nothing, Larvell Posted: November 20, 2017 at 05:08 PM (#5578991)
Unless you can't handle it at all anymore à la Blass or Ankiel, you are probably psychologically overmatched some days and not others, just as you're physically overmatched some days and not others, not even counting lucky breaks.


Then I'm not sure what the "argument" is even about, unless you're saying that the overmatched days balance out the in-the-zone days for every single player in major league baseball and always have. They obviously don't.
   112. Zach Posted: November 20, 2017 at 05:09 PM (#5578993)
#69 -- The problem with WAR, which I think James hints at, but doesn't come right out and say (Posnanski is more direct in the article linked here) is that it's derived from runs, not derived from wins.

#70 -- This is a complaint about branding.

Call it 'RAR' and don't translate runs into wins and the complaint evaporates with zero information being lost. 'Wins above replacement' is easier to grasp, and produces a number that's easier to get your hands around. It's not integral to any part of the analytic work.


This is the heart of the debate, and I disagree with #70. The difference between a run denominated stat and a win denominated stat is the model relating runs to wins. WAR uses a dead simple model -- 10 marginal runs equals 1 win. Which is historically about right. But James sees a win as something independently measured -- in his view, a team can be bad at converting runs to wins, which is something that has no meaning in the WAR view of the world.

James thinks that wins should be apportioned such that the sum total of wins for a team is the number of games they won. WAR thinks the sum total should be the number of games they would be *expected* to win -- a run is a run is a run, and a win is just a block of 10 runs. It's a philosophical difference, and calling it "branding" is just choosing one side rather than the other.
   113. Blanks for Nothing, Larvell Posted: November 20, 2017 at 05:12 PM (#5578999)
It's not really even "choking." It's just getting a little bit tighter, which has a disproportionate impact on how your body functions which is especially important given how hard it is to hit major league pitching.

When people throw around terms like "choking" it distorts debate. If we're getting to the point where people don't want to study things because they're afraid the results will make other people use terms like "choker," at that point we're getting into the exogenous politics and ideology that I've asserted are underlying a lot of this.
   114. Rob_Wood Posted: November 20, 2017 at 05:14 PM (#5579002)
What do you think Bill is saying? To me, his main contention is that wins have to "add up" at the team level. Do you read it differently?


My understanding is that Bill James is saying that he doesn't like WAR (and certainly does not think WAR should be used in MVP discussions) since it is context-neutral and does not "add up" to team wins.

People on the other side of this debate are defending WAR as being a top-of-the-line seasonal-based context-neutral stat. And saying that being context-neutral was a feature of WAR (not a bug).

People point out that you cannot blame WAR for not adding up to team wins. No seasonal stat should be expected to do that, and forcing that is not necessarily a good thing at all. These people further point out that if you think "adding up" is vital, then you should bake that into the framework at the outset and use game-level data where "adding up" is natural and appropriate.

From what I understand, it seems to me that Bill James does not agree with either of these arguments.
   115. 6 - 4 - 3 Posted: November 20, 2017 at 05:16 PM (#5579005)
But the data has consistently shown that there aren't.

The clutch studies are more about a lack of statistical power in single year samples than anything else. As they say, absence of proof is not proof of absence.

In other words, we don't have evidence that clutch hitting exists. That's not to say that we have evidence that clutch hitting doesn't exist. It's a subtle--but very important--epistemological distinction. I don't know how we can address the question in a 162 season absent pooling of consecutive seasons or something, which creates at least as many problems as it solves. The other thing is that batting situations are not independently and identically distributed (e.g., batters generally face better pitchers in close-and-late situations), which further complicates the statistical power issues.
   116. Blanks for Nothing, Larvell Posted: November 20, 2017 at 05:21 PM (#5579009)
Golfers routinely consult and even employ sports psychologists, and of course those people have written books and articles and the like to try to mainstream and monetize their ideas. The elite performers neither shy away from admitting psychology has impacted their physiology during competition, nor hesitate to openly seek professional help in improving psychology so as to improve physiology.

Do any MLB teams even employ a single psychologist? There obviously isn't remotely the analogue to golf psychology within baseball. The difference in the two cultures is actually quite interesting. Among the golf equivalent to BBTF, the idea that there isn't, amongst even the elite of the elite, definable poor and good performance under varying degrees of pressure and contextual importance would strike them as high-order cookoo, bordering on rubber room material. There's literally no one who would argue that a ten-foot putt to win the Masters wouldn't impact someone more than a ten-footer on the 8th hole of the first round of the John Deere Classic -- much less an insistent and influential faction of the community, as there is in baseball.

Fascinating. Golf is, I guess, a little more ... precise ... than hitting major league pitching, but that can't begin to explain the difference.
   117. BDC Posted: November 20, 2017 at 05:22 PM (#5579010)
What I agree is that it's essentially impossible to determine who are the chokers, and who aren't. It's almost a subject not worth talking about. But that doesn't mean we should abandon our reason and understanding of humanity when we do discuss it

I don't think we disagree. It's just the difference between writing a baseball novel and trying to determine who the better players are in terms of winning ballgames :)

   118. Zach Posted: November 20, 2017 at 05:34 PM (#5579022)
From Cameron's article:

My primary guiding principle on the usefulness of a metric is twofold:

1. What question is it answering, and is the answer to that question interesting? If yes, proceed. If no, ignore.

2. Does the metric answer that question accurately?


I agree with this, and I think WAR does a good job of measuring runs contributed. I also think that runs contributed is a very interesting thing to measure.

But WAR became popular as a metric largely because it attempts to isolate just a player’s individual contribution to a team’s wins and losses, and once you start adding in some context, there’s no real reason to stop until you’re at WPA, which tells you that Melky Cabrera‘s 98 wRC+ resulted in a better offensive season than Jose Ramirez‘s 148 wRC+. And even then, WPA ignores the after-he-hit events, so even that isn’t really telling you the full story about the value of Melky’s offensive inputs as they relate to wins.

I disagree with this. I think WAR turned into people's go to stat because they want to talk about wins, and WAR is close enough to the real thing that you can gloss over the imperfections.
   119. TDF didn't lie, he just didn't remember Posted: November 20, 2017 at 05:39 PM (#5579024)
Sometimes in these discussions, I think people get so fixated on the trees that they miss the forest.

Just to take one part of WAR, look at defensive numbers. We know that single year defensive metrics are not very descriptive - that there isn't enough data from a single year to say that Joey Votto was really 11 runs better than the average 1b this season*. Once we remember that fact, it becomes evident that this argument (translating "individual runs" into "team wins") is going to be problematic**. But to make a point ("I don't think WAR does a good job of describing what I think it is/what I want it to"), we forget this very basic point.

*And one year stats say nothing at all about defensive ability.
**The second argument this affects is MVP voting. I really hate single year defensive stats in WAR because I think it overly muddles MVP discussions - again, Votto's +11 (which is way out of whack from both last season and his entire career) does a lot of lifting towards his 7.5 bWAR.
   120. Jay Z Posted: November 20, 2017 at 05:43 PM (#5579029)
I'm not very sympathetic to Bill's argument, but I don't think it has to lead to this conclusion. One can reasonably say about a game the Astros lost that Altuve contributed 0.2 wins, while his teammates "contributed" -0.7 wins (in WPA terms). Then the total player wins equals actual team wins, which is what Bill feels is important. After all, the game really was played, and some players did far more than others to create that loss, while ignoring all that data would be the same as assuming the game was never played. So, I think there is room to argue the sum of player values should equal actual team wins, without believing all production in losses should be ignored.

BUT, someone could certainly object that Altuve's one-fifth of a win cannot possibly be "real," or represent true "value," when the Astros actually recorded 0.0 wins that day. Indeed, most of what Bill wrote about WAR could be applied with equal force to a system like Win Shares that assigns value even in the context of losses. NONE of these systems can be said to perfectly measure "real" wins -- all are approximations. So it would be nice if James would stop describing as "errors" what are simply different judgments about how closely to connect value metrics to game outcome


I'd use a wins versus replacement value, which probably would have the winning team contribute 0.67 runs and the losing team contribute -0.33 runs, or something like that.

Anyway, no, it's not correct to zero out everyone who plays on the losing team. That requires seeing into the future. Events should be valued on what is known at the time. At the time the event occurs, the winner is not known. So any positive event may lead to a win, or negative to a loss, and they are rated accordingly.

Context-free component value has the opposite problem of pretending that we don't know the context of the event. And that components are commodities that can be traded, instead of events only of use in the game at hand.
   121. snapper (history's 42nd greatest monster) Posted: November 20, 2017 at 05:44 PM (#5579030)
That's a significant exception, and I'm not aware of any human attribute that doesn't exist on a spectrum. Choking can't be just off/on, either extreme or totally non-existent.

I'm not familiar enough with the statistics to give you a better argument than this. It's been decades since I read the studies on clutch, and I'm not familiar with newer ones than what existed pre-Primer.


But they all involve throwing. There's never been a hitting example that I know of. I could believe that clutch pitching exists, though I've never seen it proven. But all the studies I'v seen have shown that clutch hitting isn't a repeatable skill.
   122. snapper (history's 42nd greatest monster) Posted: November 20, 2017 at 05:49 PM (#5579035)
Assuming that's true, and Bill James doesn't -- it doesn't matter. We only turn to second-order inferences from data when we don't have first-order direct evidence. Here, we do.

It ain't what you don't know that gets you in trouble. It's what you know for sure than ain't so that does.

The clutch studies are more about a lack of statistical power in single year samples than anything else. As they say, absence of proof is not proof of absence.

In other words, we don't have evidence that clutch hitting exists. That's not to say that we have evidence that clutch hitting doesn't exist. It's a subtle--but very important--epistemological distinction. I don't know how we can address the question in a 162 season absent pooling of consecutive seasons or something, which creates at least as many problems as it solves. The other thing is that batting situations are not independently and identically distributed (e.g., batters generally face better pitchers in close-and-late situations), which further complicates the statistical power issues.


If the impact of clutch isn't large enough to be seen in a season of results, it's not large enough to care about.

If clutch was a real & material skill, we would see the same guys outperforming and underperforming in "late and close", or other similar high leverage situations every year.
   123. Blanks for Nothing, Larvell Posted: November 20, 2017 at 05:55 PM (#5579042)
It ain't what you don't know that gets you in trouble. It's what you know for sure than ain't so that does.


There is overwhelming first-order evidence.

If the impact of clutch isn't large enough to be seen in a season of results, it's not large enough to care about.


I'm not sure what you mean by this. We can easily see the performance difference between, say, Judge and Altuve just from 2017 data -- and it's plenty large.

If clutch was a real & material skill, we would see the same guys outperforming and underperforming in "late and close", or other similar high leverage situations every year.


Not at all. Everyone everywhere would agree that hitting generally is a "real and material" skill -- yet performance in that area jumps all over the place.

Alan Trammell, OPS+, 1985 (age 27) to 1991 (age 33):

90
120
155
138
85
130
90

No one with a straight face could say they can infer Trammell's "true hitting talent" from that seasonal data.
   124. PreservedFish Posted: November 20, 2017 at 05:58 PM (#5579046)
But they all involve throwing. There's never been a hitting example that I know of.


Terrible throwing is more noticeable and more embarrassing than terrible hitting.
   125. shoewizard Posted: November 20, 2017 at 06:04 PM (#5579049)
No performance, not matter how good, produces value in a loss. That's silly.



No one's making that argument in the least, and it isn't shocking that James would block someone saying anyone was.


Can you please help me square your above comment when you said in the other thread:

34. Blanks for Nothing, Larvell Posted: November 20, 2017 at 09:33 AM (#5578687)
On an individual level, it's much more debatable than you or BJ wish to recognize.


Not really. The only point of Aaron Judge hitting a double for the New York Yankees is that it would help the New York Yankees win baseball games. Standing alone, it's a meaningless accomplishment.


   126. Blanks for Nothing, Larvell Posted: November 20, 2017 at 06:08 PM (#5579052)
Can you please help me square your above comment when you said in the other thread:


Sure -- the home runs Aaron Judge hit in the Home Run Derby last year are meaningless. The home runs he hit for the Yankees in Yankee games were not. The end goal of Judge's home runs in Yankee games was to help the NY Yankees win baseball games. He helped them win a bunch.

Nowhere did I ever say that a home run Judge hit in a Yankee loss was meaningless.
   127. shoewizard Posted: November 20, 2017 at 06:10 PM (#5579053)

Just to take one part of WAR, look at defensive numbers. We know that single year defensive metrics are not very descriptive - that there isn't enough data from a single year to say that Joey Votto was really 11 runs better than the average 1b this season*. Once we remember that fact, it becomes evident that this argument (translating "individual runs" into "team wins") is going to be problematic**. But to make a point ("I don't think WAR does a good job of describing what I think it is/what I want it to"), we forget this very basic point
.

Doesn't this apply to Win Shares as well ? Whatever Bill is using for the defensive component of Win Shares has to apply equally, and therefore is equally problematic, no ?
   128. TDF didn't lie, he just didn't remember Posted: November 20, 2017 at 06:16 PM (#5579056)
Doesn't this apply to Win Shares as well ? Whatever Bill is using for the defensive component of Win Shares has to apply equally, and therefore is equally problematic, no ?
Absolutely. I didn't write that as a defense of any system, but as a condemnation of how all are used.
   129. snapper (history's 42nd greatest monster) Posted: November 20, 2017 at 06:30 PM (#5579065)
I'm not sure what you mean by this. We can easily see the performance difference between, say, Judge and Altuve just from 2017 data -- and it's plenty large.

But it's not consistent over time. It's just as likely that Judge will be more clutch than Altuve next year as the other way around.

Not at all. Everyone everywhere would agree that hitting generally is a "real and material" skill -- yet performance in that area jumps all over the place.

Variability isn't all that matters. It's predictability. Hitting is variable, but predictable. Guys who are good hitters this year will tend to be good hitters last year.

That's not true with clutch stats. A strong performance in the clutch this year doesn't predict one next.

Clutch is likely just random noise. i.e. not a skill, but just a residue of chance.
   130. Sunday silence Posted: November 20, 2017 at 06:48 PM (#5579073)
What I agree is that it's essentially impossible to determine who are the chokers, and who aren't. It's almost a subject not worth talking about


what about what happened with Ken Giles this year in the playoffs?

a) Isnt that fundamentally different that what we are talking about with Blass, Ankiel, Knoblach, Charley OBrien (he would double clutch incessantly when returning the ball to the Pitcher)?

b) you dont think that's measurable in Giles's case?


In contrast to some of the other comments, I think its different cause what was happening to Blass didnt seem to have any correlation to the context of the game. it was happening at random times and quite often, he was just throwing crazy wild pitches. Much like Ankiel.

For Blass it started early in the season, I think '73. I dont recall when Ankiels thing started was it in the playoffs? Same with Knobloch was it in the playoffs or just at any odd time? I know the thing with OBrien seemed to be happending all year.

Those I see as probably neurological in origin. Whatever got them to the big leagues in the way of keeping composure something on an unconscious level stopped keeping those hiccups in check. Much like blood pressure or heart rate is controlled auotmatically. What seemed to happen to Giles seems psychological, the actual knowledge of the stage he was on seemed to be impacting him.
   131. Kiko Sakata Posted: November 20, 2017 at 07:29 PM (#5579090)
when Ankiels thing started was it in the playoffs?


Yes. I can't swear it wasn't the end of the season, but in 2000, he had a perfectly cromulent regular season (ERA+ of 134, K/W of 2.16) and then he just fell apart in the playoffs (ERA of 15.75; K/W of 0.45).
   132. shoewizard Posted: November 20, 2017 at 07:36 PM (#5579094)
There were hints. He had 90 walks in 174 IP, and had a 7 walk game, and 4 games with 5 Walks. He was also 5th in the league in WP. Not crazy, but enough indicators that control was a problem. Of course not like what happened in the post season. But the warning signs were there, in retrospect of course.
   133. snapper (history's 42nd greatest monster) Posted: November 20, 2017 at 07:39 PM (#5579096)
what about what happened with Ken Giles this year in the playoffs?


Do you think he's likely to choke again next season? Is he done as an effective closer?
   134. Blanks for Nothing, Larvell Posted: November 20, 2017 at 07:41 PM (#5579099)
Hitting is variable, but predictable. Guys who are good hitters this year will tend to be good hitters last year.


Except it's not. I posted Alan Trammell's prime numbers and I'm sure there a bunch of other examples -- particularly if we understand the confirmation bias inherent to only going by full seasons. Plenty of guys have big years and then suck so hard the next year that they aren't given a full season of at bats.

Clutch hitting is inherently more volatile year to year because of the smaller sample sizes. The best hitters in 2017 will not tend to be the best hitters in April 2018.
   135. BDC Posted: November 20, 2017 at 07:46 PM (#5579102)
what about what happened with Ken Giles this year in the playoffs?

It looks like choking, for sure. But it could have been some sort of injury he'd tried to play through too stoically. The thing is, we don't know, and we particularly don't know it as an attribute of his permanent or continuing make-up.

I could also ask what happened with Sam Dyson in relatively low-pressure regular-season games early this season, after he'd saved 38 games the year before and had a 1.93 ERA in five playoff appearances over the past two seasons. Did the momentousness of facing the Seattle Mariners in April suddenly dawn on him? :)
   136. Blanks for Nothing, Larvell Posted: November 20, 2017 at 07:47 PM (#5579104)
Do you think he's likely to choke again next season? Is he done as an effective closer?


Happens to a lot of guys. Calvin Schiraldi and Donnie Moore -- the 1986 versions of Ken Giles -- were never the same.
   137. Slivers of Maranville descends into chaos (SdeB) Posted: November 20, 2017 at 09:01 PM (#5579137)

If the impact of clutch isn't large enough to be seen in a season of results, it's not large enough to care about.


Not necessarily. From Bill James' Underestimating the Fog:

There are, in sabermetrics, a very wide range of things which have been labeled as "not real" or "not of any significance" because they cannot be measured as having any persistence. The first of these conclusions — and probably the most important — was Dick Cramer's conclusion in the 1977 Baseball Research Journal (SABR) that clutch hitting was not a reliable skill. Using the data from the "Player Win Averages" study by E.G. Mills and H.D. Mills of the 1969 and 1970 seasons, Cramer compared two things — the effectiveness of all hitters in general, and the impact of hitters on their team's won-lost record, as calculated by the Mills brothers. Those hitters who had more impact on their team's won-lost record than would be expected from their overall hitting ability were clutch hitters. Those who had less impact than expected were ... well, non-clutch hitters, or whatever we call those. There are a number of uncomplimentary terms in use.

"If clutch hitters really exist," wrote Cramer, "one would certainly expect that a batter who was a clutch hitter in 1969 would tend also to be a clutch hitter in 1970. But if no such tendency exists, then 'clutch hitting' must surely be a matter of luck." Cramer found that there was no persistence in the clutch-hitting data — therefore, that clutch performance was a matter of luck. "I have established clearly," wrote Cramer, "that clutch hitting cannot be an important or a general phenomenon."

...

Cramer was using random data as proof of nothingness — and I did the same, many times, and many other people also have done the same. But I'm saying now that's not right; random data proves nothing — and it cannot be used as proof of nothingness.

Why? Because whenever you do a study, if your study completely fails, you will get random data. Therefore, when you get random data, all you may conclude is that your study has failed. Cramer's study may have failed to identify clutch hitters because clutch hitters don't exist — as he concluded — or it may have failed to identify clutch hitters because the method doesn't work — as I now believe. We don't know. All we can say is that the study has failed.


(Much more in the original paper)
   138. kthejoker Posted: November 20, 2017 at 09:38 PM (#5579149)
Three DHs go 4 for 4 with 4 solo home runs one week apart against the same team. The same pitchers, even. The same pictures, even.

The first one's team wins 9-0.

The second one wins 5-4.

The third one loses 5-4.

How much value did each player contribute to their team?
   139. Jay Z Posted: November 20, 2017 at 09:48 PM (#5579156)
Three DHs go 4 for 4 with 4 solo home runs one week apart against the same team. The same pitchers, even. The same pictures, even.

The first one's team wins 9-0.

The second one wins 5-4.

The third one loses 5-4.

How much value did each player contribute to their team?


The first one, 'cuz the wind was blowing in.
   140. snapper (history's 42nd greatest monster) Posted: November 20, 2017 at 09:52 PM (#5579162)

Except it's not. I posted Alan Trammell's prime numbers and I'm sure there a bunch of other examples -- particularly if we understand the confirmation bias inherent to only going by full seasons. Plenty of guys have big years and then suck so hard the next year that they aren't given a full season of at bats.

Clutch hitting is inherently more volatile year to year because of the smaller sample sizes. The best hitters in 2017 will not tend to be the best hitters in April 2018.


Now you're being intentionally obtuse, or are just not very bright.

You absolutely know hitting is predictable. There are models (Steamer, ZiPs, Cairo, Marcel, etc.) that can predict future hitting based on past hitting, and do a very good job in aggregate. Of course there are outliers, but on the whole, the predictive models work.

No such thing is possible for "clutch" hitting. It's not a stable ability. That means either 1) the effect is too small to be noticeable, or 2) it doesn't exist. In either case, teams can safely ignore it.
   141. Greg K Posted: November 20, 2017 at 09:57 PM (#5579167)
I could also ask what happened with Sam Dyson in relatively low-pressure regular-season games early this season, after he'd saved 38 games the year before and had a 1.93 ERA in five playoff appearances over the past two seasons. Did the momentousness of facing the Seattle Mariners in April suddenly dawn on him? :)

I assume he had a Jose Bautista flashback and it crushed his spirit. I've heard those can pop up at random times.
   142. Lance Reddick! Lance him! Posted: November 20, 2017 at 09:59 PM (#5579170)
WAR uses a dead simple model -- 10 marginal runs equals 1 win.

If you don't know what you're talking about, it's best to just shut up.
   143. cardsfanboy Posted: November 20, 2017 at 10:17 PM (#5579175)
I'm just now reading this thread and I immediately passed the first stupid ass comment, and then ended up with this one..

The real problem with WAR and all the sabr stats is a refusal of those that display them to use the same definition. New age analysts hate batting average and RBIs and Wins but no matter where you look those numbers are the same. When you get to WAR fangraphs, bbref, baseball prospectus etc all have different numbers and the differences are large. If WAR was 2.4, 2.5, and 2.6 it would be ok but some players can be 2.5 in one call and 3.5 in another. How is anyone supposed to take that seriously? How can Tommy John have such different WAR totals?


You really can't be this ####### stupid can you? War from different websites are different stats.... fwar, bwar etc. are different stats... they are not the same stat even if they have the same name... .do you know two different people named John? if so, are they the same people, if not, don't complain because one time you see John he's tall, and the other time he's short.

Edit: I had planned to read this entire thread and comment and see what is going on, but after about 8 posts, it was already annoying the crap out of me, might return later when I actually want to enjoy a debate.
   144. Blanks for Nothing, Larvell Posted: November 21, 2017 at 07:25 AM (#5579239)
You absolutely know hitting is predictable. There are models (Steamer, ZiPs, Cairo, Marcel, etc.) that can predict future hitting based on past hitting, and do a very good job in aggregate. Of course there are outliers, but on the whole, the predictive models work.


Citation very much needed. "Work," how? I'm of course very much aware of the models and am curious to know your definition of working. They get some guys kind of close to right and miss on a bunch of guys.

No such thing is possible for "clutch" hitting. It's not a stable ability. That means either 1) the effect is too small to be noticeable, or 2) it doesn't exist. In either case, teams can safely ignore it.


I guess you're just going to ignore all the things informed people know about sports psychology and the impact of psychology on physical performance, and first-order effects. Hard to have a serious discussion when this massive amount of very persuasive evidence is just hand-waved away.

Nor can teams "safely ignore" it. They can live with it if it comes with good enough hitting in non-clutch situations. The Yankees didn't "safely ignore" A-Rod's perpetual gakking in the playoff clutch. They moved him way down in the order and if memory serves even benched him.
   145. PreservedFish Posted: November 21, 2017 at 07:48 AM (#5579243)
BDC:
I don't think we disagree. It's just the difference between writing a baseball novel and trying to determine who the better players are in terms of winning ballgames :)


I guess that's fair, but I want a guarantee that I can say something like "Ken Giles choked big time" without one of you pencil neck geeks objecting.
   146. BDC Posted: November 21, 2017 at 08:45 AM (#5579259)
The Yankees didn't "safely ignore" A-Rod's perpetual gakking in the playoff clutch. They moved him way down in the order and if memory serves even benched him

Exactly, until the year when they batted him cleanup the whole time, he hit .365, and they won the World Series.
   147. Rally Posted: November 21, 2017 at 08:58 AM (#5579262)
Do you think he's likely to choke again next season? Is he done as an effective closer?


Giles' struggles reminded me of the closer the Astros had last time they made the world series, Brad Lidge. There might have been some carry-over as Lidge had a 5.28 ERA in 2006, but he was lights out in 2008 and converted every save attempt though the end of the world series that year.
   148. Blanks for Nothing, Larvell Posted: November 21, 2017 at 09:10 AM (#5579269)
Exactly, until the year when they batted him cleanup the whole time, he hit .365, and they won the World Series.


Right -- the year he got off to a decent start and therefore didn't press, either at all or as much.

Again, anyone familiar with the actual first-order psychology at work understands the "uh-ohs" that go through the mind when things start to not go so well. And that they might not kick in if things in fact start out ok.

It's always been quite comical that people somehow think 2009 impacts what happened in all the previous years. Rory McIlroy had a nice US Open a few months after he obviously tensed up in the final round of the Masters.

These psychological effects are obvious; the far more interesting question is why the baseball culture is so different than the golf culture with respect to their explicit recognition. Maybe it's demographics; maybe it's because the golf community isn't as hung up on thinking there are any kind of "value judgments" or "moralizing" going on with the explicit recognition. One gets the sense as early as Bill James writing about it -- "racists will say blacks can't hit in the clutch -- a grave discomfort with the idea for reasons having little to do with baseball.

   149. Sunday silence Posted: November 21, 2017 at 09:20 AM (#5579271)

Clutch hitting is inherently more volatile year to year because of the smaller sample sizes. The best hitters in 2017 will not tend to be the best hitters in April 2018.


but if I understand you; you believe clutch hitting exists (in some manner you define) but can you measure it? I.e. is there any hitter you can point to who is a better clutch hitter than the average hitter? Are there any below average clutch hitters that you can identify?
   150. Blanks for Nothing, Larvell Posted: November 21, 2017 at 09:23 AM (#5579272)
but if I understand you; you believe clutch hitting exists (in some manner you define) but can you measure it?


Measure how? How do you measure or quantify psychological impact on body function?

I distinctly remember James writing a bunch of stuff on how when guys overperform what they think is their norm they will often kind of realize the disconnect between their performance and their deep down opinion of themselves. (*) As a result, they will often revert. This is also an obviously true observation. Maybe someone can pull up his exact words.

(*) This happened with roiders, to be sure -- but they had the psychological benefit of being able to tell themselves the change was because of the roids. Indeed, the roids themselves likely impacted the psychology in a positive way.



   151. Sunday silence Posted: November 21, 2017 at 09:36 AM (#5579275)
You really can't be this ####### stupid can you? War from different websites are different stats.... fwar, bwar etc. are different stats... they are not the same stat even if they have the same name...


why does it make someone stupid to think that there is an agreed upon def'n of WAR among sabermetricians??

That doesnt strike me as stupid and in fact I had assumed the same thing until reading something earlier this year.

If I look up the mass of the moon and NASA says its 0.5 earths and ENcyclopedia says 1.3 and WIkipedia says its 0.8 I would think somethings wrong. The only issue then is: Is it reasonable to assume WAR has an acceptable widespread def'n or its some sort of subjective term like "prime" or "valuable" or "high leverage.?"

Knowing what I know, I would assume WAR is based on weighted values of certain discrete actions that happen on the ball field: hiting a single a HR, getting a BB etc. We should be able to agree on that stuff cause we know the ba. and slug. avg and obp of each league on the whole so we know what the expectancy of a run would be in those cases on the average. Then theres stolen bases which is kind of problematical cause people can choose to steal only in certain situations but its got to be around 0.2 or 0.25 of a run...

Then theres fielding which has at least two competing systems so I guess you have a point there, but presumably one would imagine, one would ASSUME, that if you're publishing someones WAR for the season you might say rField based on UZR Or TZ or something. Right?

There's park effects, which I think WAR takes into account, and park effects are measurable. You can use different time periods to regress such effects so maybe there's a discrepancy there..

And yet there's a whole nother reason. Replacement Value has an agreed upon def'n (apparently Fangraphs and the other places had to reach an agreement). So its not reasonable to assume that saber people everywhere realize the necessity of having agreed upon def'n for certain basic things? I would think WAR is basic.

Offhand I cant think of any really compelling reason (other than maybe park effects and fielding metrics) why WAR should be different according to different reference places. Although I know it does vary. I would be less than 20% of the people here can tell us off the top of their head why WAR varies according to the source. I know I cant.

How does that make someone STUPID??

   152. Rally Posted: November 21, 2017 at 09:37 AM (#5579276)
My recollection on clutch hitting studies is that you could identify players who would be expected to hit slightly better or worse in given situations. But to reliably identify them, you need a sample size that consists of a long career. So you might be able to say Paul Molitor or David Ortiz was a good clutch hitter, but that knowledge won't do you much good in 2018.
   153. Sunday silence Posted: November 21, 2017 at 09:38 AM (#5579277)
...As a result, they will often revert. This is also an obviously true observation


OK then give us an example smart guy.
   154. Sunday silence Posted: November 21, 2017 at 09:45 AM (#5579279)
My recollection on clutch hitting studies is that you could identify players who would be expected to hit slightly better or worse in given situations. But to reliably identify them, you need a sample size that consists of a long career.


I dont think thats correct or its rather shaky. We have lists of certain baseball stats that stabilize after X number of data pts. For instance batting average as I recall stabilizes after about 150 AB. Obviously it might still vary a lot, but it has a certain permanence at that pt. I guess you get into standard deviations and confidence numbers and mathy stuff. But you get the general idea?

It doesnt seem hard to find 150 AB with RISP in a given season for a player?

Now maybe you are thinking of a more specific situation like runner on third less than 2 outs? That's why I said "shaky" above; I dont think you're correct but maybe....?
   155. fra paolo Posted: November 21, 2017 at 09:54 AM (#5579286)
Doesn't this apply to Win Shares as well ? Whatever Bill is using for the defensive component of Win Shares has to apply equally, and therefore is equally problematic, no ?

Is it equally problematic? Win Shares takes marginal runs at the team level and then calculates how many marginal runs are needed by that team to 'purchase' a win and then distributes that credit amongst the players. If the Posnanski article is correct, WAR divides a player's runs by 10, or may calculate a league value to replace 10, which immediately takes the player out of the context of his team.

Although the discussion here has covered the translation of runs to wins and the role of the individual player, my scepticism towards WAR is rooted in the removal of the team from the process. Teams just don't seem to exist under WAR-based systems.

That's a level of context that I don't think we can do without,* whether one is going to assign the credit in terms of Wins or Runs.
_______________
* Especially when it comes to fielding, when the pitcher/fielder division of responsibility has to be accounted for, plus the value of the player's fielding within that context, as well as to his counterparts at his position on other teams. If a player is a crap fielder, but fields very few BIP because of the way the team's pitchers pitch, then he's not doing as much damage as it might seem.
   156. Sunday silence Posted: November 21, 2017 at 09:57 AM (#5579288)
I have two more questions having read the James follow up article and the Dave Cameron piece on FG both of which have links somewhere on this site if not in this thread.

1. Does it matter if say Judge sat for half his teams games (I know he was benched but not sure if the details fit my hypo) and his team lost most of those? Ie does it matter which games a given player played in when doing what James wants to do with value/WAR and adjust it for team wins?

2. The logical next step would be to instead of looking at the season wins, just to look at the actual games won and see who added Win Expectancy or not. That to me would be intellectually honest. I mean given that an individuals value is only how it relates to team wins, then its logical to look at team wins themselves rather than a seasonal total of team wins.

James objection is that there's no way to determine what an average player value is...

WHy wouldnt it be zero? THe league win pct is zero on the whole. every event is going to add/substract WE but in the end it will all add to zero.

I dont get James big objection here, can anyone elaborate?
   157. Sunday silence Posted: November 21, 2017 at 10:00 AM (#5579290)
If a player is a crap fielder, but fields very few BIP because of the way the team's pitchers pitch, then he's not doing as much damage as it might seem.


OK sounds logical, but when has that ever happened in history? I mean on a season long basis, not just one game w/ a game score of 93..
   158. fra paolo Posted: November 21, 2017 at 10:08 AM (#5579295)
Is it reasonable to assume WAR has an acceptable widespread def'n or its some sort of subjective term like "prime" or "valuable" or "high leverage.?"


WAR as a formula isn't subjective, but its components can be. Something like Ultimate Zone Rating can vary depending on the source of the data for BIP, IIRC. One could use FIP or something like BPro's DRA in calculating a pitcher's value. One could even add Fielding Win Shares into a WAR calculation using a Linear Weights' style calculation of batting runs. Just divide the Fielding Win Shares by three, I believe.

And the value of a Replacement Player is subjective, so the actual worth of Replacement Level can vary, although I think there was an attempt to standardise this.
   159. fra paolo Posted: November 21, 2017 at 10:21 AM (#5579301)
OK sounds logical, but when has that ever happened in history?

One can see this effect possibly at work by comparing rankings of players under different fielding systems. Fielding Win Shares has a team element, and will at times show a poor fielder under a zone-based WAR not doing quite so badly. Now, it could be that the zones are simply wrong, but it could also be that the pitchers are successful in controlling where the batter hits the ball.

Pitchers (and catchers) often talk about the kind of contact they want from the hitter. (And hitters speak in the same way with effects we can see in StatCast nowadays.) 'Weak' or 'soft' contact. 'Induce' a grounder. 'Try to get a pop-up.' I should think they can influence direction as well.
   160. Sunday silence Posted: November 21, 2017 at 10:35 AM (#5579314)

WAR as a formula isn't subjective, but its components can be. Something like Ultimate Zone Rating can vary depending on the source of the data for BIP, IIRC. One could use FIP or something like BPro's DRA in calculating a pitcher's value.


I know BTF had this discussion a few weeks ago, and I think it had to do with Tommy John if I recall. I can see using/not using FIP could be a big factor for a relatively soft tosser (I think TJohn was). Is that the main source of the discrepancy for pitchers?

you mention fielding, can I assume WAR for batting is standardized?

the actual worth of Replacement Level can vary, although I think there was an attempt to standardise this.


my understanding is that FG and BaseballRef (I think) have agreed to 49 wins/season (which is then converted to a win pct e.g. .30 and each replacement is a .30 player or whatever) as the standard. HOw that works on the components of fielding, batting etc is lost on me.
   161. Sunday silence Posted: November 21, 2017 at 10:38 AM (#5579316)
Now, it could be that the zones are simply wrong, but it could also be that the pitchers are successful in controlling where the batter hits the ball.


this seems dubious, no? PItchers cannot control BABIP how can they control where the ball is going?
   162. fra paolo Posted: November 21, 2017 at 10:45 AM (#5579320)
this seems dubious, no? PItchers cannot control BABIP how can they control where the ball is going?

The plate appearance is a battle between the pitcher and the hitter over how hard the ball will be hit and where it will go. So with two strikes the batter has to be ready to 'go the opposite way with the pitch'. So something is going on.
   163. dlf Posted: November 21, 2017 at 10:50 AM (#5579325)
I can see using/not using FIP could be a big factor for a relatively soft tosser (I think TJohn was).


Pre-surgery, John had an average or above average fastball, but not uniquely so. Post surgery, his "fastball" was in the low 80s and his most frequently thrown pitch (a sinker) was even slower. Tommy John brings to mind the Frank Tanana line: "in the 70s, I threw in the 90s; in the 90s, I threw in the 70s."

this seems dubious, no? PItchers cannot control BABIP how can they control where the ball is going?


Wouldn't an outside pitch be harder to pull and a low pitch more likely to be hit on the ground? I do think that prior studies have shown there is a pretty strong Y-t-Y correlation, for example, in the GB:FB ratios as well as the ratio of assists by 3B per GB.
   164. fra paolo Posted: November 21, 2017 at 10:51 AM (#5579329)
I know BTF had this discussion a few weeks ago, and I think it had to do with Tommy John if I recall. I can see using/not using FIP could be a big factor for a relatively soft tosser (I think TJohn was). Is that the main source of the discrepancy for pitchers?

I don't really know if it's the main source. We don't have much information about how BPro calculates its Pitcher WAR, AFAIK. It's well known that BB-ref and FanGraphs use different methods to calculate Pitcher WAR.

you mention fielding, can I assume WAR for batting is standardized?

All batting WAR that I'm familiar with are based on some version of a Linear Weights formula for estimating runs like Pete Palmer et al described in The Hidden Game of Baseball back in the 1980s. But the coefficients might vary, I suppose. Win Shares uses a Runs Created formula, which is different.

I stopped making close studies of this stuff in about 2013 or so, and there have been changes introduced since, so I'm not the best person to answer these questions.
   165. What did Billy Ripken have against ElRoy Face? Posted: November 21, 2017 at 10:53 AM (#5579335)

Right -- the year he got off to a decent start and therefore didn't press, either at all or as much.

Again, anyone familiar with the actual first-order psychology at work understands the "uh-ohs" that go through the mind when things start to not go so well. And that they might not kick in if things in fact start out ok.

It's always been quite comical that people somehow think 2009 impacts what happened in all the previous years. Rory McIlroy had a nice US Open a few months after he obviously tensed up in the final round of the Masters.


Measure how? How do you measure or quantify psychological impact on body function?

I distinctly remember James writing a bunch of stuff on how when guys overperform what they think is their norm they will often kind of realize the disconnect between their performance and their deep down opinion of themselves. (*) As a result, they will often revert. This is also an obviously true observation. Maybe someone can pull up his exact words.

(*) This happened with roiders, to be sure -- but they had the psychological benefit of being able to tell themselves the change was because of the roids. Indeed, the roids themselves likely impacted the psychology in a positive way.


What seems to be an obviously true observation to me is that you -- and, in fairness, the majority of non-analytical sports commentators -- are very good at making up just-so stories to explain variance in players' performance based on your assumptions about what must be going on in the minds of people you've never met.
   166. Kiko Sakata Posted: November 21, 2017 at 10:55 AM (#5579336)
1. Does it matter if say Judge sat for half his teams games (I know he was benched but not sure if the details fit my hypo) and his team lost most of those? Ie does it matter which games a given player played in when doing what James wants to do with value/WAR and adjust it for team wins?


Does it matter, in Win Shares, what the team's record is when a particular player is playing? Not in the version of Win Shares that Bill James wrote about in his original book on the subject. Should it matter? Of course.

2. The logical next step would be to instead of looking at the season wins, just to look at the actual games won and see who added Win Expectancy or not. That to me would be intellectually honest. I mean given that an individuals value is only how it relates to team wins, then its logical to look at team wins themselves rather than a seasonal total of team wins.


Yes, to do what Bill James wants to do - and as I think I've argued in this thread, I think he's right to want to do this - you have to calculate your statistic at the game level, not the season level.

James objection is that there's no way to determine what an average player value is...


I don't get this either. Replacement level is a theoretical concept which is difficult to define and has no right answer. Determining what an average player is dead-bang easy (well, maybe not "easy" but it's fairly straightforward and has a knowable "right" answer).
   167. Kiko Sakata Posted: November 21, 2017 at 11:02 AM (#5579344)
I know BTF had this discussion a few weeks ago, and I think it had to do with Tommy John if I recall. I can see using/not using FIP could be a big factor for a relatively soft tosser (I think TJohn was). Is that the main source of the discrepancy for pitchers?


The discussion was with respect to Tommy John and the difference wasn't really FIP vs. RA-9 - which, incidentally, Fangraphs shows WAR numbers calculated both ways; BB-Ref's measure theoretically falls between the two Fangraphs measures, as BB-Ref uses RA-9 but adjusts for team fielding.

But, in the case of Tommy John, his FIP and ERA are actually pretty close (3.38 vs. 3.34 - I forget which was lower, they're that close - although John allowed an above-average number of unearned runs, as is typical of a ground-ball pitcher). He wasn't a big K guy but excelled at the other two "true outcomes". And, as best I can figure, BB-Ref says that he pitched in front of relatively average defenses over the course of his career - as you'd expect, when you pitch forever, you're much more likely to have the extremes in your context average out over time.

Anyway, for reasons buried within the calculations, BB-Ref gives John 10 - 17 fewer WAR than Fangraphs does (the 10 is RA-9; the 17 is FIP). [That parenthetical suggests that his FIP was probably lower than his ERA - although, as I said, he also allowed a lot of unearned runs that show up in the RA-9 figure.]
   168. Blanks for Nothing, Larvell Posted: November 21, 2017 at 11:36 AM (#5579367)
What seems to be an obviously true observation to me is that you -- and, in fairness, the majority of non-analytical sports commentators -- are very good at making up just-so stories to explain variance in players' performance based on your assumptions about what must be going on in the minds of people you've never met.


No, you still aren't reading what I've said then. There are no "assumptions," there is instead direct testimony from the people involved as to what is going on in their minds and how it impacts their performance. There are also the recollections of former players about what they think is going through the minds of current players, based on their long experience. There are the articles and other writings by psychologists about their work with players and what the players have said to them.

And there's also my direct observation as a former competitive athlete -- although this isn't remotely necessary to the point. When you aren't adequately controlling your nerves in the "hit ball well" sports, your body literally isn't functioning as it typically does. Again -- nothing remotely controversial or even debatable here. Everyone who's been there knows what I mean, and it's guaranteed that major league baseball players have been there. For cultural reasons, they don't talk about it much, unlike golfers -- but so what?

Anyone who believes psychology and psychological state has no impact on performance isn't worthy of being taken seriously as an "analyst." It's utterly and patently absurd. You don't even need golf; all you need is Steve Blass/Kevin Saucier/Rick Ankiel, and common sense. With those three people, the evidence is staring every objective person square in the face. You can either blink and pretend, bury yourself in the trees while cluelessly missing the forest -- or you can be smart and wise. Smart and wise is typically the best path.
   169. What did Billy Ripken have against ElRoy Face? Posted: November 21, 2017 at 11:46 AM (#5579380)
Anyone who believes psychology and psychological state has no impact on performance isn't worthy of being taken seriously as an "analyst."

This is a straw man. There is a world of difference between acknowledging that yes, psychology most likely has some impact on athletic performance in general and saying that you know exactly how it affected particular athletes in particular situations. For example, you ascribe A-Rod's bad performance in some postseasons to inherent chokerness, and then you explain away his good performance in one World Series as caused by the fact that he "got off to a decent start and therefore didn't press, either at all or as much."

You have no idea about that. None. You've cited no "direct testimony from the people involved as to what is going on in their minds and how it impacts their performance." And even if A-Rod offhandedly said "yeah, I was pressing" or something like that, we all know that that's just a throwaway line that players and others use to explain poor performance. If he had provided a more detailed explanation of how his mental state had affected him, then sure, that would maybe count as data. But a manager saying "he looks like he's pressing" is just crap they say to the media.
   170. Blanks for Nothing, Larvell Posted: November 21, 2017 at 11:53 AM (#5579383)
There is a world of difference between acknowledging that yes, psychology most likely has some impact on athletic performance in general and saying that you know exactly how it affected particular athletes in particular situations.


Concession accepted on the first part. Took awhile, but we finally got there. Progress!

Of course no one knows "exactly how" the impact worked. No one in sports knows exactly how anything affects anything. Not exactly an earth-shattering observation. As to A-Rod, Joe Torre (and indirectly Derek Jeter) spoke volumes by their actions as to how they thought he was being impacted. That's good enough for me.

You have no idea about that. None.


Except I do. And again the idea that an athlete would get into a less-optimal psychological state when things start not going so well -- yet again -- in big situations isn't remotely controversial.

   171. BDC Posted: November 21, 2017 at 11:54 AM (#5579385)
he had a Jose Bautista flashback

Yeah, maybe Dyson wasn't the greatest example :-D Or maybe he was. He pitched fine against Toronto till he gave up a famous home run, and then he pitched very well the whole next year (including one perfect playoff inning against Toronto), and only then did he cease to be able to get anybody out in routine April save situations …

   172. What did Billy Ripken have against ElRoy Face? Posted: November 21, 2017 at 12:01 PM (#5579389)
Concession accepted on the first part. Took awhile, but we finally got there.

Not sure "my second post" = "took awhile," especially since I never denied that there was a connection, but ok.

As to A-Rod, Joe Torre (and indirectly Derek Jeter) spoke volumes by their actions as to how they thought he was being impacted. That's good enough for me.

OK, so you're just building your psychological narrative about A-Rod from your assumptions about the thoughts behind the actions of people who weren't even A-Rod. Got it. I guess I'll get a "concession accepted" in here too.



   173. Blanks for Nothing, Larvell Posted: November 21, 2017 at 12:05 PM (#5579394)
Not sure "my second post" = "took awhile," especially since I never denied that there was a connection, but ok.


Everyone who says there's no such thing as clutch hitting or clutch non-hitting is denying it.

OK, so you're just building your psychological narrative about A-Rod from your assumptions about the thoughts behind the actions of people who weren't even A-Rod. Got it.


No, and A-Rod and his production as well as other people. We all know A-Rod is half-horse, but I still think we can consider him human psychologically. You do realize that psychologists use the experience of other humans when treating human patients, right?

   174. Blanks for Nothing, Larvell Posted: November 21, 2017 at 12:08 PM (#5579399)
I'd suggest reading up on 1991 British Open champion Ian Baker-Finch, including his personal words, for a representative description of how things can spiral out of control when self-doubt starts creeping into the mind of even mega-elite athletes.

He's another example of Blassian catastrophic failure. A-Rod self doubting and pressing in the postseason was just an example of that, writ significantly smaller.
   175. snapper (history's 42nd greatest monster) Posted: November 21, 2017 at 12:15 PM (#5579410)

Everyone who says there's no such thing as clutch hitting or clutch non-hitting is denying it.


No one is saying it's not a thing. It's clearly a thing in retrospect.

What we are saying is that there is no inherent "clutch" or "un-clutch" trait that a player possesses. No history of clutch or un-clutch hitting helps predicts future clutch performance.

You demonstrated it yourself. The un-clutchiest guy that ever choked had a clutch post-season for the ages in 2009.

I'd suggest reading up on 1991 British Open champion Ian Baker-Finch, including his personal words, for a representative description of how things can spiral out of control when self-doubt starts creeping into the mind of even mega-elite athletes.

The fact that golfers are easy marks for pop pseudo-psychology BS doesn't tell us much except they have too much money and too little scepticism.
   176. David Nieporent (now, with children) Posted: November 21, 2017 at 12:18 PM (#5579412)
The clutch studies are more about a lack of statistical power in single year samples than anything else. As they say, absence of proof is not proof of absence.

In other words, we don't have evidence that clutch hitting exists. That's not to say that we have evidence that clutch hitting doesn't exist. It's a subtle--but very important--epistemological distinction. I don't know how we can address the question in a 162 season absent pooling of consecutive seasons or something, which creates at least as many problems as it solves. The other thing is that batting situations are not independently and identically distributed (e.g., batters generally face better pitchers in close-and-late situations), which further complicates the statistical power issues.
But -- and this is a point I made when James used his "fog" metaphor -- it ultimately doesn't matter for real world purposes. There's no practical (as opposed to academic) distinction between "clutch ability doesn't exist" and "clutch ability exists but has such a small effect that it is undetectable amongst the noise." If the latter is true, you still can't reliably identify clutch performers, and therefore can't make decisions or evaluations of them based on their alleged clutch ability.
   177. Slivers of Maranville descends into chaos (SdeB) Posted: November 21, 2017 at 12:24 PM (#5579418)
David,

While I agree with you, what James argues is we don't know if clutch ability is "undetectable amongst the noise," or if we're merely using the wrong tools to look for it. Radio waves are undetectable to the ordinary person, unless you use the appropriate equipment.
   178. David Nieporent (now, with children) Posted: November 21, 2017 at 12:30 PM (#5579429)
Anyway, no, it's not correct to zero out everyone who plays on the losing team. That requires seeing into the future. Events should be valued on what is known at the time. At the time the event occurs, the winner is not known. So any positive event may lead to a win, or negative to a loss, and they are rated accordingly.
Why should events be valued on what is known at the time? I can understand a preference for choosing that approach, but I can't see why that approach is mandated.
   179. Blanks for Nothing, Larvell Posted: November 21, 2017 at 12:30 PM (#5579432)
What we are saying is that there is no inherent "clutch" or "un-clutch" trait that a player possesses.


Except yes, there is -- psychological makeup and response to pressure. Even elite athletes vary in their makeup and responses.

You demonstrated it yourself. The un-clutchiest guy that ever choked had a clutch post-season for the ages in 2009.


That doesn't demonstrate anything beyond what's already been stated, which is that the trait isn't inherent and immutable.

The fact that golfers are easy marks for pop pseudo-psychology BS


Please tell us you aren't attributing his catastrophic failure, as well as the Blass/Saucier/Ankiel catastrophic failures, entirely to things other than psychology.
   180. What did Billy Ripken have against ElRoy Face? Posted: November 21, 2017 at 12:33 PM (#5579435)
We all know A-Rod is half-horse, but I still think we can consider him human psychologically.

He has a human brain. I'm sure the centaur part has some effect that differentiates him from ordinary humans, either inherently or in terms of learned behavior (or both), but I'm willing to stipulate to his having normal human psychology for the sake of argument.

Everyone who says there's no such thing as clutch hitting or clutch non-hitting is denying it.

Again, straw man. It's perfectly reasonable to say that yes, psychology probably affects things in some ways, but it pretty clearly doesn't manifest itself in any way that shows up in the data when it comes to clutch hitting, for a whole host of potential reasons as discussed above. The only observable manifestations seem to be in catastrophic breakdowns a la Blass/Knoblauch.

I'd suggest reading up on 1991 British Open champion Ian Baker-Finch, including his personal words, for a representative description of how things can spiral out of control when self-doubt starts creeping into the mind of even mega-elite athletes.

He's another example of Blassian catastrophic failure. A-Rod self doubting and pressing in the postseason was just an example of that, writ significantly smaller.

Look, if you're going to accept "What SBB thinks Derek Jeter or Joe Torre doing something says about what they think about why A-Rod isn't playing well" as perfectly valid data to show "what is actually going on in A-Rod's head," there's no use in continuing to argue the point. If A-Rod ever discusses his postseason failures in depth, as Baker-Finch apparently has, then I'm potentially on board. But your own assumptions based on third-party actions are not valid data.
   181. Blanks for Nothing, Larvell Posted: November 21, 2017 at 12:37 PM (#5579440)
But your own assumptions based on third-party actions are not valid data.


Then you in fact don't believe in the professional practice of psychology (and for that matter psychiatry). Case studies of humans other than the patient is inherent to the practice of those disciplines.

And in those disciplines, there is of course no single "right" answer that will satisfy the types of people who like to sift through a bunch of baseball data looking for valid statistical inferences. People vary in their comfort level with ambiguity and reasoned speculations versus Cartesian certainties. That's much of what's going on here with clutch hitting.
   182. snapper (history's 42nd greatest monster) Posted: November 21, 2017 at 12:39 PM (#5579442)
Please tell us you aren't attributing his catastrophic failure, as well as the Blass/Saucier/Ankiel catastrophic failures, entirely to things other than psychology.

No, I'm just saying no "sports psychologist" is going to help fix them.
   183. BDC Posted: November 21, 2017 at 12:39 PM (#5579444)
I think there's something of a "final lap" fallacy in Bear's position (insofar as I understand it). Maybe the fallacy already has a better name.

Rory McIlroy is a good example. Here is a guy who's had as much success at a young age as almost any non-Woods/Nicklaus golfer. He has not won a Masters, despite being in a position to win after 2 or 3 rounds a couple of times. He has won the other majors and a bunch of other tournaments.

McIlroy will tell you he chokes at Augusta, and for all I know this is true to some extent. But if McIlroy were somehow inherently an "Augusta choker," he would never have done well in the early rounds of the Masters to begin with.

It's like how we sometimes hear that a World Series pitcher can't handle pressure, or a World Series manager is acephalous, based on a few events in a sixth or seventh game. But how did they get that far to start with?

"Augusta choker," sounds like a variety of kudzu …
   184. Blanks for Nothing, Larvell Posted: November 21, 2017 at 12:41 PM (#5579446)
No, I'm just saying no "sports psychologist" is going to help fix them.


Because they can't be fixed. Psychologically, they were shot. At lower levels of failure, fixing and improving is absolutely possible.
   185. snapper (history's 42nd greatest monster) Posted: November 21, 2017 at 12:42 PM (#5579447)
Then you in fact don't believe in the professional practice of psychology (and for that matter psychiatry).

Except for pharmachological pscychiatry (altering brain chemistry with drugs), and to a limited extent cognitive therapy, you would be correct. Most pschycology and psychiatry is BS, totally un-rooted in science. That's why none of their experimental results are repeatable.

   186. David Nieporent (now, with children) Posted: November 21, 2017 at 12:44 PM (#5579452)

And there's also my direct observation as a former competitive athlete
Who knew there was a professional trolling league?
   187. Blanks for Nothing, Larvell Posted: November 21, 2017 at 12:45 PM (#5579453)
McIlroy will tell you he chokes at Augusta, and for all I know this is true to some extent. But if McIlroy were somehow inherently an "Augusta choker," he would never have done well in the early rounds of the Masters to begin with.


I'm still not sure why you say this. The pressure doesn't really kick in full force until Sunday.
   188. snapper (history's 42nd greatest monster) Posted: November 21, 2017 at 12:47 PM (#5579455)
I'm still not sure why you say this. The pressure doesn't really kick in full force until Sunday.

Another non-sensical "just so" story.

But there's no pressure on Sunday in the other majors?
   189. What did Billy Ripken have against ElRoy Face? Posted: November 21, 2017 at 12:50 PM (#5579461)
Then you in fact don't believe in the professional practice of psychology (and for that matter psychiatry).

No, all it means is that I don't believe that people (especially sports commentators/fans who have no training in psychology/psychiatry whatsoever) can diagnose the mental processes of particular players in particular situations with no access to any data whatsoever from the players themselves.
   190. Blanks for Nothing, Larvell Posted: November 21, 2017 at 12:51 PM (#5579462)
But there's no pressure on Sunday in the other majors?


Yeah, there is. Not sure where you got that there wasn't.

There's more pressure on Sundays in non-majors.

This stuff is fundamental.
   191. Blanks for Nothing, Larvell Posted: November 21, 2017 at 12:54 PM (#5579468)
No, all it means is that I don't believe that people (especially sports commentators/fans who have no training in psychology/psychiatry whatsoever) can diagnose the mental processes of particular players in particular situations with no access to any data whatsoever from the players themselves.


This is self-contradictory in that it depends on data from the players themselves. The question was whether the use of data from other humans could be used to assess "the patient" A-Rod. The answer is that, of course it can.

A-Rod won't come clean on the psychiatrist's chair for cultural reasons. That doesn't mean diagnosis is impossible.
   192. What did Billy Ripken have against ElRoy Face? Posted: November 21, 2017 at 12:56 PM (#5579475)
This is self-contradictory in that it depends on data from the players themselves.

Of course there is data with respect to the player's performance. What I'm saying is that there is no data with respect to the mental state underlying that performance.


A-Rod won't come clean on the psychiatrist's chair for cultural reasons.

Also, no psychiatrist would have a big enough chair.
   193. snapper (history's 42nd greatest monster) Posted: November 21, 2017 at 12:56 PM (#5579476)

Yeah, there is. Not sure where you got that there wasn't.


Well, then it makes no sense that he only chokes on Sunday at Augusta, and not anywhere else.

He just happened to have a couple of bad rounds (probably due to bad mechanics or bad decision making) and ascribes it to "choking" as a post-hoc justification.

Leaving us once again where we began. Clutch/choke is a description of past events in pressure situations. It is not a characteristic of individual performers, and is not predictive of future events. Therefore, we can safely ignore it.
   194. Blanks for Nothing, Larvell Posted: November 21, 2017 at 01:27 PM (#5579502)
What I'm saying is that there is no data with respect to the mental state underlying that performance.


This is absolutely true, and it's unlikely ever to change. I highly doubt we'll see those colorized brain measurements during games that we see in some other clinical contexts, but I guess who knows where things will be in the future. It's very much in teams' interest to want to know how their players' brains are functioning during games.

So again, we're kind of back to comfort level with reasoned speculation versus strict statistical inference. (Also a matter of psychology, but let's not get too meta.)

The other variable is that it's possible to be both nervous under pressure and function at close to peak anyway. Paraphrasing, but Warren Buffet has said something like, "I've never been able to stop my knees from knocking when I give a public speech, but I have learned to be able to speak ok while my knees are knocking."
   195. Rally Posted: November 21, 2017 at 01:48 PM (#5579527)
What we are saying is that there is no inherent "clutch" or "un-clutch" trait that a player possesses.

Except yes, there is -- psychological makeup and response to pressure. Even elite athletes vary in their makeup and responses.


I think that should be right - players vary in their responses to pressure. What we don't know is who these players are, we are left making an guess from their performance. There is no reason to think that humans respond to pressure the same way in all situations. Maybe he's in the right frame of mind to deliver in the clutch today but a mess tomorrow. We certainly do not know, just because a player failed in a big game or a playoff series, whether he was responding optimally to pressure, or if he just got beat.

What seems obvious to me is that whatever changes in a player's response to pressure are happening, these changes are swamped by the inherent randomness to the sport. If this were not true it would be a lot easier to identify clutch performers and predict them going forward.

I've played simulation games for a long time and I know that chronic unclutchness, the kind that A-Rod's playoff career would exhibit if you took away the good parts, can happen even when the "players" are only bits of computer code. One of these days I'll have to compile the full playoff stats for David Lefevre. He pitched for me for 17 years, won 4 Cy Young awards, and probably made about 20 playoff starts with maybe one win to show for it.

If you saw the exact record happen in real life you would think that there had to be something wrong in his head, some reason why such a good pitcher always chokes when it matters. But he's not real, in this case it truly was 100% random.
   196. PreservedFish Posted: November 21, 2017 at 01:58 PM (#5579535)
I want to make a distinction that is both irrelevant and true. Clutch is unknowable and in all but the most extreme cases can be entirely ignored. But at the same time, clutch absolutely IS a real characteric of professional athletes. It cannot be otherwise.

We can and should act as if clutch did not exist. But that's just a useful lie, because we know that it must exist.
   197. jmurph Posted: November 21, 2017 at 02:08 PM (#5579545)
What seems obvious to me is that whatever changes in a player's response to pressure are happening, these changes are swamped by the inherent randomness to the sport.

Great point and well said. It's easy to imagine, to make up an example, a guy pinch-hitting in a late situation in the World Series who is not up for the moment, but who is bailed out by a reliever pitching for the 4th time in 5 games being gassed and unable to find the plate, or getting bailed out by the ump on a couple of missed borderline pitches. Or any number of other countless similar situations.
   198. What did Billy Ripken have against ElRoy Face? Posted: November 21, 2017 at 02:09 PM (#5579549)
What seems obvious to me is that whatever changes in a player's response to pressure are happening, these changes are swamped by the inherent randomness to the sport.

Great point and well said. It's easy to imagine, to make up an example, a guy pinch-hitting in a late situation in the World Series who is not up for the moment, but who is bailed out by a reliever pitching for the 4th time in 5 games being gassed and unable to find the plate, or getting bailed out by the ump on a couple of missed borderline pitches. Or any number of other countless similar situations.

Bingo. This is what I was alluding to in the middle part of 180, but better said.
   199. Blanks for Nothing, Larvell Posted: November 21, 2017 at 02:13 PM (#5579555)
We can and should act as if clutch did not exist.


Absolutely. It isn't worth the time and effort to ferret out, or certainly to try to value.

Let's say we could see A-Rod's brain waves and we knew he was the most choketastic centaur on earth and that he'd hit a known .150/.200/.275 in every postseason. How much less would you pay him? Any?

But that's just a useful lie, because we know that it must exist.


Yep.
   200. snapper (history's 42nd greatest monster) Posted: November 21, 2017 at 02:28 PM (#5579568)
I want to make a distinction that is both irrelevant and true. Clutch is unknowable and in all but the most extreme cases can be entirely ignored. But at the same time, clutch absolutely IS a real characteric of professional athletes. It cannot be otherwise.

We can and should act as if clutch did not exist. But that's just a useful lie, because we know that it must exist.


If it is so small an effect that we can act like it doesn't exist, then for all intents and purposes, it doesn't exist.
Page 2 of 3 pages  < 1 2 3 > 

You must be Registered and Logged In to post comments.

 

 

<< Back to main

News

All News | Prime News

Old-School Newsstand


BBTF Partner

Support BBTF

donate

Thanks to
Don Malcolm
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Hot Topics

NewsblogRosenthal: He’s 53 and hasn’t played in the majors since 2005, but Rafael Palmeiro is eyeing a comeback, and redemption – The Athletic
(93 - 7:26pm, Dec 10)
Last: Gonfalon Bubble

NewsblogOTP 04 December 2017: Baseball group accused of ‘united front’ tactics
(1673 - 7:24pm, Dec 10)
Last: ERROR---Jolly Old St. Nick

NewsblogAlan Trammell worthy of Cooperstown call
(23 - 7:23pm, Dec 10)
Last: Ziggy: The Platonic Form of Russell Branyan

NewsblogRyan Thibs has his HOF Ballot Tracker Up and Running!
(324 - 7:15pm, Dec 10)
Last: Gonfalon Bubble

NewsblogOT - NBA 2017-2018 Tip-off Thread
(1895 - 7:03pm, Dec 10)
Last: don't ask 57i66135; he wants to hang them all

NewsblogThe Giancarlo Stanton Trade Shines a Light on the Sad Difference Between the Mets and Yankees
(25 - 6:54pm, Dec 10)
Last: PreservedFish

NewsblogYankees in talks on Giancarlo Stanton trade
(182 - 5:33pm, Dec 10)
Last: You Know Nothing JT Snow (YR)

NewsblogShohei Ohtani agrees to deal with Angels | Los Angeles Angels
(55 - 4:51pm, Dec 10)
Last: Walt Davis

Hall of Merit2018 Hall of Merit Ballot Discussion
(313 - 4:40pm, Dec 10)
Last: bachslunch

NewsblogShohei Ohtani’s Value Has No Precedent | FiveThirtyEight
(19 - 4:31pm, Dec 10)
Last: PreservedFish

Gonfalon CubsLooking to next year
(296 - 4:03pm, Dec 10)
Last: Meatwad

NewsblogOT: Winter Soccer Thread
(288 - 1:16pm, Dec 10)
Last: Jose is an Absurd Doubles Machine

NewsblogIf Sandy Koufax is a Hall of Famer, Johan Santana Is Too
(46 - 12:40pm, Dec 10)
Last: karlmagnus

NewsblogMariners Acquire Gordon As Marlins Pick Up Trio Of Prospects | BaseballAmerica.com
(58 - 9:16am, Dec 10)
Last: snapper (history's 42nd greatest monster)

NewsblogBill Liningston's HOF Article
(24 - 8:47am, Dec 10)
Last: PreservedFish

Page rendered in 1.4194 seconds
47 querie(s) executed