Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

1. Russ
Posted: April 25, 2017 at 12:02 PM (#5442097)

I get it, but I don't get it. The Appling article is necessary reading to even begin to grasp what Bill is doing here. My basic understanding of what he is doing is the following.

1. First choose a counting stat (counting is easier for my explanation). Let's use triples (since that's where he starts).

2. Build a big table that has a row for each player in his study population and a column for each age in the study.

3. Now build a model that computes what you would *expect* to see in each row, if age had no effect on the distribution of triples over columns, *conditional* on the total number of triples for a player and the distribution of playing time for a player over their career (by counting plate appearances per season).

4. Now sum *down* the columns to see if you can find the average deviation from expectation across all players. James then (seemingly) uses this to find a pattern of average aging across levels.

My read on this approach is as follows. This is a super clever way to adjust for the super star effect that James saw in the raw data. This is the kind of stuff that he just intuits that takes me literally years to teach my graduate students. However, there are a couple of drawbacks I see in this approach.

(i) Better players will receive more plate appearances. By summing down the columns (rather than averaging down the columns), James is giving more weight to players with more plate appearances. Of course, there are more data for those players, so maybe you would want to do that for precision reasons, but the danger is that you end up with a curve that better represents the aging curve for better players (more PA) compared to worse players (fewer PA).

(ii) The events across plate appearances are not independent. If you get a triple, you can't get a double or a home run (or an out). Although James uses a "+" style coding of the expectation (i.e. taking amount per age relative to expected value for that age), I think that hides the dependence amongst the counting stats. I think it would be more interpretable to look at how things change as a percentage of plate appearances. The excess number of triples per total plate appearances could be compared to the possible deficit in the number of home runs per total plate appearances for 21-year olds, for example. One of my very good friends from grad school (from New Zealand no less!) made the very astute observation that the aging of baseball players can often be seen in how doubles and triples get shifted over to homeruns as the player aged. Fewer doubles isn't bad if you're hitting more home runs. But if you have fewer doubles and NOT more home runs, that's a bad sign.

2. villageidiom
Posted: April 25, 2017 at 12:11 PM (#5442103)

Young hitters are asked to bunt more often. You would expect that to be true, but the increase in Sacrifice Flies as players age is very interesting. One might expect veteran players to have more sacrifice fly opportunities because they are used in more RBI situations, but the increase in Sac Flies as players age is far too large to result from RBI opportunities.

Let me try to explain. Getting a fly ball when a fly ball is needed is a subset of a clutch skill, is it not? If you can do that, you’re a clutch player. A traditional insider, presented with this data, would say "Of course". Of course a veteran player knows how to get the ball in the air in a situation where a fly balls means a run. Anyone would know that. But 99% of the old baseball accepted wisdom of this type turns out on examination not to be true or not to be meaningfully true—such as clutch hitting in general. I’m not surprised that veteran players exceed expected sacrifice flies by 1% or 2%; that could be explained by player usage patterns and some small skill differential. But I am VERY surprised that they exceed expectations (relative to career norms) by 5% to 7%. That MAY indicate a clutch ability. It may be that, in another generation, when we have the data organized to do this study, we may find that veteran players increase their batting average in clutch situations by some small amount, which we have been previously unable to document. This may be the largest surprise of the study.

Prior clutch studies would have overlooked this because (a) when we think of "clutch" we're thinking of performance that doesn't result in an out, and (b) we've considered "clutch" to be a skill one has, not a skill one develops.

I went through this years ago here, that in the span of a great career a player might have only 600 "clutch" opportunities, and to evaluate whether we've properly identified clutch skills we might need to set aside 200 of them for validation, leaving 400. And if we are evaluating a player whose career is in progress maybe we have 120 such opportunities. Now taking what might be implied in this study, we can see that those 120 opportunities might represent a period in which the player hasn't yet fully developed those skills. And even if we did, would we have counted a sac fly as "clutch"? Probably not; we'd look at their OPS or maybe OBP in those situations, compared to their stats in non-clutch situations. To OBP and mostly to OPS, a sac fly is the same as any another out.

I'm not sold yet that this is proving clutch ability. But prior studies were designed to miss something like this. So what else have we overlooked?

I think any study of this type has to take a year-by-year survival function approach.

e.g. for a given population of 25 y.o. players, how many drop out of the sample, then of the remainder, how does their performnace evolve from 25 to 26. You do this for every age, and chain the results.

If you don't account for the drop-outs, then you're really modeling only really good players, as Russ suggests above.

4. villageidiom
Posted: April 25, 2017 at 12:39 PM (#5442126)

3. Now build a model that computes what you would *expect* to see in each row, if age had no effect on the distribution of triples over columns, *conditional* on the total number of triples for a player and the distribution of playing time for a player over their career (by counting plate appearances per season).

I think it's simpler than that. I think he's just calculating a "triples per PA" rate for each player based on their career stats. No model, just a base expectation that their skills don't age. Then he compares that rate to the same player's actual rate by year. And it's that ratio (actual rate / career rate) that he aggregates.

He does appear to trap for the first drawback you mention, that the averages vary by age because worse players drop out of the sample. The second drawback is, I think, not a primary concern for what he's doing here but an important one to remember should this work be advanced for a different purpose.

If you don't account for the drop-outs, then you're really modeling only really good players, as Russ suggests above.

You guys really should read TFA. Everything you're talking about is what James says right before saying "That's what this article is about."

5. Walt Davis
Posted: April 25, 2017 at 06:23 PM (#5442465)

#2 ... Haven't RTFA so maybe it's covered but ...

(a) a situation where a sac fly is a good outcome is also a case where a GB out will usually score the run too. Sometimes there's a DP possibility and sometimes the IF is in or it's just hit hard right at somebody ... but then many FBs end up short.

(b) older players tend to hit more FB to compensate for their declining speed. Was their regular FB rate controlled for in those seasons? I do recall studies linked here suggesting that there was little/no evidence that SF rates differed from FB rates.

(c) a 5% higher SF rate? The 2016 leader in SFs had 15 in what appears to be 53 opportunities. (This was Lindor BTW, not a vet.) He also had 11 hits, 5 BB and 8 K. His hits produced 17 RBI but it looks like just 1 RBI GB. Anyway, a 5% higher SF rate would be 2.5 more SFs which would be better than Ks or non-scoring GB and much better than a DP. But it's just 2.5 extra RBI over a season that has to be balanced against the chance of the next guy driving in the run. And none of this is necessarily in "clutch" situations.

For the year, Lindor's BABIP was 342 so 11 hits in 40 contacts is not particularly good -- about 2.5 fewer hits than expected. If he is trying to hit the ball in the air and sacrificing 2.5 hits for 2.5 extra SFs, that's of course a terrible trade-off. Lindor for the year hit just over half his contacts into the air, I have no idea how many he hit in the air in these situations. His 11 hits include 2 doubles, a triple and a HR so that would seem to be at least 4 more to go with the 15 SFs. But I see no way to pull out flyball outs. He did hit into two bases loaded DPs.

But sure, it's perfectly reasonable to think that players, especially ones with more experience, would be looking for pitches they can hit in the air in those situations. But it's hardly rocket science and anybody who makes the majors has been playing baseball for 10-15 years before they got there and really should have noticed by then that runners can score on fly balls.

6. Russ
Posted: April 25, 2017 at 07:58 PM (#5442498)

VI, as for your first point, you are part correct, part incorrect. It is true that he pretends that there is no aging. But that yields equivalent expected rates as what I suggested. I only put it in the context of a 2 way table to illustrate the potential problem with the differential in the plat appearances. As for your other assertion, I don't think you are right. He first calculates the expected number of triples for all players at each age given their individual plate appearances, then he sums those down and compares to the obseved number of triples. That is how he gets the 112 in his example.

As for your second point, you are correct. By conditioning on the career rate he does eliminate the superstar survival bias. It doesn't matter if the superstars are the only ones to make it to 42, all that matters is how their age 42 year compared to their base rate over their career. It is a really simple and smart way to address the problem.

If he was averaging the differnence between observed and expected rates over players, that would elimate my perceived weighting problem. But then you introduce a new problem in that averaging many ratios can sometimes be worse in terms of error than taking the ratio of averages. What I think James likes about his approach is that you get such big effective sizes in the nuemrator and denominator.

## Reader Comments and Retorts

Go to end of page

1. Russ Posted: April 25, 2017 at 12:02 PM (#5442097)Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.1. First choose a counting stat (counting is easier for my explanation). Let's use triples (since that's where he starts).

2. Build a big table that has a row for each player in his study population and a column for each age in the study.

3. Now build a model that computes what you would *expect* to see in each row, if age had no effect on the distribution of triples over columns, *conditional* on the total number of triples for a player and the distribution of playing time for a player over their career (by counting plate appearances per season).

4. Now sum *down* the columns to see if you can find the average deviation from expectation across all players. James then (seemingly) uses this to find a pattern of average aging across levels.

My read on this approach is as follows. This is a super clever way to adjust for the super star effect that James saw in the raw data. This is the kind of stuff that he just intuits that takes me literally years to teach my graduate students. However, there are a couple of drawbacks I see in this approach.

(i) Better players will receive more plate appearances. By summing down the columns (rather than averaging down the columns), James is giving more weight to players with more plate appearances. Of course, there are more data for those players, so maybe you would want to do that for precision reasons, but the danger is that you end up with a curve that better represents the aging curve for better players (more PA) compared to worse players (fewer PA).

(ii) The events across plate appearances are not independent. If you get a triple, you can't get a double or a home run (or an out). Although James uses a "+" style coding of the expectation (i.e. taking amount per age relative to expected value for that age), I think that hides the dependence amongst the counting stats. I think it would be more interpretable to look at how things change as a percentage of plate appearances. The excess number of triples per total plate appearances could be compared to the possible deficit in the number of home runs per total plate appearances for 21-year olds, for example. One of my very good friends from grad school (from New Zealand no less!) made the very astute observation that the aging of baseball players can often be seen in how doubles and triples get shifted over to homeruns as the player aged. Fewer doubles isn't bad if you're hitting more home runs. But if you have fewer doubles and NOT more home runs, that's a bad sign.

Prior clutch studies would have overlooked this because (a) when we think of "clutch" we're thinking of performance that doesn't result in an out, and (b) we've considered "clutch" to be a skill one

has, not a skill onedevelops.I went through this years ago here, that in the span of a great career a player might have only 600 "clutch" opportunities, and to evaluate whether we've properly identified clutch skills we might need to set aside 200 of them for validation, leaving 400. And if we are evaluating a player whose career is in progress maybe we have 120 such opportunities. Now taking what might be implied in this study, we can see that those 120 opportunities might represent a period in which the player hasn't yet fully developed those skills. And even if we did, would we have counted a sac fly as "clutch"? Probably not; we'd look at their OPS or maybe OBP in those situations, compared to their stats in non-clutch situations. To OBP and mostly to OPS, a sac fly is the same as any another out.

I'm not sold yet that this is proving clutch ability. But prior studies were designed to miss something like this. So what else have we overlooked?

e.g. for a given population of 25 y.o. players, how many drop out of the sample, then of the remainder, how does their performnace evolve from 25 to 26. You do this for every age, and chain the results.

If you don't account for the drop-outs, then you're really modeling only really good players, as Russ suggests above.

I think it's simpler than that. I think he's just calculating a "triples per PA" rate for each player based on their career stats. No model, just a base expectation that their skills don't age. Then he compares that rate to the same player's actual rate by year. And it's that ratio (actual rate / career rate) that he aggregates.

He does appear to trap for the first drawback you mention, that the averages vary by age because worse players drop out of the sample. The second drawback is, I think, not a primary concern for what he's doing here but an important one to remember should this work be advanced for a different purpose.

You guys really should read TFA. Everything you're talking about is what James says right before saying "That's what this article is about."

(a) a situation where a sac fly is a good outcome is also a case where a GB out will usually score the run too. Sometimes there's a DP possibility and sometimes the IF is in or it's just hit hard right at somebody ... but then many FBs end up short.

(b) older players tend to hit more FB to compensate for their declining speed. Was their regular FB rate controlled for in those seasons? I do recall studies linked here suggesting that there was little/no evidence that SF rates differed from FB rates.

(c) a 5% higher SF rate? The 2016 leader in SFs had 15 in what appears to be 53 opportunities. (This was Lindor BTW, not a vet.) He also had 11 hits, 5 BB and 8 K. His hits produced 17 RBI but it looks like just 1 RBI GB. Anyway, a 5% higher SF rate would be 2.5 more SFs which would be better than Ks or non-scoring GB and much better than a DP. But it's just 2.5 extra RBI over a season that has to be balanced against the chance of the next guy driving in the run. And none of this is necessarily in "clutch" situations.

For the year, Lindor's BABIP was 342 so 11 hits in 40 contacts is not particularly good -- about 2.5 fewer hits than expected. If he is trying to hit the ball in the air and sacrificing 2.5 hits for 2.5 extra SFs, that's of course a terrible trade-off. Lindor for the year hit just over half his contacts into the air, I have no idea how many he hit in the air in these situations. His 11 hits include 2 doubles, a triple and a HR so that would seem to be at least 4 more to go with the 15 SFs. But I see no way to pull out flyball outs. He did hit into two bases loaded DPs.

But sure, it's perfectly reasonable to think that players, especially ones with more experience, would be looking for pitches they can hit in the air in those situations. But it's hardly rocket science and anybody who makes the majors has been playing baseball for 10-15 years before they got there and really should have noticed by then that runners can score on fly balls.

As for your second point, you are correct. By conditioning on the career rate he does eliminate the superstar survival bias. It doesn't matter if the superstars are the only ones to make it to 42, all that matters is how their age 42 year compared to their base rate over their career. It is a really simple and smart way to address the problem.

If he was averaging the differnence between observed and expected rates over players, that would elimate my perceived weighting problem. But then you introduce a new problem in that averaging many ratios can sometimes be worse in terms of error than taking the ratio of averages. What I think James likes about his approach is that you get such big effective sizes in the nuemrator and denominator.

You must be Registered and Logged In to post comments.

<< Back to main