You are here > Home > Primate Studies > Discussion
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Primate Studies — Where BTF's Members Investigate the Grand Old Game ## Sunday, February 29, 2004## DIPS RevisitedWhat does play-by-play data tell us about DIPS? Several years ago, Voros McCracken published a 2-part thesis, entitled Defense Independent Pitching Stats (DIPS). In it, he introduced two important concepts in evaluating baseball performance. One, that a player?s (in this case, a pitcher?s) sample performance does not necessarily tell you a lot about his true talent, and thus his likely future performance. And two, that different components of a player?s performance do not necessarily convey the same quality of information about that player?s true talent with regard to that component.
Voros found that a pitcher?s batting average on balls in play ($H or BABIP), defined as non-hr hits per non-hr balls in play, conveys quite a bit less information about a pitcher?s talent than does, say, his HR, BB, or SO rate. He concluded, and correctly so, that a pitcher?s sample BABIP is not a particularly good predictor of his future BABIP. He found that the correlation between a pitcher?s one year BABIP and his next year BABIP was low, as compared to that of his HR, BB, and SO rates. In fact, when Voros ran a year-to-year (y-t-y) linear regression of all pitchers who had at least 162 IP?s in 1998 and 1999, he got the following correlation coefficients.
BB=.681 $SO=.792 $HR=.505 $H =.153 Voros? conclusion, based upon the .153 correlation coefficient above, was "that the ability for pitchers to prevent balls in play doesn?t exist, or if it does, it doesn?t really amount to much for almost all pitchers." Unfortunately, Voros did not adequately explain exactly what these correlations mean, the relationship between sample size and correlation, and how team defense and a pitcher?s home park factor into the equation.
In baseball, when you correlate a measure of a player?s performance from one time period to another (it could be from one year to another but it doesn?t have to be), the resultant correlation coefficient simply indicates how well the measurement in the first time period is useful in predicting the same measurement in the second time period. While the relatively low correlation for $H in Voros? regression suggests that a pitcher?s BABIP in one year is not very useful in predicting his BABIP in the next year, it
For one thing, if all pitchers in the major leagues, particularly those with most of the IP?s, had around the same ability to prevent hits on balls in play, then the y-t-y correlations for BABIP would be near zero, even though preventing hits on balls in play may
Let?s review the three things that can affect the y-t-y correlations, thus the
In fact, the size of the correlation coefficient is a direct function of the spread of talent in the population and the size of the samples in each data element. If there is any skill whatsoever related to the performance measure we are sampling
In Voros? regression, the sample size of each data element was fairly large (a minimum of 162 IP?s, or around 500 BIP), such that a correlation of .153 does indeed suggest that there is little skill or spread of talent in the population with respect to BABIP. Keep in mind though, that if Voros were able to sample pitchers who had millions of IP?s in each of two different time periods, the correlation would indeed be close to 1, if there were
Finally, the third influence on the y-t-y correlations is the amount by which a player?s true talent may change from one year to the next. The more that a player?s true talent changes from year to year, in a random or semi-random fashion, the smaller the y-t-y correlations will be. That effect can me mitigated by doing in-season regressions, like
Now, because regressions and correlations do not imply
In revisiting Voros? thesis, the first thing I did was to essentially duplicate his regression analysis. Rather than using an estimate of a pitcher?s BABIP from his traditional stat line, I used play-by-play (PBP) data to record his exact hits per BIP (BABIP). I ignored all balls in play that were bunted and all foul balls. I also removed all games in Colorado from the database.
I regressed all pitchers who had at least 300 BIP in back-to-back years from 1992 to 2003. There were 312 data pairs. Each data pair was independent (I regressed 1992 on 1993, 1994 on 1995, etc.) and the average number of BIP?s in each data pair was around 520. This corresponds to around 168 IP?s, a slightly smaller sample size than in Voros? regression - thus we would expect to get smaller correlations, everything else being equal.
Here is the result:
That is almost exactly what Voros got, considering that my sample size (for BIP) was smaller than Voros?.
Remember, however, that I said that defense and park factors could create a
Here is the result for only those pitchers who switched teams:
That?s right - once we take defense and home park out of the equation, there appears to be almost no skill in a pitcher?s ability to prevent hits on balls in play! Voros was right!
As several people have pointed out subsequent to Voros? original articles, that doesn?t necessarily means that a
As I explained before, no matter how small the sample correlation is for
Getting back to the possibility of certain classes of pitchers having unique hit preventing abilities, it should be clear that fly ball pitchers, on the average, will have a different $H than will ground ball pitchers, since a fly ball has a higher out percentage than a ground ball. In fact, extreme ground ball pitchers have a BABIP of .297 (1992-2003), whereas extreme fly ball pitchers have a BABIP of .281 (
In order to get a better idea as to why a pitcher appears to have little if any control over the outcome of his BIP, I separated BIP into six categories:
1) Line drive through the infield 2) Line drive in the outfield 3) Pop fly on the infield 4) Pop fly in the outfield 5) Fly ball in the outfield 6) Ground ball (no bunts)
These categories are based on the judgment of the play-by-player scorers. To some extent we can expect some of the subtle distinctions to be at least partially based on the outcome (e.g., if a ball in the outfield could reasonably be scored as either a fly ball or a line drive, if it is not caught, it is probably more likely to be scored the latter). Also, as you will be able to infer from the following charts, it is also possible that there are some severe biases among scorers (assuming that the same scorer tends to score the same team from year to year).
First, here are the percentages of BIP and the hit percentages for the above categories in 2003. The second column is the number of balls in that category divided by the total number of BIP. The third column is the hits per BIP for that category.
Now, here are the y-t-y correlations for these same categories. Again, pitchers had to have at least 300 BIP in back-to-back years to qualify for the regressions. As in the BABIP regressions, there were 107 pitchers who qualified in the 1992-2003 database and who switched teams from year x to year x+1. There were another 389 pitchers who qualified and who did
Again, since we only looked at pitchers who switched teams from year x to year x+1, we have essentially removed the home park and defensive influences from the correlations.
The results are quite interesting. Even though the overall correlation on BABIP (the last entry in the third column) is near zero, there appear to be several components that have some predictive value, and are therefore somewhat within a pitcher?s
Not surprisingly, a pitcher?s FB and GB as a percentage of his total BIP (essentially his G/F ratio) are very much within a pitchers control and appear to be relatively stable from year to year. The number of IF pop flies and to some extent OF pop flies,
Even though IF and OF line drives individually do not correlate well from year to year, the percentage of all line drives a pitcher allows appears to be somewhat within his control as well. So good pitchers may also give up fewer line drives per BIP.
The last column, or the hit percentage correlations, is even more interesting. The only category that a pitcher appears to have any significant control over is his hits per outfield line drive. Essentially what this means is that the better pitchers allow outfield line drives that are easier to catch. In other words, a ground ball is a ground ball, a fly ball is a fly ball, and a line drive to the infield is a line drive to the infield, regardless of the pitcher. On the other hand, all
To summarize the implications of the above chart, even though the y-t-y $H correlation of a 500 BIP pitcher is very small, a pitcher may have a fair amount of control over certain components of those BIP. The regression results suggest that good pitchers give up slightly fewer line drives and slightly more pop flies, as a percentage of their total BIP, and that their line drives hit to the outfield (and perhaps their ground balls) may be
Finally, to get an idea as to how home parks and defense (and perhaps the PBP scorers) affect the above regressions, here is the same chart of correlations, using only those pitchers who played on the same team in year x and year x+1.
As you can see, many of the correlations increase substantially, suggesting that defense, home park, or the PBP scorer, plays a significant role in creating these correlations in the first place. Whereas the increased correlations in the last column are not surprising, considering that defense and home park can have a substantial affect on the hit percentages of the various BIP components, the correlations in the second column
In conclusion, while it appears that Voros was essentially correct in that a pitcher has little control over his BABIP, he was not able to investigate this phenomenon on a more granular level, which requires an analysis of PBP data. Such an analysis suggests that pitchers may have more or less control over various components of their BIP than their overall $H y-t-y correlation would imply. In fact, good pitchers probably tend to give up fewer and
Mitchel Lichtman
Posted: February 29, 2004 at 05:00 AM | 43 comment(s)
Login to Bookmark
Related News: |
## BookmarksYou must be logged in to view your Bookmarks. ## Hot TopicsLoser Scores 2015
(12 - 2:28pm, Nov 17)Last: jingoist Loser Scores 2014 (8 - 2:36pm, Nov 15)Last: willcarrolldoesnotsuk Winning Pitcher: Bumgarner....er, Affeldt (43 - 8:29am, Nov 05)Last: ERROR---Jolly Old St. Nick What do you do with Deacon White? (17 - 12:12pm, Dec 23)Last: Alex King Loser Scores (15 - 12:05am, Oct 18)Last: mkt42 Nine (Year) Men Out: Free El Duque! (67 - 10:46am, May 09)Last: DanG Who is Shyam Das? (4 - 7:52pm, Feb 23)Last: RoyalsRetro (AG#1F) Greg Spira, RIP (45 - 9:22pm, Jan 09)Last: Jonathan Spira Northern California Symposium on Statistics and Operations Research in Sports, October 16, 2010 (5 - 12:50am, Sep 18)Last: balamar Mike Morgan, the Nexus of the Baseball Universe? (37 - 12:33pm, Jun 23)Last: The Keith Law Blog Blah Blah (battlekow) Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011 (2 - 8:03pm, May 16)Last: Diamond Research Retrosheet Semi-Annual Site Update! (4 - 3:07pm, Nov 18)Last: Sweatpants What Might Work in the World Series, 2010 Edition (5 - 2:27pm, Nov 12)Last: fra paolo Predicting the 2010 Playoffs (11 - 5:21pm, Oct 20)Last: TomH SABR 40: Impressions of a First-Time Attendee (5 - 11:12pm, Aug 19)Last: Joe Bivens, Floundering Pumpkin |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Page rendered in 0.6356 seconds |

## Reader Comments and Retorts

Go to end of page

*Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.*

1. Astro Logical Sign Stealer Posted: February 29, 2004 at 03:10 AM (#614737)Ted, I think this is a big next step over Tippett's work.

I may come back with questions after I study it a bit more.

Ted, I think this is a big next step over Tippett's work.You still have the problem of selective sampling and correlation across a narrow range of performance variation. Pitchers who pitch enough to get 300 BIP in successive seasons may very well be the ones who have the most control over the results from balls in play, and pitchers who have the least control over the results probably won't hang around enough to get to 300 BIP in successive seasons. (300 BIP is usually around 70-80 IP).

The safest conclusion to be drawn is not that pitchers have *no* effect over the results from BIP - but that among the set of major league pitchers the differences are small enough so that we can treat the pitcher's impact as constant.

-- MWE

Mike, I'm not sure what you mean. I think MGL is uncovering the correlations that do exist between pitchers. He's finding the real trends under the noise, despite the sample size issues.

MGL, the more I think about it, the more I'm disturbed by the correlations in your last table. Don't large correlations like that undermine the validity of the data?

Anyway, the conclusions in this article certainly make intuitive sense. As many pitchers have pointed out, pitching is all about disrupting the batter's timing. If you're successful in doing that, it would seem that you'd be more likely to get weakly hit balls, and fewer homeruns as well.

In fact, good pitchers probably tend to give up fewer and softer line drives and easier pop flies than do poorer pitchers.That sounds really familiar...pretty much my adjustment for pitching staff quality on ZR about 6 years ago - and one of the big reasons I don't subscribe to "Andruw Jones is a god."

It's some nice work, MGL.

Let's say I'm a pitcher with a true skill for $H of .333. Let's further say that that is a true skill and it is very repeatable. Why would it matter if other pitchers also had a true, repeatable skill for $H of .333? Wouldn't the correlation then be very strong for everyone?

For instance, did you see that UZR was quoted in the New York Times today in a Mike Cameron article? No reference to primer, alas.

http://www.nytimes.com/2004/03/01/sports/baseball/01METS.html

I guess I'm missing something big. If the pitchers have more control over various components of BABIP, where does it go? I.e., what causes this control to drop out of the overall y-t-y correlation?Ben, I'm not sure what you mean. The pitchers have better correlations on some components than overall, but they also have worse correlations on some components than overall. For the pitchers who switched teams, there are even two slightly negative correlations. In both sets, ground balls, the second most common type of BIP, have lower correlations than the overall correlations.

I assume the correlations balance out at around the overall BIP correlations. Right?

MGL, great work.

Let's say I'm a pitcher with a true skill for $H of .333. Let's further say that that is a true skill and it is very repeatable. Why would it matter if other pitchers also had a true, repeatable skill for $H of .333? Wouldn't the correlation then be very strong for everyone?Bunyon,

Over a sample as short as a year, even if every pitcher has a true skill of .333, there's going to be some variation around that, and it will occur more or less randomly for everybody. So you might see results like this:

The overall correlation from year n to year n+1 in that example is -.115. With a real sample and with more pitchers and years, you'd get a better result.

Pop flies are defined as "fly balls that don't travel 220 feet" in STATS scoring speak.

Duck snorts are problematic, as they will be interchangeably defined as popf flies and line drives. There generally aren't very many, as a percentage of any one pitcher's BIP.

Studes, do you think the article you cite *should* mention Primer?

Chris, I don't think the Times should reference primer. After all, MGL has basically made his work public, with no copyright protection that I've seen. But it would be nice, particularly for readers who might want to learn more.

*everything* written down is copyrighted. It's implied (or actually, it's law). Everything that you write is copyrighted by you nowadays.

The answer is, yes, the writer should reference where he saw it. He doesn't have to say "MGL", he can link or saw "at Baseball Primer" or whatever - it helps his reader for starters.

Mike, I'm not sure what you mean. I think MGL is uncovering the correlations that do exist between pitchers. He's finding the real trends under the noise, despite the sample size issues.Suppose that pitchers really do have an ability to control the percentage of hits on balls in play - IOW, there is a level of "true talent" $H. Assume for the moment that, in order to pitch at all in the majors, you have to have a "true talent" $H of no more than .340. Over 300 BIP (which as I indicated is usually between 70-80 IP), this would mean that you'd allow 102 hits in 70-80 IP, exclusive of HR - somewhere around 11-13 non-HR hits per nine innings, which is about as many as a team can likely tolerate. To pitch well enough to make it through two consecutive seasons of 300+ BIP, you'd have to do better than that, in most if not all cases - a .300 level would be 90 non-HR hits over those 70-80 innings, which your team might be able to live with. The best pitchers usually average somewhere around .270 $H.

Now, if you select pitchers with 300+ BIP in back-to-back seasons, what you are effectively doing is selecting pitchers from the range .270-.300, rather than the .270-.340 range that is typical of all major league pitchers - and whether you realize it or not, you're removing a large group of pitchers from the study who operate at the margins of major league performance. Those pitchers might very well exhibit different effects from the ones that you are studying, because

they aren't as goodas the ones that you are studying, and it's conceivable that the reason they aren't as good is that theydon'tcontrol H/BIP as well, for reasons which won't show up in a study limited to just good pitchers.I know MGL understands all of this, so this comment wasn't really directed at him. The point I wanted to make is that correlation analysis needs to be taken with a grain of salt when applied across a restricted range of performance, especially when attempting to measure an aspect of skill that could - if it existed - directly impact that performance.

-- MWE

But what about MGL's conclusions? Specifically:

To summarize the implications of the above chart, even though the y-t-y $H correlation of a 500 BIP pitcher is very small, a pitcher may have a fair amount of control over certain components of those BIP.Seems to me that MGL has uncovered some interesting findings even within his "restricted range of performance". Do you see it differently?

Chris, you were testing me, right? Thanks for the info. I just might send the guy an e-mail.

The question is: how can we detect this, and how much of this can we detect.

Alot of this we just won't be able to detect.

It's apparent that it's alot easier to detect this based on the rate of Line Drives given up.

Seems to me that MGL has uncovered some interesting findings even within his "restricted range of performance". Do you see it differently?Is the correlation effect on hits on OF LDs for good pitchers applicable across the range of *all* pitchers? A correlation in a subgroup is not necessarily indicative of a correlation across the entire group, especially if the distribution of the performance being measured is heteroskedastic; there may easily be a subgroup in which a significant correlation is detected that can't be detected in the population at large, because the variance of the statistic is conditioned on the performance level.

That doesn't mean that the detected correlation is not still valuable; as Chris Dial suggests, the ability to generate *catchable* line drives might be a discriminant between good pitchers and bad pitchers. But there's a lot more investigation to do across the entire range of pitching performance before one can be assured that's the correct conclusion to draw.

-- MWE

For example: say Greg Maddux allowed 100 ground balls into zone 5 last yr, with 30 going for hits and 70 being turned into outs. Suppose the average GB-out rate for this zone was 60%. Maddux would have been expected to give up 40 hits on ground balls to zone 5. Then from ZR, we can see that between Castilla and Furcal, they saved Maddux 8 hits on ground balls to zone 5. Subtracting out the effects of the defense, we might see that Maddux allowed 2 less hits than expected on ground balls to zone 5. Do this over a number of years, and maybe we can observe a trend. Maybe maddux is able to prevent an average of 5 hits per year on ground balls to zone 5, independent of defense. If he can, then maybe we have quantified his skill at getting right handed batters to pull the ball weakly to the third baseman.

I'm just throwing this out there off the top of my head. Does this seem justifiable?

Thanks for the great article. Although I'm undoubtedly biased, it seems to support DRA:

"The number of IF pop flies and to some extent OF pop flies, as a percentage of all non-GB BIP, are somewhat a unique function of the pitcher as well. In other words, good pitchers may tend to get more pop files than bad pitchers, as a percentage of their total non-ground ball balls in play."

As you may recall, DRA allocates estimated infield pop-outs to fielders, and I speculated in the article that pop-ups might be the key variable (when zone data is unavailable) for determining a pitcher's effect on BABIP.

If I'm understanding your article correctly:

1) Infield pop-ups have the highest out-conversion rate of any category of BIP: 96%. As mentioned in the DRA article, it a pitcher generates a BIP that is *never* fielded (a HR), we charge the pitcher. If a pitcher generates a BIP that is almost *always* fielded (in infield pop-up), shouldn't the pitcher get credit?

2) With the exception of the overall tendency to give up either ground balls or outfield fly balls, the tendency to give up infield pop-ups is more persistent than the tendency to give up other forms of BIP. So generating infield pop-ups seems to be, to some degree at least, a skill. Query whether, given the relative rarity of infield pop-ups (at least compared with the category of all ground balls and all outfield fly balls), using larger samples of data (two-year v. two-year?) would show a higher "r" for infield pop-up generation.

The most surprising result seems to be that the rate of giving up line-drives is not that persistent (.14 "r", at least for pitchers who change teams, which I believe is the best data set to use), but *given* the allowance a line-drive, differing out-conversion rates *are* persistent. Perhaps AED's is right:

"If there were systematic differences between what is classified as a line drive, you would expect to see significantly lower r's for the players who changed teams. So it seems more likely that you're getting hammered by sample size or selective sampling . . . ."

On the other hand, it might be interesting to determine whether there is a *cross*-correlation between generating infield pop-ups and allowing line drives that are less likely to be hits. Throughout the development of DRA, I kept finding that the run-weight for infield fly outs was a little higher than it "should" be (compared to other BIP outs), but that was probably because infield fly outs correlate with *outfield* fly outs (which have higher out conversion than ground balls) and may correlate with "easier to field" line drives.

Thanks again for a very helpful study.

"As you may recall, DRA allocates estimated infield pop-outs to *pitchers*."

If that's right then the two intervals don't look that different to me, there's lots of overlap. So I would argue that taking park and defense out of the equation just gets us fewer data points.

But I'm a complete novice when it comes to using these types of statistics, so sorry if this point is completely wrong.

No such thing as standard deviation or confidence intervals for correlations.

In thinking about this:

"Even if something is indeed a skill, if there is little or no spread in true talent with respect to that skill, in the population from which we are sampling (major leagues pitchers), then the y-t-y correlation for any sample size will be close to zero."This is a pretty important point. My first reaction was that you'd have to argue that the variance of this skill is very close to 0 for this to matter. Then, I realized that this is actualy very close to the current thinking about major league pitching (that most have no influence on batting average on balls in play). So, if you believe this, then the expected y-t-y correlation is zero. If you don't believe it, the expected correlation would be some value other than zero.

There's a few points in here I didn't understand. One is this:

"In fact, the size of the correlation coefficient is a direct function of the spread of talent in the population and the size of the samples in each data element. If there is any skill whatsoever related to the performance measure we are sampling and there is any spread of true talent in the population with regard to that skill, then as the sample size get larger, the correlation always approaches 1. The converse of that is also true. Regardless of how much skill and what the spread of talent is, as the sample size gets smaller, the correlation always approaches zero."Why should the correlations be expected to approach 0 or 1.00 as the sample sizes get larger? Shouldn't the approach the "true" population value? This statement seems to assume that there are either perfectly linear or oblique relationships and nothing else.

But correlation and (simple) regression are the same thing. In simple linear regression, the beta weight for standardized scores is also the correlation.

See the home page for a site that calculates confidence intervals.

"Even if the pitcher has a specific skill at generating various types of near certain outs that doesn't mean we necessarily would credit the pitcher. Imagine if we had even more granular data to work with that included not just the zone, but the precise speed of all BIP. Imagine further that Carlos Zambrano has a skill at inducing slow hit groundballs to the left side of the infield. His sinker is fast enough so that those pitchs usually aren't pulled directly down the line. So if 99% balls hit in the SS's zone hit between 55-80 MPH are converted into outs should the pitcher get credit?"

That is precisely what would happen under a UZR system that also provided a P[itcher] Zone Rating. David Pinto's Probabilistic Model for Range does something similar.

"If inducing pop flies of either variety is a skill, we should take that ability into account when evaluating pitchers, much the same way we take GB/FB ratio into account. That doesn't mean pitchers are contributing to pop flies being converted into outs. The H% r itself seems to suggest that it's the infielders doing the work. Just because OF fly balls are almost always converted into outs (86.5%) doesn't mean that outfielders shouldn't get credit for their defensive contribution."

Simplifying somewhat, under a PZR/UZR system, the pitcher would get almost all of the credit for the out; the fielder only a tiny amount. DRA is a non-zone system that allocates responsibility for BIP to fielders, except that it allocates responsibility for infield fly outs to pitchers. Doing so results in an allocation of pitcher and fielder credit for BIP that comes close to matching the ratio found under the Allen/Hsu "Solving DIPS" paper, as well as fielder ratings that match up well with UZR.

(.050, .222) for the full data set, and (-.154, .224) for the pitchers that moved.

So I think taking park and defense out of the equation doesn't give you anything new. It just gives you wider confidence intervals.

Not to dwell on this point, but I was under the impression that the standard deviation reported was for the pitcher's $H, not for the correlation coefficient. If this is true, then this SD can't be used to create confidence intervals for the correlation coefficient, can it?

The sampling distribution of r is not normal, so you could not use the standard deviation to build a confidence interval.

To build the confidence intervals, you convert r to Z' and add/subtract a weight based on alpha times a multiplier (normally a standard error). I THINK it must be the case that Z' has a distribution that is determined by the sample size, hence the need to only know N to calculate the confidence intervals. But, I could be wrong.

Can anyone explain why you wouldn't use overlapping data pairs (i.e. 1992 on 1993, 1993 on 1994, etc.)? I think this is what Tippett did. You would nearly double your sample size, and perhaps pick up on some weird every-other-year patterns (e.g. Saberhagen) that you would otherwise miss. Would this introduce some other selection bias?

I would be happy to send you the data when I get back home. Anything I can do to help a fellow researcher...

So cell C3 says that a pitcher has some control over whether a line drive becomes a hit or not, but then B3 says that he has little control over whether his BIPs become OF line drives in the first place. How can you control the second without first controlling the first?Cell C3 indicates that there is a good y-t-y correlation for these pitchers in the rate at which outfield line drives become hits; it says very little about reasons why that might be true. It can be true, for example, if the rate at which outfield line drives are caught for outs varies little from team to team or pitcher to pitcher (IOW, if a high percentage of outfield line drives are inherently uncatchable). We need to keep open the possibility that there may be effects other than that of the pitcher which could lead to the correlations that we see. That's why range of performance - or, alternately, performance variance, which essentially tells you the same thing - across the data set is so important to know. If the entire range of performance within the data set varies from, say 26% of OF LD caught for outs to 28% of OF LD caught for outs, I'd have no problem concluding that pitchers have little influence over whether or not OF LD became outs, even if there were a strong y-t-y correlation within that group of selected pitchers.

-- MWE

If we only look at pitchers who play for the same team in the regression "pairs," even if pitchers had no control over any of their BIP components, clearly we might (probably will) see some y-t-y correlations, due to good or bad defense, park effects, etc. Those are the correlations I am trying to "factor out."

Now, if we only look at pitchers who changed teams, we have no "systematic bias" because the assumption is that for a large sample, the average park effects and defense of the new team is league average. Therefore if pitchers had no control over any of the BIP components, the correlations for pitchers who changed teams would have to be zero, as there would be nothing "connecting" their BIP rates from one year to the next, which is essentially the definition of correlation (a "connection," loosely speaking).

If pitchers do have some control over some of their BIP components, then that control, as measured by the correlation coefficient, should survive the "changing of teams," since the variation in y-t-y rates as a result of the new defense and park is random (no bias).

As far as how introducing another variable (even if it isn't biased) effects the magnitude of the correlation coefficients, I don't know. I would have to defer to the statisticians for that answer. I could simulate the effect and come up with an answer I suppose. IOW, if the true y-t-y correlation for, say OF line drive hit percentage, were .200, and we then introduced some noise (changing teams and therefore changing defense and parks, which presumably affects the true OF line drive hit percentage independent of the pitcher, would the correlation change? I don't think so, but I am not sure.

In any case, changing of teams cannot create a correlation where none existed or even increase a correlation where one did exist, since where any given pitcher goes is presumably random, at least as far as his first year BIP rates. If there is no "connection" between his BIP rates in his first year, and where he goes, then there is by definition no bias and there can be no increase in the correlations.

As far as the ROE's on ground balls, there is probably not an extra bias associated with ROE's, so adding it into the mix shouldn't change anything, although since many more ROE's are on GB's to the left side of the infield, we might see some "extra" correlation (higher coefficient) based on a pitcher' handedness. Since we aren't really interested in that kind of "control," it is probably a good thing NOT to include ROE's in the ground balls...

With all respect, I think that you are completely wrong, but not being a statistician, I would have to defer to someone like AED if he is still lurking.

If I get a chance, I'll do some "correlation sims" which should clear up the issue...

Each data pair was independent (I regressed 1992 on 1993, 1994 on 1995, etc.)These data pairs are not independent, since it's still the same person pitching in 1992 and 1994. Let's say I'm taking your blood pressure once a year -- you would not say that your blood pressure in 1994 is independent of 1992. There are ways to adjust for this data structure, where you have repeated measurements on the same person, in linear regression. I'd have to pull out my school notes to give you the details, but basically you come up with some assumed correlation matrix that describes where correlations are in the data, then the regression procedure uses that information in its calculations. Probably only affects the standard errors of your calculations, though.

Now, how do you know that, say, I'm not a professional statistician myself, with a post-graduate degree in that field?Come on, that's a silly question.

Could you point out what exactly was wrong in my arguments (other than a couple of typos)? I'll give one more analogy, and I would appreciate it if you would think about it instead of blindly saying it's wrong.Honestly, I haven't had the time to go through your premise. On the other hand, I did take lots of time to think through my thesis when I wrote the article (several months ago).

As I said, I

thinkyou are wrong, but I'm not sure. I did say that Ithinkyou are wrong, not that I wassureyou are wrong, didn't I? There is a big difference (from my perspective).I am no statistician either but I do know a fair amount about regressions and correlations, and have done a fair amount of work in baseball analysis using those "tools."

However, I'm not sure I am capable of addressing your specific concerns, so I apologuice for just saying that "I think you are wrong," as that is not particualry helpful. As I said, I can probably write a simple sim or two that would tell us how changing parks (park effects and defense) affects a correlation, whether or not there is one in the first place.

I'll try and do that tonight....

You must be Registered and Logged In to post comments.

<< Back to main