Page rendered in 1.1005 seconds
60 querie(s) executed
Sunday, June 26, 2005
Discussing the Fog
Bill James and Phil Birnbaum discuss research - I stay out of the way with my mouth shut.
Discussing the Fog
Bill James wrote an article in the SABR publicationThe Baseball Research Journal
(Number 33) called “Underestimating the Fog”.
As we discussed here, this is an important piece of work for what James says: “What I am saying in this article is that the fog may be many times more dense than we have been allowing for. Let’s look again; let’s give the fog a little more credit. Let’s not be too sure that we haven’t been missing something important.”
That is a very important reminder that often gets lost in baseball statistical analysis and other fields as well.
However, there is the issue of clutch hitting that James used in “Underestimating the Fog”.
In the latest SABR newsletter By the Numbers, the newletter from the Statistical Analysis Committee, there were rebuttals to James’ piece – one by Jim Albert, who is a college prof who wrote a book called Curve Ball, that mathematically covers many aspects of baseball, and one by Phil Birnbaum, the chair of the Committee.
As a member of the SABR Statistical Analysis Committee, I subscribe to an email list where researchers post questions regarding research ideas and ask for help, or just discuss another point.
Bill James sent along his response to Jim Albert and Phil Birnbaum entitled “Mapping the Fog”. Birnbaum wrote a response. With permission from both, I am re-printing them here, in an effort to broaden the discussion. If Jim Albert wants to chime in, I’ll provide him the floor as well.
Without further rambling from me, here’s Bill James’ “Mapping the Fog”:
Mapping the Fog
This article has not been copyrighted, and is not intended to benefit from copyright protections. Please feel free to share it with anyone who might be interested.
1. My model
In issue number 33 of the Baseball Research Journal, I published an article entitled “Underestimating the Fog”. The thesis of this article is that we in sabermetrics have been relying on a method which doesn’t actually work, under closer scrutiny, and we should stop relying on this method. “This method” is the practice of attempting to determine whether some characteristic within the game is “real” or a statistical artifact by comparing whether the players who do well in this area in one year also do well in the same performance category the next year, as one would expect them to if the skill under study was “real”. I hope that made sense. . . .I’m a little confused myself, and, speaking of myself, I certainly was not suggesting that other researchers were guilty of this but I wasn’t. I was more guilty than anyone. I had misled the public on a series of issues due to my own failure to think clearly about this one matter, and I felt it was important for me to stand up and take responsibility for that.
Let us take the issue of clutch hitting, which is the most controversial of the many peripheral subjects entangled in the debate. Dick Cramer argued the following in 1977:
1) If clutch hitting really exists, one would expect that the players who were clutch hitters in 1969 would be clutch hitters again in 1970.
I accepted this argument for about a quarter of a century, but eventually it began to trouble me. When it began to trouble me enough, I posed a counter question to myself: is it possible to create a model in which clutch hitting clearly exists, but goes undetected by this type of analysis?
It is, in fact, possible. Let us create a “model league” based on the following assumptions:
In clutch situations, the batting average of the other twenty percent was re-calculated as
Their regular batting average,
Thus, a .280 hitter in non-clutch situations can be a .230 hitter in clutch situations, or a .330 hitter in clutch situations, or anywhere in between, and any one figure is as likely as any other—for those players who did have a “clutch element” in their makeup. The average clutch effect, for those players who have one, is 25 points positive or negative.
You may or may not agree that this model represents a fair test of the clutch thesis. If you agree that it does, end of subject. If you would argue that it does not. …Dick Cramer, in his 1977 article, stated that “I have established clearly that clutch-hitting cannot be an important or general phenomenon.” I would argue that if 20% of the hitters have clutch effects averaging 25 points, that is quite certainly an important and general phenomenon. Further, in several respects, this model exaggerates the impact of clutch hitting, which should make it easier to detect whether or not a clutch hitting ability is an element of the mix. In this league there were 60,000 at bats, which were neatly divided into 600 at bats each for 100 players. In the real American League in 1969—one of the leagues included in Cramer’s study—there were 65,536 at bats, but there were only 25 players who had 550 or more at bats, the rest of the at bats being messily distributed among players who had 350, 170, 80 and 4 at bats. This would make it much easier to detect the presence of clutch hitters in the model than in real life.
In the real leagues studied by Cramer, there were many players who had 520 at bats one year but 25 the next, making those players—and those at bats—essentially useless as a basis for year-to-year comparison. In my model, all 100 players had 600 at bats each year, with no one dropping out or coming in. This, again, would make it vastly easier to have meaningful year-to-year comparisons, in my model, than it would be in real life.
In my model, one-fourth of all at bats are designated as “clutch” at bats. In real life, it seems unlikely that the number of true “clutch” at bats would be that large. In real life, a player probably has 50 or 75 high-pressure at bats in a season. In my model, he had 150. This would make it vastly easier to detect clutch performers in the model than it would be in real life.
In my model, all at bats are cleanly delineated as “clutch” or “non clutch”. In real life, it is extremely difficult to say to what extent any at bat is “clutch” or “non clutch”. Again, this would it make it much, much easier to detect the presence of clutch hitters in this model than it would be in real life.
Having constructed this model, I then simulated on a spreadsheet 600 at bats for each player—450 in non-clutch situations and 150 under clutch conditions—and figured for each player his batting average in “clutch” situations and his batting average in non-clutch situations. I did this for two seasons for each of the 100 players, creating a “clutch differential” for each player in each season. Each player’s intended batting average changed from season to season, but his “clutch differential” remained the same. The spreadsheet on which this experiment was conducted is named “Clutch Consistency.XLS”, and I will e-mail a copy of this spreadsheet to anyone who asks. At first glance it just looks like a vast collection of random numbers, but I think you can figure it out with a little effort.
This method does not exactly mirror Cramer’s method, in his 1977 article which I was using as a kind of whipping boy in Underestimating the Fog. What I have described as “Cramer’s method” is in fact two methods—an (a) method which was used to determine whether a player was a clutch hitter in any given season, and a (b) method which was used to determine whether those players identified as clutch players were consistent from season to season. I was interested entirely in the questions raised by the (b) method. The subject of my article could be stated as “Will Cramer’s (b) method work reliably under real-life conditions, if we assume that his (a) method works?” The (a) method I never discussed at all, for three reasons—
Anyway, in my model, we know that clutch hitting does exist, and that it does exist at what seems to me a very significant level. Yet when I compared the “clutch differentials” of the 100 players in the two seasons, the year-to-year consistency was far, far below the level at which any conclusion could be drawn from the data. Despite all of the steps I took to make clutch ability easier to spot in the model than it would be in real life, it remains essentially invisible.
In the study, a player’s clutch contribution was labeled as “consistent” if he hit better in clutch situations than he did overall in both simulated seasons, or if he hit worse in both seasons. His clutch contribution was labeled as “inconsistent” if he was better one year and worse the other.
Overall, then, 52.4% of the players in the study showed consistency in their clutch contribution. If 52.44% of the players in a group are consistent from year to year and there are 100 players in the group, what is the random chance that 50 of them or fewer will show up as consistent in one test?
It’s 35%. Thus, no conclusion whatsoever can be drawn from the apparent lack of consistency in the data. Even when we know that the clutch effect does exist within the data, even when we give that effect an unreasonably clear chance to manifest itself, there is still a 35% chance that it will entirely disappear under this type of scrutiny.
What if 40% of the players have an “actual clutch effect”, rather than 20%?
Part of the problem with measuring “agreement” is that “agreement” narrows the odds, and thus profoundly changes the percentages. Suppose that half of the players in a group are good clutch hitters, and half are poor clutch hitters. Suppose that you have a test of clutch ability which is 80% accurate. Under those conditions, how many players will measure as consistent, meaning that they measure the same both years?
68%. 64% will measure as “consistent” accurately—.80 times .80—and 4% will measure as “consistent” due to a repeated inaccuracy. If the measurement is 80% accurate, in a two-year period 64% of the players will have two accurate measurements, and 4% will have two inaccurate measurements.
Thus, in order to achieve 62% agreement, as we did in the model above, you have to have a test which is 75% accurate. This is actually more of a problem in the catcher-ERA studies than it is in the clutch hitting studies.
In the first few weeks after “Underestimating the Fog” was published, I got reactions which were all over the map. However, the one thing that nobody said, in the first few weeks—at least, nobody said it where I happened to see it—was that what I was saying was not correct. Thus, I felt no pressure, in those opening weeks, to demonstrate that what I was saying was correct.
However, in the February, 2005 edition of By the Numbers—which I think came out in June, 2005, go figure—there were two articles which touched on the veracity of my central claim, and thus prompted me to put my supporting work on record.
These two articles tend to broaden the debate, and raise a number of points that I wanted to comment on. In the first of those two articles (Comments on “Underestimating the Fog”), Jim Albert writes:
I was interested in a statement that James made in this article regarding the existence of individual platoon tendencies. This was counter to the general conclusions Jay Bennett and I made in Chapter 4 of Curve Ball.
With this exception, I think that the rest of Dr. Albert’s comments, including those critical of the article, seem to me to be fair and well-considered, and I have no response to them.
The following article, however, the Phil Birnbaum article entitled “Clutch Hitting and the Cramer Test”, contains a number of statements that I wanted to comment on.
2) I don’t think that Birnbaum himself is confused about this (point 1), but he appends to his article a head-note which seems to suggest that he is responding directly to my article, and follows this by quoting two or three things I had said and responding to them. This creates the impression, to the reader, that we are writing about the same central issue. The longer his article goes, the more it drifts away from being a response to Underestimating the Fog.
In response to this, Birnbaum says that “This is certainly false. It is true that when you get random data, it is possible that ‘your study has failed.’ But it is surely possible, by examining your method, to show that the study was indeed well-designed, and that the random data does indeed reasonably suggest a finding of no effect.”
Reasonably suggests? We’re not talking about reasonable suggestions here; we’re talking about valid inferences from the data. Cramer didn’t say that his data “reasonably suggests” the absence of clutch hitters; he said—incorrectly—that his data “established clearly that clutch hitting cannot be an important or general phenomenon.” Joe Morgan, Tim McCarver, and generations of sportscasters before them have reasonably suggested that some players may have a special ability to rise to the occasion. The task in front of us is not to reasonably suggest the opposite, it is to find clear and convincing evidence one way or the other.
In the process of doing this, studies resulting in random data show only that the study has failed to identify clutch hitting ability. I stand by my statement without any reservation.
But he never actually addresses this question. His subsequent research has to do with whether Cramer is correct, and has nothing at all to do with whether his method works. He drops Cramer’s (a) method, and performs a test of statistical significance on the (b) method, the results of which, in my opinion, he misinterprets.
The results: a correlation coefficient ® of .0155, for an r-squared of .0002. These are very low numbers; the probability of an f-statistic (that is, the significance level) was .86. Put another way, that’s a 14% significance level—far from the 95% we usually want in order to conclude that there’s an effect.
But this data—and all of Birnbaum’s data—actually doesn’t indicate that there is no effect. In fact, it shows that there is some evidence that there may be such an effect, but that this evidence merely is far too weak to say for sure one way or the other. This is a very, very different thing—and one absolutely may not segue from one into the other in the way that Birnbaum is attempting.
Why? For this reason. Suppose that you took a ten-at-bat sample of Stan Musial’s career, and asked “does this ten at bat sample provide clear and convincing evidence that Musial was an above-average hitter?”
Of course the answer would be “no, it doesn’t.” In the ten at bats Musial might go 4-for-10 with 2 homers, but in a ten-at-bat sample, A. J. Hinch might go 4-for-10 with 2 homers. You would conclude, by Birnbaum’s method, that this provided very, very little evidence that Musial was in fact an above-average hitter.
Suppose that you broke Musial’s 1948 season down into a series of 61 ten-at-bat sequences, and tested each one for evidence that Musial was an above-average hitter.
By Birnbaum’s logic, this would provide overwhelming evidence that Stan Musial in 1948 was not really an above-average hitter, since he had failed 61 straight significance tests.
But wait a minute. . .the real-life problem is worse than that. Suppose that you took each ten-at-bat sample of Musial’s season, and you buried it in a pile of one thousand at bats by ordinary hitters, and you then tested the significance of the 1010-at-bat composite. This would make the f-statistic (significance level) much higher, while making the correlation coefficient even lower. You quite certainly would find no evidence whatsoever that Musial was pushing the group to be above average.
But the scale proposed here is massive. The standard deviation of batting average itself isn’t thirty points. The standard deviation of batting average, for all players qualifying for the batting title in the years 2000 to 2004, is 28 points (.0277).
Birnbaum’s argument is “if a clutch hitting ability existed on this scale, this analysis would find it.” But if a clutch hitting ability existed on anything remotely approaching that scale, Stevie Wonder could find it. If a clutch hitting ability existed on anything like that scale, we wouldn’t be having this discussion.
Cramer’s (a) method—his method of determining whether a player was or was not a clutch hitter—was to contrast two measurements. One was an estimate of the player’s presumptive win contribution, based on his total batting statistics. A home run is a home run. If a player hit a home run in the ninth inning of a 12-1 ballgame, that was the same as if he hit a walk-off homer in the bottom of the ninth. The other was an event-by-event assessment of what the player had contributed to his team’s wins. If a player hit a home run in the ninth inning of a 12-1 ballgame, that would essentially be a non-event, whereas if a player hit a David Ortiz shot, that might be worth 100 times as much.
I don’t know. I’m skeptical. I doubt that it would work. The problem, it seems to me, is that the method might be heavily liable to random influences.
Why? Too much weight on too few outcomes. I am guessing—but I don’t really know—that in Cramer’s (a) method, 50% of the variance between the player’s situation-neutral win contribution and his situational win contribution will be determined by 30 at bats by fewer (if the player plays regularly). Thus, the player’s ranking in this system would seem to be heavily influenced by random deviations in performance in a small number of at bats, and thus the players who were “truly” clutch hitters, in the model, might very often not be identified as clutch players.
10) Again for the sake of clarity, I am not suggesting that my “clutch indicator” systems works, either. My system worked, in my model, only because I set up the model to enable it to work within the model. It wouldn’t work worth a crap in real life.
It is my opinion that there is an immense amount of work to be done before we really begin to understand this issue.
And Birnbaum’s response:
Response to “Mapping the Fog”
In a famous 1977 clutch-hitting study, Dick Cramer took 122 players who had substantial playing time in both 1969 and 1970. He ran a regression on their 1969 clutch performance versus their 1970 performance. Finding a low correlation, he concluded that clutch performance did not repeat, and that, therefore, this constituted strong evidence that clutch ability did not exist.
Bill James, in his recent essay “Underestimating the Fog,” disputes that the Cramer study did indeed disprove clutch hitting.
“… even if clutch-hitting skill did exist and was extremely important, [Cramer’s] analysis would still reach the conclusion that it did, because it is not possible to detect consistency by the use of this method [regression on this year’s clutch performance against next year’s].”
“… random data proves nothing – and it cannot be used as proof of nothingness. Why? Because whenever you do a study, if your study completely fails, you will get random data. Therefore, when you get random data, all you may conclude is that your study has failed.”
To which I respond:
1. Yes, random data on its own proves nothing. But combined with evidence that your test would have found an effect if it existed, the random data is evidence that the effect doesn’t exist.
2. It is possible to detect clutch-hitting consistency (at reasonable, non-trivial levels) by the use of the Cramer test.
3. It is possible to show what effects the Cramer test is capable of finding, and, therefore, to what extent a “finding of no effect” disproves clutch hitting.
On number 1, Bill charges me with a fallacy – the fallacy of believing that, if a test finds no evidence of clutch hitting, this means that clutch hitting does not exist. I agree with Bill that this logic would be seriously incorrect – but I neither stated it nor implied it. My point was that if a test finds no evidence of clutch hitting, and you can show that the test would have found clutch hitting if it existed, well, then, and only then, are you entitled to draw a conclusion about the non-existence of clutch hitting.
Either Bill misread what I said, or I didn’t say it clearly enough.
The reason for the difference is that we’re using different tests.
Bill’s test, in essence, consists of looking at players in consecutive years, and assigning each player one of four symbols. He gets a “+ +” if he was a clutch hitter both years; “- -“ if he was a choke hitter both years; and “- +” or “+ -“ if he was split. Bill then counts the number of consistent players (+ + or - -), and compares it to the number of inconsistent players (+ - or - +). If clutch hitting existed, there would be significantly more consistent players than inconsistent.
My test – which is the same test that Cramer used (but with Bill’s measure of clutch rather than Cramer’s “(a)” measure, as Bill calls it), uses the actual numbers, and runs a regression. So if player A was 50 points higher in the clutch one year and 10 points higher the next, I add the pair (+50, -10) to my sample. I then run a regression (standard STAT101) on all the pairs, and look for a significance level.
The point is that Bill’s test is much, much weaker than mine. I think Bill is correct that with his test, “even if clutch-hitting skill did exist and was extremely important,” the test would be incapable of finding it.
(As an aside, I’d bet that if Bill threw out all datapoints except those where the absolute value of clutch hitting was over 25 points both seasons, the test would be much more likely to find significance. But that’s not important right now.)
By analogy, suppose that team A wins three games against the Brewers all by scores of 5-4, while team B wins three games against the same Brewers all by scores of 10-1. Bill’s test treats the teams the same, scoring them both as “+ + +”, and is incapable of noticing that team B is actually much better than team A.
But to my test (and Cramer’s), the amount of clutch hitting is considered. And so the Cramer test is capable of finding significant clutch effects.
It would and it did. The second row of my table (at the top of page 10 of “Clutch Hitting and the Cramer Test”), contains the results of 14 simulations of a season where clutch hitting was normally distributed with an SD of 30 points. Of those 14 simulations, the Cramer test found the effect, with statistical significance, in 11 of those 14 seasons. Seven of those 14 were extremely significant, rounding to .00.
Now, you could argue that 11 out of 14 isn’t enough – the test is only powerful enough 79% of the time. 21% of the time, the test will fail.
And that’s true if you only run the test on one season’s worth of data. But I ran it on 14 seasons. If clutch hitting at the .030 level should be caught 11 out of 14 times, and the real-life data (top row of the same table) showed significance 0 out of 14 times, does that not “reasonably suggest” (Bill doesn’t like this expression) that clutch hitting at .030 does not exist?
In my essay, I stopped there, but I could have done a more formal calculation. It looks like there’s about a 21% chance of failing to find significance for a single season. Let’s up that to 30% just to be conservative. We found 14 of those in a row. What’s the chance of a 30% shot happening 14 times in a row? 1 in 21 million.
That’s highly significant.
What’s Bill’s response to this test in “Mapping the Fog”? He doesn’t dispute the method or conclusion. Rather, he argues that .030 is a massive SD for clutch hitting (I implied that it was moderate; Bill is correct – it is massive). Of course this method can find an SD of 30 points, Bill says. “Stevie Wonder could find it.”
Bill writes, “maybe [the SD is] … 12, or 14, or 6, or 2. It sure as hell isn’t 30.”
Which is fair enough. But my original essay actually does go on to repeat the same test for 20 points, then 15 points, then 10 points, then 7.5 points – using exactly the same method, which Bill doesn’t dispute (and uses himself, as we will see shortly).
Bill does not mention these subsequent tests at all – nor does he mention my conclusion that the Cramer test (with 14 seasons of data) is “doubtful” with a standard deviation of 10 points, and that I agree with him that it “fails” if the SD of clutch hitting is actually only 7.5 points.
But Bill used his “signs” test rather than the Cramer regression, and that’s why he failed to find any effect.
My results: out of my 56 simulated seasons, 11 showed statistical significance at the .05 level in a positive direction. If the data were random, it should have been 2.5% of 56, or 1.4.
Again I didn’t do this in the essay, but what is the probability of getting exactly 11 positives out of 56, where the chance of each positive is 2.5%? If I’ve done the calculation right, it’s about 1 in 8.6 million. We really want “11 or more”, rather than exactly 11, but I’m too lazy to run the normal approximation to binomial right now. It’s definitely less than 1 in a million, in any case. (By the way, I think the 11 successes might have been a random fluke. But even if we got only 6 successes, I (lazily) believe that would still significant at the 1% level.)
In point form, then:
—Under Bill’s distribution, the simulated Cramer Test succeeded in finding positive significance about 19% of the time in 56 tries.
—Random data would, by definition, find positive significance 2.5% of the time.
—The chance of the 19% happening by chance in 56 tries, where the real probability is 2.5%, is less than 1 in a million.
But I guess there are really two conclusions:
—With 14 separate seasons worth of data, the Cramer test “works” in that it identifies the existence of clutch hitting at the Bill James distribution;
—As an aside, the real-life data do provide reasonable basis to conclude that if clutch hitting does indeed exist, it does so at a lower level than the Bill James distribution.
1. “… even if clutch-hitting skill did exist and was extremely important, [Cramer’s] analysis would still reach the conclusion that it did, because it is not possible to detect consistency by the use of this method [regression on this year’s clutch performance against next year’s].”
It seems to me that Bill believes this because he used a much weaker signs test, rather than a full regression. (Although, to be fair, I don’t know whether the Cramer test succeeds using Cramer’s own measure of clutch hitting. It might, or it might not.) I believe that the data and logic fully support the conclusion that for a large enough effect (such as Bill’s distribution) and enough seasons of data (say, the 14 that I used), the Cramer test quite easily detects consistency.
2. “… random data proves nothing – and it cannot be used as proof of nothingness. Why? Because whenever you do a study, if your study completely fails, you will get random data. Therefore, when you get random data, all you may conclude is that your study has failed.”
And, judging by Bill’s response, I don’t think he believes this second quote himself. His own test of whether the signs test would pick up an effect proves that. If he really believed that random data proved nothing, what would be the point of checking if the test could produce non-random data? Bill’s test only makes sense if he really means that random data proves nothing only if random data would come out in any case.
And so I wonder if by this quote, Bill actually agrees with me, but originally just overstated his case.
James writes that “I take no position whatsoever about whether clutch hitting exists or does not exist.” But he does acknowledge that if clutch hitting exists, it must have a standard deviation that doesn’t even approach 30 points. My position is similar – I don’t know whether clutch hitting exists or not either—but I believe that if it does exist, the Cramer test simulations prove that the SD must be 10 points or less.
Our only large disagreement, I think, is that Bill argues very strongly, in absolute terms, that the Cramer method can’t work. I argue that the absolutist formulation is wrong. The Cramer method is as legitimate as any other statistical method. With enough data – exactly how much data depends on the size of the effect you’re looking for—the test is powerful enough to provide good evidence for the lack of the effect.
This discussion took place in SABR. You almost missed it. Even some SABR-ites will miss it – and that’s no fun.
SABR is a fantastic organization. For the membership you get assorted journals, newsletters, mailing lists, use of ProQuest (which is H.G. Wells-ian time travel), statistical research, historical research and the oppurtunity to learn from nearly 7000 individuals who love baseball as much as you do.
Your spouse doesn’t understand your passion for Debs Garms? Well, I guarantee you can find someone in SABR that will.
SABR is NOT about numbers. For me that is a fantastic part, but it’s a small part.
It’s about the history of uniforms. It’s about the odd plays you find at Retrosheet (the home of SABR luminaries David W. Smith and Tom Ruane, and in the back of the store you’ll find many others sitting around the pot-bellied stove, whittling and discussing a great many things). It’s about reminiscing about the 1983 White Sox, or the 1959 White Sox, the Go-Go Sox, and less reminiscing about, and more wondering about, the Hitless Wonders, the 1906 White Sox. Every time I look at the 1983 season, I can only figure that Sox team was the “Go Wonder Sox”. Plus 12 wins from 1982 and minus 25 wins in 1984. But I digress.
SABR is about listening and learning. The range of experts on baseball things – umpires, women in baseball, the third baseman on the $100,000 Infield, baseball poetry and prose – is covered because SABR is a collective. Everybody shares because the ultimate goal is to make baseball knowledge available and documented.
Don’t get me wrong, membership has its privileges, but many things SABR are available to non-members (browse the site!) and it grows everyday.
Have a grandfather that played? You can contribute to the BioProject – an effort to get a short biographical entry on every player. Don’t have a relative that played but know about a player that went to your high school? You can contribute to the BioProject. Just like reading about players and want to help? You can contribute to the BioProject.
In the end, SABR is about loving baseball, and enhancing the quality of our knowledge of it.
Then there is the SABR Convention. You get to hang out with people you always wanted to meet: me, Furtado, Forman, Mike Emeigh, Aaron Gleeman, Jon Daly, Dan Szymborski, Eric Enders, Hall of Merit’s Joe Dimino, Chris Jaffe, Anthony Giacalone, Mike Webber, Vinay, Rauseo, Burley, MGL, Bob T, Cyril Morong, Mark Stallard (just off the top of my head).
Then there are others, mostly individuals who write about baseball in some form, that would love to stand around and listen to your ideas on who the Blue Jays should trade for and why:
Rob Neyer, Alan Schwarz, David W. Smith, Tom Ruane, Tom Tippett, Scott Fischthal, Clay Davenport, Chris Kahrl, Maury Brown, Bill James, Phil Birnbaum, Jim Albert, Will Carroll, Clem Comly, Cliff Blau, Bill Nowlin, Dan Levitt.
And for me, this year in Toronto, I will get to meet Ron Johnson, a writer/analyst I greatly admire. I can’t tell you how much that means to me.
You must be logged in to view your Bookmarks.
Reconciliation - Getting Defensive Stats and Statheads Back Together
(30 - 1:42pm, Apr 28)
Handicapping the NL East
(77 - 2:02pm, Oct 15)
Last: Rickey! doesn't think YR is a less terrible Sam
Landing Buerhle a Great Move
(79 - 8:43am, Feb 04)
Last: Foghorn Leghorn
Weekly DRS Update (Defensive Stats Thru July 19, 2010)
(3 - 2:47pm, Sep 27)
Last: Home Run Teal & Black Black Black Gone!
You Have Got To Be Kidding Me
(8 - 3:52am, May 01)
Weekly DRS Update (Defensive Stats Thru July 4, 2010)
(2 - 4:05pm, Jul 11)
Weekly DRS Update (Defensive Stats Thru Jun 29, 2010)
(5 - 12:47pm, Jul 04)
Last: Harveys Wallbangers
Weekly DRS Update (Defensive Stats Thru Jun 13, 2010)
(15 - 1:51am, Jun 16)
Last: Chris Dial
Weekly DRS Update (Defensive Stats through games of June 6, 2010)
(17 - 7:08pm, Jun 14)
Last: Foghorn Leghorn
Daily Dose of Defense
(41 - 8:31pm, Jun 04)
Defensive Replacement Level Defined
(40 - 10:11pm, Jun 03)
Last: Best Dressed Chicken in Town
2009 NL OPD (Offense Plus Defense)
(37 - 11:22pm, Feb 17)
Last: Foghorn Leghorn
NOT authorized by Major League Baseball or its Member Teams
(40 - 7:32pm, Feb 16)
2009 AL OPD (Offense Plus Defense)
(35 - 9:05pm, Jan 05)
Last: Foghorn Leghorn
Live from SABR 39!
(58 - 5:20am, Aug 04)
Last: Neal Traven