And the Beat Goes On: Derek Jeter and the State of Fielding Analysis in Sabermetrics - Part 8
In the final installment, Mike wraps up our early Christmas present.
Variations on a Theme: Where Can We Go from Here?
Back when I was a member of my high school’s debate team,
I learned that it was important that when listening to an argument, to test the
evidence that was presented in support of that argument. There were many
times when our worthy opponents would offer arguments based on evidence that
when carefully considered and understood in context said something quite
different from what they intended it to say. Of course, my teams were
never guilty of doing the same thing.
Anyway, that experience taught me to be into a skeptic
- or rather, reinforced the natural tendency that I have always had. Once
you leave the realm of statements of fact, and enter the realm of interpretation
- especially into the realm of intepretation of baseball performance - everything
is open to question. An interpreter of the evidence is always making assumptions
about how much weight to give an observation, and what this piece of data
really means, and how to account for differences in performance based
on the environment in which the performance was accomplished, and balancing
the relative value of accomplishments. Some of these assumptions are explicit;
most are never clearly stated. But all of them should be tested against the
weight of all of the known facts.
So when I heard the variations on this theme:
“Every defensive method says Jeter is a terrible fielder.
You simply can’t ignore that much evidence.”
I had to test the methods against the the evidence - because
the methods themselves were not evidence, but interpretations of the evidence.
I had to discover for myself the extent to which the developers of those
methods had interpreted the evidence appropriately, in context.
I first started looking at this at about the time that
Baseball Prospectus 2001 came out, and Clay Davenport had totally revamped
his DFTs. I was surprised that there had been very little empirical validation
of defensive methods against play-by-play data, especially for more recent
seasons when we’ve had such data available. I made efforts to understand
the methods that were out there at the time, I looked for underlying unstated
assumptions about defensive performance implicit in those methods, and I
tested those assumptions against the available play-by-play data.
As I indicated at the start of this series, we have very
limited information about the number of opportunities that any defender gets
to make a play in the statistical record. Most analysts realize that it
is folly to attempt to evaluate a defender based on his successes and a
partial record of his failures, and most analysts also realize that hits
in the field of play result (to some degree) from a defensive failure. For
that reason , every defensive method that is not based on play-by-play data
makes assumptions about the distribution of opportunities among fielders.
Yet even with those assumptions, methods not based on play-by-play data reward
players who get more opportunities, and penalize players who get fewer opportunities
- even those methods like the DFTs and CAD that use PBP data to help estimate
the distribution of opportunities. Every ball fielded by someone else will
drive a fielder’s defensive rating down in most systems. In the case of
the Yankees from 1998-2000, the ball-in-play distribution and the probable
alignment of the fielders in the face of that distribution combine to limit
the number of opportunities that Derek Jeter has to make plays - and make
it nearly impossible for Jeter to rank highly in those methods that are based
on traditional defensive statistics.
The one (more or less) open-source method based on PBP data that we do have
- MGL’s version of UZR - concludes that Jeter isn’t all that bad a fielder,
at least for the 1998-2000 period covered by my analysis. While UZR is not
problem-free, it does eliminate most of the guesswork about fielder opportunities,
being based on calculated percentages of balls fielded by a defender in an
area of the field. The UZR model can be tweaked to allow for more dynamic
allocations of fielder responsibility - perhaps by modeling various ball distributions
and positioning fielders in that distribution in such a way as to maximize
the probability that a ball in play will be fielded by someone, and
then evaluating fielder coverage responsibilities within those areas based
on a version of Don Malcolm’s force-and-distance model. Such an approach should
allow us to define the extent to which fielder percentages in different zones
should be expected to vary based on the BIP distribution.
That still leaves the problem of developing better estimators
for fielder opportunities in methods not based on PBP data, since we do not
have detailed PBP data for most of MLB history. To
do this, we need to try to take advantage of the PBP data that we do have
(including the 20 years worth of data that has been released through Retrosheet)
to figure out better ways to estimate opportunities from what we have when
we don’t have PBP data.
Most of the adjustments for BIP distribution have been made
on either the total number of balls in play or (in Win Shares) on the successful
plays. But perhaps we can approach this from a different angle. We already
know the direction of the outs in play - we get those from the assist
totals for the infielders and the putout totals for the outfielders. Thus
we really only have to try to estimate the distribution of hits in
play. We can estimate the number of singles, doubles, and triples per team
given the rest of the team stats - one way to do this is to take Bill James’s
component ERA formula and work backwards to get an estimated number of total
bases allowed based on the team ERA, then divide into singles, doubles, and
triples based on average percentages allowed. Once we have singles, we should
be able to estimate the percentage of those that are ground balls (since
we can estimate the GB/FB distribution for a team). The assist totals for
the infielders should help us develop estimators for the direction of ground
ball singles (as will the L/R pitching distribution, since most GBs are pulled).
The idea is to take advantage of everything that we know from the basic data,
and combine that with inferences that we draw from the PBP data - if we knew
from the PBP data that singles fell in about the same distribution as the
outs fall, and that they are distributed among fielders in about the same
proportion as some combination of the left side/right side assists and the
LH/RH pitching distribution - we could devise a range rating similar to that
in CAD. We’ve tried to do that based on total BIP, but it might make more
sense to do that based on (assists+estimated hits in play). That’s one way
to approach the problem - and given the number of comments that have already
come forward on this series of articles, I have no doubt that those of you
who have worked all the way through this analysis with me have your own ideas
as well. In any event, we should be trying to work back and forth from the
PBP data to the method and validate whatever conclusions we draw back against
the PBP data. No one should undertake defensive analysis without starting
from players for whom there is PBP data, because there is no other way to
validate whatever assertions you make about opportunities.
I selected Derek Jeter for this analysis because I found
it hard to accept the extent to which the analytical conclusions were at
odds with the perception among baseball men. I didn’t doubt that Jeter was
overrated as a defender, mind you, but having looked at STATS zone data for
1999 and 2000, and knowing that Jeter was seeing very few balls in his zone,
I began to wonder if it weren’t the methods themselves that were at fault.
After investigation, I’m convinced that the non-PBP methods are systematically
downgrading Jeter because they fail to account for his lack of opportunities,
and until we do a better job of leveraging what we know or can estimate from
the PBP data into the non-PBP based methods, we will always have to remember
that a rating could be as much the result of opportunity as it is of fielding
In writing this series of articles, I hoped that, more than
anything else, the Primer readers would provide thoughtful feedback and commentary.
That hope has been realized in spades. I appreciate the amount of time and
effort that you have taken to read and react to these articles, and I hope
that you will continue to do that in response to this article and to anything
else the Primer authors write. We’ve got a great group of writers and an
even better group of readers. Thanks!!
Posted: December 12, 2002 at 05:00 AM | 9 comment(s)
Login to Bookmark