The Benedictine monks of baseball statistics devote countless hours preserving the history of the game. I greatly appreciate the work they do, and so should you.
The Future of Baseball Statistics Is Coming To Us Out of the Past?
OK, a semi-obligatory non-baseball
reference to start us off. (Fear not, friends, no politics this time?enough
blood and bile has been spilled here to see us through the holidays.) Instead,
a film reference?which those of you with quick eyes have already seen coming.
It was my pleasure earlier this year to meet Jane Greer, the memorable
femme fatale of Out Of The Past, who was as bad as they come. Jane was
the Roberto Petagine of RKO, an actress of great talents whose career
was essentially wasted by the caprices of Mr. Capricious himself, Howard
Hughes. Sadly, Jane Greer passed away late in August, but she left behind
one immortal performance and an exemplary life.
Now, back to baseball?but
let?s stay with the theme of exemplary lives. Earlier this month there was a
small mention of the new features at the Retrosheet
web site. The organization founded by Dave Smith tantalized us with their
incredible efforts over the past years to accumulate a detailed backstory of
baseball in the form of play-by-play records for games in the deepest recesses
of time. Now, with a major upgrade of their web site, they?ve given us a glimpse
of the ultimate baseball encyclopedia.
Many of you have been to
the Retrosheet site already, so you know what I?m talking about. Situational
data for the years 1978-1990 is in place for both players and teams, and it?s
a major wealth of information. In addition, game logs for all of twentieth-century
major league baseball can be accessed and downloaded.
The team game logs, in conjunction
with the statistics, provide us with intriguing insights into performance that
isn?t apparent otherwise. For example, let?s look at the 1985 Chicago Cubs?a
team coming off a division championship the previous year and one expected to
be in the thick of the pennant race.
Retrosheet?s ?box scores,
narratives, and other goodies? section tells us how that Cub squad got off to
a roaring start. On June 11, they were 35-19 and leading the NL East by four
games over the Mets, and by six over the Cardinals.
In a stretch that will remind
Red Sox fans of their team?s late-season swoon in 2001, the Cubs proceeded to
lose thirteen straight games, eleven of them head-to-head matches with the Mets
and Cards. The Cubs lost eight games in the standings to the Mets, and ten to
the Cards in this stretch, and by June 25 they were in fourth place.
By early August, the Cubs were still in fourth place, but
only 71/2 games out of the lead. Unfortunately, they dropped twelve out their
next fourteen games (including seven in a row to the Mets and Cards) and were
never heard from again. From August 2 on, the Cubs had a 23-40 record, and finished
77-84. In a word, thud.
Retrosheet?s log shows us
that the Cubs were a combined 8-28 against the Mets and Cards (whom,
as you remember, staged a heck of a divisional race that year). That means that
the Cubs were 69-56 against the rest of the league. Counting up the carnage
the Cubs suffered at the hands of the Mets and Cards a bit further, we see that
the Cubs were 6-13 against these teams in close games (ones decided by two or
less runs), and a staggering 2-15 in non-close games.
Thanks to Retrosheet, we
now can see exactly how the 1985 Cubbies got their clocks cleaned. Now that
is a valuable public service?
And with Retrosheet?s play-by-play
data for these years (also available for download at their site), it?s possible
for the enterprising analyst to create almost never-seen in-game data that we?ve
been teased with by STATS, Inc. over the past twelve years. One such analyst
is John Jarvis, whose Collection
of Team Season Statistics takes the Retrosheet play-by-play
logs and starts to give us the kind of detail many have been clamoring for.
Jarvis is a mathematics
professor at the University of South Carolina, and some of you may find his
other work interesting (though some may find it a bit recondite). However, almost
any baseball fan worth his salted peanuts is going to find segments of John?s
play-by-play stats irresistible. Check out John?s site for a complete look at
his team season stat compilations, which cover most of the last quarter-century.
The ones that many find
most interesting are the base runner performance stats. For each team in each
year (from 1978 through 2000 for both leagues, and for the AL in 1967), John
breaks out the base runner advancement going from first to third on a single,
and from first to home on a double.
These are interesting stats,
though they?re probably not pivotal to our understanding of team performance.
When we look at the 1985 Cubs in the base runner categories, we see something
of an anomaly: despite the fact that the Cubs wound up second in the league
in stolen bases, their base runner advancement percentage for first->third
on a single and first->home on a double was below league average. This indicates
that the ?85 Cubs had all of their speed concentrated in a couple of guys (Ryne
Sandberg and Shawon Dunston), while the Cards had speed throughout
most of their lineup (they had the second highest first->third advancement
percentage in the NL that year?only Houston was better at it).
John doesn?t show the defensive base runner advancement data?that
is, how well teams went from first->third and first->home against
the Cubs. But he does provide an offense vs. defense breakout for the number
of runners advanced and scored on outs, which gives us a glimpse as to what
percentage of overall runs occur in this way. (Like many of us, John is inventing
some new acronyms, though he?s chosen not to be as colorful as have others:
under ?Team Sacrifice Summary? you?ll see RAO?Runners Advancing on Outs;
RSO?Runners Scoring on Outs; and ROH?Runners Out at Home.) About
12.5% of all runs scored via outs in the 1985 NL. One of these days, we might
find out what that percentage was for teams in 1916?the height of the Deadball
Era. Was it the same, or was it vastly different? (Just for sake of comparison,
11.3% of runs in the 2000 AL scored via outs.)
While ?productive outs? are, again, only a minor contributor
to overall team success, it?s interesting to note that the 1985 Cubs had the
worst ratio in the league in terms of runners advancing on outs. They advanced
183 of their own runners via outs, while 234 opposition base runners moved up
on outs. (The Cardinals, who won the NL East that year, had a 244-162 RAO ratio.)
Of course, none of this data really replaces the most important
explanation for the Cubs? collapse in 1985?their pitchers melted down, especially
at home, when the warm weather hit. Chicago pitchers allowed 6.3 runs per game
at home from July 1 to the end of the season: their 23-10 early-season home
field advantage withered away to a 41-39 overall mark, or an 18-29 home record
after July 1. (One final digression: is it possible that the swing of Wrigley
Field from hitters? park to pitchers? park in recent years can be explained,
in part at least, by the installation of lights at the venerable old park?)
John?s work is mining the treasure chest of data that Retrosheet has made available
to us, and while it may not be groundbreaking, it?s a valuable example of what
can be done. Some enterprising person will generate ideas for stat splits that
can isolate some of the truly random elements that make seasonal results deviate
from expectation?the 1985 Cubs pitchers may havebeen burned on two strikes the
way Greg Maddux was in 1999, for example. This might be the actual cause
of the deviation in batting average on balls in play; then again, it might not,
but sooner or later, we?re going to know.
If you haven?t been to the Retrosheet site, or haven?t been
back recently, you owe it to yourself to take a look ASAP. Don?t be surprised
if you wind up spending the better part of a day (and half the night) there.
And give John?s site a look as well?there?s food for thought there. These ?retro-pioneers?
are mining the past to bring us the future in baseball stats, and by doing so
are defying Ecclesiastes? famous pronouncement that ?there is nothing
new under the sun.?
(Of course, they didn?t have night games back then . . .)
Posted: December 17, 2001 at 06:00 AM | 9 comment(s)
Login to Bookmark