Page rendered in 0.4201 seconds
66 querie(s) executed
— Where BTF's Members Investigate the Grand Old Game
Monday, December 17, 2001
The Benedictine monks of baseball statistics devote countless hours preserving the history of the game. I greatly appreciate the work they do, and so should you.
The Future of Baseball Statistics Is Coming To Us Out of the Past?
OK, a semi-obligatory non-baseball reference to start us off. (Fear not, friends, no politics this time?enough blood and bile has been spilled here to see us through the holidays.) Instead, a film reference?which those of you with quick eyes have already seen coming. It was my pleasure earlier this year to meet Jane Greer, the memorable femme fatale of Out Of The Past, who was as bad as they come. Jane was the Roberto Petagine of RKO, an actress of great talents whose career was essentially wasted by the caprices of Mr. Capricious himself, Howard Hughes. Sadly, Jane Greer passed away late in August, but she left behind one immortal performance and an exemplary life.
Now, back to baseball?but let?s stay with the theme of exemplary lives. Earlier this month there was a small mention of the new features at the Retrosheet web site. The organization founded by Dave Smith tantalized us with their incredible efforts over the past years to accumulate a detailed backstory of baseball in the form of play-by-play records for games in the deepest recesses of time. Now, with a major upgrade of their web site, they?ve given us a glimpse of the ultimate baseball encyclopedia.
Many of you have been to the Retrosheet site already, so you know what I?m talking about. Situational data for the years 1978-1990 is in place for both players and teams, and it?s a major wealth of information. In addition, game logs for all of twentieth-century major league baseball can be accessed and downloaded.
The team game logs, in conjunction with the statistics, provide us with intriguing insights into performance that isn?t apparent otherwise. For example, let?s look at the 1985 Chicago Cubs?a team coming off a division championship the previous year and one expected to be in the thick of the pennant race.
Retrosheet?s ?box scores, narratives, and other goodies? section tells us how that Cub squad got off to a roaring start. On June 11, they were 35-19 and leading the NL East by four games over the Mets, and by six over the Cardinals.
In a stretch that will remind Red Sox fans of their team?s late-season swoon in 2001, the Cubs proceeded to lose thirteen straight games, eleven of them head-to-head matches with the Mets and Cards. The Cubs lost eight games in the standings to the Mets, and ten to the Cards in this stretch, and by June 25 they were in fourth place.
By early August, the Cubs were still in fourth place, but only 71/2 games out of the lead. Unfortunately, they dropped twelve out their next fourteen games (including seven in a row to the Mets and Cards) and were never heard from again. From August 2 on, the Cubs had a 23-40 record, and finished 77-84. In a word, thud.
Retrosheet?s log shows us that the Cubs were a combined 8-28 against the Mets and Cards (whom, as you remember, staged a heck of a divisional race that year). That means that the Cubs were 69-56 against the rest of the league. Counting up the carnage the Cubs suffered at the hands of the Mets and Cards a bit further, we see that the Cubs were 6-13 against these teams in close games (ones decided by two or less runs), and a staggering 2-15 in non-close games.
Thanks to Retrosheet, we now can see exactly how the 1985 Cubbies got their clocks cleaned. Now that is a valuable public service?
And with Retrosheet?s play-by-play data for these years (also available for download at their site), it?s possible for the enterprising analyst to create almost never-seen in-game data that we?ve been teased with by STATS, Inc. over the past twelve years. One such analyst is John Jarvis, whose Collection of Team Season Statistics takes the Retrosheet play-by-play logs and starts to give us the kind of detail many have been clamoring for.
Jarvis is a mathematics professor at the University of South Carolina, and some of you may find his other work interesting (though some may find it a bit recondite). However, almost any baseball fan worth his salted peanuts is going to find segments of John?s play-by-play stats irresistible. Check out John?s site for a complete look at his team season stat compilations, which cover most of the last quarter-century.
The ones that many find most interesting are the base runner performance stats. For each team in each year (from 1978 through 2000 for both leagues, and for the AL in 1967), John breaks out the base runner advancement going from first to third on a single, and from first to home on a double.
These are interesting stats, though they?re probably not pivotal to our understanding of team performance. When we look at the 1985 Cubs in the base runner categories, we see something of an anomaly: despite the fact that the Cubs wound up second in the league in stolen bases, their base runner advancement percentage for first->third on a single and first->home on a double was below league average. This indicates that the ?85 Cubs had all of their speed concentrated in a couple of guys (Ryne Sandberg and Shawon Dunston), while the Cards had speed throughout most of their lineup (they had the second highest first->third advancement percentage in the NL that year?only Houston was better at it).
John doesn?t show the defensive base runner advancement data?that is, how well teams went from first->third and first->home against the Cubs. But he does provide an offense vs. defense breakout for the number of runners advanced and scored on outs, which gives us a glimpse as to what percentage of overall runs occur in this way. (Like many of us, John is inventing some new acronyms, though he?s chosen not to be as colorful as have others: under ?Team Sacrifice Summary? you?ll see RAO?Runners Advancing on Outs; RSO?Runners Scoring on Outs; and ROH?Runners Out at Home.) About 12.5% of all runs scored via outs in the 1985 NL. One of these days, we might find out what that percentage was for teams in 1916?the height of the Deadball Era. Was it the same, or was it vastly different? (Just for sake of comparison, 11.3% of runs in the 2000 AL scored via outs.)
While ?productive outs? are, again, only a minor contributor to overall team success, it?s interesting to note that the 1985 Cubs had the worst ratio in the league in terms of runners advancing on outs. They advanced 183 of their own runners via outs, while 234 opposition base runners moved up on outs. (The Cardinals, who won the NL East that year, had a 244-162 RAO ratio.)
Of course, none of this data really replaces the most important explanation for the Cubs? collapse in 1985?their pitchers melted down, especially at home, when the warm weather hit. Chicago pitchers allowed 6.3 runs per game at home from July 1 to the end of the season: their 23-10 early-season home field advantage withered away to a 41-39 overall mark, or an 18-29 home record after July 1. (One final digression: is it possible that the swing of Wrigley Field from hitters? park to pitchers? park in recent years can be explained, in part at least, by the installation of lights at the venerable old park?)
John?s work is mining the treasure chest of data that Retrosheet has made available to us, and while it may not be groundbreaking, it?s a valuable example of what can be done. Some enterprising person will generate ideas for stat splits that can isolate some of the truly random elements that make seasonal results deviate from expectation?the 1985 Cubs pitchers may havebeen burned on two strikes the way Greg Maddux was in 1999, for example. This might be the actual cause of the deviation in batting average on balls in play; then again, it might not, but sooner or later, we?re going to know.
If you haven?t been to the Retrosheet site, or haven?t been back recently, you owe it to yourself to take a look ASAP. Don?t be surprised if you wind up spending the better part of a day (and half the night) there. And give John?s site a look as well?there?s food for thought there. These ?retro-pioneers? are mining the past to bring us the future in baseball stats, and by doing so are defying Ecclesiastes? famous pronouncement that ?there is nothing new under the sun.?
(Of course, they didn?t have night games back then . . .)
You must be logged in to view your Bookmarks.
What do you do with Deacon White?
(17 - 1:12pm, Dec 23)
Last: Alex King
(15 - 12:05am, Oct 18)
Nine (Year) Men Out: Free El Duque!
(67 - 10:46am, May 09)
Who is Shyam Das?
(4 - 8:52pm, Feb 23)
Last: RoyalsRetro (AG#1F)
Greg Spira, RIP
(45 - 10:22pm, Jan 09)
Last: Jonathan Spira
Northern California Symposium on Statistics and Operations Research in Sports, October 16, 2010
(5 - 12:50am, Sep 18)
Mike Morgan, the Nexus of the Baseball Universe?
(37 - 12:33pm, Jun 23)
Last: The Keith Law Blog Blah Blah (battlekow)
Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011
(2 - 8:03pm, May 16)
Last: Diamond Research
Retrosheet Semi-Annual Site Update!
(4 - 4:07pm, Nov 18)
What Might Work in the World Series, 2010 Edition
(5 - 3:27pm, Nov 12)
Last: fra paolo
Predicting the 2010 Playoffs
(11 - 5:21pm, Oct 20)
SABR 40: Impressions of a First-Time Attendee
(5 - 11:12pm, Aug 19)
Last: Joe Bivens, Minor Genius
St. Louis Cardinals Midseason Report
(12 - 12:42am, Aug 10)
Napoleon Lajoie: Definition of Grace
(9 - 12:38am, Jul 01)
Last: Hang down your head, Tom Foley
Youth Baseball Hitting Drills: Shine the Light
(5 - 6:47am, Mar 11)
Last: Pat Rapper's Delight