Baseball for the Thinking Fan

Login | Register | Feedback

You are here > Home > Primate Studies > Discussion
Primate Studies
— Where BTF's Members Investigate the Grand Old Game

Monday, December 17, 2001

The Retro-Frontier

The Benedictine monks of baseball statistics devote countless hours preserving the history of the game. I greatly appreciate the work they do, and so should you.

The Future of Baseball Statistics Is Coming To Us Out of the Past?

OK, a semi-obligatory non-baseball   reference to start us off. (Fear not, friends, no politics this time?enough   blood and bile has been spilled here to see us through the holidays.) Instead,   a film reference?which those of you with quick eyes have already seen coming.   It was my pleasure earlier this year to meet Jane Greer, the memorable   femme fatale of Out Of The Past, who was as bad as they come. Jane was   the Roberto Petagine of RKO, an actress of great talents whose career   was essentially wasted by the caprices of Mr. Capricious himself, Howard   Hughes. Sadly, Jane Greer passed away late in August, but she left behind   one immortal performance and an exemplary life.

Now, back to baseball?but   let?s stay with the theme of exemplary lives. Earlier this month there was a   small mention of the new features at the Retrosheet   web site. The organization founded by Dave Smith tantalized us with their   incredible efforts over the past years to accumulate a detailed backstory of   baseball in the form of play-by-play records for games in the deepest recesses   of time. Now, with a major upgrade of their web site, they?ve given us a glimpse   of the ultimate baseball encyclopedia.

Many of you have been to   the Retrosheet site already, so you know what I?m talking about. Situational   data for the years 1978-1990 is in place for both players and teams, and it?s   a major wealth of information. In addition, game logs for all of twentieth-century   major league baseball can be accessed and downloaded.

The team game logs, in conjunction   with the statistics, provide us with intriguing insights into performance that   isn?t apparent otherwise. For example, let?s look at the 1985 Chicago Cubs?a   team coming off a division championship the previous year and one expected to   be in the thick of the pennant race.

Retrosheet?s ?box scores,   narratives, and other goodies? section tells us how that Cub squad got off to   a roaring start. On June 11, they were 35-19 and leading the NL East by four   games over the Mets, and by six over the Cardinals.

In a stretch that will remind   Red Sox fans of their team?s late-season swoon in 2001, the Cubs proceeded to   lose thirteen straight games, eleven of them head-to-head matches with the Mets   and Cards. The Cubs lost eight games in the standings to the Mets, and ten to   the Cards in this stretch, and by June 25 they were in fourth place.

By early August, the Cubs were still in fourth place, but   only 71/2 games out of the lead. Unfortunately, they dropped twelve out their   next fourteen games (including seven in a row to the Mets and Cards) and were   never heard from again. From August 2 on, the Cubs had a 23-40 record, and finished   77-84. In a word, thud.

Retrosheet?s log shows us   that the Cubs were a combined 8-28 against the Mets and Cards (whom,   as you remember, staged a heck of a divisional race that year). That means that   the Cubs were 69-56 against the rest of the league. Counting up the carnage   the Cubs suffered at the hands of the Mets and Cards a bit further, we see that   the Cubs were 6-13 against these teams in close games (ones decided by two or   less runs), and a staggering 2-15 in non-close games.

Thanks to Retrosheet, we   now can see exactly how the 1985 Cubbies got their clocks cleaned. Now that   is a valuable public service?

And with Retrosheet?s play-by-play   data for these years (also available for download at their site), it?s possible   for the enterprising analyst to create almost never-seen in-game data that we?ve   been teased with by STATS, Inc. over the past twelve years. One such analyst   is John Jarvis, whose Collection   of Team Season Statistics takes the Retrosheet play-by-play   logs and starts to give us the kind of detail many have been clamoring for.

Jarvis is a mathematics   professor at the University of South Carolina, and some of you may find his   other work interesting (though some may find it a bit recondite). However, almost   any baseball fan worth his salted peanuts is going to find segments of John?s   play-by-play stats irresistible. Check out John?s site for a complete look at   his team season stat compilations, which cover most of the last quarter-century.

The ones that many find   most interesting are the base runner performance stats. For each team in each   year (from 1978 through 2000 for both leagues, and for the AL in 1967), John   breaks out the base runner advancement going from first to third on a single,   and from first to home on a double.

These are interesting stats,   though they?re probably not pivotal to our understanding of team performance.   When we look at the 1985 Cubs in the base runner categories, we see something   of an anomaly: despite the fact that the Cubs wound up second in the league   in stolen bases, their base runner advancement percentage for first->third   on a single and first->home on a double was below league average. This indicates   that the ?85 Cubs had all of their speed concentrated in a couple of guys (Ryne   Sandberg and Shawon Dunston), while the Cards had speed throughout   most of their lineup (they had the second highest first->third advancement   percentage in the NL that year?only Houston was better at it).

John doesn?t show the defensive base runner advancement data?that   is, how well teams went from first->third and first->home against   the Cubs. But he does provide an offense vs. defense breakout for the number   of runners advanced and scored on outs, which gives us a glimpse as to what   percentage of overall runs occur in this way. (Like many of us, John is inventing   some new acronyms, though he?s chosen not to be as colorful as have others:   under ?Team Sacrifice Summary? you?ll see RAO?Runners Advancing on Outs;   RSO?Runners Scoring on Outs; and ROH?Runners Out at Home.) About   12.5% of all runs scored via outs in the 1985 NL. One of these days, we might   find out what that percentage was for teams in 1916?the height of the Deadball   Era. Was it the same, or was it vastly different? (Just for sake of comparison,   11.3% of runs in the 2000 AL scored via outs.)

While ?productive outs? are, again, only a minor contributor   to overall team success, it?s interesting to note that the 1985 Cubs had the   worst ratio in the league in terms of runners advancing on outs. They advanced   183 of their own runners via outs, while 234 opposition base runners moved up   on outs. (The Cardinals, who won the NL East that year, had a 244-162 RAO ratio.)

Of course, none of this data really replaces the most important   explanation for the Cubs? collapse in 1985?their pitchers melted down, especially   at home, when the warm weather hit. Chicago pitchers allowed 6.3 runs per game   at home from July 1 to the end of the season: their 23-10 early-season home   field advantage withered away to a 41-39 overall mark, or an 18-29 home record   after July 1. (One final digression: is it possible that the swing of Wrigley   Field from hitters? park to pitchers? park in recent years can be explained,   in part at least, by the installation of lights at the venerable old park?)

John?s work is mining the treasure chest of data that Retrosheet has made available   to us, and while it may not be groundbreaking, it?s a valuable example of what   can be done. Some enterprising person will generate ideas for stat splits that   can isolate some of the truly random elements that make seasonal results deviate   from expectation?the 1985 Cubs pitchers may havebeen burned on two strikes the   way Greg Maddux was in 1999, for example. This might be the actual cause   of the deviation in batting average on balls in play; then again, it might not,   but sooner or later, we?re going to know.

If you haven?t been to the Retrosheet site, or haven?t been   back recently, you owe it to yourself to take a look ASAP. Don?t be surprised   if you wind up spending the better part of a day (and half the night) there.   And give John?s site a look as well?there?s food for thought there. These ?retro-pioneers?   are mining the past to bring us the future in baseball stats, and by doing so   are defying Ecclesiastes? famous pronouncement that ?there is nothing   new under the sun.?

(Of course, they didn?t have night games back then . . .)


Don Malcolm Posted: December 17, 2001 at 05:00 AM | 9 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. Mike Emeigh Posted: December 17, 2001 at 12:18 AM (#604518)
Re: recent transformation of Wrigley from hitters' park to pitchers' park:

The Cubs are, as far as I know, still limited by agreement with the city to 18 night games at home per season, so that's at best only part of the reason.

About 10 or so years ago, I looked at about 15 years worth of data from games played in Chicago. Generally speaking, Wrigley has throughout most of its history played as a pitchers' park when the weather has been cool (generally early in the season) and as an extreme hitters' park when the weather warms up. This has more to do with wind than ambient air temp; cooler weather in Chicago is generally a result of wind off Lake Michigan, which we all know means that it's blowing in at Wrigley. The 2000 season was one of the coolest summers in recent Chicago history.

There's also some history behind the Cub performances when the park is playing to its extreme. Cub pitchers were generally hurt much more when the park was conducive to hitting than Cub hitters were helped by it - thus the games were usually closer, and the Cubs lost more of them than they might otherwise have been expected to lose given a normal home-field advantage, when the weather heated up. I suspect this phenomenon was at the heart of the Cubs' more-or-less annual "June swoon" when I was growing up in the '60s and '70s.

-- MWE

   2. Dan Szymborski Posted: December 17, 2001 at 12:18 AM (#604519)
Perhaps we should use Wrigley one-year park factors as a measure of global warming (or lack thereof)?
   3. Charles Saeger Posted: December 17, 2001 at 12:18 AM (#604522)
On Wrigley:

Funny you should mention that. I wrote an article, as yet unpublished, addressing that odd swing.

As best I can tell, the difference has been the league. Until expansion in 1962, Wrigley was a more-or-less neutral park, contrary to popular opinion. In 1962, the NL expanded, one of those teams being in Houston's Colt Stadium, a pitchers' park. The Dodgers moved to Dodger Stadium, another extreme pitchers' park, thus moving Wrigley well up the list. That advantage grew a bit, probably because most of the other moves of that period (the Phillies left Shibe Park, the Pirates left Forbes Field, the Reds left Crosley Field, the Cardinals left Sportsman's Park) benefitted the pitcher.

That advantage peaked in the early 1970s, and remained just a bit below that peak until the expansion in 1993. That year, the NL expanded to include the various hitters' havens in Colorado, pushing Wrigley back towards the middle. The park factors are a bit lower under the lights, but only a bit, and I doubt the difference would be significant.
   4. Mike Emeigh Posted: December 17, 2001 at 12:18 AM (#604525)
Dave Poole meant to link to this site.
   5. Jim Furtado Posted: December 17, 2001 at 12:18 AM (#604528)
Yes, this isn't your normal baseball site. I don't see anyone posting this info over on

Thanks guys.
   6. Chris Dial Posted: December 17, 2001 at 12:18 AM (#604531)
Does this mean that Wrigley is no longer a bandbox?
   7. Mike Emeigh Posted: December 18, 2001 at 12:18 AM (#604532)
The cozy feeling of the Friendly Confines is due mostly to the limited foul territory; the listed dimensions in the outfield aren't especially short, although they aren't especially deep, either (355 to LF, 353 to RF, 368 in the alleys, 400 CF).

The 1985 Cubs allowed more than the normal percentage of runners to go first to third on a single (when you define a first-to-third opportunity as a runner on first and no runner on second when a single is hit, which seems to be to be the appropriate way to do it), 34.9% vs. a league norm of 32.0%. They allowed fewer than the normal percentage of runners to go first to home on a double (37.8% vs 42.0%), or second to home on a single (61.3% vs 62.1%). They were about league average in allowing runners to advance on outs (28.3% vs. 28.1%). Add it all up and baserunning against the Cubs was just about a wash for the opposition.

-- MWE
   8. David Geiser Posted: December 19, 2001 at 12:19 AM (#604577)
Chris -

My pencil and paper calculations show that Wrigley had a park factor of 86 before the All Star Break in 2001, and a 101 afterward.

I think that Wrigley is more susceptible to fluctuation than most parks.
   9. I am Ted F'ing Williams Posted: December 19, 2001 at 12:19 AM (#604588)
A stathead could do a much better analysis of the lights and their effect at Wrigley than me. I can remember nearly every year since my youth at least one 30+ run game at Wrigley, yet I can't think of one off the top of my head that happened at night. (Maybe I should go look at Retrosheet, hmm?)

One thing I will say - the lighting for Wrigley night games is really bad. It actually seems dark on the field there in comparison to other parks. Maybe one overall "park factor" isn't enough.

You must be Registered and Logged In to post comments.



<< Back to main

BBTF Partner

Dynasty League Baseball

Support BBTF


Thanks to
for his generous support.


You must be logged in to view your Bookmarks.


Page rendered in 0.1800 seconds
41 querie(s) executed