— Where BTF's Members Investigate the Grand Old Game
Wednesday, February 12, 2003
How Important are In-Season Winning Percentages?
Rob takes a look at the predictive value of seasonal records.
Recently Retrosheet has made available the Game Logs for every game played in the 20th century (1901-2002) NL and AL. The logs provide summary information on each game, including the number of runs scored by each team. A great deal of research can be conducted using this complete collection of game logs. In this article, I want to scratch the surface of an issue that has been around for a very long time. Namely, how accurate are within-season win pcts as a predictor of a team?s end-of-season win pct.
For example, suppose a team starts out 7-3 in its first 10 games. Do we expect it to go on and win 100 games that season? What if the team starts out 35-15 (again a .700 win pct), are we now more inclined to predict that it will win 100 games? From a purely empirical perspective, the game logs allow us to look at each team?s win pct evolution through every season in the 20th century.
You may remember that Bill James wrote about this issue in his 1985 Baseball Abstract in the Detroit Tigers commentary following the Tigers blistering 35-5 start in 1984. In my article here, I will present some data and some ideas following in James? footsteps. However, I merely hope to rekindle a dialog on methods and findings, so feel free to chime in with any comments or ideas you may have.
.700 Teams after 40 games
Let?s start our journey by looking at the best teams ever after 40 games into the season. As you probably know, the 1984 Tigers at 35-5 were the best ever (win pct of .875). I have arbitrarily set a cutoff of a .700 win pct to qualify for the elite teams in my study (28-12 or better after 40 games). By my count, there have been 75 teams to get off to such a hot start. The first was the 1902 Pirates (33-7) and the most recent were the 2002 Mariners (28-12) and the 2002 Red Sox (29-11).
For comparability purposes, in what follows I will exclude the 1981 Dodgers (29-11) and the 1994 Yankees (28-12) because of the shortened seasons due to the strikes. So that leaves us with the 73 teams with the hottest 40-game starts in history. How do they wind up? Are all these great teams? How many clunkers are in the bunch? Are 40 games enough to allow us to get a good read on these teams?
Yes, indeed, many of these were great teams and went on to post fabulous end-of-season records. The 1902 Pirates (102-36, .739), the 1905 Giants (105-47, .691), the 1907 Cubs (107-44, .709), the 1909 Pirates (110-42, .724), the 1931 Athletics (107-45, .704), the 1932 Yankees (107-47, .695), the 1939 Yankees (106-45, .702), the 1970 Orioles (108-54, .667), the 1986 Mets (108-54, .667), the 1995 Indians (100-44, .694), the 1998 Yankees (114-48, .704), and the 2001 Mariners (116-46, .716). So a lot of the game?s all-time best teams got off to a hot start after 40 games.
But there were also some clunkers who got off to hot starts. At least, several of these teams are far less than world beaters. Consider the 1907 Giants (started 28-12, then went 54-59, for an 82-71 record), the 1912 White Sox (28-12, 50-64; 78-76), the 1951 White Sox (29-11, 52-62; 81-73), the 1972 Mets (29-11, 54-62; 83-73), and the 2001 Twins (28-12, 57-65; 85-77).
The average end-of-season win pct of these 73 hot starters is .624. Let?s break that out a little bit:
Not surprisingly, each of these collections performed worse after their hot starts. This is a well-known phenomenon in stochastic processes, and has taken many names in the baseball analysis lexicon. I simply call it the Up-Down theory. Teams that are up are apt to go down, and vice versa. Same is true for hitters, pitchers, and every day people. You?re never as good as you look when you win and you?re never as bad as you look when you lose. The truth is almost always somewhere in the middle.
In fact, only one of these 75 teams (including the two strike-year teams) started off hot, and then got hotter. These were the Honus Wagner-led 1909 Pirates who went 110-42. They started off 28-12 (.700) and then proceeded to go 82-30 (.732) the rest of the season.
There are many ways to rationalize why the vast majority of teams that get off to a hot start cool off. You can think of the .500 level as being a magnet of sorts. The way I like to think about it is the following. Each team has an underlying "true ability". When we observe a very hot team, we can ask ourselves the question ? which is more likely, that the team is playing to its actual ability or that the team?s true ability is less than its current record but it has been the recipient of good luck? Since we intuitively believe that the distribution of true abilities of all the teams in the league form something like a bell curve, this second possibility is always more likely.
.600 Teams after 40 Games
Let?s dial down our definition of hot start from a win pct of .700 (28-12) to .600 (24-16). Of course there is more "room" for a team that starts 24-16 to improve over the rest of the season compared to a 28-12 start.
There have been 101 teams that began a season 24-16 after 40 games, starting with the 1903 Pirates and most recently the 2001 Phillies and Cardinals. A quick review of these 101 .600 starting teams shows that there is more of a mixture of quality than the .700 starters. We do have very good teams such as the 1909 Cubs (104-49), the 1919 Reds (96-44), the 1954 Yankees (103-51), the 1968 Tigers (103-59), and the 1976 Reds (102-60). But we also have such mediocrities as the 1924 Red Sox (68-87), the 1927 White Sox (70-83), the 1956 Pirates (66-88), the 1973 White Sox (77-84), the 1976 Rangers (76-86), the 1978 Athletics (69-93), and the 1986 Orioles (73-89).
On average, these 101 teams had a .550 win pct during the rest of the season, and average end-of-season win pct of .562. The worst end-of-season win pct was .426 and the best was .686. So you can see that there is quite a large spread here. For one thing, a .600 win pct is not sufficiently far away from .500; and for another, 40 games does not seem to be sufficient to allow us to make any definitive predictions. In the next section, I?ll keep to a .600 start, but I?ll increase the number of games into the season the start extends.
.600 Teams after N Games
Here are the results I found at 20-game intervals. Hopefully, this table will be formatted properly.
You can see that there are fewer .600 teams as you go deeper into the season. I believe this confirms the "luck" explanation described above. Also, the deeper you go into the season, the more confident we can be that the team actually is a .600-ish team.
The next table presents the quartiles of the end-of-season (EOS) distribution of the teams that started out with a .600 win pct.
As you can see, there is a "funnel effect" at play here. The deeper into a season we observe a .600 win pct, the more confident we can be that the team really is (close to) a .600 team. Plus, the spread in the end-of-season win pcts of these teams decreases as the start is extended. Part of this is simply due to the "weight" on the start win pct is automatically increased, but another part is due to the rest-of-season win pcts are closer to the start win pct as well.
Fiddling with Formulas
Okay, where does this leave us? I have presented data that shows that there is a systematic relationship between a team?s starting win pct and its finishing win pct. The .500 level can be considered a magnet that attempts to pull every team?s win pct towards it. The strength of the .500 attraction depends upon two factors: how far away from .500 is the starting win pct (e.g., we saw that the attraction is weaker for a .700 team than for a .600 team); and how deep into the season the start extends.
I have attempted to develop a formula that predicts a team?s end-of-season win pct depending upon its starting win pct and how many games the start consists of. However, my preliminary attempts have not been entirely satisfactory.
My formulas have been of the form EOS = (A * START) + (1-A)*.500, where A depends upon the number of games in the start. I have tried both linear and non-linear functions for the A relationship.
I have also attempted to develop a formula that tells us how confident we should be in our EOS prediction (i.e., the spread in the second table above), without total success. I am thinking that there is probably some systematic formula that links the two ideas based purely on statistical theory, but I have not yet worked this out.
I have thus far done a piece-meal investigation into the relationship between a team?s starting win pct and its finishing win pct. I picked a .600 win pct arbitrarily. Hopefully others can do a more systematic investigation, perhaps over all 20th century teams.
And besides just looking at more data, more thinking can be done on the derivation of useful formulas. These formulas can have one of two sources, one purely empirical and the other based upon statistical derivations. Ideally, the best formulas will combine the statistical underpinnings with the empirical observations.
Comments are encouraged.