— Where BTF's Members Investigate the Grand Old Game
Tuesday, February 04, 2003
More on the Modern Bullpen
Rob looks at some simulations based on recent reliever research.
I?d like to follow up on some recent work, most notably by TangoTiger, Bill James, Walt Davis, and Rany Jazayerli, into the way modern managers use their bullpens. The recent work has argued that managers under-utilize their closers by largely reserving them for ninth inning save situations. The prototypical example is bringing in your best relief pitcher to save a game when leading by 3 runs in the ninth inning. This is a waste of resources, so the argument goes. Any relief pitcher could be expected to get three outs before the other team scores 3 runs.
In fact, some have argued that how managers utilize their bullpens represents the biggest impact of statistics on baseball ? a negative impact! When the save statistic gained widespread popularity, managers began to use their bullpens in a way that maximized the total number of saves their closer got in a season, while the real objective should be to minimize the number of runs allowed given some constraints related to not over-using relief pitchers.
I ran a few simulations to look into this issue further. I used the 2001 AL runs scored distribution (by half inning) as the baseline of my simulations. In this league, teams averaged 4.86 runs per game, or about 0.54 runs per half inning.
To start simply, I suppose that all pitchers on each team are of league average ability and give up runs according to the same underlying distribution as exhibited by the 2001 AL league. For those interested in the details, in each half inning the pitcher gives up 0 runs with probability 70.53%, 1 run (15.89%), 2 runs (7.34%), 3 runs (3.41%), 4 runs (1.64%), 5 runs (0.70%), 6 runs (0.29%), 7 runs (0.12%), 8 runs (0.05%), 9 runs (0.02%), 10 runs (0.01%).
All pitchers save one who we will call the Stopper of the team?s bullpen. The Stopper gives up exactly one half the league average number of runs. I accomplished this by dividing all the probabilities of giving up (positive) runs, and adding the leftover probability to the likelihood of giving up zero runs. So in each half inning the stopper gives up 0 runs with probability 85.265%, 1 run (7.945%), 2 runs (3.670%), 3 runs (1.705%), etc. Rather than giving up 4.86 runs per game, the Stopper gives up 2.43 runs per game, so has an ERA+ of 200.
My simulations were performed at the granularity of a half-inning, not specific base-out-inning situations. So managers in my simulated baseball must decide to bring in the Stopper at the beginning of an inning, not after a rally is under way. I think this is a reasonable assumption for these preliminary simulations. In fact, this restriction could be considered to be a reflection of the time it takes a closer to warm up in the bullpen (and not wanting a stopper to warm up and then not come into the game). All simulations were performed over more than 1,000,000 games so the standard errors of the reported win percentages are less than .001.
Let?s get to the results. I have evaluated five different bullpen usage strategies in this toy world. Only one team in the league has a Stopper in these preliminary simulations, and, of course, I will report the winning percentage of that team. In all cases, the stopper never pitches the tenth or subsequent innings.
Case 1: Stopper pitches 9th inning if save situation (lead of 1, 2, or 3 runs).
Win Pct = .513 or 2.17 games better than .500 in a 162- game season; stopper pitches 42 innings in a typical 162-game season.
Case 2: Stopper pitches 9th inning if save situation or game is tied (lead of 0, 1, 2, or 3 runs). Win Pct = .521 or +3.39 games; stopper pitches 58 innings per season.
Case 3: Stopper pitches 8th and 9th innings if a save situation (lead of 1, 2, or 3 runs). Win Pct = .525 or +4.09 games; stopper pitches 88 innings per season.
Case 4: Stopper pitches 7th, 8th, and 9th innings (two innings max) if game is within one run (lead of ?1, 0, or 1 run). Win Pct = .539 or +6.33 games; stopper pitches 142 innings per season. I will comment on this workload below.
Case 5: Stopper pitches 7th inning if a 1-run lead, and the 8th and 9th innings if tied or a 1-run lead (two innings max). Win Pct = .533 or +5.35 games; stopper pitches 92 innings per season. This last scenario was an attempt to pare some of the innings of case 4 in order to get the stopper?s workload down to that similar to case 3. Most people believe that a stopper can pitch effectively for 90 or so innings per season, but 140 innings would be beyond the point where his effectiveness would suffer.
Comparing Case 1 to the other cases strongly suggests that stoppers should be used beyond 1-inning save situations. Comparing case 1 to case 5 indicates that 3 or more victories per season may be available to teams who recharacterize the way they use their best relief pitcher.
Comparing Case 3 to Case 5 suggests that pitching your stopper in close games, including tie games and games when the team may even be trailing, can increase a team?s win total by around 1-2 games per season.
To be honest, I have been a skeptic of Bill James? and TangoTiger?s arguments along these lines. First, I did not believe that the potential benefit to expanding (recharacterizing) the stopper?s role was significant. However, these simulation results seem to indicate that expanding the role and shifting innings from non-leveraged to leveraged can boost a team?s win total by 2-3 games. And, remember, these simulations brought in the stopper at the beginning of each inning, whereas we know that the most leveraged situations are when runners are already on base, and one or two key outs are needed.
Second, I believe that the "costs" of making changes to the modern bullpen, though hard to pin down, are much more significant than others have suggested. I have always put a great deal of importance on having well-defined and fixed roles in bullpens. Some have argued that the modern closer?s role has become too restrictive, and that other roles could be defined that are more effective (my cases give examples). Nevertheless I remain skeptical that a bullpen could flourish with such "fluid" roles in the real world.
There are other hidden costs of changing bullpen usage, most notably the issue of needing to warm up. One advantage of the ninth inning save is that the stopper can mentally and physically prepare for coming into the game. Having the stopper available to come into the game as early as the 7th inning may play havoc on his preparation.
In addition, he may have to get warmed up repeatedly in anticipation of coming into the game as early as the seventh inning. For example, suppose his team is down by two in the bottom of the sixth with two out and runners on second and third. He?d presumably have to start warming up since if the next batter were to get a hit, the stopper would be called on to enter the game in the top of the seventh. The point is that the stopper now has to start and stop, so to speak, warming up in the bullpen throughout the late innings of many games. It is unclear what effect this would have on his availability and his effectiveness.
We can observe that managers have been managing as if they are fearful of over-using their closers. They prefer the sure-thing of having their ace pitch effectively for 60 or so innings rather than risk lowering his effectiveness over 100 or so innings.
In summary, my simulations seem to confirm what the advocates of an expanded role for the modern closer have been arguing. The potential in additional team victories is significant. However, there may be hidden costs, of unknown size, accompanying these changes.
Areas for further research include: investigating the team effects of bullpen construction (e.g., how you allocate a given number of innings during a season among your varying quality relief pitchers); investigating these issues at the granularity of base-out-inning situations (since most of the highest leverage situations have runners on base), maybe even the specific batters due up; developing methods to get at the importance of fixed roles and the need to warm up (the "hidden" costs); the effect of pitching your stopper for more than one inning per appearance (both good and bad); etc.