You are here > Home > Primate Studies > Discussion
 
Primate Studies — Where BTF's Members Investigate the Grand Old Game Thursday, April 10, 2003Win Values 2002Did Barry Zito really deserve the Cy Young Award? Introduction Last year I introduced a new stat to measure the value that a starting pitcher brought to his team on a gamebygame basis. In short, win values assigns a number between 1 and +1 to each starting pitcher for each game to reflect how well the pitcher performed given the run support he was provided. The win value figure is the difference in probabilities that the starting pitcher?s team would have won the game given its offensive run support with the pitcher?s performance compared to if a league average pitcher would have started in his place.
As an example, a pitcher who wins a game 10 is earns a very high number of win value points since his team would very likely have lost the game (scoring only 1 run) with league average pitching. On the other hand, a pitcher who wins a game 100 earns only a few win value points since his team would very likely have won the game (scoring ten runs) with league average pitching. A complete description of the win value methodology and results from previous seasons is available here.
Win Values: A New Method to Evaluate Starting Pitchers Top AL Starting Pitchers in 2002 Retrosheet just released the playbyplay data for the 2002 season. Thus, I am now able to calculate the starting pitcher win values for last season. The following table presents the top ten starting pitcher performances according to win values in the American League. I wish to thank David Smith, Tom Ruane, Ray Kerby, Tangotiger, and all the Retrosheet volunteers for making available the 2002 season data. Table 1: 2002 AL Win Value Leaders
The entries in the table are the pitcher?s wonloss record, his parkadjusted ERA relative to league, his wins above average (which is based solely upon his ERA+ and number of innings pitched), his win values relative to a league average pitcher, and his win values relative to a replacement level pitcher. Bartolo Colon?s stats reflect his pitching in both leagues last season; Colon accumulated 2.78 win values relative to league average in the AL and 1.21 win values in the NL.
As you know, Barry Zito won the 2002 AL Cy Young award, largely due to his 235 record. Pedro Martinez finished second in the voting, with Derek Lowe third; the only other pitcher to appear on a ballot was Jarrod Washburn who got a lone third place vote.
Win values suggests that the AL pitcher who contributed the most to his team last season was Tim Hudson, a pitcher who did not even appear on a single Cy Young ballot. The reason, of course, is that Huddy only went 159 and Cy Young voters are enamored with a pitcher?s wonloss record.
A gamebygame look at Hudson?s season reveals why win values tabs it as the best in the AL.
Table 2: Tim Hudson?s 2002 GamebyGame Win Values
* Last inning Tim Hudson appeared in (partial innings, including facing one or more batters without recording an out, count as a full inning). ** The score at the conclusion of both halves of the last inning Hudson appeared in; Oakland?s score is given first.
Let me describe Hudson?s first game to make sure the reader understands what is being reported in the table. The first row indicates that Hudson?s first start of the season was at home against Texas on April 2. Hudson pitched into the 7th inning, and the score at the conclusion of the 7th inning was 21 in favor of Oakland. The A?s won the game 32 (details below). Hudson got a nodecision in the game. Hudson?s performance was worth .370 win value points. That is, Hudson put the A?s into a position (leading 21) in which his team had a .370 higher probability of winning the game after 7 innings than if a league average pitcher had started that day for Oakland.
You will see that Hudson?s hard luck stayed with him for the entire season. Games in which Hudson pitched very well but came up without a victory include the following.
By my count, if neither team had scored any more runs after Hudson left each of his starts, he would have gone 219 (with four ties). Since he actually went 159, it is clear that Hudson?s bullpen mates cost him several victories. In short, Hudson?s WL record does not do justice to how great a season he actually had.
Barry Zito, Pedro Martinez, and Derek Lowe each had better ERA+ and better WL records than did Hudson. So how can their win value be lower?
Seasonal averages can mask a myriad of issues that a detailed gamebygame analysis can take into account. Zito pitched very well last year but there were several games in which his team did not need him to pitch so great. For example, here are the scores in six of Zito?s starts when he left the games: 103, 111, 93, 70, 71, and 81. Even with a league average pitcher, the A?s would likely have won these games, and accordingly Zito is given little credit for winning these games.
Similarly, in five of Derek Lowe?s starts the Red Sox outscored the opposition 490 when Lowe was the pitcher of record (100, 90, 90, 100, and 110). Obviously Lowe pitched great in these games but receives little win value credit since his run support was so high in each of these games. This is an illustration that win values is a "value" stat (backward looking) rather than an "ability" stat (forward looking).
Pedro Martinez had another outstanding season in 2002. The main reason why he doesn?t finish higher on the win value ladder is that Pedro only started 30 games last season. Contrast that to Zito?s 35 and Hudson?s 34. I am confident had Pedro started 34 or 35 games, he would have had the league?s highest win value total. The Red Sox rightfully protect their franchise pitcher?s arm, so I am not suggesting that they pitch Pedro more; I am merely pointing out why he finished fourth in win values.
Martinez and Lowe illustrate another issue related to how win values are calculated. In the win value system, as in WAA, Pete Palmer?s TPI, and Michael Wolverton?s SNW, the pitcher is given total credit for run prevention, be it via the strikeout or the groundout. Thus, Pedro receives no additional credit for being such a dominating strikeout pitcher (nor do Clemens, Johnson, Seaver, et al.).
Barry Zito, Pedro Martinez, and Derek Lowe all had outstanding seasons in 2002. However, according to win values, none contributed more to his team winning (in a probabilistic sense) than did Tim Hudson. Top NL Starting Pitchers in 2002 Table 3 presents the top ten National League starting pitchers according to win values.
Table 3: 2002 NL Win Value Leaders
Randy Johnson dominated the NL once again in 2002 (especially in light of Curt Schilling?s lateseason slide) and unanimously won his 4th consecutive and 5th overall Cy Young award. Schilling was a nearunanimous second place finisher. John Smoltz and his 55 saves came in third for the Cy Young. Roy Oswalt tied Eric Gagne for fourth.
We don?t need to spend a lot of time discussing the NL. Any evaluation tool that claims someone other that Randy Johnson was the best starting pitcher in the NL last season is highly suspect. However, it is worth noting that Curt Schilling and Randy Johnson were virtually tied in win values after Schilling?s September 15 start. That night ByungHyun Kim blew a save in the ninth or else Schilling would have gone to 245 (Johnson was 225 at the time). Johnson closed out the season with two great victories whereas Schilling?s last two starts were both struggling losses.
Greg Maddux had another great season last year and, due to Schilling?s late season fade, actually finished second in win values in 2002. In the next section of the article, I will update Johnson, Maddux and the other top active pitchers? career win value totals to include the 2002 figures.
Updating Top Active Pitchers? Career Win Values In my previous article, I presented a table of the greatest starting pitchers of the last thirty years, the seasons for which Retrosheet has made available the playbyplay data, according to win values.
Win Values: Updated for 1969, 197477
Below, I update the active pitchers included in that table to incorporate their 2002 seasons. Note that the table is sorted by career win values relative to replacement level, though win values relative to league average are also presented.
Table 4: Win Values for Top Starters of Last 30 Years
Roger Clemens is deemed to be the greatest starting pitcher in the last 30 years, both in win values relative to replacement level and in win values relative to league average. At this point in his career, Clemens is treading water, so he?ll likely not add much to his totals.
Greg Maddux is still one of the best pitchers in the majors, and has a good shot of surpassing Clemens in the next two seasons.
Although Randy Johnson has not yet cracked the top ten in win values relative to replacement, he is currently fourth in win values relative to league average, and may well eclipse Tom Seaver to move into third place in this measure before he retires.
Pedro Martinez is the other member of this modern quartet (surefire Hall of Famers). Pedro is still in midcareer, so he hasn?t yet accumulated a ton of win values relative to replacement, at least compared to pitchers who pitched for 20 years. On the other hand, the meteoric firsthalf of Pedro?s career (highest ERA+ in history) yields the sixth highest win values relative to league average, behind only Clemens, Maddux, Seaver, Johnson, and Palmer.
Other active pitchers who are making inroads on the list of the best modern starters include Tom Glavine, Kevin Brown, Chuck Finley, Mike Mussina, the recently revived David Cone, Kevin Appier, Curt Schilling, John Smoltz (now a closer), and David Wells. Conventional wisdom puts Glavine into the Hall of Fame, and none of the other guys as of now, though Mussina will likely have a solid case before he ultimately retires. Concluding Remarks In this article I have presented updated results of my new win value system that evaluates starting pitchers. Thanks to the wonderful volunteers at Retrosheet, the required data for the 2002 season recently became available. Thus, I have calculated win values for each league for last season.
We saw that Randy Johnson contributed the most among NL starting pitchers to his team in 2002, and deservedly won his 4th consecutive and 5th overall Cy Young award. In the AL, on the other hand, win values suggests that Tim Hudson, a pitcher who did not appear on a single Cy Young ballot, contributed the most among AL starting pitchers. As always, Cy Young voters and the general public place too much emphasis on a pitcher?s wonloss record.
Many sophisticated analysts disregard a pitcher?s WL record altogether and instead focus entirely upon ERA (ERA+). I have found that there may be valuable information in a pitcher?s WL record, especially looking backward. Accordingly, I developed the win values system to be the best integration of WL and ERA+ data when attempting to determine how much the pitcher contributed to his team. In my system, run prevention is critical but its value in helping a team win must be measured in the context of the number of runs the team?s offense scores.
Comments on the method or the 2002 results are encouraged.

BookmarksYou must be logged in to view your Bookmarks. Hot TopicsWhat do you do with Deacon White?
(17  1:12pm, Dec 23) Last: Alex King Loser Scores (15  12:05am, Oct 18) Last: mkt42 Nine (Year) Men Out: Free El Duque! (67  10:46am, May 09) Last: DanG Who is Shyam Das? (4  8:52pm, Feb 23) Last: RoyalsRetro (AG#1F) Greg Spira, RIP (45  10:22pm, Jan 09) Last: Jonathan Spira Northern California Symposium on Statistics and Operations Research in Sports, October 16, 2010 (5  12:50am, Sep 18) Last: balamar Mike Morgan, the Nexus of the Baseball Universe? (37  12:33pm, Jun 23) Last: The Keith Law Blog Blah Blah (battlekow) Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011 (2  8:03pm, May 16) Last: Diamond Research Retrosheet SemiAnnual Site Update! (4  4:07pm, Nov 18) Last: Sweatpants What Might Work in the World Series, 2010 Edition (5  3:27pm, Nov 12) Last: fra paolo Predicting the 2010 Playoffs (11  5:21pm, Oct 20) Last: TomH SABR 40: Impressions of a FirstTime Attendee (5  11:12pm, Aug 19) Last: Joe Bivens, Minor Genius St. Louis Cardinals Midseason Report (12  12:42am, Aug 10) Last: bjhanke Napoleon Lajoie: Definition of Grace (9  12:38am, Jul 01) Last: Hang down your head, Tom Foley Youth Baseball Hitting Drills: Shine the Light (5  6:47am, Mar 11) Last: Pat Rapper's Delight 

Page rendered in 0.9974 seconds 
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Dolf Lucky Posted: April 10, 2003 at 01:56 AM (#610311)Greg Maddux is still one of the best pitchers in the majors, and has a good shot of surpassing Clemens in the next two seasons."
Really?
Even forgetting the small sample so far this season (Clemens good, Maddux dreadful), Clemens' xERA compared to AL avg. was better than Maddux's compared to the NL average. The only reason their actual ERAs were reversed (drastically so) was that Clemens was about the most unlucky pitcher in the AL (over 30% BIP for hits) while Maddux managed to give up bunches of hits while avoiding giving up earned runs (many unearned runs plus luck of timing). Given Clemens' K rate of over 1 per inning last season and Maddux's declining K rate, I'd guess that Maddux is much closer to losing his effectiveness than is the Rocket.
I think this eliminates a lot of the context problems. The ones that remain are (a) defense and (b) assuming that a pitcher's team offense has no effect on how he pitches (quality/runs AND innings).
I assume that after Hudson left, his team scored 3 more runs, making it 61...then the pen let up 4 runs making it 65. The lead was never let up, so it was always his decision.
(Or something along those lines)
Does this factor in the hitting quality of the opponent?
There is. It's called Player Win Averages, from the Mills Brothers, developed over 30 years ago. It did not consider fielding. It is based on playbyplay analysis.
I do something similar for relievers, with Leveraged Index.
Tom Ruane ran something similar based on base/out states (and not inning/score), for 19801997. You can look for that on baseballstuff.com
The 2002 update was written before the 2003 season got under way, so I stand by my ClemensMaddux comments. There is nobody who is a bigger Roger Clemens fan than I am. I have written extensively on Clemens. I consider him the 3rd greatest pitcher of all time, and gladly sponsor his page over at BaseballReference.com. I am no fan of Greg Maddux, though Maddux is also one of the alltime greats. The win value system considers them that way too.
The reason why Tim Hudson's 6/20 start had a slightly higher win value than his 8/14 start is due to the park factors. I cannot emphasize enough that the win value system does not consider what happens after the pitcher leaves the game (actually after the full inning).
Depot's proposed a different system that does not consider the pitcher's specific run support in the game but instead uses the league distribution of runs scored. In fact, this is what Michael Wolverton does in his Support Neutral Win system. See my original article for a discussion of why I chose to go in another direction than SNW.
The win value system is another way of looking at pitchers. It was not intended to replace SNW or ERA or anything like that. As someone posted, it was designed in a similar vein to Bill James' win shares in which he attempts to allocate credit for team winning. James does this on an aggregate seasonal basis whereas I am able to do this for pitchers on a gamebygame basis. Again, it is purely a backwardlooking evaluation system (value) rather than forwardlooking (ability).
Yes, this same approach I take for pitchers can be applied to hitters. It is akin to Win Probabilty Added and other "valueadded" measures. I cannot fathom the backlash my system seems to have caused since it is in a long line of methods which sophisticated analysts have embraced for many years.
Finally, the system does not currently take into account the strength of the opposition (strength of schedule). However, I recently developed a system to estimate the strength of opposition a starting pitcher faces and wrote this up in a separate article. My goal is to incorporate this aspect into the next version of the win value system.
Awesome! Thanks, Rob.
...Bill James' win shares in which he attempts to allocate credit for team winning. James does this on an aggregate seasonal basis whereas I am able to do this for pitchers on a gamebygame basis.
Do you find there is a significant difference between the win shares rankings and the win value rankings? Significant enough to be worth the cost of looking at gamebygame instead of total season, assuming there even is a cost (in time and effort) to compute Win Value over Win Shares?
I appreciate David's point though I think it may overlook two of the key aspects of the win value system. First, seasonal data consists of a hodgepodge of gamebygame data, and gamebygame data is the most useful, especially for pitchers. For example, a pitcher's average run support can be misleading since run support is notoriously lumpy (nonnormally distributed). Same would be true, probably even moreso, for any measure of bullpen support.
Second, I really think there is room in our analysis toolkit for a value stat for starting pitchers. That is, a stat that attempts to ascribe credit or blame for winning or losing each game, especially with regard to the starting pitchers. We have many useful pitching stats that are great for measuring ability (forward looking). But, as far as I knew when I developed win values, there were no value stats (backward looking) for starting pitchers.
It's been several months since I read Rob's Win Values, but from what I remember, this is not a true statement. It doesn't matter what the score is when he leaves the game, but just what the final score of the game was.
For Cy Young, unlike the MVP, pitchers are recognized for their performance in isolation of their teammates contributions (or so the theory goes). To award a Cy Young, you shouldn't care what the team run support is.
==========
Consider a team that scores 7 runs in a 9inning game. Suppose I find a team that scores 7 runs in a game will have scored 5 runs at the conclusion of the 6th inning 10% of the time. I also know the distribution of final scores of every team that scored 5 runs at the conclusion of the 6th inning. Say they wind up with 5 runs 12% of the time, 6 runs 20% of the time, 7 runs 25% of the time, ?, and 15 runs 1% of the time.
I can calculate this bootstrapped distribution of final scores for every possible number of runs scored at the conclusion of the 6th inning. To find the ultimate ?could have been? distribution of final scores, I would then weight these probability distributions of each possible runs scored outcome by the respective probability of having that many runs scored at the conclusion of the 6th inning (10% in the case for starting with 5 runs in the example above).
The result is a ?smearing? of the run support provided in a game. For example, this method may find that a team that actually scored 7 runs ?could have? scored runs with the following probabilities: 0 runs (1%), 1 run (2%), 2 runs (4%), 3 runs (6%), 4 runs (7%), 5 runs (9%), 6 runs (12%), 7 runs (15%), 8 runs (10%), 9 runs (8%), 10 runs (7%), 11 runs (6%), 12 runs (5%), 13 runs (4%), 14 runs (3%), and 15 runs (1%). I would then use this ?could have been? smeared probability distribution for the pitcher?s possible run support in evaluating his outing.
Now that I have answered some questions that you may have had, let me try to summarize the conceptual approach I take. I am introducing a method that evaluates a starting pitcher?s contribution to his team?s chance of winning the game if the score is RS to RA when he leaves the game at the conclusion of the Zth inning. I will first ?smear? the run support based upon RS and Z using a backwards Bayesian bootstrapping method. That will give me a probability distribution that the team could have scored X runs at the conclusion of the Zth inning, where X ranges from 0 to 25, say.
Next, using the smeared run support distribution, I will estimate the probability that the team would win a game when giving up RA runs at the conclusion of the Zth inning. Then, using the smeared run support distribution, I will estimate the probability that the team would win this game with league average pitching. I then will subtract these two probabilities to derive the pitcher?s win contribution for that game. For those readers interested in a mathematical representation, all the formulas are presented below.
===========
So, if I understand this correctly, Rob doesn't care what the score was when the pitcher is taken out (as almost all win probability measures do), but he does this reverse process in order to account for the "run in the first is worth as much as the run in the 9th" situation.
I think the bullpen issue might be valid (as would fielding), but I'm not sure.
The concept of a win probability is not new. It dates back at least as far as the Mills Brothers in the 1960's. Doug Drinen used it in his relief pitcher stat, Michael Wolverton uses it in his support neutral win stat for starting pitchers, and all of the valueadded approaches rely upon win probabilities in some way.
Here is my characterization of a win probability. For each game, of course, one team wins and one team loses. Roughly speaking, one "win probability" unit is allocated among the winning team's players, just as negative one unit is allocated among the losing team's players.
[This is not strictly correct because players on losing teams can do good things and players on winning teams can do bad things. But let's not worry about that right now.]
The basic idea is that a batter is allocated win probabilities based upon how he changes the game situation in each plate appearance relative to what a league average batter would likely have done. For example, if a batter hits a 2out bottom of the ninth grand slam home run to win a game 43, he will get a large allocation of that one win probability unit since he took his team from a very likely losing situation to a win.
Suppose the next game the same guy hits a 2out bottom of the ninth grand slam home run that only makes the score 154 (they were previously trailing 150). In this instance, the same event is worth far less due to the game score situation. The grand slam in this case barely changes the probability that his team will win from, say, .00001 to .00002, whereas the gamewinning grand slam changed the probability of winning from, say, .20 to 1.00. Surely, everybody intuitively understands this.
What my new win value system does is apply this principle, with a few wrinkles, to starting pitchers. Consider two games. In the first game the team wins 32. The winning pitcher will likely get a lot of credit (allocation of that fixed one unit of win probability) in this situation. The next game the team wins 142. Here the winning pitcher will get far less of the credit, because, as David points out, the offense was really the reason the team won the game.
But that is exactly why the pitcher will get less of the fixed one unit. You cannot fix the pitcher's credit and allow the credit to the offense to vary, since by definition their sum must always be a constant (ignoring defense for the time being).
Neither win values nor win probabilities imply any causality. The pitcher has no control over the score (just as the batter who hit the grand slam had no control over the score of the game before his at bat, and is not presumed to have the ability to hit better when the game is on the line). But so what? The allocation of credit does depend upon the score, and thus the allocation of credit going to the pitcher will depend upon the score.
If this makes you uncomfortable, then you don't fully appreciate the concept of win probabilities. Go back and reread the Mills Brothers' "Player Win Averages" or Drinen's "Win Probability Added" or Wolverton's "Support Neutral Wins" or Gary Skoog's "Runs Created: A ValueAdded Approach".
And please stop saying that the win value system is bogus. Let's reserve that degree of criticism for stats or approaches that have no redeeming value whatsoever, such as the muchdiscussed HEQ stat.
1  Associating the Cy Young Award to a metric. Unlike the MVP, the Cy Young "should" go to the pitcher who performed the best, without consideration of his teammates' performanace. From that standpoint, the consideration should either be that you assume leagueaverage run, bullpen and fielding support (or some other neutral baseline), or you also include a "pitching to the score" component. Pitching to the score occurs in realtime.
2  The ingenious bootstrapping approach. While all other win probability measures are based in realtime (win probability before and after the occurrence of the event, with the change in win probability allocated in some way to the pitcher, hitter, fielders, and runners), Rob takes a different approach, as noted in my cut and paste. If I am understanding it correctly, it doesn't matter what the score was when the pitcher was removed, but rather what was the final score of the game. A pitcher taken out of a 32 game that ends at 32 or 102 will have a different win value. A pitcher taken out of a 30 or 80 game, with the final score being 90 will have the same win value. The process speaks to the readers who think a run is a run is a run, within the context of the same game.
The question to ask is the merits of figuring win values based on a gamebygame basis, as opposed to playbyplay. This method is the only one that allows to you satisfy the condition "a run in the 1st is as valuable as a run in the 9th". From that standpoint, this method deserves to be looked at with greater detail.
3  Bullpen and fielder support. These are valid points that I've not really looked at. Fielder support is very legitimate for pitchers on teams with good/bad fielders. The impact of this can be considered by looking at the 2002 Angels, who may have contributed over 0.5 runs / game (or .05 wins) over league average. For a 234 inning pitcher, that's 1.3 wins, which is HUGE. However, Win Values is not alone in treating a team's fielders as league average. For bullpen support, you probably have similar issues.
Perhaps we can center the discussion on the mechanics of the measure, rather than making conclusions that it's no good (or is the best thing in the world).
The term "value" can really mean anything, and therefore, it's best that it is defined before discussing it.
This measure attempts to quantify the win impact a pitcher has on a game, after the outcome of the game has been decided. It does this by using the final run support of the game, and "reversing it" through a Bayesian process to establish probability and frequency rates, back to when the pitcher was removed from the game. You then move forward, using league average rates to establish a win probability for when the pitcher left the game, with this new run support. The difference between the win/loss of the game, and this new win probability is his "value".
Did I get that right Rob?
It's a fascinating process, though I'm still not sure of what it's really telling us.
Win values are an estimate of how much the pitcher's performances contributed to his team's winning over the course of the season, taking his gamebygame run support as a given. This is the best approach to estimate a pitcher's valueadded wins in the sense described by the Mills Brothers, et al.
win values aren't really that similar to win shares, either, because a premise of win shares is the size of the pie. i'm guessing that win values penalizes good pitchers on dominant teams, since their victories would generally be more lopsided.
Let's say Koufax would pitch as well for the 1927 Yankees. Why is this worth less? Because the 1927 Yankees would have won with merely good pitching; it didn't require pitchers of Koufax's ability. A game won 82 is as valuable in the standings as a game won 85.
Or, to say it another way. Everybody would agree that a pitcher who hits a home run and pitches a shutout to lead his team to a 10 victory deserves a great deal of the credit for the win. But once you agree to that premise, the relative value argument kicks in. A pitcher who pitches a shutout in a 100 win must, perforce, have less relative value since the hitters deserve at least some of the credit. The second statement follows directly from the first.
or to respond in another way to your example: what if both teams could bid on gehrig? the '65 dodgers would become a highoffense team, and thus dominant; the '27 yankees were a better team anyway, but would not have been nearly so dominant without gehrig.
as for the pitcher who homers and pitches a 10 shutout, he deserves credit for his home run because he hit it, not because the game was so close.
the only way it makes sense to me to value the 10 shutout pitcher above the 100 shutout pitcher is to say that the 10 shutout pitcher PITCHES BETTER. that doesn't make sense to me, since both pitchers allowed zero runs.
maybe it's true that sometimes a pitcher can boost his game when the pressure's on, and the ability to do so is a value unaccounted for by traditional metrics. but i don't think you've shown that your metric can establish or evaluate this ability.
You must be Registered and Logged In to post comments.
<< Back to main