|
|
|
Primate Studies— Where BTF's Members Investigate the Grand Old Game
Monday, August 19, 2002
Win Values: A New Method to Evaluate Starting Pitchers - Part 5
Empirical Data for AL 2000
Part
1: Introduction
Part
2: Conceptual Framework
Part 3: High-Level Results
Part
4: Formulas
Part 5: Empirical Data for AL 2000
Part
6: Example: David Wells in AL 2000
Part
7: Yearly Results for 1978-2001
Part
8: Top Stars
Part
9: Concluding Remarks Empirical Data for AL 2000In this section I will present examples of the empirical data corresponding
to all the variables entering the Win Values formulas presented above.? For
convenience the AL 2000 will serve as the representative season.? The following
tables will present the empirical distributions using the entire AL 2000 season
as the database.? In the Win Values framework, these distributions are derived
separately for each league and for each season under study. Table 4:? Win Probs with Average Pitching, AWin(RS,Z), for AL 2000
>
|
At the conclusion of the Zth inning
|
Run Scored (RS)
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
0
|
.436
|
.383
|
.346
|
.288
|
.235
|
.189
|
.126
|
.051
|
.000
|
1
|
.580
|
.496
|
.403
|
.347
|
.332
|
.277
|
.197
|
.146
|
.068
|
2
|
.597
|
.574
|
.535
|
.463
|
.371
|
.318
|
.296
|
.260
|
.204
|
3
|
.742
|
.715
|
.622
|
.599
|
.531
|
.444
|
.390
|
.321
|
.281
|
4
|
.800
|
.786
|
.725
|
.680
|
.633
|
.583
|
.516
|
.440
|
.431
|
5
|
.875
|
.840
|
.810
|
.788
|
.692
|
.646
|
.641
|
.615
|
.551
|
6
|
.950
|
.870
|
.839
|
.820
|
.810
|
.772
|
.702
|
.651
|
.610
|
7
|
.970
|
.913
|
.878
|
.857
|
.840
|
.820
|
.800
|
.759
|
.728
|
8
|
1.000
|
1.000
|
.958
|
.900
|
.891
|
.874
|
.850
|
.845
|
.840
|
9
|
1.000
|
1.000
|
.975
|
.965
|
.955
|
.945
|
.923
|
.880
|
.860
|
10
|
1.000
|
1.000
|
1.000
|
.975
|
.960
|
.950
|
.940
|
.932
|
.887
|
11
|
1.000
|
1.000
|
1.000
|
1.000
|
.975
|
.960
|
.950
|
.940
|
.900
|
12
|
1.000
|
1.000
|
1.000
|
1.000
|
.980
|
.965
|
.960
|
.950
|
.940
|
13+
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
| |
Table 4 reports the empirical win probabilities using the entire AL 2000
season.? The first row of the table presents the probabilities a team will
go on to win the game when it has scored exactly 0 runs at the conclusion
of each inning 1-9, given average pitching.? Of course, based solely upon
runs scored information, before the game starts each team has a .500 expected
win prob.? If a team is scoreless after 1 inning, its win prob falls to .436,
after 2 innings to .383, and so on until it has .000 chance of winning with
0 runs after 9 innings.
Maybe the better way to look at the table is by column, that is by what
inning the game is in.? After 5 innings, a team that has yet to score has
a .235 win prob with average pitching, a team with 1 run has a .332 win prob,
a team with 2 runs has a .371 win prob, and so on all the way up to a 1.000
win prob if the team has scored 13 or more runs.
As described above, these win probabilities are based upon every game in
the entire AL 2000 season.? Accordingly, I treat these probabilities as being
reflective of the win probabilities a team would have with ?average? pitching.
Table 5:? Win Probs using Pitcher?s
Performance, DWin(RA-RS, Z), for AL 2000
>
|
At the conclusion of the Zth inning
|
Deficit (RA-RS)
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
0
|
.500
|
.500
|
.500
|
.500
|
.500
|
.500
|
.500
|
.500
|
.500
|
1
|
.410
|
.390
|
.380
|
.365
|
.344
|
.274
|
.226
|
.179
|
.000
|
2
|
.358
|
.299
|
.275
|
.264
|
.234
|
.200
|
.114
|
.062
|
.000
|
3
|
.215
|
.205
|
.195
|
.176
|
.164
|
.089
|
.059
|
.023
|
.000
|
4
|
.146
|
.130
|
.115
|
.107
|
.075
|
.051
|
.030
|
.015
|
.000
|
5
|
.105
|
.085
|
.077
|
.068
|
.060
|
.045
|
.025
|
.010
|
.000
|
6
|
.050
|
.040
|
.035
|
.030
|
.027
|
.025
|
.014
|
.005
|
.000
|
7
|
.025
|
.020
|
.015
|
.010
|
.005
|
.000
|
.000
|
.000
|
.000
|
8+
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
Table 5 reports the expected win probabilities using the performance of
the starting pitcher under study.? That is, we use both the team?s offensive
run support (RS) and also the runs allowed (RA) the starting pitcher has given
up.? As above, the win probabilities vary by what inning the game is in.
By extensive analysis I have found that the win probabilities are based
largely on the deficit (or lead) the team faces, and that the specific RA,
RS information is not needed.? This allows a more robust estimation of these
probabilities since there are not enough games in any season that have the
exact same score at the conclusion of any specific inning.? But there are
lots of games that have the same deficit.
The first row of the table indicates that the expected win probability of
a team in a game that is tied at the conclusion of any inning is .500.? The
second row indicates how the win probability of a team that is 1-run behind
varies depending upon what inning the game is in.? After 1 inning, the probability
a team can expect to overcome a 1-run deficit (going on to win the game) is
.410; after 2 innings .390, etc.
The columns of the table indicate how the win probability changes with the
deficit[1] to be overcome at the conclusion of a specific
inning.? For example, the 5th column indicates that a 1-run deficit can be
expected to be overcome .344 after five innings, a 2-run deficit .234, a 3-run
deficit .164, etc.
Table 6: ?Could Have Been? Run Scored
Probabilities, Smear(m;RS,Z), for AL 2000
>
|
Actual Runs Scored (RS); at Z=9
|
Could Have Scored (m)
|
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
0
|
.338
|
.130
|
.059
|
.033
|
.015
|
.010
|
.008
|
.005
|
.002
|
1
|
.267
|
.306
|
.148
|
.099
|
.052
|
.047
|
.022
|
.015
|
.012
|
2
|
.155
|
.192
|
.279
|
.166
|
.110
|
.074
|
.049
|
.030
|
.023
|
3
|
.105
|
.153
|
.198
|
.264
|
.179
|
.116
|
.082
|
.062
|
.045
|
4
|
.060
|
.090
|
.130
|
.176
|
.234
|
.154
|
.133
|
.100
|
.071
|
5
|
.038
|
.062
|
.079
|
.104
|
.141
|
.231
|
.175
|
.124
|
.081
|
6
|
.020
|
.029
|
.053
|
.068
|
.114
|
.162
|
.209
|
.157
|
.136
|
7
|
.014
|
.015
|
.023
|
.041
|
.067
|
.089
|
.122
|
.202
|
.157
|
8
|
.003
|
.011
|
.014
|
.022
|
.036
|
.045
|
.081
|
.119
|
.184
|
9
|
.000
|
.006
|
.007
|
.010
|
.023
|
.031
|
.045
|
.069
|
.089
|
10
|
.000
|
.003
|
.004
|
.007
|
.014
|
.020
|
.036
|
.052
|
.073
|
11
|
.000
|
.002
|
.003
|
.004
|
.005
|
.007
|
.018
|
.034
|
.069
|
12
|
.000
|
.001
|
.002
|
.003
|
.004
|
.005
|
.008
|
.014
|
.022
|
13
|
.000
|
.000
|
.001
|
.002
|
.003
|
.004
|
.005
|
.006
|
.017
|
14
|
.000
|
.000
|
.000
|
.001
|
.002
|
.003
|
.004
|
.005
|
.009
|
15
|
.000
|
.000
|
.000
|
.000
|
.001
|
.002
|
.002
|
.003
|
.005
|
16
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
.001
|
.002
|
.003
|
17
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
.001
|
.002
|
18
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
.000
|
SUM
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
Table 6 reports the ?could have been? smearing probabilities for a team?s
possible run support, given that it actually scored a specific number of runs
in the game.? The table is a partial reporting of these smearing probabilities
for the final scores (after 9 innings); there is a different smearing probability
array for each inning.
This table should be viewed by column only.? Each column represents the
probability distribution that the team could have scored any number of runs
(indicated by the row labels), given that it actually scored the number of
runs indicated by the column header.? For example, the first column (with
column header 0) indicates that a team that actually scored 0 runs in a game
?could have? scored 0 runs with probability .338, 1-run .267, 2-runs .155,
3-runs .105, etc.? The bolded elements correspond to the ?could have? runs
scored being equal to the actual runs scored.
Table 7:? Pct Park Adders, PAddPct(RS), for AL 2000
>
Run Scored (RS)
|
Pct Park Adder
|
0
|
.000
|
1
|
.002
|
2
|
.004
|
3
|
.007
|
4
|
.010
|
5
|
.009
|
6
|
.008
|
7
|
.006
|
8
|
.004
|
9
|
.002
|
10+
|
.000
|
Table 7 shows how the effect of a home park on a team?s win probability
varies by how many runs it scores in a game.? As described above, the entries
were estimated empirically as the per percentage point park effect.? For example,
the Oakland Coliseum had a 97 park factor, indicating that runs scored were
generally 6% less prevalent than in a neutral park.? Thus, for games played
in Oakland the numbers in the table above would be multiplied by 6.? As the table
reflects, parks have a negligible effect on winning when a team is shutout
or scores 10 or more runs.? The park has the most effect when a team scores
around 3-7 runs.
Rob Wood
Posted: August 19, 2002 at 06:00 AM |
0 comment(s)
Login to Bookmark
Related News:
Bookmarks
You must be logged in to view your Bookmarks.
Hot Topics
What do you do with Deacon White? (17 - 1:12pm, Dec 23)Last: Alex KingLoser Scores (15 - 12:05am, Oct 18)Last: mkt42Nine (Year) Men Out: Free El Duque! (67 - 10:46am, May 09)Last: DanGWho is Shyam Das? (4 - 8:52pm, Feb 23)Last: RoyalsRetro (AG#1F)Greg Spira, RIP (45 - 10:22pm, Jan 09)Last: Jonathan SpiraNorthern California Symposium on Statistics and Operations Research in Sports, October 16, 2010 (5 - 12:50am, Sep 18)Last: balamarMike Morgan, the Nexus of the Baseball Universe? (37 - 12:33pm, Jun 23)Last: The Keith Law Blog Blah Blah (battlekow)Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011 (2 - 8:03pm, May 16)Last: Diamond ResearchRetrosheet Semi-Annual Site Update! (4 - 4:07pm, Nov 18)Last: SweatpantsWhat Might Work in the World Series, 2010 Edition (5 - 3:27pm, Nov 12)Last: fra paoloPredicting the 2010 Playoffs (11 - 5:21pm, Oct 20)Last: TomHSABR 40: Impressions of a First-Time Attendee (5 - 11:12pm, Aug 19)Last: Joe Bivens, Minor GeniusSt. Louis Cardinals Midseason Report (12 - 12:42am, Aug 10)Last: bjhankeNapoleon Lajoie: Definition of Grace (9 - 12:38am, Jul 01)Last: Hang down your head, Tom FoleyYouth Baseball Hitting Drills: Shine the Light (5 - 6:47am, Mar 11)Last: Pat Rapper's Delight
|
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
You must be Registered and Logged In to post comments.
<< Back to main