Predicting the 2010 Playoffs
Over the past few years, I have added to the original work of Vinay Kumar (who posts on Primer under a different name) in trying to find the statistical categories
that have the most value to a team in the postseason. Vinay’s work first appeared in an article at The Hardball Times web site
href=”http://www.hardballtimes.com/main/article/so-billy-what-does-work-in-the-playoffs”>“So Billy, What Does Work in the Playoffs?”
about how regular season
statistics for a team could forecast its chances of success in the postseason. Vinay created a system that does work quite well at identifying the teams that will
reach the World Series. The main competitor to Vinay’s work, Baseball Prospectus’ ‘Secret Sauce’ has now
href="http://wwww.baseballprospectus.com/article.php?articleid=12085">fallen by the wayside
. I am undaunted, as I enjoy doing the work anyway, so on with the
predictions.
The Categories
Vinay used 30 categories in his original research. But rather than using the data straight up, he used minimum splits between two teams in order to eliminate about
half of the results, to ensure that the data only reflected when a team had distinct advantage over its opponent. The columns below show the winning percentage in each
category through 2008, and how the 2009 results changed them. The numbers in parentheses indicate how many series these categories were a factor. In the case of Won-
Lost Record, 50 series featured one team having an advantage of five wins over its opponent.
Team totals: through 2008 adding 2009
Won-lost record (53) .580 .604
Runs Scored/Runs Allowed (50) .563 .580
Batting records:
Runs scored total (49) .426 .449
Batting average (57) .440 .456
On-base percentage (56) .451 .482
Slugging percentage (46) .476 .522
Doubles (64) .448 .469
Triples (67) .484 .478
Home runs (55) .479 .509
Batter walks (54) .551 .578
Batter strikeouts (fewer) (64) .559 .578
Stolen bases (50) .511 .480
Stolen base attempts (more) (53) .551 .528
Net stolen bases (60) .442 .483
Stolen base Average (66) .373 .394
Caught stealing (fewer) (59) .426 .441
Pitching records:
Runs allowed (50) .646 .620
ERA (51) .592 .569
Pitcher strikeouts (56) .540 .554
Pitchers walks (fewer) (46) .523 .490
Hits allowed (fewer) (53) .729 .736
Home runs allowed (fewer) (40) .595 .575
Complete games (56) .608 .589
Pitchers’ shutouts (60) .673 .650
Saves (58) .482 .500
Saves by team leader (55) .566 .582
Bullpen ERA (64) .574 .578
Fielding records:
Errors committed (fewer) (52) .680 .673
Defensive efficiency (57) .654 .667
Fielding double plays (59) .500 .475
Having done this several times now, one can see that the successful teams have an inordinate impact on the fluctuations in percentages. Two years ago, the speed-based
offense of Tampa Bay and Charlie Manuel’s carefully judged running game boosted the stolen-base categories that had been in decline from their previous high point,
when Vinay did his original work. The Yankees and Phillies have boosted several of the batting categories, particularly the home runs and slugging percentage. Overall,
though, run prevention is crucial to getting you to the World Series. I’m aiming to follow this up later in the post-season with a review of what has happened there.
Here are the categories that feature among those playoff series’ winners have led in more than 54 per cent of the time, ranked by their winning percentage:
Hits allowed
Errors committed (fewer)
Defensive efficiency
Pitchers’ shutouts
Runs allowed
Won-lost Record
Complete Games
Saves by closer
Runs Scored/Runs allowed
Bullpen ERA
Batters’ strikeouts (fewer)
Batters’ walks
Home Runs Allowed (fewer)
ERA
In the past, I included all those categories with a better than fifty per cent success rate. Had I done so this year, Stolen Bases and Pitcher Walks (fewer) would have
fallen out of the list, while Slugging Percentage and Batters’ Home Runs would have joined it. The higher threshold, however, means that the World Series’ teams have
less effect on pushing categories in and out. As you can see, for batters the important thing is a good eye and to put the ball in play. Let’s carry this information
forward and profile the 2010 Divisional Series. I’ve put the strong categories in italics.
Minnesota Twins vs New York Yankees
Last season the Twins were quickly bounced out of the playoffs by a powerful Yankees’ side. This year, the Twins turn up with a fighting chance. Having used a metaphor
drawn from British politics to characterize this series last year, I’m going to do the same and say that this Yankees’ team isn’t quite what I expected it to be, just
like the Con-Lib Pact that is running the country.
Twins’ advantages
Doubles
Triples
Batters’ Strikeouts (fewer)
Pitchers’ Walks
Home Runs Allowed
Complete Games
Shutouts
Yankees’ advantages
Runs Scored
Home Runs
Batters’ Walks
Stolen Bases
Net Stolen Bases
Stolen Base Average
Pitchers’ Strikeouts
Hits Allowed
Closer saves
Defensive Efficiency
Double plays
PREDICTOR PICK: NEW YORK YANKEES
Hedging my bet: If there’s a Twins’ team that looks like it could pull off a defeat of the Yankees in the post-season, this has the pitching to do it.
Texas Rangers vs Tampa Bay Rays
The Rays’ fleet offense is built all wrong to win today’s post-season. They’d have done better at the turn of the century. Nonetheless, they look likely to find the
Rangers little more than a bump in the road on the way to a League Championship showdown with the Evil Empire.
Texas Rangers’ advantages
Batting Average
Batters’ Strikeouts
Tampa Bay Rays’ advantages
Won/Lost Record
Doubles
Triples
Batters’ Walks
Stolen Bases
Stolen Base Attempts
Net Stolen Bases
Stolen Base Average
Caught Stealing
Pitchers’ Walks (fewer)
Shut Outs
Saves
Errors
PREDICTOR PICK: TAMPA BAY RAYS
Hedging my bets: The Rangers’ pitchers stand comparison to the Rays’. If they can stall the Rays’ offense, the Rangers could be celebrating a chance to play against
A-Rod for something important.
Cincinnati Reds vs. Philadelphia Phillies
More Red than a Communist Party of the Soviet Union Congress in this match-up. These two teams are more or less similarly constructed, relying on slugging and
pitching, except that Charlie Manuel beats Dusty Baker hands-down in the running game. But the Reds are rather in the position of being the imitation brand and needing
to try harder.
Reds’ advantages
Batting Average
Saves by Closer
Phillies’ advantages
Won/Lost Record
Batters’ Strikeouts
Net Stolen Bases
Stolen Base Average
Caught Stealing
Shut Outs
Complete Games
Errors
PREDICTOR PICK: PHILADELPHIA PHILLIES
Hedging my bet: The Reds, like the Rangers, surprise me by being close enough in the pitching categories to find grounds for hope.
Atlanta Braves vs San Francisco Giants
Two teams that came close to a three-way tussle with the ‘fras’ of San Diego to get into the post-season find themselves safely in the playoffs after all. If we can
avoid a tidal wave of toma-mawkish sentiment over Bobby Cox’ last post-season, this could be the most evenly matched series of this round of the playoffs, except that
the Giants are way ahead in the strong categories. So you might want to look for your excitement in the games between the Twins and the Yankees.
Braves’ advantages
On-base Percentage
Doubles
Batters’ Walks
Stolen Base Average
Pitchers’ Walks
Double Plays
Giants’ advantages
Triples
Pitchers’ Strikeouts
Complete Games
Shut Outs
Saves
Saves by closer
Errors
Defensive Efficiency
PREDICTOR PICK: SAN FRANCISCO GIANTS
Hedging my bets: The Braves ‘win one for the skipper’.
Peering Further Ahead
Well, actually I’m not going to. You’ll have to come back in about ten days’ time to see what I post in the comments about the League Championship Series. Nonetheless,
apart from Tampa Bay, once again we are looking at top media-market match-ups after the Divisional Series, with New York, Philadelphia and San Francisco. All we’re
really missing is Boston or Los Angeles for some kind of network executive’s wet dream. Ah, remember those halcyon days of competitive balance when the likes of
Detroit and Colorado made it to the World Series?
fra paolo
Posted: October 05, 2010 at 02:12 PM |
11 comment(s)
Login to Bookmark
Related News:
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. plim Posted: October 05, 2010 at 06:46 PM (#3655467)It's interesting to see that overall winning percentage and RS/RA ratio are now highly correlated with winning series; I found the lack of correlation there the most shocking of anything.
Here are the categories that feature among those playoff series' winners have led in more than 54 per cent of the time, ranked by their winning percentage:
If you're trying to look for categories that are "significant" (using a colloquial definition of "significant", not a statistical one), then it makes just as much sense to include those where the leader wins less than 46% of the time -- it's just as significant, but in the opposite direction.
http://www.insidethebook.com/ee/index.php/site/comments/secret_sauce_no_more
It sure does. The difference is that Prospectus published their results and said that they were significant, going so far as to call it "Secret Sauce". Whereas I tried to go out of my way to just say, "This is what we've seen. Who knows what it means."
SO: Rangers over Braves is my guess.
Again this thing works. For the first round of the playoffs at least.
However, when it comes to the League Championship Series, the Playoff Predictor has more trouble. It has three clear victories, two times it declined to make a forecast as the teams were tied in total strong categories in their favour, and there has been one defeat. (Last season, Phillies vs Dodgers.) I guess you could argue that is still a 75 per cent success rate, but it doesn't feel like it.
New York Yankees vs Texas Rangers
Yankees' Advantages
Won-Lost Record
Runs Scored
Triples
Home Runs
Batters' Walks (more)
Net Stolen Bases
Stolen Base Average
Caught Stealing
Errors
Double Plays
Rangers' Advantages
Batters' Strikeouts (fewer)
Stolen Base Attempts
Complete Games
Saves
Predictor Pick: New York Yankees
Hedging My Bet: Batters' Walks (more) is one of the weaker categories among the strong ones. The Rangers might be able to parlay home-field advantage to a playoff win, although the course of the Rays' series, where no-one won at home, does not fill one with confidence.
Philadelphia Phillies vs San Francisco Giants
Phillies' Advantages
Won-Lost Record
Ratio of Runs Scored to Runs Allowed
Runs Scored
Batters' Walks (more)
Stolen Bases
Stolen Base Attempts
Net Stolen Bases
Stolen Base Average
Caught Stealing
Pitchers' Walks (fewer)
Complete Games
Shutouts
Double Plays
Giants' Advantages
Pitchers' Strikeouts (more)
Hits Allowed (fewer)
Homer Runs Allowed
Saves
Closer Saves
Bullpen ERA
Predictor Pick: Pick 'em
Hedging my bets: San Franciso has the single most successful category of all, Hits Allowed (fewer), which has been on the side of the winner over seventy per cent of the time it has been a factor.
Work permitting, I may be back with a whole new article on this theme in time for the World Series, otherwise, you'll find predictions posted in this thread.
One other thought--have you ever tried to group together the categories to see if any combination produces unusual results of any kind?
Some background on the 'Secret Sauce' formula is given in Baseball Between the Numbers, pp. 352-68. To me, the big problem occurs right at the start, when the authors deploy a formula for 'playoff success' by awarding points. They then use correlations between this and statistical categories to help define what works in the playoffs. By contrast, Vinay aggregated the individual series, regardless of whether a team won a squeaker or completely dominated. This avoids unduly favouring a team like the 2005 White Sox, who swept, but in four close-fought games. (Eight points to the 2009 Yankees' six for winning a world series. I'd say the 2009 Yankees were more dominant over the 2009 Phillies than the 2005 White Sox were over the 2005 Astros.)
I have thought about doing what the 'Secret Sauce' did, and throw out some categories that don't appear to have much bearing on the outcome of a series (eg, Batting Average, which BBtN suggests is more valuable, but still not significantly so, than this data), as well as adding some of my own ideas (eg, trade and injury effects). However, I'm not altogether convinced that this will add more understanding of what is going on with a category such as Hits Allowed.
However, grouping was something I had planned to do in between last year and this. Unfortunately, too much has been happening to me this year for me to come to grips with that. I'd be particularly interested in looking at groups of statistical categories in the LCS and World Series' rounds, where the predictions are much less reliable.
But I do agree heartily with fra's larger point :)
You must be Registered and Logged In to post comments.
<< Back to main