Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Primate Studies > Discussion
Primate Studies
— Where BTF's Members Investigate the Grand Old Game

Tuesday, October 05, 2010

Predicting the 2010 Playoffs

Over the past few years, I have added to the original work of Vinay Kumar (who posts on Primer under a different name) in trying to find the statistical categories

that have the most value to a team in the postseason. Vinay’s work first appeared in an article at The Hardball Times web site

href=”http://www.hardballtimes.com/main/article/so-billy-what-does-work-in-the-playoffs”>“So Billy, What Does Work in the Playoffs?” about how regular season

statistics for a team could forecast its chances of success in the postseason. Vinay created a system that does work quite well at identifying the teams that will

reach the World Series. The main competitor to Vinay’s work, Baseball Prospectus’ ‘Secret Sauce’ has now

href="http://wwww.baseballprospectus.com/article.php?articleid=12085">fallen by the wayside. I am undaunted, as I enjoy doing the work anyway, so on with the

predictions.

The Categories


Vinay used 30 categories in his original research. But rather than using the data straight up, he used minimum splits between two teams in order to eliminate about

half of the results, to ensure that the data only reflected when a team had distinct advantage over its opponent. The columns below show the winning percentage in each

category through 2008, and how the 2009 results changed them. The numbers in parentheses indicate how many series these categories were a factor. In the case of Won-

Lost Record, 50 series featured one team having an advantage of five wins over its opponent.


Team totals:              through 2008 adding 2009
Won-lost record (53)                  .580             .604
Runs Scored/Runs Allowed (50)            .563             .580

Batting records:
Runs scored total (49)                  .426             .449
Batting average (57)                  .440             .456
On-base percentage (56)                .451             .482
Slugging percentage (46)                .476             .522
Doubles (64)                        .448             .469
Triples (67)                        .484             .478
Home runs (55)                      .479             .509
Batter walks (54)                    .551             .578
Batter strikeouts (fewer) (64)            .559             .578
Stolen bases (50)                    .511             .480
Stolen base attempts (more) (53)          .551             .528
Net stolen bases (60)                  .442             .483
Stolen base Average (66)                .373             .394
Caught stealing (fewer) (59)              .426             .441

Pitching records:
Runs allowed (50)                    .646             .620
ERA (51)                          .592             .569
Pitcher strikeouts (56)                .540             .554
Pitchers walks (fewer) (46)              .523             .490
Hits allowed (fewer) (53)                .729             .736
Home runs allowed (fewer) (40)            .595             .575
Complete games (56)                    .608             .589
Pitchers’ shutouts (60)                .673             .650
Saves (58)                          .482             .500
Saves by team leader (55)                .566             .582
Bullpen ERA (64)                      .574             .578

Fielding records:
Errors committed (fewer) (52)            .680             .673
Defensive efficiency (57)                .654             .667
Fielding double plays (59)              .500             .475

Having done this several times now, one can see that the successful teams have an inordinate impact on the fluctuations in percentages. Two years ago, the speed-based

offense of Tampa Bay and Charlie Manuel’s carefully judged running game boosted the stolen-base categories that had been in decline from their previous high point,

when Vinay did his original work. The Yankees and Phillies have boosted several of the batting categories, particularly the home runs and slugging percentage. Overall,

though, run prevention is crucial to getting you to the World Series. I’m aiming to follow this up later in the post-season with a review of what has happened there.

 
Here are the categories that feature among those playoff series’ winners have led in more than 54 per cent of the time, ranked by their winning percentage:


Hits allowed
Errors committed (fewer)
Defensive efficiency
Pitchers’ shutouts
Runs allowed
Won-lost Record
Complete Games
Saves by closer
Runs Scored/Runs allowed
Bullpen ERA
Batters’ strikeouts (fewer)
Batters’ walks
Home Runs Allowed (fewer)
ERA

In the past, I included all those categories with a better than fifty per cent success rate. Had I done so this year, Stolen Bases and Pitcher Walks (fewer) would have

fallen out of the list, while Slugging Percentage and Batters’ Home Runs would have joined it. The higher threshold, however, means that the World Series’ teams have

less effect on pushing categories in and out. As you can see, for batters the important thing is a good eye and to put the ball in play.  Let’s carry this information

forward and profile the 2010 Divisional Series. I’ve put the strong categories in italics.

Minnesota Twins vs New York Yankees

Last season the Twins were quickly bounced out of the playoffs by a powerful Yankees’ side. This year, the Twins turn up with a fighting chance. Having used a metaphor

drawn from British politics to characterize this series last year, I’m going to do the same and say that this Yankees’ team isn’t quite what I expected it to be, just

like the Con-Lib Pact that is running the country.


Twins’ advantages
Doubles
Triples
Batters’ Strikeouts (fewer)
Pitchers’ Walks
Home Runs Allowed
Complete Games
Shutouts

Yankees’ advantages
Runs Scored
Home Runs
Batters’ Walks
Stolen Bases
Net Stolen Bases
Stolen Base Average
Pitchers’ Strikeouts
Hits Allowed
Closer saves
Defensive Efficiency

Double plays

PREDICTOR PICK: NEW YORK YANKEES

Hedging my bet: If there’s a Twins’ team that looks like it could pull off a defeat of the Yankees in the post-season, this has the pitching to do it.

Texas Rangers vs Tampa Bay Rays

The Rays’ fleet offense is built all wrong to win today’s post-season. They’d have done better at the turn of the century. Nonetheless, they look likely to find the

Rangers little more than a bump in the road on the way to a League Championship showdown with the Evil Empire.


Texas Rangers’ advantages
Batting Average
Batters’ Strikeouts

Tampa Bay Rays’ advantages

Won/Lost Record
Doubles
Triples
Batters’ Walks
Stolen Bases
Stolen Base Attempts
Net Stolen Bases
Stolen Base Average
Caught Stealing
Pitchers’ Walks (fewer)
Shut Outs
Saves
Errors

PREDICTOR PICK: TAMPA BAY RAYS

Hedging my bets: The Rangers’ pitchers stand comparison to the Rays’. If they can stall the Rays’ offense, the Rangers could be celebrating a chance to play against

A-Rod for something important.

Cincinnati Reds vs. Philadelphia Phillies

More Red than a Communist Party of the Soviet Union Congress in this match-up. These two teams are more or less similarly constructed, relying on slugging and

pitching, except that Charlie Manuel beats Dusty Baker hands-down in the running game. But the Reds are rather in the position of being the imitation brand and needing

to try harder.


Reds’ advantages

Batting Average
Saves by Closer

Phillies’ advantages

Won/Lost Record
Batters’ Strikeouts
Net Stolen Bases
Stolen Base Average
Caught Stealing
Shut Outs
Complete Games

Errors

PREDICTOR PICK: PHILADELPHIA PHILLIES

Hedging my bet: The Reds, like the Rangers, surprise me by being close enough in the pitching categories to find grounds for hope. 

Atlanta Braves vs San Francisco Giants

Two teams that came close to a three-way tussle with the ‘fras’ of San Diego to get into the post-season find themselves safely in the playoffs after all. If we can

avoid a tidal wave of toma-mawkish sentiment over Bobby Cox’ last post-season, this could be the most evenly matched series of this round of the playoffs, except that

the Giants are way ahead in the strong categories. So you might want to look for your excitement in the games between the Twins and the Yankees.


Braves’ advantages
On-base Percentage
Doubles
Batters’ Walks
Stolen Base Average
Pitchers’ Walks
Double Plays

Giants’ advantages
Triples
Pitchers’ Strikeouts
Complete Games
Shut Outs
Saves
Saves by closer
Errors
Defensive Efficiency

PREDICTOR PICK: SAN FRANCISCO GIANTS

Hedging my bets: The Braves ‘win one for the skipper’. 

Peering Further Ahead

Well, actually I’m not going to. You’ll have to come back in about ten days’ time to see what I post in the comments about the League Championship Series. Nonetheless,

apart from Tampa Bay, once again we are looking at top media-market match-ups after the Divisional Series, with New York, Philadelphia and San Francisco. All we’re

really missing is Boston or Los Angeles for some kind of network executive’s wet dream. Ah, remember those halcyon days of competitive balance when the likes of

Detroit and Colorado made it to the World Series?

fra paolo Posted: October 05, 2010 at 02:12 PM | 11 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. plim Posted: October 05, 2010 at 06:46 PM (#3655467)
wow, those match my picks. i'd be curious to know if the rest of my picks would also match up: rays over yankees, phillies over giants, and phillies over rays.
   2. Harold can be a fun sponge Posted: October 06, 2010 at 03:04 AM (#3655874)
Thanks, fra paolo, for resurrecting my work. I was always a bit skeptical of the sample sizes; there were only so many playoff series, and like you point out, one hot team can be three data points. It's good to add more data, but your work also points out how much the W%s can change by adding just a little bit of data. IMO, that tells us that we should take all of this with a big handful of salt.

It's interesting to see that overall winning percentage and RS/RA ratio are now highly correlated with winning series; I found the lack of correlation there the most shocking of anything.

Here are the categories that feature among those playoff series' winners have led in more than 54 per cent of the time, ranked by their winning percentage:

If you're trying to look for categories that are "significant" (using a colloquial definition of "significant", not a statistical one), then it makes just as much sense to include those where the leader wins less than 46% of the time -- it's just as significant, but in the opposite direction.
   3. dcsmyth1 Posted: October 06, 2010 at 03:11 PM (#3656035)
Here's a discussion from The Book Blog about the demise of the secret sauce. I think it applies just as well here.

http://www.insidethebook.com/ee/index.php/site/comments/secret_sauce_no_more
   4. Harold can be a fun sponge Posted: October 07, 2010 at 02:19 AM (#3656936)
Here's a discussion from The Book Blog about the demise of the secret sauce. I think it applies just as well here.

It sure does. The difference is that Prospectus published their results and said that they were significant, going so far as to call it "Secret Sauce". Whereas I tried to go out of my way to just say, "This is what we've seen. Who knows what it means."
   5. Accent Shallow Posted: October 07, 2010 at 02:42 AM (#3656974)
I'm really surprised that Pitcher Hits Allowed is over .700 in terms of WP. Anyone care to expound upon why?
   6. Dale Sams Posted: October 07, 2010 at 02:46 AM (#3656978)
Well...The Braves have Hinske, and since 2007 I've seen the AL representative in person. This year I saw Rangers V. Red Sox.

SO: Rangers over Braves is my guess.
   7. philistine Posted: October 12, 2010 at 07:40 AM (#3661890)
So far, three for three.

Again this thing works. For the first round of the playoffs at least.
   8. fra paolo Posted: October 15, 2010 at 08:25 PM (#3664616)
Since 2007, when I first attempted this prediction malarkey in earnest, the Playoff Predictor has forecast sixteen divisional series. If you define the forecast winner as the team with the advantage in 'strong' categories, it has been correct THIRTEEN times, while in three cases it called the series a toss-up. That equals a success rate of 81 per cent.

However, when it comes to the League Championship Series, the Playoff Predictor has more trouble. It has three clear victories, two times it declined to make a forecast as the teams were tied in total strong categories in their favour, and there has been one defeat. (Last season, Phillies vs Dodgers.) I guess you could argue that is still a 75 per cent success rate, but it doesn't feel like it.

New York Yankees vs Texas Rangers

Yankees' Advantages
Won-Lost Record
Runs Scored
Triples
Home Runs
Batters' Walks (more)
Net Stolen Bases
Stolen Base Average
Caught Stealing
Errors
Double Plays

Rangers' Advantages
Batters' Strikeouts (fewer)
Stolen Base Attempts
Complete Games
Saves

Predictor Pick: New York Yankees

Hedging My Bet: Batters' Walks (more) is one of the weaker categories among the strong ones. The Rangers might be able to parlay home-field advantage to a playoff win, although the course of the Rays' series, where no-one won at home, does not fill one with confidence.

Philadelphia Phillies vs San Francisco Giants

Phillies' Advantages
Won-Lost Record
Ratio of Runs Scored to Runs Allowed

Runs Scored
Batters' Walks (more)
Stolen Bases
Stolen Base Attempts
Net Stolen Bases
Stolen Base Average
Caught Stealing
Pitchers' Walks (fewer)
Complete Games
Shutouts

Double Plays

Giants' Advantages
Pitchers' Strikeouts (more)
Hits Allowed (fewer)
Homer Runs Allowed

Saves
Closer Saves
Bullpen ERA


Predictor Pick: Pick 'em

Hedging my bets: San Franciso has the single most successful category of all, Hits Allowed (fewer), which has been on the side of the winner over seventy per cent of the time it has been a factor.

Work permitting, I may be back with a whole new article on this theme in time for the World Series, otherwise, you'll find predictions posted in this thread.
   9. Don Malcolm Posted: October 15, 2010 at 09:18 PM (#3664638)
Paul, just a random thought here...could you "numberize" these predictions by using the win values associated with the categories in some fashion? Arriving at some kind of probability--something like .550 or .625 or .700? I know the tendency here has been to remain cautious, but it would be interesting to see something like that applied to this system. By doing so, something else might pop out of it that would be interesting.

One other thought--have you ever tried to group together the categories to see if any combination produces unusual results of any kind?
   10. fra paolo Posted: October 20, 2010 at 04:56 PM (#3670032)
To pick up on some points:

Some background on the 'Secret Sauce' formula is given in Baseball Between the Numbers, pp. 352-68. To me, the big problem occurs right at the start, when the authors deploy a formula for 'playoff success' by awarding points. They then use correlations between this and statistical categories to help define what works in the playoffs. By contrast, Vinay aggregated the individual series, regardless of whether a team won a squeaker or completely dominated. This avoids unduly favouring a team like the 2005 White Sox, who swept, but in four close-fought games. (Eight points to the 2009 Yankees' six for winning a world series. I'd say the 2009 Yankees were more dominant over the 2009 Phillies than the 2005 White Sox were over the 2005 Astros.)

I have thought about doing what the 'Secret Sauce' did, and throw out some categories that don't appear to have much bearing on the outcome of a series (eg, Batting Average, which BBtN suggests is more valuable, but still not significantly so, than this data), as well as adding some of my own ideas (eg, trade and injury effects). However, I'm not altogether convinced that this will add more understanding of what is going on with a category such as Hits Allowed.

However, grouping was something I had planned to do in between last year and this. Unfortunately, too much has been happening to me this year for me to come to grips with that. I'd be particularly interested in looking at groups of statistical categories in the LCS and World Series' rounds, where the predictions are much less reliable.
   11. TomH Posted: October 20, 2010 at 05:21 PM (#3670063)
How was the Yankees win in 09 more dominant thsn ANYTHING? Their pitchers dominated the Phils' bats by letting them hit 11 (!) home runs in 6 games, but mostly when there were no runners on, thereby artifically holding down their runs scored?

But I do agree heartily with fra's larger point :)

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
Sheer Tim Foli
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 0.2730 seconds
47 querie(s) executed