Baseball for the Thinking Fan

Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Tuesday, October 01, 2002

Bill James Prediction System

I ran the Bill James prediction system for the 4 division series:

Yankees 68, Angels 68
Athletics 124, Twins 20
Diamondbacks 78, Cardinals 51
Braves 102, Giants 38

This reinforces the very nervous feelings (as a Yankee fan) have about the Yankees/Angels series.

The Cardinals/Diamondbacks score is very deceptive. Arizona picks up 41 points based on 2 doubles, .001 of batting average and 1 shutout. Split those points 21-20 and you end up with the Cards having the advantage 71-57.

The system shows the Braves pitching and experience as being much more important the Giants superior offense.

The only points the Twins score are for hitting more 3B and making fewer errors. In my gut I felt this was the biggest mismatch, the system reinforces that gut feeling.

The system was 70% accurate according to James in 1983. Since it picks the Yanks and Angels as a tie, I’m going to say the Cards will be the 1 mistake out of the other 3.

I know I picked the Giants on the other thread, but that’s wishful thinking for Bonds. I’ll really be surprised if they beat Atlanta. How’s that for sitting on the fence?

Posted: October 01, 2002 at 08:43 PM

Reader Comments and Retorts

   Posted: October 01, 2002 at 08:50 PM
Can you give the ignorant among us a rundown of the system in less than a page (or two)?
   Posted: October 01, 2002 at 08:53 PM
I'm assuming that the system doesn't take into account late season injuries such as Gonzales', so even if the Cardinals do win, I'm not sure I'd call it a "mistake."
   Posted: October 01, 2002 at 09:04 PM
Here you go:

1 point for every half game in the standings
   Posted: October 01, 2002 at 09:10 PM
As for Gonzalez, the system wouldn't predict any differently. His absence wouldn't swing any category the other way.

Check that. If you think the Dbacks would be 5 games worse over the course of the season, you could dock Arizona 10 points for that, but the score would still be 68-51.
   Posted: October 01, 2002 at 09:21 PM
What about runs? Arizona was only ahead by 32. I would think Gonzalez is worth at least 32 runs over the course of a season, so if you give that one to the Cards it's 65-54. And if you assume that the D-Back's wouldn't have gone as far in the postseason last year without him, the Cards are up 66-53. Maybe my asssumptions aren't sound but they don't seem ridiculous either.
   Posted: October 01, 2002 at 09:23 PM
I'm confused. Why do you lose points for hitting a lot of doubles, and hitting for a high average, and why do you get points for walking guys? Coincindentally, those are three areas the Angels all lose points in. I guess average could be explained by it being more unpredictable, and liekly to regress to the mean, but I can't figure out the other two.
   Posted: October 01, 2002 at 09:38 PM
That's sort of what I figured, but I just thought I'd ask. Either way, does anyone think its just a statistical anomoly, or is there any explanation for it?
   Posted: October 01, 2002 at 09:40 PM
Given the unbalanced schedule, and the dreck the Braves pitching staff got to rack up stats against because of it, you'd have to think the system overstates the Braves chances.

That series is going to be a classic irresistable force (Braves pitching) vs immovable object (Bonds/Giants offense). Or is it fairly resistable force (Giants pitching) vs nearly frictionless object (Lockhart/Castilla/Lopez/Blanco/pitcher bottom of Braves offense)?
   Posted: October 01, 2002 at 09:50 PM
Danil, you are on the right track.

This is comparing two really good teams. Teams that hit 2B's and for high averages tend to be hustling sequential type offenses.

When the pitching improves, it's harder for a high average offense to put together the sequence needed to score runs.

Teams that hustle on the basepaths run it up on bad teams. Good teams throw them out at 3rd. Doubles don't come as easy when the team has good defensive outfielders, etc.

Sure HR go down too, but if everything goes down 10%, the team that hits HR has a big advantage. The odds of a HR are 90%. The odds of 3 singles go to .9*.9.*9 or 73%. Go back and look, teams with sequential offenses have a terrible postseason record, relative to what you think it would be.

As for the teams that walk more batters, they get the points because a disproportionate share of walks come from pitchers that are bad, and they don't pitch in the postseason.

Again, it isn't perfect. If you realize the biases (like hard cutoffs for example) you can have a better idea of when it might be wrong.

I don't think the schedule is that big of a deal. 5 games a year either way, max. The difference between the ballparks understates the Braves pitching more than the difference in the schedule overstates it, IMHO.
   Posted: October 01, 2002 at 09:57 PM
Sure HR go down too, but if everything goes down 10%, the team that hits HR has a big advantage. The odds of a HR are 90%. The odds of 3 singles go to .9*.9.*9 or 73%.

I mostly agree with you, but just for the sake of discussion I'll ask about the Curse of Steve Balboni. (I realize Luis G has put this to rest, but it's still freakish that it lasted as long as it did.) Wouldn't having some monster HR guy around be helpful? By the logic you're espousing here wouldn't we have expected HR-heavy teams to do much better than they have over the past 15+ years? Wouldn't *one* team with the big stick have carried your theory out to the end?
   Posted: October 01, 2002 at 10:02 PM
It's possible Cris. I wonder if the one big stick isn't always enough though. I'd like to see the post-season record since 1995 (playoff expansion) of the team that had more regular season HR (adjusting for park and league) and a better AVG. I'd bet the HR teams have a better record.

It's also possible the HR teams didn't have the pitching. I'll bet that when the HR team lost, it was to a team with better pitching most of the time.

What I'm trying to say is that the HR teams might not have fared well, but the HR might not be the reason.

Another way to study it would be to look at the individual series and see if the team that outhomered has done better than the teams that have been out 'averaged'. This would do better at answering the HR/AVG question, while controling for the pitching.
   Posted: October 01, 2002 at 10:03 PM
That's sort of what I figured, but I just thought I'd ask. Either way, does anyone think its just a statistical anomoly, or is there any explanation for it?

It may depend on whether he looked at things bivariately or multivariately. That is, did he just look and find "teams that hit fewer doubles won more series" or did he find "teams that hit fewer doubles, after controlling for these other factors, won more series."

I'm assuming the former, which means that he's getting some spurious correlations. For example, teams that hit fewer doubles are probably teams that hit more HR's (i.e. HR power rather than doubles power). Teams that walk more batters may also be teams that K more batters (though surprisingly K's don't come in here). Scruff makes a good point about sequential offenses perhaps explaining the average quirk, but it may also be an aspect of the sample selection bias and spurious correlation -- any team that makes the playoffs with a low average probably walks a lot and hits for power, otherwise they'd never have scored enough runs.

James strikes me as usually smart enough to avoid such problems, but maybe not with these various toys he tosses out there.
   Posted: October 01, 2002 at 10:52 PM
I'll let James do the talking here, I'll just regurgitate the article as best I can.

He didn't expect it to work. It picked 70% retroactively, he never thought it would keep up going forward. It seemed to him, "like a compilation of past coincidences, and not truly likely to predict the Series winner with any consistency."

Two things happened since he invented it. One, it continued to work, at least through 1983 (although it was 0-for-2 in 1982 LCS when he went public). After the 1982 LCSs, it went 3-for-4 on the 1982 WS and 1983 postseason. 23-11 from invention through 1983.

Secondly, he's come a long way towards understanding why it works (emphasis his).

"Why do teams with high batting averages do poorly in WS play? A simple reason: it takes them too many hits to score. If they are legitimately a better offense than their opponents, then that's another story. Most of the time they're not; everybody who gets into a WS has some bats. The higher AVG doesn't indicate the team which has a better offense, but the team which has more of a high-average offense, as opposed to a power offense.

High average offenses score by stringing sequences together. To get a 3-run inning, it might take them five or six hits. Every one of those hits gets harder to come by. If each one becomes 10% more difficult to get, how much more difficult to get are all five of them? You've got five chances to stop that inning.

With the 3-run HR, on the other hand, you've got only 3 chances to stop it. As each element of the offense becomes 10% less common, the 5-element offense is damaged by much more than the 3-element offense is. That's why good pitchers, as a group, allow a higher percentage of their runs on home runs than do poor pitchers. And in a WS, you're always facing a good pitcher. So the % of runs that score on HR is consistently higher in a WS than it is during the season."

He then shows the 1983 postseason as an example . . .

27 runs in the WS, 10 HR. Postseason 73 runs, 20 HR. 27% of runs were HR, during the season it was 18%. HR/G only went from 1.57 to 1.54. "But the rate of other runs decreased by much more than that, so the HR attained an added importance. That is exactly what happens."

"So with that knowledge, it makes perfect sense for the high-average teams to do poorly in WS play. Other interpretations of odd rules might be more speculative, but I am absolutely convinced that teams which hit a lot of doubles during the season are never going to do well in the aftermath because they are aggressive baserunning teams, teams which are exploiting weaknesses that will not be there when the Series starts. Shutouts are important because most shutouts are thrown by front-line starting pitchers, and front-line starters do a much larger share of the pitching in a WS than they do during the season. Walks are not that important because they are disproportionately influenced by 4th and 5th starters who will spend the WS in hibernation. The other rules are all positive rules; all the system does is weight them according to their historic importance."

Note, he uses WS everywhere. That's because he used WS to develop the system. It's probably not quite as reliable in the early playoffs because the teams aren't as good.

He talks about the bad year in 1982.

First he blames luck, he never said 100%.

"The system is supposed to measure what happens when good pitching staffs and good defenses collide. Remember: shortens the offensive sequence, eliminates the value of aggressive baserunning by making it too difficult to exploit the defense? Ordinarily, that's what we we're talking about in a WS or playoff -- good offenses against good defenses.

But they ran a ringer on me. For the first time since God-has-forgotten-when, there was not one team in the postseason that had that kind of pitching or that kind of defense. They were just a little above average pitching and defense teams, teams which got in by scoring a jillion runs.

So the advantages that should have been negated, weren't. It was possible to take advantage of the Angel defense, with Reggie in RF and an ailing DeCinces at 3B. It was possible to put together a long offensive sequence against Atlanta . . .

So it will take forever to convince people, but the system still works. It has been in the public domain for two years now; when it has been there for 10 or 15, there will be enough data to make a reasonable evaluation. It may not continue to work 70% of the time, but it isn't going to start working 40%, either. I am not afraid to see what happens to it."

This year's Angel squad might have a case against the Yanks. The Yankee OF D is able to be taken advantage of. The system predicts a tie, despite docking the Angels 14 points for doubles and 12 for losing the season series 4-3. I'm even more nervous (as a Yankee fan)now than I was a few hours ago.
   Posted: October 01, 2002 at 11:01 PM
Arthur, just using record would get you 20-14 for the same years the system was 23-11.

Not a huge difference, but 58.8% vs. 67.6% is a pretty big difference to a gambler. I don't think that the edge is statistically significant, but it's still a decent edge for the system.

I know the team with the better record is only 13-13 in Division Series, not sure how this system has done. The team with the best record has only won the WS once since 1989, I'm not sure how the system's pre-playoff champ would do.
   Posted: October 01, 2002 at 11:07 PM
Contrarian, I think the point is that if it is a solid system, you shouldn't have to update the values (James did not do this either, he stopped after 1973). The theory would be that 1903-73 was representative enough of a sample that the results should work going forward without updating.

One other thing Arthur, I'd imagine many of those series the team with the better record was predicted to win. Let's say that covers 22 of the 34 series and they were both 15-7 for those series. That would mean, in the marginal series where the systems differ record was 5-7, while the system was 8-4. That would be a much bigger edge for the system than just the 8.8% earlier. I'm just guessing, but I know the system picks the team with the better record a lot, when it doesn't might be when it's time to take a closer look (like Angels/Yankees this year).
   Posted: October 01, 2002 at 11:12 PM
I was going to say....

Given that James' system was based on matchups from the 1973 to 1982 seasons and that there have been more than twice as many playoff series since then, don't you guys think it's worth rerunning the numbers before trying to use this ? If it was a crude tool then, it's crude and obsolete now. If teams who hit fewer doubles won 14 more series (out of 34) circa 1983, it's entirely possible that things have evened out in the 61 playoff series since then.

...and then I realized that contrarian (#20) had already said essentially the same thing.
   Posted: October 01, 2002 at 11:21 PM
Jay, the system was based on 1903-73, then used from 1974-83 (the time the book was written).

So the edge for teams with fewer doubles was 42-28-1 or something (he used 71 World Series to develop the system).

I don't think this would make the system outdated at all. Many eras were represtented from 1903-73, many different types of baseball. All of the comparisons are relative comparisons. There is also a component built into the system that gives the better team an advantage (1 point for every .5 games in the standings). I don't think it would need to be updated to avoid being outdated.

It might be interesting to make a different system for each tier of the playoffs, as the competition should get tougher each round. It worked incredibly well during the 1970's LCS's 17-3. In the 80s it was not nearly as good, 3-5. The 1970's WS it was 7-3, in the early 80's it was 2-2.

Maybe the 1970s were a different animal, who knows, but I think he makes some very valid points about both aggressive baserunning, front-line pitching, and power vs. average based offenses.
   Posted: October 01, 2002 at 11:30 PM
I'm not saying it wouldn't help, but I don't think it's necessary.

I just checked our playoff previews from last year, the system was 5-2. Just using record, you would have been 3-4.

James doesn't say anywhere in the article that the system would need updating. It's like linear weights, you don't need a slope corrector. It might be more accurate, but if the system is inherently accurate, 71 series is plenty big enough for a sample size, since all of the points are relatively based.

I realize that in the deadball era, awarding 10 points for the team w/more HR doesn't make sense. But none of the changes in baseball since 1973 have been remotely that drastic. I think the system is fine as is.
   Posted: October 01, 2002 at 11:34 PM
What I'm trying to say is that if you update the model, you can't test it. You'll be building in an advantage into the system, because you'll be testing it based on the numbers used to develop it. Of course it will be accurate then.

The key to testing a model is to test it on outside data. That's why I don't think updating it is the right way to go. If you do, you'll have to wait another 30 years to see if you are 'compiling past coincidences' or finding something that does have predictive value.
   Posted: October 02, 2002 at 12:03 AM
I guess we'll have to agree to disagree. Contrarian. I think you are dead wrong about what James felt. He never mentions what you say once in the article. If I'm missing something please let me know.

You are missing the point that nothing all that much has changed since 1973. The system doesn't take SB into account, so that change doesn't matter. I think this system will apply pretty well to any time from 1920 on. That's the last time there was a fundamental change in the game. Sure offense has gone up and down since then, but the basic building blocks are the same and have been since the HR became a prominent force.

I guess using the old system just doesn't seem pointless to me, since it appears to work. For this exercise I'm not trying to learn anything, just pointing out that it's an interesting system and it seems to work. Improving on it is for another day.
   Posted: October 02, 2002 at 02:35 AM
Hey, I did that!

The system was 4-3 in 1999, when SeanForman ran it. It only seems to make sense that expanding the playoffs would make the system less useful, but it did just fine when the league went to 4 playoff teams, so that may be overstating the case. It's fun either way.
   Posted: October 02, 2002 at 04:07 PM
So it does sound like he did everything univariately. In essence, he's double-rewarding for things like HR's (since such teams also often have fewer doubles).

I'd be curious whether a simpler scheme would work. Something like the team with the higher SLG (or maybe better isolated power), the team with the higher walk rate (or maybe OBP), the team with the better top 3 starters (measured how?). Seems that all of the points awarded except for experience are captured in those three measures.

I'm also curious as to why this works in predicting world series in the post-DH era. Don't AL teams have an inherent advantage in almost all the hitting categories, except average (the NL's will be lower) and maybe doubles? Don't NL teams have an advantage in ERA and maybe shutouts?

For this season (2B and AVG from lowest to highest):

RS: Yankees, Anaheim, Arizona, Oakland, St L, SF, Minn, Atl
   Posted: October 02, 2002 at 04:38 PM
Why in the world would NL pitchers have more walks than AL teams?

Walking #8 guys to get to the pitcher springs to mind.
   Posted: October 03, 2002 at 12:00 PM
John K. - Laughing out loud.

Walt, you did hit on the one correction I'd make. For World Series predicting, I'd adjust all counting categories to the league average, to eliminate the DH effect.

You must be Registered and Logged In to post comments.



