User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
Page rendered in 0.5340 seconds
44 querie(s) executed
| ||||||||
You are here > Home > Baseball Newsstand > Discussion
| ||||||||
Baseball Primer Newsblog — The Best News Links from the Baseball Newsstand Tuesday, October 01, 2002Bill James Prediction SystemI ran the Bill James prediction system for the 4 division series: Yankees 68, Angels 68 This reinforces the very nervous feelings (as a Yankee fan) have about the Yankees/Angels series. The Cardinals/Diamondbacks score is very deceptive. Arizona picks up 41 points based on 2 doubles, .001 of batting average and 1 shutout. Split those points 21-20 and you end up with the Cards having the advantage 71-57. The system shows the Braves pitching and experience as being much more important the Giants superior offense. The only points the Twins score are for hitting more 3B and making fewer errors. In my gut I felt this was the biggest mismatch, the system reinforces that gut feeling. The system was 70% accurate according to James in 1983. Since it picks the Yanks and Angels as a tie, I’m going to say the Cards will be the 1 mistake out of the other 3. I know I picked the Giants on the other thread, but that’s wishful thinking for Bonds. I’ll really be surprised if they beat Atlanta. How’s that for sitting on the fence? JoeD has the Imperial March Stuck in His Head
Posted: October 01, 2002 at 08:43 PM | 24 comment(s)
Login to Bookmark
Tags: |
Login to submit news.
You must be logged in to view your Bookmarks. Hot TopicsNewsblog: 2022 NBA Playoffs thread
(3452 - 11:53pm, Jun 28) Last: Der-K's tired of these fruits from poisoned trees Newsblog: OMNICHATTER for Tuesday, June 28, 2022 (47 - 11:46pm, Jun 28) Last: What did Billy Ripken have against ElRoy Face? Newsblog: Freddie Freeman changing agents over how Braves-Dodgers free agency played out, per report (13 - 11:11pm, Jun 28) Last: Balkroth Sox Therapy: Hey Now (6 - 10:32pm, Jun 28) Last: Dillon Gee Escape Plan Newsblog: OMNICHATTER for Monday, June 27, 2022 (46 - 9:37pm, Jun 28) Last: cardsfanboy Newsblog: MLB would not charge Oakland Athletics a relocation fee if team moves to Las Vegas, per reports (13 - 9:35pm, Jun 28) Last: Boxkutter Newsblog: Seattle Mariners acquire first baseman Carlos Santana from Kansas City Royals in exchange for RHPs Wyatt Mills, William Fleming (24 - 9:14pm, Jun 28) Last: Ron J Newsblog: Ohtani up for 2 ESPYs, including Best Athlete, Men's Sports (3 - 9:09pm, Jun 28) Last: Ron J Newsblog: Eight ejected after wild brawl between Seattle Mariners, Los Angeles Angels (21 - 7:38pm, Jun 28) Last: The Yankee Clapper Newsblog: SOURCES: OVER 100 PLAYERS REJECT DRAFT COMBINE PHYSICAL; VOID DRAFT BONUS GUARANTEE (2 - 7:16pm, Jun 28) Last: cardsfanboy Newsblog: Senate Judiciary Committee targets MLB’s antitrust exemption [$] (1 - 5:43pm, Jun 28) Last: Pasta-diving Jeter (jmac66) Hall of Merit: Most Meritorious Player: 1900 Ballot (5 - 4:41pm, Jun 28) Last: kcgard2 Newsblog: The Guardians duo carrying Jamaica's baseball legacy (4 - 4:01pm, Jun 28) Last: . . . . . . Newsblog: Peacock to stream Royals-Tigers game without announcers (19 - 3:19pm, Jun 28) Last: Karl from NY Newsblog: Minnesota Twins pitching coach Wes Johnson exiting, reportedly for same position with LSU Tigers (20 - 3:01pm, Jun 28) Last: Ithaca2323 |
|||||||
About Baseball Think Factory | Write for Us | Copyright © 1996-2021 Baseball Think Factory
User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
|
| Page rendered in 0.5340 seconds |
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. fracas' hope springs eternal Posted: October 01, 2002 at 08:50 PM (#163307)1 point for every half game in the standings
Check that. If you think the Dbacks would be 5 games worse over the course of the season, you could dock Arizona 10 points for that, but the score would still be 68-51.
That series is going to be a classic irresistable force (Braves pitching) vs immovable object (Bonds/Giants offense). Or is it fairly resistable force (Giants pitching) vs nearly frictionless object (Lockhart/Castilla/Lopez/Blanco/pitcher bottom of Braves offense)?
This is comparing two really good teams. Teams that hit 2B's and for high averages tend to be hustling sequential type offenses.
When the pitching improves, it's harder for a high average offense to put together the sequence needed to score runs.
Teams that hustle on the basepaths run it up on bad teams. Good teams throw them out at 3rd. Doubles don't come as easy when the team has good defensive outfielders, etc.
Sure HR go down too, but if everything goes down 10%, the team that hits HR has a big advantage. The odds of a HR are 90%. The odds of 3 singles go to .9*.9.*9 or 73%. Go back and look, teams with sequential offenses have a terrible postseason record, relative to what you think it would be.
As for the teams that walk more batters, they get the points because a disproportionate share of walks come from pitchers that are bad, and they don't pitch in the postseason.
Again, it isn't perfect. If you realize the biases (like hard cutoffs for example) you can have a better idea of when it might be wrong.
I don't think the schedule is that big of a deal. 5 games a year either way, max. The difference between the ballparks understates the Braves pitching more than the difference in the schedule overstates it, IMHO.
I mostly agree with you, but just for the sake of discussion I'll ask about the Curse of Steve Balboni. (I realize Luis G has put this to rest, but it's still freakish that it lasted as long as it did.) Wouldn't having some monster HR guy around be helpful? By the logic you're espousing here wouldn't we have expected HR-heavy teams to do much better than they have over the past 15+ years? Wouldn't *one* team with the big stick have carried your theory out to the end?
It's also possible the HR teams didn't have the pitching. I'll bet that when the HR team lost, it was to a team with better pitching most of the time.
What I'm trying to say is that the HR teams might not have fared well, but the HR might not be the reason.
Another way to study it would be to look at the individual series and see if the team that outhomered has done better than the teams that have been out 'averaged'. This would do better at answering the HR/AVG question, while controling for the pitching.
It may depend on whether he looked at things bivariately or multivariately. That is, did he just look and find "teams that hit fewer doubles won more series" or did he find "teams that hit fewer doubles, after controlling for these other factors, won more series."
I'm assuming the former, which means that he's getting some spurious correlations. For example, teams that hit fewer doubles are probably teams that hit more HR's (i.e. HR power rather than doubles power). Teams that walk more batters may also be teams that K more batters (though surprisingly K's don't come in here). Scruff makes a good point about sequential offenses perhaps explaining the average quirk, but it may also be an aspect of the sample selection bias and spurious correlation -- any team that makes the playoffs with a low average probably walks a lot and hits for power, otherwise they'd never have scored enough runs.
James strikes me as usually smart enough to avoid such problems, but maybe not with these various toys he tosses out there.
He didn't expect it to work. It picked 70% retroactively, he never thought it would keep up going forward. It seemed to him, "like a compilation of past coincidences, and not truly likely to predict the Series winner with any consistency."
Two things happened since he invented it. One, it continued to work, at least through 1983 (although it was 0-for-2 in 1982 LCS when he went public). After the 1982 LCSs, it went 3-for-4 on the 1982 WS and 1983 postseason. 23-11 from invention through 1983.
Secondly, he's come a long way towards understanding why it works (emphasis his).
"Why do teams with high batting averages do poorly in WS play? A simple reason: it takes them too many hits to score. If they are legitimately a better offense than their opponents, then that's another story. Most of the time they're not; everybody who gets into a WS has some bats. The higher AVG doesn't indicate the team which has a better offense, but the team which has more of a high-average offense, as opposed to a power offense.
High average offenses score by stringing sequences together. To get a 3-run inning, it might take them five or six hits. Every one of those hits gets harder to come by. If each one becomes 10% more difficult to get, how much more difficult to get are all five of them? You've got five chances to stop that inning.
With the 3-run HR, on the other hand, you've got only 3 chances to stop it. As each element of the offense becomes 10% less common, the 5-element offense is damaged by much more than the 3-element offense is. That's why good pitchers, as a group, allow a higher percentage of their runs on home runs than do poor pitchers. And in a WS, you're always facing a good pitcher. So the % of runs that score on HR is consistently higher in a WS than it is during the season."
He then shows the 1983 postseason as an example . . .
27 runs in the WS, 10 HR. Postseason 73 runs, 20 HR. 27% of runs were HR, during the season it was 18%. HR/G only went from 1.57 to 1.54. "But the rate of other runs decreased by much more than that, so the HR attained an added importance. That is exactly what happens."
"So with that knowledge, it makes perfect sense for the high-average teams to do poorly in WS play. Other interpretations of odd rules might be more speculative, but I am absolutely convinced that teams which hit a lot of doubles during the season are never going to do well in the aftermath because they are aggressive baserunning teams, teams which are exploiting weaknesses that will not be there when the Series starts. Shutouts are important because most shutouts are thrown by front-line starting pitchers, and front-line starters do a much larger share of the pitching in a WS than they do during the season. Walks are not that important because they are disproportionately influenced by 4th and 5th starters who will spend the WS in hibernation. The other rules are all positive rules; all the system does is weight them according to their historic importance."
Note, he uses WS everywhere. That's because he used WS to develop the system. It's probably not quite as reliable in the early playoffs because the teams aren't as good.
He talks about the bad year in 1982.
First he blames luck, he never said 100%.
"The system is supposed to measure what happens when good pitching staffs and good defenses collide. Remember: shortens the offensive sequence, eliminates the value of aggressive baserunning by making it too difficult to exploit the defense? Ordinarily, that's what we we're talking about in a WS or playoff -- good offenses against good defenses.
But they ran a ringer on me. For the first time since God-has-forgotten-when, there was not one team in the postseason that had that kind of pitching or that kind of defense. They were just a little above average pitching and defense teams, teams which got in by scoring a jillion runs.
So the advantages that should have been negated, weren't. It was possible to take advantage of the Angel defense, with Reggie in RF and an ailing DeCinces at 3B. It was possible to put together a long offensive sequence against Atlanta . . .
So it will take forever to convince people, but the system still works. It has been in the public domain for two years now; when it has been there for 10 or 15, there will be enough data to make a reasonable evaluation. It may not continue to work 70% of the time, but it isn't going to start working 40%, either. I am not afraid to see what happens to it."
This year's Angel squad might have a case against the Yanks. The Yankee OF D is able to be taken advantage of. The system predicts a tie, despite docking the Angels 14 points for doubles and 12 for losing the season series 4-3. I'm even more nervous (as a Yankee fan)now than I was a few hours ago.
Not a huge difference, but 58.8% vs. 67.6% is a pretty big difference to a gambler. I don't think that the edge is statistically significant, but it's still a decent edge for the system.
I know the team with the better record is only 13-13 in Division Series, not sure how this system has done. The team with the best record has only won the WS once since 1989, I'm not sure how the system's pre-playoff champ would do.
One other thing Arthur, I'd imagine many of those series the team with the better record was predicted to win. Let's say that covers 22 of the 34 series and they were both 15-7 for those series. That would mean, in the marginal series where the systems differ record was 5-7, while the system was 8-4. That would be a much bigger edge for the system than just the 8.8% earlier. I'm just guessing, but I know the system picks the team with the better record a lot, when it doesn't might be when it's time to take a closer look (like Angels/Yankees this year).
Given that James' system was based on matchups from the 1973 to 1982 seasons and that there have been more than twice as many playoff series since then, don't you guys think it's worth rerunning the numbers before trying to use this ? If it was a crude tool then, it's crude and obsolete now. If teams who hit fewer doubles won 14 more series (out of 34) circa 1983, it's entirely possible that things have evened out in the 61 playoff series since then.
...and then I realized that contrarian (#20) had already said essentially the same thing.
So the edge for teams with fewer doubles was 42-28-1 or something (he used 71 World Series to develop the system).
I don't think this would make the system outdated at all. Many eras were represtented from 1903-73, many different types of baseball. All of the comparisons are relative comparisons. There is also a component built into the system that gives the better team an advantage (1 point for every .5 games in the standings). I don't think it would need to be updated to avoid being outdated.
It might be interesting to make a different system for each tier of the playoffs, as the competition should get tougher each round. It worked incredibly well during the 1970's LCS's 17-3. In the 80s it was not nearly as good, 3-5. The 1970's WS it was 7-3, in the early 80's it was 2-2.
Maybe the 1970s were a different animal, who knows, but I think he makes some very valid points about both aggressive baserunning, front-line pitching, and power vs. average based offenses.
I just checked our playoff previews from last year, the system was 5-2. Just using record, you would have been 3-4.
James doesn't say anywhere in the article that the system would need updating. It's like linear weights, you don't need a slope corrector. It might be more accurate, but if the system is inherently accurate, 71 series is plenty big enough for a sample size, since all of the points are relatively based.
I realize that in the deadball era, awarding 10 points for the team w/more HR doesn't make sense. But none of the changes in baseball since 1973 have been remotely that drastic. I think the system is fine as is.
The key to testing a model is to test it on outside data. That's why I don't think updating it is the right way to go. If you do, you'll have to wait another 30 years to see if you are 'compiling past coincidences' or finding something that does have predictive value.
You are missing the point that nothing all that much has changed since 1973. The system doesn't take SB into account, so that change doesn't matter. I think this system will apply pretty well to any time from 1920 on. That's the last time there was a fundamental change in the game. Sure offense has gone up and down since then, but the basic building blocks are the same and have been since the HR became a prominent force.
I guess using the old system just doesn't seem pointless to me, since it appears to work. For this exercise I'm not trying to learn anything, just pointing out that it's an interesting system and it seems to work. Improving on it is for another day.
The system was 4-3 in 1999, when SeanForman ran it. It only seems to make sense that expanding the playoffs would make the system less useful, but it did just fine when the league went to 4 playoff teams, so that may be overstating the case. It's fun either way.
I'd be curious whether a simpler scheme would work. Something like the team with the higher SLG (or maybe better isolated power), the team with the higher walk rate (or maybe OBP), the team with the better top 3 starters (measured how?). Seems that all of the points awarded except for experience are captured in those three measures.
I'm also curious as to why this works in predicting world series in the post-DH era. Don't AL teams have an inherent advantage in almost all the hitting categories, except average (the NL's will be lower) and maybe doubles? Don't NL teams have an advantage in ERA and maybe shutouts?
For this season (2B and AVG from lowest to highest):
RS: Yankees, Anaheim, Arizona, Oakland, St L, SF, Minn, Atl
Walking #8 guys to get to the pitcher springs to mind.
Walt, you did hit on the one correction I'd make. For World Series predicting, I'd adjust all counting categories to the league average, to eliminate the DH effect.
You must be Registered and Logged In to post comments.
<< Back to main