 
Baseball Primer Newsblog — The Best News Links from the Baseball Newsstand Monday, December 14, 2009Whisnant: Beyond Pythagorean Expectation: How Run Distributions Affect Win PercentageDirect from the 2010 MIT Sloan Sports Analytics Conference (AKA  Dorkapalooza) comes…

BookmarksYou must be logged in to view your Bookmarks. Hot TopicsNewsblog: The Five “Acts” of Ike Davis’s Career, and Why Trading Ike Was a Mistake
(61  1:12pm, Apr 24) Last: CrosbyBird Newsblog: Jonah Keri Extended Interview  Video  Late Night with Seth Meyers  NBC (12  1:05pm, Apr 24) Last: Greg K Newsblog: OTP April 2014: BurstNET Sued for Not Making Equipment Lease Payments (2502  1:02pm, Apr 24) Last: Rickey! In a van on 95 south... Newsblog: Michael Pineda ejected from Red Sox game after pine tar discovered on neck (88  1:02pm, Apr 24) Last: PepTech Newsblog: OT: NBA Monthly Thread  April 2014 (505  12:57pm, Apr 24) Last: Los Angeles El Hombre of Anaheim Newsblog: Coliseum Authority accuses Athletics of not paying rent (17  12:51pm, Apr 24) Last: Joey B. is being stalked by a (Gonfa) loon Newsblog: Toronto Star: Blue Jays pave way for grass at the Rogers Centre (8  12:49pm, Apr 24) Last: Astroenteritis (tom) Newsblog: OT: The NHL is finally back thread, part 2 (233  12:32pm, Apr 24) Last: PASTE Thinks This Trout Kid Might Be OK (Zeth) Newsblog: Primer Dugout (and link of the day) 4242014 (4  12:31pm, Apr 24) Last: BDC Newsblog: Keri: Slump City: Why Does the 2014 MLB Season Suddenly Feel Like 1968? (36  12:31pm, Apr 24) Last: PASTE Thinks This Trout Kid Might Be OK (Zeth) Newsblog: Full Count » Red Sox to call up righthander Alex Wilson, option Daniel Nava (7  12:17pm, Apr 24) Last: Davo Dozier (Mastroianni) Newsblog: OMNICHATTER for 4/23/2014 (183  12:12pm, Apr 24) Last: Rickey! In a van on 95 south... Newsblog: Matt Williams: No problem with Harper's twostrike bunting (12  12:11pm, Apr 24) Last: jacksone (AKA It's OK...) Newsblog: Josh Lueke Is A Rapist, You Say? Keep Saying It. (239  12:00pm, Apr 24) Last: You Know Nothing JT Snow (YR) Newsblog: Doyel: How was Gerrit Cole not suspended? He basically started the brawl (38  11:13am, Apr 24) Last: ECBucs 



Page rendered in 0.6690 seconds 
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Jack Keefe Posted: December 14, 2009 at 04:33 PM (#3411750)The Weibull distribution has been found to model both run distribution and the Pythagorean formula very well. Here's the theoretical paper by Steven Miller:
http://arxiv.org/PS_cache/math/pdf/0509/0509698v4.pdf
THT did some work with the Weibull distribution and its relationship the Pythagorean record.
http://www.hardballtimes.com/main/article/feastorfaminefirstdraft/
http://www.hardballtimes.com/main/article/avoidingthefamine/
http://www.hardballtimes.com/main/article/consistencyiskey/
http://www.hardballtimes.com/main/article/consistencyiskeyparttwo/
http://www.hardballtimes.com/main/article/consistencyisinconsistent/
Keith Woolner looked at this a while ago, too:
http://www.baseballprospectus.com/article.php?articleid=472
Lucky that this guy wasn't in high school during World War One.
I've noted this in some analyses I've done in the past, which is why I've argued that statistical analysts tend to overvalue OBP and undervalue SLG. The 1992 Milwaukee Brewers, a fairly classic smallball team (and one of my favorite study topics), had a very inefficient offense, and although they won 92 games they underperformed their Pyth by 4 games.
Poisson distributions don't model actual scoring distributions quite as well. Ted Turocy did some research on this several years back, which if I can find it I'll post.
 MWE
It's much more flexible than the Poisson (or even an overdispersed Poisson) and it is often used in competing risks failure time modeling. Winning vs. losing is really competing risks... you win if you score more runs than the other guy, you lose if you don't.
The reason why Weibull is used is that it is a bit more flexible AND because it is more analytically tractable. There is a close relationship between the exponential/weibull which is similar to the relationship between the geometric/negative binomial distributions. See, for example, http://www.objectivedoe.com/student/ReliabilityResources/weibull1.html .
The negative binomial can be a bit more annoying to work with and there is not much gain to switching to it for RS/RA (given the historical data about how close it fits), so I would guess that is why the Weibull is often preferred.
Poisson don't allow for heterogeneity of mean/variance (i.e. the fact that the mean number of runs for each game will be different  because you're playing a different team), I wouldn't be surprised if negative binomials would do the trick, as mentioned above.
I would guess that blowout are diffcult to handle in a model because one manager often gives up and leaves a bad pitcher in the game. Although blowouts aren't what they used to be. A 6 run lead used to be a blowout.
The 3parameter Weibull is nice because of its flexibility. Beta allows you to recenter for binning, and so it's really a two parameter model. This minimizes the dof when doing fitting. The neat thing is that the gamma parameter *is* the pythagorean exponent, which makes me wonder if there is an underlying reason that the Weibull is a good fit or whether it is just phenomonenological. I like Russ' pseudoexplanation about competing risks. I wonder if there's something there to be more fully fleshed out.
The author here; Sal Baxamusa alerted me to this discussion.
I did know about the Miller proof (in fact, I mentioned it in the article!). I knew about one of the THT articles  thanks for the links to the others, plus the Woolner article.
Certainly Weibull is just a phenomenological fit (it's continuous after all).
The problem with the Weibull, as used by Miller (other than the fact that it's continuous and not discrete) is that it becomes a oneparameter distribution after fitting  it loses one parameter when choosing the binning (as you mentioned), and loses another when fitting to the data. The remaining parameter is the Pythagorean exponent, of course, but since it's a universal exponent, it doesn't allow for variation of distribution shapes.
I suppose you could use Weibull distributions with different gamma parameters for each team; it would be interesting to see if that gave results similar to mine.
As Mike said, if you want to go to a discrete distribution, Poisson won't work. As Russ said, with Poisson the shape is not independent of the mean. Another way of looking at it is that Poisson assumes events are independent, but runs
are not independent events in baseball since your probability of scoring another run depends on how you score the last one (i.e., whether you had men still on base). Interestingly, I've found hockey scoring to be welldescribed by Poisson, but that's because goals ARE a good approximation to independent events.
I doubt if any standard discrete distribution accurately reproduces run scoring distributions, which is why I went the modeling route. As depletion mentioned, you need a way to resolve ties, too. The basis for RPG distribution is the RPI (runs per inning) distribution, which also allows you to determine who wins in extra innings.
I've had this discussion re: Markov chain analyses of baseball, where future and past states are also not independent of the present state but where the Markov chain appears to be a reasonable model nonetheless  does this caveat make sense in that context as well?
 MWE
I agree, future and past states are also not completely independent (violating the Markov chain hypothesis), but it probably doesn't make that big a difference.
The fact that the W/L parameters derived from the Markov chain analysis also fit the real data fairly well was encouraging, and suggest that the lack of independence is minimal, or at least not a problem.
As it turns out the difference between the best and worst AL teams in SLG was .081 last year. So we're talking about a 1 win difference here, on the most extremes.
While preparing my article I looked at using Weibulls with different gammas to see if a modified Pythagorean expectation followed from it, but the math appeared to become intractable. Of course, if you didn't want a closed form result that wouldn't matter.
I think the alpha is the main driver of the run environment (i.e., RPG), so for the Weibull to model the run distribution effectively, I would expect the gamma to mainly affect the shape (i.e., standard deviation)  although of course both mean and variance are functions of both alpha and gamma.
The problem with any continuous distribution is that it doesn't handle ties, which is why I like starting with a RPI distribution, which can handle ties and which also allows you to calculate a RPG distribution.
Yes, but you also have the pitching side, which can give another win. And if you construct your team using these principles, you might be able to accentuate the differences some.
But, agreed, these are not huge effects. The problem is you have to do it for every player, and the gain per player is small. OTOH, if you could do something to add two wins a year, why not do it?
I like the basic result, which is that a SLG .080 higher for a player is worth one additional run. So if you have one player that rates out to a WARP or WAR of 4.3 and another to 4.8, if the first player had a SLG .080 higher, you would actually rate him at 5.3 compared to the other player. Or with a SLG .160 higher, it would be 6.3. That's not insignificant.
Fielding is a large part of a WARP rating, so lately I've been wondering if there is a similar effect in fielding, i.e., do some fielding plays that have the same runs allowed actually lead to different shapes of the runs alllowed distribution, which would affect wins? That will be a much harder nut to crack.
No, one extra run would change the 4.3 player to 4.4, a trivial change. (It takes 10 runs to generate one win.)
Oops, I meant RAR, not WARP. Duh. Anyway, doing it for the whole team (both offense and defense) might be worth 2 wins.
Although the differences are small for one player, it can be used as a tiebreaker, i.e., for two players with the same conventional rating, choose the one with the higher SLG.
Sorry if this was covered already, but is consistency always good? Wouldn't a bad team (one that averages fewer RS than RA) win more games with an inconsistent offense?
I agree they are not large effects, and noise makes it less useful. The baserunning aspect is definitely too small to be useful!
And the effect varies with the run environment. In 1968, it would have taken only a .050 higher team SLG to add a win  of course SLG separation would probably be harder to achieve in a lower run environment.
That's an interesting question. There are some subtleties I didn't get into in the article (there was a length limit on the original, and I didn't rewrite it for the web):
Although I used standard deviation as a measure of the shape (how wide it is), you can also have a skew as well (more probability above or below the average), in which case the distribution is not symmetric about the mean.
If you had a skew with more probability above the mean, then a wider distribution (more inconsistent) WOULD be helpful, no matter what your RPG. However, realistic baseball distributions always have more probability below the mean, and in that case a wider distribution is bad.
A neat (nonrealistic) example showing these effects is to consider three teams, A who scores 4 runs twothirds of the time and 7 runs the other third (call it a 447 distribution), B with a 663 distribution, and C who always scores 5 runs (call it 555). All of these have an average RPG of 5.0.
A is sort of shaped like a baseball team (more probability below the mean), C is perfectly consistent, and B is unrealistic (more probability above the mean).
It turns out that C beats A (6 games out of 9), A beats B (5 games out of 9) and B beats C (6 games out of 9)! For perfectly general distributions, which team is better is nontransitive.
In real life, the skew correlates very closely with the standard deviation, always has more probability below the mean, and doesn't really add anything to the analysis. The team closest to perfect consistency (like team C) will do better for the same RPG.
You must be Registered and Logged In to post comments.
<< Back to main