Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Baseball Newsstand > Baseball Primer Newsblog > Discussion
Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Monday, June 27, 2005

The Hardball Times: Fox: Pythagoras and the White Sox

Repoz Posted: June 27, 2005 at 07:17 AM | 61 comment(s)
  Related News: Sabermetrics

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 1 of 1 pages
   1. Walt Davis Posted: June 27, 2005 at 11:30 AM (#1434144)

Methinks Dan missed the question.

The question is how well does mid-season pythag vs. mid-season record predict how well the team will do from that point forward.

Of course there comes a point where actual record does a better job of predicting final record than pythag. The White Sox don’t have to give back all those “extra” wins just because pythag says they do.

To make it clear:

ESPN has the Sox at 4 games above their pythag (not a particularly huge margin) with a pythag win % of .621.  If Pythag is a better indicator of “true ability” than record, we expect them to win 62% of their remaining games...but we don’t expect their final percentage to magically balance out to .621.  If they win 62% of their remaining games they’ll end up with 104-105 wins (or 4 more wins than expected based on pythag).

I don’t know if he has the data to test it, but he probably should have at least mentioned BPro’s “adjusted standings”.  RS and RA also involve a certain amount of “luck”.  BPro uses EQR and EQRA and then some further adjustment for quality of competition faced to get their adjusted standings—essentially an adjusted pythaorean.  In those standings, the Sox are a whopping 8.5 wins better than expected, or a “true” .561 team.  But even if they are, they still get another 49 wins and finish with 99.

In short, no matter how you slice it, the White Sox are expected to make the playoffs.  Given the Twins need to play .662 ball from here to finish with 99 wins, they’re unlikely to win the division.

Note, even if the Sox are a true .500 team, that’s still 94 wins and most likely no worse than the wild card.

Dan also makes a mistake when he says that 1-run win %age is not related to team ability.  Of course it is—good teams win more 1-run games than bad teams.  I assume what he meant is that the gap between good and bad is much smaller in 1-run games.  Or just that even if the Sox are a true 600+ team, going 20-8 in 1-run games takes a good dose of luck.

   2. Fridas Boss Posted: June 27, 2005 at 12:01 PM (#1434211)

Good point Walt, I think the question you pose is the more interesting one.

Quick question about Pythag: it assumes RS/RA will be constant for the rest of the year, correct?  If so, why would we ever assume this to be the case?

I also have a hard time assigning Runs allowed as evenly distributed and thus extrapolatable over a longer course of the season.  Pitching is very punctuated, and can cause distributions that would very explanabaly put you over or under Pythag.  If a team had 4 lights out starters that finish their games and a disastrous bullpen and 5th starter, I imagine their overall RA would not map to the expected W%.

I think RS is much more ammenable to pythag projections because it is more evenly distributed.  But strategies (e.g small ball) leveraged at select times, as well as selected pitching/defense against your offence can also skew the distribution.

   3. Steve Treder Posted: June 27, 2005 at 12:22 PM (#1434263)

I think asking Pythag to perform as a regular predictor of future performance within a season is pushing it.

On a full-season basis, most teams actual W-L is within a few games of their Pythag.  For the minority of teams that significantly over- or under-perform against their Pythag on a full year basis, it can be seen as a piece of evidence (not the entire evidence, but one clue) that the team wasn’t actually quite that good/bad, and thus as an indicator of to expect next year.

The full season sample size is an important issue, as is the fact that for most teams in most years, Pythag is a moot point.

   4. A Surfeit of Peaches Graham (SdeB) Posted: June 27, 2005 at 12:33 PM (#1434278)

Of course it is—good teams win more 1-run games than bad teams.

Here are two of the best teams in one-run games in baseball history:

1974 Padres. 31-16 in one-run games, 29-86 in others.

1955 Athletics. 30-15 in one-run games, 33-76 in others.

   5. I am Ted F'ing Williams Posted: June 27, 2005 at 12:44 PM (#1434301)

As Treder says, pythag is moot, the only reason to look at it is to see if a team is unusually lucky or unlucky.  The White Sox have the best pythag winning percentage in the AL, so the Sox aren’t all that lucky.  Considering that the Twins actual was better than their pythag in 2002-2004 while the White Sox actual was worse than their pythag in those same years, you coul argue that the Twins were luckier the last few years than the Whte Sox are now.

   6. Tango Tiger Posted: June 27, 2005 at 12:44 PM (#1434302)

For those who missed it, I recommend Clay Davenport’s article on the subject, as well as a thread I started at Batter’s Box.

   7. villageidiom Posted: June 27, 2005 at 12:45 PM (#1434306)

Didn’t BPro do a similar study a few years back? I seem to recall it was in conjunction with their adjusted standings and/or playoff odds reports, which at the time used Pythag to project W-L of future games. I think they’d found that, at a certain point late in the season, actual W-L does a better job than Pythag W-L in predicting the remainder of the season.

Actually, upon a search, I seem to have confirmed what I’d recalled. The point late in the season where actual record and Pythag record switch places is about 135 games, or with one month to go. That’s a far cry from “end of May”.

   8. villageidiom Posted: June 27, 2005 at 12:46 PM (#1434308)

Damn you, Tangotiger! Beat me by one minute.

   9. PhillyBooster Posted: June 27, 2005 at 12:55 PM (#1434332)

It seems to me that I read somewhere on this site in the last week (although I don’t know where, exactly, and I don’t remember how much of it is my interpretation of what I was reading) that the White Sox were “overperforming Pythagoras” to a large degree because they are being very consistent—very few 0 and 1 run offensive performances, but very few 10 run blowouts either.

If Team A scores 0 runs one game and 10 runs the next, they’re projected Win% will be somewhere very close to .500.  With a league average (4.50 ERA) bullpen, they will almost certainly lose game 1 and win game 2.

The Team B scores 5 runs one game and 5 runs the next, their projected Win% will be somewhat higher, as there is a decent chance that they’ll win both games.

Put another way, without knowing how many runs your opponent will score, the marginal value of each additional run decreases.  So a “small ball” style of play, even in the early innings, can make sense.  Sure, it is true that you’re increasing your chances for one run, in return for decreasing your chances of multiple runs, but on the other hand, those first few runs have a much better chance of being decisive, and the extra “multiple runs” have a better chance of just being superfluous Pythagorean padding.

Put a third way—It’s the Top of the first, and the leadoff hitter just doubles.  You project that playing “normally,” you will score 4 runs in innings 2-9.  With that assumption, you now consider playing for one run (giving you five for the game) or for a lower chance of multiple runs.  Even if the “multiple runs” approach gives you the best overall return in terms of runs, the value of “run 4 to run 5” increases your chances of winning the game by more than “run 5 to run 6 or 7 or 8.” Playing for one run might still make sense.

   10. PhillyBooster Posted: June 27, 2005 at 01:00 PM (#1434347)

Okay.  It was this article, and the win percentages were:

RS Win %
0 0.000
1 0.078
2 0.243
3 0.322
4 0.494
5 0.606
6 0.700
7 0.858
8 0.847
9 0.880
10 0.936

So, if you score 5 runs per game, you’ll win 60.6%, but if alternate 1 and 9 runs, you’ll only win 47.9%. 

In fact, you’ll do better off scoring exactly 4 runs per game, than you will alternating 1 and 9 runs per game.

On this theory, teams that are most consistent are most likely to outperform Pythagoras.

   11. villageidiom Posted: June 27, 2005 at 01:04 PM (#1434354)

We need to keep in mind that we’re talking about two estimates here. Both Pythag and Actual are being used as estimates of a team’s “true” level of ability.

Being decent estimates, they’re correct in many cases and wrong in others. It does seem clear, based on the body of work out there, that Pythag is a better evaluator of a team’s “true” level of ability than actual W-L record - at least to the 135-game level, as seen in the article linked above.

But I think the interesting question is this: is there anything systematic about the cases where Pythag fails? I say this is the interesting question because I’m always interested in improvements to current methods. Pythag is very simple, and it would seem to me that there are more gains to be made than merely to play around with the exponent.

Let me put it this way. We know that Actual and Pythag will vary significantly when a team’s RS distribution is positively correlated with their RA distribution. Let’s say you have the following results in ten games:

Game 1  2  3  4  5  6  7  8  9 10
RS
10  9  8  7  6  5  4  3  2  1
RA
:  9  8  7  6  5  4  3  2  1  0

That’s RS = 55, and RA = 45, for a Pythag of .599. Actual win %, however, is 1.000. The difference comes because high-RS games happen to correlate with high-RA games.

In general Pythag works because we don’t expect much of a correlation between the two on a game-by-game basis. And generally that works. But are there cases where we should expect it?

   12. PhillyBooster Posted: June 27, 2005 at 01:05 PM (#1434357)

And, as the chart at the linked article shows, the White Sox have scored exactly 4, 5, an 6 runs a disproportionate number of times.

A team who does that, and has average to a-little-above pitching will likely win lots of one run games, because --like lots of average pitching staffs—you’ll give up, 3 or 4 or 5 runs a lot.

   13. Martin Hemner Posted: June 27, 2005 at 01:08 PM (#1434363)

I think Clay’s article hits the main reason for Pythag being superior on the head.  Early in the season, you’re more likely to see extreme outliers in records, on the high and low side. The vast majority of teams will regress towards their mean as the season goes on. Given that the Pythagorean formula is more of a conservative estimate that actual record, it almost has to be a better predictor based on early season record, until we get to a point where most teams have a sufficiently large actual sample.

In other words, just my quick novice glance at these numbers tells me that Pythag is more likely to predict that a team will finish closer to .500 than actual record, and most teams will finish closer to .500 than their actual record until a certain point in the season.

   14. Danny Posted: June 27, 2005 at 01:13 PM (#1434370)

On this theory, teams that are most consistent are most likely to outperform Pythagoras.

For most teams I think that’s right.  but if you have a team with a very poor offense or pitching staff, a more varied offense would probably result in more wins.  A team that scored 400 runs in a season, for example, would probably win more games with more varied scoring. 

Here’s a relevant article from Prospectus: “Hitting Approach and Run-Scoring Consistency”

   15. PhillyBooster Posted: June 27, 2005 at 01:28 PM (#1434394)

Sure, on the extremes a team that consistently scores 5 and gives up 4 runs will win 162 more games in a year than a team that consistently scores 4 and give up 5.

I guess my point is, then, that Step One is to become Good.  Once you are Good, however, it is not clear whether the next goal should be to become Better or to become More Consistent.

As just one obvious example, the Yankees have average 5.25 runs per game in their 24 games in June.  They have scored more than 5.25 runs 8 times, and fewer than 5.25 runs 16 times, including 10 games of 3 or fewer runs.

They could increase their Pythagorean record by increasing their average runs per game, but they’d win more if they just scored five or six every game, instead of alternating between 2-runs and 9-runs.

   16. Dag Nabbit Posted: June 27, 2005 at 01:36 PM (#1434409)

And, as the chart at the linked article shows, the White Sox have scored exactly 4, 5, an 6 runs a disproportionate number of times.

Bob Welch Syndrome.  The trick to Welch’s big 27 win season wasn’t fantastic run support, it was fantasticly consistent run support.

As Treder says, pythag is moot, the only reason to look at it is to see if a team is unusually lucky or unlucky.

And is it all luck?  Didn’t Neyer have a column a few years back with the best teams of all-time in 1-run games?  IIRC, two were Weaver teams and two others were early century Pirate teams.  Neyer brought the info out to show that Tony Muser’s dogshit record in 1-run games wasn’t necessarily just flukey luck.

Does the White Sox bullpen have anything to do with their record in 1-run games.

As mentioned over the weekend, ever team in the wildcard era who had an 8 game lead or better and/or a .640 winning percentage or better at 70+ games in made the playoffs.

   17. Spivey Posted: June 27, 2005 at 01:39 PM (#1434416)

I have seen talk here and at Neyer’s message board that between 2 teams equal in ‘true’ RS/RA abilities, if one has a strong bullpen they’re more likely to outperform their Pythag%. This makes intuitive sense as they’re more likely to win their close games. By the same token, teams with really poor mop-up relievers would probably also outperform their Pythag so long as they tend to pretty much only pitch in low leverage situations.

   18. John DiFool2 Posted: June 27, 2005 at 01:45 PM (#1434427)

If someone has time, I’d like to take the data from
the table to predict the winning % FROM THAT POINT
FORWARD to the end of the season: that would
probably be more useful, as the correlations get
dirtied because wins and losses already in the books
bias the results if you look at overall (162 game)
seasonal W%’s…

   19. Greg Pope Posted: June 27, 2005 at 01:51 PM (#1434432)

After reading the two articles in post 6, it looks to me like pythag does a good job with most teams, but not with really good or really bad teams.  Now that’s probably mostly due to the fact that really good and bad teams are not really as good as their numbers indicate.  But in the previous thread it was shown that the Braves and Yankees have outperformed pythag in (almost?) every year of their recent runs.

There were a few reasons thrown about in that thread, but could it be that pythag just doesn’t work as well for really good teams, and needs adjustment?  So it may not be that bullpen, or starters, or managers, or whatever are important to outperforming your pythag, but just that the key to outperforming your pythag is to just be a good team?

I have no idea how you would test that.

   20. Tango Tiger Posted: June 27, 2005 at 02:08 PM (#1434466)

John, please read links in #6.

***

Greg, the better number to use is the PythagoPat.  This works fine with all types of teams.

   21. retro-shiite Posted: June 27, 2005 at 02:14 PM (#1434480)

Here are two of the best teams in one-run games in baseball history:

1974 Padres. 31-16 in one-run games, 29-86 in others.

1955 Athletics. 30-15 in one-run games, 33-76 in others.

IIRC, the ‘03 Tigers (they of the 43-119 record) had a winning record in 1-run games.

   22. Walt Davis Posted: June 27, 2005 at 02:24 PM (#1434511)

I’m wondering about Dan’s method.  The average error is always reported as positive.  Is this the mean absolute error (i.e. over-predicting by 2 and under-predicting by 2 are both treated as an error of 2) or is he saying that pythag is biased and tends to underpredict the number of wins by 3 per year?

Quick question about Pythag: it assumes RS/RA will be constant for the rest of the year, correct? If so, why would we ever assume this to be the case?

Not exactly.  Pythag doesn’t “assume” anything, it’s just a measure of what a record can be expected to be given particular values of RS and RA.  (yes, there are assumptions in that too, but that’s not what Frida’s Boss is talking about I don’t think)

Now, to use mid-season pythag to project performance from here on out does assume that RS/RA will be constant (within random fluctuation) ... which is to say that it assumes that pythag is a good measure of true talent.

If you don’t like that latter assumption, blame the user, not the statistic.

Here are two of the best teams in one-run games in baseball history:

1974 Padres. 31-16 in one-run games, 29-86 in others.

1955 Athletics. 30-15 in one-run games, 33-76 in others.

and so what???

I’m sorry.  Would it have been clearer if I’d said ON AVERAGE good teams win more one-run games than bad teams?

From 2000-2004, the top 4 teams in one-run wins are Oakland, Dodgers, Yankees, and Twins, all winning over 55% of their one-run games.  The overall win % of those teams ranges from .532 to .604.  The bottom 4 are Milwaukee, Detroit, Colorado, and KC, all losing about 55%-60% of their one-run games.  The overall win % of those teams ranges from .389 to .457.

Now in between, things are a jumble, but that’s not so unexpected.  Still, for 2000-2004, the correlation between overall win% and 1-run win% is .7.  The correlation between 1-run win% and >1-run win% is .57.

Or were you just trying to point out freaky counter-examples while conceding the general point?

I think asking Pythag to perform as a regular predictor of future performance within a season is pushing it.

poppycock.  The central question is whether pythag is a good measure of “true talent.” If we accept that it is, then there’s no reason not to use it in-season as long as we recognize that the error bars around that estimate are larger than at end of season.

Now at times those error bars may be so wide as to be essentially useless (after 10 games say), but if you’re willing to use pythag after 162 games, no particularly good reason not to use it after 81.

   23. Dag Nabbit Posted: June 27, 2005 at 02:28 PM (#1434522)

nor is the Sox pitching likely to perform at the current level all season.

Outta curiousity, what do people here think the White Sox pitching will do the rest of the year? 

I’ve seen varities of this statement all year long, and they keep pitching great.  Buherle’s entering his prime, Garcia might be regaining the stuff he showed when he first broke into the Mariners’ rotation, Garland’s finally figured out the strike zone, and with Contreras, well, his stuff’s been better than his numbers.  I look at him and the first thing I think of is Jay Jaffe’s tremendous article demolishing Mel Stottlemyre’s ability as pitching coach from the beginning of the year.  Wouldn’t suprise me a bit if each and every one of these guys ends up far surpassing their projects and keep on a-chooglin’.  Maybe one or two will slide, but I really doubt the entire staff will find this big bag o’ runs that they were supposed to give up and start giving them up.

   24. Buddha Posted: June 27, 2005 at 02:34 PM (#1434541)

I think the Sox pitcher’s numbers will start to increase as the weather gets warmer and the balls start flying out of the Cell.  Buehrle should still be stellar.  I think Garcia will get busted a bit.  Contreras is living on borrowed time as is El Duque.  The real wild card is Garland.  He went through a stretch of bad games, but he looked good against the Cubs.  I don’t know if he can keep it up or not.  But if he can, the Sox are looking at 100+ wins, IMO.

   25. retro-shiite Posted: June 27, 2005 at 02:38 PM (#1434557)

Or were you just trying to point out freaky counter-examples while conceding the general point?

I think he was pointing out that, compared to records in more lopsided games, record in one-run games is a poor indicator of a team’s quality (success or failure in one-run games is more a result of luck than in more decisive games).  Poor (even downright awful) teams sometimes have good records in one-run games, while such teams virtually never have anything but terrible records in lopsided games.

   26. PhillyBooster Posted: June 27, 2005 at 02:40 PM (#1434565)

But if he can, the Sox are looking at 100+ wins, IMO.

Well, seeing as how they only have to play .500 the rest of the way to be a 94 win team, the pitchers could all drop back to practically league average and they’d still win 100 games if Frank Thomas continues to re-emerge.

   27. The Jerry Royster Experience Posted: June 27, 2005 at 02:43 PM (#1434574)

Outta curiousity, what do people here think the White Sox pitching will do the rest of the year?

I’m extremely biased, but here goes -

Buehrle’s the real deal.  He’s the ace of most MLB staffs.

Garcia’s settled into the Matt Clement/Matt Morris “Good but not great” category.  He’ll never be the stud that people were predicting he’d become, but he’s a solid #2.

Garland has finally taken the next step that scouts have been predicting.  He’s always had the stuff, but he’s finally developed the confidence to know that he can get away with throwing strikes.  He’ll regress some as hitters adjust to him, but he’ll be above league-average.

You never know what you’re getting with Contreras.  He still has the propensity for the big inning, but the difference is that, unlike last year, he’s settling down and getting the job done after the big inning instead of just coming unglued.

El Duque is injury-prone.

I have to eat my words on Brandon McCarthy.  I thought he was ready when I saw him in Spring Training, but he’s clearly not.  He’ll develop in the long run (his makeup and his stuff are great), but right now he’s just too homer-prone.

   28. Steve Treder Posted: June 27, 2005 at 02:53 PM (#1434603)

The central question is whether pythag is a good measure of “true talent.” If we accept that it is, then there’s no reason not to use it in-season as long as we recognize that the error bars around that estimate are larger than at end of season.

Well, that’s all I’m sayin’.  The smaller the sample, the greater the error bars.  It’s “pushing” the utility of Pythag, but not so far as to necessarily invalidate it.

   29. villageidiom Posted: June 27, 2005 at 02:55 PM (#1434610)

The central question is whether pythag is a good measure of “true talent.”

It certainly is a good measure of “true talent”. I think the general consensus is that Actual W-L is good, and Pythag is better, at measuring “true talent”.

What I’m looking for is a great measure. And I think diddling around with the exponent in Pythag isn’t getting us far enough. What kinds of teams can be expected to outperform Pythag somewhat consistently? Underperform consistently? How much of the difference between Pythag and Actual at the end of the season was random variation, and how much was non-random?

   30. cardsfanboy Posted: June 27, 2005 at 02:57 PM (#1434619)

I think it’s funny that the white sox pitchers don’t get the credit they deserve, a while back I was looking into which teams have had the best success of developing pitchers over the last 10-15 years(mostly just to bag on the Braves, and to be used as a counter argument against the bagging on TLR as a butcher of young arms)

by my method, (which is simplistic, multiple seasons over 100 era+ with 162 innings pitched) I believe that the White Sox, Blue Jays, Twins? and Dodgers had the most pitchers they developed. at work, don’t have my notes with me so going from memory.

for years I have thought that the White Sox were a good quality team, but I’m now to the point that I’m almost willing to conceed to Guillen that he was right about the small ball comments(not all of them of course)

   31. The Polish Sausage Racer Posted: June 27, 2005 at 02:59 PM (#1434624)

There’s a severe difference between saying that good teams win more 1-run games and teams that win more 1-run games are good.  The counter-examples are trying to disprove the latter, which wasn’t the hypothesis under test; it’s either a joke or a bad example of a logical fallacy.

   32. Danny Posted: June 27, 2005 at 03:04 PM (#1434636)

What I’m looking for is a great measure.

Then you shouldn’t be looking at single season team stats.  The best predictor would be updated individual projections for the players with guesses at playing time.

   33. shoewizard Posted: June 27, 2005 at 03:28 PM (#1434685)

I have seen talk here and at Neyer’s message board that between 2 teams equal in ‘true’ RS/RA abilities, if one has a strong bullpen they’re more likely to outperform their Pythag%. This makes intuitive sense as they’re more likely to win their close games. By the same token, teams with really poor mop-up relievers would probably also outperform their Pythag so long as they tend to pretty much only pitch in low leverage situations.

This helps explains why the D Backs have outperformed their pythag by so much. (Still about 6 wins over despite their recent regression)

Due to 4 relievers getting hurt, they were forced to try the waiver wire route, and got the following results:

Javier Lopez 14.1 IP 13 Runs allowed
Kerry Lightenberg 9.2 IP 15 runs allowed
Matt Herges 8 IP 12 Runs allowed

Total 32 IP 40 Runs allowed

The Backs have pitched 691.2 Innings as a team and allowed 410 runs.

These 3 guys accounted for 9.8% of the runs allowed by the team in just 4.6% of the innings.

Then you have these 3 guys from within the organization that have allowed a ton of runs in small innings pitched

Randy Choate 7 Ip 7 Runs
Mike Gosling 7 IP 7 Runs
Edgar Gonzales 1/3 IP 4 Runs allowed

14 1/3 IP 18 Runs allowed.  Added together with the first group of players you get

46 1/3 IP (6.7 % of total innings)

58 Runs allowed (14.1% of total runs allowed)

If these relievers had allowed say 5.5 runs per 9 IP instead of a whopping 11.26, the D backs Pythag report would not look nearly as bad. About 30 less runs allowed.

356-380 is probably a much better snapshot of the D backs then 356-410.

   34. PhillyBooster Posted: June 27, 2005 at 03:54 PM (#1434749)

I guess, then, that what goes for “the team” goes for the individual members of the team.

A team of exactly 4.50 ERA players will hew closer to their Pythagorean record than a team with a few aces and a few total losers, but who nonethess have a 4.50 team ERA.

   35. Fridas Boss Posted: June 27, 2005 at 04:48 PM (#1434895)

Walt,

Thanks for addressing my question.  My point in asking it was, if you are at point X in the season, and want to predict the team’s record from there on out, why would we assume that the RS/RA up to that point have arrived at their true talent level and will remain that way?  Aren’t there sample size issues?  In this way, Pythag will have the same problems as using actual W% to go forward, what if forward isn’t likely to mimic the recent past?

   36. Danlby Posted: June 27, 2005 at 04:50 PM (#1434900)

Obvious statement #1: Given the choice between five runs and more than five runs, you’ll take more than five runs every time.

I think we can credit the White Sox for avoiding low scoring games--they’re 2nd in the AL in this “statistic” with just 22 games of 3 or less runs scored--but I also think the Sox are bad at generating blowouts--they’re 11th in the AL with just 14 games of more than 6 runs. Yes, I drew arbitrary lines in the sand, but I chose those limits because the consensus sweet spot is 4-6 runs.

On the larger scale, are “small ball” teams better at avoiding low-scoring results? The top five AL teams at avoiding 0-3 run games this season are: BOS (19), CHW (22), TEX (23), MIN (25), and BAL/NYY (26).

What do BOS, TEX, MIN, BAL, and NYY have in common?…
Top 7 in homers (5,1,7,2,3)
Top 7 in OBP (1,6,7,3,2)
Scattered in SB (14,12,5,7,6)
Scattered in SH (13,14,5,7,9)

So you might say that the top teams at avoiding low scoring games are a pair of “small ball teams,” with MIN and CHI trading outs for runs, and a group of slugging teams, with BOS, TEX, BAL, and NYY avoiding outs and hitting home runs. I don’t see a particular advantage for either strategy.

As to blowouts, the four slugging teams have 26, 25, 25, and 18 games with 7 or more runs, which ranks them 1st, 2nd, 2nd, and 5th in the AL. Chicago is 11th with 14, as mentioned, and Minnesota is 9th with 16. I make no claim about causation, just correlation.

An interesting comparison can be made between Chicago and Boston, since both teams are very good at scoring at least 4 runs. If you take the Chicago offense and convert 8 of the games in which they hit the sweet spot, scoring 4-6 runs, into blowouts, with 7 or more runs scored, you get the Boston Red Sox offense.

Obvious statement #2: regardless of strategy, you have to love the Boston Red Sox results, and not just because they’ve outscored everybody.

They’ve been shutout the least, scored one or less the least, scored two or less the least, scored three or less the least, scored four or less the least, scored five or less the least, scored six or less...*pause*...the second least, scored seven or less the least...Now that’s consistency.

   37. Slinger Francisco Barrios (Dr. Memory) Posted: June 27, 2005 at 05:28 PM (#1434990)

LA-LA-LA-LA-LA-LA-LA-LA, I CAN’T HEAR YOU, LA-LA-LA-LA-LA-LA-LA-LA-LA…

   38. PhillyBooster Posted: June 27, 2005 at 06:15 PM (#1435086)

On the larger scale, are “small ball” teams better at avoiding low-scoring results?

While that is an interesting question, it’s not exactly what I was thinking.  My question was more:

“Assuming a team is good, should it use more ‘small ball’ strategies?”

My working assumption now is that good teams should use more small ball, due to the likelihood that they’ll end in the “sweet spot”—at least—anyway, where one run can make a big difference. 

BUT, bad teams should avoid ‘small ball’ like the plague, because when you suck and you play for one run, you may never get another one.

   39. Steve Treder Posted: June 27, 2005 at 06:22 PM (#1435098)

My working assumption now is that good teams should use more small ball, due to the likelihood that they’ll end in the “sweet spot”—at least—anyway, where one run can make a big difference.

I’m not sure I buy it.  The assumption is that the small ball tactics will make them less likely to blow the game open, but more likely to get that one more run that will come in handy when the score ends up close.

But in reality, events aren’t isolated; they impact one another.  If I succeed in scoring 3 runs instead of 1 early in the game, that may well lead to getting the opponent’s starter out of the game, and allow me to get into the soft underbelly of his middle relievers, and I can turn what would have been a close game into a comfortable win.

My working assumption remains that small ball tactics make sense only in the late innings of ties or 1-run games.  In all other circumstances, especially early in games, angling for the biggest possible inning will yield the best aggregate results.

   40. Srul Itza Posted: June 27, 2005 at 06:25 PM (#1435103)

they’d still win 100 games if Frank Thomas continues to re-emerge.

And that raises the issue, that the Team you see on May 1 and start predicting, may not be the team you see on July 1. 

Has anyone ever done a comparison of how well Pythag estimated team performance vs. the amount of team turnover?  I.e., has anyone ever determined that, the less the turnover, the better the fit?

   41. The Politics of Torre: How the HOF Really Works Posted: June 27, 2005 at 06:29 PM (#1435106)

It certainly is a good measure of “true talent”. I think the general consensus is that Actual W-L is good, and Pythag is better, at measuring “true talent”.

What I’m looking for is a great measure. And I think diddling around with the exponent in Pythag isn’t getting us far enough. What kinds of teams can be expected to outperform Pythag somewhat consistently? Underperform consistently? How much of the difference between Pythag and Actual at the end of the season was random variation, and how much was non-random?

Don Malcolm’s attempt to refine Pythagorean Wins and Losses. I haven’t seen it discussed before and I have no idea how good it is.

   42. johnny_mostil Posted: June 27, 2005 at 08:08 PM (#1435323)

The real wild card is Garland. He went through a stretch of bad games, but he looked good against the Cubs. I don’t know if he can keep it up or not.

Garland is a very different pitcher this year, visibly as well as in the statistics.  He’s 25.  It’s quite possible he learned something.

   43. johnny_mostil Posted: June 27, 2005 at 08:19 PM (#1435355)

Can we all step back and remember why Pythag matters at all?  It matters because it (like several other useful methods) shows a correlation between winning percentage and raw run totals.  Without this correlation, most sabermetric analysis would be useless (like analyzing football stats) because, if winning didn’t follow predictably from runs scored and allowed, evaluating players by their contributions to runs scored/allowed would be meaningless.

But, this correlation is a generalization, not an immutable Law.  There will be teams and seasons for which it isn’t useful.  There are managerial decisions (like leaving Harper in to give up 9 runs the other day) that can warp it in some instances.  These warps even out over leagues and years. 

That a back-of-the-envelope estimation method might err by a handful of games for one team is trivial, not something to be treated as if the White Sox have mysteriously created a perpetual motion machine.  ONE TEAM violating Pythag means nothing.  I bet it still fine works as a generalization for about 27 of the 30 teams.

   44. johnny_mostil Posted: June 27, 2005 at 08:26 PM (#1435389)

It certainly is a good measure of “true talent”. I think the general consensus is that Actual W-L is good, and Pythag is better, at measuring “true talent”.

“True talent” is the Philosopher’s Stone of Sabermetrics.  It doesn’t exist.  There is no magic number that describes the value of a player, as in “Frank Thomas is an 8.7, but Scott Podsednik is only a 5.1”.  We can estimate productivity, probably very accurately for most players on most teams.  We can’t ever know his true value, his “Saber-Nature” if you will, because there is no such thing.  Ballplayers are not Strat-O-Matic cards.  That doesn’t mean it’s not fun or educational to try to assign these numbers to players; what it means is we shouldn’t take them too seriously because they are only estimates, no matter how good they are.  The best indicator of baseball success is winning and losing.  When your method can’t explain winning and losing, you should at least consider the possibility that it is the method that is limited, not the team you are analyzing.

   45. Harold Posted: June 27, 2005 at 08:35 PM (#1435423)

My point in asking it was, if you are at point X in the season, and want to predict the team’s record from there on out, why would we assume that the RS/RA up to that point have arrived at their true talent level and will remain that way? Aren’t there sample size issues?

Yes, of course there are.  This isn’t a weakness in the Pythagorean formula; this is a weakness in how some people use it.

In this way, Pythag will have the same problems as using actual W% to go forward, what if forward isn’t likely to mimic the recent past?

Yes.  But Pyth is better than actual W%, because the sample is larger (i.e., there are more discreet events when you look at runs rather than at games).

   46. Harold Posted: June 27, 2005 at 08:40 PM (#1435438)

Put another way, without knowing how many runs your opponent will score, the marginal value of each additional run decreases.

This is simply not true.  It’s true once you get over five runs, but it’s not true at the lower run levels.  According to Studes’s numbers in post 10, run 2 is more valuable than any other run.  (It’s worth pointing out that we shouldn’t use Studes’s numbers for any heavy lifting, as they’re gleaned from a small sample; it’d be better to look at the last few years’ data).

Put a third way—It’s the Top of the first, and the leadoff hitter just doubles. You project that playing “normally,” you will score 4 runs in innings 2-9.

I don’t think you can do that.  Sure, on average you’ll score 4 runs over those innings, but the distribution is wide enough that you can’t bank on it.

You’ve got to score the first four runs before you play small-ball for the fifth run.

   47. Danlby Posted: June 27, 2005 at 08:55 PM (#1435468)

My working assumption now is that good teams should use more small ball, due to the likelihood that they’ll end in the “sweet spot”—at least—anyway, where one run can make a big difference.

Like Steve, I don’t see it. The hyperbole given for “walk and slug” teams continues to be that they’ll score 8 or 9 runs one game and 1 or 2 the next, but the numbers belie this caricature. The Red Sox offense, which employs a system that is far from an “outs for runs” model, is clearly better than the White Sox at hitting the sweet spot or better with 4 or more runs in each game.

   48. studes Posted: June 27, 2005 at 11:13 PM (#1435775)

(It’s worth pointing out that we shouldn’t use Studes’s numbers for any heavy lifting, as they’re gleaned from a small sample; it’d be better to look at the last few years’ data).

FWIW, I’ll post five year’s worth of data tomorrow.

   49. SuperGrover Posted: June 27, 2005 at 11:21 PM (#1435779)

The Red Sox offense, which employs a system that is far from an “outs for runs” model, is clearly better than the White Sox at hitting the sweet spot or better with 4 or more runs in each game

There is no way to determine if the “system” is better or the players are better.  It’s quite possible the Red Sox would score MORE runs if the bunted and ran more.  I certainly don’t believe it, but saying the Red Sox “system” is superior is hog wash.

   50. Tango Tiger Posted: June 27, 2005 at 11:33 PM (#1435790)

I did this a few years ago, with the first version of my run distribution model.  While it’s changed slightly, the results should be fairly stable.

Anyway, given that the opposition scores an average of 5 runs per game, distributed based on my distribution, if you allow…
0 runs, you’ll win 100% of the time
1 run, 91.3%
(that makes this first run worth .087 wins)
2 runs, 81.2%
(makes 2nd run worth .101 wins)
3 runs, 68.8% (worth .124 wins)
4 runs, 55.9% (worth .129 wins)
5 runs, 43.7% (worth .122 wins)
6 runs, 33.1% (worth .106 wins)
7th run worth .088 wins
8th run worth .069 wins
9th run worth .053 wins
10th run worth .038 wins

Clearly, the leverage is at the 3-5 run level.  If you score a disproportionate number of games at this level, your pythag will not reflect this.  As well, if you are involved in a disproportionate number of blowouts, again, pythag will not reflect this.

(Again, the numbers above will change if you use the better distribution model.  The general point should stand.)

   51. Kyle S Posted: June 28, 2005 at 12:32 AM (#1435916)

*hijack* Tango, are you still looking for a new job? I saw you mention that to Treder and you cited lots of programming/database experience. My firm(Navigant Consulting, NCI on the NYSE) is always looking for more people who know SQL, STATA, and other data tools. http://www.navigantconsulting.com - check us out. If you’re interested, I can forward your resume to our hiring people. We have offices in about 50 US cities, with HQ in Chicago and other big offices in DC, NY, Frisco, and LA.

   52. studes Posted: June 28, 2005 at 06:49 AM (#1436086)

Here’s the article with data posted for the past five years.  Just FYI.

   53. Tango Tiger Posted: June 28, 2005 at 08:20 AM (#1436101)

KYle, thanks, I’ll send you an email.

Studes, you can download my exe program (Tango Distribution), and play with the “control value” to try to match to your distribution.  This will give you a best-fit equation.

   54. PhillyBooster Posted: June 28, 2005 at 08:56 AM (#1436126)

Vinay:

I don’t think you can do that. Sure, on average you’ll score 4 runs over those innings, but the distribution is wide enough that you can’t bank on it.

But the distribution is narrow enough that you know you’ll score at least 2 more runs over 8 innings in a huge percentage of the games, and in half of them you’ll score more than four runs.  In that half playing for one run is unnecessary not because you don’t score more, but because you will probably win anyway.  The major downside of playing for one-run is that this may be a game where you only have one scoring chance, and the one run you get early will be your ONLY run. 

If you look at the full distribution of projected runs, however, that first run in most-likely to be run 3, 4, 5, or 6 overall, and therefore more important.

As Studes’ new numbers show, the most important runs are 2-4, followed by 5-7.

Why is the winning strategy not “maximize the times you score 2-7 runs”?  You would do that by projecting expected runs (in a range), and then only moving to “slugger mode” when you fall below that projected range.

The Red Sox, of course, may simply be a better team so will score 4+ runs no matter what—that is not a vindication of their strategy.

   55. villageidiom Posted: June 28, 2005 at 08:57 AM (#1436128)

The best indicator of baseball success is winning and losing. When your method can’t explain winning and losing, you should at least consider the possibility that it is the method that is limited, not the team you are analyzing.

That was the point of my posts. We should learn how to make our tools better, by examining cases where they don’t work. Fox (in the article), Tango (in #50), and I (in #11) suggest cases in which Pythag doesn’t work.

The key behind each of these cases is this: when teams do something that Pythag isn’t capable of handling, how can we tell if what the teams are doing is random variance vs. a non-random, repeatable skill? If it’s non-random, can we identify it as such, and use it to enhance the predictive nature of Pythag?

Can a team score runs at a suitably high mean, but with significantly lower variance, as a sustainable skill?

Can a team’s RS distribution have significant positive correlation with their RA distribution, as a sustainable skill?

(I have no answers. I’m just good at asking questions.)

   56. Chris Dial Posted: June 28, 2005 at 09:17 AM (#1436148)

Wasn’t Sean Forman’s presentation at SABR35 about this?

He ran Monte Carlo sims to determine what was the best predictor of a team’s true talent - I think it was their last 75 games.

Is that on the net somewhere?  Let me check....

   57. Need a job in baseball Posted: June 28, 2005 at 09:53 AM (#1436184)

The Pythagorean approach relies on a standard distribution.  I think that, as far as predicting a team’s true ability, the Pythagorean method is good for a general overview, however there are a few things that should be known.  The most of which is that blowouts have a negative effect on the predictive nature of the method.

For example...........

Take 10 games.

Team A over those 10 games: RS = 31, RA = 22
Team B over those 10 games: RS = 31, RA = 22

Team A scores 11 runs in one game and 7 in another.  The record for the week is 3-7.

Team B scores no more than 6 runs in any one game and goes 7-3. 

However, both teams Pythagorean record is between 6 and 7 wins.

Obviously, this is a ridiculously small sample, but if you run this test over a full season, you’ll find the same weakness, although admittedly not quite as dramatic.  Either way, if you want to look at Pythagorean method for predicting wins and losses, then you really need to look at some of the background information.

   58. Ivan Grushenko of HK in Tokyo Posted: June 28, 2005 at 09:59 AM (#1436189)

the top teams at avoiding low scoring games are a pair of “small ball teams,” with MIN and CHI trading outs for runs, and a group of slugging teams, with BOS, TEX, BAL, and NYY avoiding outs and hitting home runs.  I don’t see a particular advantage for either strategy.

Aren’t the Twins and White Sox lineups much cheaper than the Yankees, Red Sox and Orioles?  Mightn’t a team on a lower budget than these three be better off saving money on their lineups and applying it to the pitching staff and increasing their run consistency via “smart ball” if that’s actually possible?  Wouldn’t the lower run environment presumably created via the better pitching staff also lower the “sweet spot”?  The Rangers don’t fit this theory but their runs may be inflated by their park.  The Orioles salary budget looks higher if you include Sosa’s whole salary.

   59. Dan Agonistes Posted: June 28, 2005 at 12:53 PM (#1436462)

Walt, me agrees with your comments.

I added the Pyth+Actual predictions and corrected an error in one of the tables at this post.

And you are correct in saying that the ability to win one-run games is related to team ability. I knew that and referenced BJ’s article in my original blog posting but my sentence in the THT article didn’t reflect it.

I also forgot about BPPros adjusted standings and should have mentioned them. While the Twins may not win the central, my point was that they’re not hopelessly out of it, especially if the Sox pitching cools off as might be expected.

   60. flakmeister Posted: June 28, 2005 at 02:56 PM (#1436726)

A standard technique used in particle physics is the truncated mean. The runs scored distribution is quite similar to a (discrete) Landau distribution, i.e. the most probable is less than the average. Typically, one discards the high end outliers. I would imagine that if one discarded the 8 highest run-differential games (5% of total) that the Pythagorean projection would improve.

While I am on the subject, I have been followed the emergence of the new baseball paradigm on the net from its earliest days (whatever happened to Maynard?, remember him). One thing that is missing in many of the discussions and comparisons is what I call the “systematic” error. Using correction factors with a inherent statistical error, e.g. park factors. Doing some toy calculations, some of these errors are at the 5% level. I think that the techniques and measures that I have seen come forth over the years are very useful, but I am also seeing strong statements being made that are not warranted by the facts. For example, when projecting players, there is variance arising from the simple fluctations in batting average (20 pts or so IIRC, aka. the luck factor); corrections from park factors, league adjustments etc.. will inflate the real error on the projection to 15% or so. Heres the problem, the difference between a crappy player and an above average player is not large compared to the true standard deviation of the projection.

Anyway, I have rambled on enough…

   61. Jim Furtado Posted: June 28, 2005 at 03:35 PM (#1436798)

On my hard drive is a quick study I did a few years ago. I looked at about six years of team run scoring/allowing standard deviations. I then tried to incorporate the results into a modified Pythagorean formula.

I never finished up the study but I do recall I was able to increase the accuracy of the pyth formula by about 10% by incorporating a linear regression of the run deviations into the formula. It wasn’t the most elegant technique and my interests led me off into other areas but the study did indicate this area of research is worth exploring, especially if the investigation includes a look at how the shape of a team’s offense/defense relates to the distributions.

Page 1 of 1 pages

You must be Registered and Logged In to post comments.

 

<< Back to main

Support BBTF

donate

My Bookmarks

You must be logged in to view your Bookmarks.

Vivid Seats is a sports ticket broker, concert ticket broker and theater ticket broker offering the best baseball tickets like Yankees tickets, Cubs tickets, and Red Sox tickets, as well as Police reunion tour tickets and Jersey Boys tickets.

We have baseball tickets, the NFL schedule, college football tickets and Cowboys tickets. We have NBA tickets like Celtics tickets and Lakers tickets. Plus, buy Giants tickets, Patriots tickets and Colts tickets. Also check out our MLB baseball schedule

Buy Cheap MLB Tickets

Concerts Theatre NFL Angels Dodgers MLB Celtics Theater NBA Tickets Venues NHL Lakers Tickets NFL Yankees NHL Phillies NBA Wicked Marlins MLB Concerts Cubs Mets Red Sox Wicked WWE Red Sox Mets Yankees Dodgers

Page rendered in 0.7754 seconds
79 querie(s) executed