Page rendered in 0.3276 seconds
66 querie(s) executed
— Where BTF's Members Investigate the Grand Old Game
Tuesday, March 20, 2001
13 For His Last 24: Tomfoolery with Multiple Endpoints
Voros looks past the numbers.
In a recent column, Dayton Daily News columnist, Hal McCoy, argued that if Juan Castro was on another team besides the Reds, he would be a starter. Most of his argument centered on Castro?s glove-work, but he also contributed the following:
?He (Castro) hit .292 in his last 14 games.?
This is a simple and easy to understand example of the multiple endpoint fallacy. The multiple endpoint fallacy gets its name from its tendency to make one of several possible valid reference points, look like a single reference point. In the above example, the selection of 14 games was not the only possible time frame that could have been chosen. Why not 15 games? 16? 30? Half a season? Two games?
In the first game in that series of 14 games, Castro went 2 for 4. In the game the day before, Castro went 0 for 4. If he had chosen the last 15 games, his average would have been .269. In fact, tacking on any number of games onto that 14 lowers his batting average for the series. In other words, 14 games was chosen for the specific purpose of making Castro?s hitting seem as good as possible.
Now, frankly, this is a minor transgression. One of the reasons why you see multiple endpoint arguments so often is that as human beings they?re hard for all of us to avoid using. Whenever we have belief, we tend to see all information in a manner consistent with that belief. If we believe the Red Sox are going to win the World Series this year, we will naturally tend to give more weight to information in support of that belief, than the information that contradicts it.
There?s a famous example of this (its fame is probably due to the ease with which it can be done in a classroom). You gather a group of 30 or so people. Then you ask each person to say his or her birthday aloud. When a birthday gets repeated, say for our purposes April 17th, the person conducting the ?experiment? says, ?What are the chances that two of the 30 people in this classroom would both be born on April 17th?? He has made the prototypical multiple endpoint argument. It is, in fact, fairly rare to survey a random group of 30 people and find that two have a birthday of April 17th. However it is not at all rare (in fact with 30 people the chances are well over 50%) to survey those people and find two people with the same birthday. The idea is that it didn?t necessarily have to be April 17th. It could have been, September 9th, August 17th or January 23rd. The April 17th date was not a necessary condition of the result, so it shouldn?t be treated like one.
Here?s an absurd and humorous example from a reader of the Skeptic?s Dictionary, who is relating a ?psychic experience? he had about his parents? new cat: http://www.skepdic.com/comments/psi4com.html
"...I got a mental image of a furry white cat, perhaps a Persian, with piercing blue eyes and the kind of a ?wide? face that many long-haired cats seem to have?
"This morning, I got those cat pictures from my brother. It turns out that the cat is grayish brown, very short-haired and delicately built with a slim, angular face (a bit like a Siamese), and yellow eyes.
"Now, my question is this: how do you account for the fact that even though many, if not most, cats are multi-colored, I knew straight away that this particular cat was all one color?"
So far I?ve only shown examples that are either, not very important, or completely silly. But while multiple endpoint arguments are often simple to construct, they are often hard to identify and are often used in much more serious and important matters. Here?s a quote from an AP report last November concerning Bud Selig?s testimony before a Senate committee, ?Selig cited statistics that only three of 189 postseason games since the 1994-95 strike were won by teams that didn’t have payrolls among the top half.?
That seems like hard evidence that there is a competitive balance problem. But think about all of the possible ways you could measure the relationship between competitiveness and payroll. Selig cites postseason games. Why postseason games? Why not regular season games? Why not only games where a low payroll team faced a high payroll team? Selig cites the period from 1995 until now. The strike seems like a logical marking point, but what exactly changed from 1993 to 1995 to suddenly give the high payroll teams a big advantage? Couldn?t you make the argument that it?s used simply because the relatively low payroll Phillies racked up a large number of wins in the 1993 postseason? And why exactly is it the top half? The Seattle Mariners won five games in the 2000 post-season despite being 14th out of 30 teams in opening day payroll (just a hair above the ?half? line).? And how is payroll calculated, anyway? Do they use opening day payroll, the total amount of money paid out, or just the annual salaries of the players on the team as the season ends? And is payroll the right measurement to use? Isn?t the issue really maximum possible payroll? After all it would speak very poorly of GMs if there wasn?t a real connection between payroll and winning, even under a completely balanced system. Why not use a ?market size? measurement? How about using the city?s population?
The point is not that Selig is necessarily wrong, just that the measurement he used is loaded in his favor. If he gets to choose the variables, time period, method of evaluation and relevant data, naturally the side he favors is going to have the most favorable ?endpoints? used in the presentation. As Rod Fort testified, ?...let’s remember that focusing on a particular point in time without any historical reference can make a problem appear larger than it really is.? This is doubly true if the period of time under scrutiny is the end of a cycle.?
And that is the problem with multiple endpoints. It uses a period of time, or a section of data, that doesn?t necessarily represent the whole of the relevant periods of time and data. It isn?t that the people who use them are being dishonest or are necessarily ?wrong,? only that the statement seems to mean much more than it actually does.
So for every graph you see or every stat you hear, imagine what they would look like if you extended the time period or the qualifications just a little bit more. You might be surprised.
You must be logged in to view your Bookmarks.
What do you do with Deacon White?
(17 - 1:12pm, Dec 23)
Last: Alex King
(15 - 12:05am, Oct 18)
Nine (Year) Men Out: Free El Duque!
(67 - 10:46am, May 09)
Who is Shyam Das?
(4 - 8:52pm, Feb 23)
Last: RoyalsRetro (AG#1F)
Greg Spira, RIP
(45 - 10:22pm, Jan 09)
Last: Jonathan Spira
Northern California Symposium on Statistics and Operations Research in Sports, October 16, 2010
(5 - 12:50am, Sep 18)
Mike Morgan, the Nexus of the Baseball Universe?
(37 - 12:33pm, Jun 23)
Last: The Keith Law Blog Blah Blah (battlekow)
Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011
(2 - 8:03pm, May 16)
Last: Diamond Research
Retrosheet Semi-Annual Site Update!
(4 - 4:07pm, Nov 18)
What Might Work in the World Series, 2010 Edition
(5 - 3:27pm, Nov 12)
Last: fra paolo
Predicting the 2010 Playoffs
(11 - 5:21pm, Oct 20)
SABR 40: Impressions of a First-Time Attendee
(5 - 11:12pm, Aug 19)
Last: Joe Bivens, Minor Genius
St. Louis Cardinals Midseason Report
(12 - 12:42am, Aug 10)
Napoleon Lajoie: Definition of Grace
(9 - 12:38am, Jul 01)
Last: Hang down your head, Tom Foley
Youth Baseball Hitting Drills: Shine the Light
(5 - 6:47am, Mar 11)
Last: Pat Rapper's Delight