13 For His Last 24: Tomfoolery with Multiple Endpoints
Voros looks past the numbers.
In a recent column, Dayton Daily News columnist, Hal McCoy,
argued that if Juan Castro was on another team besides the Reds, he would be
a starter. Most of his argument centered on Castro?s glove-work, but he also
contributed the following:
?He (Castro) hit .292 in his last 14 games.?
This is a simple and easy to understand example
of the multiple endpoint fallacy. The multiple endpoint fallacy gets its name
from its tendency to make one of several possible valid reference points, look
like a single reference point. In the above example, the selection of 14 games
was not the only possible time frame that could have been chosen. Why not 15
games? 16? 30? Half a season? Two games?
In the first game in that series of 14 games,
Castro went 2 for 4. In the game the day before, Castro went 0 for 4. If he
had chosen the last 15 games, his average would have been .269. In fact, tacking
on any number of games onto that 14 lowers his batting average for the
series. In other words, 14 games was chosen for the specific purpose of making
Castro?s hitting seem as good as possible.
Now, frankly, this is a minor transgression.
One of the reasons why you see multiple endpoint arguments so often is that
as human beings they?re hard for all of us to avoid using. Whenever we have
belief, we tend to see all information in a manner consistent with that belief.
If we believe the Red Sox are going to win the World Series this year, we will
naturally tend to give more weight to information in support of that belief,
than the information that contradicts it.
There?s a famous example of this (its fame
is probably due to the ease with which it can be done in a classroom). You gather
a group of 30 or so people. Then you ask each person to say his or her birthday
aloud. When a birthday gets repeated, say for our purposes April 17th,
the person conducting the ?experiment? says, ?What are the chances that two
of the 30 people in this classroom would both be born on April 17th??
He has made the prototypical multiple endpoint argument. It is, in fact, fairly
rare to survey a random group of 30 people and find that two have a birthday
of April 17th. However it is not at all rare (in fact with 30 people
the chances are well over 50%) to survey those people and find two people with
the same birthday. The idea is that it didn?t necessarily have to be April 17th.
It could have been, September 9th, August 17th or January
23rd. The April 17th date was not a necessary condition
of the result, so it shouldn?t be treated like one.
Here?s an absurd and humorous example from a reader of the Skeptic?s Dictionary,
who is relating a ?psychic experience? he had about his parents? new cat: http://www.skepdic.com/comments/psi4com.html
"...I got a mental image of a furry white cat, perhaps a Persian, with
piercing blue eyes and the kind of a ?wide? face that many long-haired cats
seem to have?
"This morning, I got those cat pictures from my brother. It turns out that
the cat is grayish brown, very short-haired and delicately built with a slim,
angular face (a bit like a Siamese), and yellow eyes.
"Now, my question is this: how do you account for the fact that even though
many, if not most, cats are multi-colored, I knew straight away that this particular
cat was all one color?"
So far I?ve only shown examples that are either,
not very important, or completely silly. But while multiple endpoint arguments
are often simple to construct, they are often hard to identify and are often
used in much more serious and important matters. Here?s a quote from an AP report
last November concerning Bud Selig?s testimony before a Senate committee, ?Selig
cited statistics that only three of 189 postseason games since the 1994-95 strike
were won by teams that didn’t have payrolls among the top half.?
That seems like hard evidence that there is a competitive balance problem.
But think about all of the possible ways you could measure the relationship
between competitiveness and payroll. Selig cites postseason games. Why postseason
games? Why not regular season games? Why not only games where a low payroll
team faced a high payroll team? Selig cites the period from 1995 until now.
The strike seems like a logical marking point, but what exactly changed from
1993 to 1995 to suddenly give the high payroll teams a big advantage? Couldn?t
you make the argument that it?s used simply because the relatively low payroll
Phillies racked up a large number of wins in the 1993 postseason? And why exactly
is it the top half? The Seattle Mariners won five games in the 2000 post-season
despite being 14th out of 30 teams in opening day payroll (just a
hair above the ?half? line).? And how is payroll calculated, anyway? Do they
use opening day payroll, the total amount of money paid out, or just the annual
salaries of the players on the team as the season ends? And is payroll the right
measurement to use? Isn?t the issue really maximum possible payroll? After all
it would speak very poorly of GMs if there wasn?t a real connection between
payroll and winning, even under a completely balanced system. Why not use a
?market size? measurement? How about using the city?s population?
The point is not that Selig is necessarily wrong, just that the measurement
he used is loaded in his favor. If he gets to choose the variables, time period,
method of evaluation and relevant data, naturally the side he favors is going
to have the most favorable ?endpoints? used in the presentation. As Rod Fort
testified, ?...let’s remember that focusing on a particular point in time without
any historical reference can make a problem appear larger than it really is.?
This is doubly true if the period of time under scrutiny is the end of a cycle.?
And that is the problem with multiple endpoints.
It uses a period of time, or a section of data, that doesn?t necessarily represent
the whole of the relevant periods of time and data. It isn?t that the people
who use them are being dishonest or are necessarily ?wrong,? only that the statement
seems to mean much more than it actually does.
So for every graph you see or every stat you
hear, imagine what they would look like if you extended the time period or the
qualifications just a little bit more. You might be surprised.
Posted: March 20, 2001 at 05:00 AM | 2 comment(s)
Login to Bookmark