Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Primate Studies > Discussion
Primate Studies
— Where BTF's Members Investigate the Grand Old Game

Tuesday, March 20, 2001

13 For His Last 24: Tomfoolery with Multiple Endpoints

Voros looks past the numbers.

In a recent column, Dayton Daily News columnist, Hal McCoy,   argued that if Juan Castro was on another team besides the Reds, he would be   a starter. Most of his argument centered on Castro?s glove-work, but he also   contributed the following:

?He (Castro) hit .292 in his last 14 games.?

This is a simple and easy to understand example   of the multiple endpoint fallacy. The multiple endpoint fallacy gets its name   from its tendency to make one of several possible valid reference points, look   like a single reference point. In the above example, the selection of 14 games   was not the only possible time frame that could have been chosen. Why not 15   games? 16? 30? Half a season? Two games?

In the first game in that series of 14 games,   Castro went 2 for 4. In the game the day before, Castro went 0 for 4. If he   had chosen the last 15 games, his average would have been .269. In fact, tacking   on any number of games onto that 14 lowers his batting average for the   series. In other words, 14 games was chosen for the specific purpose of making   Castro?s hitting seem as good as possible.

Now, frankly, this is a minor transgression.   One of the reasons why you see multiple endpoint arguments so often is that   as human beings they?re hard for all of us to avoid using. Whenever we have   belief, we tend to see all information in a manner consistent with that belief.   If we believe the Red Sox are going to win the World Series this year, we will   naturally tend to give more weight to information in support of that belief,   than the information that contradicts it.

There?s a famous example of this (its fame   is probably due to the ease with which it can be done in a classroom). You gather   a group of 30 or so people. Then you ask each person to say his or her birthday   aloud. When a birthday gets repeated, say for our purposes April 17th,   the person conducting the ?experiment? says, ?What are the chances that two   of the 30 people in this classroom would both be born on April 17th??   He has made the prototypical multiple endpoint argument. It is, in fact, fairly   rare to survey a random group of 30 people and find that two have a birthday   of April 17th. However it is not at all rare (in fact with 30 people   the chances are well over 50%) to survey those people and find two people with   the same birthday. The idea is that it didn?t necessarily have to be April 17th.   It could have been, September 9th, August 17th or January   23rd. The April 17th date was not a necessary condition   of the result, so it shouldn?t be treated like one.

Here?s an absurd and humorous example from a reader of the Skeptic?s Dictionary,   who is relating a ?psychic experience? he had about his parents? new cat: http://www.skepdic.com/comments/psi4com.html

"...I got a mental image of a furry white cat, perhaps a Persian, with   piercing blue eyes and the kind of a ?wide? face that many long-haired cats   seem to have?

"This morning, I got those cat pictures from my brother. It turns out that   the cat is grayish brown, very short-haired and delicately built with a slim,   angular face (a bit like a Siamese), and yellow eyes.

"Now, my question is this: how do you account for the fact that even though   many, if not most, cats are multi-colored, I knew straight away that this particular   cat was all one color?"

So far I?ve only shown examples that are either,   not very important, or completely silly. But while multiple endpoint arguments   are often simple to construct, they are often hard to identify and are often   used in much more serious and important matters. Here?s a quote from an AP report   last November concerning Bud Selig?s testimony before a Senate committee, ?Selig   cited statistics that only three of 189 postseason games since the 1994-95 strike   were won by teams that didn’t have payrolls among the top half.?

That seems like hard evidence that there is a competitive balance problem.   But think about all of the possible ways you could measure the relationship   between competitiveness and payroll. Selig cites postseason games. Why postseason   games? Why not regular season games? Why not only games where a low payroll   team faced a high payroll team? Selig cites the period from 1995 until now.   The strike seems like a logical marking point, but what exactly changed from   1993 to 1995 to suddenly give the high payroll teams a big advantage? Couldn?t   you make the argument that it?s used simply because the relatively low payroll   Phillies racked up a large number of wins in the 1993 postseason? And why exactly   is it the top half? The Seattle Mariners won five games in the 2000 post-season   despite being 14th out of 30 teams in opening day payroll (just a   hair above the ?half? line).? And how is payroll calculated, anyway? Do they   use opening day payroll, the total amount of money paid out, or just the annual   salaries of the players on the team as the season ends? And is payroll the right   measurement to use? Isn?t the issue really maximum possible payroll? After all   it would speak very poorly of GMs if there wasn?t a real connection between   payroll and winning, even under a completely balanced system. Why not use a   ?market size? measurement? How about using the city?s population?

The point is not that Selig is necessarily wrong, just that the measurement   he used is loaded in his favor. If he gets to choose the variables, time period,   method of evaluation and relevant data, naturally the side he favors is going   to have the most favorable ?endpoints? used in the presentation. As Rod Fort   testified, ?...let’s remember that focusing on a particular point in time without   any historical reference can make a problem appear larger than it really is.?   This is doubly true if the period of time under scrutiny is the end of a cycle.?

And that is the problem with multiple endpoints.   It uses a period of time, or a section of data, that doesn?t necessarily represent   the whole of the relevant periods of time and data. It isn?t that the people   who use them are being dishonest or are necessarily ?wrong,? only that the statement   seems to mean much more than it actually does.

So for every graph you see or every stat you   hear, imagine what they would look like if you extended the time period or the   qualifications just a little bit more. You might be surprised.

Voros McCracken Posted: March 20, 2001 at 05:00 AM | 2 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. Voros McCracken Posted: March 21, 2001 at 12:00 AM (#603461)
I actually posted a few graphs of Win% plotted against Payroll for 2000.

There are two:

http://www.baseballstuff.com/mccracken/2000payr.gif

Is winning percentage as a function of Opening Day Payroll. The Payrolls
used for this discussion must be opening day ones, or else you can
immediately throwout a certain amount of causation, because you'll
have effects preceding their causes (a no-no). And I have:

http://www.baseballstuff.com/mccracken/2000rank.gif

Which is a graph of Win% Rank as a function of payroll rank.

Now these are just the 2000 numbers and I've looked and the correlation is much
worse than it was in 1999. So you have to be careful about looking at only 2000
here.

Still it is a decent argument against the argument that the problem is
getting worse.
   2. Voros McCracken Posted: March 29, 2001 at 12:01 AM (#603529)
A similar experiment in multiple endpoints makes an excellent bar bet
for those so inclined. :)

Each person gets a deck of cards. You bet the victim that if you both
flip over cards one at a time, that at some point before you finish
the deck you'll both flip over the exact same card at the same time.

Intuitively this sounds unlikey, but this is why Multiple Endpoints is
such an often used and deadly argument.

The chances of that happening (turning over the same card at some point
in the deck) is actually somewhere near 65%, so after doing this ten times or so
you should wind up well ahead of the game.

Of course gambling is illegal and gambling against someone who has been
drinking is unethical. This is just a hypothetical example. :)

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
Chicago Joe
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 0.1891 seconds
42 querie(s) executed