Primate Studies— Where BTF's Members Investigate the Grand Old Game
Monday, August 19, 2002
Makin? Money
Estimating teams? revenue using a few simple numbers.
Estimating Teams’ Revenue Using a Few Simple Numbers
These days everybody seems to be talking about the money end of Baseball:
how much the players make, how much the owners make, how they should split things
up, etc.? So I thought it would be interesting to see if there was a relatively
simple way to estimate how much a MLB team could expect to make given a few
variables (like the size of their market or how much they win).
Now I?d like to claim that
this was an idea I produced from whole cloth, but it isn?t. Ron Johnson was
doing this sort of thing a few years back and when last years numbers came out
from MLB, I decided it might be neat to try and re-do from scratch what he had
done. I back-burnered it for a while, and after some math work, I came up with
something decent enough to put out there for public consumption.
WARNING: The following piece contains numbers, mathematics and other
things some consider hazardous to their health. Proceed with caution.
A slight difference in mine and Ron?s system (if I
remember his correctly) is that I?m not trying to estimate a particular team?s
revenue to the most exact dollar possible, but rather to have a system which
uses overriding general principles that apply to all teams.
After doing a lot of linear
regression and other math-related work, the following are the variables the
system uses: Population of the Team?s Metropolitan Area (as defined by the 2000
U.S. and Canadian Census Bureaus, Number of Teams in that Metropolitan Area,
Per Capita Income of That Metropolitan Area (a must figure that Ron also
included in his system), a team?s winning percentage the previous year and the
four years previous to that as well, the number of home playoff games played
this year and a yes or no variable as to whether the team?s home park opened
less than two years ago.? The system is essentially a simple linear regression.
It kind of needs to be a little simple so we can get a clear view of how each
of these variables work since we?re interested in more than just the final
number.
The first thing I did to make
the system more useable is to combine all of those winning percentages into a
single number. This doesn?t change the results at all, and is important in that
a single number is much easier to work with when determining various figures (e.g.
how much more revenue a single win produces). Anyway, with the 2001 revenues as
our example year, the single number winning percentage is calculated as:
W% = (2000 W% * .25) + (1999
W% * .225) + (1998 W% * .2) + (1997 W% * .175) + (1996 W% * .15)
You will notice that the 2001
winning percentage is not listed. According to the numbers I have, it seems as
if for the current year, only how many home playoff games the team plays makes
a significant impact on revenue. However, the 2001 winning percentage would
affect the revenue estimates for the years following, so the effect isn?t gone
just delayed. This both makes sense, and of course follows what most people
have said on the subject: that when a team sees sudden improvement, the biggest
gains in attendance come the following season.
Once we have that, we can now
use this formula for the revenue estimate:
Team Revenue = (W% *
$430,169,580) + (Metro Population * $3.46) - (Teams in Metro Area *? $27,962,685)
+ (Home Playoff Games * $2,446,043)? + (Per Capita Income of Area * $2,655.60)
- $160,287,379 + ($22,906,159 if the team?s stadium is less than two years
old).
The first interesting thing
to note is that if I were to father a child tomorrow, this would mean that the Diamondbacks
could expect to earn an extra $3.46 next year.
The correlation coefficient
between these revenues and those estimated by Forbes the last three years is
.89 with an adjusted r-squared (for all the math geeks out there) of .78. The
standard error of the estimate is about $18.3 million.
I know you?re saying, ?So
what? What does this accomplish?? Well, a couple of things. First off, we can
use this to estimate how much money an extra win generates for a baseball
franchise. In a simple fashion, we can take 1 divided by 162 and multiply that
by $430,169,580 and get about $2.655 million per additional win. Now before we
stay put with this number, there are a bunch of questions to ask:
- Does the amount gained per win hold constant across teams regardless of
market size.
- Since the win percentage is an aggregate of several seasons, how do we
figure what a single win counts as.
- What would this number be without revenue sharing.
The first question is the big
one. I tried a series of various functions trying to see if the amount of
revenue per win varied across different markets (IE, is a win more valuable in New York than Kansas City).
I tried as hard as I could, but I kept coming back to the same conclusion: if
there is any change in the value of a win across markets, it?s that a win is
worth more in the smaller market but to an extent that it is irrelevant
and also not statistically significant. In layman?s terms, using the same
revenue per win figure as a constant across all teams will give you as good an
estimate as any other method I could come up with.
The second question is easier
to handle. Though a win this year only goes toward a quarter of the W% figure
for next year, it also goes to the year after?s, and so on. Still it?s money
that, if compared to player?s salaries, is spent now and paid back later,
meaning its value is diminished a little. This really is minimal and is offset
by the fact that generating a win now might have some effect on whether you
host playoff games, so the reduction in value is likely at least offset.
Without the current revenue
sharing system, it looks as if the revenue per win number would be up around
$3.5 million. It?s a good number to know because without revenue sharing, a
player worth 5 extra wins would be worth $17.5 million dollars and with the
current revenue sharing he?s only worth around $13.3 million.
Another interesting use for
the formula would be to figure out a ?base? revenue rate. What I mean by that
is to figure out the estimated revenue for each team based only on those
factors the team has absolutely no control over. For this system that would be
the size of the team?s Metropolitan Area, the per capita income in the team?s
Area and how many teams are currently in the area. By doing this, we can
estimate what each team?s revenues would be if they were all exactly equal
in terms of producing on the field, marketing, new stadiums, and other areas in
which teams can directly affect how much they rake in. A standard complaint
from the stat-head community is that a lot of teams don?t do nearly enough to
generate their own revenue and instead rely on handouts from everybody else.
This method would tell us how much each team would make if they all made
exactly the same efforts in this regard.
Below is a chart of all the Major League teams and some revenue numbers and
estimates for the 2001 season. The actual revenue numbers themselves are from
Forbes yearly estimates of Major League Baseball revenues. Those numbers are
the first column. The next column are the revenues that the above formula estimated
the team would bring in based on all of the factors listed. The third column
is a listing of the team?s ?base revenues,? which I explained above as the estimated
revenue if each team performed equally on the field. The final column lists
the ?base revenues? subtracted from the team?s actual revenues for 2001. This
works as a measure of how well the team is currently drawing from its market
(a? concept Derek Zumsteg wrote about in an interesting
Baseball Prospectus piece the other day). The table:
| Team |
2001 Act Revenue |
2001 Est Revenue |
Base Revenue |
Difference: Actual - Base |
| Anaheim |
$102,610,000 |
$113,379,000 |
$126,521,000 |
($23,911,000) |
| Arizona |
$127,240,000 |
$125,886,000 |
$106,092,000 |
$21,148,000 |
| Atlanta |
$160,020,000 |
$174,156,000 |
$119,326,000 |
$40,694,000 |
| Baltimore |
$132,720,000 |
$139,713,000 |
$139,277,000 |
($6,557,000) |
| Boston |
$152,140,000 |
$142,988,000 |
$133,525,000 |
$18,615,000 |
| Chicago (AL) |
$101,330,000 |
$113,875,000 |
$113,991,000 |
($12,661,000) |
| Chicago (NL) |
$130,550,000 |
$87,048,000 |
$113,991,000 |
$16,559,000 |
| Cincinnati |
$86,630,000 |
$104,800,000 |
$105,740,000 |
($19,110,000) |
| Cleveland |
$151,150,000 |
$141,371,000 |
$112,066,000 |
$39,084,000 |
| Colorado |
$129,420,000 |
$107,591,000 |
$117,778,000 |
$11,642,000 |
| Detroit |
$114,720,000 |
$108,964,000 |
$123,632,000 |
($8,912,000) |
| Florida |
$80,660,000 |
$86,014,000 |
$110,810,000 |
($30,150,000) |
| Houston |
$125,910,000 |
$158,289,000 |
$119,434,000 |
$6,476,000 |
| Kansas City |
$84,600,000 |
$76,247,000 |
$107,821,000 |
($23,221,000) |
| Los Angeles |
$142,630,000 |
$130,674,000 |
$126,521,000 |
$16,109,000 |
| Milwaukee |
$108,260,000 |
$113,771,000 |
$110,109,000 |
($1,849,000) |
| Minnesota |
$74,640,000 |
$85,049,000 |
$120,840,000 |
($46,200,000) |
| New York (AL) |
$214,800,000 |
$226,432,000 |
$166,521,000 |
$48,279,000 |
| New York (NL) |
$169,030,000 |
$177,792,000 |
$166,521,000 |
$2,509,000 |
| Oakland |
$92,740,000 |
$110,540,000 |
$116,596,000 |
($23,856,000) |
| Philadelphia |
$94,270,000 |
$95,530,000 |
$129,615,000 |
($35,345,000) |
| Pittsburgh |
$107,630,000 |
$105,844,000 |
$108,268,000 |
($638,000) |
| San Diego |
$93,090,000 |
$106,354,000 |
$106,981,000 |
($13,891,000) |
| San Francisco |
$142,020,000 |
$146,552,000 |
$116,596,000 |
$25,424,000 |
| Seattle |
$162,040,000 |
$158,357,000 |
$120,130,000 |
$41,910,000 |
| St. Louis |
$123,300,000 |
$115,693,000 |
$111,712,000 |
$11,588,000 |
| Tampa Bay |
$91,590,000 |
$76,730,000 |
$104,746,000 |
($13,156,000) |
| Texas |
$134,300,000 |
$124,831,000 |
$122,272,000 |
$12,028,000 |
| Montreal |
$63,260,000 |
$70,324,000 |
$96,974,000 |
($33,714,000) |
| Toronto |
$91,120,000 |
$104,175,000 |
$110,113,000 |
($18,993,000) |
| Average |
$119,480,667 |
$120,965,633 |
$119,483,967 |
($3,300) |
As you can see, the highest ?base
revenue? figures belong to the two New York teams, and the lowest belong to the
Expos (due to the fact that the Per Capita Income in their Area is easily the
lowest of the 30 teams). Immediately you?ll notice that the teams from the same
markets all have identical ?base revenue? figures. This is because the system
does not currently distinguish between the Yankees and Mets or Cubs and White
Sox as to how large their market is. Why?
There is some question as to
whether the seeming ?dominance? of one team in every two-team market is due to
inherent advantages or due to one team simply doing a better job promoting
itself than another. Let?s use attendance as proxy for a team?s popularity
within a market and see how the two-team markets shakeout. Since the A?s moved
to Oakland in 1968, we?ll use that as the back-end of this discussion. The
question is, of the 34 years from 1968-2001, how often has the ?dominant? team
in the market outdrawn the other team?
Yankees 16 ? Mets 18
Cubs 20 ? White Sox 14
Giants 17 ? A?s 17
Dodgers 30 ? Angels 4
Now, I?ll elaborate further,
but obviously the team that immediately sticks out is the Angels, who have only
outdrawn the Dodgers four times, with the last time being all the way back in
1974. It seems to me the Angels are the only team that might have a
complaint that they inherently have significantly less of the Los Angeles
market than the system credits them with. You could still make arguments for
the A?s and White Sox being inherently disadvantaged, and there are reasons to
believe this might be so, but I think the extent to which this might be true is
usually way overblown and any ?dominance? appears to be mostly due to one team
providing a better product in some way.
As such, I think I might
consider putting an adjustment in for the Angels (after all Orange County isn?t
quite the same thing as Los Angeles), since there?s hard data to support it,
but the others look to me like they have at least almost as good a deal as
their counterpart.
Speaking of two team markets,
Baltimore and Washington D.C. are considered the same market by the U.S. Census and
so they are considered the same by this system. As you can see above, the
system estimates that a second team moving into the area would cost the Orioles
around $28 million a year in revenues. This would likely mean that the value of
the franchise would take a bit of a hit if those numbers did come to pass. Now
you might then conclude that Angelos has a right to squawk (a right he
exercises with alarming frequency), but in reality you?ll see that the ?base
revenue? estimates think that the Orioles have the best non-New York situation
in the league. Of course you could say that?s only because D.C. is included in
the market, but Angelos can?t have it both ways. Either D.C. is his market or
it isn?t, but either way you slice it the Orioles would still have a decent
situation if a team moved into the District. ?It would be roughly equivalent to
that of the Chicago franchises.
One final note is to talk
about how all of this stuff ties in with revenue sharing proposals.? Under the
current system, the ?inherent? advantage for the two New York teams
over the ?average? team according to the ?base revenue? estimates is around $37
million a year.? Now one could argue that figure should be the additional
amount of revenue the Yankees should have to share, but there are problems with
this line of thought. First of all, a New
York team incurs more costs than the
average MLB team (rent, non-player employees, taxes and so forth). Making them
equal on the revenue side would make them disadvantaged on the profit side. So
that has to be taken into account.? Also to be taken into account is the fact
that buying a percentage of the Yankees is a hell of a lot more expensive than
buying that same percentage of say the Pirates. Doesn?t the person who made
that investment have a right to an expect an advantage in revenues based on the
fact that he paid more to get a share of them? I think in those terms, that $37
million advantage of the Yankees doesn?t seem that overwhelming.? Furthermore,
if you?re going to institute more revenue sharing, it would be wise to add a
mechanism to strengthen the relationship between winning and revenue. It?s that
relationship that does the most by itself to level the playing field, and it
also encourages teams to put out better products. Again, I don?t see where
these numbers indicate that increased revenue sharing is all that great of an
idea. To me at least, the Yankees big revenues look to me to be half inherent
advantage and half excellent exploitation of their market.
In short, I hope this little formula gives some insight into exactly what
factors cause teams to make money. I haven?t seen Ron Johnson around lately,
but if he pops by, I?d love to hear how my formula compares with his. I?d also
like to hear any suggestions on additional factors I could include in the formula.
Fire away!
Voros McCracken
Posted: August 19, 2002 at 06:00 AM | 14 comment(s)
Login to Bookmark
Related News:
|
Bookmarks
You must be logged in to view your Bookmarks.
Hot Topics
What do you do with Deacon White? (17 - 1:12pm, Dec 23)Last: Alex KingLoser Scores (15 - 12:05am, Oct 18)Last: mkt42Nine (Year) Men Out: Free El Duque! (67 - 10:46am, May 09)Last: DanGWho is Shyam Das? (4 - 8:52pm, Feb 23)Last: RoyalsRetro (AG#1F)Greg Spira, RIP (45 - 10:22pm, Jan 09)Last: Jonathan SpiraNorthern California Symposium on Statistics and Operations Research in Sports, October 16, 2010 (5 - 12:50am, Sep 18)Last: balamarMike Morgan, the Nexus of the Baseball Universe? (37 - 12:33pm, Jun 23)Last: The Keith Law Blog Blah Blah (battlekow)Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011 (2 - 8:03pm, May 16)Last: Diamond ResearchRetrosheet Semi-Annual Site Update! (4 - 4:07pm, Nov 18)Last: SweatpantsWhat Might Work in the World Series, 2010 Edition (5 - 3:27pm, Nov 12)Last: fra paoloPredicting the 2010 Playoffs (11 - 5:21pm, Oct 20)Last: TomHSABR 40: Impressions of a First-Time Attendee (5 - 11:12pm, Aug 19)Last: Joe Bivens, Minor GeniusSt. Louis Cardinals Midseason Report (12 - 12:42am, Aug 10)Last: bjhankeNapoleon Lajoie: Definition of Grace (9 - 12:38am, Jul 01)Last: Hang down your head, Tom FoleyYouth Baseball Hitting Drills: Shine the Light (5 - 6:47am, Mar 11)Last: Pat Rapper's Delight
|
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Ken Arneson Posted: August 19, 2002 at 12:41 AM (#605886)I wonder if this is due to other entertainment avenues that baseball competes with. There's alot more to do in NYC than in KC. If one market is 5 times as large, but there are 5 times as many things to do, the "marginal market" that each city is after is the same total number.
Voros: maybe getting data from the various Tourism boards might allow us to account for this. Just a thought...
Interesting that the two teams that replaced classic old parks with shiny new ones, Detroit and the White Sox, are on the negative side.
Perhaps the Red Sox should think more than twice about replacing Fenway.
Most of the other negative teams play on fake grass or in football stadiums. The anomalies are Anaheim and Baltimore, which Voros addresses above.
As to the national revenue, I would rather just see it subtracted out of the left-hand side. A team's share of the national revenue isn't a function of its population size, per capita income, etc. The left-hand side should really just be locally-generated revenue.
Remember that by last year's numbers, the Montreal Expos brought in over $52 million in revenues from the league alone, so that one could assume that if a team were allowed to exist while generating _no_ local revenue, they would probably take in somewhere around the $55 million mark.
The second comment is that there are a few problems with the revenue/win is not linear argument: one is that representing it as linear makes analysis before the season or at the very start of the season possible. You might figure a team is an 84 win team before the year starts, but that really means that the team migh have chance to win anywhere from 69 to 94 games with the chances of any one win total occurring increasing as it converges to 84. Therefore getting at that "curve" precisely can't be done because you don't know whether two extra wins gets you to 90 wins or to 78 wins. So the curve is "flattened" out considerably when you take this into account. Another point is that the win% numbers only affect revenue in subsequent years in the model. The only aspect of current year success that affects the model are the number of home playoff games played. Finally, and most importantly, curves that were bent, logarithmic, hyberbolic, exponential, and all sorts of other weirdness were tried and none of them improved the accuracy of the formula. I'm assuming because there are all sorts of different kinds of 84 win seasons, that the effects of these 84 win seasons differ greatly as well. Meaning that to get at the "average" effect from an 84 win season, the various different effects get averaged out to where we once again get a flat and average looking rate.
Finally, in order to use more complex functions, we'd probably need a much higher sample size than I have available (90 teams) in order to get at them. With only the 90 teams, all we're going to get are the simple trends and the important variables. If we try and get complex, the system doesn't become more accurate and becomes harder to work with.
A few things that should push up the accuracy of your model.
There's a strong upward trend in revenue. You can take this into consideration by including the year in the regression (well I actually used year-1994)
I can't understate the importance of either having made the playoffs the previous season or in particular having won the World Series in the previous season.
Once you include these in the regression, the marginal value of a random win in previous seasons goes down a fair bit.
Likewise, making the playoffs and winning the world series in the season under consideration is very important. And including this in the regression lowers the value of a random win by quite a bit. (I know why you didn't include this in your model. In looking at what to bid on a player his impact on your potential playoff chances are very difficult to assess -- particularly years down the line. And that's before getting into how long he rates to sustain that rate of play. Way easier to work with in a marginal revenue produced versus value over replacement study than what I have. )
I'm surprised that you found the marginal value of a win to be more or less constant across markets. I found winning percentage* market size to be both positive and significant. Makes for a right messy equation though. If you opted for a simplifying assumption, I can see why.
I don't know much about Canada, partly due to living in Florida. I'm curious whether Montreal is impoverished or something. However, any extra social services provided would increase the portion of people's money that they could spend on entertainment. If I make less money, but don't have to buy health insurance...
I did do this, in a more simplified way. I simply scaled up the revenue numbers to 2001 levels for each year.
As far as making the playoffs and world series, it is more or less in the model under the "home playoff games" heading. If you make the playoffs you get at least one home playoff game. If you make the World Series you could get as many as 10 or 11 playoff games.
I decided to use games instead of artificial levels, since that's more or less how the revenue is generated for the individual teams.
As far as the revenue per win/market size relationship, I was equally as surprised but I'm just not finding it.
It _could_ be that I'm estimating total revenue instead of local revenue and the revenue sharing combined with the error rate of the regression combined to reduce the difference to insignificant levels. I'm going to work on trying to compartmentalize the numbers a little more (breaking down how each revenue source is generated individually and then putting it back together), and maybe try and do a little bit more work on the right way to split two team markets (there could be significantly different revenue generating functions for two team markets than a one-team market).
Well obviously with last years winning percentage being a key variable, making the playoffs is going to be contained within that. The problem I had with trying to add more playoff and WS data was simply that it didn't make the model more accurate, only more complex. If I do a few things different with the model than Ron, then whether a certain variable shows up as significant could be affected.
As for revenues, I simply scaled up the revenues in 1999 and 2000 to where the total league revenues were the same for all three seasons. This is a bit of a kludge for some questions such a model could work with (say overall revenue growth or something) but for my main purposes I figured it was the best option. YMMV.
You must be Registered and Logged In to post comments.
<< Back to main