Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Primate Studies > Discussion
Primate Studies
— Where BTF's Members Investigate the Grand Old Game

Monday, August 19, 2002

Makin? Money

Estimating teams? revenue using a few simple numbers.

Estimating Teams’ Revenue Using a Few Simple Numbers

These days everybody seems to be talking about the money end of Baseball:   how much the players make, how much the owners make, how they should split things   up, etc.? So I thought it would be interesting to see if there was a relatively   simple way to estimate how much a MLB team could expect to make given a few   variables (like the size of their market or how much they win).

 

Now I?d like to claim that this was an idea I produced from whole cloth, but it isn?t. Ron Johnson was doing this sort of thing a few years back and when last years numbers came out from MLB, I decided it might be neat to try and re-do from scratch what he had done. I back-burnered it for a while, and after some math work, I came up with something decent enough to put out there for public consumption.

 

WARNING: The following piece contains numbers, mathematics and other   things some consider hazardous to their health. Proceed with caution.

 

A slight difference in mine and Ron?s system (if I remember his correctly) is that I?m not trying to estimate a particular team?s revenue to the most exact dollar possible, but rather to have a system which uses overriding general principles that apply to all teams.

 

After doing a lot of linear regression and other math-related work, the following are the variables the system uses: Population of the Team?s Metropolitan Area (as defined by the 2000 U.S. and Canadian Census Bureaus, Number of Teams in that Metropolitan Area, Per Capita Income of That Metropolitan Area (a must figure that Ron also included in his system), a team?s winning percentage the previous year and the four years previous to that as well, the number of home playoff games played this year and a yes or no variable as to whether the team?s home park opened less than two years ago.? The system is essentially a simple linear regression. It kind of needs to be a little simple so we can get a clear view of how each of these variables work since we?re interested in more than just the final number.

 

The first thing I did to make the system more useable is to combine all of those winning percentages into a single number. This doesn?t change the results at all, and is important in that a single number is much easier to work with when determining various figures (e.g. how much more revenue a single win produces). Anyway, with the 2001 revenues as our example year, the single number winning percentage is calculated as:

 

W% = (2000 W% * .25) + (1999 W% * .225) + (1998 W% * .2) + (1997 W% * .175) + (1996 W% * .15)

 

You will notice that the 2001 winning percentage is not listed. According to the numbers I have, it seems as if for the current year, only how many home playoff games the team plays makes a significant impact on revenue. However, the 2001 winning percentage would affect the revenue estimates for the years following, so the effect isn?t gone just delayed. This both makes sense, and of course follows what most people have said on the subject: that when a team sees sudden improvement, the biggest gains in attendance come the following season.

 

Once we have that, we can now use this formula for the revenue estimate:

 

Team Revenue = (W% * $430,169,580) + (Metro Population * $3.46) - (Teams in Metro Area *? $27,962,685) + (Home Playoff Games * $2,446,043)? + (Per Capita Income of Area * $2,655.60) - $160,287,379 + ($22,906,159 if the team?s stadium is less than two years old).

 

The first interesting thing to note is that if I were to father a child tomorrow, this would mean that the Diamondbacks could expect to earn an extra $3.46 next year.

 

The correlation coefficient between these revenues and those estimated by Forbes the last three years is .89 with an adjusted r-squared (for all the math geeks out there) of .78. The standard error of the estimate is about $18.3 million.

 

I know you?re saying, ?So what? What does this accomplish?? Well, a couple of things. First off, we can use this to estimate how much money an extra win generates for a baseball franchise. In a simple fashion, we can take 1 divided by 162 and multiply that by $430,169,580 and get about $2.655 million per additional win. Now before we stay put with this number, there are a bunch of questions to ask:

 

     
  • Does the amount gained per win hold constant across teams regardless of   market size.
  •  
  • Since the win percentage is an aggregate of several seasons, how do we   figure what a single win counts as.
  •  
  • What would this number be without revenue sharing.

The first question is the big one. I tried a series of various functions trying to see if the amount of revenue per win varied across different markets (IE, is a win more valuable in New York than Kansas City). I tried as hard as I could, but I kept coming back to the same conclusion: if there is any change in the value of a win across markets, it?s that a win is worth more in the smaller market but to an extent that it is irrelevant and also not statistically significant. In layman?s terms, using the same revenue per win figure as a constant across all teams will give you as good an estimate as any other method I could come up with.

 

The second question is easier to handle. Though a win this year only goes toward a quarter of the W% figure for next year, it also goes to the year after?s, and so on. Still it?s money that, if compared to player?s salaries, is spent now and paid back later, meaning its value is diminished a little. This really is minimal and is offset by the fact that generating a win now might have some effect on whether you host playoff games, so the reduction in value is likely at least offset.

 

Without the current revenue sharing system, it looks as if the revenue per win number would be up around $3.5 million. It?s a good number to know because without revenue sharing, a player worth 5 extra wins would be worth $17.5 million dollars and with the current revenue sharing he?s only worth around $13.3 million.

 

Another interesting use for the formula would be to figure out a ?base? revenue rate. What I mean by that is to figure out the estimated revenue for each team based only on those factors the team has absolutely no control over. For this system that would be the size of the team?s Metropolitan Area, the per capita income in the team?s Area and how many teams are currently in the area. By doing this, we can estimate what each team?s revenues would be if they were all exactly equal in terms of producing on the field, marketing, new stadiums, and other areas in which teams can directly affect how much they rake in. A standard complaint from the stat-head community is that a lot of teams don?t do nearly enough to generate their own revenue and instead rely on handouts from everybody else. This method would tell us how much each team would make if they all made exactly the same efforts in this regard.

 

Below is a chart of all the Major League teams and some revenue numbers and   estimates for the 2001 season. The actual revenue numbers themselves are from   Forbes yearly estimates of Major League Baseball revenues. Those numbers are   the first column. The next column are the revenues that the above formula estimated   the team would bring in based on all of the factors listed. The third column   is a listing of the team?s ?base revenues,? which I explained above as the estimated   revenue if each team performed equally on the field. The final column lists   the ?base revenues? subtracted from the team?s actual revenues for 2001. This   works as a measure of how well the team is currently drawing from its market   (a? concept Derek Zumsteg wrote about in an interesting   Baseball Prospectus piece the other day). The table:

 

 

                                                                                                                                                                                                                                                                                                                                                                                                                                                               
Team2001 Act Revenue 2001 Est Revenue Base RevenueDifference: Actual - Base
Anaheim$102,610,000 $113,379,000 $126,521,000 ($23,911,000)
Arizona$127,240,000 $125,886,000 $106,092,000 $21,148,000
Atlanta$160,020,000 $174,156,000 $119,326,000 $40,694,000
Baltimore$132,720,000 $139,713,000 $139,277,000 ($6,557,000)
Boston$152,140,000 $142,988,000 $133,525,000 $18,615,000
Chicago (AL)$101,330,000 $113,875,000 $113,991,000 ($12,661,000)
Chicago (NL)$130,550,000 $87,048,000 $113,991,000 $16,559,000
Cincinnati$86,630,000 $104,800,000 $105,740,000 ($19,110,000)
Cleveland$151,150,000 $141,371,000 $112,066,000 $39,084,000
Colorado$129,420,000 $107,591,000 $117,778,000 $11,642,000
Detroit$114,720,000 $108,964,000 $123,632,000 ($8,912,000)
Florida$80,660,000 $86,014,000 $110,810,000 ($30,150,000)
Houston$125,910,000 $158,289,000 $119,434,000 $6,476,000
Kansas City$84,600,000 $76,247,000 $107,821,000 ($23,221,000)
Los Angeles$142,630,000 $130,674,000 $126,521,000 $16,109,000
Milwaukee$108,260,000 $113,771,000 $110,109,000 ($1,849,000)
Minnesota$74,640,000 $85,049,000 $120,840,000 ($46,200,000)
New York (AL)$214,800,000 $226,432,000 $166,521,000 $48,279,000
New York (NL)$169,030,000 $177,792,000 $166,521,000 $2,509,000
Oakland$92,740,000 $110,540,000 $116,596,000 ($23,856,000)
Philadelphia$94,270,000 $95,530,000 $129,615,000 ($35,345,000)
Pittsburgh$107,630,000 $105,844,000 $108,268,000 ($638,000)
San Diego$93,090,000 $106,354,000 $106,981,000 ($13,891,000)
San Francisco$142,020,000 $146,552,000 $116,596,000 $25,424,000
Seattle$162,040,000 $158,357,000 $120,130,000 $41,910,000
St. Louis$123,300,000 $115,693,000 $111,712,000 $11,588,000
Tampa Bay$91,590,000 $76,730,000 $104,746,000 ($13,156,000)
Texas$134,300,000 $124,831,000 $122,272,000 $12,028,000
Montreal$63,260,000 $70,324,000 $96,974,000 ($33,714,000)
Toronto$91,120,000 $104,175,000 $110,113,000 ($18,993,000)
Average$119,480,667 $120,965,633 $119,483,967 ($3,300)

As you can see, the highest ?base revenue? figures belong to the two New York teams, and the lowest belong to the Expos (due to the fact that the Per Capita Income in their Area is easily the lowest of the 30 teams). Immediately you?ll notice that the teams from the same markets all have identical ?base revenue? figures. This is because the system does not currently distinguish between the Yankees and Mets or Cubs and White Sox as to how large their market is. Why?

 

There is some question as to whether the seeming ?dominance? of one team in every two-team market is due to inherent advantages or due to one team simply doing a better job promoting itself than another. Let?s use attendance as proxy for a team?s popularity within a market and see how the two-team markets shakeout. Since the A?s moved to Oakland in 1968, we?ll use that as the back-end of this discussion. The question is, of the 34 years from 1968-2001, how often has the ?dominant? team in the market outdrawn the other team?

 

Yankees 16 ? Mets 18
  Cubs 20 ? White Sox 14
  Giants 17 ? A?s 17
  Dodgers 30 ? Angels 4

Now, I?ll elaborate further, but obviously the team that immediately sticks out is the Angels, who have only outdrawn the Dodgers four times, with the last time being all the way back in 1974. It seems to me the Angels are the only team that might have a complaint that they inherently have significantly less of the Los Angeles market than the system credits them with. You could still make arguments for the A?s and White Sox being inherently disadvantaged, and there are reasons to believe this might be so, but I think the extent to which this might be true is usually way overblown and any ?dominance? appears to be mostly due to one team providing a better product in some way.

 

As such, I think I might consider putting an adjustment in for the Angels (after all Orange County isn?t quite the same thing as Los Angeles), since there?s hard data to support it, but the others look to me like they have at least almost as good a deal as their counterpart.

 

Speaking of two team markets, Baltimore and Washington D.C. are considered the same market by the U.S. Census and so they are considered the same by this system. As you can see above, the system estimates that a second team moving into the area would cost the Orioles around $28 million a year in revenues. This would likely mean that the value of the franchise would take a bit of a hit if those numbers did come to pass. Now you might then conclude that Angelos has a right to squawk (a right he exercises with alarming frequency), but in reality you?ll see that the ?base revenue? estimates think that the Orioles have the best non-New York situation in the league. Of course you could say that?s only because D.C. is included in the market, but Angelos can?t have it both ways. Either D.C. is his market or it isn?t, but either way you slice it the Orioles would still have a decent situation if a team moved into the District. ?It would be roughly equivalent to that of the Chicago franchises.

 

One final note is to talk about how all of this stuff ties in with revenue sharing proposals.? Under the current system, the ?inherent? advantage for the two New York teams over the ?average? team according to the ?base revenue? estimates is around $37 million a year.? Now one could argue that figure should be the additional amount of revenue the Yankees should have to share, but there are problems with this line of thought. First of all, a New   York team incurs more costs than the average MLB team (rent, non-player employees, taxes and so forth). Making them equal on the revenue side would make them disadvantaged on the profit side. So that has to be taken into account.? Also to be taken into account is the fact that buying a percentage of the Yankees is a hell of a lot more expensive than buying that same percentage of say the Pirates. Doesn?t the person who made that investment have a right to an expect an advantage in revenues based on the fact that he paid more to get a share of them? I think in those terms, that $37 million advantage of the Yankees doesn?t seem that overwhelming.? Furthermore, if you?re going to institute more revenue sharing, it would be wise to add a mechanism to strengthen the relationship between winning and revenue. It?s that relationship that does the most by itself to level the playing field, and it also encourages teams to put out better products. Again, I don?t see where these numbers indicate that increased revenue sharing is all that great of an idea. To me at least, the Yankees big revenues look to me to be half inherent advantage and half excellent exploitation of their market.

 

In short, I hope this little formula gives some insight into exactly what   factors cause teams to make money. I haven?t seen Ron Johnson around lately,   but if he pops by, I?d love to hear how my formula compares with his. I?d also   like to hear any suggestions on additional factors I could include in the formula.   Fire away!

 

Voros McCracken Posted: August 19, 2002 at 06:00 AM | 14 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. Ken Arneson Posted: August 19, 2002 at 12:41 AM (#605886)
Perhaps you can add a "shiny new/classic old ballpark" factor. As a whole, it looks likes the teams with nice ballparks are on the positive side, and the teams without one are on the negative side.
   2. Jason Posted: August 19, 2002 at 12:41 AM (#605887)
Rather disappointing analysis in my opinion. I believe Derek Zumstead's most recent look at actual market advantages did a much better job of looking at this issue. I'm incredibly skeptical that the value of marginal wins should be considered linear. That flys in the face of common sense. Using your formula one concludes that almost none of the trades that happened this year made any economic sense from the sellers perspective. Sure the Brewers ditching Alex Ochoa for 700K in savings probably mades sense since it didn't likely cost the team a win, but dealing Ray Durham was worth between a win or two for the WS and they saved a pitance without getting much for talent. And at those marginal rates they'd have needed to get talent equal to 2 first round picks to even come close to making sense. Similarily clearly the Brewers were better off not dealing Jose hernandez and getting a couple of extra Ws instead of paying his salary and only getting a decent prospect or two. No I imagine that there is little value in revenue for marginal wins between 70 and 80, a certain value in getting to 70 (and making it to the mediocre level). After 80 wins there should be a steady increase in the value of a win peaking somewhere around 90 where the playoffs become likely, and then declining slowly as winning the division handily becomes no big deal.
   3. Greg Pope thinks the Cubs are reeking havoc Posted: August 19, 2002 at 12:41 AM (#605888)
Which ballparks that are old fall under your definition of classic? Wrigley is the only one that jumps to mind.
   4. tangotiger Posted: August 19, 2002 at 12:41 AM (#605890)
An interesting point that Voros brought up is that the marginal value of a win has the same total dollar impact, regardless of market size. This sounds pretty strange to me. This means that if you have a 2 million person market and an 8 million person market, then the benefit of an extra win is the same in either market.

I wonder if this is due to other entertainment avenues that baseball competes with. There's alot more to do in NYC than in KC. If one market is 5 times as large, but there are 5 times as many things to do, the "marginal market" that each city is after is the same total number.

Voros: maybe getting data from the various Tourism boards might allow us to account for this. Just a thought...
   5. Ken Arneson Posted: August 19, 2002 at 12:41 AM (#605892)
Um, I think Fenway is a classic old park.

Interesting that the two teams that replaced classic old parks with shiny new ones, Detroit and the White Sox, are on the negative side.
Perhaps the Red Sox should think more than twice about replacing Fenway.

Most of the other negative teams play on fake grass or in football stadiums. The anomalies are Anaheim and Baltimore, which Voros addresses above.
   6. Michael Humphreys Posted: August 19, 2002 at 12:41 AM (#605896)
Excellent article. One very, very quick-and-dirty stat/implication. If each NY market team has an inherent $37 million revenue advantage, each win costs $3.5 million, and we assume (unrealistically, as you note) that all costs other than salaries are fixed, such team could spend the extra money and win 91 games a season. Furthermore, there is a *dynamic* aspect to this. The additional wins result in more revenue, which then permits more money to be spent on salaries, which creates more wins, etc., etc. In other words, the effect over time of inherent metropolitan market advantages might be greater than one would see in a static regression model. I don't know what the point of equilibrium/diminishing returns over time might be, but I would suppose the Yankees may have reached it. I also agree, however, that any revenue sharing approach must not penalize teams for being better at exploiting their market (i.e., intelligently building a better team that brings in more revenue). The Mets have certainly proven the point that the success of the Yankees is not inevitable.
   7. Jason Posted: August 20, 2002 at 12:41 AM (#605903)
Having thought about it more there's actually a second implicit constant term in the equation. By using raw winning percentage and the historical precedent for even the worst teams winning at least 33% of their games there's a built in cushion in that term. Since that's positive and the explicit constant term is negative they reduce each other in magnitude, but it certainly throws off any simple calculation of marginal win values.
   8. Walt Davis Posted: August 20, 2002 at 12:41 AM (#605905)
I'm not really worried about the constant term. The theoretical Springfield Isotopes example is just a warning not to extrapolate beyond the bounds of the observed data. There's no model that will tell you what the expected revenues are for a team that doesn't win located in a town with no people in it -- or at least, since such a thing has never been observed, we have no means of assessing how accurate that prediction would be.

As to the national revenue, I would rather just see it subtracted out of the left-hand side. A team's share of the national revenue isn't a function of its population size, per capita income, etc. The left-hand side should really just be locally-generated revenue.
   9. Voros McCracken Posted: August 20, 2002 at 12:41 AM (#605906)
Two quick comments,

Remember that by last year's numbers, the Montreal Expos brought in over $52 million in revenues from the league alone, so that one could assume that if a team were allowed to exist while generating _no_ local revenue, they would probably take in somewhere around the $55 million mark.

The second comment is that there are a few problems with the revenue/win is not linear argument: one is that representing it as linear makes analysis before the season or at the very start of the season possible. You might figure a team is an 84 win team before the year starts, but that really means that the team migh have chance to win anywhere from 69 to 94 games with the chances of any one win total occurring increasing as it converges to 84. Therefore getting at that "curve" precisely can't be done because you don't know whether two extra wins gets you to 90 wins or to 78 wins. So the curve is "flattened" out considerably when you take this into account. Another point is that the win% numbers only affect revenue in subsequent years in the model. The only aspect of current year success that affects the model are the number of home playoff games played. Finally, and most importantly, curves that were bent, logarithmic, hyberbolic, exponential, and all sorts of other weirdness were tried and none of them improved the accuracy of the formula. I'm assuming because there are all sorts of different kinds of 84 win seasons, that the effects of these 84 win seasons differ greatly as well. Meaning that to get at the "average" effect from an 84 win season, the various different effects get averaged out to where we once again get a flat and average looking rate.

Finally, in order to use more complex functions, we'd probably need a much higher sample size than I have available (90 teams) in order to get at them. With only the 90 teams, all we're going to get are the simple trends and the important variables. If we try and get complex, the system doesn't become more accurate and becomes harder to work with.
   10. Cris E Posted: August 21, 2002 at 12:41 AM (#605918)
I'm still having trouble getting past "Another point is that the win% numbers only affect revenue in subsequent years in the model. The only aspect of current year success that affects the model are the number of home playoff games played." The fact is that the Twins attendance went from 12K to 22K last year and you're saying that was due to the 69-93 fifth place finish in 2000. It may be an isolated event that doesn't model well, but like the marginal wins value not being linear, big improvement does have a larger impact on attendance for bad teams.
   11. Ron Johnson Posted: August 21, 2002 at 12:42 AM (#605932)
Nice piece of work. Some of what follows is quibbles and I know you're more interested in broad factors than specific details.

A few things that should push up the accuracy of your model.

There's a strong upward trend in revenue. You can take this into consideration by including the year in the regression (well I actually used year-1994)

I can't understate the importance of either having made the playoffs the previous season or in particular having won the World Series in the previous season.

Once you include these in the regression, the marginal value of a random win in previous seasons goes down a fair bit.

Likewise, making the playoffs and winning the world series in the season under consideration is very important. And including this in the regression lowers the value of a random win by quite a bit. (I know why you didn't include this in your model. In looking at what to bid on a player his impact on your potential playoff chances are very difficult to assess -- particularly years down the line. And that's before getting into how long he rates to sustain that rate of play. Way easier to work with in a marginal revenue produced versus value over replacement study than what I have. )

I'm surprised that you found the marginal value of a win to be more or less constant across markets. I found winning percentage* market size to be both positive and significant. Makes for a right messy equation though. If you opted for a simplifying assumption, I can see why.

   12. Silver King Posted: August 25, 2002 at 12:43 AM (#605987)
"the lowest belongs to the Expos (due to the fact that the Per Capita Income in their Area is easily the lowest of the 30 teams)"

I don't know much about Canada, partly due to living in Florida. I'm curious whether Montreal is impoverished or something. However, any extra social services provided would increase the portion of people's money that they could spend on entertainment. If I make less money, but don't have to buy health insurance...
   13. Voros McCracken Posted: August 26, 2002 at 12:43 AM (#605999)
"There's a strong upward trend in revenue. You can take this into consideration by including the year in the regression (well I actually used year-1994)"

I did do this, in a more simplified way. I simply scaled up the revenue numbers to 2001 levels for each year.

As far as making the playoffs and world series, it is more or less in the model under the "home playoff games" heading. If you make the playoffs you get at least one home playoff game. If you make the World Series you could get as many as 10 or 11 playoff games.

I decided to use games instead of artificial levels, since that's more or less how the revenue is generated for the individual teams.

As far as the revenue per win/market size relationship, I was equally as surprised but I'm just not finding it.

It _could_ be that I'm estimating total revenue instead of local revenue and the revenue sharing combined with the error rate of the regression combined to reduce the difference to insignificant levels. I'm going to work on trying to compartmentalize the numbers a little more (breaking down how each revenue source is generated individually and then putting it back together), and maybe try and do a little bit more work on the right way to split two team markets (there could be significantly different revenue generating functions for two team markets than a one-team market).
   14. Voros McCracken Posted: August 30, 2002 at 12:44 AM (#606061)
Art,

Well obviously with last years winning percentage being a key variable, making the playoffs is going to be contained within that. The problem I had with trying to add more playoff and WS data was simply that it didn't make the model more accurate, only more complex. If I do a few things different with the model than Ron, then whether a certain variable shows up as significant could be affected.

As for revenues, I simply scaled up the revenues in 1999 and 2000 to where the total league revenues were the same for all three seasons. This is a bit of a kludge for some questions such a model could work with (say overall revenue growth or something) but for my main purposes I figured it was the best option. YMMV.

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
BFFB
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 0.2587 seconds
47 querie(s) executed