Explanation of Methods
The current park effect systems either group all park effects into one offensive measure (like runs or runs created) or try to isolate a park's effect on particular events (like 1B, 2B, BB).
I dismiss the all-encompassing systems because parks don't affect all events equally. A park which is favorable to home run hitters may be harmful to single hitters. So when an all-encompassing system of park effects is applied to an individual player's statistics, some players are penalized (or rewarded) more than they should be.
I prefer park effect systems which modify events. If a park helps home run hitters, home run hitters playing in that park have the value of their home runs reduced. Players who don't hit homers are not penalized.
I calculate all my effects on a event basis. However, the way I figure park effects is different from any other source I've come across. I'll explain how and why I use my methods below.
The most widely available park effect data is published annually by STATS, Inc. In their annual Major League Handbooks, they provide current season and three-year park effects. They calculate both all-encompassing factors and event factors.
STATS creates an index for each event. Using walks (BB) to illustrate, they take BB per Plate Appearance (PA) at home and divide by BB/PA on the road. This is supposed to create an index where if a home park has an average effect on an event, the index should be 1.00. One big problem is this method presumes that all teams have a road factor of 1.00. In reality, with every team playing a varied number of games at each road park, some teams play a road schedule where they play more often at either pitcher's (road factor less than 1.00) or hitter's parks (road factor more than 1.0). This makes STATS index in need of more work. A more iterative process is required.
You may be wondering why STATS divides BB by PA to start with. Doing so creates the rate walks take place, either at home or on the road. This is more precise than using just the aggregate number of walks. Using a rate based state eliminates the bias present if a team has more opportunities either at home or on the road.
Using PA to determine the event rate is OK for BB or SO. We'd want to know how often a park influences walks and strikeouts because when these events are affected by the park, so is the number of times a ball is put in play. If more batters are striking out, less batters are hitting the ball. With fewer players making contact, every positive offensive event (1B, 2B, 3B, HR) occurs more infrequently. With no other factors involved, a park which only increases strikeouts will also decrease hits. That effect is rather indirect, which is the reason I use a different denominator than STATS uses to determine factors for hits.
STATS uses At Bats (AB) to determine the rate for hits. AB include strikeouts. This shouldn't be done. The amount that strikeouts influence hits should be figured separately. By keeping the non-hitting events (BB and SO) separate, we can better isolate the way a park affects a batted ball. It is for this reason that I determine the event rate for hits using Balls in Play (BIP) in the denominator.
For walks and strikeouts, I calculate the rates per Plate Appearances (PA). For Singles (1B), Doubles (2B), Triples (3B), and Homeruns (HR), I calculate the rates per Balls in Play (BIP).
Other considerations to keep in mind.
One quagmire people who develop park factors must muddle through is whether to use one year's park data or three (or five, or whatever) years' park data.
Those who use three years data do so to eliminate random fluctuation. Sometimes more homeruns are hit at home one year due to nothing other than chance. This is a valid argument.
However, combining the data for more than one season also eliminates weather effects.
Let's use Wrigley Park as an example. The wind has a great affect on play at Wrigley. Pitching on a day when there is a strong wind blowing in toward home plate is a pitcher's dream. However, put that same pitcher on the hill on a day when a strong wind is blowing out and the bleacher bums will be taking home a lot of souvenirs. Over a three year period, Wrigley could be a strong pitcher's park one year, average the next and a big home run park in the third year.
Grouping the data over the three year period, could give the illusion that Wrigley was a neutral park. This would be incorrect. Using park effects developed in this manner, we would underestimate (or overestimate) a player's production in two of the three years.
I take the lesser of two evils; I create park effects using one year's data.
Now that you (hopefully) understand my methods, check out the 1997 Ball Park Factors
If I wasn't clear enough or you'd like to make a comment, please e-mail me.