Run Production Average (RPA)
[ Webmaster's Note: The following article appeared in Mike Gimbel's Baseball Player & Team Ratings, 1994 edition and appears here courtesy of the author.]
Before we can rate any individual player we need to be able to find out the value, in terms of runs produced, of the individual offensive acts in which the player is involved. This means we need to know what the run production value of a single is as compared to the value of a double or the value of a stolen base or the value of a walk, etc.
HOW DO WE DETERMINE THESE VALUES?
Each year, in each league, and for each team, there are accurate records of the number of occurrences of each offensive and defensive event as well as the number of runs which resulted. The Stats, Inc. organization has personnel in every major league ball park's press section. These individuals record every pitch thrown and every ball put in play. They also record each ball's direction, distance and outcome, and they record every player on the field and that player's position on the field at the time the ball was hit.
Utilizing several years of the above data for all the Major League teams, it became a matter of experimenting with different linear weight values for the various events (singles, doubles, etc.) until I arrived at values which were accurate enough to use in estimating the approximate number of runs that should have been scored during a full season by any major league team in any particular year. A linear weight value, for those unfamiliar with the term as used in this program, simply means that a specific run value has been determined for a particular event that takes place on the playing field. As an example from this program, a single has a linear weight value of 0.29 runs.
The resulting run values, which have been arrived at through statistical testing for each of the individual events, are not static values, however. These values change with the opportunities available. The greater the number of opportunities, the greater the opportunity of scoring runs per each offensive act. If a team gets only one single or one walk per game and nothing more, then this single or walk will probably have no value in terms of runs, since no one would likely score. If however, every hitter who comes to the plate either singles or gets a walk or hits a home run, then these singles and walks would be exactly equal to home runs in terms of their run value since every hitter would score. Run scoring and the values of individual offensive events, therefore, vary with team and individual on-base percentage.
The Run Production Average (RPA) is the result of this statistical experimentation. The RPA is the product of testing years of data. This resulted in a formula which enables us to rate each player or pitcher's performance on a per plate appearance basis.
Here then are the linear weight values determined through the above testing:
The value of the intentional walk is not listed since the act of receiving an intentional walk has almost nothing to do with assessing the individual player's ability. Since the intentional walk has only a minor affect on the team run totals (and this minor effect is different in the National League from the American League due to the lack of a DH) we use the formula to predict run scoring for the team only, but we won't use it in our individual player RPA ratings.
THE SET-UP RPA
As indicated earlier, the linear weights listed above are not static values. The values of these singles, doubles, triples, etc., are dependent upon how often people are on base. In addition, each batter is responsible for driving-in or moving up the runners already on base as well as placing himself on the base paths in order to set-up the hitters coming after him in the line-up.
The RPA formula, therefore, is split up into two main parts: a set-up and a drive-in RPA rating. It is the set-up portion that reflects a batter's ability to get on base. This is the portion of the formula which is not linear since the percentage of times a batter gets on base increases or decreases the linear values of each individual offensive act for the hitters that follow him.
When the result of the on-base production adjustment is applied to the individual player it will be listed separately as the set-up RPA. This will show how many runs a hitter has set up, as compared to how many he has driven-in, on average, for each computed plate appearance.
For example, if a player, prior to the on-base adjustment, has a production figure of 100 runs, then one-half (50 runs) is his figure for driving in runs and will not be adjusted by the on-base percentage. The other 50 runs will be adjusted to reflect the on-base ability of the hitter. We can then divide each of these run totals by the total number of computed plate appearances to get an RPA rating for setting-up runs and an RPA rating for driving-in runs.
If, in this example, the player had a computed plate appearance total of 500, then his drive-in RPA would be .100 (50 runs divided by 500). If, after adjusting for this player's on-base ability, his computed set-up runs were 45, then his Set-up RPA would be .090. His total offensive RPA would be .190 (.100 plus .090).
To this will be added (or subtracted) his RPA based on his defensive ability. In other words, if the above player had a -13 (shorthand for -.013) defensive rating, compared to the median rated starter at his primary position, then his total RPA would now be .177 (.190 less .013). This is still a very good overall rating and such a player would be a valuable addition to any team despite his poor fielding. If his team can move him to an easier defensive position or to DH (if he's in the American League) then they may be able to maximize his offensive assets without incurring his defensive liabilities.
In many cases, however, the defensive liabilities of a player can entirely negate his offensive contributions and, vice versa, a very ordinary hitter may turn out to be an extraordinary asset based solely on his defensive abilities.
The Set-up RPA and the Drive-in RPA figures can be used to see how effective each batter is. It will enable us to indicate the location the hitter should occupy in a particular line-up. If he is particularly adept at setting up runs, we will want to place him in front of those hitters that are particularly good at driving him in. As indicated in last year's edition there should be two or more "bumps" in each properly designed line-up, i.e., two sets of set-up and drive-in hitters: a stronger set at the top of the line-up and a lesser set after that. Each "bump" refers to a set of set-up type hitters leading up to a set of drive-in type hitters. Without the two "bumps" you would end up wasting the set-up ability of the hitters in the latter part of a lineup.
AGE: THE MOVING TARGET
So many older free agents, either just past their prime or even far past their prime, have received whopping, multi-year contracts as if they were the same players they were a few years ago or as if they will remain the same players they are today for years down the road. Some of these players may retain their skills over a longer period of time in a few lucky cases, but the abilities of most individual players change relatively rapidly over time. Player abilities are truly moving targets, and it is vitally important to understand this when one is estimating how a player will perform in a coming season. The greatest single variable is age.
Based upon the study by Brom Keifetz, which was first published in the 1993 edition, we have age change figures which can be used to adjust the player's overall RPA rating for the coming season. These age changes are in terms of RPA adjustments to the basic RPA rating listed in the position-by position rankings and in the Pitcher rankings. These age related RPA adjustments are not included in the RPA ratings and player data listings in each individual team's section since these player ratings show the individual season performance ratings for years which are in the past. The age change RPA rating is only to be used for prediction purposes. Predicting player performance for the coming season is exactly the purpose of the position-by-position player and pitcher ranking section and that is why the age adjustment is listed there.
A LITTLE MORE MATHEMATICS
The on-base ability of a hitter is a little complicated to determine but, like most everything else in this book, it does not require a higher math education to understand. What we need to know is how many runners are left on base after a particular hitter bats for the rest of the line-up to knock in.
In order to determine the number of runners on base at the completion of a plate appearance it is necessary to make subtractions for some of the offensive acts previously listed in the linear weights listing. A home run, for instance, has a negative effect on the number of players on base. In fact, there never is anyone on base after a home run. Yet the on-base averages listed in all the newspapers and books include homers. These on-base averages listed in the newspapers are not wrong, they just aren't useful as tools. They are flat statistics just sort of lying there doing nothing very useful. What we want and need is a tool for understanding the internal dynamics of a line-up. Yes, the homer should be awarded a plus in the standard on-base percentages for historical purposes, but for our purposes of attempting to understand how to structure a line-up we must have a tool which as accurately as possible can tell us how many runners will be on base, on average, after a particular hitter finishes his turn at bat.
Remember, the purpose of this part of the formula is to determine how many runs a player sets up for those batters that follow him in the line-up. Let's look at the home run as an example of such a subtraction. It is true that the homer puts the batter on the bases for his trip around them, for which we must certainly account. Our on-base part of the formula will add the batter as a base runner as is traditionally done. We need, however, to account for the fact that the home run hitter also removed himself from these same base paths -- and even more important -- that he removed whoever was already on base. This results in a plus 1 runner minus 1.44 runners (see above values) for a net on-base production of minus 0.44 for a home run.
Here are the on-base adjustment values needed to determine a hitter's Set-up RPA:
Add: total of walks + hits; to this add .10 (to account for the extra base) times the number of doubles, stolen bases, and extra bases taken (or wild pitches and balks for pitchers) plus .20 times the number of triples.
Subtract: the number of times caught stealing and picked-off and thrown out attempting to take an extra base and the number of double plays grounded into and .29 times the number of singles, .41 times the number of doubles, .70 times the number of triples and 1.44 times the number of home runs.
After getting the total for the above on-base production, we need to divide this figure by the number of computed plate appearances to get a per plate appearance average. A hitter with an on-base total of 110 in 500 computed plate appearances would have a .220 on-base appearance average.
The recent year-to-year major league computed per plate appearance average is about a .208 rating per batter. The individual player, pitcher or team per plate appearance average can then be compared to this major league average to get the individual player or team on-base adjustment figure.
Say that a batter's computed on-base production rating is .200 per plate appearance. This is poorer than the .208 average. We divide the player's .200 rating by the league average .208 rating to get an adjustment figure of .9615. This figure is then multiplied against one-half of the total run production determined by the linear weights listed on the previous page. Remember, the reason that we multiply it against only one-half is because we are adjusting for one-half of a batter's responsibility, i.e., his ability to set up runs as opposed to driving in runs.
A TEAM RPA FORMULA WORK-UP
As indicated earlier, the RPA is the result of testing statistical data to find values which have been shown to be accurate in predicting a team's run scoring.
Here is an example of a team's actual run scoring as compared to the estimated result created by use of the RPA formula:
Atlanta Braves (1990 season)
Actual runs scored = 682
Singles = 925: 925 x .29 = 268.3 runs
Doubles = 263: 263 x .41 = 107.8 runs
Triples = 26: 26 x .70 = 18.2 runs
Home Runs = 162: 162 x 1.44 = 233.3 runs
Hit subtotal = 627.6 runs
To the above we add:
Walks + HBP = 500: 500 x .165 = 82.5 runs
Stolen Bases + Wild Pitches + Balks = 162: 162 x .10 = 16.2 runs
Adjusted subtotal = 726.3 runs
From the above we subtract the following:
Caught Stealing + Ground into Double Play (GDP) = 157:
157 x -.165 = -25.9 runs
new subtotal = 700.4 runs
Before we adjust for on-base production we need to add or subtract the value of the intentional walk. This is used for team computations only. The American League value for the intentional walk is plus .165 runs, which is the same value as a regular walk. There is little or no advantage to the intentional walk in the American League, due to the existence of the DH rule, since a regular hitter almost always follows.
The National League value is considerably different due to the fact that the intentional walk usually precedes the pitcher, which results in the waste of many scoring opportunities. The value of the intentional walk, therefore, is minus .04 runs (plus .3 plate appearances):
Adding the value of the intentional walk:
Intentional Walks = 37: 37 x -.04 = -1.5 Runs
TOTAL OF COMPUTED RUNS FOR THE ATLANTA BRAVES DURING THE 1990 SEASON
(PRIOR TO THE ON-BASE ADJUSTMENT) = 698.9
THE ON-BASE ADJUSTMENT
Atlanta's on-base Production:
Total = 1876 hitters who reached base
To the 1876 that reached base we add the following adjustments:
Stolen Bases + Wild Pitches + Balks + Doubles = 425: 425 x .10 = 42.5
Triples = 26: 26 x .20 = 5.2
Adjusted on-base subtotal=1923.7
From this subtotal we subtract:
Caught Stealing + GIDP = 157
Hit subtotal (from pg. 15) = 627.6
On-base Production Total=1139.1
The on-base production total will be divided by the computed plate appearance total which is worked out as follows:
At-bats (less home runs) = 5342
Walks + HBP = 500
Intentional Walks (37 x .3) = 11.1
GIDP = 102
Plate Appearance Total: 5955.1
No plate appearance is charged for home runs since we are only dealing with that half of the run production formula dealing with setting up runs. The home run hitter is almost like the ghost who was never there but managed to leave his mark by causing runners who may have been on base at the time to "disappear". As explained in previous years, this non-inclusion of the home run in plate appearances is mostly a subjective judgment on my part. In any case, the hitter is still charged with a plate appearance for the rest of the formula.
On-base production results:
Total on-base production = 1139.1
Divided by plate appearances = 5955.1
Atlanta's On-base Rating for 1990 = .191
This rating is lower than the .208 rating which makes it below normal. We'll have to make a subtraction for this poorer rating:
Atlanta's On-Base Rating = .191
Divided by Normal Rating = .208
Atlanta's On-Base Production Variant = .918
Computed Runs (see above) = 698.9
Multiplied by On-Base Production Variant (698.9 x 0.918) = 641.6
Unadjusted Difference = - 57.3 runs
Remember: this is an example using 1990 Atlanta Braves' data to determine that portion of the formula dealing with on-base production. We must cut the above subtotal in half, since we are only modifying that half of the run production for each team which is accounted for by the on-base setting up of runs:
Divide -57.3 by 2: = -28.6 runs
We then subtract the 28.6 runs from the previously listed total of Atlanta's computed runs for 1990 which equaled 698.9 to get a final total of 670.3 runs as estimated by the formula. Since the actual runs scored by Atlanta were 682, we are only off by a little less than 12 runs.
INDIVIDUAL PLAYER ADJUSTMENTS
It is important to know how each hitter fares against the different types of pitchers he may face. There are 3 types of lefty and 3 types of righty pitchers: Groundball, flyball and neutral type. Groundball and flyball match-ups are very important, although not as much as lefty-righty match-ups. Some of the reasons that groundball and flyball match-ups are not as important may be caused by the fact that we are dealing with much smaller size data samples as compared to the lefty-righty match-ups. Another reason that grounball and flyball match-ups may not be as important as lefty-righty match-ups is due to the fact that the pitcher may develop a new pitch or may lose some speed on his pitches or make some other adjustments which may alter the type of pitcher he is, but he can't alter which arm he throws with.
In this book there are proposed line-ups for each major league team for all of these six types of pitchers these teams may face. There is one serious proviso, however. There really are twelve types of pitchers since there is a very substantial group of pitchers who throw as if they were pitching from the opposite side. These pitchers I refer to as "reverse-type" pitchers. Therefore, there are six types each of standard and of reverse-type pitchers for anyone contemplating the correct setting-up of any particular lineup. In this program, instead of simply saying "lefty" or "righty" after each pitcher's name following the word "Throws:", I put in a more complete description such as "Flyball type, moderate reverse righty" for Cal Eldred. The meaning of this cryptic remark is that Cal Eldred throws from the right side, but actually pitches like a moderate left-hander, and he's a primarily flyball producing pitcher.
We will include the Set-up and the Drive-in RPA ratings for each hitter (to which we will add the adjustments for age) against each of the 6 types of pitchers faced and include these ratings in the above-mentioned proposed line-ups. These Set-up and Drive-in ratings for each hitter against each type of pitcher to be faced are in the team section of this book. Be prepared for many surprises! These proposed line-ups are anything but traditional. As stated in previous editions, for example, I don't go along with the belief that you have to have a speedy runner at the top of the line-up. Speed is only a bonus. You can't steal first base. An even worse practice, in my view, is placing a slow contact hitter in the number two position. This is a prescription for the worst of all offensive disasters: the double play! Better the slow hitter hits in front of the speedy one, but only if he makes up for his slow speed by getting on base frequently. Otherwise he should be farther back in the lineup.
THE DEFENSIVE RATING
The defensive rating, as used in this book, was first introduced in the 1991 edition. It is, unquestionably, my proudest contribution to the rating of player talent. While there are various methods that exist, at least in part, for the rating of the offensive abilities of baseball players, this cannot be said for rating players defensively. There is no other method of statistical analysis of defensive data that I am aware of that in any way approaches the method published in this book as a realistic evaluation of defensive ability.
An example of how ridiculous you can get when you don't understand how to analyze statistical data are the so-called "Tag Ratings"published in Baseball Weekly. These ratings, which are put out by Pete Taggert, claim to take into account the full range of player performance, both offensive and defensive. I usually have ignored these "Tag Ratings"since they were obviously incorrect and only seemed to deal with superficial offensive data. But when I saw the "Tag Ratings"which took up an entire page in the October 5-11, 1994 issue (pg. 35) of Baseball WeeklyI almost fell over laughing. These "Tag Ratings" pathetically attempted to rate defensive performance and ended up with the incomprehensible decision that Jody Reed(!) and Cecil Fielder (!!) were the two best defensive players in the American League. Give me a break! Unbelievably, Devon White and Kenny Lofton, two of the greatest defensive players in the game today, didn't even appear anywhere on this silly list of the top defensive players in the American League! I was laughing so hard, and unfortunately for Scott Kaufman, he was the only one at Baseball Weeklythat I could get ahold of at that very moment. I apologize to him if he was offended by my laughter over the telephone. I know I probably would have been offended since Scott wasn't the perpetrator of this stupidity. It was just that I couldn't resist. I just had to talk to someone at Baseball Weekly to share my guffaws at the terrible waste of space that page 35 had been put to. A few years ago I wouldn't have been so hard on Mr. Taggart or Baseball Weekly since rating defensive ability was entirely an art rather than a science until I developed the method in this book. But my method has been publically available since 1991, so no excuse remains for the existence of such silly defensive rating systems as exhibited by these "Tag Ratings".
Let me say this loud and clear: the defensive ratings in this book are at least as good as the offensive ratings, if not better.
How can I be so certain of this? Isn't defensive ability more difficult to quantify than offensive ability? I certainly used to think so myself.
As it turns out, defense may actually be easier to quantify and rate. For instance, there are no walks, no hit-by-pitch on defense and the stolen base only affects the catcher's rating. It's somewhat simpler to rate defense since all we need to really know is how to determine the areas of responsibility for each player on defense, and the direction, distance and outcome for every batted ball into a particular player's defensive area of responsibility.
As with all my data, I use all available defensive statistics for the previous two years. If a player, say Ozzie Smith, is playing shortstop for a particular team in a particular stadium, in this case the St. Louis Cardinals in Busch Stadium, I take the total defensive computed RPA's of all the visiting shortstops playing against the Cardinals at shortstop at Busch Stadium and directly use this total average RPA of these shortstops as the measuring stick with which to judge Ozzie Smith's performance. I then do the same comparison for Ozzie Smith against all these other shortstops for all the away parks for when the Cardinals are on the road.
As an example, let's take fictional player A. If the average defensive RPA for all visiting players who played his position at his stadium was .080 per batted ball through his position, and this player's defensive RPA worked out to .095 per batted ball, then this player would be rated negatively on defense at his home park. Let's say that, on average he would be expected to have 275 balls hit through his position in any one season for the 81 games at his home park. He would be giving away .015 runs (.095 minus .080) on each and every ball hit through his area. Multiply .015 by the total number of expected balls through his area (275) and you get a total of 4.125 runs given up by this fielder over the 81 games at his home park. Let's say we did the same thing for the 81 games where he played at the visitor's park, and using the RPA's garnered at those parks it turned out that his deficit was not as bad, say 2.200 runs given away. The total gift to the opposing team from this porous defender would come to 6.325 runs over a 162 game season if he were to play as a regular with such a defensive RPA. If we divide the 6.325 runs by the standard used here of 600 computed plate appearances, we end up with a defensive RPA of minus .01054 which we can safely round off to -.011 or in its shorthand form: -11 basis points. In other words this player gives up .011 runs on defense for each time he comes to the plate.
ADJUSTING FOR THE STADIUM
Since the defensive RPA's, as demonstrated above, already include the stadium and the position played at that stadium in computing them, no further stadium adjustment need be made. There will be some adjustments in terms of finding the median rating for players at that position, but nothing more than that is required.
The offensive RPA formula, as it is applied to individual players and pitchers, is further adjusted by the use of stadium variants. The purpose of these variants is to enable us to equalize the offensive ratings for all the players regardless of the stadium in which they play. Some stadiums greatly enhance offensive production and others reduce this production. A single or homer at Wrigley Field cannot have the same value as a single or homer at Shea Stadium, since Wrigley Field is a tremendous hitters' ballpark and Shea Stadium is a pitchers' ballpark.
Stadium variants are not constant from year-to-year due to many factors -- primarily the weather. This is why combining multiple year stadium variants are incorrect. In this year's introduction I again referred to my article in the 1990 edition on Howard Johnson's illusion of having a bad year in 1988 when, in fact, he had actually improved as a hitter. Had I combined the stadium variants for the high-scoring 1987 season and the low-scoring 1988 season, I would never have discovered this fact. I would have been just like the Mets management and would have thought he had a poor year (although I would never have tried to trade him after that season as they tried to do!). The weather can have an enormous effect on offensive production from one year to the next. It is usually quite cold in April and sometimes even well into May in many parts of this country. Those of you who have had the decidedly unenjoyable experience of hitting a baseball on very cold days know how the ball clangs off the bat and the bat rattles in your hand, and you should easily appreciate how cold weather reduces the number of home runs.
How is our stadium adjustment accomplished? Each major league team plays the exact same number of games at home, against the exact same teams as it plays on the road. Therefore, you can add up all the data for both the home and visiting clubs at a particular stadium. You can then apply the RPA formula to determine the average value of a plate appearance at a particular stadium in a particular year. You can then do exactly the same thing for all the offensive acts performed when these same teams are on the road and then compare the results.
Let's look at an example from a hitter's park:
Camden Yards 1994:
Baltimore at Camden Yards:
2115 Computed Plate Appearances should have produced 276.98 runs via the RPA formula.
276.98 divided by 2115 CPA's results in a .1310 RPA for all Baltimore hitters.
Baltimore at away parks:
2307 Computed Plate Appearances results in 290.08 runs via the RPA formula.
290.08 divided by 2307 CPA's results in an RPA of .1257 for all Baltimore hitters.
Baltimore Opponents at Camden Yards:
2148 Computed Plate Appearances should have produced 269.72 runs via the RPA formula.
269.72 divided by 2148 CPA's results in a .1256 RPA for all opponent hitters.
Baltimore opponent at opponent's park:
2128 Computed Plate Appearances should have produced 255.70 runs via the RPA formula.
255.70 divided by 2128 CPA's results in a .1202 RPA for all oppponent hitters.
As you can clearly see, both the visiting players and the home team's players had their RPA's rise by more than 5 RPA points when playing at Camden Yards as opposed to the opponent's park.
In order to take the above data and produce a stadium variant we combine the data for both Baltimore and its opponent. In other words we add the Computed Plate Appearances at Camden Yards (2115 + 2148) and add the theoretical runs produced (276.98 + 269.72). We then divide this new total runs produced (546.70) by the new total of Computed Plate Appearances (4263). This results in a total RPA at Camden Yards (unfortunately, I need to carry this result to 6 digits in order for the overall result to be accurate to 4 digits) of .128243 runs per plate appearance. Doing the same for Baltimore and its opponents at the opponent's park results in 545.78 runs divided by 4435 Computed Plate Appearances. This results in a total RPA at the opponent's park of .123062 runs per plate appearance.
The final step is easy. Since we now realize that offense comes relatively easy at Camden Yards, we need to reduce offensive production data that is produced there so as to compare a player's performance to a player at a less offensive park. We simply divide the RPA result for the opponent parks (.123062) by the RPA result for Camden Yards (.128243) which results in a stadium variant for Camden Yards of 0.9596.
The difference for play at Camden Yards was over 5 RPA points higher per plate appearance. When evaluating two players, these 5 points can be quite significant, especially if the player is being compared to other hitters whose performance took place at a pitcher's park where the hitter RPA's were shifted several more points in the opposite direction. Without adjusting for the stadium, a person could get a wildly inaccurate idea of the ability of a particular player.
Stadium variants can be applied only to one-half of a player's rating since he plays only half his games at his home park.
MORE ON THE DEFENSIVE RATING
Baseball is a sport filled with illusions, and nowhere else are these illusions more evident than on defense. In baseball, only one team takes the field at a time. This is the main source of the illusions created in our wonderful game. In other team sports both teams take the field at the same time and any player unable to perform in some fundamental way will be undressed publicly by his opponent. It's the most direct measure of skill, and easy to spot, even for the untrained eye. In baseball, however, the player guards territory, not another player. The player, therefore, who can guard the most territory will be the most valuable on defense -- regardless of skills. Skills help make the fast and quick player even better but usually cannot rescue a lead-footed player from being rated a defensive liability in these pages. In most cases, skills only create the illusion of competence in the slow-footed player.
The linear weights for singles, doubles, etc., as listed previously, are used to determine the number of runs given up at a particular fielding position at each stadium. For instance, the Dodger Stadium infield was the most difficult infield to play in the major leagues in 1990. More runs were produced by the fielding conditions in the infield at Dodger Stadium than at any other stadium in the major leagues! This means that when the very same players who played a particular infield position, both for the Dodgers and the visitors, went to the visitor's home stadium, these same fielders were turning batted balls into outs that were hits or errors at Dodger Stadium.
This data can be further broken down to show how each infield position contributed to this effect in that particular year or over a period of years. In the team section I will list the approximate difficulty for each position at each stadium, based on the last three years of data.
For the purposes of this study, the individual player defensive statistics I have utilized are limited to the overall area that is supposed to be covered by a defensive player and the outcome in terms of RPA. This has proven sufficient for our needs. I only utilize ground ball data for the infielders and fly ball data for the outfielders in making these ratings.
Why not break the defensive data down further by showing the RPA in terms of direction within an area of defensive responsibility or distance for the flyballs to the outfielder? Because, in statistics, as you break the data down into smaller units you fall prey to subjective judgments based on too few occurrences.
What causes stadium differences on defense? The causes are many. At some parks it is more obvious than at others. Left fielders at Fenway obviously cannot be adequately rated against left fielders at other ballparks without a huge stadium variant. A third baseman playing on a grass field will often encounter defensive differences from playing conditions at other grass fields and even bigger differences when moving onto a plastic surface.
Some infields are harder and/or more liable to bad bounces than others. Some have their grass cut lower than others. Some have their foul lines more sloped than others. Even the plastic infields can play differently from each other. Some are faster and others slower. Some are new and others old and worn. The lighting conditions and background and noise can have a noticeable affect. Heat, moisture, wind, glare, altitude and weather conditions all vary from stadium to stadium. All these elements can come into play to create peculiar playing conditions at every position at every ballpark.
One of the most beautiful aspects of baseball are these varied conditions. It is part of the real charm of this game. It would be a true shame, in my eyes, if the lords of baseball ever got it into their minds to attempt to homogenize our beautiful national pastime by eliminating these stadium differences. Judging from the construction of the newest ball parks, like Camden Yards, we don't have to worry, since it is clear that this uniqueness is consciously being preserved.
THE DEFENSIVE CHART
Stats, Inc.'s directional chart for all batted balls is very logical. This is the actual chart used by all the Stats, Inc. scorers to record direction and distance for all batted balls. The directional areas start from the foul territory adjacent to the left field stands (letter "A") over to the right field foul territory adjacent to those stands (letter "Z") in a fan shaped pattern.
Note: In the position-by-position ratings you will find the defensive rating for each player in the column headed "DEFENSE".
Here then are the assigned defensive areas for each defensive position, utilizing the lettering system on the previously referred to Stats, Inc. chart:
The catcher's defensive ratings are much simpler. We merely add up all the stolen bases plus passed balls and multiply this figure by 0.10 runs and then subtract 0.165 runs times the total number of runners caught stealing plus runners picked off. We take the difference and average it out over the number of outs produced while he was catching and add 8 points. We add 8 points since the median rating on defense for the catcher position gave away 8 points and we need to compare individual catcher values to the median values of players at other positions.
USING THE MEDIAN
The defensive ratings at each position will be adjusted so that half the players at a particular position fall above and half fall below the median rating. Adding 8 points establishes the proper relationship of each catcher to the median value for all catchers and to the median values of all other players. In fact, each defensive position, when I look at the defensive ratings, will find the median rated player will be somewhat away from a zero (neutral) rating, and I adjusted each player's rating at each position so as to place the median rated defensive player at the zero point on my scale. This means that all the defensive ratings are adjusted as in the manner for the catcher so as to produce this balance. I have found that the use of the median is the best method for comparison of groups of players. All the ratings in this book, including the pitcher RPA ratings and the overall player RPA ratings, are judged by the median of all players at their position. Their (+) or (-) rating in the team-by-team analysis section, is a comparison to this median for all players at a particular position.
The minor league ratings have been adjusted to correspond to their Major League equivalents, after taking all the necessary factors into account, and are then added to the Major League rating. The important thing to remember is that these minor league ratings do work, as has been shown in previous editions. In other words, on average, the difference in a Major League player's annual RPA rating from season to season is approximately the same as for the average year-to-year RPA difference for a minor league player's first Major League season as compared to his previous minor league season RPA performance rating.
In the individual player data in the team section, for Kelly Stinnett of the Mets for instance, on the top line (above the line which breaks down the hitter's RPA vs. the six types of pitchers and the set-up/drive-in and overall two-year RPA) after "Bats: Right", there's a series of numbers which indicate that Kelly has had 90 plate appearances against lefties (90 L), 78 against righties (78 R), 32 against groundball type pitchers (32 G) and 27 plate appearances against flyball-type pitchers (27 F) over the past three years. Three years of data are used to get the lefty-righty, groundball-flyball RPA ratings because two years of data would elicit too small a database in most cases, since we are taking pieces of the whole and analyzing these smaller pieces as against the whole. As you can see, Stinnett's data sample size for groundball and flyball ratings are just too small to make a real RPA judgment. To the right of this line of data for Stinnett is the figure 310 ML, which means that for the purposes of the overall two-year RPA rating, he had his 310 minor league computed plate appearances over the past two years integrated into his overall RPA rating. The final item on this line reads: MLrpa: .094 which indicates that the 310 minor league computed plate appearances over the previous two years yielded a Major League equivalent RPA of .094.
THE AGE ADJUSTMENT
In the 1993 edition the RPA age adjustment was introduced and successfully utilized in order to understand the dynamic process of aging and make educated estimates of how the individual player RPA's change as a player gets older. This RPA age adjustment was based on a study published in the 1993 edition. This study was done by Brom Keifetz and was based upon the data gleaned from the year-to-year RPA changes for individual players published in the pages of previous editions of my book.
If you look carefully at the chart of age changes which is printed on this page you'll notice something contrary to "common knowledge" amongst baseball fans and media "experts". The pitchers, in the age change adjustment chart, reach their peak performance level at the age of 27 while the hitters reach their performance peak at the age of 30. This is no mistake! The hitter, on average, does reach his peak later than does the pitcher!
It should be noted that the pitchers, however, tend to drop off from their peak at a slower rate than do the hitters and this slower dropoff of performance may account for some of the unfounded belief that pitchers peak later than hitters.
Remember, however, that this is a list which shows expected, not actual age-related RPA changes. Each individual player or pitcher age performance change may vary greatly from this pattern. In other words, the age-related changes in this listing are a tool for understanding the probable direction that a particular player's performance is headed in. This is particularly useful when we are assembling a 25 man team. While individual players, due to chance related events, may have age-related changes which may vary greatly from the list on this page, the 25 players that we put together as a team are more likely to conform to the age related changes pattern that was revealed in Brom Keifetz's beatiful study. In other words, just as in other aspects of this game, we are attempting to play the percentages. Our judgment on a particular player may not pan out but on the basis of 25 players it is more likely that our judgments will pan out and produce a successful team.
Here's the chart for RPA age changes for hitters and pitchers which will be applied to the two year averaged RPA ratings:
There you have it!
The RPA is a comprehensive rating that takes into account offensive ability, defensive ability, pitching, the position played, the stadium where the performance took place, the age of the player and even his minor league performance.
The most complete and accurate picture of player ability ever produced!