The following appeared in The 1985 Bill James Baseball Abstract.
— Bill James
Last fall I received a letter from an Illinois man named Paul Johnson who claimed that he had developed a very simple method of evaluating run production that was more accurate than runs created. I receive a couple of those letters a year, and it rarely takes me five minutes to get them off my desk. Peter Palmer in The Hidden Game makes a similar claim for the linear weights method, and Pete is a good friend and an outstanding analyst of the game, but in fact linear weights do not meet any acceptable standard of accuracy in assessing an offense . . . well, I went through all that in the historical book.
I spent a long time developing the runs created formula, and I've spent many hours looking for ways to improve it. My assumption has been that runs created could be improved, but that when improvements came, they would have to come from dredging through the minor offensive stats and extending the decimal points. But Johnson seemed reasonably intelligent, and I thought I owed him the courtesy of at least checking out his system.
And damned if it doesn't. I was astonished. I'm not certain that Paul's "Estimated Runs Produced" method is more accurate than runs created—we'll get into that at the conclusion of the piece—but I am convinced that it is an extraordinarily good method. It's accurate, it's simple, and it measures what we need to measure: runs. It is more accurate than runs created for certain types of players. On that basis, I felt I should ask Mr. Johnson to introduce his method to you.
— Paul Johnson
I've come up with a method for estimating run production which is more accurate than even Bill James' runs created formula. Hard to believe? Well, it's even harder for me to believe that Bill invited me to say so in print and in his book! That speaks volumes about his dedication to furthering the baseball knowledge of his readers. When he finds a new, more accurate system he lets you know about it. Even if it isn't his. With that kind of openness he's going to make baseball statistical analysis a science yet. Nominate that man for the Nobel Prize in Baseball. He'd get my vote. Thanks for the chance to introduce my system, Bill.
The estimated runs produced formula I've developed has the same goal as the runs created formula. Both are designed to calculate the number of runs that individual players produce for their teams. For most average and lessthanaverage players and teams the two formulas yield very similar results. But when you get into high home run production or outstanding slugging percentages, especially when combined with high onbase percentages, then the two formulas go their own and definitely separate ways.
Checking against the actual statistical records of Major League Baseball shows my estimated runs produced formula to be the more accurate:
19701984 MAJOR LEAGUE BASEBALL 

Runs/Team 
Estimated Runs Produced 
Runs Created 

Top 10 Teams in Home Runs (avg. 200 HR) 
820 
821 
832 
TOP 10 in Slugging Pct. (avg. .448) 
836 
838 
851 
Estimated runs produced is consistently more accurate. The '84 Detroit Tigers led the Majors with 187 home runs. They scored 829 runs, had 830 estimated runs produced, but 843 runs created. Boston led the Majors in slugging percentage in '84 at .441. The Red Sox scored 810 runs. had 816 estimated runs produced, but 832 runs created.
The differences may not seem that great, and they really aren't at these levels of performance. But even though these represent the top team performances since 1970, these team statistics are not very far removed from average levels of play. Individual players rise far beyond these levels. An average team might hit 120 home runs. A team of Mike Schmidts would smack 350. An average team might have a slugging percentage of .390 or less. A team of Jim Rices would slug over .530. At the statistical levels produced by Schmidt or Rice the differences between the two formulas become magnified, and they become very significant. Some illustrations:
In all the World Series and League Championship series games ever played in which a team had a .650 or better slugging percentage the teams combined for a .38l batting average and slugged .709. They scored 276 runs, had 279 estimated runs produced, but 331 runs created. A 19% difference in the formulas.
The highest postseason slugging percentage in a single game was recorded by the Cubs in Game One of the '84 National League Playoff. They crashed five home runs out of the friendly confines of Wrigley Field and rolled up an .895 slugging percentage. They scored 13 times, had 13 estimated runs produced, but 17 runs created. This time the formulas differ by 30%.
At the highest levels of statistical performance runs created overestimates run scoring; it seems to be biased. Would that carry over from team statistics to an individual player's figures? It can't be directly proven, but here's a stab at demonstrating that it would.
A team of Babe Ruths would hit three home runs per game and could be expected to hit at least .300. I sifted through every World Series game looking for teams which had a game with those characteristics. I found 14. And their statistics tallied up to look remarkably like a Ruthian season 1929 to be exact.
AB 
H 
TB 
2B 
3B 
HR 
BB 
SB 
Avg. 
Slg. 

14 Series Games 
509 
180 
347 
27 
1 
46 
65 
3 
.354 
.682 
1929 Babe Ruth 
499 
172 
348 
26 
6 
46 
72 
5 
.345 
.697 
125 runs were scored in the Series games. There were 129 estimated runs produced and 148 runs created. Babe's 1929 estimated runs produced total was 131, his runs created, 149. Both formula figures were nearly identical for the Series games and for Ruth. The estimated runs produced formulas gave more realistic results for the World Series games, games which were a virtual mirror image of Ruth's 1929 stats. That doesn't prove that estimated runs produced gives more realistic results for Ruth. But it is highly suggestive.
So how do the formulas compare in rating some of today's top performances? Well, they both agree that Dwight Evans had the best total production in the American League in '84, 125 estimated runs produced, and 132 runs created. The formulas disagreed in the National League. The estimated runs produced formula gave the nod to Dale Murphy, 120 to 117 over Ryne Sandberg. Sandberg edged Murphy, 126 to 123 in runs created. Why the difference in the NL? Two major factors. The first is that the runs created formula puts more emphasis on being caught stealing and grounding into double plays than does the estimated runs produced formula. Murphy had 6 more of those plays than did Sandberg. The second factor is that Murphy had 20 intentional walks compared with only 3 for Sandberg, and the runs created formula puts a lesser value on intentional walks than on a normal base on balls. The estimated runs produced formula puts equal value on each.
Some will undoubtedly point to that last fact as a flaw in the estimated runs produced equation. Perhaps it is. But I'm not convinced of that. Though it's true that the intentional walk won't advance any base runners it does improve the chances of baserunners advancing on a subsequent walk. I saw the following happen at least three times this year: Runner on second. Batter intentionally passed. Next batter walks, advancing the baserunners one base each, or a total of two bases. So because of the intentional walk, the next walk that occurred advanced men two bases instead of advancing them no bases as would've happened had there been no intentional pass. It seems to tend to even out.
But back to how the formulas rate the players. Again, the major differences in the ratings occur when a team or player combines a high onbase percentage with a high slugging average. And I should say a high net onbase percentage. What I mean by that is the chance of being on base minus the chance of being thrown off the bases by being caught stealing or of wiping someone else off the base paths by grounding into a double play. The net onbase percentage is, of course, essentially the first part of the runs created formula.
When rating a player such as Tony Armas, low net on base percentage, but high slugging percentage, the estimated runs produced formula gives a higher rating. 100 vs. 97 runs created. A player such as Eddie Murray, highest on base percentage in the AL and a high slugging percentage, fares much better using the runs created method, 130 vs. only 120 estimated runs produced. The estimated runs produced method rates Armas 20 runs behind Murray, 120 to 100. The runs created formula rates Armas 33 runs behind Murray, 130 to 97. That's a big difference. Just about a game and a half in the standings.
So which method do you believe? Well, I've certainly got my preference. Although I might be slightly biased I'll take estimated runs produced every time. If you're a runs created fan you probably won't go too far wrong most of the time. But be a little wary of the runs created totals for players who hit over .300, those whose batting average added to slugging percentage exceeds about .700, those with many or very few walks, and for those who steal a mean base. When any of those conditions are in evidence, break out the estimated runs produced formula.
Speaking of which, this might be a good time to get a look at this as yet unseen creation. Here is the estimated runs produced formula:
(2 x (TB + BB + HP) + H + SB  (.605 x (AB + CS + GIDP  H))) x .16 = Runs
To get some understanding of the formula it's easiest to at first ignore the numbers in it. Then the formula can be viewed as two simple sides. The lefthand side consists of Total Bases, Bases on Balls, Hit by Pitcher, Hits, and Stolen Bases. It can be thought of as a collection of the positive contributions a batter or baserunner makes to his team. On the righthand side are the negatives, the outs made by the batter or baserunner. The outs are figured by adding At Bats, Caught Stealing, and Grounded Into Double Plays, then subtracting the times the batter reached base safely on Hits.
So the formula is really quite simple. The lefthand side tracks the movement of batters and baserunners, the righthand side keeps tabs on the number of outs made. The numbers exist only to put proper emphasis on the various events. They are essential to making the equation work, but there's no need for me to go into how they came to be what they are. I'll just tell you that it took a hell of a lot of experimenting to settle on the darned things.
I will explain just a bit about the origins of the left side of the formula. It's based on charts I made of the number of bases advanced by batters and base runners on various offensive plays. Among other things, two bits of information I got out of those charts were that home runs moved batters and baserunners three times as many bases as did the typical single, and that a base on balls advanced the batter and baserunners only twothirds as many bases as did a single. Those two facts led to the design of the lefthand side. I just needed a simple way of saying that a home run was three times as good as a single and that a walk was only twothirds as good as a single. Well, 2 x (TB + BB + HP) + H + SB does it. A single equals 3; 2 x (1 + 0 + 0) + 1 + 0. A home run equals three times that, 9; 2 x (4 + 0 + 0) + 1 + 0. And a walk equals twothirds of the single, 2; 2 x (0 + 1 + 0) + 0 + 0. The relative values of doubles, triples, and stolen bases are similarly determined and reflect the actual values I found in my charting of movement around the bases.
That's enough on the inner workings of my estimated runs produced formula. But before I go on I just want to mention why the runs created formula seems to overvalue high slugging percentage performances. Remember that the charts I did showed that a typical home run is three times as valuable to scoring as is a typical single. But the runs created formula credits the home run with being four times as valuable as the single:
Home Run 1 hit x 4 total bases / 1 at bat = 4
Single 1 hit x 1 total base / 1 at bat = 1
Too much emphasis is put on the home run. Doubles and triples are similarly overstated. This doesn't create a big problem in fairly accurately assessing run production for most players, however, as they have a pretty normal distribution of extra base hits versus singles. So the two formulas come up with similar results for a majority of players. But not so similar for players with high onbase and slugging percentages. Some more examples from the 1984 season Don Mattingly had a .381 onbase percentage, slugged .537, had 111 estimated runs produced vs. 121 runs created. Dave Winfield (.393 .515) had 104 estimated runs produced vs. 113 runs created. Keith Hernandez (.409, .449), 100 vs. 108. Mike Easler (.376, .516), 108 vs. 118. Tim Raines (.393, .437), 115 vs. 124.
At times the two systems produce even greater disagreement. A classic example is George Brett s 1980 season. Brett finished the year at .390 and he hit with power. His .664 slugging mark had been bettered only twice in the previous twenty years, by Hank Aaron's .669 in 1971, and by Mickey Mantle in 1961 when he cracked 54 home runs and slugged .687. Anyway, here's how the two methods measured Brett's year. George had 116 estimated runs produced, 135 runs created. 19 runs; that's some difference. About two wins in the standings. That is a rare case. But one win differences are very common. Add a couple or three of them together and you might wind up thinking that your thirdplace talent could win the pennant this year. Or that your pennant contenders will end up in third place. It just depends on which formula you use.
The two formulas also differ somewhat on the value of the running game. The estimated runs produced formula finds stolen bases to be more valuable and caught stealing to be less critical than does the runs created method. For the Top 10 teams in steals from 1970 to 1984 the estimated runs produced formula comes closer to the actual average of 684 runs scored per team, beating out runs created 681 to 677. And it was more accurate for the most prolific basestealing team in recent history, the '76 Oakland A's (341 stolen bases). They scored 686 runs had 676 estimated runs produced, but only 663 runs created.
The two methods, not surprisingly, rate Rickey Henderson's 130theft 1982 season a bit differently; 104 estimated runs produced, only 99 runs created. Not a profound difference, but significant. Especially when juxtaposed with their ratings of Robin Young (.331 29 HR) in that same year; 128 vs. 136 runs created. Again, there is a 13 run difference in the two formulae' comparisons of players. It you're a bit confused about which equation to believe, don't worry tool much. Using either one will prevent you from looking as inept as Major League Executives sometimes have.
Here's one of the alltime classics as viewed by the estimated runs produced formula. After the '79 season two veteran National League second basemen declared themselves free agents. Over the prior two seasons one had hit .241. the other .243. Over the next two seasons they batted .242 each. They sound like mirror images don't they? But they sure weren't. The first player was Rennie Stennett. In '7879 he produced at a rate of 46 estimated runs produced per year. In '8081 he continued at that rate. The other player did somewhat better. In '7879 he produced at a rate of 95 estimated runs produced per year. He continued that production in '8081. I wonder if the San Francisco Giants would've liked the estimated runs produced figures for '7879 before they passed up Joe Morgan after the '79 season in favor of signing Stennett to a $2.5 million deal? It's a rare case where two players will produce so consistently from year to year, have such similar batting averages, and yet be light years apart in runproducing capabilities. But it did happen in real life. And somebody really blew it.
Of course every tool has its limits, and the estimated runs produced formula is no exception. I was trying to come up with the breakeven stolen base percentages for various teams, but I ran into a big stumbling block: Timing. It makes a big difference whether a steal is being attempted with no outs or with two outs. It's a lot easier to justify the attempt with two outs, as doing it then isn't nearly as likely to break up a big inning if you don't succeed as if you tried with no outs. But the estimated runs produced formula can't really be used to calculate just how much more risk there is in trying the noout steal. At least I haven't figured out how to do it.
I can make some very broad generalizations about breakeven percentages for stolen bases. A team that hits for a normal batting average, hits with average power, is average in every other statistic, and steals an average number of times with none, one, and two outs (who knows what that average distribution is?) needs to succeed on about 61 % of its steal attempts to make it worthwhile. A slugging team such as the '84 Tigers would have a breakeven point of about 65%. A lowpower bunch like the '84 Cardinals could justify attempting to steal with a 60% success rate. That's talking about teams, but it would be more appropriate to consider the individual batter at the plate.
With a weakhitting shortstop at bat, a 55% success rate would be about the breakeven point. With a Winfield or Murray at bat, a 70% success rate would be needed. I hesitate to try to get any more specific because as I said, timing is an important factor. I just don't know how important.
Everybody who has a formula for evaluating offenses has their lists of alltime greats. And I'm no exception But before I inflict my lists on you I need to explain briefly what they're based on. First, I calculate the player's estimated runs produced. Then I project that figure to a full season's play. This puts each performance into a context where they can be looked at and compared in much the same way as are batting averages. And without further ado, here's my first list.
ESTIMATED RUNS PRODUCED TOP 10 LlFETIME PERFORMERS IN ESTIMATED RUNS
Est. Runs Produced 
Avg. 

1. 
Babe Ruth 
202 
.342 
2. 
Ted Williams 
194 
.344 
3. 
Lou Gehrig 
176 
.340 
4. 
Jimmie Foxx 
162 
.325 
5. 
Rogers Hornsby 
158 
.358 
6. 
Hank Greenberg 
153 
.313 
7. 
Mickey Mantle 
148 
.298 
8. 
Stan Musial 
146 
.331 
9. 
Ty Cobb 
145 
.367 
10. 
Joe DiMaggio 
141 
.325 
TOP TEN SEASONAL PERFORMANCES IN ESTIMATED RUNS PRODUCED PER 162 GAMES
AB 
HR 
Avg. 
Est. Runs Produced/Season 

1. 
'20 Ruth 
458 
54 
.376 
274 
2. 
'23 Ruth 
522 
41 
.393 
266 
3. 
'41 Williams 
456 
37 
.406 
265 
4. 
'21 Ruth 
540 
59 
.378 
262 
5. 
'57 Williams 
420 
38 
.388 
245 
6. 
'26 Ruth 
495 
47 
.372 
239 
7. 
'24 Ruth 
529 
46 
.378 
237 
8. 
'25 Hornsby 
504 
39 
.403 
231 
9. 
'27 Ruth 
540 
60 
.356 
230 
10. 
'24 Hornsby 
536 
25 
.424 
228 
I can't let this list pass without a comment to put it in perspective. Babe Ruth's 1920 performance put into a onegame context would look like this:
AB 
H 
2B 
3B 
HR 
BB 

'20 Ruth 
40 
15 
2 
1 
5 
13 
That is truly astonishing. 15 runs per game. 5 home runs per game. 13 walks per game. That is what 274 estimated runs produced means. (Runs created per 162 games would be 324, or 18 runs created per game.)
TOP TEN SEASONAL PERFORMANCES IN ESTIMATED RUNS PRODUCED PER 162 GAMES (19701983)
AB 
HR 
Avg. 
Est. Runs Produced/Season 

1. 
'80 Brett 
449 
24 
.390 
190 
2. 
'76 Morgan 
472 
27 
.320 
179 
3. 
'75 Morgan 
498 
17 
.327 
173 
4. 
'81 Schmidt 
354 
31 
.316 
172 
5. 
'70 McCovey 
495 
39 
.289 
168 
6. 
'71 Aaron 
495 
47 
.327 
168 
7. 
'79 Lynn 
532 
39 
.333 
168 
8. 
'70 Yastrzemski 
565 
40 
.329 
167 
9. 
'77 Carew 
616 
14 
.388 
166 
10. 
'70 Carty 
48 
25 
.366 
163 
Besides the estimated runs produced formula already introduced there are a couple of other versions that I find useful. The first of these is the simplest. It works very well for groups of statistics for which you don't have all the minor statistics like caught stealing, hit by pitch, etc.
(2 x (TB + BB) + H + SB  (.615 x (AB  H))) x .16 = Runs
The second version works better, especially for players with high stolen base totals, for figuring estimated runs produced projected out to a full season's basis.
(2 x ( TB + BB) + H + SB  ( .610 x (AB + (SB/4)  H))) x .16 = Runs
Then:
Runs/(AB + (SB/4) H) x 458 = Runs per 162 Games
(AB + (SB/4)  H is the number of projected outs made by the player and 458 would be the number of total outs in a season.
To project estimated runs produced from the original equation to a full season this is the conversion:
Runs/(AB + CS + GIDP  H) x 474 = Runs per 162 Games
(AB+CS+GIDPH) is the number of outs made by the player and 474 would be the number of total outs in a season.
One thing that I should do here is to explain the details of the runs created formula in use this year, which I don't think I have done anywhere else in this book. The runs created estimates in this book arc derived by what is called the technical version of the runs created formula. which is the same this year as was introduced in the 1984 Abstract on pages 1215. The formula has three elements, an A element, a B element, and a C element. The three are put together in this way:
(A x B) / C
The A factor, which measures runners on base, is hits plus walks plus hit batsmen minus caught stealing and grounded into double plays (H + W + HBP  CS  GIDP).
The B factor, which measures advancement of baserunners, is total bases plus .26 times hit batsmen and nonintentional walks, plus .52 times stolen bases, sacrifice hits and flies (TB + .26(TBBIBB+HBP) + .52(SB + SH + SF))
The C factor. which measures the context in which these things occur, includes at bats walks, sacrifice hits and flies, and hit batsmen (AB + TBB + SF + HBP).
That's called the technical runs created formula: same as last year's. I don't know why it took four pages to explain last year.
The first thing that I did to try to verify Paul's method was to run his formula on the 1984 season, the statistics of which were not even available at the time that he wrote the letter. His method, for the 1984 season, was little more accurate than runs created. I checked 1983. Again Johnson's method finished a nose in front of runs created. At that point I thought I had better get serious about checking it out. so I designed a tenleague, 100team test. The ten leagues were both leagues for the seasons 1955, 1960, 1965, 1970, and 1975.
Runs created beat estimated runs produced in seven of the ten leagues and 56 of the 100 teams: still, because the errors of runs created were significantly larger than those of Johnson's method, his system came out well ahead in the gross error. The gross error for the 100 teams was 1,840.2 runs for Johnson's method (18.4 runs per team); that for runs created was 1,934.8 (19.3 per team).
At that point, I decided I should ask him to present the formula in this book. I'm not convinced that his method would be more accurate than runs created in a larger study; I'm certainly not convinced that it wouldn't. Runs created seemed to be more accurate in the period before the stolen base revolution began; it "won" both leagues in 1955 and 1960 and the American League in 1965.
I don't know that the degree of accuracy involved makes a lot of difference. The real appeal of his method, to me, is its simplicity; it involves just seven categories of information and no calculations except addition, subtraction and multiplication. I was originally suspicious of the system when I saw that ".16" at the end of it. Wouldn't it seem more likely that the most accurate possible system would require multiplication by .15974 or something? My assumption, as I said, was that if better methods were to be developed, they would have to be more complex, more difficult to figure, and that they would grow out of the existing methods. The excitement of finding Johnson's method is that 1) it is so simple, and 2) it was developed entirely independently. These two things suggest that there probably are compromises between the two methods that will prove to be yet more accurate than either method.
But not that much more accurate. Another thing that I noticed in comparing the two methods is that the correlation between the two of them was even closer—much closer—than the correlation of either with actual runs. The two methods tend to make the same mistakes—that is, the 1975 Red Sox actually scored 796 runs. Johnson's system says they should have scored 768 runs, and mine 769. That happens often, and, since we are talking about two completely independent methods, that suggests that we are nearing the limits of the information that exists within traditional batting stats. The errors in Palmer's method, on the other hand, seem to be completely unrelated to the errors of runs created.
I feel certain that Paul's method will find many uses in sabermetrics. I've known for a little over a year that the runs created formula had a problem with players who combined high onbase percentages and high slugging percentages—he is certainly correct about that—and at the time that I heard from him I was toying with options to correct these problems The reasons that this happens is that the players' individual totals do not occur in an individual context. How do I explain this. . . visualize a player's runs created as a rectangle, of which the two dimensions are the ability to get on base and the ability to advance runners. The rectangle representing Eddie Murray is much larger both ways than that representing an ordinary player.
The increase in runs created that results from the extension of the vertical axis is real. The increase in runs created that results from the extension of the horizontal axis is real. The increase in runs created that results from the extension of the one acting upon the extension of the other is not real; it is a flaw in the run created method, resulting from the player' s offense being placed in an individual context. Does that make sense?
What I was thinking of doing was figuring "context" runs created—that is, Eddie Murray's runs created would be figured as the difference between the Orioles' team runs created with Murray, and their team runs created without him, with his statistics taken out. That method in effect prevents the two extensions from acting upon each other, and thus results in runs created estimates which are more accurate for the Eddie Murray, Babe Ruth type of hitter. The runs created estimates for Murray, Sandberg, Murphy, etc., that would be derived by the use of the "context" method would be almost identical to Paul Johnson's estimates for them.
So I know he's right about that. I suspect he's also right about the stolen base adjustments, though I'm less convinced. I'll make some adjustments to the runs created formula within the next year or so. Right now, I don't know what they will be.
Excerpted from 1985 Bill James Baseball Abstract. Ballantine Books. 1985. James, Bill. "Paul Johnson's Estimated Runs Produced".
Back to the top of page  BTF Essays Page BTF Homepage  BaseballStuff.com