You are here > Home > Primate Studies > Discussion
 
Primate Studies — Where BTF's Members Investigate the Grand Old Game Monday, August 27, 2001SUPERLWTS ? A Player Evaluation Formula for the New Millennium  PART 2If you’ve been waiting for more of Mitchel Lichtman’s Super Linear Weights, your wait is over. Before I begin discussing the superlwts formulas, there are a few corrections and qualifications that must be made to the first part of this article.? First, a colleague, David Smyth, pointed out that in the latest edition of ?Total Baseball,? the SB and CS run values have been updated to reflect an essentially random distribution of stolen base attempts during a game.? According to David, ?TB? uses .22 and .35, respectively, for the SB and CS coefficients.? Apparently these represent the longterm historical values.? Since my superlwts player rankings will focus only on the last few years, I use .17 and .45 (I inadvertently used values of .19 and .46 in Part I), which represent the ?current? (1998 through 2000) values.? (According to ?TB?, the reason that SB/CS runs are not included in Palmer?s traditional offensive lwts formula, and are a separate component in TPR, is that caughtstealing figures were not readily available prior to around 1920 or so.) Keep in mind that however the linear weight coefficients are generated (regression analysis, computer simulation, or empirically, from playbyplay data), there is a standard error associated with each one of them; therefore, do not take any exact linear weight value as the gospel!? The second area that needs some qualification is converting linear weight runs into linear wins.? Thanks to Mr. Smyth, I now see the importance of this ultimate step (penultimate in TPR) in the superlwts rankings.? In Part I of this article, I stated that a player?s offensive linear weights?represent [his] theoretical run contribution to an average team within his league and year(s).? As David and several others have pointed out, this is not an accurate statement.? Although a player?s lwts approximates his run contribution to an average team, the actual mathematical relationship between a player?s lwts and his theoretical run contribution to an average team is not linear.? However  for mathematical reasons I won?t go into  a player?s lwt runs divided by his league?s runs per game very closely approximates (for all practical purposes, equals) his ?win? contribution to any team.? For example, if a player has a lwts of 20 runs per X number of ?player games,? and the average team in his league scores 10 runs per game, we can say with reasonable ?certainty? that he will (theoretically) add 2 wins to a team, per X number of games in which he plays.? Once we convert linear runs (lwts) into linear wins, we can (reasonably) compare players from different eras and different leagues.? (In TPR, ?runs per win? for a particular player is defined as 10 times the square root of the ?league average runs per inning plus that batter?s rating.?? [See ?Total Baseball? for details.]? Not only is this number going to be very close to ?league average runs per game,? but the latter, although much simpler, is probably more accurate than the TPR method.) Now  the superlwts formulas and methodologies? There are eight separate components of superlwts: 1) batting runs; 2) fielding runs; 3) GDP defense (infielders only); 4) OF arms (outfielders only); 5) baserunning; 6) GDP for batters; 7) moving runners over (on outs), and; 8) catching (catchers only).? All of these components are expressed as runs above or below average; the sum total represents a player?s superlwts, and the total divided by the leagueaverage runs per game is a player?s allaround theoretical win contribution to a team.? So who are the best and worst allaround players in baseball?? You will be surprised at some of the results! Let?s start with the most basic superlwts component ? Batting Runs.? As I stated in Part I, I use essentially the same formula that Palmer introduced in ?The Hidden Game of Baseball.?? The only difference is that I use the current (19982000) values for the various offensive events and I include the SB and CS data rather than adding it in later.? Here are the linear weight values for each of the offensive events.? For simplicity sake, I use average values for the NL and AL combined.? In order for the total linear weights to sum to exactly zero, it is necessary to use a unique out value for each league and year.
Out (including ?reached on error?)?  .29 (approximately, depending upon league and year) The formula, therefore, for the Batting Runs component of superlwts
is: This is almost identical to Palmer?s classic formula. Remember that BB?s do not include IBB?s, and the out value varies from year to year and from league to league.? Recently, it has been around .29.? Also, the out value for the NL does not include pitcher hitting, therefore the average position player in the NL for any given year has an offensive lwts of exactly zero.? Although superlwts (or TPR) does not distinguish between K and nonK outs, a K is actually around .016 runs worse than a nonK out (including DP?s).? Most of the difference is due to the value of a ?reached on error.?? Some of it is due to the value of a sac fly and ?moving runners over.?? A fly ball out and a ground ball out (including a GDP) are worth almost exactly the same. The above values are computed as follows:? Using a playbyplay database and a ?number crunching? computer program, the 27 different bases/outs run expectancies (RE matrix) are generated.? These run expectancies are the average runs scored from the time each of the 27 bases/outs situations occurs until the end of the inning.? For example, as the computer program ?goes through? the database for a given league and year, each time it encounters a ?runner on second/one out? situation, it ?records? the number of runs scored from that point forward until the end of the inning.? The average number of runs scored in all of those situations (in the above example, runner on second/one out) is the average run expectancy (RE) for that particular bases/outs combination. Once all 27 bases/outs RE?s are computed, the program ?goes through? the database once again in order to calculate the offensive linear weight values.? Each time a particular event occurs, such as a single or double, the program simply ?records? the change in RE (Delta RE) from before the occurrence of the event to after the occurrence of the event, plus any runs that are scored.? The average value of the Delta RE plus runs scored becomes the linear weight value for each event.? For example, let?s say a double occurs in the database.? If there were a runner on first and no outs prior to the double, the RE for this bases/outs combination (the ?before? RE) would be around .93.? If the runner scores on the hit and the batter ends up on second, the ?after? bases/outs combination would be ?runner on second/no outs? ? an RE of around 1.18.? The change in RE (Delta RE) plus any runs scored, would be 1.18 minus .93 plus 1 run scored, or 1.25 runs.? This is the ?value? of a double in that exact situation.? Once the program averages all the double values in all situations, the result is the (average) linear weight value of a double (for that league and year).? That is where the 1.06 number comes from.? (1.06 is the average of the double value for the NL and AL combined in 1998 through 2000.) The offensive lwt values you will see are park and opponentadjusted.? The park factors I use are component park factors ? separate values for: singles, doubles to left, doubles to right, triples, home runs to left, home runs to right, walks, and strikeouts.? They are generated from 3year?s worth of data, and each factor is regressed according to its own unique linear regression formula (for example, the K park factor is regressed more than the HR park factors, since most of the variation in parktopark and yeartoyear K values is due to random fluctuation).? If you don?t understand the park factor regression, don?t worry about it.? Opponent adjustment is similar to park adjustment.? I adjust each player?s stats for the overall quality of the opponents they face (for batters it is the pool of pitchers they face, and for pitchers it is the pool of batters they face), using ?opponent adjustment factors.?? Both park and opponent adjustments are done on a PAbyPA basis. There are several offensive events that are conspicuously absent from the offensive linear weights formula (both in TPR and in superlwts).? First are IBB?s.? Although a manager presumably issues an IBB with the thought that it is lesser in value than the batter?s average PA (or at least that it reduces the batting team?s chances of winning), in actuality an IBB is worth about the same as the batter?s average PA (i.e. it is a ?neutral? event).? Therefore it can be properly ignored in the offensive lwts formula.? Similarly, a manager generally thinks that a sacrifice bunt increases his team?s run expectancy and/or his team?s chances of winning.? While it is commonly accepted in the sabermetric community, and more and more among traditional baseball pundits, that a sac bunt by a nonpitcher decreases a team?s run expectancy and/or their chances of winning, it is close enough so that it too may be properly ignored.? In addition, a sac bunt is not particularly related to a player?s ?talent?  the manager and the game situation dictate it.? Finally, sacrifice flies are treated as ordinary outs in the offensive lwts formula.? Besides the fact that they too are situational, several studies have shown that a batter is not ?able? to hit a fly ball more often with a runner on third and less than two outs than at any other time. Keep in mind that while a player?s offensive linear weights is in many respects an indication of his talent, to some extent even a player?s raw stats, and therefore his lwts, are ?situational? ? they are not necessarily exactly indicative of his ?contextneutral? performance, or talent.? First of all, every slot in the batting order has slightly different linear weight values associated with it.? For example, a home run in the leadoff position is worth less than a home run in the cleanup spot.? Similarly, in the NL, a walk by the leadoff batter is worth more than a walk by the numbereight hitter.? While a player?s offensive lwts represents his offensive value in a random or average lineup slot, we know that certain players and certain types of players are destined to bat in certain spots in the batting order.? Technically, we could use the above formula to first calculate a player?s average offensive lwts, then go back and assign different coefficients for each offensive event, depending upon where we might expect that player to bat in the lineup.? For example, players like Rickey Henderson or Kenny Lofton, who almost always bat leadoff, would have a different set of lwt values than a player like Rey Ordonez (who almost always bats eighth) or Marc McGwire (who almost always bats third or fourth).? The linear weight values associated with a certain player can also depend upon the overall ?quality? of that player and the quality of the players around him in the lineup.? For example, it is likely that many of the walks that? ?dangerous? hitters like Bonds, McGwire, and Sosa receive are ?unintentional intentional BB?s.?? Because they are often issued with a base or bases open, their overall value may be less than that of an average walk (an IBB is worth around half that of a nonIBB).? Ditto for the number eight hitter in the NL.? In other words, do not take a player?s offensive linear weights as a precise indication of his talent or even his performance. The second component of superlwts is GDP Runs.? GDP?s are not included in TPR or in most offensive linear weight formulas (they are included in the value of the out, however) because they tend to be more situational than indicative of a batter?s speed or propensity to hit the ball on the ground.? If we have the requisite data, however, we can ?factor out? the situational portion of the GDP by looking at each player?s GDP per GDP opportunity rather than their GDP?s per PA.? A GDP opportunity is defined, of course, as a runner on first with less than 2 outs.? The GDP Runs formula is simple.? It is: GDP Runs = (LGDP ? GDP) * (.55) LGDP is the leagueaverage number of GDP?s per that player?s number of GDP opportunities (i.e. how many double plays an average player ?would have? hit into, given the same number of opportunities), GDP is the actual number of GDP?s the player hit into, and .55 is the average difference in value between a GDP and a single out with a runner on first.? In other words, a player is penalized .55 runs for every ?extra? GDP he hits into (per GDP opportunity), and rewarded the same for every ?extra? GDP he ?avoids.?? Most players are in the ?5 to +5 range per season.? A few can save or cost their team as much as 6 or 8 runs. The next superlwts component is also part of a player?s offensive production.? Who hasn?t heard a baseball announcer extolling the virtues of the unselfish batter who ?gives himself up? in order to move a runner over from second to third, usually with no outs?? Well, superlwts gives credit where credit is due (I suppose that with a runner on second and no outs, a batter should modify his swing in order to put the ball in play more often and hit more balls to the right side ? as long as his overall production is not diminished too severely).? The Moving Runners Over formula is similar to the GDP Runs formula.? A batter is awarded .25 extra runs for every runner on second (with no outs) that he moves over with an out, above the league average (again, per opportunity), and penalized the same for every runner he strands at second while making an out.?? The difference between the best and worst players, in terms of ?moving runners over,? is only around 56 runs per season.? Most players are in the plus or minus 0 to 1 run range. The last offensive superlwts component is Baserunning Runs.? Contrary to what I wrote in Part I, Baserunning Runs includes a player?s outsonbase (OOB) trying to stretch a hit (as a batterrunner), his OOB attempting to advance on a hit (as a baserunner), and whether or not he advances an extra base on a double or single (also as a baserunner).? A player is penalized .5 runs (similar to a CS) every time he is out trying to stretch a single into a double, a double into a triple, or a triple into an insidethepark home run.? Of course, there is no equivalent reward for a successful ?stretch,? since this is already accounted for in Batting Runs.? Figuring the other portions of Baserunning Runs is a tad more complicated.? Basically for every extra base a player advances on a hit, over and above the league average, given the number of outs and location of the hit, he is given an extra .2 runs.? For every base he doesn?t advance, compared to the league average, he is docked .2 runs.? And, of course, for every out a player makes while trying to advance an extra base on a hit, compared to the league average, he is penalized .5 runs.? Exceptionally good or bad baserunning can add or subtract 5 or 6 runs from a player?s total lwts for the season.? As you will see, many excellent but lumbering power hitters ?give back? 4 or 5 runs of overall production per season because of their slowness on the basepaths. Now we get to the most problematic and controversial components of superlwts ? those involving defense.? We?ll start with the simplest (and least problematic) of the defensive components ? OF Arm Runs, IF GDP Runs and Catcher Defensive Runs. OF Arm Runs are calculated exactly like the ?baserunner? (as opposed to ?batterrunner?) portion of Baserunning Runs.? An outfielder is credited with .5 runs for every ?assist? (runner thrown out) over the league average at that position, .2 runs for every ?hold? (runner does not advance the extra base) above league average, and docked .2 runs for every extra base a runner advances, above league average.? An exceptionally good or bad arm can add or subtract 9 or 10 runs per season from an outfielder?s superlwts total. IF GDP Runs are not as straightforward as OF Arm Runs and require a bit of fudging to make them work.? Basically for every extra GDP above or below league average, per GDP opportunity, both the pivot man and the fielder who fields the ground ball are credit with or docked .25 runs each (for a total of .5 runs, the approximate value of a GDP versus a single out).? Surprisingly, IF GDP Runs are not worth a whole lot to any individual infielder.? The difference between an outstanding and an exceptionally poor SS or 2B is only around 8 to 10 runs per season.? Most infielders are in the ?3 to +3 range. The last of the simple defensive components is Catcher Defensive Runs.? Some elements of catcher defense are difficult to quantify.? A catcher?s ability to block pitches in the dirt is probably reflected in his pitching staff?s WP totals.? However, separating out the influence of the pitchers themselves is a difficult, if not impossible, task.? Another nebulous aspect of catcher defense is ?calling pitches.?? In my opinion, attempts at quantifying this ability, through metrics such as catcher ERA, have been disappointing and ineffective.? In fact, on most teams these days, the pitching coach or manager calls a majority of the pitches, and of course, the pitcher is the ultimate arbiter when it comes to pitch selection. Consequently, there are only three things which are included in Catcher Defensive Runs  a catcher?s SB/CS numbers, his errors, and his passed balls.? The SB and CS totals are treated exactly the same as in the Batting Runs formula (in reverse, of course, and normalized to the league averages), each error above or below average is assigned a value of around .75 runs, and each passed ball above or below average is worth .2 runs.? The best and worst catchers in the league are typically worth plus or minus 10 to 15 runs on defense (CS percentage, errors, and passed balls). Finally, we get to the most complex (and controversial) of the defensive components ? Ultimate Zone Rating Defensive Runs (the term Ultimate Zone Rating, or UZR, is from STATS Inc.).? UZR is basically a Zone Rating or Defensive Average measure (number of outs divided by the number of ?fielding opportunities?) converted into a ?number of runs saved or allowed? above or below average at each defensive position.? The basic methodology for calculating UZR defensive runs is as follows: First, a playbyplay database that includes hitlocation and hittype is required.? The hitlocation data that I use, from STATS Inc. and The Baseball Workshop (BW), superimposes a ?grid? over a generic baseball diamond, such that every hit (fly ball, line drive, pop fly, or ground ball) is assigned a location on the field indicating where the ball is caught or fielded, where it drops in the outfield or infield (fly balls), where it goes through the infield (ground balls), or where it leaves the playing field, in the case of a home run.? Using the BW grid, the infield is divided into 45 sections and the outfield, 44.? A computer program first goes through the database and records how often each fielder, on the average, turns into an out each type of batted ball (line drive, ground ball, pop fly, and fly ball) hit into each section of the field.? For example, a ground ball hit in a particular location of the infield may result in a hit 30% of the time, an out by the SS 50% of the time, and an out by the 3B, 20% of the time.? This is done for every appropriate type of hit (for example, ground balls only apply to the infield sections) in each of the 89 segments of the playing field.? The program also records the average ?value? of a hit (using the linear weight values for each type of nonhr hit) in each location and for each type of batted ball. The program then goes through the database again and for every batted ball that is turned into an out by a particular fielder, it rewards that fielder with the difference between the value of an out and the average value of a batted ball hit in that area.? For example, let?s say that a fly ball hit to a particular section of the outfield, between the center and left fielders, is caught by the center fielder.? If an average fly ball hit to that particular area of the outfield is caught (by either the left or right fielder, in equal proportions) 30% of the time and falls for a hit 70% of the time, and the average value of the hit is .8 runs (some combination of singles, doubles and triples), then the average value of a batted ball in that section would be 30% times .3 plus 70% times .8, or .09 plus .56, or .47 runs.? An out is worth around .3 runs to the defensive team (the linear weight value of an out is around .3 runs), so the center fielder, by catching the ball, has ?saved? his team the difference between .47 runs and .3 runs, or .77 runs.? If the same batted ball were to drop for a hit, the left and center fielders would each be penalized half of the difference between the value of the hit (.8 runs) and the overall value of a batted ball in that location (.47 runs), or .33 divided by 2, or .165 runs.? If the center fielder normally caught 60% (rather than 50%) of the outs hit to that location, then he would be penalized 60% of .33 runs and the left fielder would be penalized 40% of .33 runs.? Finally, all errors are assigned a fixed value of .75 runs.? This is the essence of UZR transformed into fielding runs, or Ultimate Zone Rating Defensive Runs.? In superlwts, all UZR defensive runs are park adjusted for each fielding position.? Defensive park adjustments, like their offensive counterparts, use 3year?s worth of data and the adjustment factors are regressed. Although UZR carries a pretty good yeartoyear correlation coefficient (i.e. it is pretty reliable), there are still some problems associated with it.? First, defensive park adjustments, while important (especially for the outfield), are not extremely reliable.? Second, adjacent fielders can influence one another?s UZR.? This is difficult to account for in the computations (in fact, I do not account for this at all).? Third, it is difficult, if not impossible, to factor out the influence of a team?s pitchers in each fielder?s UZR (I do not address this either).? Finally, because most playbyplay databases do not include the ?speed? or ?difficulty? of a batted ball, it is possible that some fielders may have, in the course of a season, significantly greater or fewer ?difficult? opportunities than others, given the same hittype and location. The last step in calculating a player?s superlwts is totaling the individual components (and then dividing by the league?s runs per game to get linear wins).? There are two ways to present the data.? The sum of all the individual components in superlwts represents a player?s total run (or win, if you do the last step) contribution, given the actual number of PA?s, GDP opportunities, defensive opportunities, etc. that he had over the course of a season or seasons.? However, we may also want to know (particularly for comparing the overall quality of one player to another) a player?s prorated superlwts or linear wins ?per unit game.?? In order to do this, we need to take the value of each separate component and ?normalize? it, based on a given number of games.? In the following charts, the column that represents each player?s prorated superlwts contains runs per 162 games, based on league average PA?s per game, defensive and GDP ?opportunities? per game, etc. So, the next time someone asks you, ?Who is better ? Abreu or Sosa, Piazza or Pudge, or Bagwell or Big Mac?? you will have the definitive answer!? Enjoy! 2000 SuperLWTS 2000 CSV file 19982000 Total SuperLWTS 19982000 CSV file The following charts are average values for each position (per 162 games) for the NL and AL combined: 2000 Average SuperLWTS Values (None of the columns above, including total lwts, necessarily sums to zero, since many players have defensive chances and PA?s at more than one position.? While players? superlwts represent all of their performance in a given season, each player is assigned only one primary defensive position.)
Mitchel Lichtman
Posted: August 27, 2001 at 06:00 AM  24 comment(s)
Login to Bookmark
Related News: 
BookmarksYou must be logged in to view your Bookmarks. Hot TopicsWhat do you do with Deacon White?
(17  1:12pm, Dec 23) Last: Alex King Loser Scores (15  12:05am, Oct 18) Last: mkt42 Nine (Year) Men Out: Free El Duque! (67  10:46am, May 09) Last: DanG Who is Shyam Das? (4  8:52pm, Feb 23) Last: RoyalsRetro (AG#1F) Greg Spira, RIP (45  10:22pm, Jan 09) Last: Jonathan Spira Northern California Symposium on Statistics and Operations Research in Sports, October 16, 2010 (5  12:50am, Sep 18) Last: balamar Mike Morgan, the Nexus of the Baseball Universe? (37  12:33pm, Jun 23) Last: The Keith Law Blog Blah Blah (battlekow) Sabermetrics, Scouting, and the Science of Baseball – May 21 and 22, 2011 (2  8:03pm, May 16) Last: Diamond Research Retrosheet SemiAnnual Site Update! (4  4:07pm, Nov 18) Last: Sweatpants What Might Work in the World Series, 2010 Edition (5  3:27pm, Nov 12) Last: fra paolo Predicting the 2010 Playoffs (11  5:21pm, Oct 20) Last: TomH SABR 40: Impressions of a FirstTime Attendee (5  11:12pm, Aug 19) Last: Joe Bivens, Minor Genius St. Louis Cardinals Midseason Report (12  12:42am, Aug 10) Last: bjhanke Napoleon Lajoie: Definition of Grace (9  12:38am, Jul 01) Last: Hang down your head, Tom Foley Youth Baseball Hitting Drills: Shine the Light (5  6:47am, Mar 11) Last: Pat Rapper's Delight 



Page rendered in 0.8037 seconds 
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. Old Matt Posted: August 27, 2001 at 12:10 AM (#604067)1. The double advances any other baserunners farther.
2. The hit, even though the baserunner is later erased, has advancement potential that the out lacks.
Cheers,
Alan Shank
Cheers,
Alan Shank
Cheers,
Alan Shank
Mike Cameron cost the Mariners 9 runs last year in center field but Darin Erstad saved 52? I'm not sure I believe that.
A bigger problem than that, though, is the idea that defensive failures can be allocated to "particular" fielders based on where the ball happens to fall. The vast majority of defense is positioning of the fielders to reduce the risk resulting from a defensive failure, and defensive positioning interacts in ways that aren't necessarily obvious. If a team has a weakness in RF, for example, the CF and 2B, at a minimum, have to adjust to cover that weakness to some extent  and because they have to adjust, the other fielders do, too. If a ball falls in LCF, who's to say that the CF might not have been able to make the catch if it hadn't been for that sluggard in RF? Should the RF get off entirely without responsibility, even though everyone else on the team was covering for him?
I'm of the opinion (heresy though it may be) that responsibility for defensive failures is rarely due to "just" the players in the vicinity of the ball. Defensice failures are almost always "team" failures, and most should be credited more or less equally to the entire team.
 MWE
Someone else asked how did those LWTS values work out, etc. Mickey explained he used reallife data. However, let me just explain it in a mathematical sense. The chance of scoring from 1b is 27% (more or less). This means that if you can get to 1B, you will have added .27 runs to your team. At 2b, it is .43, and 3b is .60. The "baserunneradvancement" value works out to .20/.30/.40 runs for the 1b/2b/3bhr. So, if you simply add all this, you end up with LWTS values of .47/.73/1.00/1.40, which is very close to what happens in reallife. The reason that they don't add up exactly is because the distribution of these events does not occur evenly. If you want to get into the technical details, you can also see another explanation here: http://baseball.fanhome.com/forums/showthread.php?threadid=76367
As far as Jermaine Dye and Darrin Ersdat and Bernie Williams and Ken Griffey Junior, etc., I have contended for a long time that it is difficult if not impossible to evaluate a fielder by observing him, even day in and day out. Of course, you can probably get some idea as to whether he is poor, average, or above average, but I guarantee that if Gd came down from the sky and told you exactly how good each fielder was, you would be very surprised at some of His evaluations and you would have made some egregious errors. It would be somewhat like trying to figure out a batter's exact offensive production (OPS, RC, lwts, etc.) simply by watching him and not knowing any of his actual stats. You might think that Piazza or Sosa, for example, were not as good as they are, etc. Because of the flawed nature of human nature, observational defensive evaluation is often based on things like number of great plays you observe, the perceived speed of a fielder, the "smoothness" of his catches (e.g. Andruw Jones, Bernie Williams, Jim Edmonds, Devon White), and his reputation. Some or all of these things can belie his true defensive ability. I much prefer objective measures, although I suppose that if you knew that these objective methods were imprefect (and they are), you could add some subjectivity to your evaluation. I would be ever so careful in doing so, however.
As far as the player who suggests using team defense (team defensive average, I assume, which is trivial to calculate) prorated among each player, while your arguments (one player influencing another) have some merit, that is like throwing out the entire family with the bathwater! Last year, the Cub infield had a team UZR lwts of 4 runs. Would you want to Mark Grace to share in that number (1 runs). His own UZR lwts was +9. The Rockies had an infield UZR lwts of 8 runs in 2000, yet Neifi Perez' lwts was +12. Would you want Neifi to get his share of the 8 rather than the +12?
I have a few corrections for the article (so far). My thanks to several readers for pointing them out. First an innocuous typo: "That is where the 1.06 number comes from. (1.06 is the average of the double value for the NL and AL combined in 1998 through 2000.)" Of course, this should read .77 is the average of the double value...
More importantly, my description of the "moving runners over" superlwts component was totally wrong. Without going into the exact methodology here, a batter gets around .1 run credit for every runner on second he moves to third on an out, above league average, and gets docked around .05 runs for every runner he "strands" at second, again, above league average. For some strange reason (Gremlins in my computer), I said .25 runs credited or docked...
Updated player ratings files (98,99,00, and 9800) should be up on Tuesday. These are the same as before, but they include 2 more colums for each player  one for total lwts adjusted for position (I simply subtract the league average total lwts for each defensive position, including DH), and one for prorated  per 162 games  total lwts, adjusted for position. I also sort the players by total adjusted lwts in descending order.
I am also in the process of tweaking the UZR calculations. So keep an eye out for updated UZR's.
Thanks for all the comments!
The format for the updated files is a little screwed up. Would you check that please?
Here is some data that might be useful:
These are the normalized (1.00 is league average at that position) parkadjusted Ultimate Zone Ratings for every pair of positions for 1998 to 2000 (3 years). I looked at every player who played more than 1 position in any given year. The UZR's are the weighted (by number of chances) averages at each position for players who played (at least) those 2 pairs in any given year. The numbers in parentheses are the total number of chances at each position.
The first UZR and # of chances (the number in parentheses) refer to the first fielding position in the position pairs and the second UZR and # of chances refer to the second position.
Position pair UZR1 (chances1) UZR2 (chances2)
3/4 1.03 (628) .99 (3270)
3/5 .99 (620) .94 (4562)
3/7 1.01 (2453).97 (3490)
3/8 1.01 (37) 1.01 (554)
3/9 1.03 (810) .99 (1124)
4/5 1.01 (7159) 1.00 (5488)
4/6 1.02 (4262) 1.00 (4816)
4/7 .99 (1557) .99 (518)
4/8 .94 (107) .86 (60)
4/9 .97 (727) .93 (57)
5/6 1.00 (4654) .99 (5530)
5/7 .95 (1604) .98 (845)
5/8 1.23 (52) 1.06 (17)
5/9 1.03 (474) 1.07 (226)
6/7 .99 (1723) .98 (524)
6/8 1.00 (411) .88 (99)
6/9 .78 (101) 1.12 (14)
7/8 1.01 (12016) .99 (16697)
7/9 .98 (8603) .98 (7224)
8/9 .99 (15592) 1.01 (12054)
"I'm a bit confused by the positional adjustment being based on total lwts (Off+Def) at each position. Isn't the purpose of the adjustment to deal with the fact that the Def. superlwts are (already) relative to position, while the Off. are not? Therefore, shouldn't the adjustment be in regards to the ave. Off. lwts at the position (only). Then both components would be relative to positional averages. Unless I'm mistaken (always a distinct possibility), that's what Palmer does."
David, you are not mistaken. The positonal adjustments should be based on the average OFFENSIVE lwts (including baserunning, etc.) at each position. What made you think that that was not the case with superlwts and the charts that are included in the article? Even you included defensive lwts, they would average to zero for each position anyway.
"I'm also a bit confused by Mickey's philosophy with respect to baserunner movement during batting outs. Baserunners advance on outs in several different ways, yet he chooses to include only two of these instancesGDPs and "moving a runner from 2nd to 3rd with no outs". His reason for not including SFs (in their correct weight) is that a study showed that batters fly out in SF situations at their typical rate. But another study (by the same group) showed that batters also ground out in GDP situations at their typical rate. So, I ask, why the seeming inconsistency of approach? And what about all of the other baserunner advances on outs, such as a runner scoring on a groundout? By including only the "moving a runner from 2nd to 3rd..." category, Mickey seems to be implying that this event uniquely reflects an actual ability (to ground out to the right side), while the others don't. But where is the evidence that this is true? If batters make their normal kind of outs in SF and (especially) GDP situations, why would they not also do so in this, relatively less impactful, situation? Overall, in terms on nonvalueadded ability, would it not make more sense to use several different out values (for GO to the left side, GO to the right side, the same for FO, and then strikeouts), and simply forget about baserunner movement?"
Long comment, long answer.
This is important, so everyone sit down and get comfortable! There are two kinds of evaluations. One is what some people call "performance" or simply "value." You can call it whatever you want, I suppose. It is the actual run (and win) contribution that a player made (pasttense) to his team. We can even break down this type of evaluation into two different subevaluations  one would be their actual run contribution, and one would be their actual "win" contribution. (They are not the same; some runs are "meaningless," like all the runs scored after the goahead run is scored in a game. I put the word "meaningless" in quotations becuase I don't want o hear "The last runs has the same value as the first," etc. You all know what I mean by "meaningless.") The latter ("win" contribution is not so easy to evaluate and interpret as the former ("run" contribution), so let's forget about it for now. I only bring it up to illustrate there are many ways to conceptualize an "evaluation" measure or metric. Anyway, back to the 2 basic ways to "evaluate" a player.
What is the reason for doing a "performance" type (the first method) evaluation? I can think of two (ther may be more). One is to decide an MVP type award, and; two, is to simply apportion a team's runs among their individual players, for whatever that is worth.
So what are some of the ways to do a "performace" type evaluation. One is with regular lwts (or superlwts). This is not very good for some of the reasons that David brings up in his "criticism" of superlwts (don't worry; I can take it). Mostly it is because "all events are not created equal in an actual team contest." If a player only hits solo home runs, then clearly they were (again, notice the pasttense) worth only 1.0 runs to his team, and not 1.40. Ditto for sac fly outs and some ground outs, singles, doubles, walks, etc. Using leagueaverage values for all events, and ignoring sac flies and IBB's, etc., does not give us a real good idea of exactlt how much a player contributed to his team's runs scored. OK, how about RBI and Runs Scored. While much maligned stats, these are not too bad in measuring a player's actual run contribution to his team. In fact, the only things that are missing are events which "help" a run to score, like a single that advances a baserunner who actually scores. And of course, for a HR, RBI and Runs Scored create a double counting problem (why DOES a player get an RBI for "driving himself in" on a HR? I guess someone has to!). Beleive it or not, if I wanted to evaluate a player's actual performance, in terms of actual (not theoretical) run contribution, I would prefer RBI plus Runs Scored over lwts!
Another method for evaluating "peformance" is the onceballyhoed (sp?) "valueadded" lwts. Frankly, I hate this metric. It is apparently an attempt to combine performnce and ability, and in doing so, it combines the "worst of both worlds." If you are not familiar with "valueadded" lwts, for every individual event a player produces, rather than using a league average lwts value (like .77 for every double), you use the actual change in Run Expectancy (RE) before and after that particular event. Obviously, yoiu need PBP data to compute valueadded lwts. For example, if a player hits a HR with no one on, he gets credited with 1.0 runs. With 2 players on, 3.0 runs. For HR's, that makes a lot of sense, and is basically the same as RBI plus Runs! But what about a single? If a player hits a single with no one on and 2 outs, he gets credit for around .13 runs, while if he hits a single with a runner on first and 1 out, and the runners advances to third, he gets credit for around .68 runs. (I am using the RE tables to figure these numbers.) Well, this is silly! If, in the first example (single, no one one and 2 outs), if the run never scores why should the batter be given any credit at all? The value added proponents will want you to beleive it is because that was (notice the tense again) the theoretical value of the single. Yes it was, but... if the run didn't score, then who cares? We certainly don't if we want to measure actual performance (run contribution to a reallive team). On the other hand, if we want to measure "hypothetical" value, or "contextneutral" performance, then we would want to use the average run value of a single, which is, low and behold, lwts! Value added lwts is a very bad "hybrid" metric in my notso humble opinion!
What is the "correct" way, then, to apportion a team's runs scored. I'll give you the general idea, and leave the rest to you. First of all, you give no credit for anything that doesn't in some way cause a run to score (that's why a "performance" measure don't measure ability; if you hit a leadoff triple and don't score  tough luck!) Second (and last) of all, if a run or runs score, you figure out a nice, neat way to divey up the responsibility! That's it! If a player hits a home run with 2 runners on, he gets most of the credit for the 3 runs. The players on base get some of the credit. How much? I have no idea!
This leads us to what I think is the most intersting and useful evaluation method. That is "ability" or "contextneutral" performance, or the "Bob Dylan" metric, or whatever you want to call it. It is the theoretical run contribution of each of a player's offensive events to a random, hypothetical team (in a random batting slot, a random inning, against a random team, with a random number of outs, and a random number of baserunners, etc., etc.). This kind of measure closely resembles what we think of as "ability" (perhaps it defines ability) AND it is generally the best way to project performance (again, on a random team, BO slot, inning, etc.). That is why I find it much more intersting and useful than a "peformance" or "run contribution" measure. In fact, if I hear that Brett Boone (or whoever) contributed soandso runs to the Mariners in 2001, and it is a huge number, I am like, "So what? Is he realy that good? How much was due to circumstnace? Did he have lots of players on base to pump up his RBI?" When I get over saying "Wow! I didn't think he was that good, going into the season," all I really want to know is "How 'good' is he (compared to other players), and what will he likely do next year on the Mariners or some other team?"
In any case, given the definiton (MY definiton) of ability above, it just so happens that lwts, and specifically superlwts is perfectly designed to give you the answer that you want (per the definition). If you want to evaluate something else, use a different metric!
So, for example, the above definiton PRECLUDES treating sac flies as anything more or less than a regular fly out (and yes, a fly out has a different lwt value than a K or a ground ball; actually the value of the GB and FB are almost exactly the same). This is because we can only assume that on a random team, every player will have the same nubmer of runners on third with less than 2 outs, and we believe that no player can hit more fly balls (or ALL players can hit more fly balls) in a sac fly situation. If some players can hit significantly more fly balls in a sac fly situation, then they SHOULD be given some credit for this ability, as it will "repeat" itself on a hypo, random team in the future, and it is part of those player's hypothetical run contribution to a hypo, random team. Ditto for grounf balls with a runner on third and less than 2 outs.
What about GDP's? If the number of GDP's a player hit into was dictated solely by the number of GDP opportunities, as has been suggested by Palmer and others, then I would ignore the GDP as I do the sac fly. But we KNOW this is not true. First of all, the more GB's per PA a batter hits, the more DP's he will hit into (per DP opp, of course). So why not just use a separate lwt value for GB's, FB's, and K's, as David suggests? We could, and that would certainly account for this aspect of GDP propensity. But there is another aspect! That is the batter's speed to first (footspeed plus whether he is a RHB or LHB), and the speed of his ground balls! To be true to our defintion of "ability" we must account for this. The only practical way, of course, is to use a player's GDP per GDP opp, as part of our measure of his "ability." That is exactly what superlwts does!
What about "moving runners over," which is the crux of David's criticism. First of all, "moving a runner over from first" is already counted in our GDP component. (I suppose I could include how often a batter is out at first and the runner advances to second, although if there is no "ability" involved their [i.e. if it is just plain old "luck"], then it violates our defintion of a metric that measures ability.) Second of all, moving a runner over (in) from third is assumed to be random (luck; no "control" by the batter, other than the fact that he isn't striking out). So what is left? Moving a runner over from second! David, that is why it is the only "runner movement" I include in the superlwts components (BTW, I DO track it with 0 or 1 out, not just 0). Can't we say that moving a runner over from second, like from first or from third, is random (again, other than the fact that it is a nonK), therefore it should not be given any attention in an "ability" formula? No  for 2 reasons! One is that batters who naturally hit the ball to the right side (mostly lefties) tend to move more runners over  so even though it is not "intentional," they need to get credit for it, in order to be true to out defintion of ability! Two, is that some batters can and do "try" and hit the ball to the right side with a runner on second and 1 out. Since they reduce their overall production in doing so (and their traditional offensive lwts suffer), they need to get credit for "giving themsleves up," as certainly they will continue to do so with a hypo, random team in the future, and it is part of their theoretical value to any team at any time! Again, as David suggests, we could simply assign different lwt values to "K's," "ground balls hit to the right side," "fly balls to the left," etc. (I thought about it), however one problem would be that we would be "shortchanging" those players who went out of their way to make contact and hit the ball to the right side in those "runner on second/no out" situaions.
Probably the only thing I shoul'd have done was to separte outs into K's and nonKs and then do the GDP adjustments. I didn't because I didn't want to make the superlwts formuals any more complicated than they already were. In any case, know that, after you separately adjust for GDP's, the K's are worth about .016 more than a nonK out, whereas, after you adjust for moving runners over, a FB and GB out are worth about the same (so there is no need to distinguish the two).
That's all folks!
David, as far as not realizing that the def. lwts total to 0 for each position, don't feel bad. I forgot the same thing when I read your post. At first, I thought, "Oops, I made a mistake (I thought I DID add in the defensive lwts to each positon's average total lwts amd that this was a problem). Then I looked at the average total lwts charts and noticed that I didn't do this and when I was writing the response, I went "Duh, it wouldn't matter anyway!"
As far as my explanation of what "matters" in supelwts, it is not that "it is my metric and I get to do anything I want." It is that I am only doing what is "commensurate" with MY DEFINITION of my metric, although there is certainly a LITTLE wiggle room in deciding WHAT is commensurate. Howver, MY definiton is not unique or unusual. It is the same definiton that is attributed to regular lwts  i.e. "ability" or "contextneutral" performace. The trick, in deciding what to include, and what to ignore, is REPEATABILIY. Whatever events or situations that were not average for a particular player, if they would be expected to "show up" in the future ON A RANDOM TEAM, then they MUST be included. If they would be expected to regress to a lwague average in the future (other than due to a small sample size in the past, then thet MUST be ignored. This basically boils down to any tangible quality or skill that the batter (whether "intentional" or not; it could be something innate, like whether he bats RH or LH) possesses.
That being said, if I did the formulas again, I think I might have separate values for GB and FB outs (and perhaps to the right and left sides), and K's, as you suggest, although this would present the problem I mentioned concerning "moving runners over." Of course, if I used separate values for GB and FB outs, I would only use the "speed" element of GDP's to make that adjustment...
Let's forget about runs. Let's look at wins, and the Win Expectancy Charts. Every event in baseball changes the EXPECTED win% of the game. At the start of the game, the Braves might have a 46.2% chance of beating the Yankees in NY. The leadoff hitter for the Braves hits a triple, and brings the EXPECTED chance of his team winning to let's say 51.0%. The next 3 batters get out, and each out brings the chance of the Braves losing down somewhat to let's say 45.1%. All of this is EXPECTED chances and percentages. But it is based on ACTUAL PERFORMANCE. In terms of value/performance, this method is totally perfect.
On the other hand, if it went 123, the win% would have gone directly from 46.2% to 45.1%. These 3 outs are LESS costly to the PLAYERS, because of their specific CONTEXT. By the end of the inning however, the sum of the 3 or 4 player will add up to 1.1 win%.
In terms of ability, this method does not work. Because ability implies "all things being equal, how would this player do". Therefore, you assume that the frequency of the "context states" is the same for all batters, and then you apply their "success rates". In the previous example, the "expected win%" is based on an "average succcess rate", but given the player's "frequency of states".
Mitchell is right though in that we should clearly separate the 2. And they both have their purposes in life. And as Mitchell pointed out, one is to look backward to what happened (value), and the other looks forward to what may happen (ability).
Is it not possible that certain batters have a tendency to hit more fly balls to CF than others ?
If this is true, then it should be much easier for a runner to score on a sac fly to CF than to the corners because the throw is likely to be longer from CF.
Maybe the data sample is not large enough for this ability to show up, but that doesn't mean that no ability exists. Even if this ability is relatively insignificant, shouldn't you want to account for it to make your measurement of ability as accurate as possible ?
My other concern is lineup position  perhaps a hitter who bats leadoff will try to develop his onbase skills rather than his slugging skills  since the relative value of walk versus homerun is highest when leading off an inning, this should be accounted for. After all whether to bat a player 3rd or 1st in the batting order is the manager's decision and does not reflect a player's ability (though of course the manager looks at the player's abilities when making this decision).
Generally speaking, how does a player's hitting style change with the particular base/out/score situation? And is this affected if he is the #1 or #7 hitter? That is, does the leadoff hitter and the #7 hitter approach the same situation (say bases empty, no outs) the same way?
Mitchell has already shown in the past how hitters do change their approach when at Coors.
We know that certain events happen more with man on 1B or with 1B being empty, but you have a man on 2B. It doesn't change the LWTS component values much.
The other one though, about moving a batter from #1 to #3 is much harder to figure out. Was he moved because of a "perceived" problem, thereby affecting the "random sample selection" process? Or was he moved simply because it made sense. Tim Raines batted #3 for a while not because he blew it as a leadoff hitter, but simply because they needed a bat (not a good reason, but whatever). This would be one of the tougher studies to research.
Just like we don't have to expect that the managers will have an offensive spectrum that is a complete opposition of the defensive spectrum, nor should we assume that the best fielders are playing at the toughest positions.
1. Your "disagreement with my assessment of sac flies" is a bit unfounded. If you read some of my comments concerning the article and my methodologies, you would see that we are on the same page here. Clearly a player's GB/FB ratio and his SO rate will affect his overall out value. Of course, a player who has more FB's relative to K's will have greater value, due to the greater frequency of sac flies. Worrying about a player's sac flies is not the way to handle this though. The best way is to simply assign an out value for K's and one for nonK's. Obviously a K has a unique out value that is less (more negative) than that of a batted ball out. There is no need, however, to distinguish between FB's and GB's. This is due the coincidental fact that GB's and FB's have almost exactly the same out value. How can this be, you ask, since GB's include some double plays, and FB's generally do not, and FB outs generally score more runners from third than ground ball outs? Well, the primary reason that GB outs are worth exaclty the same as FB outs, even including the GDP, is that errors occur much more often on GB's than FB's; and of course the difference between an error and an out is huge. So to repeat myself, we do not need to record how many actual sac flies a player has, since anything above or below average is simply a matter of "luck" (usually because of fewer or greater opportunties than average) anyway. You could say that certain players will have more sac flies on the average becuase of the average distance of their fly balls or whether they tend to hit fly balls to right or left (left field arms are generally weaker). BTW, I don't know why a previous poster said that fly balls to center are generally further from home plate. Of course they are not. Batters hit the furthest fly balls to the pull field and the shortest to the opposite field, with center field being somewhere in between. Anyway, a batter's fly ball "tendencies" (location and distance) has a negligble effect on their sac fly frequencies and I don't think that superlwts (or any other evaluative measure) needs to take every niggardly thing into consideration. There has to be a point of diminsishing returns. So, in retrosoect, and with the help of some of the comments and criticisms herein and on the BaseballBoards, in retrospect (and I still may do this, when I update my ratings), I wish I had used one out value for all FB's and GB's and another one for K's, and then adjusted for the footspeed (and handedness) aspect of GDP's. But either way, it is definitely not necessary or correct to use, in any way, a player's actual number of sac flies or even his sac flies per opportunity, unless you think that individual players can "control" their fly ball frequency to a significant degree in a sac fly situation.
2. Adjusting for the quality of pitchers faced: I use a pitcher's component stats (s,d,t,hr,bb,so allowed) for the given season, expressed as a funsction of the league average in each category. For example, an average K pitcher will be 1.0 for K's, etc. On a PA by PA basis, I simply keep track of the average rating for all the stat categories of a batter's "pitchers faced." For example, if a bater happened to face a pool of pitchers during the season who had a K rating of 1.05 (high K pitchers), then his total K's would be divided by 1.05. Tango would kill me for treating each offensive category separately rather than as ratios, but I've been killed before! I don't know for sure, but my guess is that for fulltime players, pitcher adjusting makes little difference. Of course, most of whatever difference it does make is due to batter not having to face his own pitchers (Palmer incorporates this into his park adjustment formulas; he has spearate park adjustments  one for batters and one for pitchers (BPF's and PPF's); I do not, so my pitcher adjustments take care of this  and then some).
3. No, I do not adjust the stolen base and caught stealings for the opposing catchers (and pitchers). I could and probably should, to be consistent with my "pitcher adjustments." How much work do you want me to do? I work AND go to law school, and do this crap for fun!
4. I do not take into consideration the handedness of players. It doesn't really matter, except for parttime or platooned players who face more opposite sided pitchers. For them, although their total lwts (total theoretical) contribution would still be accurate and relevant to describe their ability (and value) and to project their performace, to use their prorated, per 162 games, lwts would probably overestimate their value or ability. Now that I think of it, lefties are probably overvalued in the prorated lwts category since they are often rested versus the tough lefty pitchers. Players may also have faced an inordinately high percentage of RHP's or LHP's by chance alone, so I probably should do some kind of "adjustment" for the average handedness of pitchers faced, like I do for the overall quality of pitchers faced. Thanks for the suggestion. I hadn't thought of that. Now I'll have to quit school and maybe work parttime!
5. I can't remember, but I think that each pitcher's overall normalized stats are park adjusted, then I park adjust and pitcher adjust each batter at the same time. My park adjustments are also on a PA by PA and park by park basis. I don't just park adjust a player's home stats for his home park, or half of his total stats for his home park. I park adjust according to the average (component) park adjustments of all the stadiums he has played in, according to how many PA's he has had in each park. It's probably not worth all that work, although the computer of course does most of it.
6. Like I said, errors are automatically included in the value of the out, and they should mostly be included in the value of a GB out (a little in a FB out), and not at all in the K. As far as players inducing errors becuase of their speed and/or agressiveness(I assume these are the only reasons), if I see evidence or come up with evidence that this is true to a significant degree (it would not be hard to find out), then yes, it should somehow be accounted for.
7. I like that idea. It is very possible, as you suggest, that a SB may be "worth" considerably more than .17 runs, if there were a strong correlation between a high number of SB's or a high SB success rate (or a high SBCS, as you state), and other baserunning. If there is, I supsect that the relationship would not be linear and the y intercept of the regression curve would not be 0. For example, a 5/2 SB/CS guy might be have negative value for other baserunning, while a 30/20 or 15/5 guy, might have much positive value in other baserunning. So we would have to be careful how we would adjust for this. SBCS might be just the way to go!
Any more questions or comments, save them for Part III!
:)
This is true of ALL batted balls. However, the reader was probably talking about the balls that STAY IN THE STADIUM. Of that subset, it's equally obvious that the farthest hit balls is in CF. If they moved the LF and CF and RF walls to 500 feet, then MGL's statement would be accurate.
You are right, I would have killed you for not using ratios, which I've already shown to be accurate mathematical way to go. However, I would guess that it balances out mostly, and wouldn't expect to see much if any difference.
I second the request for the "unadjusted" stats. The only way to really move superLWTS ahead is to have the data somewhat verifiable, and that would be the first step. The second step would be to produce them along teambyteam lines.
You must be Registered and Logged In to post comments.
<< Back to main