Baseball Primer Newsblog— The Best News Links from the Baseball Newsstand
Monday, July 15, 2013
Let the games begin: Patriot’s new gig at Sports Data Research.
To many in the general population of baseball fans, the singular metric associated with sabermetrics might be On Base Plus Slugging (OPS). OPS has gained a measure of acceptance in the mainstream as an offensive metric in both its raw form and the park and leagueadjusted variant OPS+, and given the long history of their use in sabermetrics (both having been developed by pioneering sabermetrician Pete Palmer), it is no surprise that the metrics are associated with the field and are still in common use. However, OPS has shortcomings that can be problematic for serious application:
 OPS is not expressed in meaningful units: Ideally, a metric should be expressed in units that are fundamental to the game itself (such as runs or wins) or that can be easily explained. On their own, the components of OPS do just fine by this standard. On Base Average can easily be understood as the proportion of plate appearances in which the batter reaches safely, and Slugging Average is the average number of total bases per at bat (although the use of the name “Slugging Percentage” is misleading at best). But when they are combined into OPS, it becomes impossible to articulate what the unit of measurement is or what the result is meant to represent. While a batter with a .400 OBA reaches safely 40% of the time and a batter with a .500 SLG averages one total base every two at bats, the meaning of his corresponding .900 OPS cannot be similarly stated. The best one can do is to state what its user intends it to represent–a measure of overall hitting productivity.
 OPS is not as accurate as competing overall measures of offensive productivity: Generally, OPS does a decent job of predicting runs scored on the team level, but it tends to be slightly less accurate than more refined metrics. OPS still performs credibly, but metrics based on linear weights or Base Runs perform better.
OPS could be improved by weighting OBA more heavily; studies have suggested a multiplier in the neighborhood of 1.8 to maximize correlation with team runs scored (i.e., OBA*1.8 + SLG). Doing so, though, would take away one of the strongest selling points of the metric, which is ease of calculation.

Support BBTF
Thanks to Brian for his generous support.
Bookmarks
You must be logged in to view your Bookmarks.
Hot Topics
Newsblog: OTP 2018 Apr 23: The DominantSport Theory of American Politics (473  6:47pm, Apr 24)Last: Stormy JENewsblog: She's got legs that go all the way up to her OMNICHATTER! for April 24, 2018 (13  6:45pm, Apr 24)Last: Moses Taylor, aka Hambone FakenameingtonNewsblog: Pujols' Age Revisted (22  6:44pm, Apr 24)Last: PreservedFishNewsblog: OT  CatchAll Pop Culture Extravaganza (April  June 2018) (242  6:33pm, Apr 24)Last: cardsfanboyNewsblog: OT: Winter Soccer Thread (1591  6:28pm, Apr 24)Last: Biff, highlyregarded young guyNewsblog: VIDEO: Rockies Announcers Sound Like Complete Idiots Talking About Javier Baez (26  6:25pm, Apr 24)Last: Brian C Newsblog: OT  201718 NBA thread (AllStar Weekend to End of Time edition) (2571  6:12pm, Apr 24)Last: JC in DCNewsblog: ESPN's top 50 players (73  6:06pm, Apr 24)Last: Walt DavisNewsblog: LongTerm Battery Combiniations (6  5:41pm, Apr 24)Last: McCoy Newsblog: Primer Dugout (and link of the day) 4242018 (32  5:38pm, Apr 24)Last: Walt DavisNewsblog: Forget that one call; Sean Manaea deserves our full attention (22  5:19pm, Apr 24)Last: villageidiomNewsblog: Brandon Belt sets MLB record, sees 21 pitches in AB before lining out (26  4:14pm, Apr 24)Last: BatmanNewsblog: 'Family' and sense of 'brotherhood' has Diamondbacks picking up right where they left off (17  3:35pm, Apr 24)Last: shoewizardGonfalon Cubs: Riding the Rails of Mediocrity (17  3:00pm, Apr 24)Last: What did Billy Ripken have against ElRoy Face?Sox Therapy: Lining Up The Minors (13  2:47pm, Apr 24)Last: Jose is an Absurd Doubles Machine

Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. RMc's Unenviable Situation Posted: July 15, 2013 at 07:55 AM (#4494454)I think OPS+ is the 'quick and dirty' tool while other metrics are good if you are trying to get to more fine print stuff. The extremes are easily noticed (guys who have low OBP high Slg, or vice versa, tend to stand out). Also of note is that most players will fluctuate in any stat far more than the difference in quality of two stats  ie: a guy with a 100 OPS+ could easily perform at a 90 or 110 level which is a far bigger spread (I think) than OPS+ has vs other metrics.
Raw OPS, I actually rarely use, and I really don't even have much of a frame of reference for it. .800 is doing alright, .700 is bad...inbetween is anyone's best guess.
I can see OPS as being an unsatisfying stat for someone getting into baseball stats for the first time. Generally stats are more acceptable when it's clear what they are trying to measure. Even something like defensive metrics where there's a degree of subjective measurements and assumptions being made, you can have a conversation about the process and be clear about where you got to and how you ended up there. But adding OBP and SLG together seems a tad arbitrary, even if it does spit out something that's decent enough for a quick and dirty number.
109: Eddie Yost (184), Chris Chambliss (70)
108: Miller Huggins (100), Duffy Lewis (39)
107: Johnny Pesky (80), Sam Chapman (32)
106: Chuck Knoblauch (105), Lance Parrish (29)
105: Ron Hunt (84), Ruben Sierra (3)
104: Willie Randolph (120), Steve Finley (31)
103: Max Bishop (89), Tony Armas (4)
102: Jeff Blauser (51), Willie Montanez (20)
101: Charlie Jamieson (46), Garry Maddox (8)
100: Willie McGee (34), Bill Buckner (41)
How much of those discrepancies are just playing time? WAR is a counting stat, OPS+ a rate stat. Just looking at the first one, Yost had 734 PA and Chambliss 455 PA in the years you cited.
I'm not sure I believe that Bill Buckner stat. According to bbref.com, Buckner didn't have an OPS+ of 100 for any single year, although he did for his entire career. And his career oWAR was 12.1. He's got a 41 in Rbat, but that's a different stat.
Those are career numbers. Yost never had a season with 184 batting Runs. Ruth never had a season with 184 batting runs.
rBat is War batting runs, just like he stated.
That's exactly backwards. All multiplicative methods suffer from the not insignificant problem that it asserts that a home run hit by (say) Frank Thomas is more far valuable than one hit by Joe Carter. (Names used should give some idea how long I've been making this point) (Basic runs created is AB*OBP*SLG)
Simple way to look at it is to just take a typical year by both players and add a single 11 with a home run. To pick one year at not random, in 1995 that extra HR is worth around 1.43 for Carter and 2.02 runs for Thomas. Doesn't take long for that kind of thing to become serious.
Bill James was acknowledging that runs created had a serious issue with extreme players as far back as the mid80s.
Now there is a way that you can mostly mitigate the problem (first advanced by Dave Tate and called marginal lineup value. Appears that James independently discovered it more than a decade later)
The idea is simple enough. Calculate runs created for a team. Calculate team runs created with a player's stats removed. Credit the player with the difference. This is a pretty fair amount of work (of course it's be automated) and it's still not as good as the better linear weight based methods.
In '81, Dave Parker was 8 RAA (8 RBA?); in '84 he was 11. But then playing time  in '81 he had 254 PA and 655 PA in '84. Simply because of playing time, he went from being "replacement level" (0 WAR) to a 1 WAR player, even though he wasn't actually a better player.
EDIT: And parker was the same hitter both years  .742 OPS/105 OPS+ in '81, .738/104 in '84 which is reflected in Rbat (0 (? must be a rounding thing) both years).
Does the WAR runs batting incorporate a position adjustment? I notice that nearly all the better players are middle infielders, while nearly all the lesser ones are outfielders.
EDIT: And parker was the same hitter both years  .742 OPS/105 OPS+ in '81, .738/104 in '84 which is reflected in Rbat (0 (? must be a rounding thing) both years).
An rBat of zero is a league average hitter. Being a league avg. hitter in 655 PA is, in fact, more valuable than doing so in 254 PA.
The difference in WAR is driven by Parker fielding much better in '84: 6 in a full season, vs. '81: 7 in a half season.
In other words, there are lots of things that might cause the differences in post 6.
The entire difference, given that Parker was a worse player in '84, was playing time  23 Rrep vs. 9 in '81.
Both statements are true. If Parker was as bad a fielder, he would have gotten many more negative fielding runs in the greater playing time in '84, which would have negated the extra replacement runs.
To make my point, imagine 2 truly average players  0 Rbat, Rdp, Rbaser, Rpos. However, player A gets 254 PA, B gets 655 (EDIT: In other words, B isn't a better player, his manager just trots him out there more often). Player A would be a 0.9 WAR player, B would be 2.2 WAR simply because of playing time. That's a bug, not a feature.
Further, from BBRef:
They are giving 20.5 runs/600 PA (give or take, depending on league/year) simply for playing time.
No, it's a feature. If you are measuring value above replacement, then more time playing above replacement will have more value. You are free to ignore the stat and go with an average based rather than replacement based system, but to criticize a replacement based system because it values average level playing time is silly.
Yes they are. And if your performance is below replacement value, you will rack up more negative components than the playing time bonus can compensate for, and you will end up negative overall. I don't see why that is so difficult.
Not even close. It correctly measures league average playing time as having value. By definition, half of the players are below average. It's far from a given that a manager will have league average or above players at every position. Every year teams lose playoff spots because of a lack of league average production somewhere. If the 2011 Red Sox could have replaced just one of Drew, Lackey, or Wakefield with a league average replacement, it indeed would have been "over".
I don't see the problem with that. War is a counting stat, two equal rate players, the one with more playing time should have the higher numbers.
I chose pairs w/ similar number of career plate appearances, with the exception of Ruben Sierra, who had quite a bit more than Ron Hunt. So basically none of those discrepancies are just playing time.
This underestimates the value of OBP (the precise value of the two is a function of offensive context  the higher the offensive context the greater the relative importance of OBP) OBP*1.7 + SLG is a good general rule.
Bishop for instance is much better at avoiding outs than Armas. Etc.
Randolph not only has a higher OBP, he reached base on error about 1/3 more often. And Finley has quite a few more IBBs. Simple metrics like OPS+ treat all walks as having equal value, but in fact intentional walks are not as valuable as the ordinary variety.
It's (OBP/lgOBP) + (SLG/lgSLG)  1. The ratio of league SLG to league OBP is usually around 1.2:1. (This year it's .401 to .317, which is about 1.25:1.)
How is a manager being an idiot for playing an average player more? Average is pretty damn good.
They divide it by a number that's smaller than lgSLG by a factor of roughly 1.2, which effectively multiplies it by 1.2. But the value of that ratio depends on league context  in the 1908 NL (lgOBP .299, lgSLG .306), the 1.2 turns into 1.02. In the 2000 AL (.349/.443), it's closer to 1.3.
EDIT: Cokes of course.
So I did a regression where each guy's OPS+ was the dependent variable and OBP and SLG were the independent variables. 20 players
OPS+ = 311.01*OBP + 271.75*SLG  105.64
The rsquared was .994 and the standard error was 1.44. This makes OBP about 14.5% more important than SLG
If OBP is 30% above average and SLG 10%, you get 140
If OBP is 10% above average and SLG 30%, you get 140
But we got here when discussing big differences between the rbat of two players with roughly equal playing time and the same OPS+.
When you use more sophisticated metrics the first player in your example will almost certainly have more rbat than the second one.
The reason for that is that OPS+ undervalues OBP. The form of linear weights used by WAR doesn't.
There is also the issue that the form of linear weights used by WAR is almost certainly overfitted. The weights for each event are calculated for each league/season. And some of the difference could be what amounts to strings of noise (but it's not likely a huge factor in Matt's listed players)
Well, the No. 1 Shortcoming is that nobody involved in sabermetrics thinks of OPS as an "Advanced Metric".
Express the difference between linear weight based methods and runs created, in terms of accuracy. Remember to show your work for partial credit.
Correct, but because the league average of OBP is lower, 10% above lgOBP is a smaller increase in raw numbers than 10% above lgSLP.
Let's use your original assumption of lgOBP and lgSLP. The first guy who is 20% above has OPS components of .384 and .480, an OPS of .864. The other two guys have OPS' of .856 and .872. So, the three guys, all with OPS+ of 140, have raw OPS of .856, .864, and .872.
That makes sense, thanks.
If the league OBP is .330 and the league SLG is .400, we have pretty close to a 1.2 ratio (1.21). If a league average hitter raises his OBP to .363, his OPS+ will be 109 (keeping SLG at .400). But if we kep OBP at .330 and we raise his SLG to .433, his OPS+ is about 107.3. So the increase in OPS+ is about 22% higher in the first case. So I think I uderstand what you guys are saying
Last time I checked, the standard error (at the team level) was around 15 runs for linear weights, about 27 runs for a properly weighted OBP+SLG. And runs created didn't score a heck of a lot better. Low to mid 20s IIRC.
I know Tango has done some work on this much more recently than I have. There's also something (by Nate Silver IIRC) over at BP.
As for showing the work, I'd encourage you to do it yourself. You can still get a copy of the Lahman database at baseball1.com last time I checked. In the past I've chosen to start at 1955 because that's the point at which you get the full dataset for the advanced versions of runs created. (don't recall if it has ROE now. Don't think so. At least that's what where I got derailed last time.)
I'm pretty much at this level, and don't find the "higher math" all that interesting/understandable and am unsure of the accuracy and usefulness, but perhaps this thread can address one of my longtime concerns. My understanding is OPS+ and some other stats use Park Factors to adjust for players from different years and different settings. But isn't a single park factor somewhat misleading for those parks that play notably different depending on whether you're batting righthanded or lefthanded, such as Old Yankee Stadium, perhaps most famously and many others to some degree? Wouldn't separate park factors be more accurate and make OPS+ more accurate, too?
As I've said on many occasions park adjustments work well for value and that what all park adjusted stats are measuring.
In terms of his value to the Yankees it doesn't really matter that Joe DiMaggio (to name one prominent example) was not particularly well suited to playing in Yankee stadium. The runs he created (and saved on defense) and the overall offensive context is what matters in measuring value.
Feel free to argue that DiMaggio would probably have been more valuable in other contexts. It's a hypothesis that has a fair amount of support. Not every change in offensive context affects all players equally but from the point of value it simply doesn't matter.
There's obviously an extremely high correlation between value and ability but the best we can do is to use the former as a proxy for the latter and then try to adjust. Dead easy to justify because the standard error for any lengthy career will not be small. Probably in the range of 4 wins for a 16 year career.
Out of, what, an average of 700 runs scored?
If batting average is your frame of reference, Don Malcolm used to divide OPS by 3 for a quick and dirty EQA equivalent.
This is one problem with all of these "let's put it on the 'scale' of X" (whether that's BA, OBP, whatever).
Current average AL BA is 256 ... in 2003 it was 267; average OPS is now about 730, then it was about 770. An 800 OPS now is about a 120 OPS+ while it was about a 108 in the good old days. At the height of sillyball (2000), AL average OPS was 792 (on a 276 BA) so an 800 was really not special. Back in 77 when I still had dreams the average BA was 266 but the average OPS just 735 ... so 2003 BA with 2013 OPS.
So I've been around long enough to know that the answer to "what is a good batting average" is "it depends on the context." Since it's hard enough to keep track of that context just to stay on top of "what is a good batting average", I don't want to have to remember what a "good batting average" was according to BP back whenever they cooked up EQA ... or wOBA which I think has an "average" OBP scale of 330 when average these days is around 320.
If you're gonna monkey around that way, just stick with the "100 is average" idea or, better yet, zscores!
The other main problem with these cute scales is that they don't scale in a statistically proper way which is to center around the mean and divide by the standard deviation. That is, not only did the average OBP use to be 330, not 320 but the standard deviation on that 330 OBP was somewhat higher. (maybe not enough to make a difference but do things right if you wanna get "fancy")
Actually I think I'm from some sort of transitional generation. Batting average was never really my frame of reference either, (or perhaps it was and I've just forgotten it). I'm almost entirely sucked into the "100 is average" dynamic at this point.
Well no, but I can recall getting a copy of the first Mac, noticing how many HOFers seemed to do very well in both OBP and SLG and spending hours writing in OPS on the team pages.
You must be Registered and Logged In to post comments.
<< Back to main