Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Baseball Newsstand > Discussion
Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Thursday, November 10, 2022

MLB Officially Starts a War on WAR

In The Athletic’s report it said, “Agreeing to a system that keeps the best players under team control, and at a set scale of pay, for potentially a longer period of time than six years — the current time it takes to get free agency — could lessen those players’ earnings in the long run. And, if the top-earning players in the sport don’t have a way to grow their salaries, then other players’ salaries also might not grow over time.”

While that didn’t ultimately come to pass in the newly agreed to CBA, Major League Baseball has now introduced a new statistic.

Enter aWAR.

Currently there is bWAR, which alludes to Baseball Reference’s calculation, and fWAR, which alludes to Fangraphs. aWAR, as described by MLB, is a straightforward average of the two numbers. It is literally defined as “average of fWAR and bWAR.”

The immediate problem here is the nuance. Neither calculation is the same because both companies weigh certain aspects of performance differently. A player could be seen better by one or the other, and therefore have that as a negotiating tactic to their advantage. With this being sent out in a memo as an official statistic, MLB has effectively sought to implement their WAR proposal within the constraints of arbitration.

As players look to file at a higher number than their team may view them worthy, the argument on the team’s side can be made officially around the concepts of an accepted aWAR statistic. Of course team’s could’ve done this on their own previously, but it would’ve been a hypothetical suggestion with no one having to adhere to the aWAR principal.

 

RoyalsRetro (AG#1F) Posted: November 10, 2022 at 10:37 PM | 48 comment(s) Login to Bookmark
  Tags: war

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. SoSH U at work Posted: November 10, 2022 at 11:25 PM (#6105093)
A player could be seen better by one or the other, and therefore have that as a negotiating tactic to their advantage.


Couldn't the team have countered with the lower WAR figure in its case?

It may be inelegant, but this proposal doesn't strike me as inherently unfair to the players.
   2. cardsfanboy Posted: November 10, 2022 at 11:44 PM (#6105094)
aWar is a pretty decent compromise and easy for MLB. They are basically saying that the community has accepted two versions of war as legitimate, they don't always agree, so instead of choosing a side at this point in time, let's use both.

For hitters it's probably not going to matter much, for pitchers it could be noticeable. I personally prefer bWar for evaluating players after the fact, but fWar is probably a bit more accurate for pitchers going forward, so it really matters if you want to gauge a players lost by the value that they actually brought to the table or their future value and their value in a neutral environment. Ultimately combining is a nice safe easy way to acknowledge the works of outside forces and accept the standards that they produced without choosing sides.
   3. John Northey Posted: November 11, 2022 at 12:33 AM (#6105095)
Sure beats what they used to use for ranking free agents as class A/B/C whatever. Here is how it was done - a very convoluted system that you can go read up on and see if you can make sense of it. Stats used varied by position but could include PA, AVG, OBP, HR, RBI, Fielding percentage, Total chances at designated position, Assists (catchers only). Meanwhile pitchers were based on Starters: Total games (total starts + 0.5 * total relief appearances), IP, Wins, W-L Percentage, ERA, Strikeouts, Relievers: Total games (total relief appearances + 2 * total starts), IP (weighted slightly less than other categories), Wins + Saves, IP/H ratio, K/BB, ERA.

They tried, I'll give them that, but geez. All to determine if a guy cost you 1 pick (type A) while the team losing got 2 (that lost pick plus a pick between rounds 1 and 2), 1 pick (type B) that goes to the team losing the player, or just got the team losing him a sandwich pick between later rounds (2 and 3 iirc) (type C)
   4. It's Spelled With a CFBF, But Not Where You Think Posted: November 11, 2022 at 12:38 AM (#6105096)
Can't Fangraphs and B-Ref meet at a Church Council or synod or something and hash out a compromise version of WAR so we only have one?
   5. Gold Star - just Gold Star Posted: November 11, 2022 at 01:28 AM (#6105097)
Can't Fangraphs and B-Ref meet at a Church Council or synod or something and hash out a compromise version of WAR so we only have one?
Summon the elders!
   6. Cooper Nielson Posted: November 11, 2022 at 04:00 AM (#6105101)
aWar is a pretty decent compromise and easy for MLB. They are basically saying that the community has accepted two versions of war as legitimate, they don't always agree, so instead of choosing a side at this point in time, let's use both.

Yeah, that's what makes this headline so weird. It's more like MLB officially ended a war on WAR.
   7. Hombre Brotani Posted: November 11, 2022 at 04:21 AM (#6105102)
I wonder how FG and BR feel about this.
   8. Dolf Lucky Posted: November 11, 2022 at 06:40 AM (#6105103)
They’re quants. They don’t have feelings.
   9. Barry`s_Lazy_Boy Posted: November 11, 2022 at 10:25 AM (#6105111)
I like the compromise. Free agency has had Elias Sports Bureau underpinnings forever, so using an outside entity isn't new.

The headline really stinks. If anything, it should be "MLB embraces the WAR".


   10. Rally Posted: November 11, 2022 at 11:23 AM (#6105116)
Article says:

The problem with using Fangraphs’ WAR valuation to determine paychecks is that baseball owners are then placing importance on an outside entity to control the livelihood of their workforce.


So MLB got around this problem by using two outside entities.

From an analytical standpoint I agree with using the average of the two. No problem at all in referring to one or both measures in negotiations or arbitration hearings over pay. But I'd be a lot more hesitant to use it in a formula that determines pay.

I thought that at this point the differences for position players really came down to fielding, one uses UZR, the other DRS. But looking at Aaron Judge, who comes out to 11.4 or 10.6, or an average of 11.0, fielding is just one part of it.

BBREF
Bat +80
Run +2
GIDP -2
Field +3
Pos -2
Rep +23

Runs above replacement +104
WAR 10.6

Fangraphs
Bat +84
Run +2
Field +5
Pos -4
Rep +21

Runs above replacement +109
WAR 11.4

It's been a long time since I've looked under the hood on these. Baseball-reference WAR is based on my work. They still credit me on the site for the framework. But that was more than 10 years ago, a lot of changes have been made since.

The batting runs are mostly similar, based on the same concepts, but calculated a bit differently. It looks like a 4-run difference, but might be a 6-run difference, because of the grounded into double plays. While BB-ref shows those separately, I'm not sure if Fangraphs rolls those into batting runs, or just ignores them. The difference could come down to the method of calculation, or maybe a difference in adjusting for park factors.

Baserunning is the same, I don't know how close that matches for all players. Might be a coincidence. My expectation is that the formulas are mostly centered on the same concepts but might be constructed differently and you'll see some differences for players.

Fielding is very close, this is where I expected to see much of the difference but it's not much. Statcast runs saved has Judge at +3. I'm a little surprised MLB didn't come up with a WAR version that uses this. In Judge's case it doesn't make much of a difference, but Statcast is the best fielding system out there. The others make assumptions and depend on larger sample size to be meaningful, whereas Statcast knows how difficult each play is and can reasonably tell you things at the micro level like the game-saving catches Nick Castellanos made in the playoffs were indeed the 4 highest rated catches he made all season.

Position adjustment: I thought that several years ago Fangraphs, BBref, and Prospectus had come to some agreement to make their WARs more consistent. Maybe position adjustment wasn't part of this. Last year Judge played 632 innings in center, 491 in right, and was the DH for 25 games. I thought we'd get more consistent position ratings from that input.

Rep: Now on this one, I really think we should be seeing consistency. Maybe position adjustment wasn't a part of the grand agreement, but I do remember that the decision to set replacement level at .294 (a 47-115 team) was part of it. I suppose there could be differences on how to allocate that, maybe one site is calculating the strength of the AL and NL differently, or allocating more to pitchers and less to hitters.

And finally we are seeing different runs to wins calculations. Fangraphs is showing Judge as 5 runs better than the BBref calculation, but sees fewer runs to make a win, so Judge ends up 0.8 wins better.

Once we get to pitchers, well, the whole concept is drastically different. So for that one, averaging the 2 methods makes sense. For hitters, using the most accurate fielding metric available, which MLB already owns, and making a decision on the position adjustments, replacement level, and runs to wins would go a long way towards giving you the best WAR number.
   11. Rally Posted: November 11, 2022 at 11:41 AM (#6105117)
Looking at the Fangraphs definitions, position adjustment is -7.5 for RF, +2.5 for CF, and -17.5 for DH.

Assuming that is per 162 games, I verified the calculation for Judge at -4.1

For BBref I found the current values are -7 for RF, +2.5 for CF, and -15 at DH, and these are per 150 games. Using that I come out with -3.9 runs. That is based on defensive innings for outfield, and games at DH. I see in the documentation though that BBref uses plate appearances to do this calculation.

Judge had 335 in CF, 246 in RF, 111 as DH (plus 4 as a pinch hitter, which I'm ignoring). That works out to a very similar -3.8 runs.

So unless I'm missing something, the position adjustment BBref shows for Judge is wrong, and not matching up with the documentation.
   12. Walt Davis Posted: November 11, 2022 at 12:20 PM (#6105123)
Thanks Rally.

Isn't some "official" version of WAR cooked into the CBA for the pre-arb bonus pool? Is this different?
   13. The Duke Posted: November 11, 2022 at 12:35 PM (#6105128)
Is there a reason why MLB doesn't come up with its own "official" advanced statistics ? Why wouldn't it want that ? Why allow articles to be written using varying maths? Like the comment on fielding above, why wouldn't it want the best raw data driving WAR?

Or maybe Elias has one that the teams use ? Is there a "dark web" for stats which exist for the insiders that no one else can see ?
   14. Rally Posted: November 11, 2022 at 01:23 PM (#6105139)
There is a source of stats (more like 30) that only insiders can see. But they aren't shared between the 30 teams.

Isn't some "official" version of WAR cooked into the CBA for the pre-arb bonus pool?


I kind of got the impression from the article that this was it, for the pre-arb bonus. I'm not aware of anything else being used.
   15. Rally Posted: November 11, 2022 at 01:31 PM (#6105141)
I think I figured out the discrepancy in the position adjustment values. BBref mentions they have a step to adjust the league total in that to zero.

Using their position adjustments:
c +9
ss +7
2b +3
cf +2.5
3b +2
rf -7
lf -7
1b -9.5
dh -15

The sum total for a team, per 150 games, is -13. So now they have to give back 13/9, or 1.44 runs, to each position. That explains why Judge rounds to 2 instead of being shown at -3.9 or -3.8. In the past, only the AL came out substantially negative, the position adjustment for the pitcher batting made the NL total positive. The universal DH changes everything.

   16. cardsfanboy Posted: November 11, 2022 at 01:33 PM (#6105142)
Can't Fangraphs and B-Ref meet at a Church Council or synod or something and hash out a compromise version of WAR so we only have one?


I don't see how, you would have to get two distinct viewpoints to agree on what they want war to represent. At least on the pitcher side it's clear the two different opinions and what they represent, and no matter how much someone argues for it, I can't see anyone who supports bWar accepting a theoretical like Fip over an actual event as grading for a pitcher. War is a value stat, it needs to be based upon actual results (same with catcher framing for defense)

As long as people citing the two stats understand where the largest gaps are coming from then it's fine to have both.
   17. Jaack Posted: November 11, 2022 at 01:42 PM (#6105149)
The multiple models is more of a positive than a negative I think - no WAR value is going to be 100% accurate, and no formula will be as straightforward as something like the formula for batting average.

But I think for things like park factors/pitcher-hitter allocation/positional adjustments it would probably be better to simply agree on a number like they did with replacement level.
   18. Rally Posted: November 11, 2022 at 01:43 PM (#6105150)
For the 130 players who qualified for the batting title, I looked at the difference in run value for the WAR components, and then the standard deviations.

The StDev by component:

Bat 2.6
baserunning 2.1
fielding 6.3
position 0.9
replacement 0.8

The list of biggest WAR differences is dominated by guys whose fielding ratings are drastically different.

Like Brendan Rodgers (4.3 bWAR, +22 def) compared to (1.7 fWAR, +2def). On the other side is Sean Murphy (3.5 bWAR, -4 def) and (5.1 fWAR, +15 def).
   19. Rally Posted: November 11, 2022 at 02:03 PM (#6105152)
On the pitching side it's a philosophical difference, nothing that can be found in the details of how the calculations are done.

Looking at the biggest differences, what stands out is that two of them were teammates. Kevin Gausman and Alek Manoah pitched in the same parks, against the same opponents, and with the same defense behind them.

Manoah faced 61 more batters, recorded 66 more outs, and allowed 17 fewer runs. His BBref WAR of 5.9 is about double that of his teammate, 3.0.

But Gausman allowed one fewer homer, walked 23 fewer batters, hit 14 fewer with pitches for an advantage of 37 in non-hit baserunners, and struck out 25 more hitters. So Gausman leads in fWAR by 5.7 to 4.1.
   20. Karl from NY Posted: November 11, 2022 at 02:21 PM (#6105155)
I can't see anyone who supports bWar accepting a theoretical like Fip over an actual event as grading for a pitcher. War is a value stat, it needs to be based upon actual results


FIP is real events, it's HR, BB, HBP, K.
   21. Hank Gillette Posted: November 11, 2022 at 04:58 PM (#6105164)
They should have called it “dWAR” since d is the average of b and f.
   22. Darren Posted: November 11, 2022 at 05:09 PM (#6105167)
Are they going to use whatever formulas are used for fWAR and bWAR right now or are they going to keep using whatever BBref and Fangraphs publish? The former seems less problematic.
   23. The Duke Posted: November 11, 2022 at 05:51 PM (#6105175)
Rally,

I did not know you were there when WAR was birthed. I'm not a detail guy but I understand the general premise, struggle with the math.

A few questions:

1. If you were starting over today is the methodology used still mostly "the best" or is something like winshares ( just an example) a better way to go?
2. I still struggle with 1B getting such a negative adjustment. 1B touch so many balls in play and when I watch guys like Paul Goldschmidt make a couple plays a game that are really good, I struggle. I get DH and the OF but not 1B.
3. How can WAR be more easily explained to the novice? I look at bbref and think why can't I just add dWAR and oWAR and get a simple answer? I come from and FP and A world in finance and having to have a cheat sheet to understand what is being presented to you is always frowned upon. WAR feels like it is intentionally made hard to understand


   24. cardsfanboy Posted: November 11, 2022 at 05:52 PM (#6105176)
FIP is real events, it's HR, BB, HBP, K.


And estimating what should have happened in every other result (which is generally around 65% of the remaining events)
   25. cardsfanboy Posted: November 11, 2022 at 06:10 PM (#6105177)
Rally of course will do a better job of answering, but on one of your questions.


I still struggle with 1B getting such a negative adjustment.

I'm in the same boat/thought process as you, but it's not about the negative adjustment, the math works for that, the issue is how much rField (or whatever else defensive numbers are available) that they are using, probably underrates the defensive value that first baseman have. Mind you, if they increase the rField spread for first baseman to add more rField to first baseman which would effectively subtract rField from the rest of the infielders, the negative adjustment will probably actually go up as the defensive numbers are based upon average and increasing the range with more potential rField values for first baseman means positional adjustments will be modified.

So you are really asking for a modification to rField to better quantify and split defensive plays from giving full credit from the guy fielding/throwing the ball, to giving some credit to the guy receiving the ball. I don't know enough about current versions of rField to even know if that is a factor in their equation and it might be or it could be in the future.

   26. The Duke Posted: November 11, 2022 at 06:54 PM (#6105185)
Yes, I think. :)
   27. Rally Posted: November 11, 2022 at 07:52 PM (#6105187)
1. If you were starting over today is the methodology used still mostly "the best" or is something like winshares ( just an example) a better way to go?
2. I still struggle with 1B getting such a negative adjustment. 1B touch so many balls in play and when I watch guys like Paul Goldschmidt make a couple plays a game that are really good, I struggle. I get DH and the OF but not 1B.
3. How can WAR be more easily explained to the novice? I look at bbref and think why can't I just add dWAR and oWAR and get a simple answer? I come from and FP and A world in finance and having to have a cheat sheet to understand what is being presented to you is always frowned upon. WAR feels like it is intentionally made hard to understand


1. Win shares was out before I did anything with WAR. I think there are a lot more problems with win shares, for example how much credit is given to pitchers. In 1984, if Dwight Gooden’s win shares were about equivalent to his right fielder, Darryl Strawberry, I could buy it. But they weren’t. Gooden has the same win share total as George Foster for 1984. I think the framework holds up well, measure everything in runs, convert to wins at the end. Something like catcher framing wasn’t there at the beginning, but could easily be added once you can measure it. Maybe add in 1B scooping, or infielder tag skills if we can figure out how to measure it.
2. 1B are involved in more plays, but most of the time they just have to catch the ball. They field fewer balls than the other infielders, and rarely have to make a strong throw. Some 1B are very good at their jobs. For the most part, players move to first base because they can’t handle tougher positions. It’s the place where you put the big bat who doesn’t have the defensive skills to play at other spots.
3. Well, I could finish my book on WAR, but I’ve made little progress on that for a while and its not close to being done. I agree with you on dWAR and oWAR. I would prefer not to present it like that, but if forced to I’d at least make it so they add up.
   28. Rally Posted: November 11, 2022 at 07:55 PM (#6105188)
Rfield should, at the league level, sum to zero for all positions. If we added in a receiving metric for first baseman, this wouldn’t change. It would just increase the range of good and bad performances. Maybe the best 1B are saving 15 runs instead of 10, but the bad ones are also costing their teams more runs.
   29. cardsfanboy Posted: November 11, 2022 at 10:08 PM (#6105201)
Rfield should, at the league level, sum to zero for all positions. If we added in a receiving metric for first baseman, this wouldn’t change. It would just increase the range of good and bad performances. Maybe the best 1B are saving 15 runs instead of 10, but the bad ones are also costing their teams more runs.


agree about it equaling, my thought process is that for every play that involves a putout at first base, the first baseman should get a very small amount of credit, which comes out of the credit given to the player fielding the ball. And at the same time when there is a throwing error or a play not made that should have been the first baseman gets a small amount of the penalty. I think the data is now there to do that somewhat. Whether it makes a difference might be an issue.


I think catchers ability to affect strike and ball calls is a real tangible thing, I don't think we have figured out how to properly quantify it though (with apologies to fWar and Vorps/warp attempts to do it) but it's a real thing that should be included in defensive metrics. I also think that first baseman ability to catch the errant throw is a real thing and a defensive skill that needs to be measured that we haven't figured out how to do, and even when we do, we'll probably spend a few years incorrectly gauging it, and then we'll come up with a way to backdate the data and it will probably still understate the skill/value, but it will move those guys that we think of as great defensive firstbaseman, up the ladder in value. Think of a play in which a guy fields a ball, throws it high, the first baseman jumps in the air and twirls tagging the runner as he goes by... by rField and every other metric the guy fielding and throwing the ball gets full credit for the defensive play. Or even more extensive, guy fields the ball makes the throw and the first baseman does a full stretch to get the guy by 1/16th of a second, I've seen first baseman that don't do the stretch and the runner is safe.

How many plays this happen on a annual basis for a team? hard to say, etc... but absolutely it's enough that it should be measured.
   30. cardsfanboy Posted: November 11, 2022 at 10:20 PM (#6105202)
I look at bbref and think why can't I just add dWAR and oWAR and get a simple answer?


oWar is offensive war, it's a positional number to explain how much value above replacement a player provides simply by his offense relative to position, it includes a positional adjustment. It gives the story of how much offense a particular player provides over what you expect from that position. You are not fooled into thinking that a 110 ops+ by a first baseman is the same as a 110 ops+ by a shortstop. It's an attempt to put offensive value for all positions into a context that shows what is expected from replacement level for that position. It's a story telling tool.

Same with dWar, it's to say how much defensive value a particular player provides to the team after factoring in their position. A left fielder's defense is less important than a shortstop defense. It points out that defensively speaking even a poor defensive shortstop is worth more than a good defensive left fielder (note: I'm avoiding first base in this discussion for reasons we are talking about in other comments) Whether or not that story is worth telling is up to you, but it is what it's doing.

Of course since you are including positional adjustments in both stats, you can't add them together or you are double counting the positional adjustment. They are story telling tools more than they are stats.
   31. Ron J Posted: November 11, 2022 at 10:35 PM (#6105204)
Rally's methods really have their origin in Pete Palmer's work. Of course, Palmer's defensive numbers were a bad joke. (As an example, he somehow came up with Johnny Bench as the worst defensive catcher in history)

Palmer got good offensive numbers (small standard error at the team level) by cheating -- without his "slope correctors" his linear weights are simply not very good.

Baseball-reference does something conceptually similar but in a more sophisticated manner by essentially calculating a new offensive formula every year (at least they did the last time I looked under the hood). You will not convince me this is not overfitting. Never looked under the hood with Fangraphs. FWar wasn't really a thing when I was still doing deep dives.

But for all of this, there are some pretty major issues with Win Shares. If you're really interested in this, you need to find rec.sport.baseball and rec.sport.baseball.analysis from the mid 90s to early 2000s.

First of all, it's based around Runs Created (More precisely what Dave Tate called Marginal lineup value -- which partially deals with the biggest issues) and that's a multiplicative method. And these have problems with extreme players. The example I used to use was Joe Carter and Frank Thomas. Take their year-end stat and add a 1-1 with a HR. Because Thomas had a much higher OBP, Thomas' HR will show up as a fair bit more valuable. This only makes sense if Thomas himself was likely to be batting with more runners on base (in fact, generally speaking Carter had excellent taste in teammates and despite personally having an unremarkable OBP he got to bat way more than most with runners on). Multiplicative methods work well at the team level and less well for individual players.

Also, James over-values playing time. (there are plenty of other isues. We spent a lot of time on this)

What Rally did was take all of the info we'd never had before and give good weighting to them.

The defensive system started life as Range factor -- because that's what we had. A lot of people worked hard to come up with ways to deal with all of the issues inherent in range factor and I think what Rally came up with is about as good as you can do.

To me what's missing from all of the methods in wide public use are sanity checks using different methods. Wowy (defensive results with and without a player) for instance has plenty of issues of its own, but it's a good way of double-checking some players who put up extreme defensive numbers (If anybody's interested, Jeter didn't do well here either)

Another possible check is whether things square up at the team level. With Range Factor by definition, they will. (The issues come from assigning the credit. Range factor in particular has issues with discretionary plays. Doesn't really matter who catches certain pop ups for instance). I've never seen anybody do this with any of the newer methods.
   32. Rally Posted: November 12, 2022 at 02:07 PM (#6105242)
For me the key in defensive metrics is at the team level they should be consistent with DER, another Bill James stat. At first I was hesitant to do that, thinking it would be impossible to separate the pitcher and fielder contributions. But thanks to the work of Voros McCracken, I realized that DER should indeed be a great check on the fielders.
   33. Rally Posted: November 12, 2022 at 02:09 PM (#6105244)
The Frank Thomas/Joe Carter problem mentioned above is certainly an issue with basic runs created. I know James somewhat mitigated that with his later, more technical RC formuli.
   34. Walt Davis Posted: November 12, 2022 at 03:11 PM (#6105252)
Musings ...

1) oWAR is the old WARP. Maybe we just got so used to having nothing but WARP that we couldn't let it go when we got more detail.

2) The oWAR + dWAR issue could be easily solved by simply dividing a player into "offensive value" or "lineup value" (which has nothing to do with position played) and "defensive value" (i.e. dWAR). But then I'm the guy who wants dWAR relabeled as dWAA so I'm probably not the target audience.

3) CFB is onto something with their narrative value. Fans/media have always debated questions like "who's the best-hitting SS" and "would you rather have an average-hitting SS or a good-hitting 1B?" These are answered by oWAR (sorta). I'm not sure anybody ever asked "in terms of defensive value, how much do you get from a SS relative to a 1B" or "how good of a SS would Bobbby Grich or Manny Ramirez have been?" But if you present dWAR as just Rfield, then you're pretennding that a good-fielding 1B is as good defensively as a good-fielding SS. Still, people are more likely to compare the relative offensive value of players at different positions while keeping defensive comparison position-specifid.

4) So dWAR on its own is of limited use. It's handy the way BDC always used it in player comparisons -- find guys at the same position with similar offensive value then rank them by dWAR. I also use it in a similar way in player comparison as an indicator of "athleticism" (or whatever). I do find it "convenient" that the dWAR of Banks and Yount post-SS is pretty similar to the dWAR of Jeter in his 30s (or that Jeter's dWAR is similar to Raines'). I also suspect that offensive and defensive aging are mostly independent of one another -- that you get a sense of how a player's bat will age by looking at similar hitters then a sense of how the glove will age mainly based on whether they're any good now. But unless a player is actually transitioning from one position to another, I never ask the specific question "let's compare the defensive value of Goldschmidt and Arenado."

5) As it's turned out, dWAR has probably caused more confusion than anything else. If I hear "how can they say Keith Hernandez had no defensive value, he's the greatest defensive 1B or all-time" one more time I might scream. Plus the confusion that it is a measure relative to average, not to replacement.

6) But I'm also the nerd who wishes the league-difference component was split out separate from Rrep.
   35. Walt Davis Posted: November 12, 2022 at 03:23 PM (#6105253)
For the most part, players move to first base because they can’t handle tougher positions.

I'm not sure that's true anymore (if you meant "move" literally as opposed to "end up there at 24"). One reason Gehrig was obviously the best 1B for so long was because there were almost no career 1B to compare him with (also he was insanely good). Generally if you started your career at 1B that was because you were already a lumbering oaf and, generally, if you were a lumbering oaf at 24 you were either oft-injured or a fat slob (or both) at 32; but if you were mobile enough to at least handle LF/RF at 24 then you became a lumbering oaf at 32 and would move to 1B. The introduction of the DH changed some of that, at least it gave those 24-yo 1B some hope of becoming a fat slob DH.

I think that changed sometime over the last 20-30 years. Maybe Bagwell was the prototype. At the moment: Freeman, Olson, Goldschmidt, Rizzo, Votto, Alonso, Pujols, maybe Vlad and even fairly ordinary guys like Hosmer, Belt and Christian Walker. Who is the last LF/RF who moved to 1B in his 30s?** My impression is we're more likely to see a non-oaf like LeMahieu start to pick up 1B PAs than we are to see (to pick a name) Yelich. Carlos Santana was something of an old-school transition from C to 1B but even that was at 28. With the universal DH, I assume we'll see even fewer old man transitions to 1B.

But maybe we're just wrong about the history of 1B. Pujols is only 19th in career 1B games, Votto 26th. Maybe there's some effect lower down -- Freeman is already top 50, Hosmer, Goldschmidt and Rizzo will be there soon. There are only 21 players with 2000+ games at 1B and Ernie Banks has more than Miggy or Thome and nearly as many as Giambi. But 21 is far more than there have been at any other position than SS (which is still just 2nd with 19). I'd put 29 of the top 50 in the post-expansion era ... but that's not a lot given expansion. I'd put 10 of those in the (mostly) 21s century bucket but that's not a huge number. At best there's a small, recent trend as Goldschmidt, Hosmer and Rizzo will cross that threshold soon and Freeman has a real shot at Murray's record. Maybe (maybe) there is a trend away from oafs -- while today's 1Bs aren't generally fleet of foot, they mostly aren't close to being fat slobs either.

I really thought both Miggy and Thome had a lot more time at 1B than they did. (I know both spent time at 3B and Miggy in LF but they got a lot more DH time than I realized.) Thome was in over 2500 games but just over 1100 at 1B. The back injury came earlier than I remembered.

** Rhys Hoskins maybe but I recall everybody saying he was really a 1B from day one.
   36. Rally Posted: November 12, 2022 at 04:25 PM (#6105265)
Vlad Jr was, by Statcast, the worst defensive 3B in his rookie season. He moved to first and now he’s a gold glover.

Andrew Vaughn will probably be the starting 1B for the White Sox, now that Abreu is a free agent. Because of Abreu he played outfield, but didn’t belong there. Rhys Hoskins played some left field when he first came up because the Phillies work extra hard to cram an extra bat into the lineup. In that case it was signing Carlos Santana.
   37. Walt Davis Posted: November 12, 2022 at 05:30 PM (#6105280)
It may be counter-intuitive but I think one of the problems we have with defensive stats is that we are trying to get too precise.

When it comes to batters, we essentially assume that (other than park effects) one PA is the same as another PA. There's no attempt to quantify the difficulty of the PA based on the quality of the pitcher (much less the quality of the pitch) or even whether the batter has the platoon advantage. It's pretty strictly a value measure -- so many PAs (Rrep) with so many singles, doubles, triples, HRs, BBs, HBPs and GDPs.

Range factor was along those lines -- this OF had 2.1 putouts/9 vs a league average of 1.9. But for some reason on defense, folks refused to take that at face value. Anybody coulda caught that fly ball! Team A pitchers give up lots of fly balls while Team B pitchers strike out a lot of guys. And most annoyingly, what about the plays not made? Those are completely legit points and are reasons why RF is not a particularly good measure of defensive quality (and therefore maybe not very good for value either).

So instead, for each defensive opportunity, we now are trying to measure the speed, launch angle, direction of the batted ball and the starting position of the fielder. Do we have second-by-second readings on wind velocity yet? That's a lot of detail to get right. It's also a relatively extreme focus on the quality aspect over the value (counting stat) aspect. It seems to me we've gone so far in the quality direction on defensive measurements that the casual WAR fan doesn't even consider that one reason this fielder got +10 DRS while that one got only +5 is at least partly due to opportunities -- DRS is a counting stat, not a rate stat.

This came up in a thread many years ago now -- back in the days when Kevin Pillar was a very good CF -- and somebody questioned why Kiermaier had a much higher DRS. It was probably 2015 -- Pillar had a maybe believable +16 DRS in 1200 CF innings while Kiermaier had a definitely not believable +38 DRS in slightly fewer innings. Who knows but, just by raw counts, Kiermaier had 6 more putouts in 60 fewer innings. That doesn't seem like a lot but, by RF, that works out to an extra 0.25 plays per 9 which, in Kiermaier's playing time, would be about 30 more plays.

Now 30 more plays in the OF is a lot, theoretically on the order of 30 more runs saved. That conclusion would assume that KK and KP faced the same chances across those innings so we might want to start asking all those questions -- was TB just a FB-heavy staff, how difficult were those extra chances, etc. Those move us closer to the question of "which of these guys is the better CF?" and potentially away from the question of "which of these guys produced more defensive value in CF?" Moreover, the difficulty question is one that DRS was already trying to answer. And even if it was just more opportunities, we have no issue understanding that for two equal hitters with one getting 600 PA and the other just 500 that the first guy will produce more value. (Probably not 14 runs more value but that's a different issue.)

Imagine applying similar criteria to batters. This LHB faced above-average LHP for 17% of his PAs while this other guy only 12%. (Difficulty of opportunity) Anybody coulda creamed that hanging curveball. (Difficulty of opportunity) Then anybody coulda creamed that hanging curveball but you swung and missed (a play not made). A 89-MPH slider with 2.5-3.9 inches of horizontal break, when contacted, results in just a 175 BA so a hit here is worth .825 hits. These points do get brought up occasionally -- this guy has good overall numbers but that's because he cleans up on bad pitchers -- but I don't think anybody has ever seriously suggested they be incorporated into WAR, etc. We have started to move in this direction with some of the statcast "x" measures based on "balls hit this hard with this sort of launch angle produce a line of 650/1400." In theory, we could look at batting WAR on a PA-by-PA or even pitch-by-pitch basis.

So we could re-define Rbat to be "# of runs created relative to an average batter facing the same pitches from the same pitchers in the same parks against the same defenders with the same weather conditions" but, thank god, we don't. We could do the same for pitcher WAR -- and we partly do, using game-specific park factors and RA9opp.** For batters, we should at least think about some adjustments for the platoon advantage/disadvantage but maybe that comes out in the wash. (Basically should it be "vs average batter" or "vs average RHB" or "vs average RHB v RHP and vs average RHB v LHP." I guess it depends on what role we think handedness might play in choosing a replacement.)

There is always some tension between "value" and "quality." Generally they're hightly correlated enough that, once we adjust for playing time (i.e. value = quality x PT), WAR can be used for either. But take Matt Stairs (83% of PAs with the platoon advantage) or John Lowenstein (90%). Part of the reason they out-performed the "average batter" is because of those extreme playing time splits. In both cases, their job was "hit against RHP" so presumably, if needing to replace them, their teams would try to find LHB. To think of it in WAR terms, Stairs is credited with 119 Rbat in just over 6000 PAs. If he'd been given another 1000 PAs vs LHP, those probably would have produced below-average results, let's say bringing his career Rbat down to 100.

Just as hitters, how should we compare him to, say, Benintendi (74%) or Schwarber (75%)? In Rbat/650 terms, it's Schwarber 15, Stairs 13, Benintendi 7 but KS and AB will have about 150-160 PAs v LHP in their 650 while Stairs just 110. Swap out 50 vRHP for 50 vLHP and Stairs bat will look a lot more like Benintendi's. To my knowledge WAR (and projection systems) do nothing to adjust for this. How do we comp him to his rough contemporary Reggie Sanders who had just 28% PAs with the platooon advantage? (Ooh, cool comp. They had nearly identical # of PAs vs RHP with Stairs hitting 268/361/490 and Reggie 259/332/469 -- Stairs wins easily on OBP. Reggie had about twice as many vs LHP with an OPS about 160 points higher.)

This probably shows up most strongly in WAA. Give Stairs an extra 1000 PA v LHP to bring him even with Sanders and he'll drop about 4 WAA (adding 0 WAR). To be clear, Sanders laps Stairs easily via picking up about 20 wins of defensive value and 3 wins on running

And maybe that's right -- value is value. Teams decided Stairs couldn't hit LHP (he did fine when given the chance) so they maximized his value. They've probably also decided that Benintendi can't hit LHP but his glove (and a roster spot) are worth it. (Schwarber still can't hit them but did club 10 HRs off them this year.) Should we consider that extra value that Stairs produced -- or maybe more accurately the negative value he didn't produce -- as "belonging" to him or to his managers? Would there be any point in splitting Rbat and Rrep into vRHP and vLHP?

I do think it's safe to say that in projecting a player like Stairs, the systems should make it clear that the projection assumes a similar platoon split going forward. Stairs was a fine player but he wasn't a "true" 117 OPS+ hitter. In those terms, he was probably much closer to Benintendi (109) than Sanders (115) and Schwarber (121) ... of course AB and KS haven't hit their decline phase yet.

** The simplifying assumption is more justifiable for batters, at least those who play almost every day, as they will face pretty much the same set of pitchers in the same set of parks over a large sample of PAs.
   38. Walt Davis Posted: November 12, 2022 at 05:43 PM (#6105283)
#36 ... sure, we will always see moves of young players off defensive positions, often to 1B. I don't consider the move of Vlad to be particularly different than, say, the move of Carlos Delgado from C to 1B (he even played some LF), it's just that the decision to move Delgado came before he reached the majors. Vladito played 3B at 20, everybody decided that was a bad idea, he got moved at 21 -- that happens in the minors all the time. I don't consider Vlad to be a guy who was "moved" to 1B, he's a guy who "started" at 1B ... or if you prefer, similar to Hoskins, he's a 1B who played 3B for a season.

It's tougher with Thome (500 g at 3B, credited as OK) and Perez (750 g, credited as average) and of course Dick Allen (terrible). As I've pointed out before, in the 60s thru the 80s at least, it was pretty common to see if your future (sometimes current) lumbering oaf could play 3B. It still comes up occasionally (Miggy).
   39. Eric J can SABER all he wants to Posted: November 12, 2022 at 08:14 PM (#6105313)
It's tougher with Thome (500 g at 3B, credited as OK) and Perez (750 g, credited as average) and of course Dick Allen (terrible). As I've pointed out before, in the 60s thru the 80s at least, it was pretty common to see if your future (sometimes current) lumbering oaf could play 3B. It still comes up occasionally (Miggy).

Pujols also debuted at 3B (kinda) and moved around the corners a lot for his first 3 seasons before settling at 1B in '04.
   40. Howie Menckel Posted: November 12, 2022 at 10:14 PM (#6105333)
I do think it's safe to say that in projecting a player like Stairs, the systems should make it clear that the projection assumes a similar platoon split going forward. Stairs was a fine player but he wasn't a "true" 117 OPS+ hitter.

Lou Whitaker enters the room
   41. cardsfanboy Posted: November 13, 2022 at 11:30 AM (#6105348)
Lou Whitaker enters the room


Whitaker's platoon usage is exaggerated though, it was really just his last two - three years of playing that he was heavily platooned, up until that time his pa split was pretty much right in line with league average.

Edit: that is going from memory, I researched it a couple of times, and Lou effectively had around 170 pa more platooned than what the league was doing as a whole.
   42. Infinite Yost (Voxter) Posted: November 13, 2022 at 03:16 PM (#6105383)
I wonder how FG and BR feel about this.


What I wonder is if they're getting paid, or MLB is just scraping the data off their websites.
   43. cardsfanboy Posted: November 13, 2022 at 04:16 PM (#6105392)
What I wonder is if they're getting paid, or MLB is just scraping the data off their websites.


I seriously doubt that they are getting paid. It would be nice if MLB decided to give 25,000 to the site for using their data.
   44. The Duke Posted: November 13, 2022 at 04:25 PM (#6105394)
I've always wondered about the "quality" issue on offensive stats. If player A faces a lot more good pitching than player B, does that get factored in anywhere in WAR? If you are facing Koufax/ Drysdale 20x a year vs some average pitchers, does your offensive performance get adjusted ? A similar thing happens in the World Series where if you face the 1960s Cardinals you are going to get Gibson for three games not Nelson Briles.

And back to defense - is there any system that allows you to compare defensive performance across positions or is that not relevant - who needs to know if Goldschmidt is a better fielder than Arenado whereas we constantly look at their relative offensive differences.
   45. Lassus Posted: November 13, 2022 at 05:49 PM (#6105403)
Did I miss someone in this thread actually trying this with a couple of players to see what happens?
   46. What did Billy Ripken have against ElRoy Face? Posted: November 13, 2022 at 07:32 PM (#6105418)
I wonder how FG and BR feel about this.
FG editor Meg Rowley has said on the Effectively Wild podcast that she’s not at all comfortable with compensation being tied to their WAR, and I believe she said her BR counterpart feels the same.
   47. Rally Posted: November 14, 2022 at 08:22 PM (#6105534)
I can understand that. From time to time those sites make revisions, recalculate the WAR. Next time that happens is MLB going to change a player’s pay? I certainly wouldn’t want to deal with some agent complaining about how it affects his guy if I’m just trying to stay current and keep the most accurate value metric I can out there.
   48. Ron J Posted: November 14, 2022 at 08:52 PM (#6105542)
#37 One other issue with defense is that it flips from negative to positive and for an awful lot of people, negative is bad and positive is ... well positive.

So while going from +2 to -3 is no big deal (I double the standard error for any of the metrics is smaller than 6 runs) a fair number of people find that incomprehensible.

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Dynasty League Baseball

Support BBTF

donate

Thanks to
The Piehole of David Wells
for his generous support.

You must be logged in to view your Bookmarks.

Hot Topics

NewsblogJosh Hader discusses reluctance to pitch four outs
(13 - 2:27am, Sep 27)
Last: Cooper Nielson

NewsblogQualifying Offer Value To Land Around $20.5MM
(18 - 2:00am, Sep 27)
Last: Howie Menckel

NewsblogRays unveil statues honoring 2 iconic moments in club history
(13 - 1:56am, Sep 27)
Last: Howie Menckel

NewsblogBetts sets 'remarkable' record with 105 RBIs as a leadoff hitter
(34 - 1:55am, Sep 27)
Last: sunday silence (again)

NewsblogJoey Votto and the city of Cincinnati say 'Thank you' in a potential goodbye
(24 - 12:13am, Sep 27)
Last: SoSH U at work

NewsblogBaseball America: Jackson Holliday Wins 2023 Minor League Player of the Year Award
(7 - 11:58pm, Sep 26)
Last: Howie Menckel

NewsblogOmnichatter for September 2023
(543 - 10:35pm, Sep 26)
Last: Walks Clog Up the Bases

NewsblogHall of Fame 3B, Orioles legend Brooks Robinson dies at 86
(3 - 10:01pm, Sep 26)
Last: baxter

NewsblogHow to Save an Aging Ballpark
(9 - 6:18pm, Sep 26)
Last: Starring Bradley Scotchman as RMc

Sox TherapyOver and Out
(48 - 4:55pm, Sep 26)
Last: Nasty Nate

NewsblogAs Padres’ season spirals, questions emerge about culture, cohesion and chemistry
(51 - 3:12pm, Sep 26)
Last: Ithaca2323

NewsblogOT - 2023 NFL thread
(17 - 1:19pm, Sep 26)
Last: tell me when i'm telling 57i66135

NewsblogOT Soccer - World Cup Final/European Leagues Start
(117 - 11:47am, Sep 26)
Last: AuntBea odeurs de parfum de distance sociale

NewsblogOT - NBA Off-Pre-Early Thread for the end of 2023
(19 - 10:05am, Sep 26)
Last: Crosseyed and Painless

NewsblogThe MLB Trade Rumors 2023-24 Free Agent Previews
(1 - 11:30pm, Sep 25)
Last: NaOH

Page rendered in 0.7120 seconds
48 querie(s) executed