Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Baseball Newsstand > Discussion
Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Thursday, May 05, 2022

The Case Against a Case Against FIP

At FanGraphs, our headline WAR number for pitchers is based on FIP. Because of that, and because people enjoy debating and arguing, there’s a yearly refrain that you’ve probably heard. “FanGraphs pitching WAR only considers (X)% of what a pitcher does, how can that be used for value?” No one would dispute that year-one FIP does a better job of estimating year-two ERA than ERA does – or at least, not many people would – but the discussion around whether FIP does a good job of assigning year-one value is alive and well.

One reason for this view is pretty obvious. FIP considers home runs, strikeouts, walks, and hit batters to estimate pitcher production on an ERA scale. Our WAR does some fancy stuff in the background – it treats infield fly balls, which virtually never fall for hits, as strikeouts, and it adjusts for park and league. In the end, though, it’s estimating pitcher value using just three (well, actually four — HBPs always draw the short straw) outcomes. There are a lot of other outcomes in baseball!

In 2021, roughly 39% of plate appearances ended in a homer, strikeout, walk, hit batter, or infield pop up. One thing you could think, in recognition of that fact, is that FIP-based WAR doesn’t consider enough of a pitcher’s production. You wouldn’t use 40% of a hitter’s plate appearances to calculate their WAR, so why do it for pitchers? But that doesn’t actually make sense, as David Appelman pointed out to me recently. Assuming “average results on balls in play” is actually going to be pretty close for every pitcher, by definition.

 

RoyalsRetro (AG#1F) Posted: May 05, 2022 at 06:38 PM | 16 comment(s) Login to Bookmark
  Tags: fip

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. John Northey Posted: May 05, 2022 at 11:23 PM (#6075464)
I suspect it could be tweaked to be better - mixing in how hard balls are hit, how often the ball is hit dead on vs weakly (barrel), etc. With all the new data now becoming available more data exists to help get FIP fine tuned I'd think.

Of course, I haven't put in an ounce of effort to see if barrel rate or average velocity off the bat has any real effect year over year for pitchers, but logically one would think it would. Might explain guys who have over or under performed FIP over time.
   2. Walt Davis Posted: May 06, 2022 at 01:20 AM (#6075483)
EDIT: #1 is a pretty good summary of my first 35 or so paragraphs, feel free to skip to >>>> ... or straight to post #3!

Statistically it's imputation and crude mean imputation is almost always a bad idea in that you virtually guarantee you are understating the true variance. There are sometimes situations where imputing (assuming you have a good estimate of the distribution of whatever you're imputing) can do better than "real data" ... basically situations closer to the year 1 FIP to year 2 ERA situation, i.e. where you are relying on the model for prediction (which is another form of imputation essentially). But in general of course, it's better to stick with real data and imputation is primarily only used when there's missing data.

To me it's more a question of "what are you trying to measure" along with "how big is the measurement error in the data?" I don't care for the argument isn't "is actually going to be pretty close" -- that's just "it's a good approximation" and you'd always rather have the "truth." The better argument is "we are trying to measure the pitcher's contribution and the pitcher's job is to get the K and the BIP, it's not their fault when BIPs go for hits." If pitcher A gives up a GB that an average SS would get to, it's not the pitcher's fault Jeter didn't get to it. Of course bWAR tries to adjust for this plugging in a mean team defensive value of some sort.

Now of course that's not always true that a BIP tells us nothing more about a pitcher. When the BIP is a 105-MPH LD, we can be pretty sure the pitcher didn't do his job even if it's just a single. If I was a WAR modeller, I'd be looking into whether I could develop a better BABIP against model using EVs, barrels, whatever. Anti-"BABIP is random for pitchers" poster boy Glendon Rusch's BABIP against (331; 367/576 fair terr split) was probably not a random fluke.

As a simple example, we can imagine just plugging in mean BABIP for every pitcher. But we know that BABIP is at least a function of whether the BIP is a GB, LD or FB. There doesn't seem to be a lot of variation in LD% for batters (not sure about pitchers) but there's certainly variation in GB vs FB. And we know that pitchers have some control over G/F ratios. So we know that the population mean BABIP differs for FB-heavy pitchers vs GB-heavy ones. So why plug in an overall mean BIP when you could plug in FB-heavy, neutral and GB-heavy mean BABIPs depending on the pitcher type. Or calculate each pitcher's mean expected BABIP based on their personal G/F mix. (They might be doing one of those, I'm just providing an example.) Treating all BIPs the same is almost certainly a source of bias.

Now, I'm still open to the idea that if you used a more detailed BIP estimate, the outputs are very close to what we're already getting so why bother? Or plugging in a noisy individual estimate for a clean aggregate estimate isn't guaranteed to help in terms of overall error (i.e. it should help in terms of bias but that might be a small improvement while you're introducing substantial measurement error). So maybe we'll be better off using the simple mean but I seriously doubt it.

That doesn't mean bWAR does a better job. First, I think we're all agreed that all of the defensive measurements are pretty noisy so what they're plugging in might be doing a terrible job of measuring "what actually happened." It might be giving too much credit/blame to the defense and it has the same issue of whether different types of pitchers should get different defensive plug-ins (which I think they might be doing now).

Ideally statcast could get us closer to "these two pitchers gave up the same GB to SS, that's the same value." But we'll still have to rely on some sort of "expected IP outcomes" based on EVs, LAs and maybe defensive positioning (i.e. how much is it the pitcher's fault that the batter was able to hit it where they ain't?)

>>>>

Still there's the unintuitive aspect of the FIP claim to fame. We use FIP in place of ERA (or RA) in this year's value calculation because ... FIP has been shown to be better at predicting next season's ERA which we have just said is not an accurate measure of value. (And if nothing else, should we look into FIP as a predictor of RA/9, not ERA?) I think there's also a small issue that FIP is artificially scaled so mean FIP will be mean ERA. The FIP constant is essentially the expected ERA of a mythical all-BIP pitcher -- not clear why this should vary from year to year just to make sure our numbers come out right. (Again, a reasonable approximation that won't make any real difference but ...)

By the way, the FIP constant is generally around 3.1 to 3.2 apparently, pretty stable. BIPs are a terrible way to score runs. For some near real-life examples of the mythical all-BIP pitcher, check the Greinke thread. It actually seems pretty accurae -- Greinke at 2.57 ERA so far this year; the Tewksbury example I gave was a 3.13 ERA almost 800 IP; Silva at 3.44 in his famous year despite a high HR rate.

Finally the FIP model is just unintuitive as all get out ... FIP = (13*HR + 3*BB - 2*K)/IP + FIP constant. I mean all of those values in the parentheses scale into the hundreds for a typical starter so what in the world are we dividing by IP? I think it looks at least vaguely linear weights-ish if expressed as (1.44*HR + .33*BB - .22*K)*(9/IP) and then a couple of other results are a bit more obvious too ... but still an awkward unintuitive formula.
   3. Jack Sommers Posted: May 06, 2022 at 09:52 AM (#6075499)
Walt, what are your thought's on xERA, which is xwOBA scaled to ERA.

Below paste is their glossary definition, and here is Tango's blog discussion

Here is link to xERA LEADERS Min 50 BIP


Expected ERA, or xERA, is a simple 1:1 translation of Expected Weighted On-Base Average (xwOBA), converted to the ERA scale. xwOBA takes into account the amount of contact (strikeouts, walks, hit by pitch) and the quality of that contact (exit velocity and launch angle), in an attempt to credit the pitcher or hitter for the moment of contact, not for what might happen to that contact thanks to other factors like ballpark, weather, or defense.

By converting this to the ERA scale, it puts xwOBA in numbers that are more familiar, and allows it to be compared directly to the pitcher's actual ERA. (If you're familiar with FIP, or Fielding Independent Pitching, the idea is similar, just that now Statcast quality of contact can be included.)

xERA is not necessarily predictive, but if a pitcher has an xERA that is significantly higher than his actual ERA, it should make you want to take a closer look into how he suppressed those runs.


The biggest issue with xSTATS at Baseball Savant I've had over the years is they start out with big gaps between expected and actual every year, and then at some point in the year they "correct" the forumla to bring xSTATS in line with actual. I noticed this issue over the past couple of years and contacted Tom on twitter and shortly after they corrected it and brought things in line. (so if you look at past years they now show more or less aligned) But they are way out of whack again right now this year. Again. In a recent brief twitter discussion I had with Tango he said they would update every 4-6 weeks, but he acknowledged that things are especially out of whack right now (Humidor...balls.....I'm not sure why)

But they still haven't made any adjustment.


Actual vs "Expected"

BA. .232 vs .252

Slug .373 vs .435

wOBA .306 vs .328

   4. Rally Posted: May 06, 2022 at 10:06 AM (#6075501)
I would think it’s the change in offense that hasn’t been corrected, but last year the actual slugging was only .411. (.418 for non-pitchers)

Those expected number look like 2019.
   5. Rally Posted: May 06, 2022 at 10:09 AM (#6075502)
So Luis Robert is slugging .453, but he’d be at .748 if we still had the 2019 baseballs!
   6. Jack Sommers Posted: May 06, 2022 at 02:51 PM (#6075558)
I would think it’s the change in offense that hasn’t been corrected, but last year the actual slugging was only .411. (.418 for non-pitchers)

Those expected number look like 2019.




As I said, they have this issue EVERY YEAR. I could try to find the old tweet exchanges with Tango from past years, but too much trouble. :). If you go and look at past seasons NOW, you will not see it, because they corrected mid season in the past. However going by memory the slugging difference is more out of whack in the early going this year compared to previous years.

The current lower offensive environment, induced by the double whammy of a ball with more drag and Humidors added in 20 stadiums that didn't previously have them are the most likely causes, as has been identified by smarter people than me. The xSTATS are based on launch angle and exit velocity.

Avg Exit Velo, Launch Angle, Distance.

The Exit Velo and LA trends haven't changed, but ball isn't going as far. Simple as that.

Yr..  ExV LADist
2015 88.8 10.8 169
2016 89.1 11.6 171
2017 87.7 12.0 172
2018 88.9 11.6 171
2019 89.2 11.8 174
2020 88.5 12.9 168
2021 88.9 12.8 166
2022 88.8 12.6 163 


And here is the same info, but for Fly Balls only. (Above is all batted ball types grouped together)

Yr.. ExV.  LADist
2015 90.4 36.8 315
2016 91.1 36.7 318
2017 91.2 36.6 320
2018 91.6 36.4 319
2019 92.0 36.5 324
2020 92.3 37.7 322
2021 92.2 37.1 318
2022 92.0 37.0 312 


So what they need to do at Savant I believe is incorporate distance perhaps, not just exit velo and LA ? I don't know how else they can keep up with MLB monkeying around (pun intended) with the ball year to year and month to month.


   7. Jack Sommers Posted: May 06, 2022 at 03:08 PM (#6075560)
Here is a good (and pretty thorough) article from yesterday I just saw now . They pulled similar info to what I pulled, but focused on "Barrells", and it shows the same thing.

MLB home runs are way down so far in 2022; here's how the league's recent tinkering deadened the baseball

Getting back to my original question to Walt, while xSTATS have their issues right now, once they tweak or correct these things, wouldn't the xERA be exactly what we are all looking for to "upgrade" FIP ? Or are the xSTATS just not good enough to do the job ?

   8. Walt Davis Posted: May 08, 2022 at 07:40 PM (#6075817)
Sorry, lost track of this thread. I have no insight into these xSTATS as in I've never looked at the models, etc. They generally sound like they're doing fairly sensible things (or trying to) but the proof is in the fit to the data which I haven't looked into.

What I'm not a big fan of is all the "sclaing" that gets done. I don't object necessarily but if you want to put some model's outputs on an "ERA scale" then turn it into an ERA model and see how well it does. That is we can put pancake flops on an "ERA scale" but that doesn't make it useful for evaluating pitchers. There's then the interpretability issue -- an ERA scale for which season and what you think of as a good ERA may not be what I think of as a good ERA. You might as well just rank them or give us a percentile or divide them up by quintiles or deciles or "excellent, good, average, poor, stinko." But fine, if folks find xwOBA easier to deal with than xwBA or xwOPS or xwRbat/650, then great.

On the centering issue -- it would be nice if we had absolute values to work with ... and if MLB would quit messing with the ball. The value of a single in 2021 isn't quite the same as the value of a single in 2001 isn't quite the same as the value of a single in 1981 ... but they should be close. But as we see, the value of a EV of 103 with a LA of 25 (or whatever the magic values were) varies between can of corn and HR depending on MLB's whim or manufacturing consistency. You can't build a good imputation/prediction model if there are substantial structural shifts underway.

Folks might not be that familiar with statistical imputation. Let's say I'm in your data but my income is missing. If you've got other info on me -- age, gender, education, maybe some reasonably price geo-coding of where I live -- then one thing you could do is plug in a value for my income based on the income of other people like me in your dataset (or potentially using aggregted Census data). This seems like "making up data" but this will usually lead to a better model than just throwing me out. But it still assumes that my missingness isn't related to my actual income once you've adjusted for age, gender, etc.

Or, you've got data on 1000 US counties, you build a model for Y (where Y is something not regularly collected but you went to the bother) based on various county characteristics that are readily available everywhere ... then you engage in "small area estimation" of Y in all the other counties using those readily available variables. That can work as long as there's no major difference in the relationship between the Xs and Y between counties in/out of your sample which, as long as you randomly sampled, is a reasonable assumption.

Then there's what's happening at the moment where we're trying to estimate this year's output using coefficients estimated for years T1 through T2 ... again a reasonable thing to do as long as conditions today are similar to years T1 through T2 ... and falls apart if they aren't. Hopefully it only falls apart in the mean -- i.e. the effect of X1 and X2 are about the same today as in T1-T2 but mean scoring is down. But if EV1 and LA1 is now going 8 fewer feet, that could be a big problem. (They might be better plugging in 2014 values.)

I'd imagine 2015 was a real problem. The first half of the year saw sluggish offense similar to 2014. The second half, as if by magic, practically looked like sillyball. They probably started out with context values that were much too high so all hitters looked terrible, all pitchers looked great. When they updated mid-season to low context, the context flipped and now all the hitters looked awesome and pitchers terrible. But until they give the ball decision to Tango, I don't think there's much we can do about that.

So if all that's changed is that the mean (intercept) has shifted then we could just go with percentiles/rankings. But if the underlying model has changed and many HRs are now harmless flyouts then even rankings based on EV and LA will be wrong. As I discovered in another thread, LeMahieu is hitting at about his career averages this year -- a career OPS+ of 102 bt this year 133. Some of that is being out of Coors. He's the type of hitter we wouldn't necessarily expect to be heavily affected by a flight-restricted ball while hitters who relied on a lot of "short" HRs might be very badly affected -- the model probably underrates LeMahieu while overrating those others. That's a problem.
   9. Jack Sommers Posted: May 09, 2022 at 01:48 AM (#6075846)
Thanks for explanation on imputation.

I think I've been doing mentally what you suggest....I simply look at the rankings and percentiles at this point, and ignore the "over or under perform" aspect except at the extremes. It just seems like an opportunity is being missed here with all the data available from statcast and yet we still don't have an upgraded version of FIP, (or SIERA for example). We have xERA , which is broken because as you said, there are "structural shifts underway"
   10. villageidiom Posted: May 09, 2022 at 10:50 AM (#6075859)
In 2021, roughly 39% of plate appearances ended in a homer, strikeout, walk, hit batter, or infield pop up. One thing you could think, in recognition of that fact, is that FIP-based WAR doesn’t consider enough of a pitcher’s production. You wouldn’t use 40% of a hitter’s plate appearances to calculate their WAR, so why do it for pitchers? But that doesn’t actually make sense, as David Appelman pointed out to me recently. Assuming “average results on balls in play” is actually going to be pretty close for every pitcher, by definition.
Average assumptions work well in the aggregate, but generally do not work for individuals. The average child is 9 years old, but very few children are 9 years old.

They make the case that, by including the average for the population rather than the actual for the player, they are capturing most of the actual anyway. If a pitcher has 100 non-TTO plate appearances against, with a .350 BABIP, let's say that's 35 non-TTO hits and 65 non-TTO outs. If average BABIP were .280, then they're assuming 28 non-TTO hits and 72 non-TTO outs instead of the actual results. They're only off by 7, out of 100. Isn't that good? No, because (a) they are trying to assess how much value this player has achieved, which itself is a measure of deviation from replacement, and (b) by definition WHAT THEY'RE THROWING OUT IS DEVIATION. It's not a random 7 PA they're throwing out - it's all in one direction.

If you're trying to project a player by assessing the value he is likely to achieve, that's fine. BABIP is generally hard to sustain or predict; focus on the outcomes that are reliable predictors. However, if you are trying to assess the value delivered by a player you need to assess what the player has done, not what the player might do in the future. FIP is one way to adjust the context of past performance, but it ignores far more than it should for that purpose IMO. In the example above, if the batted ball profile of those 35 hits and 65 outs were disproportionately line drives I'd say the additional hits are likely attributable to the pitcher, and should be considered in assessing the value they have achieved (or failed to achieve). They might not be as likely to reproduce that batted ball profile in the future, but that's not what's being assessed.
   11. Walt Davis Posted: May 09, 2022 at 10:20 PM (#6075994)
It will depend on how well you understand the process ... and how much effort/trouble/detail you are will to put in.

So yes, plugging in the average for every pitcher is clearly the least you can do. But can you do better? That will depend on how well you understand the process underlying BABIP. If you are not aware of any readily available variables that correlate well with BABIP, then you can't really do better than plugging in the overall average.

Now I think we do know a few things that correlate. FBs are less likely to be hits than GBs. So you'd think a pitcher who gives up 2/1 FB/GB should have a lower BABIP than one that's 1/1 and even lower than one that's 2/1. But maybe it comes out in the wash. Hits are largely the product of LDs, I'm not sure LD rates vary much by pitcher. Then for FBs (not incl LDs) and GBs, while GBs have a much higher BA, FBs that do fall in are much more likely to be doubles and triples so maybe the OPSs don't differ much. GBs are certainly a terrible way to produce runs (and a good way to produce DPs) so maybe the average GB has the same value as the average FB and everybody gets killed by LDs and it all comes out in the wash.

(There's also the historical stat issue. If you're b-r, you prefer fancy stats that you can apply over all/most of baseball history. A lot easier with overall BABIP than with detailed BABIP. Sometimes the gain from historical comparability is worth the loss of efficiency in current stats.)

But this can be said about any model that includes "regression to the mean" in its structure. What I would like to do in terms of projecting Bryce Harper for the rest of 2022 is to regress Harper towards his individual "true talent" (i.e. his "mean"). But I don't know what that is, I just have past performance which, if it's been good, probably includes some unstustainable good luck. So I regress his individual performance-based projection towards the ovverall mean, based on how many PAs his individual projection is based on. All statistically legit and usually a good idea. But maybe I could do better by regressing him towards a "RF mean" or a "LH slugger mean" or a "xwOBA 320-340" mean. Those could be done.

Anyway, I do like the idea of Statcast (or maybe slightly less detailed) data being used. So maybe add barrels-in-play or LDs in-play to the FIP model.

And I agree that there is always a tension between building a "good" model and getting good individual predictions. It's a very counter-intuitive result but, statistically, the goal of imputatation is not to provide the least difference between the imputed and actual values, it is to reproduce the as best as possible the population distribution of the variable being imputed. Now, using my earlier example, if you could impute my income very close to my real income (and the real incomes of everybody else with income missing) you'd achieve a good approximation of the population distribution of income. But that would require a very, very good model for income and there simply isn't one. You can often do a better job of reproducing the distribution by doing something fairly crude.

As I often say, think actuarially. Insurance companies would love to be able to distinguish between the responsible 16-25 year-old male drivers and the moronic risk-takers but they can't -- or at least not until it's too late. So they "impute" level of risk based on age, gender, location, past claims, tickets and maybe a few other things in a one size fits all methodology. They obviously over-estimate the level of risk for the responsible folks while under-estimating it for the irresponsible individuals but what else can you do? What would we guess, for every 4 players who perform similar to Harper through age 28, one goes to the HoF? No matter how hard you try, you are projecting "on average, players similar to Bryce Harper are expected to ..." not "Bryce Harper is expected to ...."

Now using FIP (or ignoring in-play results) in the estimation of the actual value produced is a very different thing than predicting future value. I suspect bWAR is closer to the truth for those purposes in that it places some but not all of the blame on the defense. In a modelling sense, the comparison of those two is not "does this year's FIP do a better job of predicting next year's ERA than this year's ERA" it's whether FIP does a better job than "defense-adjusted ERA" ... and maybe we should be predicting next year's "defense-adjusted ERA." I don't know the answer to that question nor the answer to whether adjusting ERA for DRS would be the right way to do it.

Sorry, rambling but I ain't gonna tidy up. :-)
   12. sunday silence (again) Posted: May 09, 2022 at 10:46 PM (#6076003)
I think it looks at least vaguely linear weights-ish if expressed as (1.44*HR + .33*BB - .22*K)*(9/IP) and then a couple of other results are a bit more obvious too ... but still an awkward unintuitive formula.


Walt, can I ask you what exactly is the criticism here? Is it just that the units are too large? I think you have to use linear weights or something like that if, as FIP does, give credit for Ks, HRs, BBs. You'd have to use something like that. It seems like the formula is weighted runs x 9 then divide by innings pitched to give the proper value to it.

Just wondering. I would be remiss in not saying how fascinating those last several posts you made are. I havent digested it all but very fun to read.
   13. villageidiom Posted: May 10, 2022 at 12:26 AM (#6076020)
As I often say, think actuarially.
Both actuaries and sabermetricians, if not doing a good job, will forget what they understood about the process and just apply the process because "it works". (I've been a credentialed actuary for 25 years now, and have seen this firsthand countless times.) The FIP process works well for projection because it focuses on the basic elements that are predictive. But there's a difference between "predictive" and "not random". There are all kinds of things that are not random, and also not predictive. At the MLB level, an atypically high level of line drives are, for a pitcher, non-random. You've seen it, and so have I - a pitcher who is tipping his pitches, or just doesn't have his good stuff that day, or whatever. That's not predictive outside a game or two at the MLB level, because it's self-correcting. Either a pitcher will bounce back, or he will no longer be allowed to pitch. The only pitchers who continue to pitch will be the ones who can improve, which means the past struggles will not be predictive. But those struggles are not random. They are real, and a process that throws them out because they are not predictive will fail when that process is being used for explanation or attribution, not prediction.

Are all pitching failures non-random? No. Randomness happens. Can we tell, from the statistical record, the difference between random and non-random? Maybe not today, but maybe tomorrow. Or... maybe some of it, but not all of it. FIP would suggest we can discern none of it.
   14. sunday silence (again) Posted: May 10, 2022 at 10:36 PM (#6076163)
Im having trouble understanding your thought process here Village. If I understand you: line drives are "not random" but not predictive very much because they are self correcting.

Well OK. But what makes them inherently different than HRs, BB, and KOs? You are suggesting that yes? That somehow BBs are predictive but not Line Drives. WHy doesnt the same logic apply? I guy who walks too many batters will either adapt or not pitch. Its self correcting.

So I get the general concept of something being not random and not predictive. BUt you seem to be making a distinction about certain baseball events that I dont see.

Also, somewhat tangentially: isnt one of the basic issues with FIP that we are assuming any ball in play should be a hit at a rate of .290 and the pitcher has no further ability to impact that rate. But that cant really be true, no?
   15. villageidiom Posted: May 11, 2022 at 10:49 AM (#6076245)
Well OK. But what makes them inherently different than HRs, BB, and KOs? You are suggesting that yes?
No. Well, mostly no.

Line drive rate tells us, in a predictive sense, the same thing as HR. It has generally not been helpful in making predictions because (a) there was some variability in how "line drive" vs. "fly ball" were defined, though with Statcast that can be codified easily; and (b) it tells us much the same thing that HR tells us, and HR is defined clearly and readily available; and (c) a consistently high LD rate will get a pitcher taken out of their role, or their playing time minimized, faster than a high HR rate will.

If we're talking less about predicting the future and more about assessing what a pitcher has actually done, LD rate is meaningful. Why? First, there are more of them. Second, they directly lead to hits on balls in play (and HR) at a much faster rate than any other kind of contact. Treating a low-LD pitcher and a high-LD pitcher as though they were just lucky/unlucky and wiping their hits away with a .290 BABIP assumption is assuming too much. And maybe in evaluating their performance we can use HR rate in lieu of LD rate - but then you're dependent on a lower-frequency event to tell you how someone has performed, and depending on the sample it might not be meaningful enough or reliable enough.

The example of when this matters is when a pitcher simply doesn't have his good stuff on a given night and is hit pretty hard throughout the game. It will potentially affect his K rate, his BB rate, and his HR rate... but maybe it *won't* affect the HR rate? If he didn't give up more HR but gave up 6 extra doubles, that's bad, right? And it's real - it's not random that everyone is squaring him up. If it's not a HR FIP doesn't care. He was just unlucky, let's ignore the doubles. In assessing his performance FIP gets it wrong, even if in projecting his performance FIP will be fine.
   16. . Posted: May 11, 2022 at 11:20 AM (#6076251)
As it always has, this whole thing fundamentally boils down to an aversion or inability to accept things that happened, in lieu of things that "would have been expected to happen," or "things that should have happened." In virtually all circumstances, that aversion or inability stems from sources exogenous to baseball.

but the discussion around whether FIP does a good job of assigning year-one value is alive and well.


Whatever "discussion" around it that suggests that it does do a good job is either wrong or misguided. It doesn't. FIP doesn't even measure anything real. Runs allowed or earned runs allowed, OTOH, does measure something real. Just like RBIs measure something real.

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Dynasty League Baseball

Support BBTF

donate

Thanks to
Randy Jones
for his generous support.

You must be logged in to view your Bookmarks.

Hot Topics

NewsblogHector Lopez, Who Broke a Baseball Color Barrier, Dies at 93
(24 - 8:12pm, Oct 03)
Last: BDC

Newsblog2022-23 Preseason NBA thread
(362 - 8:11pm, Oct 03)
Last: Fourth True Outcome

NewsblogShohei Ohtani to make $30 million in 2023, record amount for arbitration-eligible player
(17 - 8:07pm, Oct 03)
Last: Never Give an Inge (Dave)

NewsblogOMNICHATTER for the week of September 26 - October 5, 2022
(202 - 8:05pm, Oct 03)
Last: cardsfanboy

NewsblogThree impacts of baseball's new 12-team postseason format
(22 - 7:30pm, Oct 03)
Last: Ron J

NewsblogRoger Maris Jr. blasts MLB, says Aaron Judge’s potential 62nd home run should be single-season record
(122 - 7:06pm, Oct 03)
Last: SoSH U at work

NewsblogTony La Russa expected to announce retirement Monday as White Sox manager
(15 - 7:02pm, Oct 03)
Last: asinwreck

NewsblogTo dream the impossible dream - and then decide it's time to let it go
(49 - 5:32pm, Oct 03)
Last: Captain Joe Bivens, Pointless and Wonderful

NewsblogAtlanta Braves sign RHP Charlie Morton to 1-year, $20 million deal for 2023
(4 - 5:16pm, Oct 03)
Last: The Yankee Clapper

NewsblogOT Soccer Thread - European Leages Return
(304 - 5:14pm, Oct 03)
Last: AuntBea odeurs de parfum de distance sociale

NewsblogThere's a new longest last name in MLB history
(15 - 4:19pm, Oct 03)
Last: Edmundo got dem ol' Kozma blues again mama

NewsblogTrevor Bauer’s defamation case against accuser’s former attorney could be thrown out
(30 - 3:40pm, Oct 03)
Last: The Yankee Clapper

NewsblogOT: Wrestling Thread November 2014
(2543 - 2:26pm, Oct 03)
Last: Tubbs is Bobby Grich when he flys off the handle

NewsblogOT - October 2022 College Football thread
(48 - 1:49pm, Oct 02)
Last: AuntBea odeurs de parfum de distance sociale

Sox TherapyPredictions of Ridiculousness
(143 - 12:59pm, Oct 02)
Last: Captain Joe Bivens, Pointless and Wonderful

Page rendered in 0.4136 seconds
48 querie(s) executed