User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
Page rendered in 0.3428 seconds
45 querie(s) executed
| ||||||||
You are here > Home > Baseball Newsstand > Discussion
| ||||||||
Baseball Primer Newsblog — The Best News Links from the Baseball Newsstand Sunday, July 24, 2011MacAree: The Problem With SabermetricsSorry if I didn’t post it the right way. I thought this was kind of interesting and was wondering what people here might think.
|
Login to submit news.
You must be logged in to view your Bookmarks. Hot TopicsNewsblog: Mets planning to shut down Jose Quintana for three months
(13 - 3:16am, Mar 23) Last: Lizabeth Goolsby Newsblog: 2023 NBA Regular Season Thread (1256 - 1:52am, Mar 23) Last: Hombre Brotani Newsblog: MLB making small changes to pitch clock rules, memo says (12 - 12:47am, Mar 23) Last: Tin Angel Newsblog: Braves option Grissom to minors, clearing Arcia to start at SS (10 - 10:27pm, Mar 22) Last: Walt Davis Newsblog: Ohtani fans Trout to seal Japan's 3rd Classic championship (20 - 10:24pm, Mar 22) Last: Walt Davis Hall of Merit: Ranking Center Fielders in the Hall of Merit - Discussion Thread (76 - 10:14pm, Mar 22) Last: Chris Cobb Newsblog: “Friday Night Baseball” resumes on Apple TV+ on April 7 (6 - 9:49pm, Mar 22) Last: Hombre Brotani Newsblog: MLB's Rob Manfred pushes for more star pitchers in next WBC (9 - 9:26pm, Mar 22) Last: ERROR---Jolly Old St. Nick Newsblog: Record finish for World Baseball Classic (2 - 8:37pm, Mar 22) Last: depletion Newsblog: OT Soccer Thread - Champions League Knockout Stages Begin (279 - 7:56pm, Mar 22) Last: SoSH U at work Newsblog: Phillies Release Mark Appel (17 - 5:59pm, Mar 22) Last: shoelesjoe Sox Therapy: Yoshida In The Spotlight (14 - 5:07pm, Mar 22) Last: Steve Balboni's Personal Trainer Hall of Merit: Reranking Center Fielders Ballot (9 - 1:12pm, Mar 22) Last: cookiedabookie Sox Therapy: The Rostah (170 - 9:34am, Mar 22) Last: Darren Newsblog: Spring training OMNICHATTER 2023 (148 - 9:13am, Mar 22) Last: cardsfanboy |
|||||||
About Baseball Think Factory | Write for Us | Copyright © 1996-2021 Baseball Think Factory
User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
|
| Page rendered in 0.3428 seconds |
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. OCD SS Posted: July 24, 2011 at 12:38 PM (#3884203)Everything was ruined when all those nerds came out of their mother's basements are started looking around at the rest of the world. Everything would've been better if the sports writers had just managed to keep them locked down there.
IANAM, but how can an analysis "blame" triples for anything? If teams that hit more triples tend to score fewer runs than teams that hit fewer triples, that's just an observation, not a blame. All else being equal, if such teams hit even fewer triples they would assuredly score even fewer runs.
This peculiar definition of "science" also seems to rule out other observation-based fields like astrophysics, since we can't put stars in the lab and perform controlled experiments on them.
Echoing Bob's point, I highly doubt that whoever first did this study actually claimed that "triples were bad" and that hitting them caused a team to score fewer runs. If he or she did, I'm sure their error was quickly pointed out. More importantly, this example seems to me to illustrate the power of sabermetrics -- an analytic study generates an initially counterintuitive result, we think about it, advance theories to explain it, and learn something in the process. What's wrong with that?
What does this even mean? It seems to me that statistic can only "call up the smell of fresh mown grass in midsummer" because of our familiarity with it and the associations it triggers. Batting average would seem bizarre and incomprehensible but for its long history in the game. Conversely, "new-fangled" measures like OPS have now become commonplace enough that they "invoke baseball" pretty readily, at least for me. (Maybe they don't "call up ... the blur of seams as an outfielder whips a throw in towards his cutoff man", but that is asking an awful lot of a hitting stat.) While many new metrics are more opaque and may be inaccessible to people who don't want to take the time to learn about them, nobody is forcing MacAree or anyone else to use them.
1B 0.495
2B 0.578
3B 1.172
HR 1.473
BB-IBB 0.320
IBB 0.233
HBP 0.332
SB 0.161
CS -0.133
SF 0.663
SH 0.036
SO -0.103
AB-H-SO -0.097
So triples are not only positive but worth more than doubles and singles. Even doing it just one to one between triples and runs gives us:
Intercept 690.642
Triples 0.326
So I'm unclear as to how triples came up with a negative value in a regression. Maybe he used a small sample.
The most interesting number from the above is the relatively low negative value of the CS. This is likely a base running proxy; IE the actual caught stealing is more damaging than that but the relationship between good baserunning and higher caught stealing totals lessens it in the regression.
In the spirit of IANAL, might I suggest IANUS* instead?
I Am Not "Uh" Sabremetrician.
This peculiar definition of "science" also seems to rule out other observation-based fields like astrophysics, since we can't put stars in the lab and perform controlled experiments on them.
Yes, it would seem to rule out a whole swath of pseudosciences, like "theoretical" physics, and "mathematics".
Snark aside, mabe it's not controlled, but MLB is one heck of an experiment that is available to sabermetricians. And we can models that are far more "realistic" than many scientists in other fields could dream of (especially compared to actual pseudosciences).
Sabermetrics shouldn't be so incomprehensible so as not to call up the smell of fresh mown grass in midsummer, or the crack of the ball off the bat, the blur of seams as an outfielder whips a throw in towards his cutoff man. Statistics shouldn't be sterile and clean and shiny and soulless. They shouldn't just be about baseball; they should invoke it. Otherwise, they run the risk of losing the language which makes them so special.
What does this even mean?
I don't know, but I want some of whatever it is he's taking.
So what in the #### are we waiting for? Get with the program, you dumbassed seamheads.
Is it that they're semi-random, or isn't just that they're dramtically rarer? (Which, I suppose, might be a different way of saying the same thing.)
Triples were "the" big extra-base hit, the big RBI blow, pretty much until the 1920s. But they've steadily receded in frequency, and have never been rarer than today. Big, strong, power hitters of many decades ago (the Gehrigs, the Musials, hell, even the Rices) often sought to stretch their doubles into triples. That rarely happens today because teams have simply gotten wise to the risk/reward, and also because defenses are just far better on relay throwing plays than they used to be.
Which means that triples have progressively become the province of the speedy low-power slap hitters. Good for them, I love to watch these kinds of players, they're fun and interesting, and there is simply no better play to observe in baseball than the triple. But that's an aesthetic appreciation, not a run-value assessment. Much as we may love our slappy speedsters, we well know that they're far from the most productive run generators on just about any team. And if you are a team that has a high proportion of these guys, then almost by definition you aren't going to be much of a run-scoring team.
How in the world does making observations such as these make watching and thinking about the game any less fun?
But what a day that will be!
I say we start with Ben Affleck and Will Farrell.
I think it's more or less a whole lot of variables, smaller parks, better equipment, less worth it in the short run, the argument that players don't come out of the box anymore thinking triple (Stan Musial isn't a speed burner, yet he has the most triples among all players who's career started after 1930) etc.
It's all of that, and also it's just the case that fielders throw better than they used to, and that in combination with better gloves makes the relay throw play more effective than it once was. Truer bounces on better-groomed fields is also part of it.
I think the author has a good point but don't really see that this is a "problem" that encompasses the "field" of sabermetrics. I think it's a small subset.
Further:
I see it was already highlighted above, but this certainly caught my eye. Is it not possible to have good science through careful observations and falsifiable tests? Isn't this what cosmology?
What's your basis for that? Not saying you're wrong, but what's the evidence?
My guess would be park size/shape (almost every park in the 1910-40's had a 420'+ CF b/c they fit in city blocks), no more turf, and the decline of busting it out of the box, inthat order.
A lot of it is simply and directly personal observation of baseball at many levels over the past (oh my god) nearly half-century. Admittedly that's entirely subjective, but it's there nonetheless and I can't ignore it: I can clearly see that players just throw better than they used to. They're taught better and more consistent technique from the time of their boyhood than they used to be, and they're bigger, stronger, vastly better-conditioned, and just better athletes.
This is consistent with all the quantifiable demographic data we have on the pool of talent. And it's obvious that with vastly more money in the game, players are highly motivated to perform at the role as the serious profession it is. I don't think there is any good reason to doubt that the quality of play today is better than it's ever been, and throwing and catching fundamentals aren't far from the heart of that.
Carl Crawford is on a similar pace, but he's dropped off in recent years and is now stuck in an awful park for triples.
Don't know if I buy it. Regressions are easier than the game state work and from what I can see, the difference in the results is in the noise.
I selectively endpointed my list of triples to get Musial to the top of the list. If you go by 1920(which is when many people argue is the real start of 'modern' basebally) Musial is second behind Paul Waner. (Clemente is fourth though...the top four -Waner, Musial, Goslin and Clemente- triples hitters since the 1920's have more triples(707) than they do stolen bases(441). You probably won't see that again.
I'm a lifelong Cub fan ... these are not my summer memories. Give me a stat that reminds me of a throw sailing over the cutoff man, 15 feet off target, hitting a brick wall on two bounces and rolling into the LF bullpen!
The Problem With Sabermetrics
Woo-hoo!! We've got it down to one!
Anyway, the article is OK enough but doesn't provide any sort of solution to THE problem. And, like Voros, I want to see this regression with a negative coefficient for triples. I'd bet dollars for doughnuts (how much do doughnuts cost these days anyway?) that he's misinterpreting the coefficient. He already shows signs of being as ignorant as he claims others are -- regressions haven't been "hard" for at least 30 years (with free publicly available software since at least EPI-INFO); his explanation for the negative coefficient on triples doesn't seem to make sense if singles, doubles and HRs are in the equation ... a team's (or player's) HR production is controlled in the regression so that coefficient (as described which is surely incorrect) is saying that _given the same number of HR, doubles, etc_, the team with more triples scores fewer runs. That's got nothing to do with low-power speedsters not hitting HR because you've controlled for HR. Unless he's saying there should be an interaction between triples and HRs or some such.
Ahh, I think I found it. It appears to be a regression of FA salary, not runs, on various characteristics by David Gassko (www.insidethebook.com/ee/images/uploads/dsg_freeagent_Thesis.pdf). Gassko surmises:
The negative coefficient on triples is a mystery, but not a particularly
important one. As shown in Figure (5), the average hitter in the sample averaged fewer
than two triples per season, with a relatively tight spread at that. Moreover, since triples
likely decline strongly with aging due to their great dependence on a player’s speed, it is
likely that both the mean and spread decline further past the first season of a player’s
contract.
I'm not bothering to read the whole article nor pay much attention to his regression to see if I agree with Gassko's approach but I'm not appalled by the idea that triples have a negative impact on FA salaries ... although Carl Crawford would probably now throw that equation off. :-) (Actually I'm not even sure if this is triples before or after the contract was signed.)
One of the interesting things I've noticed when I ran regressions is that caught stealing appears to function in some way as a proxy for speed (caught stealing consistently comes back as less negative than you would expect). But if you don't include caught stealing I'd expect triples to kind of pick up the slack a tad and show up as more positive. Don't know. though. I was getting sensible results from regressions as soon as I got a copy of the Lahman database. I'm not sure how to get silly ones.
Willie Stargell had 55 career triples, 33 of them at Forbes Field - and Forbes was the Pirates' home stadium for less than half of his career.
Voros - it could just be a different sample. Your regression covers 1955-2010; if you ran a single-variable regression on triples and runs that goes back to the Deadball Era, you might get a negative correlation just because there were far more triples back then. (You'd probably have to exclude the high-scoring, triple-happy 1890s, though.)
And 13 of the remaining 22 at 3 Rivers. 46 of 55 career triples at home. The only other stadium in which he hit more than 1 was Jarry park. He had 2522 ABs in 10 different parks and hit 0 triples.
I was thinking this one would actually be a long timeframe - 1901 to present or so. The Deadball Era being the lowest-scoring and highest-triple period out of that sample would have a chance of producing a negative correlation.
But as a general rule, yes, this is very much correct.
Show your work was RSB's mantra and that general spirit is carried forward in places like this -- or the Book blog or ... well lots of places.
Certainly that consistent with what Walt quotes from Dave's article.
newsbreak - the astros uck-say
now if only i could find one of those great writers like BITGOD to explain to me what is the problem. it must be that carlos lee is hitting triples and not stealing many bases because you sure can see THAT being the problem. and the guys are playing great small ball - even sac-bunting with your #7 hitter (who was hitting like .280 at the time) in the second inning to move the runners along. or was it walking the .105 hitter hitting 8th to pitch to the pitcher?
anyways, i am not getting how a bunch of guys and a few grrrls arguing over regression and coeficcion of whatever has anything to do with what the media says. seeing as how the media doesn't go around quoting THT, Bpro, the book blog, fangraphs stuff like FIP+ or SIERA etc.
what does that have to do with either enjoying going to the ballpark, watching it on TV, listening to the modern radio announcers who talk about all kinds of shtt don't have a THING to do with the ball game and never shut up for a SECOND or give you an idea of what is happening on the field. or watching your kid play?
this is like if i was grousing about all the total jerk chefs on tv arguing about 1/4 tsp of some herb grown somewhers i never heard of or how terrible it is if you cook without el-expensivo pan - and all this is done for money/prizes, we all know - and then i say - oh i can't enjoy cooking no mo because some buttmunch on the TV
Well that's your problem right there, a television makes for a terrible cooking implement. I always saute my buttmunch in a pan. Or better yet stuff peppers with buttmunch, rice and tomatoes, and bake them.
What would make an interesting article would be a look at the as yet underexplored countries of sabermetrics. A survey of the field, if you will. What's known, what remains to be discovered, and what tools or data might best help us do that. Things such as:
1. Starting pitcher use
2. Bullpen use
3. Injury prediction
4. Falling off the cliff prediction
5. (perhaps a subset of 3 and 4) Finding a player's most similar players
6. Precise fielding numbers
7. Chemistry, it's effect
8. Catcher defense
And so on. What did I leave out?
One of the comments was interesting. The writer of it probably has a lot of company:
This is true of any field. It's probably simply a fact of life for any field, rather than an indictment of researchers in the field. Generalists get left behind. The things that are still unknown often aren't as interesting as they used to be in the sense that they become relatively arcane, or become impossible for a bright layman to discover or even conjecture usefully about.
If Bill James had been born in, say, 1990, he wouldn't currently be studying baseball statistics in a useful way, assuming he'd had the same education as Bill James v1 had through the age of 21.
But, yeah, the possibility raised in #29 could be it as well. Triples might well peak in low-scoring eras (no HRs) and you might get a negative bivariate correlation (or even partial correlation depending on what you controlled for).
Anyway, without an actual cite, we probably should wait for him to clarify.
#39 the big difference I found between pre and post 1920 baseball is that before 1920 batter strikeouts were in fact both negative and significant while after that you really need not concern yourself with batter strikeouts in run scoring models. (that is in distinguishing among various types of outs. If you break out DPs separately then Ks matter very slightly. Not clear how it works it you adjust DPs for opportunities but I doubt it matters)
It's tricky to model deadball offense. So many errors. So many baserunning outs. But including Ks in any model helps.
As to the main point, I'm not sure because the author is less than clear. But he does sound like he sees some of what I see. What I see in sabermetrics is 1) everyone can do the math (which is why the uberstats are almost all done by teams, instead of one person). 2) The differences between methods almost always comes down to either a) lack of a robust enough data set to make any real starting assumptions possible (see 19th century ball), or that there is a disagreement between the math teams about standards. This may be what the author thinks of as "applied philosophy." His term is silly, because the correct field is, very obviously, applied mathematics.
The only arena in baseball where double-blind placebo-controlled studies are needed and VERY lacking is in trying to actually figure out what steroids did and did not do. Good luck getting 30 matched pairs of MLB players and telling them that one guy will get the steroids and another will get the placebo. They've got games to play, stats to pile up, and pennants to win. They're not going to take the risk that steroids do work and they got the placebo. Other than that, baseball is just like analyzing any other historical collection of data, which you cannot change or arrange into controlled study groups. That's what applied math is for, not applied philosophy. - Brock Hanke
One might think Fenway would be a good triples park for a speedy LHB with good line drive power. Lots of room in right center and tricky angles near Pesky's Pole to bounce would-be doubles past the RF for an extra base. (Not that this proves anything, but Ted Williams' 71 career triples [35 at home] would place 9th among active players.)
My baseball interest began when Musial was in his mid 30s, so the speedy version (however speedy that was) had passed by. However, he didn't get much triples help from his home field, getting 90 there and 87 on the road. Oddly, considering his precisely even H/A split for hits, he was better at home for all kinds of EBH:
2B: 394/333
HR: 252/223
I think The Man's ability to hit loads of screaming LD to all fields has as much to do with his triples as his speed.
You must be Registered and Logged In to post comments.
<< Back to main