Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Primate Studies > Discussion
Primate Studies
— Where BTF's Members Investigate the Grand Old Game

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 1 of 1 pages
   1. VeteranPresence.com Posted: April 12, 2005 at 02:44 AM (#1249707)
WEAK!
   2. Kyle Lobner Posted: April 12, 2005 at 02:53 AM (#1249727)
Well, y'know, thanks for being constructive.
   3. StopBooingAbreu Posted: April 12, 2005 at 03:07 AM (#1249747)
Wait, Wait, Wait...

So what your telling me is...put the lime in the coconut?
   4. VeteranPresence.com Posted: April 12, 2005 at 03:23 AM (#1249765)
i was sent by a messenger named JD. Unlike the JD you know, though, his knees were a lot stronger. As is his radio voice.
   5. Duffy Duff Posted: April 12, 2005 at 10:13 AM (#1250262)
Why are you focusing on 2b+3b? Singles and homers will help score baserunners, too.
   6. xdog Posted: April 12, 2005 at 12:11 PM (#1250311)
"Using the Pythagorean formula and adjusting teams’ runs scored for this new system"

Could you explain that adjustment?
   7. Kyle Lobner Posted: April 12, 2005 at 01:08 PM (#1250383)
In response to post 5:

Singles and home runs will help drive runners in, yes, but that wasn't why I separated out doubles and triples. I separated them out because a runner who gets on base by a double or a triple is more likely to score, and therefore a team's RPSR would rise in that situation. So if you double/triple a lot, your RPSR would rise. Home runs were taken out there because they've been removed from the entire equation in the beginning.

In response to post 6:

The concept I used for this one is pretty simple. The 2004 league RPSR was .324, and the league XBH Ratio was .252. .324/.252 = 1.2871. So while a team's actual runs scored are shown by this formula:

RPSR*(times on base-HR)+HR

A team's runs scored, if extra base hitting was the only factor involved, would in fact be shown by this one:

(XBH*1.2871)*(times on base-HR)+HR

This obviously isn't the best method to predict a team's success, and just because a team doesn't match this result doesn't mean they're lucky/unlucky, but it does remove the factor of extra base hitting, so you can look at the result and try to determine why the actual result happened. For example, by this system, the Yankees scored 58 runs more than they should have in 2004, which I'm attributing to a lineup with fewer dead spots than most teams have.

To take the final step, I used the pythagorean formula to determine what each team's win percentage would be if their runs allowed had remained the same, but their runs scored had been adjusted. It's worth noting that while my formula says the Yankees only should have won 84 games, their Pythagorean record from last season (before this change) was 89-73. So the change, from my perspective, was only 5 games.
   8. Nick S Posted: April 12, 2005 at 02:05 PM (#1250488)
Your first equation (RPSR) is an important factor to look at in terms of run production (along with OBP and HR, that's pretty much all there is.) In trying to explain why teams vary in the stat, though, you should look at the correlations between RPSR and a number of factors (e.g. slugging percentage.) After that you should conclude what RPSR is primarily a function of, as opposed to what you've done here, which is concluding ahead of time that percent of doubles and triples is the primary factor. What you've done is akin to saying "Teams that hit a lot of doubles score a lot of runs, therefore I can factor out luck by normalizing each team's run scored to their doubles rate." It does not make sense.
   9. Kyle Lobner Posted: April 12, 2005 at 02:33 PM (#1250576)
Let me refer you back to this paragraph:

If a team has a high XBH ratio, it makes logical sense that they should also have a high RPSR. In the cases where there are large differences between the two, one has to take a look at the other factors listed above. If the difference can’t be explained by one of those factors, it’s possible you have a luck situation on your hands.

I didn't say doubles were the only factor involved, in fact I listed most, if not all the factors involved. I eliminated extra base hitting from the equation because it's the easiest to quantify. Then, from there, you can look at the numbers again, with one factor removed.
   10. Nick S Posted: April 12, 2005 at 03:11 PM (#1250687)
You haven't "eliminated extra base hitting from the equation". What you have done is guess that XBH ratio correlates well to RPSR. In fact, going by the numbers in your chart the two stats show almost no correlation (i.e. XBH ratio appears to have little predictive value for RPSR.)

Primates in your last thread offered quite a bit a constructive commentary on the work you had done. They were fairly negative (i.e. we generally said "You aren't doing this well, here are things to look at in order to learn how to do this better") and a bit patronizing. You completely ignored the substance of those comments. I don't really understand why you are posting this work if you are not looking for feedback on it (even if that feedback is negative.)

I'm being a bit of an ####### here, but this sort of stuff (bad, yet pretentious) hits my buttons.
   11. Kyle Lobner Posted: April 12, 2005 at 04:26 PM (#1250869)
Admittedly parts 1 and 2 of the project are somewhat unrelated, but part 3 ties them together. I read every comment, took notes on some things, and you'll see a notable difference in my theory on linear weights in Part 3. This article had nothing to do with linear weights, so the past comments didn't change my angle on it.

If you find my work pretentious, I'm sorry, but if you're going to hit me for work I haven't even written yet, I'd encourage you to wait.
   12. Andy Aymeloglu Posted: April 12, 2005 at 06:03 PM (#1251073)
Looking at the chart, it appears AL teams rank near the top, NL teams rank near the bottom. You don't address why this would be, and it looks like what you should probably be doing is splitting the chart in two, and ranking within the league.

My take on the difference is pretty simply that pitchers kill rallies (lowering RPSR), but they don't have much effect on XBH ratio. If that's the case, it backs Nick S's observation that XBH and RPSR aren't well correlated, and indicates that the difference is largely due to lineup composition and doesn't measure luck.

Now, it's possible, then, that the difference in rankings that you're observing is interesting and does tell us something useful, but I don't think what it's telling us has been well-defined, and I certainly don't think you've eliminated much of anything nor made strides in isolating luck.
   13. BTL: Lesser Primate, 4th Class Trainee Posted: April 12, 2005 at 09:42 PM (#1251522)
My least primate brain's take: there is probably (almost certainly) an inverse direct correlation between no. of HRs hit and speed (easy enough to measure roughly by comparing SB, CS, triples). Teams with more HRs hit will likely have a lower RPSR.
Your stats do show chances of driving in runs with singles and walks, but because HRs are so common and such an important part of run producing up and down the lineup, I'm not sure what the use of this stat is. Still, even though these stats aren't perfect, I found your article interesting, so thanks.
There are a lot more factors to be considered, though, before you are left with "luck" or random fluctuations. Eg -- sacrifice fly ability, bunting ability, baserunning ability esp stealing, etc. True Primates could probably help you out here. I hope they do.
   14. Kyle Lobner Posted: April 12, 2005 at 09:54 PM (#1251551)
Your first point (determining powerful teams and speedy teams) is interesting, and I'll look into it, but here's what I'm scared of:

Willie Harris and Aaron Rowand stole 19 and 17 bases, respectively, and were caught just 12 times combined, for a gain of 14 bases. This high of a success rate would imply a) good speed, and b) only going when you're sure you'll make it.

My bet is, though, that if they hadn't had 5 20+ HR hitters behind them (and Frank Thomas), they'd have been running more. A fast player on a home run hitting team will run less to prevent outs, because his chances of being driven in are better than average. Their odds of being able to get from first to third on a single, though, are also better than average. So I'm reluctant to measure speed as the opposite of power. I'm also reluctant to measure speed purely by stolen bases and success rate for much the same reason.
   15. BTL: Lesser Primate, 4th Class Trainee Posted: April 12, 2005 at 10:14 PM (#1251589)
Willie Harris and Aaron Rowand stole 19 and 17 bases, respectively, and were caught just 12 times combined, for a gain of 14 bases. This high of a success rate would imply a) good speed, and b) only going when you're sure you'll make it.

That's a 75% success rate, which is actually about break even. Someone has posted break even stealing percentages, broken down by no. of outs and whether stealing 2nd or 3rd. I think the range is from about 69% to 89% if stealing 3rd with 2 outs. Therefore, Harris and Rowand probably would have scored about as often if they hadn't tried to steal. They scored more times the 36 times they advanced successfully, but they lost some scoring opportunities by being caught 12 times.

So I'm reluctant to measure speed as the opposite of power. I'm also reluctant to measure speed purely by stolen bases and success rate for much the same reason.

You can measure speed in any way you want, for example, number of times advancing first to third on a single, as you said, and call it baserunning ability (I would include #of SB and success rate also), but you need to add these types of factors into the equation, because you need to control for these factors. You've found a difference between teams, and based on the comments you made you seem to understand some of the reasons for the differences. But because your comments are observation based and not statistics based, the stat in its present form doesn't allow us to draw any meaningful conclusions or allow us to utilize it in any meaningful way.

A fast player on a home run hitting team will run less to prevent outs, because his chances of being driven in are better than average

Not sure about that. Be careful about such statements until you research them. A home run hitting team may not have a higher chance of driving someone in, depending on many other factors such as batting average, OBP, number of strikeouts, etc. And a manager's preference for hit and run, sacrifices (rare now, I know) versus stealing also play a role.

Got to go. Good luck.
   16. bibigon Posted: April 13, 2005 at 05:26 AM (#1252392)
I swear, this whole Snow Index Project thing is part of some psychological study to see how Primates react to things that bother them from the statistical side.

It's well documented how we react to traditionalists who try to create bogus stats, like Productive Out Percentage. Someone out there is watching us now to see what we'll do with an equally bogus and pointless stat being created, except from the other direction.

And if this is for real, which I'm seriously skeptical about, then what is exactly is the point? The Snow Index Part I attempted to reinvent Linear Weights, and screwed up unbelievably badly while doing so.

Now we have the Snow Index Part 2, theorizing about run scoring efficiency, and not really providing any insight.

Mr. Snow, please don't take this personally. The issues that I have with this research are fairly simple:

1. You are covering pretty well charted ground. It's not that this sort of thing isn't worth trying in a vacuum, but what does this teach us that we don't already know?

Bill James has complained about the number of new stats being created to measure things which we've already figured out, for little purpose. I never gave his comments much weight in this regard, but I'm rapidly reconsidering things.

2. In covering this well charted territory, you're making some pretty serious errors based on assumptions that you take as fact. Furthermore, you seem to be spending your time running correlations on the wrong things.

3. Don't name a stat after yourself.
   17. Snoopy (#3 All-time in home runs) Posted: April 13, 2005 at 08:48 AM (#1252485)
I thought the point of the Snow Index was to show us the process by which statistics evolve. If Mr. Snow was trying to do something really groundbreaking, I doubt he would be showing us his notes while he was attempting it. After all, someone could steal his ideas. By attempting to reinvent the wheel, so to speak, he can at least pinpoint which direction he wants to go. I think this whole thing is just an heuristic process to show us how most ideas start out as crap and than evolve through refinements into something much better. That's why he's asking for suggestions, right? Having said that, here's a suggestion:

Mr. Snow, have you considered controlling for the number of outs at the time the runner get on base? That would seem like a huge factor that affects whether the runner scores or not. Not controlling for it would mean that it's somehow included in the RPSR, introducing biases into the "luck" component you are trying to capture.
   18. Too Much Coffee Man Posted: April 13, 2005 at 11:46 AM (#1252518)
Part of this discussion reminds me of my wife's family. They will argue for an hour whether or not a light is on in the other room without any of them getting up to check.

The correlation between RPSR and XBH Ratio is .21. It's not clear to me why the latter term becomes the gold standard for a new statistic to measure up to. That said, there's very little meaningful variance shared by the two.
   19. Kyle Lobner Posted: April 13, 2005 at 03:24 PM (#1252821)
Ok, I've got 20 minutes before my lunch is done, but I want to respond to each of the last three posts before I'm busy for the rest of the afternoon, so here goes:

First and foremost, I could say "Mr. Snow is my father," but that's not even true. Please, call me KL.

Sometimes I waste time running correlations on something where there's no correlation at all. But you usually don't see that in my work. And, I guess, in this process, a "serious error," by my defintion, would be defined as something that causes the ceiling in my apartment to cave in, and while it is doing that, I'm pretty sure it's not because of this project. Everything else I do I define as exploring a possibility. I don't expect most of these possibilities to be accepted, and in fact neither of the published ones have been. But largely, when I put a prospect out there, I'm not putting it out there as "This is the way it is." i'm putting it out there as "Here's what I'm thinking, would anyone care to offer an alternative suggestion?" My first article was buried under alternative suggestions, and I appreciated that. I spent most of a week digging through them and deciding what I could use. And you'll see the difference when I get back to linear weights.

In response to post 18: I guess, in the article, I spent a bit too much time playing up the "what would happen if RPSR was replaced by XBH Ratio" argument, such that people now think I'm putting too much weight on it. I'm not arguing that this is the only factor, I'm simply arguing that it's an easy factor to eliminate. And the concept I'm working on now to eliminate it will probably look considerably better than the one you just read. Unless, of course, someone provides me a better suggestion along the way.
   20. Nick S Posted: April 13, 2005 at 05:47 PM (#1253262)
If you haven't previously, read "How Runs are Really Created" on tangotiger.net, particularly part 2. What you seem to be doing here is looking for a theoretical justification for Tango's 'experimentally' determined linear formula for the "b" component of BaseRuns. You present a hypothesis (what base runners start at is an important factor in percent of baserunners who score.) You create two stats in an attempt to quantify the two variables in the hypothesis. And then stop. There should be some attempt to evaluate the relationship between your two variables, and from that some conclusion drawn. For instance, you could show the correlation between the two stats, preferably over a large sample size of teams. You'd show that there is surprisingly little correlation, particulary considering that XBH ratio probably has a positive correlation to slugging, which almost certainly has a positive correlation to RPSR. You might then conclude that XBH has little to do with RPSR and that is evidence that the hypothesis was wrong. Better yet, you could look at play-by-play data (retrosheet.org) and calculate for each team the percent of runners who start at 1b,2b,and 3b, and the respective scoring percentages. How large are the differences? How much do 2bs and 3bs affect the weighted average (RPSR)? Does this vary much from team-to-team? It is great that you want to put time and thought into questions like this.
   21. Michael Posted: April 13, 2005 at 10:16 PM (#1254123)
Does anyone else think of JT Snow when they see the snow index?

More work on any topic is good (marketplace of ideas) but the value of this work to the greater community seems pretty low. K.L. Snow may have value as he may be learning something from this (both from the process of thinking about it and doing it and from the comments he's receiving). I agree that more references to existing work would be helpful.
   22. Cabbage Posted: April 14, 2005 at 01:59 AM (#1255264)
I've said it before, and I'll say it again. This should really be called "The Informer Index"
   23. Mike Piazza Posted: April 14, 2005 at 04:05 PM (#1256121)
I've said it before, and I'll say it again:

I licky boom boom down!
   24. Duffy Duff Posted: April 14, 2005 at 10:22 PM (#1257128)
As far as the idea of looking at the 'process", I think that is bogus, or certainly not worth the time if the end result is not enlightening.

I've been playing with baseball stats for 20 years, and have invented many dozens of stats, all of which seemed logical at the time. Really, only one of those stats has stood the test of time and is still considered the state of the art by the best analysts.

IOW, the end result is all that counts....
   25. Jimenez > Soriano Posted: April 15, 2005 at 10:44 PM (#1260321)
Gotta agree with #16. I'm betting this is a hoax. The fact that the author called it by his name after the whole Gleeman thing is just too much. Moreover, would the lords of PTF really let anything of this quality on the site if it were really a serious piece of work?
   26. Jimenez > Soriano Posted: April 15, 2005 at 10:51 PM (#1260342)
Oh, I know what it reminds me of now - the Sokal Affair.
   27. Mike Maddux Mike Posted: April 27, 2005 at 08:16 PM (#1293705)
And it reminds me of HEQ.
Page 1 of 1 pages

You must be Registered and Logged In to post comments.

 

 

<< Back to main

Support BBTF

donate

Thanks to
Sebastian
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Buy MLB playoff tickets, plus 2011 World Series, 2011 ALCS tickets and NLCS game tickets. We also have Texas Rangers playoff schedule, tickets to Red Sox games and Yankees game tickets. Plus, buy Phillies baseball tickets, Tigers playoff tickets and the biggies like ALDS baseball tickets and 2011 NLDS tickets.

Demarini, Easton and TPX Baseball Bats

 

 

 

AllianceTickets.com has cheap MLB Tickets. Get all your Colorado Rockies Tickets, Seattle Mariners Tickets, San Francisco Giants Tickets and all your favorite baseball tickets here. We also carry cheap Denver Broncos Tickets, Seattle Seahawks Tickets and Denver Nuggets Tickets.

Page rendered in 0.9095 seconds
56 querie(s) executed