User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
Buy MLB playoff tickets, plus 2011 World Series, 2011 ALCS tickets and NLCS game tickets. We also have Texas Rangers playoff schedule, tickets to Red Sox games and Yankees game tickets. Plus, buy Phillies baseball tickets, Tigers playoff tickets and the biggies like ALDS baseball tickets and 2011 NLDS tickets. |
Demarini, Easton and TPX Baseball Bats
|
AllianceTickets.com has cheap MLB Tickets. Get all your Colorado Rockies Tickets, Seattle Mariners Tickets, San Francisco Giants Tickets and all your favorite baseball tickets here. We also carry cheap Denver Broncos Tickets, Seattle Seahawks Tickets and Denver Nuggets Tickets. |
Page rendered in 0.8913 seconds
38 querie(s) executed

Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
There is a second article about a topic I about which I am writing in the next version: putouts by first basemen, second basemen and shortstops. I realized late in writing this my ways of handling these were poor and I could do better, as both Bill James and Clay Davenport were (well, Bill James on first basemen's putouts, he handles putouts by middle infielders no better than I do).
As I commented on Fanhome, it is a great overall system (I think) - one that is able to use traditional team and player info and rigorously compute each player's fielding skill (with the help of some extraneous, yet critical, material from a "one-time" PBP database), rather than having to rely on such garbage metrics as "range factor".
Again, to be honest, I can't comment on the specifics of your methodology because it is difficult to wade through your article. I have a "feeling", however, that the methodology is sound.
I do wish that you (and some others to whom I have made similar comments) would write more in "English" than in a style more befitting the "American Journal of Applied Mathematics". As well, more "description" (again, in "English") and less formula-type prose would be helpful for feedback and understanding. For example, if I described (in nauseating detail) the exact formula I use for UZR, I think that I would lose lots of folks. Instead I describe, in easy to understand English, the basic idea. I might even get into some of the boring details, and if I do, I still try to present them in "English". If someone wants to know some of the exact "formulas" that go into my methodolgies (and trust me, no one ever does - no one gets paid to "peer review" our articles), they can request them or I can supply appendices or something like that.
If I am in the minority in this regard, ignore these comments...
Here is an edited copy of my recent post on Fanhome, describing what I think is your "system" (I hope that either Charles or Mike is dgb100 or that dgb100 is a colleague of theirs, otherwise it looks like someone is stealing ideas from someone else) in 500 words or less:
Let me summarize what dgb is doing, which as Tango states, is using all the traditonal information available to come up with basically a ZR/RF (that's "slash", not "divided by") for each fielder.
He is first taking each team's BIP's. Then he is (or should be) separating that into GB's and FB's, using a team's GB/FB ratio (if available, of course; if not, we don't do that step - we simply
assume a league-average GB/FB ratio, or we can estimate GB/FB ratio from a team's total IF assists and BIP's).
Now here's the nice part:
He determines how many balls per 100 GB's that each infielder (or how many balls per 100 FB's for outfielders) "should" catch (turn into outs - GB assists for IF'ers and FB putouts for OF'ers). He does this by using PBP data from a bunch of historical games to determine, on the average, what percentage of all ground balls (GB BIP's) are "caught" (turned into outs) by the SS, 3B'man, etc. For example, in the database, if the SS catches 10 ground balls per every 100 GB's
hit, then it is assumed that for every 100 GB's that team A allows (remember, we calculate how many GB's team A allows by taking their BIP's and applying their pitchers' GB/FB ratio), their SS should field 10 of them. If he only fields 8 (he has 8 GB "assists" per 100 GB's), then he is a below average fielder with a AFR of .8.
He can further refine his system to take into consideration the handedness of the opposing batters and/or the handedness of a team's pitchers, depending upon what information is available. Obviously if a team has lots of LHP's they will face more RHB's than the average team; consequently they will allow more ground balls to the left side of the infield.
So, he can take the PBP database and figure out "how many ground balls does a SS catch per 100 GB's allowed when a RHB is at bat," and do the same for LHB's, and for FB's versus LHB's and RHB's as well.
If we don't have that kind of information available - handedness of opposing batters (which we probably don't unless we have some kind of a PBP database), then we can do the same thing using the handedness of the pitchers. For example again, we would use the historical PBP databse to see how many GB's were caught by a SS on average when a LHP was on the mound and how many were caught with a RHP on the mound. Now we can look at team A and even if we don't know how many balls were put in play when their RHP's were on the mound and how many were put into play when their LHP's were on the mound, we can figure out the percentage of time a RHP was on the mound, the percentage of time a LHP was on the mound, and divide up the team's total BIP's accordingly (I guess if we want, we can compute from the indivual player stats how many BIP's and GB's and FB's were allowed by each pitcher, and therefore by all their LHP's combined and all RHP's combined.)
The next logical step is to put all of this nice methodolgy into an "easy to read and undestand" formula, DGB!
Like EZR for an IF(estimated zone rating, or whatever you want to call it)=(player "A" GB assists)/((team BIP)*(team GB/FB))/(average player for position "A" GB assists per 100 GB's).
The above reads "A's GB assists divided by his team's total GB's" divided by "the average player at his position's GB assists per 100 GB's". This last term is a constant for each position in the field, based upon the historcal database.
You can refine the formula to account for a team's LHP and RHP's as discussed above, by putting in the appropriate conversion algorithm, and you can also include the formula for detemining a middle IF'er's GB assists only (facoring out his CS assists I guess).
Charles, is that basically what you are doing? You are calculating a "normalized ZR" for each fielder by estimating how many balls should have been caught by an average fielder at that position given the estimated number of ground balls hit to the infield and the actual or estimated percentage of RHB and LHB at the plate!
Of course, in order to do this, you also have to estimate how many of a defensive player's PO and A are actually "balls caught" and in fact relate to defensive skill. For outfielders, it should be PO only (I don't know whether you incorporate OF assists in your "formula" for OF defense; if you do, you shouldn't - OF assists bear little relationship to OF defensive skill vis-a-vis the arm - you would need holds/extra bases and opportunities; using OF A only would be like using catcher or baserunner CS only - it tells you very little about overall value), and for IF, it should be mostly assists on GB only - as you properly explain, assists on steals (do they give an IF an assist on a CS?) should be ignored (subtracted), PO by all IF'ers other than the 1B should be ignored (other than by middle IF'ers if you want to incorporate DP's), and only PO's by the 1B'man should count when he fields the ground ball and makes the play himself (doesn't he also get an assist when this happens? If he does, then we can ignore PO's by 1B'men as well). Whatever you do, as you also state, pop-fly PO's by IF'ers should and must be ignored, as there is almost no relationship between a defender's number of pop-ups caught and his defensive skill - for obvious reasons.
It just doesn't seem as complicated as a glance at your article suggests. Am I missing something? (Please re-read my last 2 paragraphs.)
A putout results when (these are either or):
1) a fielder catches a fly ball or line drive - nope, that's not it.
2) catches a thrown ball which puts out a batter - that's not it either.
3) tags a runner - that's not it!
Well, I can see no defintion that gives a fielder a putout (or an assist) when he makes a GB out unassisted...
BTW, a fielder who tags a runner on a CS or pickoff, gets credited with a PO, I guess, according to definition 3 of a PO, and certainly not as assist. Does the catcher get an assist based on "throwing a [batted or] thrown ball that results in a putout," the "thrown ball" being the pitch?
MGL -- lotsa stuff, obviously. You're right, I should have an article structure that shows me going through the math step by step. I have been working on that for the next version. Problem is, it is slow going. In spite of my malaprop above (and it is probably not the only one), my training is as a writer, not as a statistician. Writing page after page about numbers just is not very interesting, but it is necessary in this case.
IF putouts -- again, I have been spending tons and tons of time on these. I do have better formulas to estimate unassisted putouts not only by first basemen, but by second basemen and shortstops as well. I discovered a few things about these. I, too, am a little skeptical of the value of middle infielders' putouts, though the unassisted numbers do have some year-to-year consistency. I think I have spent more time on this topic over the last year than any other defensive topic, and have written about eight formulae about this.
OF assists -- as in, I have no other way of estimating the impact of an outfielder's arm. There is some correlation between a high assist rate and a low advance rate, but only some; as I wrote above, it looked like the assist was keeping the runner it pegged from advancing, which is why I gave it the value I gave it. As a note, an outfielder with a large number of Baserunner Kills almost certainly did have a positive impact with his arm despite the number of advances against him. Even a fluke year like Gary Ward 1982 or Joe Orsulak 1992 probably has defensive value in spite of the extra advances.
LHB/RHB -- well, yeah, I am trying to measure opportunity, or more to the point, failed opportunity (since we already know successful opportunity). We went to the PBP data for this one. You figure the adjustment, and multiply it by failed opportunities. It is not as bad as it looks, but I would be open to a simpler way of calculating this.
Errors -- I am doing this; the error values show how likely the error put a man on base. For example, I figure an outfielder's error as 25% of the value of putting another man on base (0.50 + 0.09 + LgR/LgPA) plus 75% of the value of allowing a man to advance (0.18). As each position has a different "put the batter on first" rate, each position has a different value.
DP opps -- that is what I am doing.
Run values -- what I am doing is figuring the value of each event and multiplying each plus/minus number by that value. If each infielder's assist has a weight of 0.234, I multiply the positive/negative number by 0.234 to determine how many runs that fielder saved/blew versus league average. I add them together to find the total plus/minus runs.
And this is where he makes the mistake - because whether or not a fielder should make a play is dependent upon the specific context in which the ball is hit - both the game context (runners on base, number of outs, game score, batter at the plate, pitcher on the mound) and the fielder context (fielder position relative to his teammates). Lumping all of these results together, and making value assignments based on the aggregate results from all fielders, makes the outcome highly susceptible to aggregation bias, where the group characteristics not only don't apply across the board to the individuals in the group but are highly likely to be significantly different for individuals in the group.
It is far less likely to introduce bias into the results to consider whether the fielder *could* make a play on the ball, and to penalize him to the full extent whenever a play is not made in an area where he could have made a play, even if when all teams are lumpred together another fielder was more likely to have made the play. IOW, if a single goes through the SS hole, both the 3B and the SS should be penalized the full value of one single, because either could have made the play depending on the circumstances, and because you don't know the circumstances you can't make a valid a priori judgment as to which fielder *should* have made the play.
-- MWE
Now, if you tell me that with man on 2b, 0 outs, and RH at bat the SS only makes 60% of those plays, then fine, let's adjust based on this new data.
But to categorically make it 100% for both players, is, in my view, not a valid representation.
My position is that you identify every possible variable, situation, and context that you can think of, and base your best estimate on that.
ZR
- by zone
- by base-out state
- by score differential & inning
- by LH/RH batter
- by LH/RH pitcher
- by park
- by actual batter
- by actual pitcher
- by batter showing bunt / no-bunt
- by batter executing bunt / no-bunt
- by speed of runners on base
Anything else I missed?
There's supposed to be some tables for the 2001 data, but they aren't up yet.
Range outs / (Range outs + Hits Allowed - Home Runs Allowed)
Arm outs / Runners on base
And then I adjust.
I discovered the reduced formulae this spring. I don't use them because it is harder to make adjustments with ithem, but they do work on a basic level.
I was not being facetious. Those two formulae really do work.
Catcher assists do track opponents caught stealing. I am adding an adjustment based on passed balls allowed, which do loosely (r=0.50) track K23 assists, which improves the accuracy there. The problem is, both catcher assists and opponents caught stealing also correlate well with opponents stolen bases allowed, so a good assist total could well mean the opposite of what we assume it means, a good throwing catcher.
M.D.'s enthusiasm for spreadsheeting reminds me that Mssrs. Saeger and/or Emeigh have previously been seen talking with Sean Forman about eventually adding CADish results to Baseball Reference. I.e., for every player season ever. Which of course would rule. I remember subsequent intimations that this might be impossibly hard.
So where does that project idea stand?
Enlist M.D. and others as aides!
If you want constructive criticism, you have to present the details of your method to your audience so that they understand your thought process and so that they can replicate enough of the work to feel comfortable about the path that you are taking (and to suggest improvements as warranted). If you hold back the details, on the other hand, you take the risk of undermining your own credibility, especially if your audience sees what appears to be an obvious flaw in your approach but can't confirm whether or not you've addressed it because you haven't provided the details. There's nothing to lose, and a great deal to be gained, from submitting an analysis method in all of its gory detail for independent analysis and assessment by your readers, many of whom probably know as much about the subject as you do, much as I hate to say it :)
The "holy grail" nature of defensive analysis comes about in large part because we have almost no information about fielder performance in relation to opportunity to perform. We have "opportunity contexts" for batters and pitchers, with a fairly complete record of their successes and failures. For fielders, we have a record of their successes (polluted by successes of other fielders that show up in their record) and only a partial record of their failures (even in zone-based systems), so we don't have a complete record of their "opportunity context". What Charles has attempted to do here, to the best of his ability, is to strip out the areas of pollution in the existing records and to derive an "opportunity context" for fielders based on information that we know, without trying to divvy up responsibilities based on what we think is true but which we can't support with empirical evidence. It's not a simple task, because the process of converting a ball in play into an out is *heavily* driven by contextual factors, to a far greater degree than either batting or pitching, and it's a strain just to make sure that those factors have been identified, let alone to ensure that they have been properly accounted for.
-- MWE
You must be Registered and Logged In to post comments.
<< Back to main