Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Dialed In > Discussion
Dialed In
— 

Wednesday, January 04, 2006

Who were the REAL MVPs?

The Major League Baseball writers voted for Alex Rodriguez and Albert Pujols as the AL and NL MVPs recently, and statheads around the country lauded them for “getting it right”.  But did they?

Every discussion I have read on the American League Award focused largely on whether or not David Ortiz game state performances should outweigh A-Rod’s overall performance compared to players at his position.  The lack of defense played by Big Papi played a large role in the number of votes he would get, and very large in the discussions here and around the baseball world.

Even in the National League, there was a good deal of clamor over whether or not a good defensive centerfielder hitting 51 home runs, leading the league in RBIs and lifting his team to their zillionth straight division crown was more deserving over a good fielding, great hitting first baseman – after all, he was still “just” a first baseman.

What I haven’t seen to date is a nice list of what every player contributed on both sides of the ball.  Defensive runs saved and offensive runs generated. 

There is a big problem – designated hitters, that scourge of baseball everywhere, don’t play defense.  So how do you quantify what they contribute to the defense?  Some want to say they are the worst fielder on their team, because that is the player a team chooses to put in the field instead of the DH.  One of the problems with that is that the DH may not have the ability to play that position – which I’m not sure mitigates the original question.

Perhaps they should only be “penalized” as the worst player in the league at the position they *would* play, were they to play. 

But that doesn’t really cover it either, because all that is really required is that they be a worse fielder than the player that is playing the position for the team.  I mean, being a DH when Jon Olerud is your defensive first baseman isn’t really damning.

However, if you aren’t very capable of playing defense, you are sucking up a roster spot and hurting your team overall defensively.  In addition, your team is stuck in interleague play on the road.  Okay, that’s just 8 games, but it is 5% of the time.

I’m not sure what the answer is, but I am certain a designated hitter that cannot play in the field adequately damages his team more than a player that plays defense poorly.  This we can be sure of because teams do choose to play a “Manny Ramirez” over a “David Ortiz”.

That is effectively saying that Ortiz at first base with Manny DHing and Jay Payton in LF is a *worse* lineup than Ortiz at DH, Manny in LF and Kevin Millar/Olerud at first base.

We know the Red Sox will make decisions on defensive value, and they stick with this lineup.  People argue that the Red Sox don’t do that for defensive reasons, but for health reasons.  Ortiz probably couldn’t take the grind.  That’s another reason to de-value Ortiz’ abilities, but does it de-value his performance?

For me, I’ll provide you with the two categories, and let you add your own weighting for the DH.  I personally discount the DH performance to a bad fielding first baseman – around -15 runs – think Mike Piazza or Frank Thomas.  In all honesty, that overstates the DH value, because if Frank Thomas can manage only a -15 and I can get another bat like Ortiz in the lineup, that’s very important.  Okay, that was a bit rant-like.

My ratings for defense can be found here.

My offensive ratings are Jim Furtado’s Extrapolated Runs above average at position, park-adjusted.  Why?  Because I already have all the spreadsheets set up, and it does a very good job, even compared to BaseRuns.

For these purposes, there is nothing wrong with using “average” as the baseline – it doesn’t undervalue an average performance for this usage.  Plus, you don’t put your eye out trying to guess at a defensive “replacement level”.

American League


Player	Team	pos	Offense	Defense	Total
rodriguez,alex	NYY	3B	81.0	-13.5	67.5
roberts,brian	BAL	2B	47.8	3.8	51.6
ortiz,david	BOS	DH	51.6	-1.6	50.0
hafner,travis	CLE	DH	49.0	Dnp	49.0
guerrero,vladim	LAA	RF	41.8	1.3	43.1
peralta,jhonny	CLE	SS	28.3	8.9	37.2
martinez,victor	CLE	C	39.4	-6.0	33.4
ellis,mark	OAK	2B	21.2	11.3	32.5
mora,melvin	BAL	3B	25.8	6.1	31.9
mauer,joe	MIN	C	23.3	8.0	31.3
chavez,eric	OAK	3B	17.8	13.2	31.0
crisp,coco	CLE	LF	19.9	10.2	30.1
teixeira,mark	TEX	1B	23.2	6.3	29.5
giambi,jason	NYY	1B	38.6	-10.3	28.3
crawford,carl	TB	LF	17.2	10.5	27.7
jeter,derek	NYY	SS	26.1	1.5	27.6
varitek,jason	BOS	C	29.1	-4.3	24.8
sizemore,grady	CLE	CF	22.6	1.4	24.0
gomes,jonny	TB	DH	18.7	3.5	22.2
polanco,placido	DET	2B	18.6	2.3	20.9
lugo,julio	TB	SS	15.3	5.6	20.9
young,michael	TEX	SS	28.0	-7.1	20.9
posada,jorge	NYY	C	21.1	-0.3	20.8
matsui,hideki	NYY	LF	26.8	-6.7	20.1

Offense is XR runs above average, park-adjusted for a player’s playing time.
Defense is runs prevented above average for a player’s playing time.
Thanks to Doug’s Stats for the offensive stats.
The decimal places are not meant to indicate a level of accuracy, but there so you can see where the math comes out.

Well, Ortiz’ clutch-hitting notwithstanding, ARod was definitely the correct MVP.  He had the best bat by a wide margin.  If you note, despite my rant, I did not dock the DHs for defense.  That’s wrong in the overall analysis, but I’ll let you make your own adjustment.

Look at that – Brian Roberts was the second most valuable player in the American League.  What a great season for him.  He has to be the bargain of the year.  Not a great bet to repeat, but a great season for him.

Ortiz played first base for 78 innings.  In that time, he cost the Sox two runs.  You don’t want that out there for 780 innings, much less 1400.  Playing Manny is probably the right move (Manny isn’t on the list, but ended up at +14 runs).

Travis Hafner, one of the top five AL players last year, didn’t play in the field.  He is a great secret.  Sure the Indians are becoming popular, but Grady Sizemore and Jhonny Peralta are getting the press.  Hafner is going to be a top candidate for the MVP for a few more years.

Above are the players that were twenty runs above average at their position.  It’s a nice list, with a good variety of teams and positions.

There are five Yankees on the list, and Sheffield was just off of it.  That’s a good team.

There are also five Indians on the list, and they are all twenty nine or younger.  That’s a good team.

National League

Player	Team	pos	Offense	Defense	Total
lee,derrek	CHN	1B	59.7	0.0	59.7
utley,chase	PHI	2B	34.3	19.4	53.7
giles,brian	SDP	RF	48.5	4.3	52.8
pujols,albert	STL	1B	52.6	-1.3	51.3
ensberg,morgan	HOU	3B	38.4	4.3	42.7
bay,jason	PIT	LF	40.9	-2.2	38.7
kent,jeff	LAD	2B	35.4	1.4	36.8
jones,chipper	ATL	3B	31.8	4.9	36.7
edmonds,jim	STL	CF	32.5	3.7	36.2
wright,david	NYM	3B	36.4	-5.0	31.4
lopez,felipe	CIN	SS	31.7	-2.2	29.5
winn,randy	SFG	CF	23.7	5.7	29.4
cabrera,miguel	FLA	LF	36.2	-7.2	29.0
furcal,rafael	ATL	SS	23.2	5.5	28.7
drew,j.d.	LAD	RF	21.8	6.5	28.3
helton,todd	COL	1B	19.1	8.6	27.7
hall,bill	MIL	SS	23.9	2.0	25.9
floyd,cliff	NYM	LF	16.1	9.7	25.8
abreu,bobby	PHI	RF	31.5	-5.8	25.7
jones,andruw	ATL	CF	25.3	-0.2	25.1
jenkins,geoff	MIL	RF	17.2	7.1	24.3
dunn,adam	CIN	LF	26.4	-2.5	23.9
delgado,carlos	FLA	1B	31.2	-8.2	23.0
burrell,pat	PHI	LF	17.9	4.7	22.6
rollins,jimmy	PHI	SS	18.4	1.6	20.1
valentin,javier	CIN	C	17.8	0.3	18.1

Chart key as above.

I added Javier Valentin because he was the highest rated NL catcher.  He’ll be the sleeper in next year’s fantasy leagues.

So it looks like the voters got this one wrong – sort of.  Sure it’s close enough to not really be a travesty, but it looks like Lee was the better performer.  In addition, we can see Chase Utley and Brian Giles being top performers as well.  Utley, like Roberts in the AL, was a great bargain for maximum production.  The problem will be, in Philly, that Utley doesn’t “look like” a second baseman.  He’ll be an all-star there if he’s allowed to play it. 

Brian Giles wasn’t much of a secret before and now he has re-signed with San Diego.  That’s a great deal for the Padres.  Teams would have really benefited from Giles signing with them.  I would bet he has four more top-notch seasons in him.

It is interesting to note that JD Drew is in the top twenty considering he missed most of the season.  The combination of his injuries, his holdout and being platooned has probably sidetracked what could have been a stellar career.

All in all, the MVP awards were given to very deserving candidates.  What we did not see was deserving candidates being considered, like Giles and Utley and Roberts.  It isn’t likely that Utley and Roberts will be in this lofty position very often, so finishing high in the MVP voting is a good reward when you do deserve it.

No, I didn’t list any pitchers here.  We can discuss them, but that’s a different ranking system.

I may have missed someone else that performed at 20 runs above average, but I don’t think so.

Complete player rankings will be available (all players) when I combine the defense and offense.  That’s a bit of work.

Chris Dial Posted: January 04, 2006 at 05:45 AM | 197 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 2 of 2 pages  < 1 2
   101. Damon Rutherford Posted: January 05, 2006 at 12:31 AM (#1806491)
The first time you ever heard someone say "A single is worth 0.47 runs" you got it immediately? You didn't go, "Huh?"

I didn't first hear or read about a single being worth 0.47 runs out of context from the rest of the methods. When I first read Bill James's Runs Created method and whatever I first read regarding linear weights, it immediately made a whole lot of sense. Keep in mind I was already an electrical (and biomedical) engineering graduate student at the time. RC and linear weights were easy to comprehend compared to what I was studying. But the defensive metrics ... not as easy, probably because I couldn't actually see (or do myself) the actual data and equations, or if I could, it wasn't as familiar to me as hits, BB, HR, AB, PA, probability of scoring based on base and out conditions, etc.

That makes you a lot smarter than everyone else.

Not necessarily. Best not to compare oneself (at least publicly!) with others.
   102. greenback Posted: January 05, 2006 at 12:54 AM (#1806525)
One of the problems with the defensive metrics is that we have no idea what the "correct" answer is, at the personal level, team level, or league level.

Have you read any of the Blyleven threads? OK, "no idea" would be an overstatement on pitching metrics, but it's probably an overstatement for defense metrics as well.

Dial, since you've taken to wearing your Professor Hat in this thread, do you know of any reason to use defensive stats based on traditional data (PO, A, E) when ZR is available?
   103. Chris Dial Posted: January 05, 2006 at 01:00 AM (#1806533)
Greg,
that is possible.

I didn't read James until I was 27, and had little issue with it for similar reasons (in that - they gave reasons immediately, and with my age and math experience, the concept wasn't foreign like it would be to a 12-yr old).

However, I think in all these metrics, the explanation *has* been fully offered - BUT, you may have heard, as so many do, that ZR is flawed or "defense cannot be quantitated" enough times, so you don't approach the methodlogies openly.

Steve has already decided that the sample size is too small, despite not examining it at that level (and I don't want to pick on Steve - that's just an example).

Andere,
read the last three defense articles, and I have leveled those criticisms at UZR many times on this site - usually without specific response, but MGL responds in the discussion in the D ### article.
   104. Chris Dial Posted: January 05, 2006 at 01:02 AM (#1806539)
greenback,
no.

However, because we don't have ZR before about 1988, using ZR to develop a good metric *solely* from traditional data is a very good idea.
   105. Chris Dial Posted: January 05, 2006 at 01:05 AM (#1806546)
One of the problems with the defensive metrics is that we have no idea what the "correct" answer is, at the personal level, team level, or league level. You can't look at the Padres statistics and figure out how many runs their defense prevented, the way you can look and see how many runs they scored.

I don't know that this is true. It may be true to state today, but it may soon be that you *can* just look at the stats and get the defense.

There are those that say you *can* just look at DER and tell how many runs the defense prevented. They're wrong.

I get a rough correlation of team ZR and DER around 0.32 (r^2)
   106. Steve Treder Posted: January 05, 2006 at 01:08 AM (#1806551)
Steve has already decided that the sample size is too small, despite not examining it at that level (and I don't want to pick on Steve - that's just an example).

I realize you're just using me as an example of prejudging, and fair enough. But for the sake of clarity: I haven't "decided" that the sample size is too small; I am suspicious about sample size issues here, because that's always one of the likely culprits when you get readings that appear to be quite unstable.
   107. Chris Dial Posted: January 05, 2006 at 01:10 AM (#1806554)
But my feeling, and I could be completely wrong about this, is that the only people who know these nuances are you, MGL, DSG, Rally, etc., and very few people who are equipped to think about it critically. You know those nuances because you've been getting down and dirty with the data, but that's asking a lot from people

Anyone who has scored ZR or Project Scoresheet may have noticed. I admit, Mike Emeigh and I have probably been more down with BIP distribution than anyone else here, but...

Ah hell, I suppose I can just demonstrate it better with some charts - that's going to be difficult.

If everyone prints out my What is ZR article and a copy of the grid, and then colors in teh areas described, it helps to visualize what BIP distribution can mean. Raw counts would certainly help. I may be able to do something there too.
   108. Spivey Posted: January 05, 2006 at 01:14 AM (#1806561)
Dr Memory - thats why I didn't vote for AP for first. I voted for Roger for 1st.

AP was second for exactly the reason you mentioned.


I don't understand. If are agreeing with the hypothetical argument that Dr. Memory is putting out there - that STL didn't need much or any of Pujols' production, how can you even vote for him? Seems like his contribution would be considered worthless in your eyes.
   109. Chris Dial Posted: January 05, 2006 at 01:17 AM (#1806567)
True, Steve, and I stated as much in this discussion that I was fearful for years about the sample size issue. It does exist - I used a 1000 inning cutoff to make sure there was some sample.

It's a valid concern, but I am pretty sure that sample size isn't the issue with these fluctuations - well, sort of. Obviously as long as two players can get differnt BIP distributions, there is some sample size effect.

The theory is that *eventually* teh distribution will even out (over a career).

For some reason, that *hasn't* happened for Jeter (read: the Yankees at SS) and teh Braves at 3B. Even through pitching staff changes and tendency changes. Why? It's probably related to the pitching styles. If you have the same pitching coach, the pitchers all have a general philosophy (and I'm not going to debate that here), so even though the pitchers change, the BIP distribution doesn't change (much) relative to the league.

I know this happens for differnet fielders on teh same team and with different pitchers. Heck, maybe it is a park effect. Not sure.
   110. greenback Posted: January 05, 2006 at 01:19 AM (#1806575)
However, because we don't have ZR before about 1988, using ZR to develop a good metric *solely* from traditional data is a very good idea.

Yes, that makes sense. The world still would be a happier place if a moratorium was placed on the development of trad-data fielding metrics.
   111. AROM Posted: January 05, 2006 at 01:26 AM (#1806591)
I've got a correlation of .58 for the 2004 season, using ZR plays made above/below average (not runs!) compared to team DER plays vs average.

Colorado is the biggest outlier, with -6 ZR and -106 DER. I think STATS treats their zones a little differently because of the park. For other teams, a hitters park will affect both DER and ZR. Removing Colorado only improves the correlation to .61 anyway.

I'll try and put my team numbers somewhere where people can look at them.
   112. Chris Dial Posted: January 05, 2006 at 01:34 AM (#1806609)
Rally,
we can attach tehm to your article and get a fronter link, I think.
   113. AROM Posted: January 05, 2006 at 02:07 AM (#1806657)
No need to, I'll just put them right here.

Here is team ZR+ compared to DER+ (Defense efficiency rating):

Team,DER,ZR
ari,-49, 22
atl, -4, 32
bal, -9, 2
bos, -54,-72
chc, 24, 31
cin, -77, -37
cle, 69, 39
col, -106, -6
cws, 79, 49
det, 4, 38
fla, -72, -20
hou, 45,  -19
kc, -127, -45
laa, 22, 20
lad, 21, 2
mil, 12, -16
min, 36, 11
nym, 30, 14
nyy, -18,-89
oak, 93, 37
phi, 41, 25
pit, 0, 8
sd, -22, 31
sea, 31, -26
sf , 22, -7
stl, 39, 23
tb, -58, -37
tex, -60, -24
tor, 27, 16
was, 13, -2



The correlation between the 2 columns is 0.58. Zone rating doesn't explain all of DER, because of balls hit out of zones. Balls hit outside of zones are harder to explain, predict, and grok. There are many factors involved, including pitching, ballparks, and probably some luck thrown in.

We can measure pitching through the pitching independent stats (BB, HR, SO, BB) and defense through ZR, but these alone can't tell us as much about how many runs the team will allow as runs created or linear weights does about offense.

Balls out of zone is the missing ingredient. It's part defense (sometimes fielders can make plays on these balls), part pitching, part ballpark (think Green Monster or anywhere with larger than normal foul territory), part who knows what.
   114. Chris Dial Posted: January 05, 2006 at 02:14 AM (#1806667)
We can measure pitching through the pitching independent stats (BB, HR, SO, BB) and defense through ZR, but these alone can't tell us as much about how many runs the team will allow as runs created or linear weights does about offense.

Balls out of zone is the missing ingredient.


sehr gut!

That's where chances comes in. I'll do a nice little trick with this afterwhile.
   115. GuyM Posted: January 05, 2006 at 05:38 AM (#1806996)
It's a valid concern, but I am pretty sure that sample size isn't the issue with these fluctuations - well, sort of. Obviously as long as two players can get differnt BIP distributions, there is some sample size effect.

Chris, it's true that the # of BIP isn't that much smaller than the # of PAs. But you noted yourself that many balls are totally unplayable. An even larger proportion (at least in OF) are playable by any fielder. So the # of marginal BIP -- plays for which there is any doubt at all about the outcome -- must be far smaller than PA. And it is really only the outcomes of these plays that distinguish good from bad fielders. (To find out if some players were better than others at handing IF popups, you'd need an N of 10,000, since 98% of them are caught.) So it seems possible to me that the measurement error for fielding metrics will be larger than for hitting and pitching metrics.
   116. Steve Treder Posted: January 05, 2006 at 06:40 AM (#1807193)
So the # of marginal BIP -- plays for which there is any doubt at all about the outcome -- must be far smaller than PA. And it is really only the outcomes of these plays that distinguish good from bad fielders.

Yes. This is a second way in which # of BIPs and # of PAs might not be truly comparable standards.
   117. DSG Posted: January 05, 2006 at 10:44 AM (#1807410)
Heck, you can all do yourself a favor and score the same game on a UZR grid and a ZR grid, and then you'll see why I have some issues with bigger zones.

########! ZR does NOT have smaller zones than UZR; it has ONE big zone for each position. You can take issue with Mitchel not using smaller zones in UZR (I happen to agree that he should), but his zones are much smaller than those used for ZR, since ZR does not distinguish between a ball hit into zone G from a ball hit into zone F (names for illustration only -- I'm not actually looking at a chart of all the zones).

Also, ZR does not look at batted ball speed. ZR does a crappy job of treating balls out of zone. If you have UZR, you want to use UZR, if you have ZR, you want to use some combination of ZR and Range or DRA or whatever. To say that ZR is the best defensive metric out there, and that it's limitations are imagined is pure, well, I already said it once.
   118. Rusty Priske Posted: January 05, 2006 at 01:09 PM (#1807431)
You aren't going to get a resolution here because no one has the same definition of what an MVP actually is.

I used to agree with Mr. H.S.. I remember sayign there was no way that A-Rod should get the MVP back when he was with Texas because how valuable could he be to a last place team?

I have come around on that thinking. I now believe that the best all-around performance deserves the MVP, no matter what the outcome. This is because it is an individual award, not a team one.

Derrek Lee had better all-around performance this year than Albert Pujols did.

Period.
   119. Chris Dial Posted: January 05, 2006 at 01:32 PM (#1807435)
it's true that the # of BIP isn't that much smaller than the # of PAs. But you noted yourself that many balls are totally unplayable. An even larger proportion (at least in OF) are playable by any fielder.

Er, we've missed each other. The #of BIP for a SS is approx 530 - that approximates PAs.

The lowest fieldable by any position is something like 350 - which is certainly a good batting sample if smaller, and that's at first base where the skill range is expected to be smaller.
   120. studes Posted: January 05, 2006 at 02:06 PM (#1807442)
Another great fielding discussion. I learn a little more from each one.

I've got to say that Runs Created is a totally different situation. I bought into RC the minute I first read about it. It's simple and accessible. These fielding systems aren't. I'm more accepting of the output than Steve and Greg are, but I can't say I deeply understand them.

To me, the huge lesson that Chris, Rally, DSG and others have shown us is that fielding counts (see Chris's MVP list for examples) and that we all should really invest ourselves in understanding fielding systems and spreading the word.

And I think using Game States or Win Probability as your primary criterion for MVP is just silly and an inappropriate use of the stat. I've focused a lot on WPA at THT, but I've never suggested it should be used that way.
   121. Chris Dial Posted: January 05, 2006 at 02:15 PM (#1807444)
########! ZR does NOT have smaller zones than UZR; it has ONE big zone for each position. You can take issue with Mitchel not using smaller zones in UZR (I happen to agree that he should), but his zones are much smaller than those used for ZR, since ZR does not distinguish between a ball hit into zone G from a ball hit into zone F (names for illustration only -- I'm not actually looking at a chart of all the zones).

David,
you'll do better at this argument if you operate from "Chris knows what he's talking about" rather than "Chris is wrong."

ZR, the score, is made up of balls hit to *a data summary of small zones*.

The group of zones used to make up a fielder's area of responsibility are zones where players at that position regularly (>50%) convert the plays into outs.

The crux of the problem with #BIP is that, as GuyM and Rally state above, many balls are unfieldable.

ZR doesn't count those balls against the fielders. UZR does count ground balls like that.

You should re-read my articles.

Even MGL says his data is in smaller zones and he converts them to larger zones. Maybe *you* should take it up with MGL about that.

Also, ZR does not look at batted ball speed.

Does that make a significant impact? Demonstrate that. I don't believe it does *from teh view that it unfairly impacts some fielders over others* That is, a fielder gets a wildly disproportionate number of difficult to field balls such that it distorts his skill as a fielder.

Please - demonstrate that.

ZR does a crappy job of treating balls out of zone.

Their treatment is suboptimal. However, this is only a factor if the number of OOZ plays is significant. Can you demonstrate this is a "fatal flaw"?

If you have UZR, you want to use UZR,

No, I don't. It's using the wrong data. Or rather the data wrong. You don't have to agree with that, but I am correct. Sooner or later, MGL will recalculate UZR using STATS zones, and you'll see I'm right.

if you have ZR, you want to use some combination of ZR and Range or DRA or whatever.

No, I don't. Those add nothing to ZR. Unless you have some brilliant DP tweak. And when I say nothing I mean that Range adds nothing. Well, it adds confusion and murkiness, but it's not remotely as sophisticated as ZR. Range is like using BA instead of OPS+. Sure OPS+ doesn't weight OBP properly and has other issues, but that doesn't make it a bad stat. No, OPS+ isn't as good as using LWts or whathaveyou, but it does a good job for comparative purposes.

DRA has some interesting characteristics - MAH sent me some work he had doen with it right before the holidays (maybe before the Thanksgiving ones), but I haven't gotten around to diving in - but he had really good comparative scores).

To say that ZR is the best defensive metric out there,

Did I say that? I did say my interpretation of ZR is the best out there. And I agree that it is. That doesn't mean it doesn't have limitations.

and that it's limitations are imagined is pure, well, I already said it once.

I didn't say that either. But I agree, the things you complain about are, well, you already said it once.

I'm sorry you like UZR better, but your understanding of the systems is clearly limited, so you are better off not shouting about it.
   122. Chris Dial Posted: January 05, 2006 at 02:17 PM (#1807445)
I've got to say that Runs Created is a totally different situation. I bought into RC the minute I first read about it.

Evidently I was wrong about htat one. I was thinking of LWts apparently.

Or you are all smarter than me - which is no great leap.
   123. GuyM Posted: January 05, 2006 at 02:25 PM (#1807447)
I assume you mean BIP in the 50% zones? In that case, you've excluded uncatchable balls (sort of). Still, suppose an OF gets 450 chances, but 300 of those are balls that will be caught 99% of the time. Our sample size for BIP that effectively test his fielding ability is now just 150. I'd guess this is less true for IF, but still an issue.
   124. Chris Dial Posted: January 05, 2006 at 02:42 PM (#1807456)
Still, suppose an OF gets 450 chances, but 300 of those are balls that will be caught 99% of the time.

Fair enough, but PAs will be outs - what - 67% of the time? What's the variance on BA? 40 points?

It's not identical, but it is a reasonable sample size. It's analogous.
   125. Mister High Standards Posted: January 05, 2006 at 02:51 PM (#1807458)
"And I think using Game States or Win Probability as your primary criterion for MVP is just silly"

Not nearly as silly as saying a typical single, in a typical game, at a typical time, playing for a typical team, in a typical lineup slot is worth .47 so we will treat everysingle for in every inning, for every team, for lineupslot, for every baseout situation, ect... When you treat them the same your not distinguising between value and ability.

A single with 2 outs and a runner on second leads to more runs than a single with 2 outs. A run in a 3-3 game in the 9th leads to more wins than a single in 7-2 game in 8th inning. The idea is to measure how many wins a player actually contributed. Not how many wins a player might have hypothetically contributed if he was in a vaccum.
   126. Andere Richtingen Posted: January 05, 2006 at 02:55 PM (#1807459)
Did I say that? I did say my interpretation of ZR is the best out there. And I agree that it is.

So Dial agrees with Dial that his interpretation of ZR is best. I think that settles it.

Kidding of course...this is actually a great thread.
   127. Slinger Francisco Barrios (Dr. Memory) Posted: January 05, 2006 at 03:28 PM (#1807484)
I don't understand. If are agreeing with the hypothetical argument that Dr. Memory is putting out there - that STL didn't need much or any of Pujols' production, how can you even vote for him? Seems like his contribution would be considered worthless in your eyes.

Spivey--that is indeed an inescapable conclusion of his argument as stated, so perhaps he isn't explaining himself as well as he might. What he also isn't seeing is that it applies even better to Clemens, who probably has <u>less</u> overall impact on his team's success than does Pujols or Lee (or Berkman).

In Clemens' particular case, even more damning is that the team did better in games he didn't pitch.
   128. Chris Dial Posted: January 05, 2006 at 03:47 PM (#1807499)
So Dial agrees with Dial that his interpretation of ZR is best. I think that settles it.

True enough, but ZR does require interpretation to runs and then compared to average to be clearly applicable. What a 0.865 ZR means isn't very apparent.
   129. Mister High Standards Posted: January 05, 2006 at 04:03 PM (#1807520)
I didn't respond to your point because you 1) brought my position to a ridiculous extreme. 2) You have no intrest in changing your opinion on this specific matter so I have no reason to discuss it with you. We have discussed it in the past and you've made a point of ridiculing it.
   130. Chris Dial Posted: January 05, 2006 at 04:24 PM (#1807550)
A run in a 3-3 game in the 9th leads to more wins than a single in 7-2 game in 8th inning.

The problem I have with this is that it clearly favors #3 hitters on teams that play close games (and win some of them).

Because a GrandSlam that takes a game from 4-2 to 8-2 in the 4th changes teh other team's playing. They remove the starter and it just goes downhill from there.

If a batter strikes out here, it doesn't do anything. So when the pitcher gives up a 2-run HR in the 5th to make it 4-4, now you suddenly think a single in the 9th to score a run is more valuable than that GS.

That's just wrong. It is NOT value.
   131. GuyM Posted: January 05, 2006 at 04:47 PM (#1807588)
Fair enough, but PAs will be outs - what - 67% of the time? What's the variance on BA? 40 points? It's not identical, but it is a reasonable sample size. It's analogous.

It's only analogous if you believe that in some large percentage of ABs the hitter effectively has no chance at all of reaching base, and/or that in many ABs ANY major league hitter would reach base. I don't see any reason to think that's true.

Do your really not understand this point, or are you just being difficult?
   132. studes Posted: January 05, 2006 at 04:59 PM (#1807609)
Not nearly as silly as saying a typical single, in a typical game, at a typical time, playing for a typical team, in a typical lineup slot is worth .47 so we will treat everysingle for in every inning, for every team, for lineupslot, for every baseout situation, etc.

This reminds me of something a college prof once said (that I haven't forgotten in the 30 years since): "Every idea carried to its logical extreme becomes a caricature of itself." IMO, that's what you're doing using Game States to solely determine your MVP.

I don't mean to sound like Chris Dial (God forbid!), but have you scored many games using WPA? I have, and I've found that the results sometimes just don't make sense when interpreted as "value". Larry's example of a home run in the first vs. a sacrifice fly in the ninth is a great example.

WPA is awesome for following a game, it's great for analyzing relievers and their usage, and it's useful for MVP discussions. But it's not sine qua non.
   133. Chris Dial Posted: January 05, 2006 at 05:01 PM (#1807612)
in many ABs ANY major league hitter would reach base. I don't see any reason to think that's true.

If there weren't some portion of ABs where ANY ML hitter would reach base, why is the BA spread so small? The baseline to be a MLB player is very high.

I understand the point - thanks.

I just think that teh sample sizes are reasonable analagous.
   134. Mister High Standards Posted: January 05, 2006 at 05:08 PM (#1807632)

The problem I have with this is that it clearly favors #3 hitters on teams that play close games (and win some of them).


Thats the position they have been put in. Their is no rule that life is fair or equitable. Some people get paid much larger bonuses for doing less quality work, because firms are more profitable.

Chris, thankfully you understand defense a little more than the concept of value, of course your wrong about the accuracy of the specific numbers involved but your still in the know. Maybe one of these days I'll get through but I won't hold my breath.
   135. Mister High Standards Posted: January 05, 2006 at 05:14 PM (#1807644)
solely determine your MVP.


It doesn't solely determine who should be the MVP, in my opinion. This statement is doing exactly what you charachterize me of doing previously "Every idea carried to its logical extreme becomes a caricature of itself." (btw, thats a very nice quote)
   136. Mister High Standards Posted: January 05, 2006 at 05:16 PM (#1807649)
btw: I think using any ONE thing as the sole detemination for the MVP is silly. Including Dial player ratings.
   137. Chris Dial Posted: January 05, 2006 at 05:17 PM (#1807653)
MHS has said he uses Game State and tehn sprinkles with defense, clubhouse and position.
   138. Chris Dial Posted: January 05, 2006 at 05:21 PM (#1807660)
My ratings aren't using *one thing*. They use a combination of things.
   139. studes Posted: January 05, 2006 at 05:25 PM (#1807669)
btw: I think using any ONE thing as the sole detemination for the MVP is silly. Including Dial player ratings.

MHS has said he uses Game State and tehn sprinkles with defense, clubhouse and position.


That's helpful. Thanks. I'm obviously coming in late to a discussion, and I interpreted vehemence as intractability. Personally, I would put Dial-type ratings first, then sprinkle with WPA. Win Shares does that in a weak-form kind of way.
   140. DSG Posted: January 05, 2006 at 05:27 PM (#1807670)
The crux of the problem with #BIP is that, as GuyM and Rally state above, many balls are unfieldable.

ZR doesn't count those balls against the fielders. UZR does count ground balls like that.


If a ball is "unfieldable", it will NEVER be fielded, and it won't count against a player in UZR. If a ball is fielded 5% of the time, it will count and -.05 plays against someone in UZR. And it's mostly those plays that distinguish the great players. By discluding those plays, ZR converted to runs ends up with tiny variance, and mistates the value of a player.

Does that make a significant impact? Demonstrate that. I don't believe it does *from teh view that it unfairly impacts some fielders over others* That is, a fielder gets a wildly disproportionate number of difficult to field balls such that it distorts his skill as a fielder.

Yes, it does. Re-read MGL's original UZR articles. Or, look at how much the results change when batted ball speed is included. Or, re-read the Ichiro! thread on an article I wrote about his decline in performance in '05 where Mitchel demonstrated that a large part of Ichiro!'s lower BA on GB was slower hit speed.

No, I don't. Those add nothing to ZR. Unless you have some brilliant DP tweak. And when I say nothing I mean that Range adds nothing. Well, it adds confusion and murkiness, but it's not remotely as sophisticated as ZR. Range is like using BA instead of OPS+.

No. Incorrect. Range is like using OBP instead of SLG (or vice-versa, I don't really care). Neither ZR nor range capture the whole truth, but together (OPS) they can get 90% of the way there (or whatever % you choose to use).
   141. Mister High Standards Posted: January 05, 2006 at 05:39 PM (#1807686)
My ratings aren't using *one thing*. They use a combination of things.


Chris what exactly isn't a combination of things?

MHS has said he uses Game State and tehn sprinkles with defense, clubhouse and position.


In other threads i've also said I use other things. Importance to team. Leverage of teams needs for wins for example contributing about 10 wins to a 95 win team as contributing 10 wins to an 80 win team. Traditional stats. Adjusted stats. In the A-rod/Ortiz debate many of these factors aren't important which is why they weren't mentioned.
   142. mommy Posted: January 05, 2006 at 06:48 PM (#1807806)
"The only leap of faith required is to believe that it's similarly accurate at the individual level."

"But even there, it isn't all that hard to add up the RC's for every member of a team, and see how close they come to the team's actual runs scored."

that doesn't prove RC is accurate for each individual. Bonds's RC are overestimated because all his BB and HR cannot interact w/ each other.
   143. Mike Emeigh Posted: January 05, 2006 at 07:03 PM (#1807832)
The problem with using a pure value-added method (by itself) is that it treats game state events as independent when they really are not independent. If Johnny Damon leads off a game with a single, David Ortiz gets to bat in the bottom of the ninth with two outs and a runner on first base, giving him the chance to hit a two-run homer to win the game. GS methods give Ortiz a ton of credit for that HR, but without Damon's 1st-inning leadoff hit (and everything else that went before), he'd get no credit at all.

That said - I think it is appropriate, when given a close decision, to factor in the extent to which player performance was leveraged into actual wins and losses, although the amount of weight to give it is a different and IMO difficult question. Chris Jaffe noted in one of the various Blyleven threads that when he factored in Blyleven's run support Blyleven still comes up about 9 wins short of expectations (IIRC), and to the extent that it's true that Blyleven's teams won less often than expected when he was pitching - which was the perception when he was active - even in the face of his actual run support (and bullpen support, which I don't know if Chris considered), that would be a legitimate argument IMO against Blyleven's HOF candidacy.

-- MWE
   144. Chris Dial Posted: January 05, 2006 at 07:11 PM (#1807852)
That said - I think it is appropriate, when given a close decision, to factor in the extent to which player performance was leveraged into actual wins and losses

this is what I think.
   145. Mike Emeigh Posted: January 05, 2006 at 07:16 PM (#1807861)
As far as the zone issue goes:

The group of zones used to make up a fielder's area of responsibility are zones where players at that position regularly (>50%) convert the plays into outs.


50% is too high a threshold, IMO, and as David Gassko suggests in #140, I think it winds up discriminating against the truly great fielders. However, if ZR would give players credit for a play made OOZ without charging the player with a zone opportunity, that would go a long way toward fixing the problem.

-- MWE
   146. Mister High Standards Posted: January 05, 2006 at 07:16 PM (#1807862)
The problem with using a pure "vaccum" approach is if a player hits a single the next 3 guys fly out, then the only value the single had provided was turning the lineup over.
   147. Chris Dial Posted: January 05, 2006 at 07:18 PM (#1807868)
If a ball is "unfieldable", it will NEVER be fielded, and it won't count against a player in UZR. If a ball is fielded 5% of the time, it will count and -.05 plays against someone in UZR. And it's mostly those plays that distinguish the great players. By discluding those plays, ZR converted to runs ends up with tiny variance, and mistates the value of a player.

Derf.

Every groundball in UZR counts against someone, even if it is unfieldable. ZR doesn't "disclude" them. It just counts them as plusses.

Yes, it does. Re-read MGL's original UZR articles. Or, look at how much the results change when batted ball speed is included.

BUT HE"S USING THE WRONG ZONES! His baseline is off - if the hard hit balls are in the 56 zones, it won't impact it. Or up the middle.

Or, re-read the Ichiro! thread on an article I wrote about his decline in performance in '05 where Mitchel demonstrated that a large part of Ichiro!'s lower BA on GB was slower hit speed

Did you read the tail end of those comments? That data was analyzed all wrong. He was hitting the ball *further to the left* - that is - not up the middle, but to the shortstop.

And that doesn't address what I said anyway.

Range is like using OBP instead of SLG (or vice-versa, I don't really care). Neither ZR nor range capture the whole truth, but together (OPS) they can get 90% of the way there (or whatever % you choose to use).

I'll refrain from explaining the value of your method to you.
   148. Mike Emeigh Posted: January 05, 2006 at 07:22 PM (#1807877)
The problem with using a pure "vaccum" approach is if a player hits a single the next 3 guys fly out, then the only value the single had provided was turning the lineup over.


Sure. But there's value in turning the lineup over, and part of the linear weight reflects that value. The game-state approach undervalues that performance when compared to a linear weights approach; the linear weights approach understates performance in high-leverage situations when compared to a game-state approach. In the end state, the difference between the two isn't usually very significant; it happened to be significant this year because a good percentage of ARod's value was concentrated in low-leverage situations and a good percentage of Ortiz's value was concentrated in high-leverage situations.

-- MWE
   149. Steve Treder Posted: January 05, 2006 at 07:25 PM (#1807884)
that doesn't prove RC is accurate for each individual. Bonds's RC are overestimated because all his BB and HR cannot interact w/ each other.

Sure, but Bonds is an extreme outlier in event distribution. Given that RC does closely approximate runs scored at the team level, it is very appropriate to conclude that the RC scores for individual players with event distributions in the normal range do accurately reflect their total of runs "created."

The reasonable test of any metric isn't how accurately it captures an extreme outlier, or the hypothetical "suppose a guy went 1-for-1,000" cases. It's how accurately it captures the great majority of normally witnessed cases.
   150. Mister High Standards Posted: January 05, 2006 at 07:37 PM (#1807915)
Mike - thats exactly correct... which is why you need to factor in both.
   151. Kyle S Posted: January 05, 2006 at 07:39 PM (#1807920)
Do WPA type stats take opportunity into account? I know Ortiz had obscene rate stats in close and late situations, etc etc, but he probably had lots of opportunities in those situations too. Is it A-Rod's (or whoever's) fault that he didn't get as many chances to prove himself as Papi did? Matt, maybe you can answer since you use GSW/WPA stats so much in your MVP consideration.

---

50% is too high a threshold, IMO, and as David Gassko suggests in #140, I think it winds up discriminating against the truly great fielders. However, if ZR would give players credit for a play made OOZ without charging the player with a zone opportunity, that would go a long way toward fixing the problem.

Mike - let me make sure I'm at the right point in the discussion. Right now,

ZR = ( BIZ_fielded + BOZ_fielded) / (BIZ_opps + BOZ_fielded)

Is that correct? Leaving a theoretical maximum of 1.00 ZR. I think your formula would be (if I understand correctly):

ZR_emeigh = (BIZ_fielded + BOZ_fielded) / BIZ_opps

As I see it, this makes sense. Take two players: one converts 3 out of 4 BIZ plays and 1 out of 1 BOZ play; the second converts 4 out of 4 BIZ plays and 0 out of 1 BOZ play. The first will have a 4/5 = .80 ZR, the second will have a 4/4 = 1.00 ZR, even though (assuming the balls hit to them have equivalent run value) they both provide equal fielding value.

Right?
   152. GuyM Posted: January 05, 2006 at 07:44 PM (#1807928)
I think it is appropriate, when given a close decision, to factor in the extent to which player performance was leveraged into actual wins and losses,

But the value-added method doesn't deal in real wins and losses, only probable wins and losses. A player gets credit for nearly a run when he hits a triple, because he "should" score, even if he ends up stranded. A player gets credit for a go-ahead HR in the 8th, even if his team loses in the 9th. Consequently, the value-added approach is neither fish nor foul -- it's less useful than traditional sabremetric methods for assessing true talent (because context is so powerful), but probably no more accurate than Runs or RBI in assessing "true game value." Does value-added tell us more about Ortiz' clutch performance than RBI or BARISP? It's not at all clear that it does.

I think it's great for understanding leverage and reliever impact (as Studes suggests), but skeptical about other applications.
   153. Chris Dial Posted: January 05, 2006 at 07:44 PM (#1807930)
However, if ZR would give players credit for a play made OOZ without charging the player with a zone opportunity, that would go a long way toward fixing the problem.

Correct.

this is what I say in my "What is Zone Rating?" article (on your side bar):
When a player fields a ball outside his zone and turns it into an out, it is counted as both an out and a ‘ball in zone’ for the purposes of calculating his zone rating. This is a flaw in ZR, as balls outside the zone should be counted only as an out. Otherwise it takes away a “range” aspect of the rating.
   154. Mister High Standards Posted: January 05, 2006 at 07:54 PM (#1807953)
Matt, maybe you can answer since you use GSW/WPA stats so much in your MVP consideration.


Its not really all that relevant. You know the players were full time players, so they had the same basic opportunity to contribute. Its what the bottom line of their contribution was. The only time I'm concerned with it is if their playing time was equal or close. If one player had more high leverage opportunities I don't really care. Like I said its not about being fair its about who contributed more to his teams wins and loses. Some players will get more opportunity than others. Just like some people in life have more opportunities than others, it only matters what you make of your opportunities not how many you have.
   155. Mister High Standards Posted: January 05, 2006 at 07:57 PM (#1807957)
You can't have it both ways. You can't say a pitchers leverage is important while a hitters leverage isn't. Either leverage matters or it doesn't.
   156. Mike Emeigh Posted: January 05, 2006 at 07:57 PM (#1807958)
Take two players: one converts 3 out of 4 BIZ plays and 1 out of 1 BOZ play; the second converts 4 out of 4 BIZ plays and 0 out of 1 BOZ play. The first will have a 4/5 = .80 ZR, the second will have a 4/4 = 1.00 ZR, even though (assuming the balls hit to them have equivalent run value) they both provide equal fielding value.

Right?


Right.

The OOZ plays may not have equivalent run value to the in-zone plays, though. If teams are positioning their players appropriately, the OOZ plays should be worth less in run value than the in-zone plays (on a net basis). In the infield, it's almost certainly true that the OOZ plays are worth less than the in-zone play; most of the OOZ plays are singles, while a good percentage of in-zone plays on the corners would go for extra bases if they weren't made.

-- MWE
   157. Chris Dial Posted: January 05, 2006 at 08:22 PM (#1808011)
Yes, and we need to determine what % of plays are OOZ plays to see how big of an effect this is (it *could* be negligible).
   158. Chris Dial Posted: January 05, 2006 at 08:35 PM (#1808043)
I don't think it'll be negligible in the near "0", but that the effect may be nearly universal, so the impact isn;t very much.
   159. GuyM Posted: January 05, 2006 at 08:41 PM (#1808058)
A single with 2 outs and a runner on second leads to more runs than a single with 2 outs. A run in a 3-3 game in the 9th leads to more wins than a single in 7-2 game in 8th inning.

Distinguishing value from ability is fine. But a run in the 9th of a 3-3 game does NOT lead to more wins than a run in the 2nd inning of that same game. Yet value-added metrics say it does.

Moreover, the value-added metric doesn't even care if your 2-out single actually plated the runner on second -- it just assumes it did (most of the time). If you really want to measure how many wins a player actually contributed, shouldn't you care about actual outcomes, rather than probabilities?
   160. Mister High Standards Posted: January 05, 2006 at 09:05 PM (#1808091)
We more of less all ready have those in RBI, and GWRBI, and runs scored ect.
   161. GuyM Posted: January 05, 2006 at 09:08 PM (#1808094)
We more of less all ready have those in RBI, and GWRBI, and runs scored ect.

Right. So what does value-added give us?
   162. Mister High Standards Posted: January 05, 2006 at 09:10 PM (#1808099)

Distinguishing value from ability is fine. But a run in the 9th of a 3-3 game does NOT lead to more wins than a run in the 2nd inning of that same game. Yet value-added metrics say it does.


So in a tie game a manager should just bring in his best reliever no matter what the inning?

The problem is you don't know in the 2nd inning how important that run will be. You know how important it is in the 9th. Because it ends the game.

The idea is to capture the changes in game state a player is contributing. It measures something completely different from other metrics. I am not advocating its the only metric that should be considered but it is far more important than just a tie breaker type element. Its just as important as runs created or OPS, somewhat more important in my opinion not hugely so but more important. The reason it seems like I value it as much is because in the A-rod/Ortiz debat its one of the few major differences between the players.
   163. studes Posted: January 05, 2006 at 10:19 PM (#1808233)
You can't have it both ways. You can't say a pitchers leverage is important while a hitters leverage isn't. Either leverage matters or it doesn't.

That's actually a good example. Relievers are brought in and out of games all the time, so their leverage changes as a result of managerial decisions. That's not true with batters. There's a difference. WPA does, IMO, distort the value of the reliever vs. the starter. So I would only use it to contrast and compare the contribution of relievers and not relievers vs. starters. To me, that's a great example of the limits of its usefulness.

It would be a great tool for assessing pinch hitters, BTW.
   164. GuyM Posted: January 05, 2006 at 10:24 PM (#1808242)
The problem is you don't know in the 2nd inning how important that run will be.

No, but you DO know by the time you're casting MVP awards. So why give more credit now to the guy who contributed in the 9th? Value-added mixes up two very different things: 1) hitting well w/ men on base, and 2) hitting well close and late. I can see why you may want to do #1 (though RBI may do it as well and far more simply), but why should #2 matter once games are over?

Also, doesn't this approach give credit only to "clutch" RBI guys, but not to "clutch" OB guys? Under this philosophy, you should give greater credit to hitters who have the "ability" to get on base before a hit than those who don't. But a lead-off single is worth the same, whether or not the hitter eventually scores. VA gives bonus credit to the hitter who drives him in for hitting at the right time, but not to the first hitter who got on base at the right time. Why credit one but not the other?
   165. Mister High Standards Posted: January 05, 2006 at 10:30 PM (#1808258)
Relievers are brought in and out of games all the time, so their leverage changes as a result of managerial decisions. That's not true with batters.

Yes - but managerial control is out of the hands of that pitcher, just like what the baseout situation is for the hitter. They just have to perform within the bounds of the situation they are put in, and they get credit for it based on how important those situations are.
   166. WalkOffIBB Posted: January 05, 2006 at 10:34 PM (#1808263)
The problem is you don't know in the 2nd inning how important that run will be. You know how important it is in the 9th. Because it ends the game.
Seems to me the value of the run has not changed, but rather the perecption of the value of the run.
   167. Mister High Standards Posted: January 05, 2006 at 10:41 PM (#1808278)
GuyM - I'm not really understanding your point. RBI's don't measure other important factors which is why it isn't as valuable as game state wins. Going first to third, hitting a single which moves a guy into scoring position allowing for another man to potentially have a game winning hit. All of those things have value.

You don't know why you would want to know who hit well when the game was close and late? Because those hits high leverage? And as I explained to Studes leverage matters for hitters and pitchers.

This approach also gives credit to "clutch OB" guys. I strongly suspsect your missing something in how the system works if you think otherwise.
   168. studes Posted: January 05, 2006 at 10:50 PM (#1808295)
Relievers are brought in and out of games all the time, so their leverage changes as a result of managerial decisions. That's not true with batters.

Right. But that wasn't really my point (though I'll admit I didn't express it well). WPA is useful to assess the value of relievers AS WELL AS the usage patterns of managers. Because both are critical to the optimal use of the bullpen, it's a great tool to look at both. "Value" in that sense is explicitly a function of both.

But the more germane point is that I believe it's not useful to compare relievers and starters. Starters often provide a lot of great innings that keep ballclubs in games but, because they don't pitch say, the final inning of a close game, a reliever will often have more WPA points for pitching one shutout inning vs. a starter pitching, say, 8 innings of 3-run ball. In that case, I don't believe the reliever was truly more "valuable" than the starter. But, due to the leverage of late innings, WPA says he was.

And I think this same argument is why WPA shouldn't be a primary consideration for MVP. Batters are, for all intents and purposes, "starters" whose contributions, if they happen to fall in late innings of close games, will be overvalued vs. those who contributed earlier in games.
   169. Kyle S Posted: January 05, 2006 at 10:57 PM (#1808308)
Guy's point is that after all the games are over, you can look back at a second inning GWRBI and say, there, that was the crucial RBI. That did it. It provided as much value as an RBI in the 9th would have. Given that the team won, the RBIs were equally "valuable" - the only difference is that one had more certainty of value at the time.

---

Here's the "Clutch" OBP example:

Tie game, 0 outs, top of the 9th: P(Home win) = .500
Batter 1 doubles: P(Home win) = .328
Batter 2 singles (runner scores): P(Home win) = .133

(using tango's WE matrix from a while back I have on my computer)

If you assign credit based on delta P(home win), the double was worth .172 wins, while the single was worth .195 wins. Doesn't that strike you as silly?

Moreover, if the same thing happens in the 2nd inning of a game that ended 1-0, the numbers are much smaller.
   170. mommy Posted: January 05, 2006 at 11:10 PM (#1808333)
"Bonds is an extreme outlier in event distribution. Given that RC does closely approximate runs scored at the team level, it is very appropriate to conclude that the RC scores for individual players with event distributions in the normal range do accurately reflect their total of runs "created." "

I'm not sure it is appropriate to immediately conclude that. As it turns out, it is generally correct. But i can see where one would find it a "leap of faith" to believe that just because RC works on the aggregate team level, it must therefore be as accurate for each individual player. It estimates runs based on getting on base and moving runners around. Well, no player gets on base and then bats himself around. I've never studied it, but this may not affect only Bondsian outliers. Maybe a Roy Thomas/Richie Ashburn type is consistently misvalued by basic RC. Maybe a Tony Batista/Joe Carter type is as well. these players may not be "normal," but they are not so rare that we shouldn't expect I'm sure their RC estimates are good enough for government work, but perhaps some of the more advanced runs-created estimates handle those types of players better. (or perhaps they even handle all individual players better, even if they are no more accurate at the team level).
   171. Mister High Standards Posted: January 05, 2006 at 11:15 PM (#1808342)
Guy's point is that after all the games are over, you can look back at a second inning GWRBI and say, there, that was the crucial RBI. That did it. It provided as much value as an RBI in the 9th would have. Given that the team won, the RBIs were equally "valuable" - the only difference is that one had more certainty of value at the time.


Certainty has value. However, I agree that would be a useful stat to have as well, not in RBI form but wins added form... as soon as someone provides it, I will use it. I don't have the accumen or data to do it myself. Then I could even lower the dependecy on standard metrics.

As for your clutch OBP example, that seems reasonable to me. They were both very important to winning the game. I guess you object to the double being more valuable than a single, but I see no problem with that, it is what it is and one way or the other it is reasonable. Much more reasonable than saying that double was worth .74 wins and the single was work .47 wins.



And I think this same argument is why WPA shouldn't be a primary consideration for MVP. Batters are, for all intents and purposes, "starters" whose contributions, if they happen to fall in late innings of close games, will be overvalued vs. those who contributed earlier in games.


While overvalued/undervalued is relative, it does a closer job or approximating than using static weights does, at least in terms of change in win contribution. Though I certainly agree you need to use both or sometimes you'll get wrong answers.
   172. Mister High Standards Posted: January 05, 2006 at 11:18 PM (#1808350)
mommy - baseruns does a better job of it at the extremes.
   173. GuyM Posted: January 05, 2006 at 11:18 PM (#1808351)
This approach also gives credit to "clutch OB" guys. I strongly suspsect your missing something in how the system works if you think otherwise.

It gives credit in one sense: if you get on base close and late. But it's indifferent to the performance of later hitters -- i.e. whether you actually score! But the system does reward hitters for doing well with men on base.

We don't think about "clutch" OB performance for the obvious reason that a hitter can't know what the guys following him in the lineup are going to do. But in value terms, there's no logical difference. (And there's good reason to doubt clutch RBI performance is a skill for most players, but that's another debate.)
   174. Spivey Posted: January 05, 2006 at 11:33 PM (#1808374)
I didn't respond to your point because you 1) brought my position to a ridiculous extreme.

How so? Seems like you use some combination of season stats (like LWTS), game situation adjustments, and how well the team finishes without explaining why you use these 3 things and why you weight them like you do. You're not required to explain it, but given how you've acted in the AL MVP threads, I assumed you would explain it.
   175. GuyM Posted: January 05, 2006 at 11:33 PM (#1808379)
Going back to the fielding sample issue: Even if Chris is right that BIP sample size is comparable to that for hitters/pitchers (I'm not sure), I think we all agree that the measurement error is much greater relative to the variance in true talent. At a 90% confidence level and 450 BIP, the MOE for estimating true talent based on one season is about +/-11 runs (and that assumes perfect accuracy in measuring performance, which we don't have). A 22-run swing is pretty big, given the range of ratings. That means a player with a zero (average) rating could easily be very good or quite bad. So I wonder:

1) Can we only use defensive metrics to ID really exceptional and awful fielders, or can they also make finer distinctions?

2) How many years of data do we need to judge a player's true talent?

3) What's the y-t-r correlation of Dial ratings and other metrics? And is this one way to judge the competing value of different metrics?
   176. studes Posted: January 05, 2006 at 11:37 PM (#1808386)
While overvalued/undervalued is relative, it does a closer job or approximating than using static weights does, at least in terms of change in win contribution.

Heh. Overvalued and undervalued are relative. That's the whole point. And that, of course, is the seduction of WPA. It isn't subjective; it's the result of real math. That doesn't mean it isn't distorted, however, particularly in terms of how it handles events in the ninth inning vs. earlier innings.

I've stated what my objections are. As much as I use and enjoy WPA, I don't endorse it as a primary tool for MVP determination, at least in its present state.
   177. DCA Posted: January 05, 2006 at 11:50 PM (#1808402)
I'm fully onboard with GuyM, and this is where I disagree with MSH on using game state wins or the like. I would say that every run scored by a team in a given game is of equal value. They are 100% interchangeable, like currency. When you buy a $1.50 cup of coffee with six quarters, you don't ask which piece of change bought you the most drink, even if you had to dig through your pockets furiously to find that last "clutch" quarter to complete the purchase.

If the same cup of coffee costs $2.50 across the street, that's different, these quarters are clearly buying less -- but it's a different game. And when coffee is better or worse across the street, that makes it interesting. But within a game, all runs are of equal value.

I've proposed, but never done since I don't have any PBP data or the time/inclination to exactify the procedure, that the best way to compute value* -- at least the offensive part -- is to replace all a batter's PA with a random event generator that is park and pitcher adjusted, restart all innings at the same place, don't introduce substitutions, sim 100 times, and definite value as actual team wins - mean simmed team wins. Similar fielding and pitching metrics can be defined, but might be a bit harder.

*this is value defined as "worth to specific team". I haven't decided if this is better than "worth to any team" -- wherein the player could be "added" to every other team in the league, moving around other players on that team and getting appropriate # of PA (min 0, max actual PA) and then computing delta team runs using some RC/XR type formula and some Defensive Runs formula. And all these up, including the "worth to specific team" value that you got above, or just a RCAA + FRAA value, and take the average. That's the other metric I'm mulling over -- it is context independent, but it has to be since you can't fairly get the context of a hypothetical team.
   178. Chris Dial Posted: January 06, 2006 at 01:45 AM (#1808526)
GuyM
I like your obs. I suspect the accuracy is a little better than that wrt performance (not true talent).

I'll see what I can't generate for y-t-y data, as well comparing to a team value (although I don't think that'll work).

And I think about 3 years to judge a player's ability - but not his value.
   179. Steve Treder Posted: January 06, 2006 at 01:59 AM (#1808540)
But i can see where one would find it a "leap of faith" to believe that just because RC works on the aggregate team level, it must therefore be as accurate for each individual player.

I would consider it far more of a little hop than a leap. Maybe a nudge. Your basic commonsense deduction, actually.

Team offense is nothing more nor less than the aggregation of its players, occurring in sequence. Unusual players (and they get no more unusual than Bonds), and teams with unusually good or bad RISP performance, will elude RC to some extent, of course. But for the practical purposes generally required, if it works well at the team level, it's a logical deduction that it's working well at the individual level.
   180. mommy Posted: January 06, 2006 at 02:40 AM (#1808589)
"I would consider it far more of a little hop than a leap."

well i was just quoting Boots Day's terminology. for the record i never really questioned using RC for individuals and i have no problem w/ doing so. just playing devil's advocate a bit, as i can see how one trying to think about it analytically might pause at assuming the calculation works for the individual as well as it does for the team.
   181. GuyM Posted: January 06, 2006 at 03:06 AM (#1808610)
As for your clutch OBP example, that seems reasonable to me. They were both very important to winning the game. I guess you object to the double being more valuable than a single, but I see no problem with that, it is what it is and one way or the other it is reasonable. Much more reasonable than saying that double was worth .74 wins and the single was work .47 wins.

In Kyle S's example, the double was worth .172 wins and the single was worth .195 wins. Batter #1's double will still be worth .172 "wins" even if the next 3 hitters strike out. The double that preceded a single is far more valuable, but both are assigned the same VA value.

Now let's say batter 1 strikes out, but batter #2 again singles. Now his single is worth just .063 wins, 1/3 as much. The "value" of batter #2's single is hugely impacted by the success/failure of his teammate, but the value of #1's double is not affected by his teammate's performance. Does that make sense?
   182. Mister High Standards Posted: January 06, 2006 at 01:41 PM (#1808957)
perfect.
   183. GuyM Posted: January 06, 2006 at 02:42 PM (#1808992)
So context matters for RBI guys, but not OB guys? I get the impression that you don't exactly have an open mind on this issue.....
   184. Smitty* Posted: January 06, 2006 at 03:15 PM (#1809013)
My main issue with the use WPA stats for hitters (and this same argument would apply for starting pitchers, but not relief pitchers) is that the hitters help to create the context, and the ideal situation is to avoid high leverage situations altogther.

A team's goal, going into the game, is not "keep it a close and then get a big hit late". The goal is "put runs on the board early, knock out their starting pitcher, and then not let them get back into it". The team wants to create low leverage situations where they are ahead. Obviously, they won't accomplish that goal, and when they do find themselves in high leverage situations they want to perform well. But from what I can tell, the WPA stats do not give much credit for creating the low leverage (with your team winning) context.

A hitter doing well in the high leverage situations does have an impact to the team's won/loss record, and should be considered. But the stats seem to almost penalize hitters who do their damage early, thus avoiding the high leverage situations altogther.

The same sort of argument can be made for starting pitchers, though they're mostly limited to avoiding low leverage situations where your team is losing. They're still playing a big part in creating the context.

Relief pitchers on the other hand, are thrown into a context they did help create. The WPA stats are better tools for them because they are never (or at least very rarely) in a situation to create a low leverage situation with their team winning, therefore won't be penalized by the stats.

As mentioned earlier, pinch hitters could evaluated similarly.
   185. Mister High Standards Posted: January 06, 2006 at 03:15 PM (#1809014)
Context matters for both. His single is still worth more than an average single.

My mind is open, very few others appear to be however.
   186. Mike Emeigh Posted: January 06, 2006 at 03:18 PM (#1809018)
I've proposed, but never done since I don't have any PBP data or the time/inclination to exactify the procedure, that the best way to compute value* -- at least the offensive part -- is to replace all a batter's PA with a random event generator that is park and pitcher adjusted, restart all innings at the same place, don't introduce substitutions, sim 100 times, and definite value as actual team wins - mean simmed team wins.


You would also have to do some level of situational adjustments, to take into account known variations in batter performance (as a group). Batters (as a group) have a different level of performance with a runner on first than they do with the bases empty, for example - this was something that James noticed when he was doing his rookie study back into the '80s.

-- MWE
   187. studes Posted: January 06, 2006 at 03:46 PM (#1809040)
My main issue with the use WPA stats for hitters (and this same argument would apply for starting pitchers, but not relief pitchers) is that the hitters help to create the context, and the ideal situation is to avoid high leverage situations altogther.

That's an awesome point. Well said.
   188. greenback Posted: January 06, 2006 at 06:35 PM (#1809262)
With the thread dying out, I've got a question about Base Runs v. Runs Created. RC has a problem on the game-to-game level. Does BR have similar issues?
   189. Damon Rutherford Posted: January 07, 2006 at 12:12 AM (#1809807)
RC has a problem on the game-to-game level. Does BR have similar issues?

Most likely, as the smaller the sample size, the more inaccurate the model is going to be.

My main issue with the use WPA stats for hitters (and this same argument would apply for starting pitchers, but not relief pitchers) is that the hitters help to create the context, and the ideal situation is to avoid high leverage situations altogther.

I second Studes on this. Excellent comment.

My mind is open, very few others appear to be however.

Most predictable response award goes to MHS.
   190. Spivey Posted: January 07, 2006 at 02:38 AM (#1809937)
Rauseo: Context matters for both. His single is still worth more than an average single.

On a situation where someone hits a triple in a tie game in the bottom of the 9th with 0 outs, and then there are 3 straight strikeouts afterwards...would you value that triple the same as a triple in the same situation with a single that followed? I honestly am curious.

Steve Treder:
I would consider it far more of a little hop than a leap. Maybe a nudge. Your basic commonsense deduction, actually.

Team offense is nothing more nor less than the aggregation of its players, occurring in sequence.


I could be way off base here, but...

I don't know if this is necessarily true. Actual runs correlate perfectly on the team level to team offense productivity (nobody is surprised, I know). RBIs correlate very well. The reason they do not correlate well on the individual level is because what happens before and after the hitter is important. I think you could say that's possible true with RC. In what order events occur affects the offensive value of certain contributions. Plus, a walk isn't a walk. A walk is more valuable in certain positions in the lineup than it is in others. I don't think these make huge differences, but there I think there is at least some reason to think that just because RC correlates well on a team level doesn't mean it's going to correlate just as well on an individual level.
   191. Los Angeles Waterloo of Black Hawk Posted: January 07, 2006 at 02:45 AM (#1809942)
On a situation where someone hits a triple in a tie game in the bottom of the 9th with 0 outs, and then there are 3 straight strikeouts afterwards...would you value that triple the same as a triple in the same situation with a single that followed? I honestly am curious.

I don't know about Rauseo, but Win Probability methods would give equal rank to those triples.
   192. DSG Posted: January 07, 2006 at 06:18 PM (#1810412)
With the thread dying out, I've got a question about Base Runs v. Runs Created. RC has a problem on the game-to-game level. Does BR have similar issues?

Small sample size obviously means BsR will have SOME issues, but because of the strength of the build (that is, because BsR works at extremes much better), BsR will be MUCH closer than any other run estimator. For example, look at the following table from Tangotiger's Part 3 of his BsR series:

     Runs Scored, breakdown by HR hit

     HRclass     n      R     BsR    LWTS    RC 
         0    33,068   3.08   3.06   3.79   3.03 
         1    23,117   4.62   4.62   4.44   4.66 
         2     9,218   6.12   6.12   5.00   6.41 
         3     2,838   7.65   7.65   5.62   8.37 
         4       687   9.03   9.00   6.07  10.29 
         5       146  10.55  10.49   6.73  12.45 
         6        40  12.33  12.32   7.52  15.35 
         7         9  16.22  14.32   8.34  18.27 
         8         2  14.00  15.87   8.58  22.52 
        10         1  18.00  18.30   9.51  27.03
   193. Chris Dial Posted: January 07, 2006 at 06:26 PM (#1810421)
XR falls somewhere in ther closer to BsR too, IIRC.
   194. DSG Posted: January 08, 2006 at 12:27 AM (#1810764)
No, XR will be where LW is. There is very little difference between the two systems, except that LW is theoretically correct, while XR is simply an approximation of LW with some tweaks to improve RMSE over the period Jim Furtado tested.
   195. BoSox Rule Posted: January 08, 2006 at 04:38 AM (#1810944)
I don't believe in giving the MVP to the best player on a contendor. The most valuable player contributes the most to his team, whether his team was 100-62 or 62-100. Alex Rodriguez was the right choice for the 2005 American League MVP Award and Pujols wasn't necessarily a bad choice for the NL MVP. He is a great player, and was just behind Lee as the 2nd best player in the NL. It's just that Derrek Lee was undoubtedly better than Pujols on both sides of the ball, even if it was by a tiny margin you can tell he was.
   196. Buzzards Bay Posted: January 08, 2006 at 05:40 AM (#1811008)
There is no defense...we score outs and we score runs...we call it defense but it's not...two parallel contests...the race to 27 outs and the race to more runs in 9 innings...
   197. Rob Base Posted: January 09, 2006 at 04:40 PM (#1812419)
Page 2 of 2 pages  < 1 2

You must be Registered and Logged In to post comments.

 

 

<< Back to main

Support BBTF

donate

Thanks to
Sebastian
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Buy MLB playoff tickets, plus 2011 World Series, 2011 ALCS tickets and NLCS game tickets. We also have Texas Rangers playoff schedule, tickets to Red Sox games and Yankees game tickets. Plus, buy Phillies baseball tickets, Tigers playoff tickets and the biggies like ALDS baseball tickets and 2011 NLDS tickets.

Demarini, Easton and TPX Baseball Bats

 

 

 

AllianceTickets.com has cheap MLB Tickets. Get all your Colorado Rockies Tickets, Seattle Mariners Tickets, San Francisco Giants Tickets and all your favorite baseball tickets here. We also carry cheap Denver Broncos Tickets, Seattle Seahawks Tickets and Denver Nuggets Tickets.

Page rendered in 0.8420 seconds
51 querie(s) executed