Monday, July 02, 2001

Situational in Seattle

Are the Mariners scoring more runs than they should be?

The Seattle Mariners have jumped out of the gate, and are solidly in control of the AL West, if not all of baseball.  Despite reduced scoring this year, their offense is clicking; an average of more than 6 Mariners cross the plate each game, 25% more than the league rate!  Is it for real, or is it simply a fa?ade of timely hitting and lucky bounces?

Using sabermetric tools, we can closely examine the workings of the Seattle offense. Early research by Bill James has shown that teams tend to regress towards their predicted runs total, rather than maintain large discrepancies from it.  A prediction of runs scored can be made using eXtrapolated Runs (xRuns) .

TEAM           RUNS       BA      OBP      SLG     xRuns
Seattle         460    0.284    0.360    0.447    438

We can see that the Mariners are over-performing by about 22 runs.  That’s not a huge figure, but it is significant: about 2 wins.  How did the Mariners score those 22 runs?  A few years ago, in a drastic departure from normal sabermetric standards, Bill James started using situational statistics in his runs created estimations. 

Although the adjustments for batting with runners in scoring position and home runs with runners on base increased the accuracy of his estimations, it reduced the independent nature that aids analysis and increases predictability.  Much like teams regress towards their predicted runs scored over the course of a season or group of seasons, situational statistics do not demonstrate consistency.  A team that is batting .250 overall, but .300 with the bases loaded in June, is more likely to finish the year batting .250 in both situations than to maintain its statistical split.

However, we can use his adjustments to help us see why the Mariners are scoring those extra runs.  The first adjustment is for batting average with runners in scoring position (RISP).  James reasoned that each extra hit in that situation was worth an extra run, thus by comparing regular batting average and batting average with RISP, the extra value of those hits can be counted. 

The Mariners have batted .284 overall this year, and have come to the plate 760 times with RISP.  We expect that they would have 216 hits in that collection of at bats; the Mariners actually have 234 hits with RISP.  James adjustment explains a large part of the discrepancy by adding 18 runs to Seattle’s total.

The second adjustment is for home runs with runners on base.  James reasoned that an extra homer in that situation was worth an extra run, so by comparing home run rates overall with rates with runners on, we can see how many runs that type of situational hitting added.  Seattle has hit 0.03 homers per at bat and batted 1241 times with runners on.  We would expect that they would hit 40 homers in those 1241 at bats.  They have hit 42.  We can add 2 more runs to their situational tally.

Simply by examining two situational statistics and applying crude adjustments, we can see the mechanism that the Mariners’ offense has used as it overachieves: they have had timely hitting.

Team          Runs    xRuns    Difference    Situational Runs
Seattle        460      438            22                  20

Some would argue that the Mariners possess a talent for situational hitting.  It’s possible, but so is spontaneous human combustion.  As Voros has written before, a lot can happen in a small sample size - and on a team scale, the number of at bats with RISP is a small sample.  I would expect them to relapse a bit and drop closer to their predicted xRuns.  It’s important to note that even with the 22 run penalty (or rather, without their 22 run situational bonus) the Mariners would still be 15 runs ahead of second place Cleveland’s 423 runs total.

The Mariners’ offence is for real - but, that doesn’t mean they haven’t been lucky!

How long do to you expect this to last?

Batting Average       Overall    RISP
Ichiro                   .383    .500    (31 for 62)

Situational statistics are valuable as a tool to explain events that have already happened; however, the do not track ability and have little predicative value.  In this case, we can see why the Mariners have performed as well as they have; however, I do not agree with this type of adjustment for estimating runs. 

For a method as well publicized as James’ Runs Created, the introduction of situational statistics is silly.  Runs Created is meant to track what should have happened if all the bounces evened out, and thus has more value in predicting what will happen.  If James wants to analyze what did happen and look at the differences from the expected, he should use a more comprehensive analysis of situational statistics such as Value Added Runs.  He has the data, we don’t.  Luckily, we can still analyze the situations using the James methods.

James Fraser Posted: July 02, 2001
Reader Comments and Retorts

   1. Robert Dudek Posted: July 03, 2001 at 12:09 AM (#603966)
Joe is correct,

David, it is relevant, because the 20 runs that James Fraser calculated seems too high because batters as a group have a higher batting average with men on base, precisely for the reasons Joe noted (Joe, I believe you left out a "not" before "pitching well").

Of course the Mariners have done very well with men on base. Perhaps this has accounted for 16 runs and not 20, if so, then there is a shortfall of 6 runs, not 2, to be explained. Then we have to go to baserunning and advancing guys with outs.

There could be another reason the Mariners are batting well with RISP: the lineup is very concentrated, with high OBP men batting in the one slot and two hole, which should lead to more RISP PA for the 3,4 and 5 hitters which happen to be Edgar, Olerud and Boone. Additionally, the #3 and #4 guys have very high OBP, which sets up more RBI opps for the number 5 and 6 guys. Then the lineup sort of goes over a cliff.

How many teams have very high OBP men in the first 4 slots ? Not very many right now.

I'd be surprised if the Mariners #3,4,5 were not getting a large pct of Team RISP PA.

James, if you could do a recalculation based on the expected BA and homerun rate in RISP situations, we might be able to nail that number down.

   2. Carl Goetz Posted: July 03, 2001 at 12:09 AM (#603968)
I agree with Joe that teams should be expected to hit better with the bases loaded than not. This does not change Fraser's calculations, however. Extrapolated runs are based on overall numbers. Therefore, any adjustments for situational hitting need to be made as adjustments beyond the norm. .250 is the norm. .300 is the situational stat. XRuns are calculated on the .250, so even if we would expect the Mariners to hit .300, the adjustment still should be made against the .250.
Just as an aside; Since we expect that teams hit better with runners in scoring position, shouldn't we also expect them to hit worse with the bases empty? Wouldn't we then need a negative adjustment to compensate for that? Maybe the situational adjustment should be made based on how many more PAs a team had with the bases loaded than an average team had in that situation. Or, maybe based on how many PAs a team has with the bases loaded vs how many they have with the bases empty.
Just some food for thought.
   3. Robert Dudek Posted: July 03, 2001 at 12:09 AM (#603969)

At the risk of offending, I offer the humble opinion that you are dead wrong.

XRUNS is a run estimation tool developed by calculating the average value of each offensive event, based on team and league data from 1955 to 1997 (see Jim Furtado's "Baseball Stuff" for a more complete explanation). If the players in a league, over a long period of time, hit .265 in RISP situations versus .252 in ALL situations, then the XR calculation takes this into account. I.E. the value of a single will reflect what happens on average - the proportion of singles in RISP situations will be whatever the proportion was from 1955 to 1997.

Let's look at this issue another way. If the 2001 AL has a batting average in RISP situations of .265 and a .252 overall BA, then using .252 as a baseline for your situational runs will mean that the sum for the league will be significantly above zero. The situational stats over the long-term should sum to zero or very near zero, because it doesn't make sense to say the league has performed well in the clutch relative to itself (clutch is always relative to a "normal" performance).

Differences between XR projections and actual runs can only arise from one of the following:

1) the team does things that are not measured by XR, well or badly. Examples: advancing runners with outs, advancing extra bases on hits,wild-pitche etc, avoiding or committing excess baserunning errors, reaching base on error disproprotionately, etc.

2) Bunching their hits/walks etc aka clutch performance or lack thereof. If the average team hits 12 points better than its overall BA in RISP situations, then a team will have to exceed that or have a shortfall if it is to show up as a discrepency between XRUNS and actual runs.

3) An efficient or inefficient lineup - but this will show up in categories 1 and/or 2.

For the record, I believe that all three points apply to the 2001 Mariners' performance so far.
   4. Carl Goetz Posted: July 04, 2001 at 12:09 AM (#603972)
There still needs to be a backside adjustment. If the average team hits 12 points higher than their overall during clutch situations, you need to find out where they are losing the 12 points. If they are batting 12 lower with the bases empty and 2 outs, then we can close the book and say they are just a clutch team. What about a team that hits alot of extra base hits. I'd say, for that team, a runner on first with nobody out is a pretty clutch situation. My point, is that, if you're going to make adjustments to XRuns, you can't just make them for 1 or 2 of the base-outs scenerios. There are 24 of them and they all need adjustments.

ps I'm not offended. My first post was a gut reaction and I now realize that you are correct, assuming its true that the average team hits significantly better with the bases loaded.
   5. Robert Dudek Posted: July 04, 2001 at 12:09 AM (#603976)
I read somewhere that two hitters, Paul Molitor and Eddie Murray did significantly better than expected with men on base. That they did this does NOT constitute proof that they were clutch hitters, only that (based on statistical evidence) they are the most likely players to have been clutch hitters.

Ichiro may or may not be another one to add to the list when it's all said and done, but again, those numbers can only suggest the odds that they were able to raise their game.

As far as batting with the bases loaded, those sample sizes are so small that it's impossible to prove anything (rememeber Pat Tabler ?)

I was looking at the league splits this year and BA for both leagues has been the same in RISP situations as overall. The rate of homeruns per AB in MOB situations is actually lower than the overall rate in both leagues.

I don't know why Bill James only considered those two situations: homers with MOB and BA in RISP situations. Bases loaded performance is likely to have a decent impact of how many runs a team score.

   6. Carl Goetz Posted: July 05, 2001 at 12:09 AM (#603978)
It takes more than 1 hitter to prove the existance of the elusive "clutch hitter". Statistically speaking, there is a high probability that a handful of hitters will hit better in the clutch over the course of a season and even a career. Assuming a normal distribution, of course.
   7. James Fraser Posted: July 05, 2001 at 12:09 AM (#603979)
Robert, David and Carl... The main reason I wrote this article was to play around with Bill James' RC adjustments. I agree that its silly to only adjust those two Base/Out situations. If the data were available to me, I could look at all of them and possibly look at the bunching of runs too.

