Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Primate Studies > Discussion
Primate Studies
— Where BTF's Members Investigate the Grand Old Game

Tuesday, January 21, 2003

Scoring Position Average

Dave looks at 2002’s best table-setters.

I’ve developed an annual habit this time of year when complete statistics for the previous year become available. I create a spreadsheet that contains key data from the previous season, and I play with it to see what I can find. Most recently, while using Ray Kerby’s Astros Statistical Software, I tried to look at offensive production in a different light.

Most of us understand that OPS is a tremendous tool to evaluate offense. But offensive production is a complicated process that often requires several different approaches to understand it fully. So I changed some of the parameters a little bit.

That is, instead of thinking primarily along the two dimensions of offense (getting on base and slugging); I thought it might be helpful to think of three fundamental run production elements:

     
  1. Hit home runs. Any time, any day. The power and seduction       of the home run as an offensive weapon is obvious.
  2.  
  3. Get runners into scoring position. Not just on base, but       on second and third, where a single, error or sacrifice       fly can bring them home.
  4.  
  5. Hit with runners in scoring position.

I calculate that these three components account for 90% to 95% of all runs scored. Plus, thinking of these components yields additional insight into teams’ and individuals’ hitting strengths and weaknesses.

As an example, let’s pick on two relatively equal teams from the NL East: the Mets and Phillies. The Phillies scored 710 runs in 2002, the Mets 690. When you correct for ballpark effects, they virtually come out even. But there were different factors accounting for their offensive production:

     
  1. Phillies hit 165 home runs, and the Mets hit 160.       Correcting for ballpark effect again, they were about       even. But the next two factors are more telling.
  2.  
  3. Phillies had 1,474 at bats with runners in scoring       position, while the Mets only had 1,267. Phils had the       fifth best total in the major leagues; the Mets were       third from last. This would normally bode very well for       the Phillies.
  4.  
  5. However, the Phillies simply did not hit well with       runners in scoring position. Their batting average was .237       in those situations. Conversely, the Mets hit very well     —.274 – with runners in scoring position.

The following graph shows the number of at bats and batting average with runners in scoring position for every major league team.

This chart tells many stories. For instance, Anaheim only hit 152 home runs in 2002, but the offensive strength that carried it to World Series victory is clear on this chart. Tampa Bay, one of the weakest offensive teams in the American League, was one of the best at getting runners into scoring position, but their poor batting average in those situations undermined that advantage.

Obviously, there is a strong correlation between how well teams hit, how they hit in the clutch, and the number of runners they get into scoring position. But large variances from that correlation, such as the Mets’ and Phillies’, have a strong impact on offensive performance.

These trends are also telling when applied to individual players. Why was Edgardo Alfonzo’s RBI total so low (56)? Well, he only had 103 at bats with runners in scoring position, though he batted well (.330) in those situations.

How about his apparent batting order replacement, Cliff Floyd? Floyd only had 79 RBIs last year, a low total for such an outstanding hitter. Floyd did have 150 at bats with runners in scoring position, but his batting average in those situations was .265, twenty points below his overall average. Most notably, he was 1 for 14 with the bases loaded.

Thanks to the Internet, individual player information regarding home runs and hitting with runners in scoring position is readily available. One other question intrigued me, however. How do we assess the ability of individual players to get into scoring position? Can this be a meaningful analysis?

To begin, here is a list of top ten baserunners ranked by how often they were in scoring position. The ranking is based on the number of total plate appearances in which that player was a runner on second or third:

National League

             
Luis Castillo

286

size="2" face="Arial">

Junior Spivey

align="right">

285

size="2" face="Arial">

Bob Abreu

align="right">

273

size="2" face="Arial">

Fernando Vina

align="right">

269

size="2" face="Arial">

Vladimir Guerrero

align="right">

264

size="2" face="Arial">

Derrek Lee

align="right">

263

size="2" face="Arial">

Todd Walker

align="right">

259

size="2" face="Arial">

Aaron Boone

align="right">

259

size="2" face="Arial">

Todd Helton

align="right">

256

size="2" face="Arial">

Corey Patterson

align="right">

255

This list includes some classic leadoff hitters, such as Castillo and Vina, as well as some other very good hitters such as Walker, Lee and Boone. Corey Patterson, at number ten, is a surprise. He had a very poor OBP last year, but his ability to hit doubles, steal bases and move around the bath paths contributed to his standing.

Here’s the American League list:

American League

             
Ichiro Suzuki

317

size="2" face="Arial">

Randy Winn

align="right">

300

size="2" face="Arial">

Derek Jeter

align="right">

288

size="2" face="Arial">

Shannon Stewart

align="right">

284

size="2" face="Arial">

Bernie Williams

align="right">

273

size="2" face="Arial">

Johnny Damon

align="right">

266

size="2" face="Arial">

Shea Hillenbrand

align="right">

262

size="2" face="Arial">

David Eckstein

align="right">

262

size="2" face="Arial">

Ray Durham

align="right">

260

size="2" face="Arial">

Darin Erstad

align="right">

257

Again, a mix of leadoff and very good hitters who don’t hit a ton of home runs. Randy Winn also was the primary driver of Tampa Bay’s large number of at bats with runners in scoring position.

Obviously, this information is skewed by a number of factors, such as the total number of at bats each player had. To correct for this, I calculated the following list, in which each player’s total scoring position opportunities is divided by their total plate appearances (minimum of 400 plate appearances):

National League

                                         
PlayerScPosPARatio
Junior Spivey

285

align="right">

626

align="right">

0.455

size="2" face="Arial">

Dave Roberts

align="right">

216

align="right">

479

align="right">

0.451

size="2" face="Arial">

Larry Walker

align="right">

243

align="right">

553

align="right">

0.439

size="2" face="Arial">

Craig Counsell

align="right">

211

align="right">

491

align="right">

0.430

size="2" face="Arial">

Luis Castillo

align="right">

286

align="right">

668

align="right">

0.428

size="2" face="Arial">

Placido Polanco

align="right">

246

align="right">

595

align="right">

0.413

size="2" face="Arial">

Edgar Renteria

align="right">

251

align="right">

609

align="right">

0.412

size="2" face="Arial">

Eric Young

align="right">

226

align="right">

553

align="right">

0.409

size="2" face="Arial">

Corey Patterson

align="right">

255

align="right">

628

align="right">

0.406

size="2" face="Arial">

Barry Bonds

align="right">

246

align="right">

612

align="right">

0.402

Dave Roberts had a great season, didn’t he? Craig Counsell is certainly a surprise; this is probably due to his position in a strong lineup and a hitter’s park. Note that Corey Patterson stays on the list.

American League

       

Player

align="center">

ScPos

align="center">

PA

align="center">

SPA

size="2" face="Arial">

Adam Kennedy

align="right">

232

align="right">

509

align="right">

0.456

size="2" face="Arial">

Randy Winn

align="right">

300

align="right">

674

align="right">

0.445

size="2" face="Arial">

Shannon Stewart

align="right">

284

align="right">

641

align="right">

0.443

size="2" face="Arial">

Ichiro Suzuki

align="right">

317

align="right">

728

align="right">

0.435

size="2" face="Arial">

Kenny Lofton

align="right">

249

align="right">

611

align="right">

0.408

size="2" face="Arial">

Ray Durham

align="right">

260

align="right">

659

align="right">

0.395

size="2" face="Arial">

Derek Jeter

align="right">

288

align="right">

730

align="right">

0.395

size="2" face="Arial">

Bernie Williams

align="right">

273

align="right">

699

align="right">

0.391

size="2" face="Arial">

Mark Ellis

align="right">

157

align="right">

404

align="right">

0.389

size="2" face="Arial">

Manny Ramirez

align="right">

201

align="right">

518

align="right">

0.388

Adam Kennedy jumps to the top of the list and Winn stays at number two. Kenny Lofton’s entire season stats are listed here, even though he split time between leagues. Mark Ellis?

As noted in the Counsell listing, these rankings are a reflection of both the individual player and his team. Perversely, some players may be on this list because they reach scoring position once in a while and stay there, while their teammates are unable to bat them in for two or three plate appearances.

To refine the analysis a bit more, I analyzed individual players based on their ability to get into scoring position on their own. I’ll call the number "Scoring Position Average," or SPA.

To compute SPA, I first calculated the number of times batters reached scoring position as the result of their at bats (in other words, doubles and triples). I then added each event in which they reached second or third base from first base without the assistance of a base hit or walk by a teammate. Examples of these events include stolen bases, balks, wild pitches and advances on outs.

I divided this sum by total plate appearances to calculate SPA. SPA, in other words, represents the percent of plate appearances in which a player advanced into scoring position under his own power.

The results (minimum of 400 plate appearances):

National League

                   
Dave RobertsLAN

479

align="right">

82

size="2" face="Arial">

0.171

size="2" face="Arial">

Luis Castillo

size="2" face="Arial">

FLO

align="right">

668

align="right">

102

size="2" face="Arial">

0.153

size="2" face="Arial">

Juan Pierre

size="2" face="Arial">

COL

align="right">

640

align="right">

93

size="2" face="Arial">

0.145

size="2" face="Arial">

Eric Young

size="2" face="Arial">

MIL

align="right">

553

align="right">

79

size="2" face="Arial">

0.143

size="2" face="Arial">

Alex Sanchez

size="2" face="Arial">

MIL

align="right">

435

align="right">

61

size="2" face="Arial">

0.140

size="2" face="Arial">

Tony Womack

size="2" face="Arial">

ARI

align="right">

652

align="right">

90

size="2" face="Arial">

0.138

size="2" face="Arial">

Bob Abreu

size="2" face="Arial">

PHI

align="right">

685

align="right">

94

size="2" face="Arial">

0.137

size="2" face="Arial">

Vladimir Guerrero

size="2" face="Arial">

MON

align="right">

709

align="right">

97

size="2" face="Arial">

0.137

size="2" face="Arial">

Eric Owens

size="2" face="Arial">

FLO

align="right">

426

align="right">

57

size="2" face="Arial">

0.134

size="2" face="Arial">

Corey Patterson

size="2" face="Arial">

CHN

align="right">

628

align="right">

83

size="2" face="Arial">

0.132

American League

                   
Adam KennedyANA

509

align="right">

85

size="2" face="Arial">

0.167

size="2" face="Arial">

Chris Singleton

size="2" face="Arial">

BAL

align="right">

502

align="right">

73

size="2" face="Arial">

0.145

size="2" face="Arial">

Randy Winn

size="2" face="Arial">

TBA

align="right">

674

align="right">

98

size="2" face="Arial">

0.145

size="2" face="Arial">

Jerry Hairston

size="2" face="Arial">

BAL

align="right">

479

align="right">

69

size="2" face="Arial">

0.144

size="2" face="Arial">

Ray Durham

size="2" face="Arial">

OAK

align="right">

659

align="right">

94

size="2" face="Arial">

0.143

size="2" face="Arial">

Brad Fullmer

size="2" face="Arial">

ANA

align="right">

479

align="right">

68

size="2" face="Arial">

0.142

size="2" face="Arial">

Kenny Lofton

size="2" face="Arial">

CHA

align="right">

611

align="right">

86

size="2" face="Arial">

0.141

size="2" face="Arial">

Johnny Damon

size="2" face="Arial">

BOS

align="right">

702

align="right">

97

size="2" face="Arial">

0.138

size="2" face="Arial">

Michael Tucker

size="2" face="Arial">

KCA

align="right">

543

align="right">

74

size="2" face="Arial">

0.136

size="2" face="Arial">

Alfonso Soriano

size="2" face="Arial">

NYA

align="right">

741

align="right">

98

size="2" face="Arial">

0.132

These two lists include a number of hitters we typically regard as very good hitters, such as Abreu and Guerrero. But they also include a number of hitters who are typically not highly thought of, such as Juan Pierre, Chris Singleton, Jerry Hairston and Corey Patterson. It is Patterson’s persistence on each of these lists that makes me think we should perhaps reevaluate our perception of some of these low-OBP, speedy players.

Dave Studenmund Posted: January 21, 2003 at 06:00 AM | 16 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. John Posted: January 21, 2003 at 02:24 AM (#608464)
Really interesting. One of my initial reactions was to think, "but doesn't a guy who's always batting with men on base get penalized," since it's harder (relatively) to get to 2B or 3B with all of that "traffic" out there? But, of course, if he's hitting a double or a triple, he'll clear the bases. So, this SPA does seem like a useful way of summarizing a leadoff-type guy's "secondary" contributions to team scoring, or at least to team scoring opportunities.

Lots of guys on lousy (craptastic is kind of played) teams on the leaderboards, especially in the NL. Helps explain why those teams were lousy, but the good teams don't have guys with good SPA's (Anaheim-2, other 7 playoff teams-3). Wonder why?

   2. Dave Studenmund Posted: January 21, 2003 at 02:24 AM (#608467)
backstabber, this stat is really just a descriptive stat, useful for looking at certain types of players. It's not meant to be a "be all/end all" kind of statistic. Just a slightly different way of looking at things.

John, good point about the high SPA players being on some craptastic teams. As I've thought about it further, I've realized that these types of players have more value to lousy teams.

Take Eric Young, for instance. On the surface, his Milwaukee contract looked like a bad idea. But Young had a .340 OBP this past year, with a good set of skills for moving himself into scoring position. That's not a bad thing on a team without a lot of good hitters in a row moving runners along. In fact, it's arguably exactly the sort of thing a team like Milwaukee should be doing.
   3. All you Need is Glove Posted: January 21, 2003 at 02:24 AM (#608468)
John -- "Lots of guys on lousy (craptastic is kind of played) teams on the leaderboards, especially in the NL. Helps explain why those teams were lousy, but the good teams don't have guys with good SPA's (Anaheim-2, other 7 playoff teams-3). Wonder why?"

This is a wild guess, but maybe players on teams with really poor offenses take more risks on the bases. One limitation of SPA is it doesn't capture failed attempts to take the extra base (e.g., caught stealing and being thrown out at second when trying to stretch a single into a double, etc.).
   4. Damon Rutherford Posted: January 22, 2003 at 02:24 AM (#608470)
The ranking is based on the number of total plate appearances in which that player was a runner on second or third:

Dave, so what if Corey Patterson hits a double to lead off an inning and then sits there while the following three batters all strikeout. Does this count as three opportunities?

Perhaps I am misreading the above statement of yours. I see it as: (1) how many plate appearances by his teammates occured while that player was on second or third. Maybe it really is: (2) how many plate appearances by a particular player resulted in that particular player becoming a runner in scoring position.

I think the "was" in your statement above is confusing me. Thanks in advance if you help clarify this for me.





   5. Dave Studenmund Posted: January 22, 2003 at 02:24 AM (#608472)
Greg, you're exactly right. The first two calculations are the number of plate appearances in which that runner was at second or third. Therefore, if the runner stays there for three outs, that is three plate appearances. It's an interesting stat, but I'm not claiming it has value on a stand-alone basis.

PSA, however, is a measure of how often the player got to scoring position on their own, divided by their own plate appearances. So it eliminates that problem. Seems to me this does have some value beyond just being interesting.
   6. Marc Stone Posted: January 22, 2003 at 02:24 AM (#608474)
There's a simple way around backstabber's problem with Barry Bonds. Subtract HRs from plate appearances, then divide.
   7. Mike Posted: January 22, 2003 at 02:24 AM (#608475)
Dave,

Did you calculate SPAs from previous seasons as well? I'd be curious to see how some players trend.
   8. Walt Davis Posted: January 22, 2003 at 02:24 AM (#608485)
So Joe Morgan was right all along and Brad Fullmer really is a good baserunner!

I have some qualms with the measure, at least as a measure of any skill. I'm not sure we should credit these guys with advancing on outs. At the very least, the number of times they advance on sacrifice bunts should probably be removed. It's also easier to move from 1 to 2 if the batter after you hits more groundballs than flyballs. Moving from 2 to 3 is almost automatic for all baserunners on groundballs to the right side or deep fly balls to anywhere but left.

And as noted, either HR should be removed from the denominator or, better yet, added to the numerator -- home plate seems like the best of all scoring positions to me. This measure necessarily rewards guys who hit doubles instead of HRs (which is one factor that contributes to all the Angels on the list). Tony Muser would have loved this stat. :-)

I also don't see why these guys should necessarily be rewarded for wild pitches. Balks I can see an argument for, though my impression is that most balks these days are the result of mental errors, not attmepts to deceive the runner.

Finally, seems leadoff hitters have an automatic advantage since they're guaranteed at least one PA with no outs. It's easier to advance to scoring position at some point in that inning if you get on base with no outs.

How about this measure: # of teammates' PAs spent in scoring position divided by # of teammates' PAs spent on base. This part gives us the frequency with which they move up when they do reach base. Then multiply by OBP and I think that gives us something like th number of teammates' PA spent in scoring position per player's PA. OK, that's not quite right, but we've somehow got to get the number of oppoturnities in there.

For example ... well, ESPN surprisingly doesn't give splits by # outs, but they do give the none on/no out split. Patterson had 177 such PAs, Sosa had 122. And he didn't even lead off the whole year. Dave Roberts, also a part-time leadoff guy, had 210 none on/out PA. Eckstein, a full-time leadoff guy, had 259 such PAs, about 37% of all his PAs.

Of course Sammy also had an embarassing 279 OBP in those situations (still, better than Patterson's!). So maybe Barry and his 121 PAs would be a better example.
   9. Dave Studenmund Posted: January 22, 2003 at 02:24 AM (#608492)
Lots of good comments. I'm actually going to be out of action for a couple of days, so let me try to respond to a few of the points.

One, I didn't worry about home runs in this stat, because I considered home runs to be another part of run production (as laid out in the beginning of the article. I would lean toward taking them out of the denominator (good suggestion).

Secondly, I've got to say the the Phillies' (and Mets) performance with runners in scoring position was simply a matter of luck. Look for the Phillies to increase significantly next year in runs scored, for no other reason than this.

Really, the ability to get runners in scoring position (the "x" axis on the chart) is a very good indicator of the team's overall batting ability. Hence, teams that are below the imputed linear relationship will tend to float back up, while those above the line will tend to float down.

Third, I'd love to calculate this stat over time, but I just don't have the time. I'm more interested in refining the stat itself first (if I even have time for that!).

Third, I think, Walt, that you hit the nail on the head with your comments. I went back and forth a lot regarding advancing on outs. I think there is a good rationale for crediting players who have speed and advance on outs that others don't. But you're right, this strongly biases the average toward leadoff hitters, who get on base with no outs more often. This is big, because advancing on outs accounts for half of all runner-only advances (stolen bases is second).

So I recalculated these stats by taking home runs out of the denominator and removing advances on outs altogether. As you can imagine, the list changes dramatically to emphasize non-leadoff guys who are strong doubles and triples hitters.

Garciaparra and Garrett Anderson lead the list in the AL at .091. Winn drops to 14th.

In the NL, the great Japanese hitter Kevin Millar leads at .087 and Abreu is second at .084. Corey Patterson drops to about 45th.

I'd put the tables in this post, but I don't know how to format them.

Not nearly so interesting, and probably closer to the "truth." Still, I can't stop feeling that some credit should be given for advancing on outs, but I'm not sure how to do that. I didn't follow your line of logic, Walt.

Any thoughts?
   10. bob mong Posted: January 22, 2003 at 02:24 AM (#608493)
As an example, let?s pick on two relatively equal teams from the NL East: the Mets and Phillies. The Phillies scored 710 runs in 2002, the Mets 690. When you correct for ballpark effects, they virtually come out even. But there were different factors accounting for their offensive production:

Phillies hit 165 home runs, and the Mets hit 160. Correcting for ballpark effect again, they were about even. But the next two factors are more telling.


Actually, according to baseball-reference, the phillies played in a tougher home park, for hitters, than the Mets did.

Veterans Stadium 2002 batting park factor: 91
Shea Stadium 2002 batting park factor: 94

So, according to this, the difference, offensively, between the two clubs is real, and actually wider than it appears at first.
   11. Walt Davis Posted: January 23, 2003 at 02:25 AM (#608504)
Is the underperformance just "bad luck"? If so, what does that say for single season player stats? "He's not a bad player, he just had an unlucky string of 1400 at bats these past three years . . ." No one says that, but there is clearly a difference between a .237 hitter and a .266 hitter over 1,000 at bats.

Can someone explain that?


Sure. It's luck....barely. Let's start with a generic example. What is the likely range for a 250 hitter over 1,000 ABs.

We use the handy binomial distribution. The mean of the binomial distribution is p*N and the variance is calculated as p*(1-p)*N, where p is the likelihood of success and N is the number of "trials." In this example, p is .250 and N=1000.

This gives us a mean of 250 (we knew that) and a variance of 187.5. We can get the standard deviation by taking the square root of the variance, giving us 13.7. Now, for a large number of trials, the binomial distribution becomes the normal distribution. This means we can calculate a 95% confidence interval as the mean +/- 1.96*SD. In this case, that tells us that we can expect a 250 hitter in 1,000 ABs to get 250 hits +/- 27 hits. So in 1,000 ABs, a 223 hitter is not statistically significantly different from a 250 hitter.

People may not be willing to say that a 223 hitter is as good as a 250 hitter over 1,000 ABs, but the stats say you can't reliably tell them apart.

Now, back to the example at hand. The Phils hit 237 in 1,474 AB with RISP vs 259 overall. Are these significantly different? Well, let's assume their "true" value is 259. In 1,474 ABs, the expected number of hits is 382 and the variance is 282.9. The square root of that is 16.8. The resulting 95% confidence interval is 382 hits +/- 33 hits. As luck would have it, 349/1474 is .237. So the Phils were right on the borderline there.

Now one thing to keep in mind is that, technically, this test is only valid if we had chosen the Phils randomly. But they weren't, they were chosen precisely because they'd done so poorly. The point here is that statistical significance is determined (generally) by whether the value lies outside the 95% confidence interval. But, by chance alone, even if the true BA is 259, 5% of all sets of 1,474 ABs will fall outside that interval. In other words, on average, every season we'd expect 1.5 ML teams (i.e. 5% of all teams) to have a BA/RISP "significantly different" from their overall BA.

Or ignore everything I just said and rely on the fact that, to my knowledge, no one has yet been able to demonstrate that RISP differences maintain from season-to-season. If those differences were "real", they should.

Now, the league-wide difference on 2 outs vs. less than 2 outs. That's definitely significantly different, though I'd be more interested in OBPs and SLGs. You touched on one part of the puzzle --sac flies. Unfortunately b-r doesn't give team/league SF totals, so I have no idea how big that effect might be. Intentional walks to good hitters with 2 outs is probably another piece. And maybe particular positions (like the 8th/9th spots) in the batting order are systemically more likely to have BA/RISP with 2 outs than other spots.

And this does give us another possible explanation for the Phils poor performance -- maybe they had a disproportionate number of ABs/RISP when there were 2 outs. Bound to happen when you've got Doug Glanville and Jimmy Rollins at the top of your lineup. :-)

Finally, as to my "logic", I didn't offer any way of better dealing with advancements on outs. About the only way you might do that is with PBP data. I do agree that fast players deserve some credit here, I'm just not sure how much.

My proposal was to measure the denominator differently in hopes of correcting for the "lead off" bias. Instead of number of PAs resulting in the batter eventually making 2nd/3rd divided by their number of PAs, I say you go back to the first measure in your article (number of teammates' PAs that a player spends in scoring position), then divide it by the total number of teammates' PAs that a player spends on base.

For example, Patterson singles to lead off the inning. He remains at 1st after Bellhorn makes an out. He advances to second on Sammy's grounder. McGriff strands him. So Patterson was on 2nd base for 1 of his teammates' PAs, but he was on base for 3 PA's. So he'd get a 1/3. He had 3 opportunities to be in scoring position for his teammates and he was there for 1 of them.

Compare this to Patterson out, Bellhorn out, Sammy doubles, McGriff out. Sammy had only 1 opportunity to be in scoring position for a teammate, and he was, so he gets a 1/1.

So that might be the rate stat, but we need to somehow "correct" it for guys who don't get on base. That is, Patterson may indeed be really good at advancing (have a high rate). But that doesn't necessarily make him good at getting into scoring position because he rarely gets on base. A guy who advances at a lower rate but gets on base a lot should still come out higher on this measure. That's where I was going with the idea of multiplying it by OBP, but the resulting formula looks like complete nonsense:

(teammates' PAs spent in scoring position)*(times on base)

divided by:

(teammates' PAs spent on base)*(own PAs)
   12. Dave Studenmund Posted: January 26, 2003 at 02:25 AM (#608523)
Thanks for the comments, everyone. I followed up on a lot of them, and took another stab at the numbers. Here's what I found:

In all, players reached scoring position 34,000 times in 2002. Of those, 30% were the result of a batter hitting a double or triple. Another 50% were the result of a batter reaching first, and then moving into scoring position as a result of a positive contribution from another hitter (hit, walk, hbp, or sacrifice). The remaining 20% were the result of the runner on first moving to second or third "on his own" (stolen base), thanks to the defense (balks, wild pitches, etc.) or on a less-than-positive contribution by the hitter (non-sacrifice outs).

That was kind of interesting. Then I looked at the situation based on number of outs. Players are more likely to move into scoring situations when there are no outs than when there are two outs. This is obvious, due to a couple of reasons:

1. Many more sacrifice bunts with none out. The batter moving the runner along increases approximately 15% with none out.

2. Similarly, runners can't move along on outs when there are two outs.

I also found that the percent of times runners moved on stolen bases or defensive lapses was constant throughout the out situations.

Walt, you're exactly right about the Phillies: they led the league in at bats with runners in scoring position with two outs. This was a factor in their low BA w/RISP. One reason is that their highest SPA batter, Abreu, was not their leadoff batter.

In the end, the "out factor" is huge. Guys who get on base with no outs (that is, leadoff hitters) obviously SHOULD get into scoring position sooner or later. So here's what I did:

I recalculated SPA to credit sacrifices to the batter. I still credited the runner for moving up on outs and defensive mistakes. Debatable, but what the heck. I subtracted home runs from plate appearances and recalculated SPA.

Next step: I then added this new SPA to a modified OBP (OBP without the home runs in denominator or numerator) to get another new stat: OBSPA. Given that the most important job of a leadoff hitter is to get on base (so other guys can bat him around), some measure of pure OBP should be included. After playing with a lot of formulas, I basically just added the two.

Then I only looked at batters who got to first base with no outs at least 40% of all times they got to first base. That basically gave me a list of leadoff hitters.

Anyway, here are the OBSPA leaders for the National League (pray for formatting):

Name Team Lg SPA OBSPA
Luis Castillo FLO NL 0.150 0.511
Dave Roberts LAN NL 0.162 0.506
Eric Young MIL NL 0.140 0.469
Alex Sanchez MIL NL 0.131 0.468
Juan Pierre COL NL 0.139 0.466
Mark Kotsay SDN NL 0.110 0.450
Todd Walker CIN NL 0.108 0.447
Craig Counsell ARI NL 0.104 0.446
Fernando Vina SLN NL 0.106 0.437
Tony Womack ARI NL 0.114 0.431
Rafael Furcal ATL NL 0.115 0.426
Reggie Sanders SFN NL 0.120 0.416
Jimmy Rollins PHI NL 0.118 0.411
Todd Zeile COL NL 0.069 0.402
Aaron Boone CIN NL 0.118 0.401
Kevin Young PIT NL 0.098 0.399
Corey Patterson CHN NL 0.132 0.397





I hope that looks okay. Final thing: if anyone wants this, send me an e-mail. I'll try to make the Excel sheets I used understandable and send them to you. I worked in Windows XP Excel 2002, but I can probably save it to an earlier version if needed.

By the way, there probably is an issue in which runners reach a high SPA if they play for poor hitting teams, as Adam says. I didn't take the time to try and correct for that. Maybe someday I will.

And once again, I'm not claiming this is the "be all and end all" stat. Just interesting. My thought is that combining SPA and OBP might be a good value stat for a leadoff hitter.
   13. Dave Studenmund Posted: January 26, 2003 at 02:25 AM (#608542)
Ah, I knew I'd get in trouble. Was almost done, too. Ignore that last sentence; I went back and recalculated the stats.

Anyway (if you're still here) here are the American League leaders (I'll retry the formatting):

Name Team Lg SPA OBSPA

Ray Durham OAK AL 0.148 0.502

Ichiro Suzuki SEA AL 0.110 0.489

Shannon Stewart TOR AL 0.124 0.485

Randy Winn TBA AL 0.138 0.483

Johnny Damon BOS AL 0.141 0.483

Kenny Lofton CHA AL 0.138 0.473

David Eckstein ANA AL 0.112 0.461

Alfonso Soriano NYA AL 0.135 0.430

D'Angelo JimenezCHA AL 0.092 0.416

Jacque Jones MIN AL 0.107 0.416

Melvin Mora BAL AL 0.098 0.415

Matt Lawton CLE AL 0.083 0.403

Ruben Sierra SEA AL 0.089 0.387

Cristian Guzman MIN AL 0.107 0.385

Mike Young TEX AL 0.090 0.381

Brent Abernathy TBA AL 0.094 0.375
   14. Dave Studenmund Posted: January 26, 2003 at 02:26 AM (#608543)
That was better. Here's a re-run of the National League stats:

Name Team Lg SPA OBSPA
Luis Castillo FLO NL 0.150 0.511
Dave Roberts LAN NL 0.162 0.506
Eric Young MIL NL 0.140 0.469
Alex Sanchez MIL NL 0.131 0.468
Juan Pierre COL NL 0.139 0.466
Mark Kotsay SDN NL 0.110 0.450
Todd Walker CIN NL 0.108 0.447
Craig Counsell ARI NL 0.104 0.446
Fernando Vina SLN NL 0.106 0.437
Tony Womack ARI NL 0.114 0.431
Rafael Furcal ATL NL 0.115 0.426
Reggie Sanders SFN NL 0.120 0.416
Jimmy Rollins PHI NL 0.118 0.411
Todd Zeile COL NL 0.069 0.402
Aaron Boone CIN NL 0.118 0.401
Kevin Young PIT NL 0.098 0.399
Corey Patterson CHN NL 0.132 0.397

Looks like Patterson is in his rightful place.
   15. Dave Studenmund Posted: January 26, 2003 at 02:26 AM (#608544)
Doh! One more time:

Name Team Lg SPA OBSPA

Luis Castillo FLO NL 0.150 0.511

Dave Roberts LAN NL 0.162 0.506

Eric Young MIL NL 0.140 0.469

Alex Sanchez MIL NL 0.131 0.468

Juan Pierre COL NL 0.139 0.466

Mark Kotsay SDN NL 0.110 0.450

Todd Walker CIN NL 0.108 0.447

Craig Counsell ARI NL 0.104 0.446

Fernando Vina SLN NL 0.106 0.437

Tony Womack ARI NL 0.114 0.431

Rafael Furcal ATL NL 0.115 0.426

Reggie Sanders SFN NL 0.120 0.416

Jimmy Rollins PHI NL 0.118 0.411

Todd Zeile COL NL 0.069 0.402

Aaron Boone CIN NL 0.118 0.401

Kevin Young PIT NL 0.098 0.399

Corey Patterson CHN NL 0.132 0.397
   16. Dave Studenmund Posted: January 27, 2003 at 02:26 AM (#608563)
FJM, nice study. Similar to ones I've done in the past. I also found that SLG had a higher correlation with runs scored than OBP. What was the t stat for each variable?

Your finding regarding batting average with RISP is consistent with my approach. It's the one situation in which BA is actually a valuable stat. I'm not sure I'd draw the same conclusion as you regarding RISP with two outs, but it could be. Multivariate regression analysis is a bear.

FYI, I ran a regression of my three key offensive variables on runs scored and got an R squared of .94. As a reminder, my three variables are hit a home run anytime/anywhere, get runners in scoring position, and hit (BA) with runners in scoring position. Of those three, HR hitting has the highest t stat, BARISP is second and PAs with RISP is third in importance.

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
Eugene Freedman
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 0.9639 seconds
66 querie(s) executed