Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Primate Studies > Discussion
Primate Studies
— Where BTF's Members Investigate the Grand Old Game

Thursday, July 17, 2003

Stacking the Deck

Bill and David take an in-depth look at the success rate of first-round draft picks.

“A manager has his cards dealt him and he must play them”—Miller Huggins

Summary:

In holding the purse strings of Major League Baseball franchises, today’s owners and general managers face tough choices every day.  Decisions about player personnel alone require massive research, judgment, and as always, some good luck.  One age-old question centers around “nature versus nurture”—should teams focus their capital in human resources on scouting top talent of prospects from the high school and college ranks or focus on the development of bargain players through coaching?  Are the Mickey Mantle’s of the game born or bred?

 

We’ll look at today’s major-league baseball leaders in a variety of categories, tracing their origins to examine correlations in their picking order and current success.  Other issues might surface from the data too, such as, All-star potential and which team will likely win the World Series.  Surely, we won’t answer all these questions once and for all, but we intend to examine how statistical research of leading prospects can lead teams to allocate their resources more wisely, field better players, and hopefully win more pennants.

Clark Griffith, owner of the hapless Washington Senators, once said, “Fans like home runs and we have assembled the pitching staff to please our fans.”

Issue:

Our overall hypothesis states that:

 

     

  1. MLB Scouting does identify promising major-league players, as evidenced by the success of players who were selected during the first-round of the major-league baseball entry draft.  Moreover, predictability varies:
    1.  

    2. by "picking order” of top selections;
    3. by position; and lastly,
    4. by experience levels of selections, such as high school versus college.


“Baseball statistics are like bikinis—they both reveal a lot, but not everything”—Toby Harrah

Data and Limitations

Our population is all major-league baseball first-round draft picks, as published on the major-league baseball web site.  Our sample was collected from this list and includes all first-round draft selections from 1982-1986.  One advantage of this data set is that it is the complete population of the 130 first-round draft picks of 1982-1986.  The sample inferences will be made on career results of the population of first-round selections of the period, 1982-1986.  We chose this period as it is the most recent year for which the full term of its players? careers can be accurately assessed.  Indeed, the careers of most of these players are complete, and we can make fairly accurate assessments of the success of those that aren’t (Roger Clemens, Barry Bonds, Matt Williams, etc.).

 

“Scouting biases” are sure to exist to some degree.  Scouting biases relate to both the imperfections in the process of observing each new couple ballplayers, and also the passing trends or idiosyncratic priorities that shape a scout?s scoring system.  We believe the limitations and biases are inherent in the process, but importantly, that they are normally distributed for all players, inside or out of our sample.

“Baseball is 90% mental.  The other half is physical.” —Yogi Berra

Analysis

The first part of the analysis will determine correlation of success with selection order, position and experience of our sample.  Before we pulled the data, we created a rating system of buckets to be detailed later in this study.  First, the sample is divided and grouped by defensive position at the time of selection— All Pitchers and All Hitters, classified as Catchers, Infielders and Outfielders.  The performances of the players are then compared against their peers, the population of MLB players of the same positions and career span.  The Sabermetric Baseball Encyclopedia software enables us to compared each sample player’s output to those of his peers at the same position.  Using “Plate Appearances” and “Innings Pitched” as our basis for hitters and pitchers respectively, we determined expected values for each statistic and compared to the player’s actual values.  For example, if in 1988, first baseman averaged one home run every 19 plate appearances, we would then apply that ratio to Mark McGwire’s output for the same year, and see that his expected value was 35 home runs, versus his actual output of 70 home runs.  This effort was then replicated for each year of each player’s career to determine what his career stats would have been had he performed equal to the average of his peers, versus what the player actually did.  Criteria and performance measures are chosen by us and are based on what we think are the most representative measures of success of the most widely used quantitative stats for hitters and pitchers.  Performance is then scored for each criterion on the basis of several factors—

 

  • 10% or better than peers, called a Ten.
  • 20% or better than peers, called a Twenty.

BATTERS

Primary Mentions

Definitions / Calculations

(If Needed)

Runs	

 

Home Runs

 

RBIs

Runs Batted In

Stolen Bases

 

Batting Average

Hits divided by At Bats

Secondary Mentions

 

Slugging Percentage

TB divided by AB

Strike Outs

 

On-base percentage

The divisor for On Base Percentage: At Bats plus Walks plus Hit By Pitcher plus Sacrifice Flies; or Plate Appearances minus Sacrifice Hits and Times Reached Base on Defensive Interference.

(H + BB + HBP) divided by (AB + BB + HBP + SF)

On-base percentage plus slugging percentage.

 

Runs created

A way to combine a batter’s total offensive contributions into one number. The formula:

[(H + BB + HBP - CS - GIDP) times (Total bases + .26[BB - IBB + HBP] + .52[SH + SF + SB])] divided by (AB + BB + HBP + SH+ SF)

Runs Created per Average appearance

 

Runs Created AP

 

Total Bases

 

Grounded Into Double Play

 

 

PITCHERS

Primary Mentions

Definitions / Calculations

(If Needed)

Strike Outs

 

Wins

 

Saves

 

E.R.A.

(Earned Runs times 9) divided by Innings Pitched.

   

RSAA	Run Support Per 9 IP

The number of runs scored by a pitcher’s team while he was still in the game times nine divided by his Innings Pitched.

Secondary Mentions

These are the hits, walks and hit batsmen allowed per nine innings.

Winning Percentage

Wins divided by (Wins plus Losses).

Home Runs Allowed

 


(Note:  Fielding performance, while important, is not considered because of its qualitative nature.)

Next, we assigned seven final achievement “buckets” to the output, based on longevity minimums and a matrix of the total “mentions” from above.  Ultimately, it is these categories that determine a player’s value.

 

Buckets and Requirements

Struck Out Looking

Never made it to the majors

Struck Out Swinging

Had a "cup of coffee", that is, less than one aggregate season of play

Single

1 ? 3 aggregate seasons played, and not enough mentions for a double

Double

1 ? 3 aggregate seasons played, and at least (3) primary Tens, or at least (6) secondary Tens in total OR 3-10 aggregate seasons played, and not enough mentions for a Triple

Triple

1-3 aggregate seasons played, and at least (3) primary Twenties, or at least (6) secondary Twenties; OR 3-10 aggregate seasons played and at least (3) primary Tens, or at least (6) secondary Tens; or 10-15 aggregate seasons and not enough mentions for a Home Run.

Home Run

3-10 aggregate seasons played, and at least (3) primary Twenties, or at least (6) secondary Twenties; OR 10-15 aggregate seasons played, and at least (3) primary Tens, or at least (6) secondary Tens; or 15+ aggregate seasons and not enough mentions for a Grand Slam.

Grand Slam

10+ aggregate seasons, and at least (3) primary Twenties, or at least (6) secondary Twenties

 

More Disclaimers

We picked two increments of 10% above the mean as our significance level.  These particular levels needn’t be used, and will severely impact the hypothesis test; however, we believe these increments the fair and persuasive levels of significance.  Baseball fan and political type George Will argued in his 1990 book, Men at Work: the Craft of Baseball, that the big leagues have evolved uniformly to be more competitive and more precise than ever before.  If true, margins of greatness should diminish, and we’d allow the significance level of the test hypothesis to shrink over time.

 

The buckets are also our creations.  The names are descriptors only, for hitters and pitchers; the choice of seven buckets was our bet for the most efficient career breakdowns; the aggregate seasons are taken from batting title minimums; and the requirements are our attempt to extrapolate the qualitative judgment of a player’s career from the quantitative summation of his performance.

“Say you were standing with one foot in the oven and one foot in an ice bucket.  According to the percentage people, you should be perfectly comfortable.” —Manager Bobby Bragan

The Numbers:

The sample mean for all first-round players is 3.02.

Results by picking order:

All Players

Pick #

Mean Outcome

Std Dev

1-6

4.13

1.96

7-13

2.94

1.43

14-20

2.69

1.66

21-26

2.40

1.73

Total

3.02

1.79

Pitchers

Pick #

Mean Outcome

Std Dev

1-6

3.85

1.95

7-13

2.58

1.24

14-20

2.94

1.83

21-26

2.20

1.82

Total

2.91

1.81

 

Hitters

Pick #

Mean Outcome

Std Dev

1-6

4.36

2.00

7-13

3.13

1.52

14-20

2.41

1.46

21-26

2.60

1.68

Total

3.14

1.79

 

Overall, the higher the pick, the stronger the performance.  We bracketed picks of six and seven players in order to yield large samples of 30 or 35 players per group, over the five-year sample period.  We compared the means above, using large sample confidence intervals (z-scores), in each case tested at the 95% level.  According to our analysis, only the top grouping of first rounders has greater success than their peers, an indication in this case that their relative success is significant versus the lower picks.  Both subsets of hitters and pitchers also show the highest picks generating the strongest careers.  In addition, pitchers 14-20 were better than pitchers 21-26, as were hitters 7-13 versus hitters 14-20.

Results by position:

All Players

Position

Mean Outcome

Std Dev

C

2.46

1.33

IF

3.87

1.84

OF

2.69

1.69

Pitcher

2.87

1.81

Other*

2.00

1.41

Total

3.02

1.79


The data are less conclusive when sorted by position, but first, these position classifications deserve some background:  Naturally, catchers are grouped together easily enough.  All pitchers are lumped together at the draft level, as are all outfielders.  Second and third baseman from either high school or college rarely enter the draft; instead, each draft contains a glut of shortstops (20 of 31 infielders including first baseman are listed as shortstops), many of whom switch positions as managers and GM’s adjust the oversupply of able shortstops.  Examples of first rounders drafted as shortstops that are currently stars at other positions include Gary Sheffield, Chipper Jones, and Matt Williams.  Finally, a small percentage (less than 3%) of players are drafted across multiple categories (e.g. pitchers/outfielder, infielder/outfielder).  These are classified as “others”.  Using t-scored confidence intervals, the data from the table above shows only infielders have a significantly better chance of future success than prospects at other positions.

Results by experience:

All Players

Experience

Mean Outcome

Std Dev

High School

2.67

1.70

College

3.31

1.83

Total

3.02

1.79

 

For all players, college bred first rounders significantly outperform players coming out of high school.

 

Hitters

Experience

Mean Outcome

Std Dev

High School

2.66

1.73

College

3.59

1.74

Total

3.14

1.79

 

For the subset of All hitters, college bred players also outperform their high school counterparts.

 

Pitchers

Experience

Mean Outcome

Std Dev

High School

2.78

1.69

College

3.00

1.89

Total

2.91

1.81

 

However, for pitchers, the results are “too close to call”.

 

Hitters

Experience

Mean Outcome

Std Dev

College C

2.67

1.37

HS C

2.29

1.38

College IF

4.13

1.86

HS IF

3.60

1.84

College OF

3.62

1.61

HS OF

1.77

1.24

Total

3.14

1.79

 

Looking at combinations of position and experience, we see that high school outfielders as a group significantly underperformed the other hitting categories, except catchers.  One limitation of our data was the need to group all catchers together given their small total representation.  Catchers in both experience categories, seven from high school and six from college, produces too unstable a confidence interval for meaningful results.  Judging all catchers together, we see that they underperformed college infielders, college outfielders, and high school infielders.

 

Up to this point, the data shows that:

 

     

  1. Picks from the top bracket of the first six players chosen outperform lower picks.
  2. Catchers and high school outfielders have significantly weak track records in the majors.
  3. College players consistently achieve greater success than high school players.

 

How many recent top players were first-round draft picks?

Top 25 of Career Leaderboards Since 1981-2000 in Selected Categories

 

Likelihood of x or more players on leaderboard given 2%, 10%, and 20% probabilities of success

Category

First Round

Percentage

2%

10%

20%

Primary Hitting Categories

RBI

11

44.0%

0.000

0.000

0.004

HR

10

40.0%

0.000

0.000

0.006

Runs

10

40.0%

0.000

0.000

0.006

SB

8

32.0%

0.000

0.002

0.047

Average

7

28.0%

0.000

0.007

0.109

Secondary Hitting Categories

Total Bases

10

40.0%

0.000

0.000

0.006

OPS

9

36.0%

0.000

0.000

0.017

RCAA

9

36.0%

0.000

0.000

0.017

RCAP

9

36.0%

0.000

0.000

0.017

RC

9

36.0%

0.000

0.000

0.017

OBP

8

32.0%

0.000

0.002

0.047

Slugging %

8

32.0%

0.000

0.002

0.047

SO

7

28.0%

0.000

0.007

0.109

Other Hitting Categories

Extra Base Hits

11

44.0%

0.000

0.000

0.004

Isolated Power

9

36.0%

0.000

0.000

0.017

Primary Pitching Categories

SO

5

20.0%

0.000

0.033

0.383

Wins

5

20.0%

0.000

0.033

0.383

ERA

4

16.0%

0.000

0.098

0.579

WHIP

3

12.0%

0.000

0.236

0.766

Saves

3

12.0%

0.000

0.236

0.766

Secondary Pitching Categories

Win Percentage

5

20.0%

0.000

0.033

0.383

RSAA

3

12.0%

0.000

0.236

0.766

 

The second part of our analysis determines the representation of first rounders among baseball’s brightest luminaries over the decades.  To further evaluate the success of first-round draft picks, we took a look at the leaders in key statistical categories from 1981-2000 as shown above.

 

Given that first rounders comprise less than 2% of the pool of players entering baseball, the alternative hypothesis suggests that first round compositions would be at or near 2% (Central Limit Theorem).  Rather, the primary hitting categories all show levels at or higher than 28%.  Secondary and “other” hitting categories feature similarly disproportionate compositions of first rounders.  The corresponding pitching categories are less weighty on the first rounders, yet all still boast compositions at or better than 12%.  As stated earlier, first rounders comprise approximately 2% of all professional players.  If each player were equal across all draft picks, than expected value of 1st rounders in each top 25 list compiled would be .5 (25*.02).  In fact, in every table we see a significantly higher number.  We then applied binomial probabilities in each category to determine the likelihood of the actual value.  In every case, the likelihood was .000, or less than one in 1000.  To hone in on the data, we replaced 2% with 10% and 20%, assuming than that all players were equal over the 1st 10 and 5 rounds, respectively.  We still had very low likelihoods and random selection, less than 1%.  As a result, we conclude that 1st rounders are, as a class, more talented than their counterparts.  As seen in the results of the binomial distribution, the odds of this success ratio in a random test are less than one in 1000.  We conclude from this data that a strong correlation exists between top performance and early selection.

 

The 20-year results from these time-honored baseball categories are consistent with the five-year player data previously discussed: first-round hitters and pitchers are more likely to distinguish themselves successfully at the game’s highest level, and that of the two player types, hitters provide bigger impact.

 

Awards Presented 1981-2001

Award

First Rounders

Total

Percentage

Annual Awards

MVP

15

42

35.7%

Rookie of the Year

11

42

26.2%

Cy Young

11

42

26.2%

Series Awards

World Series MVP

3

23

13.0%

LCS MVP

5

41

12.2%

 

The three highest major-league baseball player annual awards are listed above.  The MVP, Rookie of the Year, and Cy Young Awards honor the top player, rookie, and pitcher, respectively, in each of the National and American Leagues.  Again, first-round players make up a significant percentage of each of the awards of the period.

 

The World Series and League Championship Series awards recipients are simply the series MVPs in the playoffs.  Since many managers will argue that “anything can happen in a short series”, and since the MVP is normally chosen from the winning team, one might expect that first rounders would not make impact here.  The impact is in fact less, but it is still felt.

 

All-Stars

First Year After

First Rounders

Total

Percentage

1970

116

616

18.8%

1975

104

515

20.2%

1980

90

416

21.6%

1985

67

314

21.3%

1990

44

206

21.4%

1995

22

108

20.4%

       

 

First Round All Stars by Pick Number

Pick #

# of All Stars

Percentage of Total

Cumulative # of All Stars

Cumulative Percentage

1

14

11.4%

14

11.4%

2

10

8.1%

24

19.5%

3

8

6.5%

32

26.0%

4

10

8.1%

42

34.1%

5

3

2.4%

45

36.6%

6

7

5.7%

52

42.3%

7

3

2.4%

55

44.7%

8

3

2.4%

58

47.2%

9

4

3.3%

62

50.4%

10

8

6.5%

70

56.9%

11

3

2.4%

73

59.3%

12

2

1.6%

75

61.0%

13

3

2.4%

78

63.4%

14

4

3.3%

82

66.7%

15

4

3.3%

86

69.9%

16

5

4.1%

91

74.0%

17

3

2.4%

94

76.4%

18

4

3.3%

98

79.7%

19

3

2.4%

101

82.1%

20

3

2.4%

104

84.6%

21

6

4.9%

110

89.4%

22

4

3.3%

114

92.7%

23

3

2.4%

117

95.1%

24

1

0.8%

118

95.9%

25

3

2.4%

121

98.4%

26

2

1.6%

123

100.0%

 

123

100%

   

 

Players are selected to the MLB All-star game based on a combination of three factors: their popularity among the fans (8 starting players per league as voted on by the fans—16 players per season); the relative success during that specific, and to some extent previous, season(s) (approximately 30 to 35 players per season); and their relative success to the players on their own teams (representing teams that have only their one mandated All-star, who may or may not have been on the team if the mandate did not exist—approximately 10 to 15 players per year).  The choices made by fans and managers represent crowning success on the field, and for many players, off the field to—as popularity can play as big a part in the fans? selections as achievement does.  To underscore the perceived value (in terms of wins and tickets sold) the teams receive from their all-stars, it is common for teams and corporate sponsors to reward players with extra bonus incentives for All-star selection.  The results of the last 20 years of data indicate a strong correlation between top picks and All-star selection.  Moreover, picking order within the first round strongly affects a player’s statistical chances of being named to All-star team: 42% of the first rounder all-stars were one of the 1st 6 selections of their year.

 

World Series Wins by teams with more First Round Picks

From X to 2002

Wins

Losses

Percentage

1970

15

9

62.5%

1981

12

6

66.7%

1988

10

3

76.9%

       

 

We fittingly end our look at today’s leaders with the winners of the Fall classic.  From 1981 through 2002, of the 18 World Series squads with the higher number of first-round picks listed on their season rosters (there are many ties), 12 have won that Series, or 66.7% of the time.  Since 1988, the numbers are even more remarkable—the team competing in the series with more first rounders on their roster won 75% of the championships, including the Diamondbacks over the Yankees and the Angels over the Giants the last two years.  If only through empirical observation, the high number of individual awards and All-star selections, along with the consistent characteristics of championship winners, all reflect the null hypothesis.

“Baseball is like church.  Many attend, but few understand.” —George Will

Conclusion:

The scouting process does identify big-league talent at young ages.  Therefore, it is a wise focus for the front offices of baseball.  As a group, first-round selections in the major-league baseball draft have made a tremendous impact on the sport in the last 20 plus years.  Also, within the first-round, GM’s can look even closer to maximize their benefit and find the true nuggets, as defined by these basic principles…

 

     

  1. When choosing between an infielder and any other position player, take the infielder.
  2. Avoid catchers and high school outfielders.
  3. When choosing between college and high school hitters, take the college player.


“Baseball fans are junkies, and their heroine is statistics”—Robert S. Weider been “In Praise of the Second Season (1981)”

Evaluation:

The study has been informative for us in several ways.  First, the test of means and confidence intervals we created confirmed the null hypothesis more persuasively than we had anticipated. 

 

When comparing today’s leaders to our sample, we had to use vague, “conservative” percentages of first rounders.  Had the results margin been narrower, our estimates would require greater accuracy.  Other lessons include the growing biases, errors, assumptions and limitations that affected our test.  At first, we believed the endless statistical research widely available on professional baseball would help us resolve our issue relatively easily, but we uncovered quite a lot of details to address—absence of accepted standards of excellence, a lack of consistency in the raw percentage of first rounders in the majors, are examples of details that require time, and threatened to alter the results.  Choosing a sample that would remain large even as we narrowed our focus would have provided more conclusive results.  Our initial sample of 130 turned into sample sizes in the teens as we looked at combinations of our factors—experience, position, and pick number.  The insights and lessons together underscore for us the value of statistical research—“there’s always more to be done, and the results may prove surprising.”

 

From here, one might want to reconstruct the analysis and further ways:

 

Naturally, the process can be duplicated for lower rounds, or grouping rounds together.  We resolved only that the scouting process is a wise allocation of a team?s operating budget, and further analysis on the return on investment of players in different draft levels can be useful in determining optimal spending levels.

 

In addition to test of means, we recommend regression analysis to:

 

     

  1. Determine significance levels of the buckets for the dependent variables we chose—picking order, position, and experience.
  2. Judge the significance of qualitative variables (injury, popularity, leadership, etc.) as a whole, on the draft selections.
  3. Re-create the analysis without one of the variables to determine the highest correlation values of the variables used, and adjust model accordingly.

 

It is now up to the GM’s determine how best to use this data.  However, they should choose wisely, as if they don’t they must remember…

“There’s no crying in baseball”—Tom Hanks in “A League of Their Own”

Parting shot:

This analysis showed us how many truly great players came from the club of first-round picks.  The thought of an All-star team of first rounders is irresistible to us, and therefore, we conducted a small survey of our own to determine just such a dream team.  The survey taught us a lot about the effect of a biased sample as our New York-based survey put Derek Jeter onto the team over Robin Yount, and Nomar Garciaparra only received one vote.  We suspect that had we taken the same poll in Boston, Nomar would make the team and Jeter would be left in the cold.

 

The tainted, biased results are as follows:

 

First Round All-Star Team

Starting Pitchers

Roger Clemens, Dwight Gooden, Kevin Brown, Frank Tanana, Rick Sutcliffe, Bob Welch, J.R. Richard

Relief Pitchers

Billy Wagner, Todd Worrell

Catcher

Thurman Munson, Lance Parrish

First Base

Mark McGwire, Frank Thomas

Second Base

Craig Biggio, Chuck Knoblauch

Third Base

Paul Molitor, Chipper Jones

Shortstop

Alex Rodriguez,  Derek Jeter

Left Field

Barry Bonds, Jim Rice

Center Field

Ken Griffey, Jr., Dale Murphy

Right Field

Dave Winfield, Reggie Jackson

Honorable Mention

Bobby Grich, Robin Yount, Will Clark, Jack McDowell, Rafael Palmeiro, Scott McGregor, Steve Howe


BILL DESIMONE is a lifelong Yankees fan, but in all other aspects of his
sporting interests, is a lover of the underdog.  Despite much recent evidence
to the contrary, Bill still believes all small market teams can compete
consistenly.  Bill is also in favor of expansion and realignment to fix
baseball’s woes.

DAVID PRINCE offers living proof that not every mediocre southpaw can find
a job in the majors, even if he is a college graduate.  He enjoys writing,
debating and facts of all kinds, although he has long suffered from
degenerative audience syndrome.  He cherishes wasting otherwise productive hot
summer nights listening to Yankee games on the radio and when he’s down on his
luck, he need only recall the Stump Merrill and Rich Kotite years to make
everything else look a lot brighter.

 

Bill DeSimone Posted: July 17, 2003 at 06:00 AM | 10 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. Chris Reed Posted: July 17, 2003 at 02:27 AM (#612188)
Very Cool stuff. Excellent job. It takes me a while to digest numbers like STDV and all that other jibber jabber ;-) , but very interesting. (Thanks for breaking it down with the key observations as well, those helped me through it

Good job.
   2. Chris Reed Posted: July 17, 2003 at 02:27 AM (#612189)
Whoops, i forgot to ask. Could you send me the draft data you used for this piece? I'd appreciate it greatly. :-)
   3. Ben Posted: July 17, 2003 at 02:27 AM (#612197)
No mention of Nomar Garciaparra at SS?
   4. Walt Davis Posted: July 17, 2003 at 02:28 AM (#612208)
I haven't read this closely enough (yet) to know if I have any methodological problems with it. However, I would put 1B (and maybe 3B) in a separate category from 2B/SS. They're really very different kinds of hitters.

More importantly, I'm not sure we can make some of the conclusions offered. If the sample is limited to players drafted 82-86, it's questionable whether we can use these results to say something like "draft the college hitter." That would have been the right decision 20 years ago, but maybe not today. Analyses of more recent drafts (the one on Baseball America springs to mind) suggest there's not much of a gap these days. If true, that lack of gap may well be the result of the reaction to what you found here -- teams used to undervalue college players, recognized they were wrong for doing so (after the success of other teams), and now draft a higher percentage of college players which equalizes the two pools of draftees.
   5. tangotiger Posted: July 17, 2003 at 02:28 AM (#612210)
I second walt's post.

The fielding spectrum can be: p,c,ss,2b,cf,3b,rf,lf,1b (or something close to it). If you wanted to group, i would make it
p.....c....ss/2b/3b...cf/rf/lf/1b

I would prefer to see some more data to calculate the statistical significance. those standard deviations just seems to high to support the conclusions being made, but i might be missing something.

Finally, using runs or wins as a basis would have been a far better choice. Even using Win Shares would have been good.
   6. Bill DeSimone Posted: July 18, 2003 at 02:28 AM (#612212)
Hey guys, great questions! Let me see take a shot at answering them as best I can?

1. Eric and Ben - Don't worry too much about who's on and who's snubbed from the all-star team :) This list is truly just for fun - we polled a bunch of NY'ers, so naturally we'd get some NY-skewing answers. The ballot box was most certainly stuffed by those rabid Yankee fans.

2. Eric and FJM - Lefty vs. Righty would seem to make for a great follow-on study. We?ll take a look at that in a future version.

3. BK - In calculating the % of potential players entering baseball, we assumed 50 rounds of the draft, plus all of the international players signed. We didn?t take into consideration that not all players are in fact signed, nor did we take into consideration whether or not all picks were used. We really were just using it as a directional estimate. The thing we found so interesting was such a high percentage of hitting categories were led by first rounders. 11 of the top 25 RBI guys (44%) from 1981-2000 were first round picks. Those results are consistent with other hitting categories. Even if we assumed 20% of players entering baseball were first round picks, I think this figure is pretty impressive.

4. Jason - We didn?t do any analysis on combinations such as by pick AND position, primarily because there wasn?t enough of a sample to give us what we felt would be an accurate read. Upon this closer look, its certainly possible that the preponderance of IF in the first few picks would skew the data a bit. There are some factors that make me think the conclusion still holds. a. The best pitchers AND the best hitters were taken 1-6. b. Infielders represent a lesser percentage of hitters 14-20 than 7-13, but 7-13 were better than 14-20. For more information though, here were the breakdowns:

Picks 1-6, 3 C, 12 IF, 2 OF, 13 P
Picks 7-13, 5.5 C, 6 IF, 11.5 OF, 12 P (John Russell, 1982, #12, by the Phillies was a C/OF)
Picks 14-20, 5 C, 7 IF, 5 OF, 18 P
Picks 21-26, 0 C, 6.5 IF, 8.5 OF, 15 P (Jeff Ledbetter, 1982, #26, by the Red Sox was a 1B/OF)

5. Jason - Taking Barry Bonds out would have a minimal effect on the related scores, and no effect on our conclusions. OF would drop to 2.52, College Hitters would drop to 3.50, College OF would drop to 3.34, picks 1-6 would drop to 4.03 and hitters picked 1-6 would drop to 4.20. Similarly, taking Roger Clemens out has even less of an impact, as our sample of pitchers is much larger than our sample of college outfielders drafted.

Where Bonds and Clemens do have an impact is in the award areas. Bonds? 4 MVP awards (through 2001) and Clemens 6 Cy Young awards account for almost 40% of the winners in those 2 categories.

6. ED - While I would agree that we haven?t completely solved the ?nature vs. nurture? question, I think we?ve given it a kick in the right direction. First round picks are certainly given more attention than 40th round picks, and I agree that, given the same performance on the minor league levels, they will likely get to the big leagues faster than their later round brethren. That said, they still have to outperform their peers fairly significantly once they get to the bigs to rate higher than a ?single? or ?double? in this analysis, or to make it to the top of the various leaderboards.

7. Xenophobe ? The purpose of this analysis was two-fold. First to compare first rounders against other first rounders, and second to compare them with the rest of the league. In this second part of the analysis, international players are most certainly factored. It would be interesting to see how internationally signed players would rate using the same criteria as this analysis. Anyone know where to get a list of internationally signed players by year?

8. FJM ? We absolutely agree that longevity is a key factor. ?Aggregate seasons? are actually calculated for the purpose of this study as 1 season = TPA?s/502 or IP/162. Even if the hitter had 315 plate appearances were spread over 6 seasons, they would still rate as ?Struck Out Swinging? ? less than one aggregate season of play.

For strikeouts, we used strikeouts 10% or 20% less than the average at their position.

Although there certainly is some level of correlation of in the Secondary Mentions area, we were surprised that it did in fact provide just enough of a level of differentiation to give us confidence that our rankings and bucketing were pretty accurate. Other researchers could have used fewer categories and reduced the number of mentions needed to qualify for a bucket accordingly.

The run support stat is actually ?runs saved against average?, and we had included that along with WHIP in the study (that?s what was on the blank line in the pitchers box).

This is a good time to thank two folks who helped us with this. First, we relied heavily on Lee Sinins Baseball Encyclopedia as the source of our data. Lee is mentioned in the first version of the paper, but I think I inadvertently deleted that as I updated and (believe it or not) shortened this to submit to Baseball Primer. Second, big kudos to Dan Symborski for taking our Word doc (and all of its tables) and translating it into HTML so we could share with our fellow Primates!


9. Walt and tango? Our comparisons were done at a position by position basis, but we bucketed the results to get numbers we could work with. For example, Shawon Dunston rates a ?Grand Slam? because his numbers are far better than other SS of his era, while Will Clark rated a ?Home Run? even though many (most?) of his numbers were better than Dunston?s. Their individual results are combined to score the infielders. I'll break out the infield as per your suggestions and post back in a little while.

While it would seem that the ?Moneyball? brigade would have tipped the scales heavily towards the college players, I?m not sure that?s true. Interestingly, from 1982-1986, 55% of players drafted in the first round were college players (64% in 1983-1985) and from 1995-1999, 48% were college players. This year 60% of first rounders were college players. The needle may have moved a bit, but it certainly hasn?t made a seismic shift ? yet. With 3 of the top 6 picks high school players, I would even argue that the shift may not even be on the radar screen as much as we would think. Taking out Oakland, Boston, and Toronto, only 54% of the remaining picks were college players - right in line with the sample.
   7. Bill DeSimone Posted: July 18, 2003 at 02:28 AM (#612213)
Oops - make that Szymborski - Sorry about that Dan!
   8. Tinkers2Evers Posted: July 20, 2003 at 02:28 AM (#612248)
It's interesting that your finding contradict those of Baseball America. BA did a study of the '90-'97 draft and they came out with much different than yours did. BA even broke it down per round and they came out with High Schoolers (27.6) actually having a better chance at becomming a regular or better than College players (26.8)and a much better chance at becomming "good" than College players. They even break it down by position (though not position by round) and find that it is not true that it is not true that High School catchers and outfielders have less of a chance to make it. In fact they have a better chance. HS catchers are "regular or better" 9.9% while College catchers are at 8.5% and the HS have a much better chance of being better than a regular. HS outfielders and College outfielders have about the same chances with HS again having a better chance of becomming stars.

Overall, the study of the first ten rounds shows that college players have a slightly better chance of becomming "regulars or better" (8.8 to 8.4) while HS have a better chance of becomming "good" or "stars". Certainly you cannot say something like "When choosing between college and high school hitters, take the college player."

Though drafting college players does have it's benefitts not only are they cheaper, but they also develop trade value and make it to the majors quicker. Importantly, they flop less than HS (61.1 to 71.8) meaning that they are more likely to develop trade value.

IMO, it is not who you draft, but how you develop the players you draft. It is clear that to be drafted all players have to have at least some talent and it depends on how you develop the prospects. Organizations have had success drafting HS or College playesr (see Oakland & Toronto and Twins and Braves) while others have had success mixing in both (see cubs and Indians). The thing these organizations have in common is solid playres development. Drafting good players is important, developing them is more.

Here's the adress for the page (pretty sure it's subcribers only)-http://www.baseballamerica.com/today/draft/90-97draftbreakdown.html

While it is a nice looking study I have to say that your study is skeptical in my view.

NOTE: Position player breakdown is not done by round, but done for the first 10 rounds.
   9. Bill DeSimone Posted: July 22, 2003 at 02:28 AM (#612258)
Tinker - more great questions, but I don't know how close a comparison can be made between our study and Baseball America's. There are definitely similarities: 1) they broke down the players into 6 buckets, we broke them down into 7 - that's effectively the same, we pretty much just broke down their "star" category into "star" and "superstar"; 2) they also did a comparison of 1st round college players vs. first round HS players, however we drilled in at a position level to better understand what was driving those results.

From there, however, there are enough differences to question how different our conclusions truly were. First, at a position level we only looked at the first round, while they compiled their positions into the first ten rounds. Thus, they are weighting Jason Schmidt (1991, 8th Round, Braves) the same as Brian Barber (1991, #22 overall, Cardinals) in terms of eventual success, while we would only record Barber?s results against HS pitchers and not credit Jason Schmidt's success. This certainly has some effect, although I?m not sure how strong in either direction - to determine this would be to answer the question of how strong are picks in rounds 2-10. Finally, 1992?s first round skews their data a bit, as the MLB teams drafted, by far, the highest ever percentage of first round college players ? 75%. That year only 7 HS players were drafted in the first round, including Derek Jeter, Preston Wilson, Shannon Stewart, and Jason Kendall. At no other time between 1979 and 1999 did either HS or College players make up more than 70% of the drafted players. There were 8 college players from 1992 who spent somewhat significant time in the majors ? Phil Nevin, Paul Shuey, Jeffrey Hammonds, Calvin Murray, Michael Tucker, Ron Villone, Ricky Helling, and Charles Johnson. If 7 or more of those 8 are considered ?regulars or better? by Baseball America, than I would agree that they would say that in a ?fair? draft, HS players are better than college players. In addition, I think 1997 might be a bit too recent to appropriately assess the full career success of those players. Players like J.D. Drew, Lance Berkman, Troy Glaus and Vernon Wells still have a long time to go in their careers before we can assess their true value. The same is true for pitchers like Kerry Wood (1995, #4, HS) and Matt Morris (1995, #12, College). For a historical perspective, if the same analysis were performed on the 1986 draft, but in 1992 instead of 2002, Roberto Hernandez (#16, college) would have had the same rating as Scott Scudder (#17, HS), when in fact Mr. Hernandez had a far better career. This of course works both ways, and possibly even favors greater improvement in HS players than college players, how much is a question that cannot yet be answered.

In addition, we?re not saying avoid HS?ers completely. We would say, given the choice between two players who appear similar, use the following ranking as criteria for the final decision:

1. College IF
2. College OF
2a. HS IF (they are statistically tied with College OF)
3. Pitcher (with no clear edge between college and HS)
4. Catcher (again, with no clear edge between college and HS)
5. HS OF

Looking at Baseball America's analysis, they have essentially the same order amongst hitters, with College OF dropped to the bottom. I'd be interested in why we differ on that point, but some of the things I mentioned above probably factor in:

1. College IF (11.85% reg or better)
2. HS IF (9.48)
3. Catcher (8.93)
4. HS OF (7.4)
5. College OF (7.3) (and this is probably a statistical tie with HS OF)

Resorting the top 5 draft picks from each of Baseball America?s years (1990-1997) we would have drafted as follows (actual pick number in parenthesis):

1990 ?
1. Chipper Jones (1)
2. Alex Fernandez (4)
3. Kurt Miller (5)
4. Mike Lieberthal (3)
5. Tony Clark (2)

To date, Fernandez definitely is the second best of these five players, and having Lieberthal at catcher is a more valuable improvement over the replacement player than Tony Clark would be over the replacement first baseman. Kurt Miller would have been the mistake here, but 4 out of 5 isn?t bad. Near the end of the round, we would have cheered the O?s for taking Mike Mussina (#20) while scoffing at the Giants for taking Eric Christopherson (#19, C). A few picks later, we would have cheered the Cubs for taking Lance Dickson (#23, P) while mocking the Expos for taking Rondell White (#24). Hey, nobody?s perfect.

1991 ?
1-3. Choice between Mike Kelly, David McCarty, and Dmitri Young
4-5. Choice between Brien Taylor and James Henderson

Hard to go right from this group, but all of the hitters made it to the bigs, with Dmitri rating as a ?regular?, while the pitchers not only didn?t get a whiff, but Taylor has become synonymous with draft flops. Probably the only draft pick questioned more in NY is Blair Thomas over Emmitt Smith. Manny Ramirez would have made us look bad later on, when we would have selected Cliff Floyd over him. From the looks of it, we at least would have been with the other 12 teams that passed on him and got virtually no serviceable players in return.

1992 ?

1. Phil Nevin (1)
2. Jeffrey Hammonds (4)
3. Chad Mottola (5)
4. Paul Shuey (2)
5. Billy Wallace (3)

An injury free Nevin is clearly the #1 in this group, with Shuey #2 and growing the gap between him and Hammonds. Mottola and Wallace haven?t made an impact in the majors. This one is probably a loss for us, but we would have recommended taking Derek Jeter (#6) over Billy Wallace or Chad Mottola at #5, so perhaps that would have saved our rating.

1993 ?

1. Alex Rodriguez (1)
2. Choose between Darren Dreifort, Brian Anderson, Wayne Gomes, and Jeff Granger

Not a lot for us to add here since 4 of the top 5 were college pitchers, but at least we had ARod first. In this draft, we would have agreed with the Astros and taken Billy Wagner over Matt Drews

1994 ?

1. Tony Williamson (4)
2. Josh Booty (5)
3. Paul Wilson (1)
4. Dustin Hermanson (3)
5. Ben Grieve (2)

Another disaster of a top 5 for our MLB counterparts. In this draft, we would have moved Todd Walker and Nomar up from their actual spots.

1995 ?

1. Darin Erstad (1)
2. Jose Cruz, Jr. (3)
3. Kerry Wood (4)
4. Ariel Prieto (5)
5. Mark Davis (2)
This one makes us look like geniuses. Had we been even smarter, we would have bumped Kerry Wood up a notch or two as well. Add in that we would have taken Todd Helton ahead of Jamie Jones and Jonathan Johnson, and we really look smart.

1996 ?

1. Travis Lee (2)
2. A choice between Kris Benson, Braden Looper, Billy Koch, and John Patterson
Another boring round, with 4 pitchers in the top 5. Deeper in the draft, we would have moved Eric Chavez ahead of Seth Greisinger and Matt White.

1997 ?

1. Troy Glaus (3)
2. J.D. Drew (2)
3. Matt Anderson (1)
4. Jason Grilli (4)
5. Vernon Wells (5)

So far, we?re pretty close here, but Wells seems to be moving up.

Overall, our analysis would have given provided good results in the top 5 of 1990, 1991, 1993, 1995, and 1997. In 1992, we clearly would have done worse. In 1994, with not a lot to choose from in the top 5 actual picks, we would have done worse by moving Ben Grieve down. Finally, in 1996, we would have moved Travis Lee down below the Looper/Koch closers, but in the long run, that may end up okay.

No system is perfect, but we are confident our analysis holds up to fairly deep scrutiny.

I?m in the process of polishing the data for consumption. I?ve already promised it to a few folks, and will be sending it to them hopefully by the end of this week. If anyone else is interested, drop me an email and I?ll send it to you as well.

Thanks,
Bill
   10. Tinkers2Evers Posted: July 23, 2003 at 02:29 AM (#612304)
Thanks for the response and you raise some interesting points. I'd like to take a deeper look into the previous drafts past the first 5 picks:

1990: Yes the first five do look good with Chipper and Fernendez, but using your criteria you would take Tim Costo over Dan Wilson; Ronnie Walden over Carl Everett, Tom Nevers over Steve Karsay though I'll give you Mussina and Burnitz.

1991: You do get some good players but Eduardo Perez overs Shawn Green hurts.

1992: As you mention your criteria doesn't come out the best here with Michael Tucker and Dave Wallace over Preston Wilson; Michael Grigsby over Shannon Stewart; John Burke over Charles Johnson doesn't look good either, it's hard to say how you come out on Eddie Pearson vs. Helling and Kendell because you don't include JC players in your works

1993: Overall as you say not much here though Jay Powell over Torii Hunter doens't look too good. I don't count Wagner because Drews was also a pitcher and you don't a make a distinction between the two.

1994: It's hard for me to argue this one though Jayson Peterson over Konerko and Vartitek isn't great.

1995: Yes this one is a good one for you though again Ryan Joroncyk over Halladay isn't good. Still this one is exceptional for you.

1996: Not a whole lot happened that could change for this draft, but Damian Rolls over Gil Meche hurts.

1997: Not bad, though Geoff Geotz and Jason Grilli over Vernon Wells is bad.

The way I see it 1990-bad, 1991-so-so 1992-bad, 1993-so-so, 1994-pretty good, 1995-very good, 1996-so-so, 1997-pretty good. So I see it like this:
2-bad
2-so-so
1-pretty good
1-very good

It doesn't look that great to me. I think it is very hard if not impossible to create a system of who to draft by looking at there positions and HS-college, but what you need a good balence of each position and a little stat crunching with a little scouting. Your analysis does work in some places while it doesn't in others and I think most systems like that will. Maybe your analysis is a better way of looking at it, but I do not believe that you can make judgements about whether to draft one player or another just by this research. Organizational needs whether GM's want to admit it or not do come into play. If there is a decision between an equal IF and OF, it could come down to what the organization is lacking. The way I would use it would be if it cam down to a catcher and another position and the team is strong in both areas they could see that catchers don't work as much, but besides that I don't see a whole lot of use for it as the other possitions are too close to make a decision IMO.

As I said in my last post I do favor college players a little though not because of success rates in the majors, but because it's more likely for them to gain trade value and they are cheaper. As I also said in my last post, I do well a lot of it comes down to developing players and not drafting them and often that is overlooked (though not by orgaizations).

Yes the 1992 draft does change things a bit for BA's reseach, but I don't thing it change the overall numbers that much especially since the High Schoolers advantage is in the Good and Star section and this doens't do too much to change that also who's to say that if more HS were added to the first round they wouldn't be good also.

While some players may still be gaining value it will not change the results that much as I'm guessing Morris and Wood are already in the good range if not "stars" already.

I would be interested in seeing the data. Thanks.

Still, though, I did enjoy your research as a whole especially looking at the sucess rates to first rounders overall and comparring 1-6, 7-13, etc...

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
Harveys Wallbangers
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 0.8452 seconds
42 querie(s) executed