Baseball for the Thinking Fan

Login | Register | Feedback

You are here > Home > Hall of Merit > Discussion
Hall of Merit
— A Look at Baseball's All-Time Best

Sunday, February 11, 2007

Tommy John

Eligible in 1995.

John (You Can Call Me Grandma) Murphy Posted: February 11, 2007 at 08:27 PM | 28 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. John (You Can Call Me Grandma) Murphy Posted: February 11, 2007 at 08:31 PM (#2295659)
Since you know I'm not crazy about Sutton, you can probably guess how I feel about Tommy. :-)

With that said, he was a quality player and was a hard guy to dislike personally.

trevise should be posting in about 10...9...8... :-D
   2. BDC Posted: February 12, 2007 at 01:10 AM (#2295787)
True fact: Tommy John was the Opening Day starter for both the 1966 Chicago White Sox and the 1989 New York Yankees. I think he passes that Keltner question about playing regularly past one's prime ...
   3. OCF Posted: February 12, 2007 at 03:06 AM (#2295828)
RA+ equivalent record of 281-244. Compare Sutton at 320-267 and Kaat at 262-241. My "big years score" for pitcher is simply the sum of the equivalent FWP on a season-by-season basis that they had above 15. On that, Sutton scores 21 (which is fairly low as these things go; Kaat scores 13 and John score 3 (all of it for his equivalent 19-11 in 1979.) If Sutton is peakless, what does that make John? I like Sutton, and I gave him a high vote. I see a rather large gap between Sutton and John.

What does John's defensive support look like? He's an archetype of a pitcher who depends on defensive support.
   4. DavidFoss Posted: February 12, 2007 at 03:07 AM (#2295829)
When I was a kid we used to call him "Thomas Jonathan" :-)
   5. Chris Cobb Posted: February 12, 2007 at 03:37 AM (#2295836)
What does John's defensive support look like? He's an archetype of a pitcher who depends on defensive support.

Almost exactly neutral for his career.

Based on RSI with no adjustment for fielding, I have him as 19.5 wins above average for his career. With an adjustment for fielding, I have him at 19.5 wins above average for his career. No change at all.

That'd be 282-243 by OCF's # of decisions, so our assessments of John's career are darn near identical.

WARP1 shows John's NRA (with defense included) as 4.24, his DERA (defense-neutral) as 4.25, so their defensive measures are pretty much in agreement with mine in this case.

I have John between Kaat and Sutton. He has less peak even than Kaat, but his career is significantly better. He trails Sutton on both career and peak.

Sutton I have as a solid HoMer, Kaat as slightly but clearly below the in-out line, with John as right on the borderline. I am very interested to see what more analysis turns up, as John could now end up anywhere between 10 and 25 in my rankings.
   6. Howie Menckel Posted: February 12, 2007 at 03:48 AM (#2295839)
Sutton vs John vs sample HOM group:

RWaddell 179 79 65 53 26 25 23 21 07 02
Marichal 169 66 65 44 32 22 19 16 13 00
JBunning 150 49 43 42 34 32 29 14 14 04
BiPierce 201 48 41 36 33 24 15 13 08 07 07 05 04 03
Drysdale 154 49 40 29 28 22 18 17 15 13
EarlWynn 154 42 36 35 26 18 15 10 09 03
EppRixey 144 43 43 39 36 29 24 15 15 13 10 09 09 06
DoSutton 161 59 42 27 26 21 19 12 11 10 10 07 06 02 01
TommJohn 154 38 38 37 25 20 19 19 16 14 11 10 09 09 06 03 00

John's 154 comes in 1968 in 177 IP, which is quite low for that season. He has only one top-10 IP season above 120 ERA+.

RWaddell top 10 in IP: 3 4 4 10
Marichal top 10 in IP: 1 1 3 5 5 6 8 8
JBunning top 10 in IP: 1 1 2 2 3 3 4 5 6 8
BiPierce top 10 in IP: 3 3 3 5 5 7
Drysdale top 10 in IP: 1 1 2 2 4 5 5 5 9 9 10
EarlWynn top 10 in IP: 1 1 1 2 2 3 3 6 6 6 6 7
EppRixey top 10 in IP: 1 3 3 3 4 7 8 8 9 9
DoSutton top 10 in IP: 5 5 5 7 8 9 9 9 10
TommJohn top 10 in IP: 2 5 8 10

Throwing in a couple of contenders
BWalters 168 52 46 40 27 23 07
LuiTiant 184 69 32 28 25 20 19 05 02 02 00
BuGrimes 153 44 38 36 31 23 08 08 08 03
DoSutton 161 59 42 27 26 21 19 12 11 10 10 07 06 02 01
TommJohn 154 38 38 37 25 20 19 19 16 14 11 10 09 09 06 03 00

BWalters top 10 in IP: 1 1 1 4 6 6 8 8
LuiTiant top 10 in IP: 6 7 8
BuGrimes top 10 in IP: 1 1 1 3 3 4 7 9 9 9
DoSutton top 10 in IP: 5 5 5 7 8 9 9 9 10
TommJohn top 10 in IP: 2 5 8 10

Kaat didn't even make these lists; maybe that's a good battle, but not to get onto my ballot.
You need peak and/or workhorse to get into the consideration set. This is a nice pitcher who often pitched well but not that often.
   7. Dag Nabbit at Posted: February 12, 2007 at 05:09 AM (#2295863)
Does he get any credit for putting himself through an experimental surgerial technique that now bears his name? That's one Keltner list question.

In June-July 1967, in the early phases of one of the tightest pennant races ever, Tommy John started 11 games, completing six - five for shutouts. In 71.3 IP he walked 16 and allowed 43 hits, only one of which was a homer. Meanwhile, he K'd 45. Only 11 of the 14 runs he allowed was earned, for a 1.39 ERA. Over two months. Incredibly, he only went 5-4 in those games. The Sox scored 22 runs in those 11 games; 6 in one day. (looks closer). From June 4 to July 8 he allowed 7 ER in 64 IP for a 0.98 ERA. He then lasted 0.7 IP on July 12 (only 2 runs allowed). Out there two days later he allowed 3 runs (1 earned) in 6.3 IP. He faced only two batters in his next start before leaving with what I can only assume was an injury. He didn't pitch for four weeks. When his great stretch ended on July 8, the Sox were in first by 3 games. Before coming back, they were two games out of first.

And in maybe the game's most overlooked facet in the HoM, in 9 postseason series in went 6-3 with a 2.65 ERA in 88.3 IP.
   8. John (You Can Call Me Grandma) Murphy Posted: February 12, 2007 at 11:23 AM (#2295896)
Does he get any credit for putting himself through an experimental surgerial technique that now bears his name?

Yes, Chris. I give him credit for everything he did after the surgery.

Beyond that? No. ;-)
   9. DL from MN Posted: February 12, 2007 at 02:51 PM (#2295939)
I thought it was interesting that Tommy John and Jack Quinn ended up side by side in my rankings.

I agree that postseason performance is generally overlooked here which is a shame. The problem is there is no systematic database of valuations of postseason performance to pull from. I'd hate to forget to give credit to someone. I'd love to add PRAA, FRAA and BRAA credits for everyone but I've never bothered with calculating them myself.
   10. The Honorable Ardo Posted: December 06, 2012 at 05:59 AM (#4318400)
Bump for Tommy John (from the 2012 ballot discussion thread):

Badly underestimated by the electorate. Low K/9 rates, but inducing double plays is a repeatable skill and John did it as well as anybody. Above the HoM threshold for starting pitchers.

John through his age-39 season (1982): 3709 IP, 118 ERA+
Rick Reuschel, career: 3548 IP, 114 ERA+

John has better defensive support than Reuschel, and his career is centered earlier in the '70s (easier to accumulate IP). Account for those factors and the two are of equivalent merit.

Age 40 and up, John has exactly 1000 IP of 92 ERA+, which "zeros out" in terms of HoM value; it shouldn't help or hurt his case.
   11. AndrewJ Posted: December 06, 2012 at 07:53 AM (#4318413)
He pitched during seven different U.S. Presidencies. Someone in MLB doing that today would've had to start playing in 1976. :)
   12. Bleed the Freak Posted: January 17, 2015 at 09:11 AM (#4882559)

Link to an article penned by Kiko reiterating Ardo's point.

An overview of pitchers in general:

Kiko, can you share with us some insights from your system, this shows that John is criminally underrated and should be elected well before Luis Tiant, someone who has been on the cusp of election previously.

Looks like situationally he was more effective while also excelling in post-season appearances.
   13. Bleed the Freak Posted: January 17, 2015 at 09:50 AM (#4882578)
John by other systems:
Worthy by Fangraphs, borderline or in by Joe D's P, WPA, and Chone's old WAR, a little short with baseball-reference, and well short by seamheads and baseball prospectus.
   14. Kiko Sakata Posted: January 18, 2015 at 11:07 PM (#4883692)
Kiko, can you share with us some insights from your system, this shows that John is criminally underrated and should be elected well before Luis Tiant, someone who has been on the cusp of election previously.

Looks like situationally he was more effective while also excelling in post-season appearances.

My system seems to like starting pitchers more than a lot of other systems, especially starting pitchers who are a little bit above average (my system likes pitchers who are a lot above average too, but every system likes the Roger Clemens's and Greg Maddux's of the world). I think the key is that the translation from player performance to team performance isn't exactly linear: teams that are a little bit above average everywhere will win a lot of games. I think these two articles on my site probably explain this best.

Basically, being slightly above average has a somewhat multiplicative effect on team wins, then, which is picked up in my numbers. This effect is most pronounced for starting pitchers, because they concentrate their performance more heavily within individual games.

But, that said, I'm not entirely sure how much all of that matters with respect to Tommy John. Even setting aside my numbers and my system (not that I disagree with my system or think it should be set aside, mind you), look, for example, at the 10 players most similar to Tommy John at Baseball-Reference. Eight of his 10 sims are Hall-of-Famers. Now, granted that includes Burleigh Grimes. But compare John to Early Wynn: John has more career IP (4,710 to 4,564) and a higher career ERA+ (110 to 106). It looks like y'all elected Early Wynn, but not Tommy John (although, Wynn was elected long before John was eligible, so admittedly nobody ever had to compare the two directly on the same ballot).

Or, just look at what John did from age 23-39: he pitched 3,411.1 IP, ERA+ of 119. Compare that to Jim Bunning from age 24-38 - 3,599.1 IP, ERA+ of 119. And that's excluding a handful of seasons (at both ends) where Tommy John was still a useful pitcher. For Bunning, that's excluding the first and last seasons of his career, when he put up ERA+'s under 70.

Or, since you mentioned him, Luis Tiant: 3,486.1 IP, ERA+ of 114. I don't know: to me, looking at raw numbers, I guess I don't understand why you'd prefer Tiant to John. So, I'm not sure how to explain why my system likes John better beyond wondering why other systems like Tiant better.

I will say, I was a little surprised when I first put my numbers out there how good John looked and I thought that maybe it was a DIPS thing. But John actually wasn't all that great at preventing hits on balls in play (my system does think he's among the best pitchers of the past 70 years at preventing extra-base hits) and on his BB-Ref page his ERA (3.34) doesn't out-perform his FIP (3.38) by a remarkable amount or anything. He wasn't a huge K guy, but he didn't walk people and he kept the ball in the ballpark.

Going back to my system, John does look better in context (tying his wins to team wins) (28.1 career pWins over positional average - pWOPA - and 53.3 pWins over replacement level - pWORL) than out of context (25.2 eWOPA, 50.7 eWORL), but that's a difference of about 3 wins over a 25+ year career, and even out of context, my system thinks he's an excellent pitcher and among the top 40 players in both eWOPA and eWORL among players for whom I have calculated Player won-lost records.
   15. Chris Cobb Posted: January 19, 2015 at 09:27 AM (#4883784)
A couple of comments about John:

The biggest knock against him isn't his career value but his lack of peak performance and big workloads. He placed in the top 10 in his league in IP only 4 times, (2nd in 1979, 5th in 1980, 8th in 1970, and 10th in 1966) and in the top 10 in ERA+ only 6 times, only one of which overlapped with his top 10 IP finishes twice (1966--7th and 1979--3rd). For peak voters who are looking for players who were among the best at their position in a given season, John isn't going to do well.

Early Wynn thus has three big edges over John: (1) he was an in-season workhorse, which John never was, (2) he was a good hitter, which John never was (Wynn picks up 10 WAR over John with his hitting); (3) he pitched in a tougher era for pitchers than John did.

What I don't understand about John is why his BB-Ref pitching WAR doesn't match his Fangraphs pitching WAR more closely. Fangraphs shows us why his FIP-based WAR (75.2) and his RA-WAR (73.2) are very close by showing that although he gave up a lot of hits, he was good at keeping runners on base from scoring (lots of DP, low extra-base hits), so that his actual runs allowed is very close to his FIP-RA. When we flip over to BB-Ref, we find that, for his career, John's defenses were pretty close to average--they improved his RA average by .03 runs/9 IP over his career. This would suggest, to me, that his WAR in BB-Ref would be very close to his RA-WAR at Fangraphs, but it isn't: it's 11 wins lower (62.3). I don't know where that difference comes from, and it's very significant to my assessment of John. A player with basically no peak (like John), needs to have around 70 career WAR to be a serious candidate in my system. Fangraphs sees John as having that, BBRef doesn't, and I don't know the reason for the difference--it seems like it must be a difference in how they are assessing his context--where they set replacement level or how they calculate his run-environment or the quality of his opponents, or something like that. If anybody knows what's going on here, I'd love to know.

   16. DL from MN Posted: January 20, 2015 at 11:03 AM (#4884479)
Tommy John is a pure compiler case. MMP has covered his career and his best finish (and only votes) were a 17th place finish in 1979. Tiant was listed on 3 MMP ballots and was top 10 in 1974. Early Wynn has won pitching MMP awards.
   17. The Honorable Ardo Posted: January 28, 2015 at 02:20 AM (#4888934)
Chris Cobb and DL are both right: Tommy John's low peak makes him a difficult case.

I looked at his BB-Ref page for top-10 seasonal Pitching WAR finishes (not that it's the perfect stat, but just as a rough-and-ready way to compare pitchers to their contemporaries), and John has a 3rd, two 6ths, and a 7th. By Hall of Merit standards, this is poor.

As a chart, starring John and several comparables:
Tommy John 3 6 6 7
Jim Kaat 3 4 5 7 8 (John has more "slightly above average" seasons than Kaat)
Luis Tiant 1 3 4 6 6 7
Don Sutton 2 3 5 9
Andy Pettite 2 4 8 (future candidate, on the bubble)

Close to John in career pitching WAR, but not comparable due to much stronger peaks:
Juan Marichal 1 2 3 3 4 6
Rick Reuschel 1 2 3 3 4 5 (more deserving than John; we did well here)

I still support John's induction, but not as fervently as I once did.
   18. Bleed the Freak Posted: January 08, 2021 at 09:41 AM (#5998232)
Tons of discussion on John, part 1 of 3:

11. Jaack Posted: January 13, 2020 at 09:43 PM (#5915434)
14. Buehrle (new) - superior to John in era context.

How are you determining this? Buehrle's career ERA+ is 116. Tommy John through 1981 had an ERA+ of 119. That's longer than Buehrle's entire career at a (marginally) better rate.

But John then added another 1200 innings of league average pitching.

I can see liking Buehrle as a candidate, but I really can't see him above Tommy John.

14. Kiko Sakata Posted: January 14, 2020 at 01:54 PM (#5915676)
Tim Hudson looks quite good in my Player won-lost records. My WORL are on a similar scale to the various WAR's and I get Hudson with 64.1 eWORL and 64.5 pWORL (so note that context isn't doing it there; my system just legitimately likes him). My system is less impressed with Mark Buehrle: 46.3 eWORL and 49.0 pWORL. The former will almost certainly make my ballot, probably in an elect-me slot. The latter will almost certainly not.

For WAR users, I also want a make a suggestion. First, you should look at converting to my Player won-lost records, which are much more robust and, I think, much more informative (here's the build-your-own 2021 HOM ballot link).

But second, be sure to compare Baseball-Reference and Fangraphs. For pitchers, in particular, Fangraphs calculates WAR two ways: using FIP (which is their default) and also using actual runs allowed (RA9). On the player page, click "Value" and they'll both show up - e.g., here's Tommy John. I use Tommy John here not coincidentally.

I've mentioned repeatedly things to the effect that "my system loves Tommy John". This is true, but I think it makes it sound like my system is more of an outlier than it really is (aside: I think my system is more of an outlier on Tim Hudson, for example - although his RA9-WAR at Fangraphs is 63.0). I have Tommy John at 62.3 eWORL and 71.0 pWORL - so he does look a fair bit better in context. But while Baseball-Reference only shows John with 61.5 WAR (with, I believe, a slightly lower replacement level than I have), Fangraphs shows him with 79.4 fWAR based on FIP and 72.3 fWAR based on RA9 (with the same replacement level as BB-Ref).

Now, I understand that the shape of Tommy John's career is less appealing to a lot of people than, say, Johan Santana's career, where the latter was the best pitcher in baseball for a few years, and that's a personal opinion that I'm not inclined to argue too strongly against (largely because I share it to some extent). But bottom line: I think people should take another look at Tommy John.

19. Rob_Wood Posted: January 14, 2020 at 05:47 PM (#5915764)
I think as a whole the electorate leans a little too much on BB-Ref WAR. It's basically the only major metric that doesn't think Tommy John is a very strong candidate.

I'd like to respond to this. By way of background this project has been going on for 17 years. We have had many different voters over the years but virtually all of them have taken the project very seriously and filled out their ballots in a serious and thoughtful manner. "High level" stats (such as WAA and WAR) have become popular since the project began. One of the things that HOM founder Joe Dimino did each year was calculate/present the "Pennant Added" figure for each new candidate on that year's HOM ballot. Of course, Pennant Added reflects the cumulative "pennant impact" of a player's seasonal WAR and WAA values.

The reason I mention Pennant Added is that I know of no voter, now or in the past, who has simply followed WAR (any version) when constructing his ballot. The reason is obvious. Cumulative career WAR is wonderful but it never ever tells a complete story of a player's worth. Every voter acknowledges, either implicitly or explicitly, that a player's "Peak Value" is important as well as his "Career Value". Of course, these terms have multiple interpretations but I am comfortable saying that a player's peak value is reflected in his WAA whereas his career value is reflected in his WAR, even though this is only a shorthand interpretation.

Tommy John has one main thing going for him. Career Value. The man pitched forever. When measured relative to replacement, pitching forever is going to accumulate a ton of "career value". However Tommy John's career WAA figure is not very high (especially relative to his career WAR figure). A hypothetical pitcher who racks up 3 WAR in each of a 25-year career will total 75 career WAR. But he won't accumulate a high career WAA figure and would be unlikely to be voted into the HOM. And that makes perfect sense to me.

John retired in 1989 and came onto the HOM ballot in 1995. He was named on 5 ballots and came in 44th place. There were 54 voters that year and John was named on 5 ballots. Read that sentence again. That does not give the impression that John is a "very strong" candidate. Since I have it in front of me, here is how Tommy John has fared on each of his HOM ballots.
Year Place OnBallots
1995 44 5
1996 44 6
1997 44 6
1998 41 6
1999 35 7
2000 41 6
2001 44 5
2002 39 6
2003 41 6
2004 37 7
2005 34 8
2006 29 9
2007 30 8
2008 35 7
2009 39 5
2010 40 5
2011 35 5
2012 32 5
2013 30 5
2014 33 4
2015 30 4
2016 25 5
2017 21 7
2018 22 6
2019 23 5
2020 19 8

I had Tommy John 15th on my 2020 ballot (and I think on one other earlier ballot as well). He bounces around between 15 and 25 for me. Since 2021 looks to be another "weak" ballot, I will likely have him near the bottom of my 2021 ballot as well. And I heartily join in encouraging other voters to give another look at Tommy John.

23. kcgard2 Posted: January 14, 2020 at 05:57 PM (#5915773)
I want to echo what Jaack and Kiko have said about Tommy John. If you look at metrics other than bWAR, he appears to be an all-time pitcher. #21 all-time by fWAR for example. That's *seriously* up there in all time ranks. I also understand the stance of favoring peak over longevity, but John did have a peak, that depending on how you create a metric to measure it, may get diluted by the fact that John pitched *a ton of innings and seasons.* If a player is getting dinged simply for playing really long, that's not ideal. Longevity is a plus, not a minus, all else being equal.

As an aside, I've seen a number of people refer to peak/prime in consecutive seasons. I have always wondered why we do/should care if top seasons are consecutive or not. It seems very arbitrary to me and I'd like to hear views on why this is deemed important, by those who think it is. It is, incidentally, something that would affect Tommy John pretty heavily, as well.

25. kcgard2 Posted: January 14, 2020 at 06:25 PM (#5915778)
More on Tommy John. If you calculate WAA based on his fWAR, he has 38.9 WAA. If you calculate based on his RA9 WAR he has 31.7 WAA. Also, longer careers in general will suffer on WAA unless you zero out negative seasons. Pretty much any look at WAA apart from specifically bWAR is quite complimentary of John as a HOM candidate.

I would also quibble with the idea that WAA equals peak value while WAR equals career value as a shorthand. The reason being that career length will deteriorate WAA even while the player provides positive value at the beginning or end of a career. Imagine two players, one of them plays ages 24-32 at 60 WAR and 30 WAA. That's his whole career and then he gets injured and is done. The second player has an identical career to the first from ages 24-32. Perfectly identical. But he also came up at age 20, and played until age 38, and in those extra 10 years he had -5 WAA (and 16 WAR to make up a number to show he was still adding value as a player those years). Now this player by the shorthand metric appears to have a worse peak than the first player, which is emphatically not the case. He simply had a longer career and so the early and decline phases hurt him on career WAA. Apart from the greatest of all time players, this pattern holds almost universally. Basically, the shorthand punishes players for having long careers even if their peaks are the same as short-career players.

27. Rob_Wood Posted: January 14, 2020 at 07:47 PM (#5915800)
Well, I guess this is now the Tommy John thread. Here is my career Win Values Pennant Added figures for the top pitchers of the Retrosheet era not currently in the HOM.
Tim Hudson 1.20
Johan Santana 1.19
Roy Oswalt 1.15
Kevin Appier 1.13
Tommy Bridges 1.12 (w/o WWII credit)
Ron Guidry 1.02
Bucky Walters 1.01
Jimmy Key 0.99
Andy Pettitte 0.95
Tommy John 0.92
Chuck Finley 0.90
Dwight Gooden 0.86
Vida Blue 0.85
Orel Hershiser 0.85

The WVPA stat has nothing to do with BB-Ref's WAA or WAR figures.

When a "Pennant Added" metric is applied to Tommy John's career, his HOM case seems to be greatly diminished which, of course, is perfectly understandable since he had a very long low-peak career.

29. Chris Cobb Posted: January 14, 2020 at 08:10 PM (#5915808)
Good discussion here, as always!

Re the reasons to value "consecutive peak":

kcgard2 wrote: As an aside, I've seen a number of people refer to peak/prime in consecutive seasons. I have always wondered why we do/should care if top seasons are consecutive or not. It seems very arbitrary to me and I'd like to hear views on why this is deemed important, by those who think it is. It is, incidentally, something that would affect Tommy John pretty heavily, as well.

I use two measures of peak value--one that doesn't measure peak consecutively, and one that does. Consecutive peak isn't everything. But there are two reasons that I find it important to include a consecutive peak measure. One is that the ability to repeat a skill is both highly important and hugely difficult in baseball. Consecutive peak is a meaningful indicator of repeatability, so I am more confident that a player's achievements aren't heavily influenced by fluke circumstances if he has performed at a similar high level for several years running. (To put it in statistical terms, it establishes a higher mean performance for the player.) The other is that a player being able to reliably perform at a very high level has value for successful team building.

In a less value-based way, I think it's one of the characteristics that makes a great player a great player from the standpoint of watching the game: baseball fans develop high expectations for a player, and he goes out and meets or exceeds those expectations, several years running.

Overall, I weight peak, with a consecutive component, more highly than a pure "pennants added" approach would do, because I think it adds both value and merit beyond the wins we can account for directly from the player's actions.

Tommy John is not helped in my system by his best seasons being scattered around his career.
   19. Bleed the Freak Posted: January 08, 2021 at 09:42 AM (#5998233)
Tommy John part 2

40. Jaack Posted: January 15, 2020 at 01:26 AM (#5915863)
Some telling comparisons:


Santana 139
Tim Hudson 120
Mark Buehrle 117
Tommy John 111


Vic Willis 49.8
Santana 45.0
Roy Oswalt 40.3
Tommy John 34.6
Jack Morris 32.5

ERA+ over the course of a career is inherantly biased agains John for pitching longer than everyone else on the list. Tommy John had an ERA+ of 119 through 1981, at which point he had pitched more than Buehrle or Hudson. Should we penalize him for pitching eight more years at an average rate? That's absurd.

The WAR7 you are using is, I assume BBRef WAR, which is the metric John does the worst in. Here are those five by FIP-WAR7

Oswalt - 38.4
Santana - 37.2
John - 33.8
Willis - 31.9
Morris - 31.1

Oswalt and Santana still lead, but not substantially. Of course, WAR7 is a blunt and arbitray tool to define peak. Dwight Gooden had one of the highest peaks in baseball history, and his bbref-WAR7 is 36.0. That's only 0.2 per season better than Tommy John! Who supposedly didn't have a peak!

43. Michael J. Binkley's anxiety closet Posted: January 15, 2020 at 10:03 AM (#5915914)
I guess I am a relative EoTJ.

I have Tommy John around 40th of current eligible, not-PHoM players. bWAR is not the only metric that doesn't like him. gWAR likes him even less, only giving him a total of 55.7 WAR. Now I incorporate both of those, as well as some FIP and Kiko's pWins to arrive at my mWAR numbers. The latter two metrics help him in my system. And he is also boosted by the fact that other than his last 3 years, all but 5 seasons in his career, he played in lower than average standard deviation leagues for pitchers.

Despite all this, he only ends up with a career salary estimation of $93,582,235 where $100M (a 100 PEACE+ score, counting postseason bonuses, if applicable) is what I would consider as the bottom of my ideal PHoM. His peak, especially when considered versus other HoM candidates is non-existent. He has ten seasons above 3 mWAR, only seven above 4 mWAR, and most-damning in my system, only two above 5 mWAR (5.4 in 1970 and 6.0 in 1979).

Longevity is admirable, but it doesn't make you a great player - it makes you a durable player. For me, Tommy John just wasn't great enough for long enough to be a HoMer in my eyes.

255. Chris Cobb Posted: April 20, 2020 at 12:30 AM (#5942109)
So I guess the question I have to ask is - what do Tommy John and Dwight Gooden have in common? I'd think nothing, but there seems to be some quality about both that the metrics not based on RA seem to like.

Well, they don't have much in common, but one thing that they do have in common, at least as far as Gooden's 1988 and 1990 seasons are concerned, which are the two seasons were RA/9-based measures and FIP-based measure disagree most significantly, is that they gave up a lot of hits on balls in play.

By Fangraphs' calculations, Tommy John is 10.7 wins below average on balls in play for his career, which accounts for the entirety of the difference between his RA/9-WAR and his FIP WAR. He is below average on BIP-wins for 17 out of 26 seasons of his long career, sometimes by quite a bit, although he has only one season where he is more than 2 wins below average, -2.5 in 1988. This is a pretty well known feature of John's pitching profile, and it is not an uncommon one for groundball pitchers.

Power pitchers like Gooden are frequently above average on balls in play. Gooden himself is not, although he is much less below average than Tommy John. Although Gooden, like John, has a large gap between his FIP-WAR and his RA/9-WAR, BIP wins below average account for a minority of the gap, -2.6 wins. He is also 3.8 wins below average on "LOB-Wins" which is the grab-bag of unanalyzed factors that Fangraphs places under the heading of "sequencing" -- the factors that lead to runners scoring at higher rates than average once they reach base. It is not uncommon for power pitchers to be below average here, while finesse pitchers are more often above average. Nolan Ryan is, of course, the poster child for the phenomenon. Gooden is a member of what is probably the smallest subset of elite pitchers: pitchers who are highly successful overall despite being below average on balls in play and on preventing runners from scoring once they reach base. He is not far below average in either one.

The question for evaluating Gooden's case is, of course, are these below average results his responsibility? Or, more precisely, how much of these results are his responsibility? The fact that, as Baseball-Reference sees it, he pitched in front of below average defenses for his career is probably a contributing factor, but is it the whole story?

I don't have an answer to that question, but one very intriguing feature of Gooden's profile in this respect is that his BIP-wins are highly inconsistent. For most of his career, he is around average, and he is in fact well above average in 1985 and 1986. From 1988-91, however, he is far below average, especially in 1988 and 1990, the two seasons of disputed quality. In these seasons, he registers -2.1 and -2.7 BIP WAR respectively. Where a finesse pitcher like John typically recoups some of his negative BIP WAR with positive LOB WAR (from double plays, outs on the base paths, pickoffs, etc.), Gooden is below average on LOB-wins, just as he is, in modest ways, for most of his career. Overall, in 1990, he loses 4.1 wins between his FIP-WAR (he led the league in FIP era) and RA/9-WAR. That's a very big swing; I think shifts of that magnitude are probably found in less than 1% of pitcher seasons (at least among elite pitchers). So what happened?
   20. Bleed the Freak Posted: January 08, 2021 at 09:43 AM (#5998234)
Tommy John part 3:

263. Jaack Posted: April 21, 2020 at 03:19 PM (#5942756)
I won’t dispute that FIP-WAR is not perfect. But it’s approaching the question from the opposite direction as RA9-WAR. While FIP-WAR makes an error of exclusion, RA9-WAR makes an error of inclusion – it’s including a ton of information that is not in a pitcher’s control – defense and variance, in exchange for incorporating the pitcher’s influence on batted balls (although FIP WAR does account for IFFB). RA9 is essentially a total defense metric, but since pitchers make up the majority of defensive value, it is valuable as a pitching metric.

The fangraphs framing of WAR is


But for our purposes, this is still a lot of noise. Both BIP and Sequencing are things a pitcher has some control over, but are also dependent on defense and random variance. FIP is almost completely under a pitcher's control. A more helpful model for us would be something like

RA9=FIP+Defense+Luck+Other Pitcher skills

We are most concerned with a pitcher’s value, which we could arrive either by constructing it (FIP+Other Pitcher Skills) or by deriving it from the total defensive value (RA9-(Defense+Luck).

But both are going to arrive at the same place. I think it makes more sense to start from FIP and try to incorporate a pitcher’s other skills, than it is to start from RA9 and remove the aspects a pitcher has no control over. The primary issue is the random variance, at least as far as BIP goes. It takes up a large portion of the difference, at least as far as BIP goes. Tangotiger goes into this a bit here in the section titled Accountable. I can’t find the article at the moment, but he also rates defense and pitcher influence over BIP to be pretty close in value.

For batters, there is a similar dichotomy between constructive models and derivative models. Linear weights is built from the various components of offense, while contextual models like RE24 start from runs and then divide the credit among the actors. Linear weights is the model more similar to FIP – both take components and produce an expected run value. As I weigh linear weights more strongly for batters, I also weigh FIP more strongly for pitchers. Philosophically, I prefer the constructive approach to the derivative one, as I think it’s more effective to add information than it is to remove the random variance.

That being said, for pitchers, the unknown value is greater, which is to say, there is more missing information from the constructive model for pitchers than for hitters. To account for this, FIP takes up less of a share of pitching in my system. My initial weighting for pitchers is something like 55% FIP, 25% RA9, 15% Kiko’s W-L records, and 5% BPro, while for batters, 70% of the weighting is on linear weights, with 15% for RE, and 15% for Kiko’s W-L records.

I do not include BBRef pitching WAR because it doesn’t really add much beyond RA9-WAR. Their defensive adjustment is too haphazard to help all that much. Furthermore, their model does not give enough credit to pitchers overall – this is particularly evident when looking at pitchers with long careers like Jim Kaat, Tommy John, Eppa Rixey, and Don Sutton, who are systematically underrated by BBRef WAR, but it affects all pitchers to some degree. Since there seems to be a fairly solid consensus that we should be inducting more pitchers, relying on the metric that likes them the least feels counter productive to me.

264. Chris Cobb Posted: April 21, 2020 at 04:07 PM (#5942790)
Jaack, your argument againt BBRef pitching WAR seems largely to depend on two claims whose basis you haven't yet explained:

(1) the defensive adjustment is too haphazard to be helpful

(2) it doesn't give enough credit to pitchers, because certain long-career pitchers are systematically underrated.

I'd like to know more about what you see as the basis of these claims.

I'll say that my understanding in both cases leads to the opposite conclusion. My understanding is that assessments of fielding quality are more reliable at the team level than at the level of the individual fielder, so that BBRef WAR's way of adjusting pitchers' RA/9 to what it would be if they were pitching in front of an average defense is, if not perfectly reliable, more reliable than other methods of making this adjustment. (The greater reliability of team-level fielding assessments was one of Bill James's basic premises for the Win Shares system, and I suspect it moved from there to WAR as a design principle.)

My view is that all of the difference between the RA/9 WAR and BBRef WAR for the pitchers you mention can be empirically accounted for by their adjustment for the quality of pitchers' defensive support and (less obviously) by their adjustments for league quality. Of the pitchers you've listed Kaat, John, and Rixey, at least all spent long stretches of their careers in a league that BBRef's methods indicate was substantially weaker than the other league. Similar effects can be observed more directly by comparing the BBRef WAR and Fangraphs WAR for NL position players from 1900-1930. Fangraphs is consistently substantially higher for NL players, while AL players from the same period show much less difference between BWAR and FWAR. These patterns are also observable in AL position players from 1950-70. I'll acknowledge that I don't think this case accounts fully for BWAR's evaluation of Don Sutton, but before a claim of systematic bias is justified, the biased element of the system should be identified. If Don Sutton is underrated but not Nolan Ryan or Phil Niekro, then it can't be a systematic bias against pitching; it is exists, it must have another source. If there is a systemic bias, then I want to know what it is!

627. Bleed the Freak Posted: January 07, 2021 at 11:03 PM (#5998170)
Every single year I have to fight the urge to go find some kind of extra credit for Tommy John. Because damn the numbers, I just know he's more worthy than this, somehow. Maybe I shouldn't say that, because it spins bad for Tommy John, I really do have him where I think his merit is! XD

John's an extreme candidate, he looks bad by Baseball-Reference, even worse by Baseball Gauge, but excellent at Fangraphs and Kiko's W-L records. Maybe FIP and Kiko are picking up on a skill John possessed that doesn't show in RA-9 type of WARs. He was quite successful in the post-season and was clutchy compared to normal situations too. A divisive candidate that I support, you can too : )

630. Chris Cobb Posted: January 08, 2021 at 12:24 AM (#5998186)
Bleed the Freak wrote, re Tommy John:

Maybe FIP and Kiko are picking up on a skill John possessed that doesn't show in RA-9 type of WARs.

Since FIP looks at less information than an RA/9 WAR does (intentionally ignoring what happens on balls in play), I think the more likely explanation is that they are not picking up on information about John's performance that a measure that starts with RA/9 and then adjusts for contributing factors. Most of the extra WAR that FIP gives to John that RA/9 WAR doesn't comes from balls in play. FIP assumes that pitchers don't control what happens on balls in play, so FWAR fills in the gap in scoring information created when balls in play are omitted by assuming average outcomes on bip. Any worse outcomes than average are assumed to be either fielders' responsibility or luck. John's RA/9 is about 10 wins worse on bip than his FIP with average results would be, so FWAR gives back all those wins to John: it erases any responsibility he might have had for what happens on balls in play.

Given that we know that John was a sinkerball pitcher, and that sinkerball pitchers give up more groundballs than average pitchers because groundballs become hits at greater rates than flyballs do (when they don't leave the ballpark), it stands to reason that a sinkerball pitcher is likely to give back some of the runs they save by keeping the ball in the park (which FIP credits them for) on the outcome of balls in play.

Now, maybe John got bad outcomes because his pitched in front of bad defenses. In that case, the extra BIP runs wouldn't be on him. But BWAR finds that, over the course of his career, he pitched in front of defenses that were average, overall, so it gives responsibility for the deviation from average fielding outcomes back to John. Given the type of pitcher that John was, that seems an eminently reasonable step to me.

If I recall correctly, Kiko's system agrees with FIP that a lot of what happens on balls in play is luck -- out of the control of both the pitcher and the fielders, so he splits responsibility evenly between the pitchers and the fielders and leaves it at that. It's not an unreasonable generalization, but if there is reason, as there is in John's case, to expect that the pitcher's approach is going to result in an increase in base hits on balls in play, then splitting the difference between fielders and pitchers is again going to shift responsibility away from John that is rightfully his. Kiko's system also places a lot of emphasis on the value of power, so I would hypothesize that pitcher suppression of home runs is also given heavy emphasis, although I don't recall any statement to that effect that he has made. Therefore, it again stands to reason that a system that moves responsibility for what happens on balls in play away from the pitcher while giving the pitcher full credit for suppressing home runs will systematically favor (I might say overrate) a pitcher of John's type.

Of course, giving up more ground balls to avoid giving up more home runs is a sensible strategy, as long as the hits don't get out of hand. Tommy John was a very successful pitcher. However, analytical systems that zero out or reduce the impact of balls in play on such a pitcher's runs allowed are going to overrate that pitcher's accomplishments. I am certain that's what happening with FWAR, and I suspect it may be a factor in Kiko's system, although I also acknowledge that Kiko's system could be picking up on situational leverage factors where John excelled that a context-neutral WAR system will not consider. John's WPA is 6.5 wins higher than his pitching wins above average, so if you credit those wins to John and not to situational luck, then that would be evidence that he is better than he appears by BWAR. That's where one might look for further evidence of a skill. I'd want to know more than I do about WPA for pitchers before I put weight on that number myself.
   21. TJ Posted: January 08, 2021 at 01:29 PM (#5998344)
Does he get any credit for putting himself through an experimental surgerial technique that now bears his name?

Yes, Chris. I give him credit for everything he did after the surgery.

Beyond that? No. ;-)

Agreed- it's not like Tommy John designed the surgery and then performed it on himself...
   22. progrockfan Posted: January 08, 2021 at 01:38 PM (#5998346)
In other news, Lou Gehrig finally gets credit for something other than all those damn games in a row. What a lucky guy.
   23. Chris Cobb Posted: January 08, 2021 at 03:01 PM (#5998387)
John's an extreme candidate, he looks bad by Baseball-Reference, even worse by Baseball Gauge,

If Baseball Gauge is basing the fielding component of its WAR on DRA, then I can readily imagine that it is pulling more credit away from John to his fielders, as DRA allows much higher variance in fielding than does BWAR's fielding component. If that's the source of the difference, I don't know how justified that would be. As far as I know, DRA was designed to reveal the fielding value of individual players, not to model the total value of a team's fielding. If that's the case, even if its results are accurate to a significant degree at the individual level, they might not represent the impact of team defense on pitchers accurately.

   24. Doug Jones threw harder than me Posted: January 11, 2021 at 12:47 AM (#5998923)
Does he get any credit for putting himself through an experimental surgical technique that now bears his name?

Yes, Chris. I give him credit for everything he did after the surgery.

Beyond that? No. ;-)

Dr. Frank Jobe himself gave Tommy John a lot of credit both for going through the surgery but also for working so hard and helping to develop the rehab program along with the trainer Bill Buehler, which was an important part of the success story - the rehab program itself was important because if the entire package - surgery+rehab, hadn't worked, then it might have been a number of years before the same thing had been tried again.

Frank Jobe and Tommy John Speech at the Hall of Fame

The first surgey actually was not entirely successful, and had to be followed up with another surgery to repair nerve damage, which took until the middle of the following year to finally heal.

John went through months of rehabilitation, fighting through the frustration of more than a half year of numbness of two paralyzed fingers. Feeling in his fingers eventually returned in July 1975 and by the start of 1976, he was back in the Dodgers’ starting rotation.

More about Tommy John

Tommy John also had an excellent postseason record, much of it pitching in the pressure-cooker of the Yankees of the late 1970's, someone else here mentioned that postseason success (or failure) gets short-shrift here, but success in that venue is an important part of one's merit if given the chance to perform. Much of the "peak" argument is made on the basis of a peak performance=pennants, but performance in the postseason also=pennants. If Tommy John wasn't taken out prematurely in Game 6 of the 1981 World Series, the Yankees not the Dodgers may have been World Champs.
   25. Eric J can SABER all he wants to Posted: January 11, 2021 at 01:07 AM (#5998926)
Tommy John also had an excellent postseason record, much of it pitching in the pressure-cooker of the Yankees of the late 1970's

To be slightly picky, Tommy John was pitching for the Dodgers in the postseason in the '70s; his first Yankee postseason was 1980. He was actually on the losing end of all three of the Yankees-Dodgers World Series during the '77-'81 stretch, albeit through not much fault on his part (he did get blown up in his one start in the '77 WS, but pitched fine in both '78 and '81).

In the small sample size that is the postseason, it is probably worth pointing out that John's 2.65 ERA is not entirely representative because he gave up a comparatively large number of unearned runs (26 earned, 10 unearned). That is a deadball-era level fraction of unearned runs.
   26. The Honorable Ardo Posted: August 22, 2021 at 11:38 PM (#6035677)
Another way to assess John:

Tommy John, 1965-1982: 235-160, 118 ERA+, 3595 IP, 8.6 H/9, 2.4 BB/9, 4.6 K/9
Mark Buerhle, career: 214-160, 117 ERA+, 3283 IP, 9.5 H/9, 2.0 BB/9, 5.1 K/9

They're basically the same pitcher. B-Ref WAR gives John 55.3 for this stretch and Buehrle 59 flat, which seems right: John had better teammates (especially on defense) and pitched in an easier era. Both gave up more unearned runs than their contemporaries, as you'd expect. Both were pickoff artists.

John's output after 1982 (1000 IP, 92 ERA+, positive WAR but negative WAA) reminded me of Early Wynn with the Nats or Red Ruffing with the Sox. Of course, those were dreadful teams, while John's 1980s Angels and Yankees were usually contenders. I admire John for pitching so long, but he didn't accumulate much, if any, value in his 40s.
   27. Chris Cobb Posted: November 05, 2021 at 07:25 PM (#6051490)
Another angle on Ardo's John/Buerhle comparison is to look at their IP relative to their contemporaries. I pulled the top 10 in IP 1965-82 and the top 10 in IP for 2000-2015 (Buerhle's career) and here are the results:

1965-82 IP Leaders
1. Gaylord Perry, 4838.7
2. Phil Niekro, 4403
3. Fergie Jenkins, 4333.3
4. Steve Carlton, 4274.7
5. Don Sutton, 4137 (first season 1966)
6. Tom Seaver, 3900 (first season 1967
7. Jim Palmer, 3853.7
8. Tommy John, 3595
9. Jim Kaat, 3549.7
10. Catfish Hunter, 3449.3

2000-2015 IP Leaders
1. Mark Buerhle, 3283.3
2. Tim Hudson, 2990.3
3. CC Sabathia, 2998.7
4. A.J. Burnett, 2690
5. Livan Hernandez, 2665.7
6. Roy Halladay, 2586
7. Barry Zito, 2576.7
8. Kyle Lohse, 2522.3
9. Javier Vazquez, 2503
10. John Lackey, 2481.3

   28. Doug Jones threw harder than me Posted: November 05, 2021 at 08:24 PM (#6051496)
There is a little bit of me that says:

"It's the Hall of Fame, not the Hall of Merit", and everytime you turned the TV on in the late 70's and early 1980's and it was a big game, there was a good chance Tommy John was pitching.

There is also a little bit of me that says:

"Well, maybe the above arguments say that Mark Buehrle should go into the Hall too".

It's hard to say that Tommy John contributed no value in his 40's, since he was in the discussion for the Yankees' most valuable pitcher in 1987 and 1988.

The innings pitched comparisons to Buehrle are interesting. One point might be made is that all of those guys being compared to John are Hall of Famers except Kaat, who is close, and Buerhle is being compared to Kyle Lohse and John Lackey. Also this comparison includes John's lost year+ for the surgery.

Tommy John pitched in front of Bill Russell and Ron Cey, who though steady were by no means Ozzie Smith and Brooks Robinson. Cey in fact noticeably had very little range, and Russell had consistent problems with throwing errors, especially after his finger injury. John may have benefited from good outfield defense, and I don't know the statistics on Davey Lopes. John may have been gifted with better outfield defense, but that was quite a bit less important for his pitching style.

It's frustrating in some sense that the same group of people who bemoan a game that has evolved into TTO-ball, devalue pitchers like John and Buehrle simply because they didn't strike people out, even though they created outs via other means. Maybe that's what WAR says is correct, but it seems wrong somehow. Sometimes math is a harsh mistress, I guess.

You must be Registered and Logged In to post comments.



<< Back to main

BBTF Partner

Dynasty League Baseball

Support BBTF


Thanks to
Kiko Sakata
for his generous support.


You must be logged in to view your Bookmarks.


Page rendered in 0.6795 seconds
38 querie(s) executed