User Comments, Suggestions, or Complaints | Privacy Policy | Terms of Service | Advertising
Vivid Seats is a sports ticket broker, concert ticket broker and theater ticket broker offering the best baseball tickets like Yankees tickets, Cubs tickets, and Red Sox tickets, as well as Police reunion tour tickets and Jersey Boys tickets. |
We have baseball tickets, the NFL schedule, college football tickets and Cowboys tickets. We have NBA tickets like Celtics tickets and Lakers tickets. Plus, buy Giants tickets, Patriots tickets and Colts tickets. Also check out our MLB baseball schedule |
Concerts Theatre NFL Angels Dodgers MLB Celtics Theater NBA Tickets Venues NHL Lakers Tickets NFL Yankees NHL Phillies NBA Wicked Marlins MLB Concerts Cubs Mets Red Sox Wicked WWE Red Sox Mets Yankees Dodgers |
Page rendered in 0.6595 seconds
61 querie(s) executed


Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
> who played on the Cuban national team, then defected?
I'll say yes, but I'm not the final authority on matters such as this.
Right. I suppose only the author can be an authority here.
I have had offense MLEs finished for Oscar Charleston using my system as I had left it a couple of years back, but I have been hoping to improve my system by handling walks and stolen bases more systematically and appropriately, now that we have fuller data on both, at least for the players for whom the HoF has released their data.
This post is going to focus on walks.
When Brent released his MLEs for John Henry Lloyd, this was his description of his system {my emphases added):
The approach I’m taking in deriving MLEs is a simplification of Chris Cobb’s method (and is the same method that I previously used for Carlos Morán’s MLEs). Here’s a summary of the method: (1) I restrict my analysis to data for which league averages are available or can be calculated. (2) I build my estimates from three rates: the walk rate (BB+AB)/AB, the batting average, and isolated power. (3) I adjust these rates for quality of play, multiplying BA by 0.9 and ISO and walk rates by .81. (No quality-of-play adjustment is made for games played against major league teams.) (4) I adjust the BA and ISO statistics to a National League context by multiplying by the ratio of the two league averages for the period. (However, because the HoF study doesn’t report league averages for walk rates or OBP, I decided not to make any contextual adjustments to walk rates. Gary’s data for 1921 and ’22 show NeLg walk rates similar to those in the NL.) (5) I report Lloyd’s MLEs as average rates for four periods: 1907–18 (ages 23 to 34), 1919–23 (ages 35 to 39), 1924–28 (ages 40 to 44), and 1929–30 (ages 45 to 46), and also as career rates.
This system is indeed what I use, except (1) I go season by season and use regression and (2) I haven't been adjusting walk rates by .81. I never developed a consistent BB conversion factor because I never did conversions for players who had a full career of walk data available.
So the question I have is: is .81 an appropriate conversion factor, from a theoretical perspective? That is, should walk rates vary as the square of hit rates, as isolated power does? Should we expect any sort of relationship between walk rates and other offensive statistics?
Brent doesn't give his rationale for choosing that factor (or I failed to notice it), so I am not sure his choice is justified. If anyone has any insights into this question, I'd be glad to read them!
I don't have a theoretical answer to this question, but I thought I might be able to get an empirical view by looking at walks in the same way I looked at batting average in the study I did when I first began my MLE project: I looked at the walk rates of players who went from the NeL to the majors. Walk data is only available for HoFers or short-listed HoF candidates, so I was only able to look at Roy Campanella, Larry Doby, Monte Irvin, and Minnie Minoso. (Jackie Robinson's NeL stint is too short in the HoF data to be meaningful.) When I compared their NeL walk rates to their ML walk rates, with appropriate adjustments for aging patterns (using Tangotiger's figures for this), I found that these players' NeL walk rates were most predictive of future ML walk rates when a conversion factor of about 1.0 was applied. (Without showing all the gritty details, Campy's and Doby's results suggested conversion factors of about .9, while Irvin's and Minoso's suggested conversion factors of about 1.1. I can post the gritty details if there is interest.) The walk rates of the top NeL power hitters with reputations for plate discipline--Buck Leonard and Josh Gibson, are comparable to the walk rates of the top hitters with plate discipline in the majors, so there's no obvious need for a conversion factor far removed from 1.0 there, either.
We lack, of course, the league walk rate data that would enable us to interpret systematically what this means. It could be that low-walk conditions in the NeL offset the weaker competition levels, so that, overall, earning walks was about as difficult in the NeL as in the majors. It could be that walk rates are not affected by competition levels in the same way that hits and hitting for power are.
So, the two points of data on walk rates that I have are (1) irrespective of competition levels, the conversion factor for individual NeL players in the 1940s looks to be about 1 and (2) in the early 1920s, NeL walk rates matched major league walk rates closely. The big question, then, is what happened to NeL league walk rates between the early 1920s and the early 1940s relative to the majors?
It might be that there is suggestive data on the effect of competition levels on walk rates hidden somewhere in CWL data. If there is CWL seasonal data that includes walks for years in which players participated who also played in the major leagues (e.g. Armando Marsans), comparing the walk rates of those players relative to their leagues might reveal the effects of competition levels on walk rates. I am not aware that any seasonal data of this kind is available for seasons with major-league cross-overs. If anyone knows where such data could be found, please let me know! (Whoops, I just realized that the 1908-09 CWL data Gary A. has compiled has Marsans himself! I'll have a look at him and see what I find. So, if _more_ data of this kind can be found, let me know.)
In the absence of theoretical analysis or empirical data to resolve my uncertainties about walk rates, I plan to handle walk rates as follows: (1) in seasons before 1940 for which no NeL league data is available, I will use a conversion factor of .95 for walk rates, just to hold down outliers a bit in seasons when walk rates might have spiked (2) in seasons for which I have NeL league data, I will adjust for context but not apply a competition conversion factor, and (3) I will regress walk rates in the same way I regress BA and slugging--to a five-year average, centered on the season in question. (Dan R's study of war credit suggests that more sophisticated regression studies using major-league data could produce a more precise regression formula, but since his findings there indicate that taking an average of the four surrounding seasons leads to similar results, I have some confidence that my simple regression formula is basically sound.)
This approach is different from Brent's, so our results will differ with respect to walks, though our systems are otherwise the same. Without a clear basis for applying a 20% discount to walk rates for competition level, I can't see my way to doing it. My limited review of CWL data suggests that it was a high walk rate league, so if one is working with that data, a contextual discount certainly appears to be appropriate in many cases, but, based on what I know at this moment, I would want to treat that purely as a contextual discount, not a competitive one.
Your thoughts?
I didn't do any kind of empirical study to select the .81 factor for walks, so I agree that the choice may not be justified. There are two reasons that I used it:
(1) Some of you may recall that several years ago I calculated MLEs for minor league seasons of HoM candidates like Buzz Arlett, Earl Averill, and Gavy Cravath, using the method MLE methods published by Bill James for Class AAA translations in his 1985 Baseball Abstract. I eventually realized that the Bill James method was essentially equivalent to multiplying batting average by M and isolated power and walk rates by M^2, where M^2 = .82. Since Chris was doing about the same thing for Negro league players with M=.9 and M^2 = .81, it seemed like a natural extension of the Bill James methodology.
(2) I started doing translations of Cuban League players with Carlos Morán, and without reducing his walk rate by a factor of .81, none of you would have believed the resulting MLEs. (None of you believed them anyway! :) At any rate, it seemed reasonable in his case to lower the walk rates to account for differences in quality of pitching.
Chris's suggestion is a good one--Gary has posted enough data on Marsans and Almeida that we could take a look at how their Cuban walk rates compare with their major league rates. I'll see what I can put together.
Your thoughts?
Do you generate MLEs in such a way that you can handle this discount a parameter? That is, after some set-up stage, "easily" generate low and high estimates with d=0.9 and d=1.0.
I do the MLEs with spreadsheets, so, yes, I could do this sort of thing quite easily.
After several seasons in minor league baseball, Marsans and Almeida both debuted with the Cincinnati Reds on July 4, 1911. Marsans soon became a regular and a minor star for the Reds, while Almeida hit fairly well but didn't get much playing time. While with the Reds, 1911-14, Marsans appears to have been an impatient hitter, drawing 66 walks in 1,179 AB+BB, a rate of .056 compared to the NL rate (excluding pitchers) of .087. Almeida, also with the Reds from 1911-13, was closer to average, drawing 25 walks in 310 AB+BB, a rate of .081 compared with the league average rate of .089. (Their league average rates differ slightly because Almeida didn't play in 1914 and I weighted the annual league average rates by each players AB+BB.)
For the Cuban data, in order to get a large enough sample I had to include data from as early as the fall of 1907. (Gary's website, agatetype.typepad.com, includes compilations of walks for regular Cuban League seasons through 1908-09, but after 1909 the data with walks are limited to a few series between visiting Negro league teams and Cuban teams. Figueredo, the other main source of Cuban data, doesn't include information on batter walks.) I analyzed walk rates for the following leagues and series: 1908 Cuban League season (winter of 1907-08), 1908-09 Cuban League season, and series against the Philadelphia Giants (fall 1907), Brooklyn Royal Giants (fall 1908), Leland Giants (fall 1910), Lincoln Giants (fall 1912), Lincoln Stars (fall 1914), and Indianapolis ABCs (fall 1915).
The strange thing is that the roles of Almeida and Marsans flip in the Cuban data. Almeida, who had a near-average walk rate in the majors, was impatient in the Cuban data. He drew 16 walks in 420 AB+BB, a rate of .038 compared to a league rate of .096. Marsans, on the other hand, was close to the league average in Cuba, drawing 35 walks in 474 AB+BB, a rate of .085 compared to the league rate of .096. So while the example of Marsans might provide support for the idea that walk rates dropped when facing major league pitching, Almeida is an obvious counterexample.
How about restricting it to the time period after they debuted in the majors (i.e., Cuban series held after 1911)? The sample sizes get uncomfortably small, but the same tendencies appear: Almeida drew 4 walks in 67 AB+BB, a rate of .060 compared to the series average of .113. Marsans drew 7 walks in 71 AB+BB, a rate of .099 compared to his series average of .118. Again, Marsans walked quite a bit more than his major league rate, while Almeida walked less.
The only conclusions I can draw from this little study are: (a) individual players make different adjustments when they move from one league to another, and (b) we would need a much larger sample than 2 players to accurately calculate the translations for walk rates from one league to another.
Given the lack of information, I'll be happy to go along with Chris's recommendation to use a factor of .95 for walks. I will revise my MLE calculations for Lloyd and Hill and re-post them.
It's interesting that Almeida and Marsans so such diametrically opposed trends, but it's too bad their careers don't give us any insight into the problem of the effects of league strength on walks.
When the HoF releases the full data from their study and gives us NeL league walk rates for the 1940s, we'll be able to get some answers, I guess, but perhaps not until then. (Unless independent researchers like Gary A., who release their findings publicly, complete some studies first, which is beginning to seem the more likely outcome.)
You must be Registered and Logged In to post comments.
<< Back to main