Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Hall of Merit > Discussion
Hall of Merit
— A Look at Baseball's All-Time Best

Monday, February 05, 2007

Dan Rosenheck’s WARP Data

WARP Methodology and Results

Thanks, Dan!

EDIT: Link updated 2/23/2009

John (You Can Call Me Grandma) Murphy Posted: February 05, 2007 at 08:59 PM | 763 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 7 of 8 pages ‹ First  < 2 3 4 5 6 7 8 > 
   601. Joey Numbaz (Scruff) Posted: October 08, 2008 at 04:17 AM (#2974031)
Calling up a 1B prospect from the minors is not the same as having a guy like David Ortiz, Jason Giambi, Aubrey Huff, Jason Kubel, etc..

Not to mention that whatever pitcher you sent down would have to stay in the minors for 10 days unless someone on the active roster gets hurt.

I'm not saying it's the entire difference, but it definitely helps overstate it.
   602. Blackadder Posted: October 08, 2008 at 06:06 AM (#2974075)
By the way, you can get baserunning +/- numbers from Bill James Online. My understanding is that they aren't as good as Dan Fox's numbers, but they are probably more accurate than using regression-equation estimated baserunning.
   603. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 08, 2008 at 02:54 PM (#2974270)
Blackadder, 2007 data I would have to do by hand (as I did these selected 2008's). 2006 is in my sheet but missing some tidbits; I could try to fill them in and send you a sheet if you'd like. And yes, I could use James's numbers for 2008, but I'd rather just wait for Fox. My main efforts at the moment are these tests of various defensive stats against the PBP metrics, to try to improve FWAA for the Retrosheet era (and particularly the post-1987 era, for which we have fantastically good data).
   604. Blackadder Posted: October 08, 2008 at 07:27 PM (#2974579)
Sure Dan, I mean, If it is not too much work I suspect I am not the only person here who would be curious to see the 2006 data. This is nothing more than a curiosity though; I agree that getting FWAA right should be a bigger priority, not to mention the fact that I suspect that people in your profession have not been lacking for real things to work on lately either! So just send it over when you have time to do so.
   605. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 08, 2008 at 07:37 PM (#2974590)
Surprisingly, it's the converse. As we devote more space to the credit crisis, the Americas section gets squeezed, so I'm actually writing a bit *less* often now. That's given me time to work more on my WARP and on a book proposal about baseball in Latin America.
   606. AROM Posted: October 08, 2008 at 07:53 PM (#2974608)
Is Dan Fox going to provide baserunning numbers? He's working for the Pirates now, they may want to keep that data for themselves.

I've got a system for baserunning that I can do once the retrosheet data is up, but I don't want to promise too much. I know I still need to get my OF throwing ratings together to send to you, Dan.
   607. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 08, 2008 at 08:05 PM (#2974623)
I have no idea; I haven't asked. Frankly, I'm more concerned about getting the system as good as possible for 1893-2005 than I am on tacking on the most recent years for now.

Yeah, I'm looking forward to them. Thanks.
   608. Blackadder Posted: October 08, 2008 at 10:56 PM (#2974790)
Actually, it looks like you can get Dan Fox's baserunning numbers from Baseball Prospectus (or at least, I think they are his, based on the description in the BP glossary). You can find them here:

http://www.baseballprospectus.com/statistics/sortable/index.php?cid=420215
   609. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 08, 2008 at 11:22 PM (#2974808)
Hey, nifty! I hadn't seen that. Nice catch! Let's see how my estimator did for players I had at least 0.2 away from average...

Rollins: Me 0.9, Fox 0.9
Sizemore: Me 0.6, Fox 0.5
Reyes: Me 0.6, Fox 0.8
Beltrán: Me 0.5, Fox 0.6
Holliday: Me 0.5, Fox 0.8
Kinsler: Me 0.5, Fox 0.9
Roberts: Me 0.4, Fox 0.1
Pedroia: Me 0.4, Fox 0.4
Hanley: Me 0.2, Fox 0.3
Utley: Me 0.2, Fox 0.1
Hamilton: Me 0.2, Fox 0.2
A-Rod: Me 0.2, Fox 0.3
Youkilis: Me -0.3, Fox 0
Markakis: Me -0.2, Fox -0.3
Morneau: Me -0.2, Fox -0.3
Soto: Me -0.2, Fox -0.4
Teixeira: Me -0.2, Fox 0
Delgado: Me -0.2, Fox 0
Mauer: Me -0.2, Fox +0.4
Ludwick: Me -0.2, Fox +0.2

Pretty good! The only guy that's really off is Mauer...and he is such a unique player, I think my model can be forgiven for suspecting that a catcher with one stolen base and four triples is slow.
   610. Joey Numbaz (Scruff) Posted: October 09, 2008 at 05:01 AM (#2975027)
Actually, I'd think a catcher with 4 triples probably isn't all that slow - could the model be slightly underrating them? At least for catchers?
   611. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 09, 2008 at 02:51 PM (#2975140)
That might be fast *for a catcher* but it's not fast overall, by any stretch of the imagination. Now, my WARP certainly don't penalize catchers for this--they will, as a whole, have negative BRWAA, but that means replacement catchers have negative BRWAA as well, so if some of them are being incorrectly docked in the BRWAA column, they'll be incorrectly credited an equal amount in the Replc column, which washes out. That said, I think by definition it's neither under- nor over-rating them, since the equation was derived precisely by analyzing the aggregate baserunning of players at each position (C included) from 1956 to the present.
   612. Joey Numbaz (Scruff) Posted: October 09, 2008 at 10:40 PM (#2975527)
4 triples a year has to be at least average for all players right? The AL only had 408 triples this whole season.

That means the average hitter, in 162 games of full time play would only hit 3.2.
   613. Joey Numbaz (Scruff) Posted: October 09, 2008 at 10:42 PM (#2975528)
I would think for things like triples and SB you'd need to adjust for changes in norms over time in a system that uses it to project baserunning values compared to average for a season, right?
   614. Joey Numbaz (Scruff) Posted: October 09, 2008 at 10:43 PM (#2975529)
No idea why he stopped running this year, but for his career he's an 83% base-stealer too. He was 13-1, 8-3, 7-1 the last 3 years.
   615. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 09, 2008 at 10:56 PM (#2975533)
I handle changes in norms over time simply by zeroing out the average non-SB baserunning runs for each league-season. The estimator, which is based on the modern game, obviously "thinks" that virtually every deadball-era player was a speed demon--but I just take the average estimated EqBR score for each season (which is 0 today and probably about +2 for deadballers) and subtract it out, so that a deadball who the estimator "thinks" is +5 will register as +3, and one who the estimator "thinks" is +1 will register as -1.
   616. Brent Posted: October 10, 2008 at 01:42 AM (#2975655)
No idea why he stopped running this year, but for his career he's an 83% base-stealer too. He was 13-1, 8-3, 7-1 the last 3 years.

This observation suggests that perhaps Dan's formula for projecting non-SB baserunning might be improved by including statistics for the previous year (or years).
   617. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 10, 2008 at 10:29 AM (#2975829)
That is not a bad idea at all, Brent. I will definitely explore it when I revisit BRWAA (which will be after I have FWAA to my liking).
   618. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 10, 2008 at 08:26 PM (#2976219)
Interestingly, those extra 0.6 baserunning wins move Mauer up from a crowded field to a clear #1 on my nonexistent 2008 AL MVP ballot.
   619. stax Posted: October 10, 2008 at 08:39 PM (#2976240)
Where does Pedroia stand on your ballot?
   620. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 10, 2008 at 08:46 PM (#2976247)
Right where he was before...see post #583.
   621. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 28, 2008 at 04:16 AM (#2997376)
I am pleased to announce that I have completed a revision of my WARP for the 1987-2005 period. Since I last updated my system exactly one year ago, a number of new metrics have become available to measure baserunning and defense. For baserunning, Dan Fox's EqBRR take account of substantially more data than James Click's EqBR, which I was previously using. For defense, to measure assist and putout runs, we now have Sean Smith's TotalZone, Dan Fox's SFR (for the infield), and an updated version of Chris Dial's Zone Rating-based data, as well as updates to Mitchel Lichtman's UZR and David Pinto's PMR. Sean Smith has also done extensive work on the other aspects of defense, and has produced figures for catcher fielding, infielders turning the double play, and outfielder throwing arms.

Given all this outstanding new data, I felt it was a shame to keep relying on BP's FRAA and Bill James's Fielding Win Shares, so I set about incorporating it into my WARP. For everything but assists/putouts (that is to say, Fox's work on baserunning and Smith's on catchers, DP's, and arms), I simply added the data into the fielding and baserunning wins metrics. For assists/putouts, I did multiple regression analyses of the various stats against a weighted average of contemporary play-by-play metrics, and used the resulting equations to determine A/PO runs (e.g., a shortstop's A/PO score from 1987-98 is equal to .48 times his SFR plus .38 times his Dial rating plus .26 times his TotalZone plus a constant). These regressions show that the data we now have available are extremely high-quality: the r-squared's against the PBP metrics range from a low of .61 in center field to a high of .84 at third base (so multiple r of .78 to .92). In other words, you should feel quite confident in the accuracy of these figures--assuming you trust PBP numbers to begin with. The actual PBP data itself is included in the calculation of fielding wins as it becomes available (so UZR in 2000, Plus/Minus in 2003, and PMR in 2004).

To keep these numbers on the same scale as those available for 1893-1986, I have reduced the LgAdj figures (increasing the regression to the mean due to the standard deviation adjustment) to account for the extra variance introduced by this new data. This means that you *can* compare these WARP2 scores to those from earlier years on an apples-to-apples basis, but you *can't* compare these LgAdj scores as a measure of ease of domination to those from earlier years. For this reason, I have not modified the StDevs and Rep Levels spreadsheet (although I have slightly changed the replacement levels for the 1987-2005 period used to calculate these WARP as a result of the new data).

Finally, park factors in the Lahman database were updated at some point, and those new numbers are factored into this spreadsheet. The salaries listed here are based on the 2007 free agent market, and are calculated by looking at seasonal totals, not rates.

I hope the group finds this new information useful, and I look forward to feedback. Once Michael Humphreys finishes his work on DRA, I will see if the resulting correlations to PBP metrics are good enough to do something similar for years before 1987.

The new data are available in the Yahoo group.
   622. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 28, 2008 at 12:03 PM (#2997538)
By the way, I could do 2006 and 2007, but the only access I can get to Dewan's Plus/Minus for those years is by typing in every player's name individually on billjamesonline.com and manually entering the results at every position. While I devote a lot of time to this, I don't have *that* much time. Anyone have any ideas to help?

Also, I should have mentioned that the problems my old WARP have with players who handled multiple positions in the same season are solved here. Every player's scores are calculated based on his exact number of innings played at each position in each season.

Check out my 1994 AL MVP! It's a stunner.
   623. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 28, 2008 at 01:16 PM (#2997578)
Speaking of that '94 AL MVP, the finding that most surprises me is that Kenny Lofton has a very strong HoM case: he's right on the in/out line before counting his '06 and '07, which both had some value. In the Max Carey/Richie Ashburn mold, he's got 6.7 baserunning wins above average and 9.8 fielding wins above average through 2005, an OBP-heavy OPS+, and a long career. Surprisingly for a player of this profile, his arm is a meaningful part of his case, with 30 throwing runs from 1992-2003.
   624. RedSoxBaller Posted: October 28, 2008 at 01:48 PM (#2997611)
Dan what is this Yahoo group you speak of, I like WARP, and I would like to see your new updates. Thanks alot, I have enjoyed looking at your other statistics as well. Great Work
   625. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 28, 2008 at 03:08 PM (#2997681)
Get a Yahoo account if you don't already have one. Then go to groups.yahoo.com, search for the Hall of Merit group, and sign up. Then go to the Files section and download the Rosenheck WARP.zip archive.
   626. Mike Emeigh Posted: October 28, 2008 at 03:17 PM (#2997684)
Once Michael Humphreys finishes his work on DRA, I will see if the resulting correlations to PBP metrics are good enough to do something similar for years before 1987


Michael will NEVER be done :)

-- MWE
   627. Joey Numbaz (Scruff) Posted: October 30, 2008 at 04:40 PM (#2999691)
Great news Dan! I'll update the DB I built as soon as possible. I still have to set up a download for it as well, as Paul suggested.
   628. stax Posted: October 30, 2008 at 09:17 PM (#2999978)
Dan R.... You are a very smart man who just used a whole heck of a lot of words I'm going to go ahead and trust. Thanks! I can understand the basic levels of SABR-style comparison and gameplay, but the higher level stat-creation wizards such as yourself still amaze me.
   629. Paul Wendt Posted: October 30, 2008 at 11:28 PM (#3000086)
DanR #621
To keep these numbers on the same scale as those available for 1893-1986, I have reduced the LgAdj figures (increasing the regression to the mean due to the standard deviation adjustment) to account for the extra variance introduced by this new data. This means that you *can* compare these WARP2 scores to those from earlier years on an apples-to-apples basis, but you *can't* compare these LgAdj scores as a measure of ease of domination to those from earlier years. For this reason, I have not modified the StDevs and Rep Levels spreadsheet (although I have slightly changed the replacement levels for the 1987-2005 period used to calculate these WARP as a result of the new data).

I think this means that StDevs and Rep Levels may be valuable for general sabrmetric reference --as I suppose many have used them. Maybe I will take a close look at thanksgiving break.

Finally, park factors in the Lahman database were updated at some point, and those new numbers are factored into this spreadsheet.

>>http://www.baseball-reference.com/about/parkadjust.shtml
Calculation of Park Factors

[Sean Forman:] I largely follow the method spelled out below. Historically, B-R has been using single-year park factors for recent years and 3-year park factors historically. I have changed that to now use 3-year factors by default for all years. Of course, the current season is only really a 2-year factor. The current year and last year. This can lead to some big changes in the numbers, from what had been on the site.

[and following TotalBaseball.com or Pete Palmer:] We use a three-year average Park Factor for players and teams unless they change home parks. Then a two-year average is used, unless the park existed for only one year. Then a one-year mark is used.
<<


622. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 28, 2008 at 08:03 AM (#2997538)
By the way, I could do 2006 and 2007, but the only access I can get to Dewan's Plus/Minus for those years is by typing in every player's name individually on billjamesonline.com and manually entering the results at every position. While I devote a lot of time to this, I don't have *that* much time. Anyone have any ideas to help?
   630. Paul Wendt Posted: October 30, 2008 at 11:31 PM (#3000090)
please excuse the repeat but it is too late to highlight the heading or web address and this is lost in #629

Park Adjustments at Baseball-Reference
>>
Calculation of Park Factors

[Sean Forman:] I largely follow the method spelled out below. Historically, B-R has been using single-year park factors for recent years and 3-year park factors historically. I have changed that to now use 3-year factors by default for all years. Of course, the current season is only really a 2-year factor. The current year and last year. This can lead to some big changes in the numbers, from what had been on the site.

[following TotalBaseball.com, ie Pete Palmer:] We use a three-year average Park Factor for players and teams unless they change home parks. Then a two-year average is used, unless the park existed for only one year. Then a one-year mark is used.
<<
   631. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 31, 2008 at 12:19 AM (#3000121)
stax, I'm talking in shorthand here, because I think it's safe to say that we can divide the HoM electorate into two camps: those who find my system useful, in which case they are probably already familiar with its basic concepts, and those who don't, who wouldn't bother to read this thread anyway. If you feel like taking the time to read all 630 posts of this thread, you should be able to get a pretty good hold on what I'm doing. I am hardly a "higher-level stat creation wizard"--my methods are incredibly rudimentary and would clearly not pass an elementary academic scrutiny. But they're good enough for HoM purposes. :) Anyway, a brief summary of what I do would be the following:

1. I calculate wins above average (league average for hitting and baserunning, positional average for fielding), factoring in everything up to and including the kitchen sink: double plays net of opportunities, non-SB baserunning, sac flies, you name it. This is no different than adding BRAA1 and FRAA1 together from a BP card, except that it's far more comprehensive.

2. I adjust this mark for how easy the league was to dominate. If it seems like some years there are a bunch of guys with OPS+ over 170 and even up around 200, while in other years the league leader will barely crack 160, there's a reason for it: certain factors make it easier or harder for players to distance themselves from average (either positively or negatively). In particular, the higher a league's run scoring is, and the closer it is to an expansion, the more likely you are to see extremely high scores. So, for example, if any season were likely to generate two guys breaking Roger Maris's record in the same season, it was the 1998 NL: the league scored 4.6 runs a game (high by historical standards), and added two new teams on top of the two that joined in 1993. Conversely, when Mike Schmidt led the NL 4 times in 5 years with OPS+'s of 161, 156, 155, and 152, he was able to do so because he played in low-scoring leagues that hadn't expanded in 15 years. I correct for this factor (standard deviations in stat-speak).

3. I come up with a final wins-above-replacement score by subtracting the standard deviation-adjusted wins above average of a replacement player at the same position from the standard deviation-adjusted wins above average of the player in question. I determine this replacement level by starting with the performance of real live replacement players from 1985-2005 (based on Nate Silver's study of "Freely Available Talent," or players over age 27 earning less than twice the minimum salary), and then trace its evolution back over the years by using the average performance of the worst 3/8 of MLB starters at the position over 9-year periods. E.g., let's say that shortstops over age 27 earning less than twice the minimum salary averaged 3.0 standard deviation-adjusted wins below average per full season played from 1985-2005, and the worst 3/8 of MLB starting shortstops averaged 2.7 standard deviation-adjusted wins below average per full season played during the same time period. And for the 1976-84 period, let's say that the worst 3/8 of MLB starting shortstops averaged 3.8 standard deviation-adjusted wins below average per full season played. In that case, I would set a replacement level of 4.1 standard deviation-adjusted wins below average per full season played for shortstops in the year 1980 (3.8 for the worst regulars, and an extra 0.3 for the gap (of 3.0-2.7) between the worst-regulars average and the "freely available" average for 1985-2005). Thus, if we have a SS who was 1.5 batting wins below league average, 0.3 baserunning wins above league average, and 0.8 fielding wins above an average shortstop (after adjusting for standard deviations) in precisely half a season in 1980, he would be -1.5+.3+.8 + (4.1/2) = 1.65 wins above replacement.

Paul Wendt, that spreadsheet is definitely intended to be a quick reference of ease-of-domination and the defensive spectrum for the time period covered by my system. Those two factors are the main added value that my system offers above what BP WARP and WS provide.
   632. Heinie Mantush (Krusty) Posted: October 31, 2008 at 01:02 AM (#3000157)
Thanks again for all your work, Dan. It really is a fantastic resource. A quick question, though Do you factor in "clutch" vis a vis WPA or some other metric? I've skimmed through the thread (I intend to read it all later- promise!), and I didn't see anything.
   633. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 31, 2008 at 01:11 AM (#3000163)
No, definitely not. I do have that data and am playing around with it--I think it is perfectly relevant to HoM debates--but it's not the direction I've chosen to go in with this system.
   634. Heinie Mantush (Krusty) Posted: October 31, 2008 at 01:58 AM (#3000175)
Ah, very well then. I'm curious as to where I could find that sort of information? It must be fascinating to read through.
   635. stax Posted: October 31, 2008 at 06:06 AM (#3000249)
Heh, I understand what you're doing, it's just the how that I couldn't do so I'm thankful for. :)
   636. stax Posted: October 31, 2008 at 06:08 AM (#3000251)
Actually, looking at what you posted really fast, do you do step #2 (adjusting for how easy a league was to dominate) for defense? For example, does your method in any way alter Tris Speaker's defensive value because he played in an era and park that allowed for massive defensive value (I guess not really a 'dominating' question, but still). Even beyond that Speaker example, is there such a thing as a league that's easier to dominate defensively?
   637. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 31, 2008 at 11:46 AM (#3000274)
Herschel--fangraphs.com has WPA for all player-seasons since 1974.

stax--You hit on a very important methodological question. At the moment, my standard deviation adjustment is applied to total wins above average, hitting + fielding + baserunning. I am still quite unsure of whether I should attempt to break this up or not. I certainly couldn't try to do so given the current data I have, where I am the first to admit that for pre-1956 seasons I am completely reliant on junk stats (Fielding WS and FRAA). But I hope to have DRA soon. If the correlations to the PBP metrics come out strongly enough, the next version of my WARP will be DRA-based for that period.

Unlike FRAA and Fielding WS, which have some sort of built-in caps (of course we know that Fielding WS does), DRA does not resort to such half-baked solutions. As a result, its standard deviation shows a pronounced decline over time, with prewar shortstops frequently posting +30 marks, occasionally +40, and even reaching as high as +50 in one case. Part of this is obviously real and legit, since SS simply had more ground balls to field back then than they do now. But another part, I think, is that the DIPS issue becomes harder to tackle as we go backwards in time. DRA credits pitchers for all grounders to the pitcher and infield popups, and fielders for all other balls--it assumes that pitchers have no ability to induce easy-to-field balls other than popups for their defenses and grounders to themselves. This assumption works reasonably well in the modern game, as shown by the metric's strong correlations to the PBP's. But I do NOT think it is a safe assumption to make before WWII and particularly in the deadball era, when far more pitchers could and did make their living off inducing easy-to-field balls than today's small contingent of knuckleballers. Just look at a guy like McGinnity; he's all BABIP, all the time. Unless that's all coming in the form of popups and groundouts to the pitcher, DRA is going to see him as a merely average pitcher, and the fielders behind him as wizards.

Disentangling pitching and defense is challenging in any era, and obviously more so the further back we go in time. If I can ever figure out a way to do it, then I would be able to achieve what I consider to be the Holy Grail: an integrated WARP system that traces the pitching/fielding split (with its quite substantial consequences for standard deviations) back over time. But until then, I'm afraid I'm just going to have to keep muddling along for the pre-Retrosheet era. They do seem to push the Retrosheet frontier back a bit further every year, so perhaps eventually there will be enough data available that this issue will resolve itself automatically.
   638. stax Posted: October 31, 2008 at 02:56 PM (#3000401)
My other question is this: Would adjusting for an easy-to-dominate-defensively league be timelining? I heard a lot about Speaker (this is why I used him as an example) and how should his defense be treated (since he would not be nearly as valuable in a modern park with modern pitchers, hitters, and balls). Would adjusting for std deviations in defense be proper, timelining, or some combination of the two?
   639. Mike Emeigh Posted: October 31, 2008 at 03:15 PM (#3000422)
But another part, I think, is that the DIPS issue becomes harder to tackle as we go backwards in time. DRA credits pitchers for all grounders to the pitcher and infield popups, and fielders for all other balls--it assumes that pitchers have no ability to induce easy-to-field balls other than popups for their defenses and grounders to themselves. This assumption works reasonably well in the modern game, as shown by the metric's strong correlations to the PBP's.


The metric correlates well to the PBPs in the modern game because Michael is using the PBPs to validate his metric. That doesn't validate the assumption.

PBP metrics still correlate fairly well with opportunity, although not to the same extent that non-PBP metrics do. Infielders playing behind groundball pitchers will, on balance, fare better in the PBP metrics than do infielders playing behind flyball pitchers, and the reverse tends to hold true for outfielders. Teams that have few, or many, left-handed pitchers will tend to show skews based on the platoon differential (few LHP=more RHB against=more balls to the left side of the infield=better SS and 3B ratings and lower 2B and 1B ratings), although since platooning has declined this is becoming less of a factor.

-- MWE
   640. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 31, 2008 at 03:24 PM (#3000431)
Huh? Certainly we'd expect infielders playing behind GB pitchers to have a higher *standard deviation* of PBP scores than infielders playing behind FB pitchers, since they get more chances to distinguish themselves from each other. But that should hurt bad fielders as much as it helps good ones--if Derek Jeter is -15 now, he'd probably be -20 if the Yankees added Webb, Lowe, and Carmona to their staff. The average should still be zero, except to the extent that GM's are intelligent and seek to match their staffs and defenses by pairing sinkerballers with good infielders and bad outfielders, and flyball pitchers with the reverse.
   641. Mike Emeigh Posted: October 31, 2008 at 03:34 PM (#3000438)
But that should hurt bad fielders as much as it helps good ones--if Derek Jeter is -15 now, he'd probably be -20 if the Yankees added Webb, Lowe, and Carmona to their staff.


It doesn't. Adding groundball pitchers to the staff tends to lift the rankings of ALL infielders, bad and good.

-- MWE
   642. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 31, 2008 at 03:45 PM (#3000454)
Can you show me data on this? There's certainly no reason that should be the case, and if it were truly a strong trend, it would cast serious doubt on the reliability of the underlying metrics.
   643. AROM Posted: October 31, 2008 at 03:48 PM (#3000458)
It seems that guys like Lowe, Webb, and Wang generate a lot easily fielded groundballs, which helps the infielder PBP ratings. I would check with SG at replacement level Yankees, he's been tracking zone rating on a day by day basis and could probably tell you what Jeter's ZR is playing behind Wang compared to other pitchers.
   644. David Concepcion de la Desviacion Estandar (Dan R) Posted: October 31, 2008 at 03:56 PM (#3000467)
That still means the system's not working. If they are truly easily fielded, they should have a probability of being turned into outs of about .98, and therefore the infielders should only be credited with .015 runs or so for fielding them.
   645. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 06, 2008 at 11:45 PM (#3004661)
Some quick housekeeping: I had miscalculated the BRWAA in the updated 1987-2005 numbers, so they were too low and the BWAA were too high. That has now been corrected in the file posted to the Yahoo group. The final WARP numbers (both before and after the stdev adjustment) are the same, but the BWAA/BRWAA breakdown is now different.
   646. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 12, 2008 at 10:09 PM (#3007781)
Why is it that elsewhere on the Internets, the replacement level "question" is treated as settled law (http://www.hardballtimes.com/main/article/replacement-level-article/, and every single BTF thread about player valuation on the free agent market), while at the HoM we continue to bicker over whether the sun revolves around the earth?

Also, no feedback/comments from anyone on 19 years' worth of new WARP numbers? Gary Sheffield as one of the worst fielders who wasn't moved to 1B or DH ever? Kenny Lofton as the 1994 AL MVP and a HoM candidate? Robin Ventura as the 1999 NL MVP? A-Rod already just one MVP award behind Bonds? (Bonds 1990-96 except '94 and 2000-04 except '03 when Pujols beats him; A-Rod 1996, 98, 2000-05, 07) Discuss, people!
   647. JPWF13 Posted: November 12, 2008 at 10:44 PM (#3007812)
Why is it that elsewhere on the Internets, the replacement level "question" is treated as settled law (http://www.hardballtimes.com/main/article/replacement-level-article/, and every single BTF thread about player valuation on the free agent market


You are not reading enough threads- the concept of replacement level is in dispute many places, and where the concept itself is not in dispute- the actual location of such level is.
   648. Juan V Posted: November 12, 2008 at 10:58 PM (#3007818)
Ventura as 1999 NL MVP? Would you mind giving a breakdown of him vs Chipper? And how far is he from your PHOM?
   649. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 13, 2008 at 01:07 PM (#3008019)
Yay! Someone asked a question! Here we go:

Chipper Jones: 157 G, 701 PA, 567 AB, 181 H, 41 2B, 1 3B, 45 HR, 110 unintentional BB+HBP, 18 IBB, 25 SB, 3 CS, 6 SF, 94 K. exTrapolated Runs puts that line at 149.3 runs. Subtract 1 run for non-SB baserunning and you're at 148.3, and tack on 9.27 net double plays on offense, and you're down to 144.9. PF is 100, so that stays. That's a total of 404 outs, leaving 3,765 for his theoretical teammates, who averaged .1944 runs per out in the 1999 NL, so the Average Team Plus Chipper would score 144.9+(.1944*3765) = 877 runs.
On defense, both Dial and TotalZone (the only two metrics available for that season) put him at -10 runs. My weighted average of .43*TZ + .62*Dial + .00065*Innings comes out to, unsurprisingly, -10 runs. Furthermore, he turned only 5 double plays in 56 opportunities, costing him 3 more runs. And finally, I have to subtract 2 more runs from his defense (as I do from every 1999 NL 3B) to get the league total to add up to 0. So he's 15 runs below average on D. The average team in the 1999 NL had 810 runs, so the average team with Chipper at 3B would allow 825 runs.
Pythagoras says a team with 877 RS and 825 RA will win 85.8 games. That's 4.8 wins above average, which if you multiply by the standard deviation adjustment of .901 for the 1999 NL (one year after an expansion and 6 years after another one, scoring 5 runs a game), gets reduced to 4.3 wins above average. My replacement level for 3B in the 1999 NL is 1.5 standard deviation-adjusted wins below average per 696 PA, and Chipper had 701 PA, so 4.3+(1.5*701/696) = 5.8 WARP2.

Robin Ventura: 161 G, 670 PA, 588 AB, 177 H, 38 2B, 0 3B, 32 HR, 67 unintentional BB+HBP, 10 IBB, 1 SB, 1 CS, 5 SF, 109 K. eXtrapolated Runs gives that 110.9 runs. He was a perfectly league average non-SB baserunner and hit into a league average number of double plays given his opportunities, and his park factor was 98, so he winds up with 113 XR. He had 417 outs, leaving 3,752 for his teammates, so the Average Team Plus Ventura would score 113+(3752*.1944) = 842 runs.
Dial has Ventura at 23 runs above average on defense, and TotalZone at 26. So .43*26 + .62*23 + .00065*1356 makes 26 runs above average. He also turned 27 double plays in 57 opportunities, which is 3 more runs above average, bringing him to 29. Subtracting the same 2 for the league constant makes him 27 runs above average. Taking the league average 810 runs a team and subtracting 27 translates to 783 runs allowed.
842 RS and 783 RA are a Pythagorean record of 86.7 wins, 5.7 above average. Multiplying that by .901 reduces it to 5.1 wins above average. Add on 1.5 stdev-adjusted wins per 696 PA (Ventura had 670), and you get 6.6 WARP2, fully 0.8 higher than Jones.

That's the long version. But to boil it down, metrics drawing on two different data sets (STATS and Retrosheet) both agree that Ventura had an ungodly-great fielding season in 1999, one of the best in an excellent defensive career, while Jones had quite a poor one, the worst by far in a fine defensive career. The gap between the two in the field was a massive 42 runs, significantly larger than their differences on offense.

Ventura comes out to $135M in my salary estimator. The HoM in/out line is about $150M. He'd need another strong All-Star year like his 1995 to contend for my PHoM, and more than that to make my ballot. I have Buddy Bell ahead of him in the glove-first 3B queue.
   650. Al Peterson Posted: November 13, 2008 at 08:25 PM (#3008373)
Couple observations from DanR's new WARP numbers

2nd best player in the National League for 1993: Could it be Jay Bell?

Check out Derek Jeter in 1999. That deserves a monster *fist pump*

Darin Erstad really was a wonderful fielder. He has some of the best fielding seasons for the time period at CF, corner OF, and 1B. Plus he could punt the opponents inside the 20 yard line if necessary.
   651. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 14, 2008 at 02:47 AM (#3008810)
A 124 OPS+ shortstop who wins a very deserving Gold Glove over 154 games? That sure sounds like an MVP candidate in the non-Bonds division to me.

What's surprising about Jeter's '99? He hit .349 with 24 homers at shortstop. I mean, 'nuff said.

A very old version of UZR had Erstad's '02 at +57 runs. Anyone with even a passing familiarity with PBP metrics knows that Erstad was a historically great, possibly Mays/Speaker-caliber defender when he was on the field. But perhaps precisely the all-out style that enabled him to get to so many balls was also what prevented him from playing more games. And obviously, he never hit a lick after that 240-hit season. Strange career.
   652. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 21, 2008 at 04:13 AM (#3013609)
In case anyone's curious (as I was), I ran A-Rod, Pujols, and Chipper for 2006-08 to see where they were alltime. Here are the 3 (using updated numbers for D and baserunning, regressing everyone an extra 2.5% to the mean to account for the bigger stdev of the new D/baserunning numbers, and using a .975 stdev adjustment for '06-'08).

Alex Rodríguez

Year SFrac BWAA BRWAA FWAA Replc WARP
1994  0.12 
-0.8  -0.1 -0.5  -0.4 -0.9
1995  0.24 
-0.6   0.0 -0.7  -0.8 -0.5
1996  0.95  4.7   0.3  0.0  
-3.5  8.6
1997  0.91  1.4   0.6 
-0.5  -3.2  4.7
1998  1.07  3.6   0.1  0.8  
-3.7  8.1
1999  0.82  2.3   0.3  0.0  
-2.9  5.5
2000  0.96  5.7   0.4  1.0  
-3.3 10.3
2001  1.06  5.5   0.5  0.5  
-3.4  9.9
2002  1.05  5.1   0.4  0.8  
-3.4  9.7
2003  1.03  3.8   0.1  0.4  
-3.2  7.6
2004  1.00  2.8   0.3  1.6  
-2.3  6.9
2005  1.04  6.9   0.0  0.1  
-2.3  9.3
2006  0.97  2.7   0.0 
-1.0  -2.2  3.9
2007  1.02  6.6   0.5  0.3  
-2.2  9.6
2008  0.86  3.3   0.3  0.2  
-1.8  5.7
TOTL 13.11 53.1   3.7  3.0 
-38.6 98.4
TXBR 12.75 54.5   3.8  4.2 
-37.4 99.8
AVRG  1.00  4.1   0.3  0.2  
-2.9  7.5 


3-year peak: 29.9
7-year prime: 65.5
Career: 99.8
Salary: $346,409,212--above Robinson, Ripken, and Henderson, tied with Ott, below Schmidt and Morgan and Gehrig, #2 among primary-SS


Albert Pujols

Year SFrac BWAA BRWAA FWAA Replc WARP
2001  0.99  4.9  
-0.1  0.6  -1.0  6.4
2002  0.99  4.5   0.0  0.4  
-1.0  6.0
2003  1.00  8.0   0.5  0.8  
-0.7 10.0
2004  1.01  6.4   0.1  1.1  
-0.2  7.8
2005  1.03  6.4   0.4  0.4  
-0.2  7.4
2006  0.92  6.5   0.3  1.5  
-0.2  8.6
2007  0.98  5.1  
-0.1  2.2  -0.2  7.4
2008  0.93  7.8   0.0  1.9  
-0.2  9.9
TOTL  7.85 49.6   1.0  8.9  
-3.8 63.4
AVRG  1.00  6.3   0.1  1.1  
-0.5  8.1 


3-year peak: 28.4
7-year prime: 57.4
Career: 63.4
Salary: $219,466,929--already ahead of Frank Thomas; obviously Thomas has more career bulk than Pujols to this point, but Pujols bests him significantly on peak due to the defensive value. Close to Griffey. Above guys like Reggie Jackson and Jesse Burkett. Among 1B, will pass Bagwell next year, and then Greenberg and Mize await.


Chipper Jones

Year SFrac BWAA BRWAA FWAA Replc WARP
1995  0.99  1.3   0.4  0.7  
-1.2  3.6
1996  1.01  3.6   0.4  0.3  
-1.8  6.1
1997  0.99  1.9   0.1  0.6  
-1.4  4.0
1998  1.03  4.6   0.1  0.6  
-1.6  6.9
1999  1.01  5.4   0.2 
-1.3  -1.5  5.8
2000  0.99  3.7   0.3 
-0.1  -1.6  5.5
2001  0.99  5.4   0.0 
-0.4  -1.6  6.5
2002  0.97  4.9   0.1  0.6  
-0.9  6.5
2003  0.96  4.1   0.0 
-0.1  -0.9  4.9
2004  0.82  1.8  
-0.1  1.0  -1.2  3.9
2005  0.63  3.2   0.0  0.1  
-1.0  4.3
2006  0.69  3.8  
-0.2 -0.7  -1.1  4.1
2007  0.87  5.4   0.1  0.4  
-1.4  7.3
2008  0.78  5.7   0.0  0.8  
-1.3  7.8
TOTL 12.73 54.8   1.5  2.5 
-18.5 77.2
AVRG  1.00  4.3   0.1  0.2  
-1.5  6.1 


3-year peak: 22.0
7-year prime: 46.9
Career: 77.2
Salary: $228,317,409--same ballpark as Pujols overall at this point. Among 3B, he's one good year away from Brett and Boggs; pretty damn impressive. Will need two more MVP-type seasons, or 5 more All-Star seasons, to reach Mathews. Crazy late-career peak, no? 1999 is a perfect storm: tons of net double plays, very poor fielding, and a super high standard deviation (2nd highest-scoring NL season since 1930, one year removed from one expansion and six years removed from another).
   653. Juan V Posted: November 21, 2008 at 04:18 AM (#3013610)
That Pujols guy, he's kinda good.
   654. DL from MN Posted: November 21, 2008 at 04:33 PM (#3013754)
Assuming a normal career shape, could Pujols challenge Gehrig?
   655. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 21, 2008 at 05:14 PM (#3013789)
Easily. He's nearly 60% of the way there, and he's only 28. The key thing to remember about Pujols is that he's not really a first baseman in the traditional sense--according to the numbers, which I have no reason to doubt, his fielding there is so good (Sean Smith has him projected for +12 in 2009, and that's with a very healthy dose of regression) that his overall defensive value is more like that of a 2B/3B/CF. (Indeed, he was originally a perfectly fine 3B, and was only moved to first to keep him healthy). He's a significantly above-average baserunner for 1B as well, probably a good 2-3 runs better than the positional average.

Gehrig was certainly a better hitter than Albert--through his first 8 seasons, he averaged 7.0 BWAA2 per season, vs. 6.3 for Pujols--but the rest of his game was merely average. As a result, Pujols is 15% ahead of where Gehrig was at the same age (he did debut a year earlier, which helps). Needless to say, Gehrig's career would have been greater had he not come down with, you know, Lou Gehrig's disease.
   656. DL from MN Posted: November 21, 2008 at 05:46 PM (#3013823)
I'll have to make a point of it to may more attention to watching Pujols considering he may be the best 1B ever. Challenging Musial for "best Redbird" would require a pretty shallow decline.
   657. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 22, 2008 at 03:01 AM (#3014142)
Make sure you pay attention to the fielding--that's what makes him a contender. He's Jimmie Foxx and Keith Hernandez rolled up into one.
   658. Blackadder Posted: November 22, 2008 at 04:03 AM (#3014147)
Dan, thanks for the new numbers, but this reminds me of a potential concern I had with your updated 1987-2005 numbers, at least for the purposes of historical comparisons. As you seem to acknowledge (and if this is an incorrect assumption on my part than these concerns are largely mitigated), the standard deviation of your defensive and baserunning numbers are higher in the 1987-2005 data (which I'll call V2) than they are in the 1893-2005 data (V1). In the past, what you have done is standard deviation adjusted all fielding numbers so that they had the same standard deviation each year, thus assuring that the share of value allocated to fielding at least is constant throughout baseball history. Now, however, it seems that instead of adjusting the standard deviation of the new defensive and baserunning numbers, you adding them as is and THEN applying a standard deviation adjustment to the overall resulting WARP number. As such, V2 should have the same standard deviation, roughly, as V1, but the shape of the variance among players could differ somewhat. More specifically, V2 should have a larger standard deviation among the population for fielding and baserunning than V1, and thus also, by necessity, a smaller standard deviation in hitting and positional value. In effect, this means that V2 thinks that fielding and baserunning are a larger share of value than V1 does.

Now, maybe doing so is really what you intended to do, but it does seem that certain biases are introduced for the purposes of historical comparisons. For instance, a player in V2 who who was fielding and baserunning almost neutral, but was a very good hitter, would suffer in comparison to a similar player in V1, since the V2 player would be hit with a larger overall standard deviation adjustment. This is true even if hitting is a larger relative component of value today than it was in the past, which seems if anything more likely to be true than the opposite. I am not sure how large this effect is; I tried looking briefly at some guys with extreme fielding value and guys with neutral fielding value relative to how they were valued in the older version of WARP, but unfortunately there are enough other confounding factors (it appears, for instance that some of the baseline offensive numbers have also changed). Still, this is something to bear in mind. It may even have some relevance to some of the points mentions above, e.g. Gary Sheffield's particularly bad showing in this version.
   659. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 22, 2008 at 05:18 AM (#3014161)
Blackadder, thanks for your attention to my work. A correction: in the past, I have NOT standard deviation adjusted all fielding numbers so that they had the same standard deviation each year. I standard deviation adjusted the fielding numbers so that BP FRAA and Fielding WS both had the same stdev at each position over the 1987-2005 time period as Dial's Zone Rating-based numbers. But I kept those multipliers constant over time, so that if the standard deviation of BP FRAA or Fielding WS was higher or lower in a given year or time period than it was for 1987-05, that greater/lesser variation made it into my final WARP.

Nonetheless, your point that "V2 thinks that fielding and baserunning are a larger share of value than V1 does" is correct. That is because V2 is more accurate than V1--V1 has less sophisticated tools to perceive fielding and baserunning value, so it regresses it to the mean. (In fact, this is true within V2 as well--the observed stdev for '87-'99 should in theory be lower than '00-'02, which should be lower than '03, which should be lower than '04-'05, as the actual PBP metrics come on board to replace their proxy stats). But comparing V1 to V2 will thus introduce distortions. There are two possibilities.

The first option is to accept that V2 has a (slightly) higher standard deviation than V1, which should lead V2 players as a group to have slightly higher WARP2 scores than V1 players as a group (assuming that the extra variance is distributed randomly, which is in fact probably false). The second is to reduce its stdev to account for that, which as you correctly note means that a 100 OPS+, average fielder at a position will have a lower score in V2 than in V1.

I chose to go for the second approach, but you could easily take the first: just multiply all the "2" numbers (BWAA2/BRWAA2/FWAA2/Replc2/WARP2) by 1.03. I can certainly see a very strong case that that is the better way: we know more about these players, so we can say with greater confidence that they were/weren't HoM'ers than their predecessors. Of course, none of this would affect the relative value of Sheffield (or anyone else) within his cohort; it would only affect the valuation of 1987-05 players as a group to 1893-1986 players as a group.

The baseline offensive numbers have changed only in the cases where the park factors in the Lahman database changed. Some of those adjustments were quite substantial, by up to 7 points of PF (I think this mainly has to do with their previously using a 1-year PF for post-1999 seasons, but there are some other random cases as well).
   660. Blackadder Posted: November 22, 2008 at 08:02 AM (#3014185)
Dan, thanks for the response. I don't disagree with anything you say there; personally, I would probably be more inclined to take the other approach, and just accept the higher SD of the 1987-2005 numbers, but that seems to be to a large extent a matter of taste.

On more concrete terms, how many people have the kind of late career surge Chipper Jones has enjoyed? Or to be more concrete: how many great players have had their two best seasons over the age of 35? I mean, you obviously have Bonds, but I can't think of anyone else. Aaron had his career high OPS+ of 194 at 37, and Williams' second highest was 233 at 38, but because of defense and playing time neither of those years ranked among their best. No other great player comes to mind.

Oh yeah, and Yale sucks.
   661. Paul Wendt Posted: November 22, 2008 at 03:25 PM (#3014218)
#658
V2 should have the same standard deviation, roughly, as V1, but the shape of the variance among players could differ somewhat. More specifically, V2 should have a larger standard deviation among the population for fielding and baserunning than V1, and thus also, by necessity, a smaller standard deviation in hitting and positional value. In effect, this means that V2 thinks that fielding and baserunning are a larger share of value than V1 does.

yes, except in case of some big player recruiting and development changes at the same time.
(If everyone capable of top-quintile batting and fielding and baserunning now plays soccer football, so that baseball's very best prospects are top batters or top fielders but not both . . .)

--
P.S.
C.C. Sabathia is from California.
Carlos Zambrano is from Venezuela.
They are extraordinary batters and pitchers both.
- By OPS+ they are absolutely stronger batters than any of the long-career pitchers who debuted in
the 1960s, Jim Hunter and Rick Wise and Jim Maloney at OPS+ = <52, 51, 44>; or in
the 1970s, Rick Rhoden and Dave Stewart and Rick Sutcliffe at OPS+ = <59, 40, 30>; not to mention
the 1980s, Doc Gooden and Orel Hershiser and Fernando Valenzuela at OPS+ = <32, 31, 30>.

Sabathia and Zambrano are not yet long-career pitchers at all. If they survive to pitch another 1000 innings, the chance is good that their relative batting skills OPS+ = <73, 59> will decline.
Has anyone looked at pitchers batting more systematically?
Is it true there has been some increase in the variation in pitcher batting skill (OPS+)
-- as the examples of Sabathia and Zambrano unsystematically suggest to me?
   662. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 22, 2008 at 03:56 PM (#3014227)
José Cruz Sr. had his at age 35 and 36, and Hank Sauer had his at 35 and 37. But they weren't great, just good. Yes, just Barry and Chipper. Everyone seems to agree Chipper's clean, though. Jim Eisenreich?

Sadly I'm still in South America, although enjoying a weekend at the beach in Uruguay. I get to the States on Thanksgiving day.
   663. Paul Wendt Posted: November 22, 2008 at 04:04 PM (#3014231)
#660
On more concrete terms, how many people have the kind of late career surge Chipper Jones has enjoyed? Or to be more concrete: how many great players have had their two best seasons over the age of 35? I mean, you obviously have Bonds, but I can't think of anyone else. Aaron had his career high OPS+ of 194 at 37, and Williams' second highest was 233 at 38, but because of defense and playing time neither of those years ranked among their best. No other great player comes to mind.

1.
I would have boldly declared that because of playing time Chipper Jones does not qualify --now playing about 80% rather than about 100% of team games. If indeed 2007 and 2008 have been his greatest seasons by some measure, be warned that your own intuitions may not match that measure very closely.

2.
Jones was a leftfielder during his last seasons as a full-time player. His return to thirdbase precisely matches his decline in playing time.

games played at primary fieldpos, Chipper Jones 1997-2008
152 158 156 152 149 : five seasons at 3B, 1997-2001
152 149 : two seasons at LF, 2002-2003
96 101 105 126 115 : five seasons at 3B, 2004-2008

Again I can only wonder aloud whether 2007 and 2008 have been his greatest seasons as a ballplayer!
and only warn that if that is true, it is quite unlikely that your own intuitions match that measure of greatness.

3.
At the beginning of the 2007 season Jones was only 24.11 yrs.mon old. If the answer to the original question is "Yes" then it is likely to be Yes only on a technicality.

4.
About ten years ago in Baseball's All-Time Best Hitters, author-statistician Michael Schell argued for truncation of careers at 8000 ABs or PAs. That is, for purposes of "all-time best" and "career" assessment, he counts the full career if shorter than 8000; otherwise only the first 8000.
During the argument for this move, he provided some illustration. Among all players in major league history (since 1871? 1876? 1893?), only Roberto Clemente improved his career average record notably after passing 8000. Luis Aparicio improved his career record slightly and maybe ranked second in mlb history.

I don't recall the two criteria that define the class of player-career-ends or the one measure that defines improvement. Only the bottom line that Clemente alone improved notably during his career's end. That was possible only because Pittsburgh used him as a full-time player for five seasons, ages 20-24, before he was a good batter. For the same reason only, his career's end defined by Schell began early. (If the criterion is plate appearances, Clemente was already there after the 1968 season, age 34.2; Jones only at the end of the 2007 season, age 35.6.)
   664. Paul Wendt Posted: November 22, 2008 at 04:08 PM (#3014235)
At the beginning of the 2007 season Jones was only 34.11 yrs.mon old. If the answer to the original question is "Yes" then it is likely to be Yes only on a technicality.

correction, 34.11 years old (b. 1972-04-24)

"only on a technicality"
How to Lie with Statistics, appendix N
Pretend that everyone's age is an integer. In baseball context, use so-called baseball age, the July 1 hanky-panky.
   665. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 22, 2008 at 06:33 PM (#3014268)
Paul Wendt, as far as '07 and '08 being Chipper's best two seasons, see my #652. His 1999 MVP year looks a lot less shiny after docking him for his crap fielding, double play propensity, and the recent expansions. He seems to have had a late-career defensive renaissance as well.
   666. HGM Posted: November 23, 2008 at 03:06 AM (#3014411)
The link in the main post doesn't work. I was just curious, is there anywhere to get the full methodology and results for all players?
   667. Blackadder Posted: November 23, 2008 at 11:52 PM (#3014617)
You can download it from the Hall of Merit group on Yahoo. This question comes up enough that I think it is worth adding a link at the top of this thread.
   668. Wes Parkers Mood (Mike Green) Posted: November 24, 2008 at 01:43 AM (#3014629)
I like the methodology, and concur with the comments about Pujols, Lofton and Ventura.
   669. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 24, 2008 at 09:39 PM (#3015063)
Thought Riot, the methodology is discussed over the length of the thread, but the most complete work-through example is at #491. And yes, just go to groups.yahoo.com, sign up for the Hall of Merit, and download the file.
   670. KJOK Posted: December 10, 2008 at 06:35 AM (#3024621)
Chone posted some hints about his WAR system, similar to Dan's in some ways I believe:

Hall of Very Good

Creating a list of the best players retired for 5+ years, Willie Mays (153 wins) and Hank Aaron (148) come out on top, shouldn't surprise too much. Going down the list, every player with at least 73 wins (Gary Carter) above replacement (WAR) is in the Hall. Every eligible player that is, sorry Charlie Hustle. So 73 is the magic number for HOF inclusion.

There are 7 players between 70 and 73, but only two are in the HOF, Carlton Fisk and Eddie Murray. The other five are the the greatest members of the Hall of Very Good.

They are:
Tim Raines (72), best of the group
Lou Whitaker
Reggie Smith
Willie Davis
Ron Santo

Before I did this systematically, I thought 60 WAR was a strong qualification for HOF inclusion, but there are 23 players over 60 but under 70, and only 7 have made it to Cooperstown. They are: McCovey, Winfield, Killebrew, Stargell, Sandberg, Billy Williams, and Banks. Lessons learned: Get your counting stats, your 3000 hits or 500 homers, or be a Cub with a working pancreas.

The second tier of the HOVG:
Bobby Grich
Graig Nettles
Alan Trammell
Buddy Bell
Keith Hernandez
Mark McGwire
Jimmy Wynn
Andre Dawson
Willie Randalph
Dwight Evans
Dick Allen
Darrell Evans
Ken Boyer
Will Clark
Sal Bando
Bobby Bonds

There are 17 players in the 50-60 range. Only two, Tony Perez and Luis Aparicio, are in the hall.

Then there are 45 players in the 40-50 range. This is the lower tier of the HOVG, but three players of this group have made it to the HOF: Kirby Puckett, Orlando Cepeda, and Lou Brock. These are very questionable choices for the Hall, as there are many better players who have not received the honor. If Jim Rice (43 WAR) makes it this year, this is his group.

For now I'll settle on 40 WAR as the minimum standard for the Hall of Very Good. The dividing line is Bobby Murcer, in, Rick Monday (39.9) out. Maybe I'll have to go a bit lower to 35 or so - some players just below 40 who would commonly be described as very good: Harold Baines, Dave Concepcion, Doug DeCinces, Tim Salmon.

This is all subject to revision, there is a good chance that I'll need to adjust the position adjustments - they work for the most recent era, 2000+, but may need to be changed a bit for the past.
   671. AROM Posted: December 10, 2008 at 06:40 AM (#3024624)
Here's a link.
   672. Joey Numbaz (Scruff) Posted: December 12, 2008 at 07:44 PM (#3027647)
Does anyone know where DanR's military service credit formulas are located? I've been looking everywhere, but I can't find them. Thanks!
   673. Bleed the Freak Posted: December 12, 2008 at 08:01 PM (#3027670)
The war adjusted results are located in a csv file, under the revised Rosenheck WARP file, located at the Hall of Merit Yahoo! Group page. The file is the sixth from the bottom, updated Nov 6th

Hope this is what you were looking for.
   674. Joey Numbaz (Scruff) Posted: December 12, 2008 at 08:15 PM (#3027689)
Thanks Bleed, but those results just adjust players who played during the war to what conditions would have been like without a war.

I was looking for the formulas that fill in missing seasons for players who didn't play during the war. I am going to figure the numbers for everyone that missed time between 1942-45 . . .

I also have a question for Dan - let's say a player missed 1944-45, but played in 1943. When figuring out his credit for 1944-45, should I use his adjusted 1943, or his actual 1943?
   675. David Concepcion de la Desviacion Estandar (Dan R) Posted: December 13, 2008 at 12:33 AM (#3028061)
The war credit estimators are at http://www.baseballthinkfactory.org/files/hall_of_merit/discussion/credits_and_deductions_for_ww_ii_players/P100.

Definitely use his adjusted 1943.
   676. Joey Numbaz (Scruff) Posted: December 13, 2008 at 03:18 AM (#3028174)
Thanks Dan! Planning to run all of the numbers tomorrow.
   677. David Concepcion de la Desviacion Estandar (Dan R) Posted: December 27, 2008 at 03:22 PM (#3038837)
I would just like to note that the "Value Wins" statistic now being published at Fangraphs.com is extremely similar to my WARP. It's missing a few tidbits--the play-by-play baserunning, the net double plays for both hitters and fielders, OF throwing arms, the STATS database for fielding, etc.--and its replacement level and defensive spectrum are slightly different (although certainly within the realm of reasonable debate; I wouldn't venture to say that mine are necessarily closer to the truth than theirs). But the correlation between the two is probably about .98 or something. You should basically feel free to use them interchangeably until I get around to calculating my own 2006-08 and onwards numbers. Finally, there is a constantly updated metric posted on the Web for everyone that actually reflects the sabermetric consensus on valuation. God knows why it took so long.
   678. Blackadder Posted: December 27, 2008 at 03:54 PM (#3038841)
Apparently Clay Davenport is reworking BP's WARP, to include PBP fielding when it is available and a more realistic replacement level. Jay Jaffe quoted some of the preliminary results, and eye-balling it the replacement level still looked a little too low, but I'll withhold judgment until his system is public. Still, it is a very welcome development that there seems to be convergence in opinion about the correct methodology for player valuation.
   679. David Concepcion de la Desviacion Estandar (Dan R) Posted: December 27, 2008 at 04:02 PM (#3038844)
Yes, I saw that. Indeed, it seems like everyone is finally "coming around"...maybe now that BP, Fangraphs, Tango etc. have all made public roughly the same valuation scheme that I have, more HoM voters will take notice and we'll stop seeing so many of these worrying Perez and Puckett votes.
   680. Devin has a deep burning passion for fuzzy socks Posted: January 09, 2009 at 10:23 PM (#3047931)
This seems like as good a place as any to mention this: BP now has searchable stats for Batting Translations, Pitcher Translations, and WARP Leaderboards here. They only go back to 1901, and its year-by-year, but I assume folks will find these helpful.
   681. Joey Numbaz (Scruff) Posted: January 09, 2009 at 11:49 PM (#3048043)
That is a serious big deal, thanks for pointing it out Devin.

I've been entering NRA and DERA by hand for every one of the 500 or so pitchers I've worked through my system.

Now I can just query what I need for each year, combine them all into one spreadsheet or database and I should be good to go.

That's where I get my adjustments for inherited runners and bullpen support, etc. - but I never noticed the translations were there - are they new?

I still need to know if Access has something equivalent to Excel's "solver" plug in. If it doesn't, I think I'm still going to have to do it by hand. Does anyone have any insight to this?
   682. HGM Posted: January 14, 2009 at 09:21 AM (#3051408)
Just want to get a little perspective here, which WARP numbers are "better", WARP1 or WARP2?
   683. Ivan Grushenko of Hong Kong Posted: January 14, 2009 at 10:16 AM (#3051410)
Yes, I saw that. Indeed, it seems like everyone is finally "coming around"...maybe now that BP, Fangraphs, Tango etc. have all made public roughly the same valuation scheme that I have, more HoM voters will take notice and we'll stop seeing so many of these worrying Perez and Puckett votes.

Congratulations! This is really exciting!
   684. HGM Posted: January 15, 2009 at 02:07 AM (#3052281)
One other question, reading through HoM threads, I've seen reference to WARP estimates for Negro Leaguers. Are these compiled anywhere? Any info on those (and really, any other compilations of Negro League MLE's/estimates) would be appreciated. Thanks.
   685. David Concepcion de la Desviacion Estandar (Dan R) Posted: January 19, 2009 at 04:36 AM (#3054989)
I've just been posting them on the individual NgL'ers player threads.
   686. Paul Wendt Posted: January 19, 2009 at 06:55 PM (#3055233)
681. Joe Dimino Posted: January 09, 2009 at 05:49 PM (#3048043)
That is a serious big deal, thanks for pointing it out Devin.

. . .
I still need to know if Access has something equivalent to Excel's "solver" plug in. If it doesn't, I think I'm still going to have to do it by hand. Does anyone have any insight to this?


It seems entirely contrary to the purpose of a database as opposed to a spreadsheet, so I would be surprised. Broadly the database is designed to select and rearrange data rather than derive from data. "Commonly" one finally exports a dataset and manipulates it in more sophisticated ways elsewhere.

If I'm right that Access does not directly "solve" what you wish, that doesn't mean you need to do it by hand. Maybe "Export" an appropriate query --that is, save what the query generates as a file outside the database (eg, a table in csv format). Manipulate it in some ways that is beyond the database and save something (eg, a table in csv format). Import to the database.

I don't know how effectively such operations can be automated.
I don't know what you need to "solve", eg approximate a fixed point of some iteration.
So this may not help much except as it stimulates someone who knows and knows.


682. HGM Posted: January 14, 2009 at 03:21 AM (#3051408)
Just want to get a little perspective here, which WARP numbers are "better", WARP1 or WARP2?

That scare quotation is frightening.

Perhaps this concerns WARP by Clay Davenport published at BaseballProspectus.com (player 'DT cards'). Perhaps our Uberstats thread is useful or interesting.
   687. HGM Posted: January 19, 2009 at 11:35 PM (#3055452)
I was talking about Dan R's WARP, not BP's WARP.
   688. HGM Posted: January 19, 2009 at 11:37 PM (#3055454)
Basically, my question was which version, WARP1 or the standard deviation adjusted WARP2, is better for the purpose of comparing players across eras.
   689. David Concepcion de la Desviacion Estandar (Dan R) Posted: January 20, 2009 at 05:17 AM (#3055724)
*Definitely* WARP2--cross-era comparisons are what it's designed for.
   690. HGM Posted: January 20, 2009 at 01:25 PM (#3055834)
Thanks
   691. jimd Posted: January 20, 2009 at 11:09 PM (#3056426)
and WARP Leaderboards

Unfortunately, it does not combine service with multiple teams.
EG, Sabathia does not appear in the 2008 leaderboard despite his huge season last year.
   692. jimd Posted: January 20, 2009 at 11:17 PM (#3056435)
(I mean he's there, but nowhere near the top, given that his value is chopped in two pieces.)
   693. David Concepcion de la Desviacion Estandar (Dan R) Posted: February 19, 2009 at 07:36 PM (#3081295)
My WARP archive is now available as well at tangotiger.net/rosenheck. Could we update the link at the top of the page to reflect this?
   694. Joey Numbaz (Scruff) Posted: February 23, 2009 at 10:28 PM (#3084052)
Done . . . in the future, don't hesitate to send an email on something like this to John or I . . .
   695. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 25, 2009 at 02:08 AM (#3396193)
Cross-posting from the CHONE WAR thread:


I'm going to take this opportunity to compare and contrast three players that my system and CHONE's strongly disagree on: Sal Bando, Willie Davis, and Dagoberto Campaneris. I hope a close analysis will help illuminate the differences between CHONE's methodology and mine (although it is worth emphasizing that a cursory comparison of our rankings shows that the systems are extremely similar). I have converted his numbers to a comparable format to mine--raw hitting, GIDP, and ROE are subsumed under batting; TZ, ifDP, OFarm, and Catch are subsumed under fielding, the subcategories are converted from runs to wins, and seasons are straight-line adjusted to 162 games. I have also adjusted my BWAA and Replc to use the pitchers-removed average as league average for non-DH years, to mirror CHONE's accounting.

Glossary

SFrac: Percentage of the league average plate appearances per lineup slot for that season, a measure of playing time (greater than 1 for leadoff men, etc.)
RBWAA: Rosenheck batting wins above average.
RBRWAA: Rosenheck baserunning wins above average.
RFWAA: Rosenheck fielding wins above average.
RRepl: Wins above average a replacement player would have accumulated in the same playing time, according to Rosenheck.
RWARP1: Rosenheck total wins above replacement, unadjusted for standard deviations.
LgAdj: Ratio of the 2005 standard deviation to the regression-projected standard deviation of the league-season in question.
RWARP2: Rosenheck total wins above replacement, adjusted for standard deviations.
SBWAA: Smith batting wins above average.
SBRWAA: Smith baserunning wins above average.
SFWAA: Smith fielding wins above average
SRepl: Wins above average a replacement player would have accumulated in the same playing time, according to Smith.
SWARP1: Smith total wins above replacement, unadjusted for standard deviations.
TXBR: Totals, excluding sub-replacement seasons.

Sal Bando

YEAR SFrac RBWAA RBRWAA RFWAA RRepl RWARP1 LgAdj RWARP2  |  SBWAA SBRWAA SFWAA SRepl SWARP1
1967  0.22  
-0.5    0.1   0.1  -0.4    0.0 0.985    0.0  |   -0.2    0.0   0.7  -0.5    1.0
1968  1.01   0.7    0.2   0.6  
-1.8    3.4 1.003    3.4  |    0.7    0.1  -0.1  -2.5    3.2
1969  1.08   5.2   
-0.2  -0.6  -1.8    6.3 0.948    6.0  |    6.1    0.1   0.2  -2.5    8.9
1970  0.93   3.3   
-0.4  -0.4  -1.7    4.3 0.949    4.1  |    3.0    0.1   0.4  -2.6    6.2
1971  0.95   3.2   
-0.3  -0.5  -1.8    4.2 0.962    4.0  |    3.0   -0.1   0.7  -2.7    6.3
1972  0.99   1.6    0.2   0.9  
-2.0    4.8 0.970    4.7  |    1.1    0.2   1.1  -2.7    5.1
1973  1.00   4.5    0.0  
-1.7  -2.1    4.9 0.947    4.6  |    5.2    0.1  -0.7  -2.8    7.3
1974  0.89   2.7    0.4  
-1.4  -1.8    3.5 0.963    3.4  |    3.2    0.1  -0.4  -2.4    5.3
1975  0.97   0.3    0.2   0.4  
-2.0    2.9 0.943    2.7  |    0.3    0.2   0.5  -2.6    3.6
1976  0.94   2.4    0.2   0.3  
-2.1    5.0 0.948    4.7  |    2.7    0.1   0.2  -2.7    5.7
1977  0.97   0.3    0.1  
-0.1  -2.2    2.6 0.907    2.4  |   -0.3    0.4   0.7  -2.2    3.0
1978  0.93   2.7    0.1   0.7  
-2.2    5.7 0.919    5.2  |    2.4    0.1   0.9  -2.3    5.8
1979  0.79  
-1.3    0.1  -0.5  -1.8    0.1 0.913    0.1  |   -1.4    0.0  -0.3  -1.8    0.1
1980  0.42  
-1.4   -0.1  -0.3  -1.0   -0.8 0.929   -0.7  |   -1.2   -0.1  -0.3  -0.8   -0.8
1981  0.16  
-0.3    0.0  -0.3  -0.3   -0.3 0.950   -0.3  |    0.0   -0.2   0.0  -0.2    0.0
TOTL 12.25  23.4    0.6  
-2.8 -25.0   46.6 0.951   44.3  |   24.7    1.2   3.7 -31.1   60.7
TXBR 11.67  25.1    0.7  
-2.2 -23.7   47.7 0.950   45.3  |   25.9    1.3   4.0 -30.3   61.5
AVRG  1.00   1.9    0.0  
-0.2  -2.0    3.8 0.951    3.6  |    2.0    0.1   0.3  -2.5    5.0 


Sean and I clearly see Bando's offense the same way. We have a significant but not enormous disagreement on his fielding---Sean sees him as a slightly above average fielder, me as a slightly below one. (I'm really not sure what to make of this, since SFR thinks Bando was a brilliant 3B and DRA a meaningfully below average one; the truth probably lies somewhere in the middle). And we have a yawning gap on the intrinsic value of playing third base. I see 3B during Bando's prime as a dead mid-spectrum position, no different than it is today, while Sean has it as glove position, not far from where he puts modern SS. (Note that my view and Sean's converge starting in around 1977).


Willie Davis

YEAR SFrac RBWAA RBRWAA RFWAA RRepl RWARP1 LgAdj RWARP2  |  SBWAA SBRWAA SFWAA SRepl SWARP1
1960  0.14   0.3   
-0.2   0.3  -0.3    0.7 0.955    0.7  |    0.5   -0.2   0.3  -0.3    0.9
1961  0.58  
-0.2    0.1  -0.3  -1.1    0.7 0.962    0.7  |   -0.4    0.1   1.3  -1.3    2.3
1962  0.96   1.7    0.5  
-0.1  -1.9    4.1 0.900    3.7  |    1.8    0.8   1.4  -2.2    6.3
1963  0.82  
-0.5    0.2   0.7  -1.5    2.0 0.942    1.9  |   -0.5    0.1   0.9  -1.8    2.3
1964  0.96   1.8    0.5   1.5  
-1.6    5.5 0.930    5.1  |    1.8    0.5   3.0  -2.3    7.6
1965  0.87  
-1.7    0.3   1.1  -1.5    1.2 0.937    1.1  |   -1.3    0.2   1.5  -1.9    2.3
1966  0.96   0.3    0.2   0.1  
-1.8    2.4 0.950    2.3  |    0.0    0.2  -0.3  -2.2    2.1
1967  0.90  
-0.1    0.4  -1.0  -1.6    0.9 0.947    0.9  |   -0.6    0.6  -0.6  -2.0    1.4
1968  1.02   0.2    0.7  
-1.1  -1.8    1.5 0.973    1.5  |   -0.3    0.6  -0.7  -2.3    1.9
1969  0.80   2.5    0.3   0.4  
-1.4    4.6 0.914    4.2  |    2.6    0.4   0.2  -2.0    5.2
1970  0.91   1.3    0.3   1.1  
-1.5    4.3 0.919    4.0  |    1.0    0.7   1.2  -1.6    4.5
1971  0.99   1.9    0.3   0.3  
-1.8    4.3 0.940    4.0  |    2.2    0.3   0.7  -1.8    5.0
1972  1.00   1.8    0.5   1.2  
-1.9    5.5 0.950    5.2  |    1.8    0.3   1.4  -1.8    5.3
1973  0.94   1.4    0.7   1.0  
-1.9    5.0 0.948    4.7  |    0.9    0.4   0.9  -1.7    4.0
1974  0.95   0.3    0.3   0.5  
-1.9    2.9 0.932    2.7  |    0.8    0.7   0.1  -1.8    3.4
1975  0.81  
-0.1    0.4   0.4  -1.1    1.7 0.936    1.6  |    0.3    0.5  -0.1  -1.3    2.0
1976  0.77   0.0    0.4   0.2  
-1.5    2.0 0.929    1.9  |   -0.3    0.4  -0.2  -1.4    1.3
1979  0.09  
-0.3    0.0   0.0   0.0   -0.3 0.913   -0.3  |   -0.2   -0.1  -0.2  -0.1   -0.4
TOTL 14.47  10.7    5.9   6.3 
-26.0   49.0 0.934   45.8  |   10.1    6.7  10.9 -29.8   57.5
TXBR 14.38  11.0    5.9   6.3 
-26.0   49.3 0.934   46.1  |   10.3    6.8  11.1 -29.7   57.9
AVRG  1.00   0.7    0.4   0.4  
-1.8    3.4 0.934    3.2  |    0.7    0.5   0.8  -2.1    4.0 


Same issues as Bando. The offensive evaluation is idential. I see Davis's fielding as good, while Sean sees it as very good. Sean's decade-based cutoff for changing positional weights is very apparent here--he agrees precisely with me about the value of CF from 1970 onwards, but sees it as significantly more valuable in the 1960's (note the increase in SRepl/SFrac from -2.5 in 1969 to -1.75 in 1970--you'd never see that in my system). I also further ding Davis for the extreme ease of domination of the 1969-70 NL, a factor Sean does not take into account.


Dagoberto Campaneris

YEAR SFrac RBWAA RBRWAA RFWAA RRepl RWARP1 LgAdj RWARP2  |  SBWAA SBRWAA SFWAA SRepl SWARP1
1964  0.42  
-0.6    0.2  -0.5  -1.7    0.9 0.983    0.9  |   -0.4    0.0  -0.7  -0.9   -0.2
1965  0.94   0.5    0.3  
-1.4  -3.7    3.2 0.977    3.1  |   -0.3    0.2  -1.2  -1.8    0.5
1966  0.91  
-0.3    0.9  -0.6  -3.7    3.7 0.999    3.7  |    0.4    1.0  -0.5  -2.7    3.5
1967  0.97  
-0.9    0.7  -0.8  -4.1    3.2 0.985    3.2  |   -0.5    0.3  -0.8  -2.6    1.6
1968  1.06   1.9    0.6   0.7  
-4.5    7.6 1.003    7.6  |    1.5    0.3   1.0  -3.1    5.9
1969  0.86  
-1.9    1.2   0.1  -4.0    3.3 0.948    3.1  |   -1.3    1.0   1.1  -2.3    3.1
1970  0.95   1.4    0.6   1.0  
-4.5    7.4 0.949    7.0  |    1.3    0.3   1.0  -3.0    5.6
1971  0.91  
-1.7    0.7   0.3  -4.2    3.4 0.962    3.3  |   -1.5    0.8   1.0  -2.6    2.9
1972  1.05  
-1.7    1.1   1.7  -4.6    5.8 0.970    5.6  |   -1.1    0.7   1.5  -3.1    4.2
1973  0.97  
-1.4    0.4   1.8  -4.4    5.3 0.947    5.0  |   -1.0    0.3   2.0  -2.9    4.2
1974  0.85   1.1    0.4   0.7  
-3.9    6.1 0.963    5.9  |    1.6    0.4   0.7  -2.7    5.4
1975  0.84  
-0.3    0.2  -0.2  -3.8    3.5 0.943    3.3  |   -0.1    0.0   0.1  -2.6    2.6
1976  0.91   0.0    0.5   0.4  
-4.0    5.0 0.948    4.7  |   -0.1    0.3   0.9  -3.0    4.1
1977  0.89  
-1.2    0.0   1.7  -4.1    4.6 0.907    4.2  |   -1.1   -0.3   1.7  -3.0    3.3
1978  0.44  
-2.4    0.3  -0.2  -2.0   -0.3 0.919   -0.3  |   -2.5    0.2   0.0  -1.5   -0.8
1979  0.40  
-1.6   -0.1   0.4  -1.8    0.5 0.913    0.5  |   -1.1    0.1  -0.3  -1.3    0.0
1980  0.33  
-0.6   -0.1  -0.4  -1.5    0.3 0.929    0.3  |   -0.7    0.0  -0.5  -1.0   -0.2
1981  0.20   0.0    0.0  
-0.7  -0.4   -0.2 0.950   -0.2  |   -0.3    0.0  -0.3  -0.5   -0.2
1983  0.22   0.1   
-0.4  -0.2  -0.6    0.1 0.954    0.1  |    0.3   -0.4  -0.3  -0.6    0.2
TOTL 14.12  
-9.6    7.5   3.8 -61.5   63.4 0.962   61.0  |   -6.8    5.1   6.4 -41.0   45.7
TXBR 13.48  
-7.2    7.2   4.7 -59.1   63.9 0.962   61.5  |   -2.9    4.9   7.9 -37.2   47.1
AVRG  1.00  
-0.7    0.5   0.3  -4.4    4.5 0.962    4.3  |   -0.5    0.4   0.5  -2.9    3.2 


Once more, we have similar takes on overall value above average, with Sean slightly more kind to Campaneris. (It's worth noting here that SFR has Campaneris as an otherworldly +144 shortstop, and DRA also gives him a superb +109. With that kind of defense and baserunning value, Campaneris's case is far stronger than I make it here, where old BP FRAA and Fielding Win Shares thought he was merely good). It's that replacement level column that is night and day: Sean has SS a mere 2.5 runs per year more valuable in the 70's than it is today (which I think he himself recognizes is a fudge, due to the arbitrary +10 cap he places on positional adjustments...right, Sean? Kind of like Bill James's cap on Fielding WS). By contrast, as everyone in the HoM knows, I see 70's SS as the toughest position to fill at any point in MLB history. If you could even provide league-average offense at the position back then, you were an All-Star in my book. I have offered two potential explanations for this phenomenon: the mega-expansion of the 1960's, which hurt SS more than (say) LF/RF due to the intrinsic scarcity of the position, and the advent of turf ballparks.

As a reminder, I derive my positional weights by starting with Nate Silver's findings on the aggregate performance of Freely Available Talent (MLB players over age 27 earning less than twice the minimum salary), and adjusting them over time based on the performance of worst 3/8 of everyday players at each position. Sean gets his by studying the fielding of position-switchers over ten-year periods (e.g. 1970-1980).
   696. Yoenis Cespedes, Baseball Savant Posted: November 25, 2009 at 04:03 AM (#3396270)
Dan,

Where can I find your complete WARP file? I can only find the 1987-2005 file.
   697. David Concepcion de la Desviacion Estandar (Dan R) Posted: November 25, 2009 at 05:30 AM (#3396300)
At the link at the top of the page? It's the Rosenheck WARP Results.csv file.
   698. Yoenis Cespedes, Baseball Savant Posted: November 25, 2009 at 05:35 AM (#3396304)
Thanks, Dan. I must have overlooked that file the first time I downloaded it. As my AP US History teacher used to say, "How embarrassing."
   699. Joey Numbaz (Scruff) Posted: December 04, 2009 at 08:15 AM (#3403112)
The Rosenheck DB will be up shortly (FTP client says 17 minutes). It's the older version though, but I'm not sure when I'll get to update it for his latest baserunning fielding adjustments (some time this weekend or possibly not until early next week) and didn't want to wait.

I'll post it in the main entry once it's up to date.
   700. Paul Wendt Posted: December 04, 2009 at 09:08 PM (#3403603)
Thanks, Joe.

Thanks even if it turns out that I can't use it myself. The format is MS Access, iirc. Does the edition correspond to some model Office 2007 or is more recent? Office 2007 is what I can use at the library. Sometimes there is no substance unique to that edition, so I can effectively convert a file and use it elsewhere too.

Other readers may have similar options.
Page 7 of 8 pages ‹ First  < 2 3 4 5 6 7 8 > 

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
JPWF13
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 0.9758 seconds
49 querie(s) executed