Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Transaction Oracle > Discussion
Transaction Oracle
— A Timely Look at Transactions as They Happen

Thursday, November 13, 2008

The Oddsinator aka the Doubly Stochastic Matrix of Doom

No, I haven’t come up with an actual name yet, but I wanted to say a few words on the new formatting for my individual player projections in order to stem off any confusion that might result from my general assumption that everybody knows exactly what I’m talking about without me having to explain.  And no, there are no Markov chains involved, I just like the sound of it.

Why did I change it?  I don’t feel that optimistic and pessimistic lines were communicating as much information about a player’s projection as they could - they were simply saying that players have really good seasons sometimes and really bad ones sometimes, which you don’t need a line to figure out.  I feel this way conveys probabilities of players accomplishing certain things without having to print a 5 page series of lines.

First, let’s look at the pitchers, using Damaso Marte as an example (I just changed the formatting slightly from what was posted due to some confusion).


ERA+    %
Top 1/3 64
Mid 1/3 28
Bot 1/3   8

This compares Marte to the usual population of pitchers with 30 or more innings pitched (for starters, I use 100 as the cutoff).  When I do hitters, I look at comparing players to starters at positions, but the start/bench relationship isn’t the same for pitchers.  So, when looking at relievers used in the majors with any regularity, I expect Marte to finish in the top third 64% of the time, the middle third 28% of the time, and the bottom third 8% of the time.

 


ERA+  %    BB/9   %
>150   51

<1.5 0
>

140   58

<2.0 3
>

130   68

<2.5 8
>

120   78

<3.0 22
>

110   87

<3.5 30
>

100   92

<4.0 49
>

90   97
>80   99     HR/9   %
>70   100   <0.5 25
<1.0 72
K/9 % <1.5 97
>9     69     <2.0 98
>8     84
>7     95
>6     99

I think that this is pretty self-explanatory.  I don’t do wins and saves (and probably should not have done W-L in my 85th and 15th projections) because not only do you have to take into consideration the errors of the individual part of the projection, you have more noise-causing factors like team offense and team scoring distribution.

Let’s go on to Josh Phelps.


(600 PA)

Offense %
STAR   16
AVG   42
REP LV 77

“Star” simply means the average level of the top 20% of starters.  This projects that Josh Phelps has a 1-in-6 chance of being in the top 20% of 1B offensively, not that he has a 1-in-6 chance of being an MVP candidate (which he doesn’t).  Average means his odds at being average or better offensively among starters at his position and replacement level, his chances of being better than replacement level.  I use a similar level to Tango and Smith.


OPS+  %    OBP   %    3B     %    Hits   %
>160   0     >.400   2     >10   0     >200   0
>140   4     >.375   11     >5     12     >150   40
>120   24     >.350   34
>100   58     >.325   66     2B     %
>80   85     >.300   89     >45   5
>60   97               >30   57

BA     %    SLG   %    HR     %    SB     %
>.350   0     >.550   5     >50   0     >70   0
>.325   1     >.500   24     >40   0     >50   0
>.300   12     >.450   50     >30   19     >30   0
>.275   41     >.400   76     >20   57     >10   1
>.250   76     >.350   91     >10   85

I use 600 PA as a base for counting stat projections.  This does not mean that I expect the player to get 600 PA (and Phelps won’t), but just to see what he would do over the course of a season.  I might change this to use the predicted PA total instead, I haven’t decided what’s better.  ZiPS doesn’t really make any projection about playing time and simply assuming that everyone’s a full-time player for this purpose might be more useful than looking at playing time component that isn’t really real.  We don’t really know how many at-bats Phelps will get in 2009, but we do know about how many full-time players get.


Milestone       %
100th Career 2B   74
100th Career HR   3
500th Career Hit 96

Just here for fun. 


As a final note, it needs to be noted that the mean projection at the very top is not necessarily the 50th percentile projection - expected player performances are not symmetric distributions and I take into account skewness risk.  Sorry about the dryness as I usually try to keep everything as layman as possible.  I’m fairly certain that the majority of even the people reading this site are unlikely to be turned on by an endless discussion of nonparametric modeling or kernel estimation.

Dan Szymborski Posted: November 13, 2008 at 11:18 PM | 5 comment(s) Login to Bookmark
  Related News:

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

   1. alskor Posted: November 14, 2008 at 01:13 AM (#3008770)
I wanted to say I really like the new system - both the oddsinator(c) and the way youre dividing teams by position and into excellent/good/etc... columns. Looks great and it presents a lot more info.

Im sure its a lot more work and I appreciate it. Thanks.
   2. alskor Posted: November 14, 2008 at 01:20 AM (#3008776)
I use 600 PA as a base for counting stat projections. This does not mean that I expect the player to get 600 PA (and Phelps won't), but just to see what he would do over the course of a season. I might change this to use the predicted PA total instead, I haven't decided what's better. ZiPS doesn't really make any projection about playing time and simply assuming that everyone's a full-time player for this purpose might be more useful than looking at playing time component that isn't really real. We don't really know how many at-bats Phelps will get in 2009, but we do know about how many full-time players get.


If I could make a suggestion, perhaps the easiest way to do this, without driving yourself nuts would to just use nice big round benchmarks - that is for each guy have a 600AB projection(to show what he's capable of as a full time player) and a 400AB projection (For guys who split or are likely to miss time) and something like a 250AB(for part time/bench guys). From those benchmarks, or something near them, it would be easy enough to eyeball it if you projected a guy for 350 or something. Maybe even just 600 and 300. Something along those lines. It seems like that would be easier and simpler than coming up with an AB projection for every player... especially for teams done in November. I figure the round numbers are easier to work with, too...
   3. villageidiom Posted: November 19, 2008 at 07:52 PM (#3012507)
The Oddsinator gives the ODDs of Interesting Baseball Events. Call it ODDIBE.
   4. Dan Szymborski Posted: November 19, 2008 at 08:02 PM (#3012523)
VI, that's ridiculously awesome and I'm going to do that.
   5. villageidiom Posted: November 20, 2008 at 01:39 AM (#3012740)
Hey! I've finally made a contribution around here!

You must be Registered and Logged In to post comments.

 

 

<< Back to main

BBTF Partner

Support BBTF

donate

Thanks to
1k5v3L
for his generous support.

Bookmarks

You must be logged in to view your Bookmarks.

Syndicate

Page rendered in 0.2681 seconds
47 querie(s) executed