Baseball for the Thinking Fan

Login | Register | Feedback

btf_logo
You are here > Home > Baseball Newsstand > Baseball Primer Newsblog > Discussion
Baseball Primer Newsblog
— The Best News Links from the Baseball Newsstand

Saturday, December 22, 2007

Statistically Speaking: Fast: Can we classify every pitch?

Since Dalkowski is no longer pitching...I guess there would be enough time. Mike Fast takes a look at the PITCHf/x system revolution.

What if we knew what type of pitches every major league pitcher threw? What if we had detailed pitch-by-pitch data about how he used those pitches in every game situation? What if this information was accurate and freely accessible to baseball researchers?

...Clearly, one of my priorities should be improving my understanding of and facility with clustering techniques. In my opinion, an effective clustering algorithm should include speed, spin direction, spin rate, vertical spin deflection, horizontal spin deflection, and handedness of the pitcher and hitter. It’s possible that additional variables such as strike zone location, release point, and ball-strike count might be helpful, but I have a feeling their addition to a clustering algorithm might cause more problems than they solve.

I would like to develop a classification system that would be available to any researcher and accurate enough to distinguish splitters from changeups, sinkers from four-seam fastballs, and cutters from sliders and fastballs. I confess to being a bit daunted by the task and not very experienced in some of the statistical math that may be required, but I’m going to keep chipping away at this problem until we get there.

Repoz Posted: December 22, 2007 at 07:55 AM | 8 comment(s)
  Related News: GeneralSabermetricsProjections

Reader Comments and Retorts

Go to end of page

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Page 1 of 1 pages
   1. villageidiom Posted: December 22, 2007 at 10:37 AM (#2652125)
I think you* do need to go through the league, pitcher by pitcher. I'm thinking that a changeup is largely denoted by a drop in velocity... relative to another pitch. What's a changeup for one pitcher (Pedro Martinez) could be a fastball for another (Keith Foulke). While general rules will generally work, there will be significant exceptions; and we shouldn't be giving people excuses to ignore the limitations of the method simply because it generally works. Let's get it right the first time.

* By "you" I mean "not I".
   2. steagles Posted: December 22, 2007 at 11:51 AM (#2652146)
done and done


http://baseball.bornbybits.com/plots/players.html
   3. Mike Fast Posted: December 22, 2007 at 11:57 AM (#2652148)
Steagles, in the article you will see that I am quite aware of Josh Kalk's work, and one of my main points is that we are not done. Josh Kalk's algorithm misclassifies ~30% of pitches. That's good enough for some things but not good enough for a lot of other more serious research. Josh's algorithm is also not public, so no one can replicate his work. I want to develop an algorithm that other people can use to do their own research and that will be accurate enough to get most pitches classified correctly.
   4. philly Posted: December 22, 2007 at 12:19 PM (#2652155)
Josh Kalk's algorithm misclassifies ~30% of pitches.


I don't doubt that, but I am curious where that number comes from.

As good as all of the pitchF/x research has been the underling data and these algorithms are not nearly as precise as most article imply them to be, imo.
   5. IronChef Chris Wok Posted: December 22, 2007 at 01:36 PM (#2652184)
Why do we even need "algorithms" to do this? If you want to classify every pitch, just simply throw money and manpower at the problem, and have every pitch reviewed on tape.
   6. Mike Fast Posted: December 22, 2007 at 02:19 PM (#2652223)
Philly, that number is an approximation that comes from my head based on comparing my detailed analysis of 8-10 pitchers with the results of Josh's algorithm. My topic of my article is squarely addressing your concern.

IronChef, if you've got money and manpower to throw at the problem, I'll gladly talk business arrangements with you.
   7. Der Komminsk-sar Posted: December 22, 2007 at 02:50 PM (#2652248)
From looking at 20 or so people (too small of a sample to go tossing assumptions out there, but - hey), I'd guess it's off 20-30% of the time. Not a bad start, really.
   8. Phineus Fog Posted: December 23, 2007 at 12:28 AM (#2652520)
i think this piece should be posted on every dug-out wall!
Page 1 of 1 pages

You must be Registered and Logged In to post comments.

 

<< Back to main

Support BBTF

donate

My Bookmarks

You must be logged in to view your Bookmarks.

Vivid Seats is a sports ticket broker, concert ticket broker and theater ticket broker offering the best baseball tickets like Yankees tickets, Cubs tickets, and Red Sox tickets, as well as Police reunion tour tickets and Jersey Boys tickets.

We have baseball tickets, the NFL schedule, college football tickets and Cowboys tickets. We have NBA tickets like Celtics tickets and Lakers tickets. Plus, buy Giants tickets, Patriots tickets and Colts tickets. Also check out our MLB baseball schedule

Buy Cheap MLB Tickets

Concerts Theatre NFL Angels Dodgers MLB Celtics Theater NBA Tickets Venues NHL Lakers Tickets NFL Yankees NHL Phillies NBA Wicked Marlins MLB Concerts Cubs Mets Red Sox Wicked WWE Red Sox Mets Yankees Dodgers

Page rendered in 9.1063 seconds
83 querie(s) executed