According to a recent report in Sports Illustrated, Carmine is “the virtual brains of the Boston operation,” and Epstein “never makes a move” without him.
Who is this statistical genius, and why can’t the Cubs pry him loose?
It turns out Carmine is the name of a computer program created by Epstein five years ago to analyze players’ stats and tendencies, and is valued by the Red Sox organization more than some humans.
Of course, flash drives are cheaper than ever, and nothing is stopping Epstein from creating a similar program to do the job in Chicago that Carmine does in Boston.
...The Cubs were slow in recognizing the trend toward sabermetrics. Even when he promoted Chuck Wasserstrom to the job as the team’s first baseball information manager in 2003, former general manager Jim Hendry admitted, “I’m more of an old-school scout’s guy.”
Wasserstrom was the Cubs’ only full-time numbers cruncher until 2010, when Ricketts made Ari Kaplan his first official hire in the baseball operations department, giving him the newly created position of manager of statistical analysis.
...Ready or not, a Carmine-like computer is the wave of the Cubs’ future.
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. KyleJRM Posted: October 17, 2011 at 03:52 AM (#3965879)But as mentioned in the other thread, the Cubs have such a ridiculous financial advantage over their division, that just not shooting themselves in the foot should be all it takes in the long run.
True, but after years of Jim Hendry, I still think we can expect to see a marked improvement.
I'll bet money that Epstein didn't write a single line of code for this program.
I think it's less of an advantage thing than a not falling behind thing.
A few of us went over this in the thread dedicated to his firing in August. Hendry was something of a fence-sitter when it came to falling into the old-school or saber-friendly GM group. He made some moves in his tenure that indicated he had some semblance of saber understanding (trading for Derrek Lee, Aramis Ramirez, signing Kosuke Fukudome), but he also made plenty of moves that indicated he bought into some of the old-school GM mentality. Signing Jacque Jones, extending Neifi f'n Perez, and probably most damning to the long-term well-being of the franchise, wasting draft pick after draft pick on "toolsy" players.
Baseball has enough computer-toting general mangers now that understanding the importance of stats such as OBP doesn't automatically put someone way ahead of the curve. Having such a general manager follow in the footsteps of a GM who was NOT like that, however, can likely pave the way for some good things in the coming years.
The Cubs have to keep on developing players and do it at a better scale. The Cubs don't have enough money to assemble a team via FA and they don't have the chits to augment it via trade which means they need to start getting those chits and turning them into tradeable assets or major league players for their team. Then they can go out and use their payroll lead to take chances in the FA market. Better known as the 2007-2008 plan.
Really? Outside of Vitters, it seems to me the cubs draft focused on safe college players, advocated by Moneyball.
Pie was not a draft pick. He was an international free agent signee, and not an especially expensive one IIRC. So, while the player development system might deserve some blame for not turning him from a top prospect to a good MLBer, it isn't like bringing him into the system was a bad move. On the contrary, it was a great move.
As for Wood, he was a very good draft pick. Getting a guy like him is a reason FOR drafting toolsy, raw guys.
When Hendry became GM from 2002-2005, I'm not sure how hands-on he was with the draft, and in that era the Cubs made six picks in the first round (including supplemental) and none have made it to the big leagues.
Then in the offseason after 2005, Tim Wilken was hired to be the S&D guy for his draft guru status.
The early returns have been mixed to disappointing. So far, three first-round guys drafted since then have made the majors, for a combined bWAR of -0.7. Some weird picks like Hayden Simpson (a huge reach who then got a wicked case of mono and has yet to fully recover to show if he was worth it), but also a lot of flawed guys who never overcame their flaw. Tyler Colvin still can't make consistent contact and Josh Vitters still can't do anything but make contact.
That said, there is some immediate hope for the Wilkens drafts. Brett Jackson, Andrew Cashner and maybe Ryan Flaherty are all recent first-round picks who look poised to start piling up value at the MLB level in 2012.
This was a topic of discussion on Chicago sports talk radio, believe it or not.
As a professional IT guy, I find this just a little bit insulting. I know that Theo Epstein is the second coming and all of that, but the guy has a freaking American Studies degree, people. If Epstein wrote one line of code in this program – which may in fact be an enhanced Access database or something - I’d be astounded. I’m not saying that someone without a CS background can’t dabble in programming languages a little, but come on.
I just hope Theo has a few minutes, when he’s not leading the Cubs to their eleventh consecutive World Series title, to address world hunger and develop three or four new alternative energy sources.
If you have a chance to draft a guy like Kerry Wood, why wouldn't you? Toolsy is fine fine with pitchers. Unless you are Greg Maddox, I would rather take my chances with a hardthrower who needs to learn command of the strike zone then someone whose fastball is only 88 mph and paints the corner. Because he isn't going to learn to throw harder.
Putting down old-school baseball guys like Hendry and Dusty Baker as failing to win due to blinders about stats implies that baseball is a simple game to play and win. When you are dealing with human beings and competing with 29 other teams for limited resources, you don't always have the players you want even though you may know an ideal player to draft and trade for.
Did you guys read the S.I. "from the vault" piece about this that was posted here last week? It really does sound like just a database of scouting reports that cross references with stat splits. It also keeps records of things like the Red Sox's proprietary internal metrics.
Some actually do. Some HS pitchers are already throwing 97 and this is as hard as they'll ever throw. Some are in the upper 80's there and gain velocity as they continue to grow. Projecting which young pitchers will succeed in the majors is probably the toughest job in the sport. I'd rather take my chances hitting a Neftali Feliz 100 MPH fastball than stake anything of value on how a HS pitcher will turn out.
I work with a couple guys like that. Let's just say, I go way out of my way to avoid working on any code that they have written. There is a reason they teach software design patterns when studying CS.
I doubt it's access. I'm not sure access could handle video, full pbp and pitch-by-pitch data.
As for Wood, he was a very good draft pick. Getting a guy like him is a reason FOR drafting toolsy, raw guys.
I didn't say they were bad moves. Person A said that Hendry liked to get toolsy guys and Person B disputed that by naming a recent draft pick. I'm simply bringing up Hendry's past to show that he did like toolsy guys.
A BBREF sponsorship? Er, wait...
This part kinda freaked me out because I thought the infamous base running play that defined the 2001 A's post season were created by the Brothers Giambi, but Moneyball just deleted that whole portion.
A lot of it is remarkably clever. It's almost always badly documented and buggy. Though the best documentation I've seen in ~30 years of support was by a physicist.
Billy Beane should have never written that software.
I think projecting amateur hitting, especially high school hitting, has to be much tougher. Pitching talent is easier to see. At least you know a good curve or a big fastball when you see it. Granted, its hard to guess right on projection and injuries (as you mention) but with hitting you really have very little idea of what you are seeing unless you're seeing it against good pitching. And you don't always have that luxury when scouting amateur hitters
A lot of it is remarkably clever. It's almost always badly documented and buggy. Though the best documentation I've seen in ~30 years of support was by a physicist."
Guilty as charged. My programs work just fine for me, but documentation is not very good.
Same. I have an IT background but not a CS/programming one, per se. I'm entirely self-taught, and I'm sure my code is inelegant and inefficient.
The programs themselves work wonderfully. Documentation? Fie.
It will be interesting to see whether Tippett is one of the guys who leaves with Theo. Since Tom is so talented, I hope not. Even if he doesn't go, it would only be a matter of time before the Cubs could replicate Carmine.
Our problem is that all "official" development is handled by a specific org, and any substantive development effort is a 9-12 month/7 figure cost investment (and turns out like absolute ####, by the way). Anything smaller just gets ignored.
So there are a number of us that do under-the-radar coding to create simple tools to automate the drudge work. I'd hate for a guy like you to get stuck with debugging my code, so I tend to not share it.
An inefficient system, alas.
I would think the video piggybacks off MLBAM's system, integrates with an updated Diamond Mind simulation, uses something like MSSQL for play-by-play, scouting reports, and the like, and also integrates with whatever they are using for their scouts.
This is a not-uncommon predicament in larger companies. Management tends to get hung up on "software ROI" and won't fund projects that don't promise a significant return. Your solution is, sadly, what most companies in this situation turn towards - and then when IT changes the larger infrastructure and these unsupported applications break, all hell breaks loose.
-- MWE
Our problem is that all "official" development is handled by a specific org, and any substantive development effort is a 9-12 month/7 figure cost investment (and turns out like absolute ####, by the way). Anything smaller just gets ignored.
So there are a number of us that do under-the-radar coding to create simple tools to automate the drudge work. I'd hate for a guy like you to get stuck with debugging my code, so I tend to not share it.
Sounds like you work where I do. I write tons of "under-the-radar" code to analyze the performance of mobile phone networks from various system logs we have (Ed background: BSEE, MS in Applied Math nearly complete, but essentially self-taught as a programmer). So many times I have managers running to me with their hair on fire that they need some new view of network performance for the last 6 months and they absolutely need it "yesterday" that there's no time to write the code "the right way" and fully document everything even if I really knew what "the right way" was. Most of the time "just working" and "doesn't take a crazy long time to run" are the only standards I have time to meet.
Indeed -- the golden rule around here is that a project has to reduce actual headcount to get funding.
To prevent hell from breaking loose, they simply don't allow under-the-radar tools to become critical. This stunts efficiency, of course, but there's not really an alternative.
My background is in the network side (layer 2/3 especially). I mean, I've been fooling about with programming since the Apple II, and taken a few courses (a long time ago), but little or no real education in the field. What experience I do have was, until recently, in obsolete languages like Turbo Pascal, so nowadays programming for me tends to go like this:
* Identify what I want the tool to do
* Fire up Visual Studio
* Sketch out a basic dialog and lob some controls at it
* Start writing code
* Frequently google "How do I..." (wherein "..." might be "read a text file" or "change a text control" or whatever)
* See if it actually runs
And so on. Yeah, "just working" and "doesn't take a crazy long time to run" are usually my design goals as well.
However, I think there's a non-insignificant PR portion of this move that should not be ignored.
Great question. But it is Epstien the fans and media will not question, he has pillow factor.
Changed a few (read sequentially through huge file) to (series of keyed reads and join the results) and got the execution time down to 12 seconds.
A lot of it is remarkably clever. It's almost always badly documented and buggy.
That's me too - though I don't claim to be a programmer, just somebody who has to do a lot of it as part of his job. (Similarly, my wife is a DBA with an English degree ... she lucked into a low-level job circa Y2K and worked her way up.)
As I said in the other thread, it is seen as a courtesy that teams allow their employees to interview for a position that is a promotion, but it is not a requirement. The Red Sox asked for permission to interview Stick Michaels for the GM opening that Theo eventually filled, and the Yankees declined.
In this case one of the reasons that ownership allowed Theo to interview with the Cubs is that he told them that he would not return at the end of his contract. But this also means that they know they need a new GM, and Cherington is that guy. So if the Cubs were to try to make an end run around the Sox's compensation demands and go to Cherington, the Red Sox would simply decline them to permission to interview him on the grounds that he is under contract and is their GM of the future (and is now in every facet but having had a press conference).
In my experience, there is no way to prevent that from happening - even when under-the-radar tools aren't mission-critical to the company, they become part of the fabric of operations at some level and it becomes nearly impossible to break them out.
-- MWE
Oh, puh-#######-lease. I have a degree in Post-Structural Continental Ethics and have made a rather decent living over the last two decades banging out datawarehousing code. Because, dude, it's just a special-syntax form of logical modeling, right? Some guy named Jimmy Furtado was a ####### fireman or some such nonsense, yet he banged out a relatively stable website that pulls from various databases and ####.
CS people who think CS is a real discipline are like the autistic kid who keeps shouting "I LIKE PIZZA!" You're math geeks who couldn't do the heavy #### required for theoretical physics and didn't have the contacts to get into investment banking.
Has there been any sort of definitive answer to whether "toolsy types" fail more often than, um, whatever the current vogue term is to describe "those guys that people assume exist aside from 'toolsy types' that give Billy Beane a stiffy?"
I'm a theoretical physicist, and I wish I were a better programmer than I am. Pretty much every project I'm involved with involves programming at some point, and making sure the program works is a major constraint on the speed with which I can do new projects.
For some projects, programming is nearly 100% of the work. A friend of mine has been working for 2+ years adapting a molecular scattering code to give new information. The original code is a mishmash of fortran subroutines originally written in the '70s and updated by a series of grad students since then. Good programming style would have improved his life immensely.
I'm not sure what design patterns are, but I'm pretty sure it's as worthless as most CS classes, they don't teach anything you can't learn better and faster by cracking a book and just doing it.
I spent a couple years studying for a CS degree, and quit when I got a job. my boss had dropped out of Carnegie Mellon and raised a few million to start a software business at age 23. What I learned on the job was his to architect for reuse quality in every module, and how to test your code thoroughly before the QA department even saw it. I took what I learned to other jobs and teams I managed shipped dozens of commercial apps, all high quality, and almost all on time.
The biggest problem with software development is most engineers dont think (or read) about process and how to improve it, they just want to get things done fast, and end up writing code quickly that's hard to test and verify, poorly architected and documented, but meets some arbitrary deadline. then they try to dump it on someone else and move on to the next sexy project, but can't because of an endless series of bug fixes. Most amateur and many professional developers are oblivious to the fact that most development time is spent trying to fix what you wrote, and building the right way can cut maintenance work by a factor of ten.
I'm porting some Java code to the iphone that I wrote in a rush months ago as a sort of prototype. If anyone who worked for me wrote it as badly as I'm finding it now, I'd fire them. I made all the classic mistakes of getting it to work first, then piling on additional features without a strong internal design or any thought of testing/maintenance. So now I'm doing it over the right way, and it will take twice as long, but I'll spend almost no time fixing anything when it is done..
I didn't think Beane was specifically looking for players to acquire in that scene, just players that could replace what they lost. So it was "How can we replace Giambi and Damon?" and the answer was "Well, we have Jeremy Giambi in house, he can get on base, and we can get Scott Hatteberg for nothing, etc.".
You must be Registered and Logged In to post comments.
<< Back to main