Over the course of the baseball season, WFNY will be providing you with a bit of a different look into the Cleveland Indians. We can debate wins, losses and payroll economics ad nauseum – but what it all boils down to is the numbers. And that’s where Jon Steiner comes in. Formerly of Bugs and Cranks, Mr. Steiner will be supplying us with the occasional look into what members of the Cleveland Indians are truly providing the team with the use of advanced statistics. Today, we have the first installment of what will be a great addition to our upcoming Tribe coverage. Do enjoy.
Image may be NSFW.
Clik here to view.How should we evaluate a player’s offensive contribution? Why do people tell us that Shin-Soo Choo is the best hitter the Indians have, despite the fact that Asdrubal had a higher average, and Choo struck out 17 times more than Jhonny Peralta in 2009? Why, after all these years, do we not have a reliable offensive metric?
Turns out, we do. But first, let’s examine some of the current ways of evaluating offensive performance and why they don’t quite do what we’d prefer.
Remember, the most important thing that a batter does—really, the only thing he should do—is produce runs for his team. After all, more runs means more wins, and that’s what it’s all about.
So here’s what we currently have:
Runs Batted In (RBI): The Ribbie, the Ribeye, and the first line on any cleanup hitter’s resume. Runs batted in count the number of runs that result from a player’s plate appearances (PAs) accrued over the season (less some runs produced by double plays, errors, etc.). But the problem here is obvious: RBI are completely context-driven. A player who bats with runners in scoring position on a regular basis (see: Teixeira, Mark) will necessarily have more chances for RBI than your average #3 hitter, but does that make him a superior hitter to someone like Albert Pujols? No. A thousand times, no. Blasphemer.
Runs (R): Jim Caple recently wrote an interesting piece in which he argued that the run is undervalued (Insider required). I beg to differ; in fact, we pay way too much attention to runs scored. True, runs win games, but unfortunately, players rarely control whether they’re being driven in consistently (unless they drive themselves in via the HR). Think of runs scored as the similarly deformed cousin of the RBI: they are dependent on too many variables that a player simply can’t control. If Grady Sizemore hits a triple in every at bat (AB), but is then stranded at third each time, would he be a bad hitter? No. He’d be a great player who’s being let down by his team. Not good enough.
Batting Average (Avg): Batting average measures the rate at which a player hits his way safely onto base per official AB. Seems straightforward, simple, and useful. In fact, it is. But it certainly misses some important components of a hitter’s job. First, it does not account for the value of the walk, since a walk is not counted as an official AB. And, believe it or not, a walk can be quite valuable. Second it doesn’t account for the difference between a double and a single, or the difference between a single and HR for that matter. It’s a good stat that can tell us a good deal about a player, but it obviously isn’t a solid measuring stick if it tells us that Paul O’Neill’s 1997 season (.324 avg) was more valuable than Jim Thome’s (.286 avg). Keep going.
On-Base Percentage (OPB): Measures a player’s ability to not make an out. Very valuable—more so even than batting average. Back to our 1997 example, Thome’s OBP (.423) blew away O’Neill (.399) because Thome walked in nearly 20% of his ABs compared to less than 12% for O’Neill. Those extra times on base eventually become runs, which is what we’re after. But OBP still misses that component of the difference between getting a walk and hitting a HR. That’s a big difference, no?
Slugging Percentage (SLG): Slugging percentage measures how many bases a player generates per AB. So if a batter gets hits four doubles in a game, his slugging percentage will be 2.000—exactly 2 bases per AB. Similarly, if he hits a single, walks (not an official AB), flies out to center, and hits a HR, his slugging will be (1.000+0.000+4.000)/3, or 1.667. Basically, this stat adds the power component to batting average. Nevertheless, it still misses walks, because its denominator is AB, rather than plate appearances. So….
On-Base Plus Slugging (OPS): Why not just add the two? If we add OBP to slugging, we get a metric that combines (albeit haphazardly) the two skill sets that seem most attractive to us thus far—the ability not to make outs and the ability to hit for power. The problems here are less obvious to the casual reader, but for starters, we’re adding metrics that are measuring different samples: slugging is measured in ABs and OBP is measured in PAs. Therefore, the result is a bit difficult to interpret. But the underlying flaw in OPS is even more interesting: a player’s ability not to make an out (OBP) is actually more valuable than a player’s ability to hit for power (slugging) by a factor of about 2 to 1 (ask for details if you’re interested). So if Player A has an OBP of .385 and a slugging percentage of .400, he is more valuable than Player B, who has an OBP of .360 but a slugging of .415, even though they both posted OPS’s of .785. Why? Because he makes fewer outs, of course. And if your team doesn’t make outs, you score an infinite number of runs. It’s hard to lose when you score infinity runs!
So the metrics currently at our disposal are not measuring the important stuff, or at least not measuring it as well as we’d like. How to solve this problem? The answer lies in “linear weights,” and below we’ll discuss what this means, how we can do better, and how the current Indians measure up.
First, we need to identify the major things a batter can do in any PA. He can: (1) make an out; (2) hit a single; (3) hit a double; (4) hit a triple; (5) hit a HR; (6) walk; (7) reach base on error; and (8) get hit by a pitch.
If we can assign how many runs each of these events typically produces (or doesn’t, in the case of an out) in a generic context, then we can multiply that factor (called a coefficient, by nerds) by the number of times each event occurred, divide by total PA, and viola! A new stat. It’s called weighted on-base average (wOBA), and was created by Tom Tango to estimate a player’s ability to produce runs for his team.
Here’s the final formula, but Fangraphs provides all the data, absolutely free:
(0.00xOuts + .72xBB + .75xHBP + .90x1B + .92xRBOE + 1.24x2B + 1.56x3B + 1.95xHR)/PA
A few things to note from the formula, before we leavethe math behind. First, we can see that while a walk is valuable (coefficient of .72), it’s not as valuable as a single (coefficient of .90). Makes sense, since singles are more likely to drive in runs than walks. Similarly, a home run is obviously the most valuable thing a batter can do, and the formula reflects that. Finally, since outs don’t contribute positively to players production in this formula, they are typically left out of the numerator, but obviously counted in the total PA.
The wOBA formula (I pronounce it like it rhymes with “Joba” with a long “o”, if that helps) includes a minor adjustment which scales the stat to look more like OBP, so that we understand that an average hitter is around .330, a great hitter is .390+, and Josh Barfield is around .300. So simple!
So how does this help us think about the Indians? Who has the highest wOBA on the current team? In short, who are the Indians’ best offensive players from last year, and how havethey performed over their respective careers?
First, a look at the 2009 team (beware—it’s not pretty):
Name |
Pos |
2009 wOBA |
Shin-Soo Choo |
OF |
0.389 |
Victor Martinez |
C/1B |
0.361 |
Ryan Garko |
1B |
0.361 |
Travis Hafner |
DH |
0.355 |
Asdrubal Cabrera |
SS |
0.354 |
Josh Barfield |
2B |
0.350 |
Mark DeRosa |
3B/OF |
0.346 |
Grady Sizemore |
OF |
0.343 |
Ben Francisco |
OF |
0.339 |
Kelly Shoppach |
C |
0.329 |
Matt LaPorta |
OF |
0.327 |
Lou Marson |
C |
0.326 |
Jamey Carroll |
2B/3B |
0.317 |
Michael Brantley |
OF |
0.311 |
Luis Valbuena |
2B/SS |
0.308 |
Jhonny Peralta |
3B/SS |
0.304 |
Andy Marte |
1B |
0.299 |
David Dellucci |
DH |
0.297 |
Trevor Crowe |
OF |
0.277 |
Chris Gimenez |
1B/OF |
0.233 |
Wyatt Toregas |
C |
0.223 |
Wow. Shin-Soo Choo. And it’s not even close. I guess another depressing thing about this list is how many of our top performers from last year are gone: V-Mart, Garko, DeRosa, and Benny. Finally, look at the corpse of David Dellucci. Remember that a bad player who can be grabbed from the AAA squad and be paid the league minimum posts a wOBA of about .300. That means we played Dellucci millions of dollars to perform at a level beneath your average AAA outfielder. No wonder the Dolans are tired of spending money on veterans.
And talk about Grady having a down year? Here are Grady’s wOBAs and batting averages from the last four seasons:
Year |
BA |
wOBA |
2006 |
0.290 |
0.386 |
2007 |
0.277 |
0.376 |
2008 |
0.268 |
0.384 |
2009 |
0.248 |
0.343 |
A couple of things, here. First, Grady’s never had a season as strong as Choo’s 2009 (remember, we’re talking only about his performance in the batter’s box, and ignoring his defense and baserunningfor the time being). Second, look at that dip from ’07 to ’08 in his batting average. It turns out he more than made up for that dip by boosting his slugging by 40 points and increasing his ratio of walks to strikeouts. And this is why people talk about his potential for greatness despite his unimpressive batting averages: he’s only 27 and he’s already good at the two things that are most important—power and not making outs. That fact that the outs he does make are largely concentrated as strikeouts is not such a big deal, as long as his other skills remain sharp.
Now let’s turn to another Tribe whipping boy. The Pronkian Disaster. Here are Hafner’s lines from the past 5 seasons:
Year |
BA |
wOBA |
2005 |
0.305 |
0.422 |
2006 |
0.308 |
0.449 |
2007 |
0.266 |
0.360 |
2008 |
0.197 |
0.275 |
2009 |
0.272 |
0.355 |
Okay. So the drop from a .308 hitter to a .270 hitter from ’06 to ’07 tells us something, but Grady dropped from .290 to .268 and remained a strong offensive player. So what about Hafner’s wOBAs. Jeez! He went from Barry Bonds to Barry Larkin to Barry Zitoin the course of three seasons. Remember when he got his albatross contract? That’s right: early in the 2007 season, before his production fell off a cliff. Say what you will about the wisdom of investing the largest contract in Indians history to a 30 year old slugger with no defensive value, but hey, at least he really did hit the snot out the ball for the two years prior.
To explore just how good Hafner was, and how pedestrian he has become, here is the wOBA leader board for the 2006 season:
Name | OPS | wOBA |
Travis Hafner |
1.097 |
0.449 |
Albert Pujols |
1.102 |
0.448 |
Ryan Howard |
1.084 |
0.436 |
Manny Ramirez |
1.058 |
0.434 |
David Ortiz |
1.049 |
0.427 |
Lance Berkman |
1.041 |
0.425 |
Jim Thome |
1.014 |
0.420 |
Jermaine Dye |
1.006 |
0.417 |
Miguel Cabrera |
0.998 |
0.413 |
Carlos Beltran |
0.982 |
0.412 |
That’s right. In 2006, Travis was the best offensive player in baseball. Ahead of El Hombre, even, if you believe that wOBA beats out OPS as an accurate measuring stick. (By the way, here’s one instance of over-valuing power: Pujols out-slugged Hafner by 12 points in 2006, but Hafner walked in nearly 5% more PAs, so his OBP was considerably higher. wOBA appreciates this difference in value, whereas OPS ignores it.)
Anyway, here’s Hafner and his closest companions on the 2009 wOBA leaderboard:
Name | wOBA |
Brian Roberts |
0.356 |
Magglio Ordonez |
0.356 |
Michael Aubrey |
0.356 |
Travis Hafner |
0.355 |
Luke Scott |
0.355 |
Rajai Davis |
0.354 |
Marco Scutaro |
0.354 |
Asdrubal Cabrera |
0.354 |
Jacoby Ellsbury |
0.354 |
And that’s the thing. Travis went from the best hitter in all of baseball, to Luke Scott, but without any defensive value. Whether it’s injuries, or, ahem, something else, we just can’t count on the guy these days. Oh yeah, and we still owe him over $40 million. Fun!
I could spit wOBA charts out for you all day, and we wouldn’t even come close to scratching the surface of the data that’s out there on the interwebs these days. There are literally hundreds of offensive statistics, and this is an introduction to just one that I’ve found helpful, not too difficult to understand (linear weights), and thoughtfully constructed. If you have any questions about what I’ve written here, leavethem in the comments, and I’ll do my best to give you an answer or at least good place to look for one.
See you next time!
Thanks to the guys at WFNY for picking me up as an occasional contributor. Much of the research in this series is built on ideas from The Book: Playing the Percentages in Baseball, the ongoing work at FanGraphs and Tom Tango’s blog, and the countless other blogs and books that refuse to stop thinking and arguing about baseball.