Regressed WAR 2015

Intro about what WAR is and how you can find a more in depth definition of it here.

A player’s value is some combination of performance and randomness, or luck. Hitting a hard line drive is performance; hitting it into the glove of a diving Jason Heyward is luck. Hitting a soft fly ball is performance; hitting it toward right field on a windy day in Yankees stadium is luck. WAR in its current form does not discriminate between value from performance and value from luck for position players. While this is appropriate for the purposes of WAR, we as fans are often more concerned with the performance side of things. At the end of a season most of that luck will have evened out, and the performance component will outweigh the luck component. But even if outweighed, that luck component is still present, especially near the beginning of the season when the sample sizes are still small. Because of that, I thought it appropriate to find a way to calculate a player’s value with neutral luck. What I came up with is what I’m calling Regressed WAR. Some of you may remember that I already tried this once, but my original methods were overly complicated so I thought it best to give it another go.

A position player’s value comes from his base running, batting, and fielding, thus these are the things we need to correct. I don’t know how to correct a player’s base running for luck, so Regressed WAR only includes the batting and fielding adjustments.

A player’s offensive value is determined by using weighted On Base Average (wOBA), which is a function of walks, strikeouts, and hits that weights each event according to how many runs it’s worth. For example, a home run is worth roughly 2.5 times more than a single, so wOBA weights them each accordingly. It would be exceedingly complex to try to look at each component of wOBA and try to separate the performance from the luck, but luckily there’s a much easier way. Using a batter’s walk rate, strikeout rate, Batting Average on Balls In Play (BABIP), flyball rate, and Home Run to Fly Ball ratio (HR/FB), you can predict his wOBA with extraordinary accuracy. For the visual readers, here’s a plot of wOBA versus Predicted wOBA:

A player’s walk, strikeout, and flyball rate are mostly under his control, so they don’t require any sort of adjustment. That leaves us with just BABIP and HR/FB to correct, which can be done easily with a couple publicly available tools. Correcting a player’s BABIP can be done by looking at line drive, ground ball, fly ball, and home run rates, and correcting his HR/FB can be done by looking at the distance and direction of his fly balls. The Expected BABIP (xBABIP) formula I’m using can be found here, though I should note that I use Jeff Zimmerman’s xBABIP values when available. Jeff Zimmerman also keeps up with Expected Home Runs (xHR) going back to 2007, and those values can be found here.

Once we have an Expected BABIP and Expected HR/FB, we can plug them into our Predicted wOBA formula to give us an Expected wOBA (xwOBA), which can then be converted to an offensive runs adjustment (ORAd) using the formulas found in the appendix. That takes care of our offensive adjustment, which means we’re halfway to our Regressed WAR!

The luck in fielding mainly comes from the noisiness of the statistics used to measure it. To deal with this I regress their defensive value toward league average (zero) and their defensive value in each of their previous three seasons, with more recent data given extra weight. The amount of regression I used was partly based on comments in Fangraph’s UZR primer, and partly based on intuition. Therefore, while it may provide values I’m comfortable with, I wouldn’t call them “correct.” If any of the more statistically savvy readers would like to look over my formula and provide input, I’m all ears. For now, we’ll just go with what we’ve got.

Anyway, at this point you have both an offensive and defensive adjustment which can be used to give you a final Regressed WAR (rWAR). In my provided spreadsheet, you will also see rWAR(o) and rWAR(d). These are simply rWAR with either just the offensive or just the defensive adjustment included. Remember that these numbers are simply a quick, dirty, and fun way to see through the noise of randomness to give an idea of how well a player have been performing. Use WAR (or Regressed Defense WAR) for  determining his actual value, because you can’t act like those balls didn’t fall for hits or fly over the fence. Use a player projection, like those provided by ZiPS and Steamer, for determining his true talent level, because anybody can just get hot for a couple weeks or months. Dig through the numbers, look over the formulas, play with your own weights if you’d like. But most of all, have fun arguing over pointless stuff with your friends. Enjoy!


Appendix: Formulas

Predicted wOBA = C1 + [C2 × BB%] + [C3 × K%] + [C4 × BABIP] + [C5 × FB%] + [C6 × (HR/FB)]

ORAd = [((xwOBA – league wOBA) / wOBA scale) × PA] – [((wOBA – league wOBA) / wOBA scale) × PA]

rWAR(o) = WAR + OAd/(R/W)

Regressed Defense Equation

FAd = (rDef × PA) – Def

rWAR(d) = WAR + FAd/(R/W)

rWAR = WAR + (FAd + OAd)/(R/W)

Stephen came up with the idea for this blog shortly after graduating from Tech. Realizing that life is ephemeral, he decided to put (metaphorical) pen to paper and catalogue his thoughts. His thoughts are series of numbers and spreadsheets, casually categorized as “research,” and said research is usually conducted on the margins of what is both relevant and socially acceptable.

Posted in Baseball, Featured, Sports Tagged with: , , ,
0 comments on “Regressed WAR 2015
1 Pings/Trackbacks for "Regressed WAR 2015"
  1. […] I’ve previously written about the fact that a batter’s results are a combination of both talent and randomness; therefore, using results alone offers a very incomplete view of a player’s performance. This season, we’ve finally been given access to a couple new data sources that should lift the veil on player performances by describing batted balls in more detail than the general public has ever seen before. The first is called StatCast, and it’s a tracking system that captures all of the movements of the players and the ball on the field. MLB installed the system in all 30 Major League ballparks and has started to release some of the information, starting with the speed and direction of batted balls. In addition to this, FanGraphs just released data from Baseball Info Solutions on quality of contact that tells us where and how hard a player’s batted balls were hit, broken down by pull, center, and opposite field rates and soft, medium, and hard contact rates. […]

Leave a Reply