Over the past few weeks, I’ve done a weekly Sabermtric Mining piece attempting to provide utility for fantasy owners using more advanced statistics. But is it possible that when it comes to predicting in-season pitching performance that it’s one of the simplest “advanced” metrics that you should be using?
For my first Sabermetric Mining piece, I had looked at FIP, xFIP, and SIERA as ERA predictors, highlighting SIERA as the favourite but identifying the benefits of each. Earlier this week, though, Glenn DuPaul of The Hardball Times put the estimators to the test in terms of their ability to predict in-season performance.
At the same time, I think these results should be taken as both a lesson and a cautionary tale. The ERA estimators that were tested (xFIP, FIP, SIERA and tERA) all did a better job of predicting future ERA than actual ERA; which was to be expected and is the normal assumption in the sabermetric community. But although they did better than ERA, simply subtracting walks from strikeouts did a better job of predicting ERAs for the second half than any of the advanced statistics.
In other words, for all the advancing ability of ERA estimators to predict future ERA, it is still this simple formula that does the best as it pertains to in-season ERA prediction:
Tom Tango of Inside The Book reflected on Glenn’s work, suggesting:
I also seem to remember that in terms of forecasting 2, 3, 4 years down the line that kwERA did better than anything else out there. Basically, for all our sabermetric advances, simply relying on K and BB (differential, not ratio) is just about the best we’ve been able to come up with.
He also noted that (K-BB)/PA (plate appearances) is preferable to using innings pitched as a denominator, but that the results would be more or less the same. Further to that notion, he indicated he uses FIP and kwERA but not really xFIP. He goes into detail on why, but basically it’s because we know for certain what these two are measuring.
For the record, kwERA is an ERA estimator with K and BB as its sole inputs. I didn’t identify it in my original piece, but it is another tool you can utilize when it comes to predicting pitcher performance, and it seems it may be both the simplest and the easiest. Again, though, using (K-BB)/IP or (K-BB)/PA would tell you the same story, just not on an ERA scale (rather, it would be a ratio).
Pursuant to that, I found a 2011 post from Tango that summed up some research as follows:
Overall, we see that while the ratio may have some additional information for us, a simple and straight strikeout minus walk differential per PA is a great indicator of performance.
Not to over-link, but I thought Eno Sarris’ piece at Getting Blanked did a nice job summing up this week’s saber-community discussion on this topic:
If you make a simple sauce, it’s easy to evaluate the ingredients. The more complicated the sauce, the more likely you’re left wondering which input was the spoiled one. Everything we needed to know about pitching we learned in the kitchen, it seems.
Here, of course, K and BB are the simple ingredients he is referring to.
None of this is to say that FIP, xFIP, SIERA and others don’t have a place or value, because they definitely do, especially for offseason analysis. Anything that improves your understanding of the components of pitcher success has value, this new research simply reinforces that scanning the xFIP leaderboard is not sufficient.
In addition, further research could be done on how the components of strikeouts and walks, for example swinging strike percentage or first pitch strike percentage, do in predicting future ERA, perhaps letting us improve on K-BB metrics.
Beyond just this K-BB analysis, you can expand your research to include components of strikeouts, as I outlined in August, and perhaps look for pitchers due to improve or decline in the strikeout category, and thus, K-BB metrics.
On the odd chance you’re still streaming pitchers to try and win a fantasy title at this point, the chart below shows pitchers available in more than 60% of Yahoo leagues and their ERA, FIP, (K-BB)/IP and (K-BB)/PA.
The higher the (K-BB)/PA, the better, obviously, as it indicates a greater ability to generate outs and a decreased propensity to allow free baserunners and thus scoring opportunities. Since those two things are the core components of ERA, it makes sense that a ratio that indicates increased outs and decreased runners (and therefore scoring opportunities) is a strong predictor of ERA. What’s even more appealing is that strikeouts and walks are generally considered the elements most within a pitcher’s control, so there are less situational mitigating factors at play than with some other metrics.
It will certainly be an interesting offseason in the statistical community, as I’m sure Glenn’s findings will encourage further research on ERA estimators, their efficacy, and how the components of K and BB work to predict ERA as well.