# Predicting pitchers’ walks using xBB%

The other day, I discussed predicting pitchers’ strikeout rates using xK%. I will conduct the same exercise today in regard to predicting walks. Using my best intuition, I want to see how well a pitcher’s walk rate (BB%) actually correlates with what his walk rate should be (expected BB%, henceforth “xBB%”). Similarly to xK%, I used my intuition to best identify reliable indicators of a pitcher’s true walk rate using readily available data.

An xBB% metric, like xK%, would not only if a pitcher perennially over-performs (or under-performs) his walk rate but also if he happened to do so on a given year. This article will conclude by looking at how the difference in actual and expected walk rates (BB – xBB%) varied between 2014 and career numbers, lending some insight into the (un)luckiness of each pitcher.

Courtesy of FanGraphs, I constructed another set of pitching data spanning 2010 through 2014. This time, I focused primarily on what I thought would correlate with walk rate: inability to pitch in the zone and inability to incur swings on pitches out of the zone. I also throw in first-pitch strike rate: I predict that counts that start with a ball are more likely to end in a walk than those that start with a strike. Because FanGraphs’ data measures ability rather than inability — “Zone%” measures how often a pitcher hits the zone; “O-Swing%” measures how often batters swing at pitches out of the zone; “F-Strike%” measures the rate of first-pitch strikes — each variable should have a negative coefficient attached to it.

I specify a handful of variations before deciding on a final version. Instead of using split-season data (that is, each pitcher’s individual seasons from 2010 to 2014) for qualified pitchers, I use aggregated statistics because the results better fit the data by a sizable margin. This surprised me because there were about half as many observations, but it’s also not surprising because each observation is, itself, a larger sample size than before.

At one point, I tried creating my own variable: looks (non-swings) at pitches out of the zone. I created a variable by finding the percentage of pitches out of the zone (1 – Zone%) and multiplied it by how often a batter refused to swing at them (1 – O-Swing%). This version of the model predicted a nice fit, but it was slightly worse than leaving the variables separated. Also, I ran separate-but-equal regressions for PITCHf/x data and FanGraphs’ own data. The PITCHf/x data appeared to be slightly more accurate, so I proceeded using them.

The graph plots actual walk rates versus expected walk rates. The regression yielded the following equation:

xBB% = .3766176 – .2103522*O-Swing%(pfx) – .1105723*Zone%(pfx) – .3062822*F-Strike%
R-squared = .6433

Again, R-squared indicates how well the model fits the data. An R-squared of .64 is not as exciting as the R-squared I got for xK%; it means the model predicts about 64 percent of the fit, and 36 percent is explained by things I haven’t included in the model. Certainly, more variables could help explain xBB%. I am already considering combining FanGraphs’ PITCHf/x data with some of Baseball Reference‘s data, which does a great job of keeping track of the number of 3-0 counts, four-pitch walks and so on.

And again, for the reader to use the equation above to his or her benefit, one would plug in the appropriate values for a player in a given season or time frame and determine his xBB%. Then one could compare the xBB% to single-season or career BB% to derive some kind of meaningful results. And (one more) again, I have already taken the liberty of doing this for you.

Instead of including every pitcher from the sample, I narrowed it down to only pitchers with at least three years’ worth of data in order to yield some kind of statistically significant results. (Note: a three-year sample is a small sample, but three individual samples of 160+ innings is large enough to produce some arguably robust results.) “Avg BB% – xBB%” (or “diff%”) takes the average of a pitcher’s difference between actual and expected walk rates from 2010 to 2014. It indicates how well (or poorly) he performs compared to his xBB%: the lower a number, the better. This time, I included “t-score”, which measures how reliable diff% is. The key value here is 1.96; anything greater than that means his diff% is reliable. (1.00 to 1.96 is somewhat reliable; anything less than 1.00 is very unreliable.) Again, this is slightly problematic because there are five observations (years) at most, but it’s the best and simplest usable indicator of simplicity.

Thus, Mark Buehrle, Mike Leake, Hiroki Kuroda, Doug Fister, Tim Hudson, Zack Greinke, Dan Haren and Bartolo Colon can all reasonably be expected to consistently out-perform their xBB% in any given year. Likewise, Aaron Harang, Colby Lewis, Ervin Santana and Mat Latos can all reasonably be expected to under-perform their xBB%. For everyone else, their diff% values don’t mean a whole lot. For example, R.A. Dickey‘s diff% of +0.03% doesn’t mean he’s more likely than someone else to pitch exactly as good as his xBB% predicts him to; in fact, his standard deviation (StdDev) of 0.93% indicates he’s less likely than just about anyone to do so. (What it really means is there is only a two-thirds chance his diff% will be between -0.90% and +0.96%.)

As with xK%, I compiled a list of fantasy-relevant starters with only two years’ worth of data that see sizable fluctuations between 2013 and 2014. Their data, at this point, is impossible (nay, ill-advised) to interpret now, but it is worth monitoring.

Name: [2013 diff%, 2014 diff%]

Miller is an interesting case: he was atrociously bad about gifting free passes in 2014, but his diff% was only marginally worse than it was in 2013. It’s possible that he was a smart buy-low for the braves — but it’s also possible that Miller not only perennially under-performs his xBB% but is also trending in the wrong direction.

Here are fantasy-relevant players with a) only 2014 data, and b) outlier diff% values:

I’m not gonna lie, I have no idea why Cobb, Corey Kluber and others show up as only having one year of data when they have two in the xK% dataset. This is something I noticed now. Their exclusion doesn’t fundamentally change the model’s fit whatsoever because it did not rely on split-season data; I’m just curious why it didn’t show up in FanGraphs’ leaderboards. Oh well.

Implications: Richards and Roark perhaps over-performed. Meanwhile, it’s possible that Odorizzi, Ross  and Ventura will improve (or regress) compared to last year. I’m excited about all of that. Richards will probably be pretty over-valued on draft day.

# Predicting pitchers’ strikeouts using xK%

Expected strikeout rate, or what I will henceforth refer to as “xK%,” is exactly what it sounds like. I want to see if a pitcher’s strikeout rate actually reflects how he has pitched in terms of how often he’s in the zone, how often he causes batters to swing and miss, and so on. Ideally, it will help explain random fluctuations in a pitcher’s strikeout rate, because even strikeouts have some luck built into them, too.

An xK% metric is not a revolutionary idea. Mike Podhorzer over at FanGraphs created one last year, but he catered it to hitters. Still, it’s nothing too wild and crazy like WAR or SIERA or any other wacky acronym. (A wackronym, if you will.)

Courtesy of Baseball Reference, I constructed a set of pitching data spanning 2010 through 2014. I focused primarily on what I thought would correlate highly with strikeout rates: looking strikes, swinging strikes and foul-ball strikes, all as a percentage of total strikes thrown. I didn’t want the model specification to be too close to a definition, so it’s beneficial that these rates are on a per-strike, rather than per-pitch, basis.

The graph plots actual strikeout rates versus expected strikeout rates with the line of best fit running through it. I ran my regression using the specification above and produced the following equation:

xK% = -.6284293 + 1.195018*lookstr + 1.517088*swingstr + .9505775*foulstr
R-squared = .9026

The R-squared term can, for easy of understanding, be interpreted as how well the model fits the data, from 0 to 1. An R-squared, then, of .9026 represents approximately a 90-percent fit. In other words, these three variables are able to explain 90 percent of a strikeout rate. (The remaining 10 percent is, for now, a mystery!)

In order for the reader to use this equation to his or her own benefit, one would insert a pitcher’s looking strike, swinging strike and foul-ball strike percentages into the appropriate variables. Fortunately, I already took the initiative. I applied the results to the same data I used: all individual qualified seasons by starting pitchers from 2010 through 2014.

The results have interesting implications. Firstly, one can see how lucky or unlucky a pitcher was in a particular season. Secondly, and perhaps most importantly, one can easily identify which pitchers habitually over- and under-perform relative to their xK%. Lastly, you can see how each pitcher is trending over time. Every pitcher is different; although the formula will fit most ordinary pitchers, it goes without saying that the aces of your fantasy squad are far from ordinary, and they should be treated on an individual basis.

(Keep in mind that a lot of these players only have one or two years’ worth of data (as indicated by “# Years”), so the average difference between their xK% and K% as a representation of a pitcher’s true skill will be largely unreliable.)

It is immediately evident: the game’s best pitchers outperform their xK% by the largest margins. Cliff Lee, Stephen Strasburg, Clayton Kershaw, Felix Hernandez and Adam Wainwright are all top-10 (or at least top-15) fantasy starters. But let’s look at their numbers over the years, along with a few others at the top of the list.

Kershaw and King Felix have not only been consistent but also look like like they’re getting better with age. Wainwright’s difference between 2013 and 2014 is a bit of a concern; he’s getting older, and this could be a concrete indicator that perhaps the decline has officially begun. Darvish’s line is interesting, too: you may or may not remember that he had a massive spike in strikeouts in 2013 compared to his already-elite strikeout rate the prior year. As you can see, it was totally legit, at least according to xK%. But for some reason, even xK% can fluctuate wildly from year to year. I see it in the data, anecdotally: Anibal Sanchez‘s huge 6.7-percent spike in xK% from 2012 to 2013 was followed by a 5.5-percent drop from 2013 to 2014. Conversely, David Price‘s 5-percent decrease in xK% from 2012 to 2013 was followed by an almost perfectly-equal 5-percent increase from 2013 to 2014. So the phenomenon seems to work both ways. Thus, perhaps it shouldn’t have come as a surprise when Darvish couldn’t repeat his 2013 success. To the baseball world’s collective dismay, we simply didn’t have enough data yet to determine which Yu was the true Yu. I plan to do some research to see how often these severe spikes in xK% are mere aberrations versus how often they are sustained over time, indicating a legitimate skills improvement.

I have also done my best to compile a list of players with only one or two years’ worth of data who saw sizable spikes and drops in their K% minus xK% (“diff%”). The idea is to find players for whom we can’t really tell how much better (or worse) their actual K% is compared to their xK% because of conflicting data points. For example, will Corey Kluber be a guy who massively outperforms his xK% as he did in 2014, or does he only slightly outperform as he did in 2013? I present the list not to provide an answer but to posit: Which version of each of these players is more truthful? I guess we will know sometime in October.

Name: [2013 diff%, 2014 diff%]

And here some fantasy-relevant guys with only data from 2014:

# Bold prediction #3: Corey Kluber is this year’s Hisashi Iwakuma

Bold Prediction #2: Brad Miller will be a top-5 shortstop
Bold Prediction #1: Tyson Ross will be a top-45 starter (until he reaches his innings cap)

The Corey Kluber Society, fronted by Carson Cistulli of FanGraphs, is, frankly, hilarious. The format of the post is great, and if you haven’t read it before, you should here.

But there’s a more important reason to read about (and “join”) the Society. Kluber is not only a legitimate fantasy starting pitcher but also a very good one. His breakout last year was muted by a couple of bad starts, but he is a perfect comp to a 2012 Hisashi Iwakuma on the verge.

I will list a variety of statistics in which Kluber excelled. Then I will let you know whom he outperformed in each category for all pitchers with at least 140 innings pitched (1o7 total).

K/9: 8.31 (26th overall)
Better than: Cole Hamels, Julio Teheran, Adam Wainwright, Mat Latos, Mike Minor

K/BB: 4.12 (11th overall)
Better than: Hamels, Jordan Zimmermann, Teheran, Anibal Sanchez, Homer Bailey

BAbip: .329 (6th worst)

Swinging strike rate: 10.4% (22nd overall)
Better than: Zack Greinke, Latos, Iwakuma, Scott Kazmir, Jose Fernandez

Contact rate: 76.8% (16th overall)
Better than: Kris Medlen, Jeff Samardzija, Bailey, Greinke, Fernandez

xFIP-: 78 (11th overall)
Better than: Max Scherzer, Fernandez, David Price, Iwakuma, Stephen Strasburg

Yowza. Those are some seriously stellar numbers. What’s the deal? Unfortunately for Kluber, he suffered a brutal outing or two, causing his WHIP and ERA to be inflated for most of the year and allowing him to fly under the radar. Chalk it up to bad luck, considering Kluber’s 6th-worst BAbip, better than only Joe Saunders, Dallas Keuchel and other names one wishes not to be associated with.

This sounds vaguely familiar. A high-control guy with a solid strikeout rate out of the bullpen? Does the name Hisashi Iwakuma ring a bell? It should, because he has already been mentioned several times in the last 300 words. Anyway, I rode the Iwakuma (and Bailey) wave through the end of 2012. Instead of going with my gut and drafting Iwakuma in the last round of my shallow draft in 2013, I opted for Marco Estrada — not a terrible pick, but clearly not the right gamble to take. It’s actually the moment upon which I reflected and realized that I should really just take my own advice. Because given Dan Haren‘s peripherals, why would anyone have trusted him over Bailey last year? Ridiculous. (FYI, I will rip on Haren in a forthcoming bold prediction, just to be clear that I’m not ripping on him because he gave up a million home runs last year.)

But I digress. Iwakuma was good in 2012, but his 7.25 K/9, 2.35 K/BB and 1.28 WHIP were all rather pedestrian. But sometimes you need to rely on your eyes more than the numbers, and anyone who watched Iwakuma saw flashes of brilliance. 2013 may have been more than we anticipated, which brings me to my point:

Kluber already has the makings of a great pitcher, and his peripherals indicate that none of it was a fluke. My official prediction: Corey Kluber will be a top-40 starting pitcher.

# Pitchers due for strikeout regression using PITCHf/x data

If FanGraphs were a home, or a hotel, or even a tent, I’d live there. I would swim in its oceans of data, lounge in its pools of metrics.

It houses a slew of PITCHf/x data — the numbers collected by the systems installed in all MLB ballparks that measure the frequency, velocity and movement of every pitch by every pitcher. It’s pretty astounding, but it’s also difficult for the untrainted eye to make something of the numbers aside from tracking the declining velocities of CC Sabathia‘s and Yovani Gallardo‘s fastballs.

I used linear regression to see how a pitcher’s contact, swinging strike and other measurable rates affect his strikeout percentage, and how that translates to strikeouts per inning (K/9). Ultimately, the model spits out a formula to generate an expected K/9 for a pitcher. I pulled data from FanGraphs comprised of all qualified pitchers from the last four years (2010 through 2013).

The idea is this: A pitcher who can miss more bats will strike out more batters. FanGraphs’ “Contact %” statistic illustrates this, where a lower contact rate is better. Similarly, a pitcher who can generate more swinging strikes (“SwStr %”) is more likely to strike out batters.

Using this theory coupled with the aforementioned data, I “corrected” the K/9 rates of all 2013 pitchers who notched at least 100 innings. Instead of detailing the full results, here are the largest differentials between expected and actual K/9 rates. (I will list only pitchers I deem fantasy relevant.)

Largest positive differential: Name — expected K/9 – actual K/9) = +/- change

1. Martin Perez — 7.77 – 6.08 = +1.69
2. Jarrod Parker — 7.74 – 6.12) = +1.62
3. Dan Straily — 8.63 – 7.33 = +1.30
4. Jered Weaver — 8.09 – 6.82 = +1.27
5. Hiroki Kuroda — 7.93 – 6.71 = +1.22
6. Kris Medlen — 8.38 –  7.17 = +1.21
7. Francisco Liriano — 10.31 – 9.11 = +1.20
8. Ervin Santana — 8.06 – 6.87 = +1.19
9. Ricky Nolasco — 8.47 – 7.45 = +1.02
10. Tim Hudson — 7.42 (6.51) | +0.91

Largest negative differential:

1. Tony Cingrani — 8.15 – 10.32 = -2.17
2. Ubaldo Jimenez — 7.68 – 9.56 = -1.88
3. Cliff Lee — 7.11 – 8.97 = -1.86
4. Jose Fernandez — 8.15 – 9.75 = -1.60
5. Shelby Miller — 7.20 – 8.78 = -1.58
6. Scott Kazmir — 7.71 – 9.23 = -1.52
7. Yu Darvish — 10.41 – 11.89 = -1.48
8. Lance Lynn — 7.58 – 8.84 = -1.26
9. Justin Masterson — 7.84 (9.09) | -1.25
10. Chris Tillman — 6.60 (7.81) | -1.21

There’s a lot to digest here, so I’ll break it down. It appears Perez was the unluckiest pitcher last year, of the ones who qualified for the study, notching almost 1.7 fewer strikeouts per nine innings than he would be expected to, given the rate of whiffs he induced. Conversely, rookie sensation Cingrani notched almost 2.2 more strikeouts per nine innings than expected.

There is a caveat. I was not able to account for facets of pitching such as a pitcher’s ability to hide the ball well, or his tendency to draw strikes-looking. With that said, a majority of the so-called lucky ones are pitchers who, in 2013, experienced a breakout (Cingrani, Fernandez, Miller, Darvish, Masterson, Tillman) or a renaissance (Jimenez, Kazmir, Masterson — woah, all Cleveland pitchers). Is it possible these pitchers can all repeat their performances — especially the ones who have disappointed us for years? Perhaps not.

(Update, Jan. 24: Cliff Lee’s mark of -1.86 is, amazingly, not unusual for him. Over the last four years, the average difference between his expected and actual K/9 rates is … drum roll … -1.88. Insane!)

Darvish and Liriano were in a league of their own in terms of inducing swings and misses, notching almost 30 percent each. (Anibal Sanchez was third-best with 27 percent. The average is about 21 percent.) However, Darvish recorded 2.78 more K/9 than Liriano. Is there any rhyme or reason to that? Darvish is, without much argument, the better pitcher — but is he that much better? I don’t think so. Darvish was expected to notch 10.41 K/9 given his contact rate. Any idea what his 2012 K/9 rate was? Incredibly: 10.40 K/9.

More big names produced equally interesting results. King Felix Hernandez recorded a career-best 9.51 K/9, but he was expected to produce something closer to 8.57 K/9. His rate the previous three years? 8.52 K/9.

Dan Haren didn’t produce much in the way of ERA in 2013, but he did see a much-needed spike in his strikeout rate, jumping above 8 K/9 for the first time since 2010. His expected 7.07 K/9 says otherwise, though, and it fits perfectly with how his K/9 rate was trending: 7.25 K/9 in 2011, 7.23 K/9 in 2012.

I think my models tend to exaggerate the more extreme results (most of which are noted in the lists above) because they could not account for intangibles in a player’s natural talent. However, they could prove to be excellent indicators of who’s due for regression.

Only time will tell. Maybe Jose Fernandez isn’t the elite pitcher we already think he is — not yet, at least.

————

Notes: The data almost replicates a normal distribution, with 98 of the 145 observations (67.6 percent) falling within one standard deviation (1.09 K/9) of the mean value (7.19 K/9), and 140 of 145 (96.6 percent) falling within two standard deviations. The median value is 7.27 K/9, indicating the distribution is very slightly skewed left.

# The role of luck in fantasy baseball

I apologize for being that guy that ruins that ooey gooey feeling you get when think about the fantasy league you won last year. As much as you want to think you are a fantasy master — perhaps even a fantasy god — you should acknowledge that you probably benefited from a good deal of luck. Sure, for your sake, I will admit you made a great pick with Max Scherzer in the fifth round. But did you, in all your mastery, predict he would win 21 games?

Don’t say yes. You didn’t. And frankly, you would be crazy to say he’ll do it again.

I focus primarily on pitching in this blog, and let it be known that pitchers are not exempt from luck in the realm of fantasy baseball. If you’re playing in a standard rotisserie league, you probably have a wins category. In a points league, you likely award points for wins.

Wins. Arguably the most arbitrary statistic in baseball. Let’s not have that discussion, though, and instead simply accept the win as it is. The win has the most drastic uncontrollable effect on a fantasy pitcher’s value. (ERA and WHIP experiences similar statistical fluctuations, but at least they aren’t arbitrary.)

I had an idea, but before I proceed, let me interject: if you’re drafting for wins, you’re doing it wrong. But, as I said, you can’t ignore wins.

But let’s say you did, and drafted strictly on talent, or “stuff” (which, here, factors in a pitcher’s durability). How would the top 30 pitchers change? Here’s my “stuff” list, which you can compare with the base projections:

Here are the five players with the biggest positive change and a breakdown of each:

1. Brandon Beachy, up 23 spots
His injury history has weakened his wins column projection. Consequently, the number of innings Beachy is expected to throw is significantly less than a full season. But if he managed to stay healthy for the full year (say, 200 innings)? He’s a top-1o pick based on pure stuff. If you draft with the philosophy that you can always find a viable replacement on waivers, Beachy could be your big sleeper.
2. Marco Estrada, up 22 spots
Estrada’s diminished expected wins is more a function of his terrible team than ability. Estrada has underperformed the past two years, Ricky Nolasco style, but if he can pull it together, he’s a top-30 pitcher based on “stuff.” And hey, maybe he can luck into some extra wins. However, if he can’t pull it together — Ricky Nolasco style — he’ll be relegated to fringe starter.
3. Danny Salazar, up 9 spots
Salazar has immense potential. His injury history led the Indians to cap his per-game pitch count last year, and that has been factored into his projection. But if he’s a full-time, 200-inning starter? He’s a top-25 starter with top-15 upside. Again, this is in terms of “stuff”. But is Ivan Nova better than Felix Hernandez because he can magically win more games? Of course not. Among a slew of young studs, including Jose Fernandez, Shelby Miller, Michael Wacha and so on, Salazar is a diamond in the rough.
4. A.J. Burnett, up 8 spots
His projection is already plenty good. But you saw how many games he won in 2013. Anything can happen.
5. Corey Kluber, up 8 spots
Most people were probably scratching their heads when they saw Kluber’s name listed above. Frankly, I’m in love with him, and it’s because he’s a stud with a great K/BB ratio. I understand why someone may be inclined to dismiss it as an aberration, but his swinging strike and contact rates are truly excellent. Even if they regress, he should be a draft-day target.

Here are the three starting pitchers with the biggest negative change.

1. Anibal Sanchez, down 10 spots
He’s great, but he also plays for a great team. Call it Max Scherzer syndrome. He carries as big a risk as any other player to pitch great but only win five or six games, as do the next two players.
2. Hisashi Iwakuma, down 6 spots
3. Zack Greinke, down 4 spots

Let me be clear that although I created a hypothetical scenario where wins didn’t exist, I don’t advocate for blindly drafting based on “stuff.” It’s important to acknowledge that certain players have a much better chance to win than others. Chris Sale of the Chicago White Sox could win 17 games just as easily as he could win seven. It’s about playing the odds — and unless a pitcher truly pitches terribly, don’t blame the so-called experts for your bad luck. He probably put his money where his mouth is, too, and is suffering along with you.

Here is a more comprehensive list of pitchers ranked by “stuff,” if that’s the way you sculpt your strategy:

1. Clayton Kershaw
3. Felix Hernandez
4. Max Scherzer
5. Cliff Lee
6. Yu Darvish
7. Chris Sale
8. Cole Hamels
9. Jose Fernandez
11. Stephen Strasburg
12. David Price
13. Justin Verlander
14. Alex Cobb
15. Homer Bailey
16. Mat Latos
17. Gerrit Cole
18. Michael Wacha
19. Anibal Sanchez
20. James Shields
21. Danny Salazar
23. A.J. Burnett
24. Corey Kluber
25. Brandon Beachy
26. Zack Greinke
27. Matt Cain
28. Sonny Gray
29. Hisashi Iwakuma
30. Gio Gonzalez
31. Doug Fister
32. Jordan Zimmermann
33. Alex Wood
34. Kris Medlen
35. Jeff Samardzija
36. Mike Minor
37. Jake Peavy
38. Kevin Gausman
39. Tyson Ross
40. Patrick Corbin
41. Lance Lynn
42. Francisco Liriano
43. Andrew Cashner
44. Ricky Nolasco
45. CC Sabathia
46. Hiroki Kuroda
47. Tim Lincecum
48. Tim Hudson
49. Jered Weaver
50. Shelby Miller
51. Clay Buchholz
52. Tony Cingrani
53. Matt Garza
54. John Lackey
55. Ubaldo Jimenez
56. Justin Masterson
57. Julio Teheran
58. R.A. Dickey
59. A.J. Griffin
60. Hyun-Jin Ryu
61. Dan Haren
62. Johnny Cueto
63. C.J. Wilson
64. Ian Kennedy
65. Chris Archer
66. Kyle Lohse
67. Scott Kazmir
68. Carlos Martinez
69. Jon Lester
70. Ervin Santana
71. Jose Quintana
72. Derek Holland
73. Garrett Richards
74. Dan Straily
75. Tyler Skaggs

# Early SP rankings for 2014

I wouldn’t say pitching is deep, but I’m surprised by the pitchers who didn’t make my top 60.

Note: I have deemed players highlighted in pink undervalued and worthy of re-rank. Do not be alarmed just yet by what you may perceive to be a low ranking.