Tagged: Jake Odorizzi

Predicting pitchers’ walks using xBB%

The other day, I discussed predicting pitchers’ strikeout rates using xK%. I will conduct the same exercise today in regard to predicting walks. Using my best intuition, I want to see how well a pitcher’s walk rate (BB%) actually correlates with what his walk rate should be (expected BB%, henceforth “xBB%”). Similarly to xK%, I used my intuition to best identify reliable indicators of a pitcher’s true walk rate using readily available data.

An xBB% metric, like xK%, would not only if a pitcher perennially over-performs (or under-performs) his walk rate but also if he happened to do so on a given year. This article will conclude by looking at how the difference in actual and expected walk rates (BB – xBB%) varied between 2014 and career numbers, lending some insight into the (un)luckiness of each pitcher.

Courtesy of FanGraphs, I constructed another set of pitching data spanning 2010 through 2014. This time, I focused primarily on what I thought would correlate with walk rate: inability to pitch in the zone and inability to incur swings on pitches out of the zone. I also throw in first-pitch strike rate: I predict that counts that start with a ball are more likely to end in a walk than those that start with a strike. Because FanGraphs’ data measures ability rather than inability — “Zone%” measures how often a pitcher hits the zone; “O-Swing%” measures how often batters swing at pitches out of the zone; “F-Strike%” measures the rate of first-pitch strikes — each variable should have a negative coefficient attached to it.

I specify a handful of variations before deciding on a final version. Instead of using split-season data (that is, each pitcher’s individual seasons from 2010 to 2014) for qualified pitchers, I use aggregated statistics because the results better fit the data by a sizable margin. This surprised me because there were about half as many observations, but it’s also not surprising because each observation is, itself, a larger sample size than before.

At one point, I tried creating my own variable: looks (non-swings) at pitches out of the zone. I created a variable by finding the percentage of pitches out of the zone (1 – Zone%) and multiplied it by how often a batter refused to swing at them (1 – O-Swing%). This version of the model predicted a nice fit, but it was slightly worse than leaving the variables separated. Also, I ran separate-but-equal regressions for PITCHf/x data and FanGraphs’ own data. The PITCHf/x data appeared to be slightly more accurate, so I proceeded using them.

The graph plots actual walk rates versus expected walk rates. The regression yielded the following equation:

xBB% = .3766176 – .2103522*O-Swing%(pfx) – .1105723*Zone%(pfx) – .3062822*F-Strike%
R-squared = .6433

Again, R-squared indicates how well the model fits the data. An R-squared of .64 is not as exciting as the R-squared I got for xK%; it means the model predicts about 64 percent of the fit, and 36 percent is explained by things I haven’t included in the model. Certainly, more variables could help explain xBB%. I am already considering combining FanGraphs’ PITCHf/x data with some of Baseball Reference‘s data, which does a great job of keeping track of the number of 3-0 counts, four-pitch walks and so on.

And again, for the reader to use the equation above to his or her benefit, one would plug in the appropriate values for a player in a given season or time frame and determine his xBB%. Then one could compare the xBB% to single-season or career BB% to derive some kind of meaningful results. And (one more) again, I have already taken the liberty of doing this for you.

Instead of including every pitcher from the sample, I narrowed it down to only pitchers with at least three years’ worth of data in order to yield some kind of statistically significant results. (Note: a three-year sample is a small sample, but three individual samples of 160+ innings is large enough to produce some arguably robust results.) “Avg BB% – xBB%” (or “diff%”) takes the average of a pitcher’s difference between actual and expected walk rates from 2010 to 2014. It indicates how well (or poorly) he performs compared to his xBB%: the lower a number, the better. This time, I included “t-score”, which measures how reliable diff% is. The key value here is 1.96; anything greater than that means his diff% is reliable. (1.00 to 1.96 is somewhat reliable; anything less than 1.00 is very unreliable.) Again, this is slightly problematic because there are five observations (years) at most, but it’s the best and simplest usable indicator of simplicity.

Thus, Mark Buehrle, Mike Leake, Hiroki Kuroda, Doug Fister, Tim Hudson, Zack Greinke, Dan Haren and Bartolo Colon can all reasonably be expected to consistently out-perform their xBB% in any given year. Likewise, Aaron Harang, Colby Lewis, Ervin Santana and Mat Latos can all reasonably be expected to under-perform their xBB%. For everyone else, their diff% values don’t mean a whole lot. For example, R.A. Dickey‘s diff% of +0.03% doesn’t mean he’s more likely than someone else to pitch exactly as good as his xBB% predicts him to; in fact, his standard deviation (StdDev) of 0.93% indicates he’s less likely than just about anyone to do so. (What it really means is there is only a two-thirds chance his diff% will be between -0.90% and +0.96%.)

As with xK%, I compiled a list of fantasy-relevant starters with only two years’ worth of data that see sizable fluctuations between 2013 and 2014. Their data, at this point, is impossible (nay, ill-advised) to interpret now, but it is worth monitoring.

Name: [2013 diff%, 2014 diff%]

Miller is an interesting case: he was atrociously bad about gifting free passes in 2014, but his diff% was only marginally worse than it was in 2013. It’s possible that he was a smart buy-low for the braves — but it’s also possible that Miller not only perennially under-performs his xBB% but is also trending in the wrong direction.

Here are fantasy-relevant players with a) only 2014 data, and b) outlier diff% values:

I’m not gonna lie, I have no idea why Cobb, Corey Kluber and others show up as only having one year of data when they have two in the xK% dataset. This is something I noticed now. Their exclusion doesn’t fundamentally change the model’s fit whatsoever because it did not rely on split-season data; I’m just curious why it didn’t show up in FanGraphs’ leaderboards. Oh well.

Implications: Richards and Roark perhaps over-performed. Meanwhile, it’s possible that Odorizzi, Ross  and Ventura will improve (or regress) compared to last year. I’m excited about all of that. Richards will probably be pretty over-valued on draft day.

Ten bargain starters outside my top 60

The idea is simple: In a standard 10-team mixed league, an owner is allotted six spots to fill with starting pitchers. That relegates everyone else drafted No. 61 and higher to fantasy benches or free agency.

That doesn’t mean pitchers drafted outside the top 60 are worse than pitchers in the top 60. You can find good pitchers up until the 60th pick — heck, it’s the Brewers’ Marco Estrada, who has excellent control and solid strikeout numbers — but as many as a third of those 60 are risky are overvalued. Value bleeds into the late rounds  and it’s worth figuring out who’s worth reaching for, despite pitchers with better ADPs (average draft positions) still on the board, and who’s worth waiting for.

I’ll discuss a handful of pitchers I like outside my top 60, in order of ESPN ADP.

John Lackey | ADP: 63rd
Lackey had a renaissance 2013, coming back from a lost 2012 and miserable 2011. The strikeout and walk rates were second-best and best of his career, respectively, and there’s little reason to think he’ll crumble overnight. He’s less risky than Dan Haren (about whom I’ve been vocal about my distrust), who is being drafted 49th of starting pitchers, or Dan Straily, going 56th, who is honestly mediocre. He’s enough to fill the back of your rotation, let alone a bench spot.

Alex Wood | ADP: 66th
Wood is a control artist, and the Braves simply know how to develop pitchers. Scouts and experts are excited about him; I don’t know why he’s not getting more draft love. He’s guaranteed a rotation spot, due to the rash of injuries to Atlanta starters, and should be more than serviceable.

Corey Kluber | ADP: 79th
I love Kluber.

Josh Beckett | ADP: 91st
Sources say he’s recovering well from his surgery. If he makes the Dodgers’ rotation and remotely resembles the Beckett of old, he’s  a value.

Tyson Ross | ADP: 103rd
He absolutely dealt for the Padres last year. A reader mentioned he could be on an innings limit, but I would still ride him until he’s shuffled out of the rotation, and then simply find a replacement for him.

James Paxton | ADP: 105th
If the Royals’ Yordano Ventura is going 62nd on average, there’s no reason Paxton should be going outside the top 100 pitchers. Paxton doesn’t gas a 10o-mph heater like Ventura does but his strikeout and walk rates are very similar to Ventura’s.

Tyler Skaggs | ADP: 110th
Skaggs was a three-time top-100 prospect for Baseball America, peaking at No. 12 in 2013 (and No. 17 for Baseball Prospectus). It would be a mistake to write him off so soon after one bad season, especially with minor-league numbers better than those of Ventura or Paxton. His 2013 and current spring training numbers are an eyesore, though, so the repulsion is understandable. But, as I always say, he’s a name worth remembering.

Other notables: Drew Hutchison (114th), Erik Johnson (133rd), Jake Odorizzi (151st)

Early SP rankings for 2014

I wouldn’t say pitching is deep, but I’m surprised by the pitchers who didn’t make my top 60.

Note: I have deemed players highlighted in pink undervalued and worthy of re-rank. Do not be alarmed just yet by what you may perceive to be a low ranking.

2014 STARTING PITCHERS

  1. Clayton Kershaw
  2. Adam Wainwright
  3. Max Scherzer
  4. Yu Darvish
  5. Felix Hernandez
  6. Cliff Lee
  7. Stephen Strasburg
  8. Jose Fernandez
  9. Cole Hamels
  10. Justin Verlander
  11. Anibal Sanchez
  12. Chris Sale
  13. Mat Latos
  14. Madison Bumgarner
  15. Alex Cobb
  16. Homer Bailey
  17. Gerrit Cole
  18. Zack Greinke
  19. David Price
  20. James Shields
  21. Jordan Zimmermann
  22. Michael Wacha
  23. Danny Salazar
  24. Jered Weaver
  25. A.J. Burnett *contingent on if he retires
  26. Kris Medlen
  27. Mike Minor
  28. Jake Peavy
  29. Corey Kluber
  30. Lance Lynn
  31. Matt Cain
  32. Hisashi Iwakuma
  33. CC Sabathia
  34. Gio Gonzalez
  35. Doug Fister
  36. Patrick Corbin
  37. Francisco Liriano
  38. Sonny Gray
  39. Ricky Nolasco
  40. Hiroki Kuroda
  41. Tim Hudson
  42. Marco Estrada
  43. Shelby Miller
  44. Trevor Rosenthal
  45. Tony Cingrani
  46. A.J. Griffin
  47. Brandon Beachy
  48. Tim Lincecum
  49. Clay Buchholz
  50. Ubaldo Jimenez
  51. Alex Wood
  52. Julio Teheran
  53. Tyson Ross
  54. Hyun-jin Ryu
  55. Matt Garza
  56. Andrew Cashner
  57. Johnny Cueto
  58. C.J. Wilson
  59. John Lackey
  60. Justin Masterson
  61. R.A. Dickey
  62. Kevin Gausman
  63. Jon Lester
  64. Dan Haren
  65. Ervin Santana
  66. Derek Holland
  67. Chris Archer
  68. Jeff Samardzija
  69. Bartolo Colon
  70. Ivan Nova
  71. Matt Moore
  72. Ian Kennedy
  73. Dan Straily
  74. Rick Porcello
  75. Jarrod Parker
  76. Carlos Martinez
  77. Jeremy Hellickson
  78. Kyle Lohse
  79. Scott Kazmir
  80. Jason Vargas
  81. Tommy Milone
  82. Wade Miley
  83. Dillon Gee
  84. Brandon Workman
  85. Chris Tillman
  86. Zack Wheeler
  87. Yovani Gallardo
  88. Miguel Gonzalez
  89. Jose Quintana
  90. Garrett Richards
  91. Robbie Erlin
  92. Felix Doubront
  93. Jhoulys Chacin
  94. Jonathon Niese
  95. Chris Capuano
  96. Nick Tepesch
  97. Alexi Ogando
  98. Bronson Arroyo
  99. Travis Wood
  100. Trevor Cahill
  101. Tyler Skaggs
  102. Randall Delgado
  103. Martin Perez
  104. Mike Leake
  105. Carlos Villanueva
  106. Todd Redmond
  107. Brandon Maurer
  108. Tyler Lyons
  109. Ryan Vogelsong
  110. Zach McAllister
  111. Wily Peralta
  112. Brett Oberholtzer
  113. Erik Johnson
  114. Jorge De La Rosa
  115. Paul Maholm
  116. Hector Santiago
  117. Burch Smith
  118. Jeff Locke
  119. Joe Kelly
  120. Jason Hammel
  121. Jake Odorizzi
  122. Danny Hultzen
  123. Anthony Ranaudo
  124. Archie Bradley
  125. Rafael Montero
  126. James Paxton
  127. Taijuan Walker
  128. Yordano Ventura