精華區beta Nationals 關於我們 聯絡資訊
http://detectovision.com/?p=803 I am hoping to make my recent research product - DNRA - a part of the dialogue among students of the game of baseball and to that end, I've been asked to present in a quasi-readable format, the exact methodology I used to produce results recently disclosed in a couple of different places online which have shown promise. The few who have thus far reviewed some of my findings seem interested at the very least in potential refinements to the system and overall pleased with the quality of the data I've shown to date. I am certain there is more work that needs to be done and I'll discuss some of my plans for future research and improvements at the end of this document but for the moment, I give you the nuts and bolts of DNRA. Overview of Underlying Philosophy In recent days I've been told by many a researcher that the question of predictive analysis (attempting to divine from available data what a player is likely to do in the future) and the evaluation of past value (reading the past data to explain what has already happened) are two separate tasks requiring two different approaches and two different sets of statistics. I have long believed this is not the case. For years I've worked hard to accurately describe the vast history of numerical baseball data in terms of how valuable players are and have been (think along the lines of Win Shares and WARP) for two reasons. I am fascinated by the history of the game, obviously, but I also strongly believe that an accurate depiction of all that has come before will serve us well in answering the question of what will happen in the next year or three. Predictive sabermetrics that do not make use of the mounds of data in the history of the game generally operate by applying some assumptions and the laws of probability and this, to me, has always seemed like a somewhat incomplete approach. Meanwhile, the study of baseball history has recently been dismissed as a means to the desired end - an accurate predictive model - because current sabermetric tools don't quite adequately capture all of the necessary information to fully understand how and why events occur. This is especially true of pitching and fielding analysis where the majority of the available statistics that would quantify a pitcher or fielder's value are far behind the advanced approaches used to diagnose hitting and far less reliable and accurate. I have therefore devoted most of my energy to the effort of breaking down defense (including pitching) and getting an accurate picture where one is most needed. Defense is a Context When Voros McCracken stumbled on the idea that pitchers have little control over the batting average on balls in play he, launched a fierce debate that continues to this day regarding how best to use this observation. Since 2001 a litany of pitching metrics have been invented illustrating a range of beliefs on the subject of how to handle defense. Most pitching metrics begin with a pitcher's actual statistical record and make adjustments to account for the defensive skill of the players behind him. RSAA and DERA are examples of this. The original DIPS (defense independent pitching statistics) report was unfortunately somewhat overly simplistic and it has led many of those studying pitching to dismiss DIPS as a starting point because there are pitchers who consistently defy DIPS principles to a certain degree. The result of that dismissal is a generation of statistics which begin with runs allowed and make adjustments based on different perceptions of how important the fielders are including the aforementioned RSAA and more complete statistics like PRC (a Hardball Times creation) which make use of correlations that suggest pitchers are roughly 20% responsible for the results of in-play events. I believe is it inherently incorrect to begin with a statistic (RA) that is poisoned by biased starting data and small sample sizes and try to make adjustments to account for the biases. I believe the most accurate pitching measure should begin with DIPS principles and adjust for the observable affects pitchers do have on balls in play. Defense is a context just like parks and leagues are, capable of creating confusing biases in the data. DNRA is the logical conclusion of that belief. Events and their Place in the Defensive Spectrum For every event that can possibly occur in a baseball game, there are three possible modes of pitcher responsibility: DIPS event (purely the domain of the pitcher) Fielder/Baserunner mistakes while the ball is live (purely the domain of the fielders and 100% luck as to how those fall in a pitching line) Balls In Play or Pitcher/Fielder Interaction (Controlled more by the fielders, but influenced by pitching patterns and ball trajectories which we know the pitcher can help shape) Now there are 26 unique event types (25 that are commonly tracked and one that I've added because I feel it makes an important observation about pitchers)…here they each are, grouped by which type of play they are (DIPS, Random Variables, and Balls In Play). DIPS events Pitcher-Assisted Ordinary Outs Strikeouts Balks Unintentional Walks Intentional Walks Hit Batsmen Park-Adjusted Home Runs Random Variables Interference Reached-On-Error Live-ball Baserunner Advance Fielders Choice Sac Bunt (situation specific and therefore not random in any given game, but as far as the pitcher's concerned, there's no pitching method to apply to prevent a sac bunt once runners are on base) Fielder Indifference (it's not Rivera's fault if the runner at first base means nothing in a three run game) Balls In Play Stolen Bases (pitchers show some skill at helping to prevent but they can't control steals without a catcher) Caught Stealing Pickoff (this might surprise you, but pick-offs don't happen as frequently on teams with poor defenses - the fielders need to be contributing to holding the running game down) Wild Pitch and Passed Ball (obviously some pitchers are more wild than others, but WP and PB are often confused in the scorer's notebook and either one can be the fault of a bad catcher) Singles, Doubles and Triples Sac Flies (certain pitchers are harder to get the ball to the outfield off of, and some outfielders are harder to run off of) Double Plays In-Play Outs (non-pitcher-assisted) A word of explanation about the pitcher-assisted outs. I now consider this a DIPS event because after years of observing the data I have realized that pitchers who tend to have a positive impact on ball in play hit rates also tend to be guys who rate as "great fielders." There is some skill involved in being able to field the pitching position, but I am more and more convinced that pitchers' fielding is more of an expression of the pitcher's ability to induce very poor contact than of his glove skills. If the pitcher has a chance to field a ball first and get an assist for his efforts, more likely than not it was poorly hit and on the ground, which must be considered a pitching success. Now you see lots of replays of pitchers finding a line drive embedded in their glove by pure chance, but in general, the pitchers who get poor contact get more first-touch assists (I limited the assists that counted to include only plays which were ordinary in-play outs (non-pick-offs, non-sac bunts) and which the pitcher had the first assist). Now for a Little Math Bare with me while I go through the entire process that produces a Defense-Neutral Run Average through an example season that will illustrate the differences between traditional analysis and the defense-neutral approach. A) Gathering the Necessary Raw Statistics We begin as always with a pitching line produced by querying the play by play database provided by Retrosheet (these guys do incredible work to provide data of unparalleled quality and I hope you students of the game who have been reading this article will consider making a contribution of time or money to their most important objective of producing play by play level data for as much of baseball history as possible). I have chosen Tom Glavine's 2001 season for this demonstration as it represents some of the dangers of using other approaches. Here's the initial data. # Type Count 2 Out 454 3 K 116 4 SB 9 5 Indifference 0 6 CS 7 8 PkO 3 9 WP 2 10 PB 0 11 BK 0 12 Advances 0 14 UBB 87 15 IBB 10 16 HBP 2 17 Interference 0 18 ROE 5 19 FC 1 20 1B 156 21 2B 32 22 3B 1 23 HR 24 25 SH 5 26 SF 8 27 DP 29 28 TP 0 29 Pitcher A 33 The event numbers are the same as the ones invented by Retrosheet to identify every play in the play by play (PBP) database (DB). Also note, the pitcher assisted plays (event 29) are included in the standard in play outs (event 2), not separate. The same exact statistics are totaled up for the pitcher's team and his league for later use as you'll see. Now we need some information that can be derived from the above table and that is used in the calculations that follow. BIP = Outs-Pitcher A+Interference+ROE+FC+1B+2B+3B+HR+SH+SF+DP+TP RFB = 1B + UBB + IBB + HBP + ROE BR = RFB + 2B + 3B BIP represents the number of balls in play (non-pitcher-assisted), RFB represents the approximate number of runners at first base allowed by any pitcher and BR represents the total number of baserunners the pitcher allowed. All three of these statistics are calculated for the pitcher's team and home league as well. In Tom Glavine's case, he allowed 682 BIP, 260 RFB, and 293 total baserunners. B) Park-Adjusting HR Allowed All of the statistics used in the analysis of pitching are by their very nature adjusted for the environments in which they occurred. Hit rates, double play turning rates and baserunning statistics, by being placed in both team and league contexts screen out the potential poisoning bias of the parks in which they were garnered. All statistics get this benefit except the DIPS statistics. Parks don't truly impact walks or hit batsmen or grounders back to the mound (I expect any differences in walk rates in parks are purely incidental and have nothing to do with the park itself), but they most certainly impact HR rates. A method must be devised to get a park-neutral HR figure. Here again the PBP DB comes in handy. I queried the DB in order to learn: How many HRs were hit in each game by either side How many balls were put into play plus home runs hit in each game by either side Where each game was played (which park) How many balls in play plus home runs each pitcher allowed in each park in each season in which they pitched The league average rate at which (BIP + HR) become HRs. The rate at which BIP become HR at each park Once you have all of that you can determine the difference between the park's HR rate on balls in play and the league's rate of the same, apply that difference to each pitcher/season/park record and you get a park HR adjustment that accounts for every park in which a pitcher allowed at least one ball in play. The formula is as follows: HR Added = SUM ((BIP + HR at each park) * (Park HR rate - League HR rate)) In English, for every park in which a pitcher pitched, multiply the number of balls in play he allowed by the difference between that park's BIP and the league average BIP and sum all of the individual park HR Added figures together to find the number of HRs added by all of the parks in which he pitched. In the future, I plan on adjusting this figure to account for the overall strength of batters and pitchers who played in each park, because there is potential for some small error if a park happens to play host to unusually strong offenses or if a clustering of unusually good or poor pitchers happen to get their appearances in a particular park, but this park adjustment is as accurate as necessary to get within 1 HR Added 99% of the time and because I round all events to the nearest integer (you can't have half a HR), that does a pretty good job. In 2001, the parks in which Glavine pitched contributed to Glavine's HR rate by subtracting 2 HRs from his park neutral total. We estimate that in the same playing time at a neutral site, Glavine would have allowed 26 HRs. Once you have a new HR total for each pitcher/season, you need to adjust his BIP to account for the change. Since we expect that two extra balls in play should have become HRs in Glavine's case, we remove two BIP from his line, leaving 680 defense independent balls in play (DIBIP). C) Calculating a Defense Independent Pitching Summary Normally a DIPS analysis begins by taking a player's BIP data and converting it into a line that represents the league averages on ball in play events. This method begins there two, but adds an important component. Before we place a player in the context of the league, we need to know something about how he did relative to his team. For every event that falls into the category of in-play or baserunning plays (the third grouping in the above summary), we need to know how the pitcher actually impacted the defensive results on the field. We'll use singles data to exemplify how I do this. Tom Glavine allowed 156 singles in 680 defense independent balls in play (a rate of .229) That is a rate significantly higher than the league's 14888 singles in 69395 BIP (a rate of .215). Did Glavine do that much worse than average at preventing singles or were the Braves infielders a little worse than average? The Braves as a team allowed 960 singles in 4261 non-pitcher-assisted balls in play, for a rate of .225. Glavine's single rate was pretty close to the Braves' single rate which suggests that most of the difference between Glavine and the league was the infield behind him. To calculate Glavine's singles over team is a fairly straightforward approach: 1BOT = 1B - DIBIP * (1B (team) / DIBIP (team)) The same approach can be used to calculate doubles over team and triples over team. In Glavine's case we find that he allowed 2.3 singles more than his team rate, but 4.5 doubles fewer than his team rate and 2.5 triples fewer than expected. The hit events are the only events which get processed using BIP data by itself. All of the other in play events require more specific information because they can only occur when certain preconditions exist. A double play can only occur when there is at least one runner on base and less than two outs, for example. There's a whole subset of data for which it makes the most sense to use RFB figures instead of BIP. DP, SB, CS and PkO over team are estimated in the same way that singles are except instead of DIBIP data, RFB data is used. I could technically get very specific information for double plays and count only baserunners existing with less than two outs, but I'm assuming that the distribution of baserunners in the various out counts is relatively random. DP are also possible without a runner at first base, but they are very rare (less than 3% of all DP do not involve the first baseman) and the same is true of stolen bases and CS. RFB is close enough that it gives you a good idea how many chances to turn DP and allow SB or get CS each pitcher has. Still other data requires information on all of the baserunners. WP, PB and SF occur more randomly in the various base-out states but require runners to be on base. SF technically require a runner at third, so I considered using triple counts, but it is far more common for runners to reach third by a combination of other hits and walks and the like than by triples, so I stuck with all baserunners. For all statistics in the second grouping (random variables), the team data is considered irrelevant and no team adjustment is needed. Variables in the DIPS event grouping are left as they are. Once the team context has been considered, we can place all of the non-DIPS statistics in their proper league average contexts. For each non-DIPS stat the formula is the same: DI Event = Opportunities * (Event (league) / Opportunities (league) + events over team) In the case of singles, opportunities would be DIBIP and events over team would 1BOT. In the case of WP, opportunities would be BR and events over team would be WPOT. In the case of SH, opportunities would be RFB and events over team would be zero. Care must be taken at this juncture because as you convert a pitcher's line into defense independent equivalents, the RFB and BR values change. If you expect him to have allowed more singles, the RFB increases. The defense independent line must therefore be calculated in the order specified here, with DIBIP and HR first, singles, doubles and triples second, RFB next, all of the statistics which depend on RFB after that, then BR, then all of the statistics which depend on BR. Defense independent ordinary outs (event 2) are calculated by finding the difference between the original "batter is safe" events and their defense independent equivalents and subtracting that from the original out count (if we expect batters to reach safely 6 extra times, that means the pitcher is getting 6 fewer outs). If you do all of this in the correct order, this is the defense independent line you get for Tom Glavine. # Type Count 2 Out 446 3 K 116 4 SB 7 5 Indifference 1 6 CS 5 8 PkO 2 9 WP 2 10 PB 0 11 BK 0 12 Advances 0 14 UBB 87 15 IBB 10 16 HBP 2 17 Interference 0 18 ROE 9 19 FC 2 20 1B 148 21 2B 41 22 3B 2 23 HR 26 25 SH 11 26 SF 6 27 DP 30 28 TP 0 D) Turning the Defense Independent Summary Line into DNRA Once you have all of that complicated work done, it becomes surprisingly easy to generate DNRA. All you need are accurate linear weights (where average values yield zero LWRC). Now I generated my linear weights through the use of some matrix algebra (a method called Markov chains), but anyone wanting to duplicate my findings just needs the results of that research, which are available in any number of places online. Let's add another couple of columns to the table above. The first new column will contain linear weight values for each of the events and the second will multiply the counts by the linear weights. When you sum those multiplied values up you get the pitcher's defense independent runs allowed above average (DIRAAA). # Type Count LW RC 2 Out 446 -0.256 -114.2 3 K 116 -0.281 -32.6 4 SB 7 0.185 1.3 5 Indifference 1 0.111 0.1 6 CS 5 -0.447 -2.2 8 PkO 2 -0.278 -0.6 9 WP 2 0.279 0.6 10 PB 0 0.287 0.0 11 BK 0 0.292 0.0 12 Advances 0 -0.444 0.0 14 UBB 87 0.305 26.5 15 IBB 10 0.172 1.7 16 HBP 2 0.328 0.7 17 Interference 0 0.614 0.0 18 ROE 9 0.504 4.5 19 FC 2 -0.379 -0.8 20 1B 148 0.460 68.1 21 2B 41 0.771 31.6 22 3B 2 1.060 2.1 23 HR 26 1.370 35.6 25 SH 11 -0.133 -1.5 26 SF 6 -0.065 -0.4 27 DP 30 -0.841 -25.2 28 TP 0 NA 0.0 TOTAL -4.6 In 2001, Glavine posted an ERA+ of 125, but we observe that he was nowhere near that effective in reality. He was slightly above average (a negative value means he gave up fewer runs than average) but not 25% above average. To turn DIRAAA into DNRA, we need to know two additional facts. First we need to know how many defense independent total outs Glavine is getting credit for. DIOuts = Outs + (2 * DP) + (3 * TP) + K + CS + PkO + FC + SH + SF Second we need to know what the league average RS/27 Outs is. I have this information for every league in major league history, and in this example, the 2001 NL scored 4.68 R/27, and Glavine is credited with producing 647 DIOuts. From here, the formula for DNRA is simple. DNRA = ((LgRS27 * (DIOuts / 27) + DIRAAA) / DIOuts) * 27 Note that DNRA is on a RA scale and not an ERA scale. In the modern era, a 5 ERA is bad, but a 5 RA is about average to many a little worse than average (especially in the AL). Also note that by definition the DNRA and the league RS/G are related, so in order to put pitches on the same scale, you need to adjust for the league RS rate. This is where DNRA+ comes in. Just like ERA+, DNRA+ is calculated by finding the ratio between the league RS rate and the pitcher's DNRA like so: DNRA+ = (LgRS27 / DNRA) * 100 100 is average, 120 is an all star, 150 is outstanding and the replacement level is commonly around 75. Tom Glavine's DNRA in 2001 is 4.50 for a DNRA+ of 104. Future Improvements As I stated earlier there is room for improvement in the park factoring process, both by accounting for the strength of the batters and pitchers who happen to play in each park and by accounting for LH/RH splits. In order to get a truer measure of a pitcher's skill, I'm current pursuing strength of schedule analysis as well. I have also considered utilizing batted ball trajectories to improve the DIPS analysis aspect of this metric, but at the moment I am skeptical about the quality of batted ball data and about its' usefulness. I remain open to the possibility, however. Research is ongoing and I am certainly open to suggestions from anyone interested. Thank you for reading and I hope I have made a positive impression and avoided boring you to tears with this introduction. Summary of Findings Top 50 Pitchers (since 1957, excluding 1999) with 5000 DIOuts in career DNRA+ First Last Outs DNRA+ Pedro Martinez 6893 164 Roger Clemens 13666 150 Randy Johnson 10075 145 Greg Maddux 12525 138 Sandy Koufax 6684 135 Mike Mussina 8394 134 Curt Schilling 8183 132 Kevin Brown 9036 131 John Smoltz 8228 128 Rich Gossage 5390 126 Andy Pettitte 5737 126 Tom Seaver 14283 124 Bret Saberhagen 7314 123 Nolan Ryan 16256 123 Sid Fernandez 5613 120 Ron Guidry 7124 120 Dave Stieb 8650 120 Juan Marichal 10576 120 Tom Gordon 5914 120 Andy Messersmith 6643 120 Rollie Fingers 5029 119 Sam McDowell 7560 119 Dean Chance 6425 118 Bob Stanley 5151 118 Gaylord Perry 16021 118 Bob Gibson 11601 118 Jimmy Key 7714 118 Kevin Appier 7122 118 Camilo Pascual 7545 118 Dwight Gooden 8118 118 Whitey Ford 6420 117 Bert Blyleven 14916 117 David Cone 8148 117 Jose Rijo 5655 116 Don Drysdale 9994 116 Tom Glavine 11132 116 Dennis Eckersley 9867 116 Fergie Jenkins 13506 115 Steve Rogers 8449 115 Orel Hershiser 8888 115 Jon Matlack 7111 114 Jim Maloney 5525 114 Don Wilson 5186 114 Larry Jackson 8983 114 David Wells 8934 113 Steve Carlton 15702 113 Rick Reuschel 10726 113 Chuck Finley 8934 113 Luis Tiant 10478 112 Mark Langston 8732 112 Top 50 Team Pitching Staffs Yr Tm DNRA+ 1994 ATL 126 1996 ATL 126 1998 ATL 126 1995 ATL 124 2002 NYA 124 2003 NYA 123 1997 ATL 121 2003 LAN 121 1988 NYN 120 2001 NYA 120 1966 LAN 119 1990 NYN 119 2004 CHN 117 1993 ATL 116 2001 CLE 116 2003 CHN 116 1980 NYA 115 1981 NYA 115 1984 LAN 115 1995 CLE 115 2001 BOS 115 2001 OAK 115 1971 CHA 114 1971 NYN 114 1994 SDN 114 1996 NYA 114 2001 CHN 114 2002 OAK 114 2003 ARI 114 1976 NYN 113 1981 HOU 113 1981 LAN 113 1982 NYA 113 1989 KCA 113 1994 MON 113 1997 NYA 113 2002 BOS 113 2003 BOS 113 2004 BOS 113 1977 LAN 112 1982 LAN 112 1982 PHI 112 1983 LAN 112 1983 PHI 112 1985 DET 112 1985 KCA 112 1985 LAN 112 1986 NYN 112 1990 BOS 112 1994 CLE 112 1995 SEA 112 2000 CLE 112 2003 OAK 112 2005 HOU 112 50 Worst Team Pitching Staffs 1957 PIT 89 1962 KC1 89 1965 WS2 89 1965 NYN 89 1967 KC1 89 1968 CAL 89 1971 CLE 89 1971 WS2 89 1977 SEA 89 1978 SEA 89 1982 MIN 89 1983 MIN 89 1984 OAK 89 1993 COL 89 1994 SLN 89 1996 DET 89 1996 ML4 89 1997 COL 89 2001 TBA 89 2003 SDN 89 2004 COL 89 2005 KCA 89 2005 TBA 89 1964 KC1 88 1966 KC1 88 1979 TOR 88 1982 OAK 88 1994 FLO 88 1995 ML4 88 1995 SFN 88 1996 SFN 88 2002 KCA 88 2004 KCA 88 1963 WS2 87 1964 NYN 87 1965 KC1 87 1966 NYN 87 1985 CLE 87 1989 CHA 87 1995 MIN 87 2000 KCA 87 2001 KCA 87 2002 MIL 87 1963 NYN 86 2002 COL 86 2003 DET 86 1998 FLO 85 2002 TBA 85 2003 TBA 84