http://detectovision.com/?p=803
I am hoping to make my recent research product - DNRA - a part of the
dialogue among students of the game of baseball and to that end, I've been
asked to present in a quasi-readable format, the exact methodology I used to
produce results recently disclosed in a couple of different places online
which have shown promise. The few who have thus far reviewed some of my
findings seem interested at the very least in potential refinements to the
system and overall pleased with the quality of the data I've shown to date.
I am certain there is more work that needs to be done and I'll discuss some
of my plans for future research and improvements at the end of this document
but for the moment, I give you the nuts and bolts of DNRA.
Overview of Underlying Philosophy
In recent days I've been told by many a researcher that the question of
predictive analysis (attempting to divine from available data what a player
is likely to do in the future) and the evaluation of past value (reading the
past data to explain what has already happened) are two separate tasks
requiring two different approaches and two different sets of statistics. I
have long believed this is not the case. For years I've worked hard to
accurately describe the vast history of numerical baseball data in terms of
how valuable players are and have been (think along the lines of Win Shares
and WARP) for two reasons. I am fascinated by the history of the game,
obviously, but I also strongly believe that an accurate depiction of all that
has come before will serve us well in answering the question of what will
happen in the next year or three.
Predictive sabermetrics that do not make use of the mounds of data in the
history of the game generally operate by applying some assumptions and the
laws of probability and this, to me, has always seemed like a somewhat
incomplete approach.
Meanwhile, the study of baseball history has recently been dismissed as a
means to the desired end - an accurate predictive model - because current
sabermetric tools don't quite adequately capture all of the necessary
information to fully understand how and why events occur. This is especially
true of pitching and fielding analysis where the majority of the available
statistics that would quantify a pitcher or fielder's value are far behind
the advanced approaches used to diagnose hitting and far less reliable and
accurate. I have therefore devoted most of my energy to the effort of
breaking down defense (including pitching) and getting an accurate picture
where one is most needed.
Defense is a Context
When Voros McCracken stumbled on the idea that pitchers have little control
over the batting average on balls in play he, launched a fierce debate that
continues to this day regarding how best to use this observation. Since 2001
a litany of pitching metrics have been invented illustrating a range of
beliefs on the subject of how to handle defense. Most pitching metrics begin
with a pitcher's actual statistical record and make adjustments to account
for the defensive skill of the players behind him. RSAA and DERA are
examples of this.
The original DIPS (defense independent pitching statistics) report was
unfortunately somewhat overly simplistic and it has led many of those
studying pitching to dismiss DIPS as a starting point because there are
pitchers who consistently defy DIPS principles to a certain degree. The
result of that dismissal is a generation of statistics which begin with runs
allowed and make adjustments based on different perceptions of how important
the fielders are including the aforementioned RSAA and more complete
statistics like PRC (a Hardball Times creation) which make use of
correlations that suggest pitchers are roughly 20% responsible for the
results of in-play events.
I believe is it inherently incorrect to begin with a statistic (RA) that is
poisoned by biased starting data and small sample sizes and try to make
adjustments to account for the biases. I believe the most accurate pitching
measure should begin with DIPS principles and adjust for the observable
affects pitchers do have on balls in play. Defense is a context just like
parks and leagues are, capable of creating confusing biases in the data.
DNRA is the logical conclusion of that belief.
Events and their Place in the Defensive Spectrum
For every event that can possibly occur in a baseball game, there are three
possible modes of pitcher responsibility:
DIPS event (purely the domain of the pitcher)
Fielder/Baserunner mistakes while the ball is live (purely the domain of the
fielders and 100% luck as to how those fall in a pitching line)
Balls In Play or Pitcher/Fielder Interaction (Controlled more by the
fielders, but influenced by pitching patterns and ball trajectories which we
know the pitcher can help shape)
Now there are 26 unique event types (25 that are commonly tracked and one
that I've added because I feel it makes an important observation about
pitchers)…here they each are, grouped by which type of play they are (DIPS,
Random Variables, and Balls In Play).
DIPS events
Pitcher-Assisted Ordinary Outs
Strikeouts
Balks
Unintentional Walks
Intentional Walks
Hit Batsmen
Park-Adjusted Home Runs
Random Variables
Interference
Reached-On-Error
Live-ball Baserunner Advance
Fielders Choice
Sac Bunt (situation specific and therefore not random in any given game, but
as far as the pitcher's concerned, there's no pitching method to apply to
prevent a sac bunt once runners are on base)
Fielder Indifference (it's not Rivera's fault if the runner at first base
means nothing in a three run game)
Balls In Play
Stolen Bases (pitchers show some skill at helping to prevent but they can't
control steals without a catcher)
Caught Stealing
Pickoff (this might surprise you, but pick-offs don't happen as frequently on
teams with poor defenses - the fielders need to be contributing to holding
the running game down)
Wild Pitch and Passed Ball (obviously some pitchers are more wild than
others, but WP and PB are often confused in the scorer's notebook and either
one can be the fault of a bad catcher)
Singles, Doubles and Triples
Sac Flies (certain pitchers are harder to get the ball to the outfield off
of, and some outfielders are harder to run off of)
Double Plays
In-Play Outs (non-pitcher-assisted)
A word of explanation about the pitcher-assisted outs. I now consider this a
DIPS event because after years of observing the data I have realized that
pitchers who tend to have a positive impact on ball in play hit rates also
tend to be guys who rate as "great fielders." There is some skill involved
in being able to field the pitching position, but I am more and more
convinced that pitchers' fielding is more of an expression of the pitcher's
ability to induce very poor contact than of his glove skills. If the pitcher
has a chance to field a ball first and get an assist for his efforts, more
likely than not it was poorly hit and on the ground, which must be considered
a pitching success. Now you see lots of replays of pitchers finding a line
drive embedded in their glove by pure chance, but in general, the pitchers
who get poor contact get more first-touch assists (I limited the assists that
counted to include only plays which were ordinary in-play outs
(non-pick-offs, non-sac bunts) and which the pitcher had the first assist).
Now for a Little Math
Bare with me while I go through the entire process that produces a
Defense-Neutral Run Average through an example season that will illustrate
the differences between traditional analysis and the defense-neutral approach.
A) Gathering the Necessary Raw Statistics
We begin as always with a pitching line produced by querying the play by play
database provided by Retrosheet (these guys do incredible work to provide
data of unparalleled quality and I hope you students of the game who have
been reading this article will consider making a contribution of time or
money to their most important objective of producing play by play level data
for as much of baseball history as possible). I have chosen Tom Glavine's
2001 season for this demonstration as it represents some of the dangers of
using other approaches. Here's the initial data.
# Type Count
2 Out 454
3 K 116
4 SB 9
5 Indifference 0
6 CS 7
8 PkO 3
9 WP 2
10 PB 0
11 BK 0
12 Advances 0
14 UBB 87
15 IBB 10
16 HBP 2
17 Interference 0
18 ROE 5
19 FC 1
20 1B 156
21 2B 32
22 3B 1
23 HR 24
25 SH 5
26 SF 8
27 DP 29
28 TP 0
29 Pitcher A 33
The event numbers are the same as the ones invented by Retrosheet to identify
every play in the play by play (PBP) database (DB). Also note, the pitcher
assisted plays (event 29) are included in the standard in play outs (event
2), not separate. The same exact statistics are totaled up for the pitcher's
team and his league for later use as you'll see.
Now we need some information that can be derived from the above table and
that is used in the calculations that follow.
BIP = Outs-Pitcher A+Interference+ROE+FC+1B+2B+3B+HR+SH+SF+DP+TP
RFB = 1B + UBB + IBB + HBP + ROE
BR = RFB + 2B + 3B
BIP represents the number of balls in play (non-pitcher-assisted), RFB
represents the approximate number of runners at first base allowed by any
pitcher and BR represents the total number of baserunners the pitcher
allowed. All three of these statistics are calculated for the pitcher's team
and home league as well.
In Tom Glavine's case, he allowed 682 BIP, 260 RFB, and 293 total baserunners.
B) Park-Adjusting HR Allowed
All of the statistics used in the analysis of pitching are by their very
nature adjusted for the environments in which they occurred. Hit rates,
double play turning rates and baserunning statistics, by being placed in both
team and league contexts screen out the potential poisoning bias of the parks
in which they were garnered. All statistics get this benefit except the DIPS
statistics. Parks don't truly impact walks or hit batsmen or grounders back
to the mound (I expect any differences in walk rates in parks are purely
incidental and have nothing to do with the park itself), but they most
certainly impact HR rates. A method must be devised to get a park-neutral HR
figure.
Here again the PBP DB comes in handy. I queried the DB in order to learn:
How many HRs were hit in each game by either side
How many balls were put into play plus home runs hit in each game by either
side
Where each game was played (which park)
How many balls in play plus home runs each pitcher allowed in each park in
each season in which they pitched
The league average rate at which (BIP + HR) become HRs.
The rate at which BIP become HR at each park
Once you have all of that you can determine the difference between the park's
HR rate on balls in play and the league's rate of the same, apply that
difference to each pitcher/season/park record and you get a park HR
adjustment that accounts for every park in which a pitcher allowed at least
one ball in play. The formula is as follows:
HR Added = SUM ((BIP + HR at each park) * (Park HR rate - League HR rate))
In English, for every park in which a pitcher pitched, multiply the number of
balls in play he allowed by the difference between that park's BIP and the
league average BIP and sum all of the individual park HR Added figures
together to find the number of HRs added by all of the parks in which he
pitched.
In the future, I plan on adjusting this figure to account for the overall
strength of batters and pitchers who played in each park, because there is
potential for some small error if a park happens to play host to unusually
strong offenses or if a clustering of unusually good or poor pitchers happen
to get their appearances in a particular park, but this park adjustment is as
accurate as necessary to get within 1 HR Added 99% of the time and because I
round all events to the nearest integer (you can't have half a HR), that does
a pretty good job.
In 2001, the parks in which Glavine pitched contributed to Glavine's HR rate
by subtracting 2 HRs from his park neutral total. We estimate that in the
same playing time at a neutral site, Glavine would have allowed 26 HRs.
Once you have a new HR total for each pitcher/season, you need to adjust his
BIP to account for the change. Since we expect that two extra balls in play
should have become HRs in Glavine's case, we remove two BIP from his line,
leaving 680 defense independent balls in play (DIBIP).
C) Calculating a Defense Independent Pitching Summary
Normally a DIPS analysis begins by taking a player's BIP data and converting
it into a line that represents the league averages on ball in play events.
This method begins there two, but adds an important component. Before we
place a player in the context of the league, we need to know something about
how he did relative to his team.
For every event that falls into the category of in-play or baserunning plays
(the third grouping in the above summary), we need to know how the pitcher
actually impacted the defensive results on the field. We'll use singles data
to exemplify how I do this.
Tom Glavine allowed 156 singles in 680 defense independent balls in play (a
rate of .229) That is a rate significantly higher than the league's 14888
singles in 69395 BIP (a rate of .215). Did Glavine do that much worse than
average at preventing singles or were the Braves infielders a little worse
than average? The Braves as a team allowed 960 singles in 4261
non-pitcher-assisted balls in play, for a rate of .225. Glavine's single
rate was pretty close to the Braves' single rate which suggests that most of
the difference between Glavine and the league was the infield behind him.
To calculate Glavine's singles over team is a fairly straightforward approach:
1BOT = 1B - DIBIP * (1B (team) / DIBIP (team))
The same approach can be used to calculate doubles over team and triples over
team.
In Glavine's case we find that he allowed 2.3 singles more than his team
rate, but 4.5 doubles fewer than his team rate and 2.5 triples fewer than
expected.
The hit events are the only events which get processed using BIP data by
itself. All of the other in play events require more specific information
because they can only occur when certain preconditions exist. A double play
can only occur when there is at least one runner on base and less than two
outs, for example. There's a whole subset of data for which it makes the
most sense to use RFB figures instead of BIP. DP, SB, CS and PkO over team
are estimated in the same way that singles are except instead of DIBIP data,
RFB data is used. I could technically get very specific information for
double plays and count only baserunners existing with less than two outs, but
I'm assuming that the distribution of baserunners in the various out counts
is relatively random. DP are also possible without a runner at first base,
but they are very rare (less than 3% of all DP do not involve the first
baseman) and the same is true of stolen bases and CS. RFB is close enough
that it gives you a good idea how many chances to turn DP and allow SB or get
CS each pitcher has.
Still other data requires information on all of the baserunners. WP, PB and
SF occur more randomly in the various base-out states but require runners to
be on base. SF technically require a runner at third, so I considered using
triple counts, but it is far more common for runners to reach third by a
combination of other hits and walks and the like than by triples, so I stuck
with all baserunners.
For all statistics in the second grouping (random variables), the team data
is considered irrelevant and no team adjustment is needed. Variables in the
DIPS event grouping are left as they are.
Once the team context has been considered, we can place all of the non-DIPS
statistics in their proper league average contexts. For each non-DIPS stat
the formula is the same:
DI Event = Opportunities * (Event (league) / Opportunities (league) + events
over team)
In the case of singles, opportunities would be DIBIP and events over team
would 1BOT. In the case of WP, opportunities would be BR and events over
team would be WPOT. In the case of SH, opportunities would be RFB and events
over team would be zero.
Care must be taken at this juncture because as you convert a pitcher's line
into defense independent equivalents, the RFB and BR values change. If you
expect him to have allowed more singles, the RFB increases. The defense
independent line must therefore be calculated in the order specified here,
with DIBIP and HR first, singles, doubles and triples second, RFB next, all
of the statistics which depend on RFB after that, then BR, then all of the
statistics which depend on BR.
Defense independent ordinary outs (event 2) are calculated by finding the
difference between the original "batter is safe" events and their defense
independent equivalents and subtracting that from the original out count (if
we expect batters to reach safely 6 extra times, that means the pitcher is
getting 6 fewer outs).
If you do all of this in the correct order, this is the defense independent
line you get for Tom Glavine.
# Type Count
2 Out 446
3 K 116
4 SB 7
5 Indifference 1
6 CS 5
8 PkO 2
9 WP 2
10 PB 0
11 BK 0
12 Advances 0
14 UBB 87
15 IBB 10
16 HBP 2
17 Interference 0
18 ROE 9
19 FC 2
20 1B 148
21 2B 41
22 3B 2
23 HR 26
25 SH 11
26 SF 6
27 DP 30
28 TP 0
D) Turning the Defense Independent Summary Line into DNRA
Once you have all of that complicated work done, it becomes surprisingly easy
to generate DNRA. All you need are accurate linear weights (where average
values yield zero LWRC). Now I generated my linear weights through the use
of some matrix algebra (a method called Markov chains), but anyone wanting to
duplicate my findings just needs the results of that research, which are
available in any number of places online.
Let's add another couple of columns to the table above. The first new column
will contain linear weight values for each of the events and the second will
multiply the counts by the linear weights. When you sum those multiplied
values up you get the pitcher's defense independent runs allowed above
average (DIRAAA).
# Type Count LW RC
2 Out 446 -0.256 -114.2
3 K 116 -0.281 -32.6
4 SB 7 0.185 1.3
5 Indifference 1 0.111 0.1
6 CS 5 -0.447 -2.2
8 PkO 2 -0.278 -0.6
9 WP 2 0.279 0.6
10 PB 0 0.287 0.0
11 BK 0 0.292 0.0
12 Advances 0 -0.444 0.0
14 UBB 87 0.305 26.5
15 IBB 10 0.172 1.7
16 HBP 2 0.328 0.7
17 Interference 0 0.614 0.0
18 ROE 9 0.504 4.5
19 FC 2 -0.379 -0.8
20 1B 148 0.460 68.1
21 2B 41 0.771 31.6
22 3B 2 1.060 2.1
23 HR 26 1.370 35.6
25 SH 11 -0.133 -1.5
26 SF 6 -0.065 -0.4
27 DP 30 -0.841 -25.2
28 TP 0 NA 0.0
TOTAL -4.6
In 2001, Glavine posted an ERA+ of 125, but we observe that he was nowhere
near that effective in reality. He was slightly above average (a negative
value means he gave up fewer runs than average) but not 25% above average.
To turn DIRAAA into DNRA, we need to know two additional facts. First we
need to know how many defense independent total outs Glavine is getting
credit for.
DIOuts = Outs + (2 * DP) + (3 * TP) + K + CS + PkO + FC + SH + SF
Second we need to know what the league average RS/27 Outs is. I have this
information for every league in major league history, and in this example,
the 2001 NL scored 4.68 R/27, and Glavine is credited with producing 647
DIOuts. From here, the formula for DNRA is simple.
DNRA = ((LgRS27 * (DIOuts / 27) + DIRAAA) / DIOuts) * 27
Note that DNRA is on a RA scale and not an ERA scale. In the modern era, a 5
ERA is bad, but a 5 RA is about average to many a little worse than average
(especially in the AL).
Also note that by definition the DNRA and the league RS/G are related, so in
order to put pitches on the same scale, you need to adjust for the league RS
rate. This is where DNRA+ comes in.
Just like ERA+, DNRA+ is calculated by finding the ratio between the league
RS rate and the pitcher's DNRA like so:
DNRA+ = (LgRS27 / DNRA) * 100
100 is average, 120 is an all star, 150 is outstanding and the replacement
level is commonly around 75.
Tom Glavine's DNRA in 2001 is 4.50 for a DNRA+ of 104.
Future Improvements
As I stated earlier there is room for improvement in the park factoring
process, both by accounting for the strength of the batters and pitchers who
happen to play in each park and by accounting for LH/RH splits.
In order to get a truer measure of a pitcher's skill, I'm current pursuing
strength of schedule analysis as well.
I have also considered utilizing batted ball trajectories to improve the DIPS
analysis aspect of this metric, but at the moment I am skeptical about the
quality of batted ball data and about its' usefulness. I remain open to the
possibility, however.
Research is ongoing and I am certainly open to suggestions from anyone
interested.
Thank you for reading and I hope I have made a positive impression and
avoided boring you to tears with this introduction.
Summary of Findings
Top 50 Pitchers (since 1957, excluding 1999) with 5000 DIOuts in career DNRA+
First Last Outs DNRA+
Pedro Martinez 6893 164
Roger Clemens 13666 150
Randy Johnson 10075 145
Greg Maddux 12525 138
Sandy Koufax 6684 135
Mike Mussina 8394 134
Curt Schilling 8183 132
Kevin Brown 9036 131
John Smoltz 8228 128
Rich Gossage 5390 126
Andy Pettitte 5737 126
Tom Seaver 14283 124
Bret Saberhagen 7314 123
Nolan Ryan 16256 123
Sid Fernandez 5613 120
Ron Guidry 7124 120
Dave Stieb 8650 120
Juan Marichal 10576 120
Tom Gordon 5914 120
Andy Messersmith 6643 120
Rollie Fingers 5029 119
Sam McDowell 7560 119
Dean Chance 6425 118
Bob Stanley 5151 118
Gaylord Perry 16021 118
Bob Gibson 11601 118
Jimmy Key 7714 118
Kevin Appier 7122 118
Camilo Pascual 7545 118
Dwight Gooden 8118 118
Whitey Ford 6420 117
Bert Blyleven 14916 117
David Cone 8148 117
Jose Rijo 5655 116
Don Drysdale 9994 116
Tom Glavine 11132 116
Dennis Eckersley 9867 116
Fergie Jenkins 13506 115
Steve Rogers 8449 115
Orel Hershiser 8888 115
Jon Matlack 7111 114
Jim Maloney 5525 114
Don Wilson 5186 114
Larry Jackson 8983 114
David Wells 8934 113
Steve Carlton 15702 113
Rick Reuschel 10726 113
Chuck Finley 8934 113
Luis Tiant 10478 112
Mark Langston 8732 112
Top 50 Team Pitching Staffs
Yr Tm DNRA+
1994 ATL 126
1996 ATL 126
1998 ATL 126
1995 ATL 124
2002 NYA 124
2003 NYA 123
1997 ATL 121
2003 LAN 121
1988 NYN 120
2001 NYA 120
1966 LAN 119
1990 NYN 119
2004 CHN 117
1993 ATL 116
2001 CLE 116
2003 CHN 116
1980 NYA 115
1981 NYA 115
1984 LAN 115
1995 CLE 115
2001 BOS 115
2001 OAK 115
1971 CHA 114
1971 NYN 114
1994 SDN 114
1996 NYA 114
2001 CHN 114
2002 OAK 114
2003 ARI 114
1976 NYN 113
1981 HOU 113
1981 LAN 113
1982 NYA 113
1989 KCA 113
1994 MON 113
1997 NYA 113
2002 BOS 113
2003 BOS 113
2004 BOS 113
1977 LAN 112
1982 LAN 112
1982 PHI 112
1983 LAN 112
1983 PHI 112
1985 DET 112
1985 KCA 112
1985 LAN 112
1986 NYN 112
1990 BOS 112
1994 CLE 112
1995 SEA 112
2000 CLE 112
2003 OAK 112
2005 HOU 112
50 Worst Team Pitching Staffs
1957 PIT 89
1962 KC1 89
1965 WS2 89
1965 NYN 89
1967 KC1 89
1968 CAL 89
1971 CLE 89
1971 WS2 89
1977 SEA 89
1978 SEA 89
1982 MIN 89
1983 MIN 89
1984 OAK 89
1993 COL 89
1994 SLN 89
1996 DET 89
1996 ML4 89
1997 COL 89
2001 TBA 89
2003 SDN 89
2004 COL 89
2005 KCA 89
2005 TBA 89
1964 KC1 88
1966 KC1 88
1979 TOR 88
1982 OAK 88
1994 FLO 88
1995 ML4 88
1995 SFN 88
1996 SFN 88
2002 KCA 88
2004 KCA 88
1963 WS2 87
1964 NYN 87
1965 KC1 87
1966 NYN 87
1985 CLE 87
1989 CHA 87
1995 MIN 87
2000 KCA 87
2001 KCA 87
2002 MIL 87
1963 NYN 86
2002 COL 86
2003 DET 86
1998 FLO 85
2002 TBA 85
2003 TBA 84