by Rany Jazayerli
June 3, 2004
http://www.baseballprospectus.com/article.php?articleid=2934
It takes a lot these days to awaken me from my slumber and coerce me into
penning a column for BP. Between taking care of a baby daughter at home and
starting my own medical practice, the truly important things in life--like
baseball analysis--have gotten short shrift of late.
But finally, I have found a topic that arouses my passion. A question so
intriguing as to get my heart racing, my blood pumping, my brain thinking.
Finally, a puzzle worth being solved, a code worth being cracked.
That question, of course, is: "Does Alex Sanchez have the emptiest batting
average in major-league history?"
Consider the evidence. Bolstered by an obscene number of bunt hits, Sanchez
was hitting .359 going into Wednesday night's game, which ranked him third in
the American League. (By the way, who had the exacta on a Melvin Mora-Ken
Harvey-Alex Sanchez top three at this point in the season?) But Sanchez's
impressive ability to hit singles is neutered by his inability to do anything
else: hit for power (eight extra-base hits), reach base by other means (four
walks, no HBPs), or make effective use of his speed (11 steals, 10 caught
stealings).
For the season, Sanchez is hitting .359/.371/.431. His batting average may
rank third in the league, but his 802 OPS ranks just 43rd--in a tie with Jose
Cruz, who's hitting .237.
Put succinctly, Sanchez's batting average is about as empty as Le Stade
Olympique. But is it the emptiest ever?
There are a number of ways to frame that question. Ever since baseball
analysts became aware that batting average didn't tell the whole story--which
is to say, ever since baseball analysis existed--there have been attempts to
measure all the different aspects of offensive performance not covered by
batting average. The first commonly-used statistic that was designed to sweep
the floor of all the things that batting average left behind was Secondary
Average, which was originally designed (or at least propagated) by Bill
James, two decades ago.
James' original formula for secondary average was (BB + (TB – H) + SB)/AB.
The statistic covered the three primary elements of offense outside of
average--power, walks, and offensive speed--and had the added feature that a
league-average figure for secondary average would be almost the same as a
league-average figure for batting average. (Using this formula, the AL's
secondary average in 1985, for instance, was .260; the league batting average
was .261.)
Secondary average has had more of an impact on baseball analysis than you
might think. James, in his essay explaining secondary average to the world,
made an off-hand comment that he thought a player's overall offensive
contribution was approximately two-thirds batting average, one-third
secondary average. One of his readers decided to use this rough template--2/3
BA + 1/3 SA--as the foundation for a new statistic that he hoped would
quantify a player's performance in one number, which was the genesis for how
Clay Davenport came up with the idea for Equivalent Average, or EqA.
There is a problem with James' original formula, which is that by counting
stolen bases without counting caught stealings, the formula measures only the
positive impact of offensive speed, and not its downside. While it's a useful
way of quantifying a player's diversity of skill, it's not a good way to
quantify the value of a player's talents outside of batting average. So we'll
make an adjustment to the formula, and also add HBPs into the equation:
Secondary Average = (BB + HBP + (TB-H) + SB – (2*CS))/AB
The best secondary averages of all time?
Year Player SEC
-------------------------
2002 Barry Bonds .955
2001 Barry Bonds .941
2003 Barry Bonds .831
2000 Mark McGwire .797
1998 Mark McGwire .786
Greatness can make for some pretty boring lists. A look at those numbers can
help explain how skills like power and walks can be so important when
secondary average is, in terms of raw numbers, only half as important as
batting average. It's because secondary average has considerably more
variance than batting average; no one's ever had a .955 batting average, to
the best of my knowledge. It's been 63 years since anyone hit .400; generally
about two dozen players a year will have a secondary average that high.
The flip-side of that is that while no one has ever made it through a full
season hitting .150, plenty of players have had a secondary average that low.
The lowest secondary averages of all time (min: 450 PA):
Year Player SEC
-------------------------
1968 Hal Lanier .053
1915 Gus Getz .055
1982 Doug Flynn .056
1999 Mike Caruso .060
1976 Duane Kuiper .063
As a 21-year-old rookie shortstop in 1998, Mike Caruso hit .306, and had some
people predicting future greatness. That secondary average is a sign that
Caruso was never destined to meet those expectations.
So, what is Sanchez's secondary average? Try .044. There have been a little
over 12,000 player-seasons of 450+ PA in the modern era. Alex Sanchez is on
pace to have a lower secondary average than any of them.
Even if he raises his secondary average towards the stratospheric heights of
.100, he'll still have a strong claim to the title of Emptiest Batting
Average ever. That's because the players at the bottom of the secondary
average rankings weren't hitting an empty .330; they weren't hitting, period.
When Hal Lanier had a .053 secondary average, his batting average was just
.206. That's not an overrated hitter; that's a bad hitter.
One way to measure just how overrated Sanchez's batting average is, would be
to compare his batting average with his secondary average. First, though,
let's look at the opposite list: the players whose secondary average dwarfs
their batting average by the highest amount:
Year Player SEC AVG Diff
----------------------------------------
2001 Barry Bonds .941 .328 .613
2002 Barry Bonds .955 .370 .586
2003 Barry Bonds .831 .341 .490
1998 Mark McGwire .786 .299 .487
1920 Babe Ruth .775 .376 .400
OK, so this is the same list as the first one, more or less. I just couldn't
resist pointing out that Barry Bonds was underrated by his batting average
more than any player in major-league history--over the course of three
straight seasons in which he hit .328, .370, and .341. That's inhuman.
So where's the list of the players whose batting average towers over their
secondary average by the greatest amount?
Year Player AVG SEC Diff
----------------------------------------
2004 Alex Sanchez .359 .044 .315
1898 Willie Keeler .385 .135 .250
1915 Stuffy McInnis .314 .066 .248
1914 Stuffy McInnis .314 .071 .243
1971 Glenn Beckert .342 .108 .234
1920 Stuffy McInnis .297 .066 .231
It's not even close.
Wee Willie might have Him 'Em Where They Ain't, but he wasn't hitting 'em all
that far. He had 216 hits that year: one homer, two triples, seven doubles,
and 206 singles. Stuffy McInnis may be the King of the Empty Batting Average;
over an 11-year stretch from 1914 to 1924, the long-time first baseman for
the Philadelphia A's and Red Sox hit over .290 10 times, without ever posting
an OBP over .350 or a slugging average over .400.
There's one major problem with dead-ball era stats: we lack caught stealing
data for many of those years, meaning that the secondary averages for many of
the players in that era are artificially inflated. Keeler, for instance,
stole 28 bases in 1898, and gets credit for that in the secondary average
formula; we don't know how many times he was nailed trying to steal, so he
doesn't get penalized for his larceny. In other words, some of the players
from the dead-ball era may be even more overrated than they appear using this
formula.
So let's look at the question a different way, using a different formula. A
player like Sanchez, whose batting average represents most of his value,
should have an OBP and slugging average only slightly higher than his batting
average. We can use the ratio of OPS to batting average as a measure of this;
much like a jock taking the SAT gets 400 points just for writing his name, a
player starts with an OPS-to-average ratio of 2.000, because batting average
is a component of both OBP and slugging average. (For you smart-alecks who
want to point out that owing to sacrifice flies, OBP can be lower than
batting average, I have only two words for you: shut up.)
For all my fellow comrades in the Rob Deer Fan Club, here's the list of the
highest ratios of OPS to batting average in major-league history:
Year Player OPS AVG Ratio
-----------------------------------------
2001 Barry Bonds 1.379 .328 4.206
1995 Mark McGwire 1.125 .274 4.100
1998 Mark McGwire 1.222 .299 4.093
1999 Mark McGwire 1.120 .278 4.026
1991 Rob Deer .700 .179 3.918
And the flip-side of the list:
Year Player OPS AVG Ratio
-----------------------------------------
1898 Willie Keeler .830 .385 2.156
1984 Kirby Puckett .655 .296 2.212
1966 Don Kessinger .608 .274 2.218
1915 Stuffy McInnis .699 .314 2.228
1968 Horace Clarke .512 .230 2.230
By this measure, Sanchez loses his top spot on the list, finishing just out
of the Top Five at 2.239. Which makes this the perfect time to unveil his
secret weapon in the Quest for Batting Average Purity: the bunt single.
Fully 20 out of his 65 hits--over 30%--have come without a full swing of the
bat. Leading the universe in bunt hits might get Sanchez a lot of press, but
it actually diminishes his value, because 99% of bunt hits result in runners
advancing only one base. (This explains why, in situations with a man on
second base, and no one at third, Sanchez has six hits--and only four RBIs.)
Giving Sanchez extra credit because he has so many bunt hits is like giving
Barry Bonds extra credit because so many of his walks are intentional.
How overrated is Sanchez as a leadoff hitter? Let's put it this way: he's on
pace to finish with 206 hits, and only 79 runs scored. Only three players in
major-league history have tallied that many hits, and scored so few runs. And
none of them batted leadoff for a team that ranked fifth in the league in
runs scored, in an era of high offense.
So to answer the riddle that spawned this unholy column: does Sanchez have
the emptiest batting average of all time? By some measures, yes. By some
measures, not quite.
But give him time. It's still early.