Why doesn't football have a Bill James?

# Why doesn't football have a Bill James?

Dec. 19 2003 10:50 AM

# Number Crunching

## Why doesn't football have a Bill James?

In 1977, a boiler-room attendant named Bill James revolutionized baseball analysis with a self-published pamphlet called the Baseball Abstract, sweeping conventional wisdom aside with his radical reinterpretation of the statistical record. James' insights, since embraced by the front offices of the Oakland A's, the Boston Red Sox, and the Toronto Blue Jays, should have inspired the serious students of America's other national game, football, to overturn their status quo. But a quarter-century later, nobody has. Why?

For one thing, a football game is several orders of magnitude more complex than a baseball game. For another, nobody has yet determined what the most useful data to collect might be. And finally there's the problem of data points: The 16-game NFL season leaves less stuff to count than Major League Baseball's 162 games.

Josh Levin

Josh Levin is Slate’s editorial director.

None of this stopped Virgil Carter from giving it his best shot. Carter, who wrote one of the earliest technical papers on pro football, was no mere boiler-room attendant. As a statistics major at Brigham Young University, he starred as quarterback from 1964-66 and was drafted by the Chicago Bears.

Soon after launching his NFL career, Carter enrolled as a part-time student in Northwestern University's MBA program in the offseason. Business school wasn't just something to fall back on—it was a computerized venue in which to work out some of his football theories. Carter acquired the play-by-play logs for the first half of the 1969 NFL season and started the long slog of entering data: 53 variables per play, 8,373 plays. After five or six months, Carter had produced 8,373 punch cards.

By today's computing standards, Carter's data set was minuscule and his hardware archaic. To run the numbers, he reserved time on Northwestern's IBM 360 mainframe. Processing a half-season query would take 15 or 20 minutes—something today's desktop computers could do in nanoseconds. In one research project, Carter started with the subset of 2,852 first-down plays. For each play, he determined which team scored next and how many points they scored. By averaging the results, he was able to learn the "expected value" of having the ball at different spots on the field.

Carter deduced that the value is negative when the ball is inside a team's own 20-yard line—that is, that the team playing defense will likely score before the one playing offense. Carter and his professor Robert E. Machol co-authored a paper based on his findings, "Operations Research on Football," for a statistical journal. Carter's insight proved prescient: In the 1998 edition of the book The Hidden Game of Football, the statistically savvy authors ran a similar study and came to the same conclusion. The expected value is essentially linear, starting at -2 points at your own goal line, moving to +2 at midfield, and rising to +6 at the opponents' end zone.

Carter continued his football studies after graduating from Northwestern in 1970 and leaving the Bears for the Cincinnati Bengals, where he led the team to the division title in his first season. That offseason Carter taught math and statistics at Xavier University and gave dinner seminars for the software division of the A.O. Smith Corporation.

A.O. Smith used the seminars to encourage business executives to buy its software, and Carter used A.O. Smith's computer gear. Carter extended his play-by-play database and used the results in a weekly column, "The Computerized Quarterback," that ran in about 35 newspapers. One result showed that passing teams were more likely to score from their opponent's 15- or 10-yard line than from their 5-yard line. Neither Carter nor his quarterbacks coach—Bill Walsh—had the gumption to suggest to Bengals owner Paul Brown that the team's receivers should run out of bounds at the 10 rather than fighting for extra yardage. (Though Sports Illustrated's Paul Zimmerman does credit the duo with inventing the cerebral West Coast offense.)

The column died a couple of years later when A.O. Smith decided it didn't want to update the database anymore. Given the high cost and slow processing power of computers in that era, it's no surprise that data-intensive football analysis stalled in the '70s. It is surprising that, 30 years later, quantitative analysis of football remains mostly on the fringes, niche-marketed to fantasy gamers and gamblers. Carter himself tried that angle, marketing a "Computerized Quarterback" point-spread-analysis service in 1981. He stopped after a year when only 40 or 50 people ponied up the \$175 subscription fee.

Aaron Schatz, who operates the stats-heavy Web site Football Outsiders, says there's a simple reason why football research hasn't caught up. "Frankly, Bill James. Baseball analysis exists as it does today because Bill James is one of the people who, in American intellectual history, is a force of nature."

James' greatest contribution to the baseball base was the simple but powerful idea that "a hitter's job is not to compile a high batting average. The job is to create runs." In the years since Carter began stumbling around with his punch cards, nothing so powerful—and certainly nothing so simple—has emerged from the gridiron. And for good reason: It's easy to determine the result of each event in a baseball game—double, out, strike—and to assign the individuals involved numbers based on the outcome, but football's not that cut and dried. For one thing, specialization—running backs have entirely different tasks than wide receivers—makes it more difficult to come up with a single, grand, unifying theory of football. But most significantly, though it's clear that every football player's end goal is to put points on the scoreboard—or, for a defender, to keep them off—it's not clear how to extrapolate a pancake block or an open-field tackle into a point total.

As demonstrated by Carter's 53-variable system, each discrete event in football is ridiculously more elaborate than its baseball parallel. While many baseball interactions can be modeled as either one-on-one (batter vs. pitcher) or one-on-zero (fielder vs. ball), each 11-on-11 football play can feature innumerable potential interactions, determined in part by the innumerable different formations and play calls available to both offense and defense. With all that built-in complexity, it's difficult to assign credit and blame to an individual—that is, to decide whether a running back gained five yards because of, or in spite of, his offensive line. Were the Denver Broncos geniuses for drafting Terrell Davis, Olandis Gary, and Mike Anderson in the late rounds—or were they just average backs running behind a great line?

Statistically inclined baseball fans complain that sportswriters evaluate players with teammate-dependent stats like RBIs, but football players can be evaluated only with teammate-dependent stats. No football statistic, with the possible exception of a kicker's touchback percentage, represents individual accomplishment. And these are just the biases when numbers get assigned: There are large swaths of the game—most notably offensive-line play—that have yet to be described with useful numbers.

Just as there are difficulties associated with quantifying individual ability, there are similar problems with analyzing teams. In football, the scoreboard changes strategy far more often and to a greater degree than in baseball. Whereas a baseball team tries to get hits and score runs in essentially the same fashion no matter what the score, if a football team is down by two touchdowns in the second half, they'll often abandon half their offense and rely solely on the pass.