Number Crunching

Why doesn’t football have a Bill James?

Dec 19, 200310:50 AM

In 1977, a boiler-room attendant named Bill James revolutionized baseball analysis with a self-published pamphlet called the Baseball Abstract, sweeping conventional wisdom aside with his radical reinterpretation of the statistical record. James’ insights, since embraced by the front offices of the Oakland A’s, the Boston Red Sox, and the Toronto Blue Jays, should have inspired the serious students of America’s other national game, football, to overturn their status quo. But a quarter-century later, nobody has. Why?

For one thing, a football game is several orders of magnitude more complex than a baseball game. For another, nobody has yet determined what the most useful data to collect might be. And finally there’s the problem of data points: The 16-game NFL season leaves less stuff to count than Major League Baseball’s 162 games.

None of this stopped Virgil Carter from giving it his best shot. Carter, who wrote one of the earliest technical papers on pro football, was no mere boiler-room attendant. As a statistics major at Brigham Young University, he starred as quarterback from 1964-66 and was drafted by the Chicago Bears.

Soon after launching his NFL career, Carter enrolled as a part-time student in Northwestern University’s MBA program in the offseason. Business school wasn’t just something to fall back on—it was a computerized venue in which to work out some of his football theories. Carter acquired the play-by-play logs for the first half of the 1969 NFL season and started the long slog of entering data: 53 variables per play, 8,373 plays. After five or six months, Carter had produced 8,373 punch cards.

By today’s computing standards, Carter’s data set was minuscule and his hardware archaic. To run the numbers, he reserved time on Northwestern’s IBM 360 mainframe. Processing a half-season query would take 15 or 20 minutes—something today’s desktop computers could do in nanoseconds. In one research project, Carter started with the subset of 2,852 first-down plays. For each play, he determined which team scored next and how many points they scored. By averaging the results, he was able to learn the “expected value” of having the ball at different spots on the field.

Carter deduced that the value is negative when the ball is inside a team’s own 20-yard line—that is, that the team playing defense will likely score before the one playing offense. Carter and his professor Robert E. Machol co-authored a paper based on his findings, “Operations Research on Football,” for a statistical journal. Carter’s insight proved prescient: In the 1998 edition of the book The Hidden Game of Football, the statistically savvy authors ran a similar study and came to the same conclusion. The expected value is essentially linear, starting at -2 points at your own goal line, moving to +2 at midfield, and rising to +6 at the opponents’ end zone.

Carter continued his football studies after graduating from Northwestern in 1970 and leaving the Bears for the Cincinnati Bengals, where he led the team to the division title in his first season. That offseason Carter taught math and statistics at Xavier University and gave dinner seminars for the software division of the A.O. Smith Corporation.

A.O. Smith used the seminars to encourage business executives to buy its software, and Carter used A.O. Smith’s computer gear. Carter extended his play-by-play database and used the results in a weekly column, “The Computerized Quarterback,” that ran in about 35 newspapers. One result showed that passing teams were more likely to score from their opponent’s 15- or 10-yard line than from their 5-yard line. Neither Carter nor his quarterbacks coach—Bill Walsh—had the gumption to suggest to Bengals owner Paul Brown that the team’s receivers should run out of bounds at the 10 rather than fighting for extra yardage. (Though Sports Illustrated’s Paul Zimmerman does credit the duo with inventing the cerebral West Coast offense.)

The column died a couple of years later when A.O. Smith decided it didn’t want to update the database anymore. Given the high cost and slow processing power of computers in that era, it’s no surprise that data-intensive football analysis stalled in the ‘70s. It is surprising that, 30 years later, quantitative analysis of football remains mostly on the fringes, niche-marketed to fantasy gamers and gamblers. Carter himself tried that angle, marketing a “Computerized Quarterback” point-spread-analysis service in 1981. He stopped after a year when only 40 or 50 people ponied up the $175 subscription fee.

Aaron Schatz, who operates the stats-heavy Web site Football Outsiders, says there’s a simple reason why football research hasn’t caught up. “Frankly, Bill James. Baseball analysis exists as it does today because Bill James is one of the people who, in American intellectual history, is a force of nature.”

James’ greatest contribution to the baseball base was the simple but powerful idea that “a hitter’s job is not to compile a high batting average. The job is to create runs.” In the years since Carter began stumbling around with his punch cards, nothing so powerful—and certainly nothing so simple—has emerged from the gridiron. And for good reason: It’s easy to determine the result of each event in a baseball game—double, out, strike—and to assign the individuals involved numbers based on the outcome, but football’s not that cut and dried. For one thing, specialization—running backs have entirely different tasks than wide receivers—makes it more difficult to come up with a single, grand, unifying theory of football. But most significantly, though it’s clear that every football player’s end goal is to put points on the scoreboard—or, for a defender, to keep them off—it’s not clear how to extrapolate a pancake block or an open-field tackle into a point total.

As demonstrated by Carter’s 53-variable system, each discrete event in football is ridiculously more elaborate than its baseball parallel. While many baseball interactions can be modeled as either one-on-one (batter vs. pitcher) or one-on-zero (fielder vs. ball), each 11-on-11 football play can feature innumerable potential interactions, determined in part by the innumerable different formations and play calls available to both offense and defense. With all that built-in complexity, it’s difficult to assign credit and blame to an individual—that is, to decide whether a running back gained five yards because of, or in spite of, his offensive line. Were the Denver Broncos geniuses for drafting Terrell Davis, Olandis Gary, and Mike Anderson in the late rounds—or were they just average backs running behind a great line?

Statistically inclined baseball fans complain that sportswriters evaluate players with teammate-dependent stats like RBIs, but football players can be evaluated only with teammate-dependent stats. No football statistic, with the possible exception of a kicker’s touchback percentage, represents individual accomplishment. And these are just the biases when numbers get assigned: There are large swaths of the game—most notably offensive-line play—that have yet to be described with useful numbers.

Just as there are difficulties associated with quantifying individual ability, there are similar problems with analyzing teams. In football, the scoreboard changes strategy far more often and to a greater degree than in baseball. Whereas a baseball team tries to get hits and score runs in essentially the same fashion no matter what the score, if a football team is down by two touchdowns in the second half, they’ll often abandon half their offense and rely solely on the pass.

Jim Schwartz knows from experience why most conventional stats make no sense. When he was a coach with Baltimore in 1996, he says, the Ravens were No. 2 in the NFL in passing because they were always behind and desperately heaving the ball to catch up. Schwartz, now the defensive coordinator with the Tennessee Titans, studied econometrics at Georgetown University, then earned what he calls a “Ph.D. in footballology” as an assistant coach with the Cleveland Browns under Bill Belichick. Schwartz took the Jamesian initiative to suss out whether certain statistics correlated with winning and was surprised to learn, after studying five years of data, that fumbles were evenly distributed between winning and losing teams. Belichick was incredulous when Schwartz reported his finding. “He was like, ’Good teams don’t fumble.’ “

When reviewing game film, Schwartz uses a simple grading system: He gives a plus (positive impact), a minus (negative impact), or a zero (no impact) to each player on each play. “You take those and then you can push them into an equation,” he says. “You basically have an 11-variable equation and the result is yards gained. Over the season, over 1,000 plays, you can isolate a variable.” Schwartz hopes to use his data to make personnel decisions: If a minus play by a defensive lineman costs the team on average more yards than a minus play by a linebacker, then perhaps linemen should be more of a priority in the draft or free agency.

At Football Outsiders, Aaron Schatz has his own sets of equations: His team offensive- and defensive-efficiency numbers take into account the score and time of the game and make adjustments for schedule strength. He’s also imported concepts, like the notion of the replacement-level player, from advanced baseball analysis. Another innovative football stat Web site, Two Minute Warning, has done some regression analysis for Schwartz.

The data available for football geniuses to crunch, and the CPUs for crunching them, far surpass what was available to Virgil Carter. Around three-fourths of NFL teams shell out more than $10,000 per year for STATS, Inc.’s databases, which can spit out customized reports on everything from play selection in the red zone to passing yardage in blitz situations. But, as Bill James showed, the difficulty isn’t in extracting and plotting the data, it’s in sifting through the morass with the right question to determine which of the game’s biases are rooted in empirical fact. At this point, all we really know is that good teams establish the run on offense (except the 2002 Tampa Bay Bucs), stop the run on defense (except the 1997 Denver Broncos), and, most definitely, don’t fumble the ball away.