Soccer stats, Prozone, Opta: The trouble with soccer's statistical revolution.

Soccer stats, Prozone, Opta: The trouble with soccer's statistical revolution.

Soccer stats, Prozone, Opta: The trouble with soccer's statistical revolution.

The stadium scene.
Jan. 27 2011 12:53 PM

Numberless Wonders

The trouble with soccer's statistical revolution.

(Continued from Page 1)

Soccer isn't baseball, in other words, and even leading researchers in soccer numerology doubt they'll discover a universally applicable set of metrics—there are simply too many differences between teams and cultures, and too much chaotic complexity within the game itself. Sure, soccer has passes, shots, crosses, free kicks, and so on. But there are also long sequences of play when, say, a defender boots the ball forward, and two players jump to contest it in the air, and it sort of slouches off to one side, and the opposing right back (who's stuck covering the midfield because his teammates are scattered out of position) gets to it first, and angles what looks like a long pass to the left winger, only the ball swerves in the air and is picked off by the other team's goalkeeper, who rolls it back to the defender, who boots it forward, and so on. How do you account for that?

Bill Gerrard, a Leeds University professor who has worked with Beane along with several English clubs, told me that the key issue in soccer-stat analysis is "how to weight the different player actions in the overall calculation." It's not enough to know how many tackles your central defenders make; you also have to know how well the tackles-made stat helps explain wins and losses. (Maybe not very well: Good defenders position themselves to cut off attacking moves without attempting a tackle in the first place.) If metrical analysis is going to do for soccer what it's done for baseball, it needs to produce a set of formulas that are transferrable, broadly speaking, across teams and leagues. In baseball, Gerrard says, there's "pretty universal agreement about what the stats of top players look like compared to journeyman players." In soccer, beyond the roaringly obvious—scoring a lot of goals is generally a good thing—there's no such agreement to be found.

And Gerrard told me that he doubts that such agreement will be found. Teams operate under divergent tactical philosophies, player movement is interdependent, and a swarm of unanalyzable factors like "collective motivation" and "unity of vision" contribute something (but how much?) to every outcome. Stats can help soccer clubs, Gerrard says, with metrics designed around the clubs' specific needs. If Arsenal's whirring cogs tell them they need a defensive midfielder with ball-retention skills, an Arsenal-specific formula could help them isolate the right player to bid on. But even there, the corporate-contractor nature of the stat-trackers imposes potential distortions. Match Analysis President Mark Brunkhart told me that over the last decade, "a number of companies have very successfully marketed the idea that certain stats that are costly to measure, and for which they just happened to sell six-figure systems, were critically important." If the stat-tracking companies are engaged in that kind of marketing, how can clubs be sure that the numbers they're getting are the right ones?


The rise of for-profit sabermetrics marks a big difference between the quest for better soccer stats and the baseball story that helped inspire it. Baseball-focused empiricism started as a grassroots phenomenon, centered on Bill James' Baseball Abstracts, in the 1970s and 1980s. As Jamesian discoveries began to infiltrate the sport, teams brought on their own analysts, trying to keep at least some of the game's secrets in-house. Yet today, baseball wonkery remains largely open-source—any lost soul can still browse to Baseball Prospectus and FanGraphs to be enlightened about the difference between VORP and RARP.

Because of the highly fluid, stat-unfriendly nature of soccer, fans have been mostly shut out from its version of the numbers revolution, which has been team- and business-driven from the start. Nobody's sharing their metrics, because nobody wants to lose the competitive advantage they hope their metrics will confer. Services like the Guardian's brilliant chalkboards feature let fans access a limited range of Opta data, but the clubs' formulas for using those data are guarded like Cold War intelligence. In 1999, baseball analyst Voros McCracken changed the game by introducing the concept of defense-independent pitching statistics in a public Usenet newsgroup. Now, he works as an analyst in European soccer—but what he's doing, or for whom, is a secret.

It's hard to imagine a revolution in understanding a popular sport that could entirely circumvent that sport's followers. But that, weirdly, is what the soccer clubs seem to be aiming at: a great, obscurantist leap forward that will enable them to win more matches without anyone outside their own offices knowing precisely why. You can't really blame the clubs for this; it's their job to win more matches, after all. But in the meantime, fans are left looking in on a world of hidden complexity, a world in which experts sift through data we can't see to make decisions we can't understand. This isn't exactly the bright beam of American math the media keep anticipating. Instead, it's as if the more you try to quantify soccer, the more mysterious it gets.

Like Slate on Facebook. Follow us on Twitter.