Hey, Robot Ref! Are You Blind?

Should the sports world replace human umpires with computers?

Sept 02, 20084:48 PM

Hawk-Eye technology

Disputes over line calls used to be one of the main joys of tennis—this, after all, is John McEnroe’s game. But fans rarely see players explode in rage anymore. In high-profile matches (i.e., those broadcast on TV), human umpires have largely been replaced by a machine called Hawk-Eye. The system is a kind of computerized ump that stitches together video footage from several high-speed cameras to produce a 3-D simulation of the ball as it approaches and bounces off the ground. Hawk-Eye’s decisions are final: When a player challenges an umpire’s call, the system displays its view of what just happened, then displays a judgment on the screen— in or out—that the human umpires are compelled to accept.

Hawk-Eye represents one extreme in the growing adoption of technology to solve disputes in sports. On the other side, you’ve got Major League Baseball, which has long resisted any kind of instant-replay system. Last week, pro baseball played its first games under a new rule that lets umpires review video in the limited scenario of “boundary calls”—essentially determining whether a home run was really a home run. (The replay system is so limited that in the week it’s been active, it hasn’t been deployed once.) Unlike tennis, which has made Hawk-Eye the ultimate authority, MLB, the NFL, the NBA, and the NHL all give human officials the final say in interpreting instant-replay footage.

Even so, instant replay alters how fans and players approach the game. Baseball officials have vowed not to expand the current replay system, but that will be a difficult promise to keep. If video can help an ump determine whether a ball went over the fence, why can’t it help with every other call a baseball umpire has to make? That’s the lure of video—it promises a measure of certainty in an otherwise uncertain endeavor. Place enough high-speed, high-resolution cameras at enough points around the field of play and you’ll eventually get at the absolute truth of any play, the thinking goes. The trouble is, technology can introduce as much uncertainty as it solves.

For one thing, photography doesn’t give clear-cut answers. From one angle, a ball may look in, while from another it looks out. Sure, umpires can decide which replay is most reflective of what actually happened. But umpires are biased—studies show that officials tend to favor certain players and teams based on race, jersey colors, and the size of the home crowd. Two umpires can look at the same instant replay and see different things. The whole point of going to replay was to move away from umps’ screw-ups. Computers, on the other hand, are free of hate and idiosyncrasies. So why don’t we move to the tennis model, letting a computer be the ultimate decision-maker?

Sure, I’m making a slippery-slope argument, and it may seem far-fetched to think that baseball or any other team sport will let machines analyze, rather than just record, what happens on the field. But you can’t dismiss the slippery slope when some sports have already slid down: Tennis adopted Hawk-Eye after several high-profile matches were marred by bad calls. Hawk-Eye is installed at the U.S. Open’s two stadium courts, as well as at Wimbledon, the Australian Open, and other major tournaments. (It’s not necessary at the French Open because the ball leaves a visible mark on the red clay.) Since 2006, there have been more than 550 Hawk-Eye challenges at the U.S. Open, and 30 percent resulted in reversed calls. Players have occasionally questioned its decisions, but lately many have been agitating for Hawk-Eye to be used more widely. Roger Federer said recently that his biggest complaint about Hawk-Eye is that it isn’t installed everywhere. Fans, too, seem to enjoy the challenge system—the crowds at the U.S. Open watch the replays intently and regularly cheer the results.

The most fascinating thing about Hawk-Eye is that it’s perceived as such a success despite being demonstrably fallible. According to a fascinating paper by Harry Collins and Robert Evans of Cardiff University, the system’s manufacturer reports its average error as 3.6 millimeters. The International Tennis Federation, which tests the line-calling equipment, allows for Hawk-Eye to be off by as much as 10 millimeters in some situations. This means that if a ball lands nine millimeters out, Hawk-Eye might call it in by one millimeter.

It isn’t terrible that Hawk-Eye is sometimes wrong—after all, humans often make mistakes. What is odd, though, is that the system’s designer, Hawk-Eye Innovations, has never explained these failures or how the system arrives at its decisions. Hawk-Eye uses up to six cameras placed around the court, but the graphic that it shows to judges and to viewers on TV does not include actual footage from any of those cameras. Instead, the system creates a composite of what it thinks happened to the ball. Collins and Evans argue that these composites subtly trick viewers. The simulation takes on an air of reality, even infallibility, when in fact it is only a statistical estimate. At the very least, the researchers say, Hawk-Eye should report its confidence—that it is X percent sure of its ruling. They also push for a more general “health warning”: When CBS broadcasts Hawk-Eye simulations on TV, it should remind viewers: “This is only a virtual representation of reality. It’s not what actually happened.”

Those who call for increased use of instant replay in sports often point to major mistakes by umps. Every sport has its signal event, the blown call that proves that refs would do better with video backup. In baseball, it’s Game 1 of the 1996 American League Championship Series between the Yankees and the Orioles. In the bottom of the eighth with the Yankees trailing, Derek Jeter hit a fly ball to right field. Richie Garcia, the right-field ump, watched Baltimore outfielder Tony Tarasco leap to make the catch. When Tarasco missed the ball, Garcia thought it was clear that the ball had gone over the fence—if it wasn’t in Tarasco’s glove, where else could it be? But everyone at home saw something else on that play: A 12-year-old kid named Jeffrey Maier had reached over the fence with his glove and deflected the ball into the stands. Garcia watched the replay after the game—which the Yankees went on to win—and was shocked at the sight of the kid. “Where did he come from?” Garcia said later. “I didn’t see him standing there. I never saw that.”

The replay system that baseball just installed probably would’ve gotten that call right. But why stop there? As instant replay becomes a generally accepted part of the game, players and fans are sure to press for more reviews. A system of sensors and cameras could conceivably be used to decide whether a runner is safe at first, for instance. Given that a blown call at first base once decided the World Series, you can imagine that fans might soon be calling for software to make those calls, too.

Stray too far beyond that simple case, though, and it’s difficult to imagine software taking over. Tennis—with its two players, small field of play, and bright-line demarcation between balls that are in and balls that are out—is a comparatively easy sport for computers to umpire. In other sports, refs have to take into account many more variables before making a call. To be able to tell whether a football runner is down before he fumbled, for example, a computer would have to somehow keep track of every player, which one has the ball, when the player with the ball gets hit, and when and how the ball comes loose. That task would likely require an array of sensors and sophisticated image-processing techniques—probably not yet a possibility.

Beyond the technological obstacles, the age of Hawk-Eye presents a larger philosophical problem. Sometimes the computer makes a call that no human—not the fans, not the umps, not the players—can quite understand. Late in the 2007 Wimbledon final between Rafael Nadal and Roger Federer, Nadal hit a deep ball that Federer let go, thinking it was out. The umpire thought so, too, and TV replays showing the ball from Federer’s side seemed to confirm it—the ball looked a good half-inch out. But when Nadal challenged the call, Hawk-Eye called the ball in. On the Hawk-Eye Innovations Web site, the company’s representatives posted an explanation (PDF) that blames the dispute on the limited perceptive capacities of TV cameras and human eyeballs. When a tennis ball smashes into grass at high speeds, it compresses, skids for 10 centimeters or so, and then takes off, the company said. Hawk-Eye’s fast cameras were sensitive enough to see the ball just clip the base line, while TV cameras and viewers caught only a blur while the ball skidded away from the line, making them think the ball was out.

Got that, then? Because it’s so perceptive, Hawk-Eye makes obsolete every assessment tool that humans have ever used to adjudicate sports disputes: our eyes, our TV cameras, even perceptible marks on the ground. In their paper, Collins and Evans argue that this is too precise. By erasing all of tennis’ ties to human perception, Hawk-Eye renders the game interpretable only to computers. That’s fairly ridiculous: After all, computers aren’t paying to see two human beings hit a ball over a net. People are.