After Further Review

The NFL’s instant replay system doesn’t work. Here’s how to fix it.

Dec 22, 200912:25 PM

Every serious football fan knows the central tenet of the NFL’s instant replay system: “A decision will be reversed only when the referee has indisputable visual evidence” that the call made on the field was incorrect. While we’re all quick to complain when we don’t agree with a replay reversal, few observers ever ask the obvious question: Why is the league’s replay standard so demanding?

Earlier this month, Duke law professor Joseph Blocher asked that very question at PrawfsBlog, and since then several more bloggers have weighed in. Many start from the common legal wisdom that a fact-finder usually minimizes errors in adjudication by returning the verdict he believes most likely to be right. That’s why, in a tort suit or one for breach of contract, the jury will find for the plaintiff if he proves his case by “a preponderance of the evidence”—a fancy way of saying “more likely than not.” Sometimes the jury will mistakenly find for the plaintiff when the true facts favor the defendant, and sometimes it will err in the opposite direction. But, on reasonable assumptions, the “more likely than not” standard of proof is the best way to maximize correct verdicts and minimize incorrect ones.

In criminal law, our goal is not to minimize total errors. Believing, with the 18^th-century English jurist William Blackstone, that it is “better that ten guilty persons escape than that one innocent suffer,” the Anglo-American system deems it especially important to minimize erroneous convictions. That’s why a criminal defendant must be acquitted unless the jury is persuaded of his guilt “beyond a reasonable doubt”—about the most demanding standard of proof known to law. It is designed to ensure that there will be very few erroneous convictions. However, it achieves that end at a high cost: lots of erroneous acquittals.

The NFL’s “indisputable visual evidence” standard goes still further. Pro football’s replay rules would seem to ensure very few erroneous reversals but at the cost of tolerating a great many erroneous initial calls. That tradeoff would make sense if we believe it “better that ten erroneous calls be allowed to stand than that one correct initial call be reversed.” But that seems silly. After all, the very point of introducing instant replay was to reduce errors—”to get the play right.” If we want fewer errors, our best bet is to allow the referee to reverse an initial call if he concludes that a contrary call is more likely to be right. In the legal world, that’s what’s known as de novo review.

Notwithstanding the written rule, isn’t it possible that referees are applying something close to de novo review? After all, coaches’ challenges succeed from 40 percent to 50 percent of the time. While you might think such a high success rate offers little reason to believe refs abide by the indisputable visual evidence standard, consider that the conviction rate in American criminal prosecutions stands just above 90 percent. Still, few if any scholars think that fact alone provides good evidence that juries do not adhere to the “beyond a reasonable doubt” standard of proof; the high conviction rate is consistent with the supposition that, by and large, prosecutors only bring strong cases. Similarly, a high reversal rate for instant replay is consistent with the possibility that coaches use their challenges wisely.

There are reasons to think that’s so. First, because coaches have a limited number of challenges and lose a timeout if a challenge fails, they have every incentive not to make challenges that are likely to lose. Moreover, the facts that reversal rates have increased over time—coaches were successful on 29 percent of challenges in 1999—and that coach-initiated challenges succeed at twice the rate of booth-initiated challenges suggest that coaches are learning to issue challenges more judiciously. So while it appears to me—as it does to many fans—that refs don’t always adhere to the strict IVE standard, it seems hard to deny that they apply a standard significantly more demanding than de novo review.

Even assuming that’s right, many reasons have been proposed to keep the NFL’s replay system the way it is now. None seems very convincing.

First, you could argue that the IVE test is technically a standard of review, not of proof, and that standards of review frequently prescribe deference to the initial decision. That is true. But the rationales behind deferential appellate review standards just don’t apply to football. The law frequently mandates deference when the initial decision was entrusted to the discretion of an initial adjudicator—the referee on the field, in our example. But “judgment calls” like holding and unnecessary roughness are already unreviewable. Reviewable calls are up for scrutiny precisely because they are thought not to be discretionary. The law also mandates deferential review when the initial decision-maker is better able to ascertain the true facts. But nothing like that applies here. Although the on-field official will sometimes have the superior vantage point, surely the replay official—who has the benefit of multiple camera angles in high-definition and super slo-mo—will more often than not have better access to the truth.

What about the argument that the indisputable video evidence standard is needed to preserve respect for the officials? To start, fans don’t expect perfection from the refs. And insofar as they might, respect is threatened less by reversal of the rare challenged call than by the fact that viewers at home, watching replays in HD, can see when the refs get it wrong. Moreover, a standard as demanding as “indisputable video evidence” predictably invites only partial compliance. Sometimes replay officials appear to go by the IVE rule, sometimes they don’t. The refs’ oft-remarked failure to apply this extremely demanding standard might provoke more disrespect than the initial error.

There’s also the argument that de novo review would lead to inexorable delays. That case is weak as well. Under current NFL rules, each team gets only two challenges, though they earn a third if the first two are successful. If the standard were changed to de novo review, the league might face lobbying to increase that allowance, for fans and coaches might find it frustrating if exhaustion of a team’s challenges prevented a review of a clearly reversible call. The outer limit here is the college game, where the replay booth reserves the right to review every play. But the NFL has steadfastly insisted that pursuit of greater accuracy must not be allowed to cause substantial delay—two years ago, for example, it cut the allotted time for review by one-third. For or better or worse, then, the likelihood that it would permit significantly more challenges is trivially small.

All that I have argued so far has been designed to show that the tools and analysis developed in the law can be of use to sports—that they can help reveal that the replay standard used in the NFL doesn’t make much sense. Happily, football might be able to teach the law something as well.

After an on-field review, the replay official has two basic options: to reverse the initial call or not. When the call is not reversed, the referee sometimes says that the on-field call “stands,” other times that it is “confirmed.” This year’s edition of the NFL Referees’ Manual directs referees that the “only time you should announce ‘the ruling on the field stands’ is when there is not enough visual evidence to make a decision.” In short, the “stands” and “confirmed” verdicts mean different things. The latter tells us that the initial call was correct; the former reports that it wasn’t indisputably wrong.

Perhaps the criminal justice system should be reformed to communicate a similar difference. A criminal jury unable to conclude beyond a reasonable doubt that the defendant is guilty is instructed to acquit. It does so by announcing “not guilty.” But why only one verdict of acquittal, not two? A jury could return a verdict of “innocence” if it believes, more likely than not, that the defendant was, well, innocent. A verdict of “not (proven) guilty” would be appropriate only when the jury believes the defendant isn’t guilty beyond a reasonable doubt.

That’s the way they do it in Scotland. If it’s good enough for the Scots and the NFL, maybe it should be good enough for the American criminal justice system too.