It's time to move beyond those squiggly letter tests.

Inside the Internet.
April 24 2009 12:28 PM

I'm Human, Computer, I Swear!

It's time to move beyond those squiggly letter tests that Web sites use to weed out spam.

Click here to launch a slide show on CAPTCHAs over the years.

If only someone had listened to computer scientist Moni Naor in 1996, proving that you're human on the Internet would have been so much more interesting. Naor was among the first to propose that simple tests only humans can solve would prevent malicious bots from infiltrating the Web. In an unpublished manuscript, Naor proposed nine possible tests, including gender recognition in images, fill-in-the-blank sentences, and a "deciding nudity" quiz in which you're asked to identify which person isn't wearing any clothes.

Alas, rather than getting to play "find the naked person" every time we sign up for a webmail account, we're now stuck with those reviled squiggly letter tests known as CAPTCHAs. Let's give credit where credit's due: These tests have been incredibly effective in combating spam. But even CAPTCHA pioneer Luis von Ahn, who received a MacArthur genius grant on account of his squiggly-letter work, admitted to me that they won't be a solution forever. For all their success, these tests are a crude way to weed out the bots among us. And they have proliferated to so many sites that the task of proving your humanity on the Internet is beginning to feel like an imposition.

This guess-the-funny-letters approach has been the dominant strategy in bot warfare for the past decade or so. As spammers have gotten more sophisticated, the CAPTCHAs have gotten harder to solve. Now, it's not at all uncommon for flesh-and-blood people to botch the tests, failing to convince the computer of their Homo sapiens credentials.

There is something uniquely vexing about having your humanity disputed by a machine. Don't blame the computer. While humans are perfectly capable of spotting a machine masquerading as a human on the Internet—the classic definition of a Turing test—it gets much more difficult when you're asking a computer to be the judge, particularly as hackers get better at teaching computers to read. Tech publications regularly report that this or that CAPTCHA has been cracked by spammers, though there's often some dispute over whether the culprits are using optical character recognition or simply paying people in India to solve them en masse. (This site, for example, charges $2 per 1,000 solutions.) An engineer at Google told me that the company has collected evidence of OCR attacks on its CAPTCHAs but believes the majority of illicit solving is being done by humans.

We all despise spam, but using CAPTCHAs as a first line of defense often amounts to killing a mosquito with a squiggly machete. Serving readers these Pictionary exercises might be called for in situations with high-value targets, like free webmail services that get conscripted to send more spam. But is it really necessary for me to fill out a CAPTCHA in order to send an e-mail to an English professor at Auburn University?

These days, most of the advances in human verification involve new and improved tests. Google is experimenting with rotated images, since computers still have trouble telling up from down. Von Ahn currently runs a system called reCAPTCHA that helps digitize books in the process of getting people to identify words. (He says reCAPTCHA, which is used by more than 100,000 sites, is still spammer-proof.) For some really nutty proposals, check out this paper, which proposes a series of word association games and inkblot tests.

I don't doubt that these innovations can extend CAPTCHA's lifespan for at least a few years. But I don't think this should be the goal. Rather, developers should be moving away from a system where humans have to prove they're human, particularly for sites that are low-value targets. (No offense, Auburn English department.) Ideally, software should be able to figure out who's a human on its own.

To that end, there are a few interesting techniques that can at least weed out the dumbest spambots. Developer James Edwards offers a nice overview of noninteractive alternatives. My favorite is the "honeypot" defense: Since bots live inside the Internet and see HTML, not the pretty versions of Web pages our browsers make for us, they can have a difficult time figuring out what's visible to humans and what isn't. So when they see a submission form—say, to submit a comment to a blog—they're inclined to enter something in all the fields and try to post it. The honeypot here is an input field that is invisible to readers. As a human, you will never know this secret input box exists, and even if you did, there would be no way for you to access it. If the site receives a submission in the invisible field, then, it's probably coming from a bot and can be automatically discarded.

Spammers, of course, are dedicated, able, and not easily fooled. Anyone trying to target a specific site would not have much trouble bypassing this defense. But for sites whose main threat comes from roving bots that paint with a wide brush, these sorts of solutions are sensible.

For more robust protection, my hope lies in systems like Akismet, which applies a complex algorithm on blog comments to determine whether they are spam. It's in the same vein as e-mail spam filters that examine the content of a message and give it a thumbs up or thumbs down. These filters have gotten a lot better over the years—it's no longer possible to fool the e-mail watchdogs by spelling your product R0lex. Another automatic system called Bad Behavior boasts that it doesn't even bother with the content. Instead, it uses what it calls a "fingerprinting" strategy to identify spammers based on technical characteristics, like the IP address and the details of the HTTP request, exploiting the fact that most spammers are sloppy programmers who leave at least a few digital red flags waving.

Herein lies the key to leaving squiggly letters behind. As Alan Turing laid out in the 1950 paper that postulated his test, the goal is to determine whether a computer can behave like a human, not perform tasks that a human can. The reason CAPTCHAs have a term limit is that they measure ability, not behavior. The history of computing shows us that machines will eventually learn how to perform all manner of tasks—like identifying words, for instance—that we currently assume only humans can solve.

How might it be possible to measure behavior rather than ability? The other day, I was writing a note to company using the online form they provided for media requests, doing the usual amount of typing, backspacing, and retyping as I tried to phrase my note in a way that would make them respond quickly. It occurred to me that the random, circuitous way that people interact with Web pages—the scrolling and highlighting and typing and retyping—would be very difficult for a bot to mimic. A system that could capture the way humans interact with forms algorithmically could eventually relieve humans of the need to prove anything altogether.

Any solution that could replace CAPTCHAs en masse would have to be free, work across a wide variety of platforms, and be easy for the average blogger or Web admin to install. One of the reasons that CAPTCHAs have spread like kudzu, I suspect, is that they're so easy to implement—in some cases, as simple as checking a box on a site that helps you set up an input form. The more a bot-fighting algorithm can insinuate itself behind the scenes, the better. In the meantime, we'll all have to keep debating the eternal question: Is that a W, or is it a V and an I attached at the hip?

TODAY IN SLATE

Politics

The Democrats’ War at Home

How can the president’s party defend itself from the president’s foreign policy blunders?

Congress’ Public Shaming of the Secret Service Was Political Grandstanding at Its Best

Michigan’s Tradition of Football “Toughness” Needs to Go—Starting With Coach Hoke

Windows 8 Was So Bad That Microsoft Will Skip Straight to Windows 10

Homeland Is Good Again! For Now.

Politics

Cringing. Ducking. Mumbling.

How GOP candidates react whenever someone brings up reproductive rights or gay marriage.

Building a Better Workplace

You Deserve a Pre-cation

The smartest job perk you’ve never heard of.

The Ludicrous Claims Women Are Pitched at “Egg Freezing Parties”

Piper Kerman on Why She Dressed Like a Hitchcock Heroine for Her Prison Sentencing

Behold
Oct. 1 2014 11:48 AM An Up-Close Look at the U.S.–Mexico Border
  News & Politics
The World
Oct. 1 2014 12:20 PM Don’t Expect Hong Kong’s Protests to Spread to the Mainland
  Business
Business Insider
Oct. 1 2014 12:21 PM How One Entrepreneur Is Transforming Blood Testing
  Life
Outward
Oct. 1 2014 11:59 AM Ask a Homo: A Lesbian PDA FAQ
  Double X
The XX Factor
Sept. 30 2014 12:34 PM Parents, Get Your Teenage Daughters the IUD
  Slate Plus
Behind the Scenes
Oct. 1 2014 10:54 AM “I Need a Pair of Pants That Won’t Bore Me to Death” Troy Patterson talks about looking sharp, flat-top fades, and being Slate’s Gentleman Scholar.
  Arts
Brow Beat
Oct. 1 2014 12:26 PM Where Do I Start With Leonard Cohen?
  Technology
Future Tense
Oct. 1 2014 11:48 AM Watch a Crowd Go Wild When Steve Jobs Moves a Laptop in This 1999 Demonstration of WiFi
  Health & Science
Bad Astronomy
Oct. 1 2014 12:01 PM Rocky Snow
  Sports
Sports Nut
Sept. 30 2014 5:54 PM Goodbye, Tough Guy It’s time for Michigan to fire its toughness-obsessed coach, Brady Hoke.