Yes, Ill Matty You
How your cell phone's autocorrect software works, and why it's getting better.
Autocorrect gets no respect. Every day, you dash off dozens of messages on your mobile phone, and most of the time, you do it wrong—you mistype, misspell, or make some other kind of error that's bound to cause you great embarrassment. In the vast majority of cases, your phone steps in to save the day. Thanks to the genius of autocorrect, you can appear fully literate even when you type "im ar thw store," "thats so fibby," or "yes ill matty you."
But no one ever thanks autocorrect. Instead you focus on the few instances in which your phone, overwhelmed by your errors, makes a mistake of its own. True, some of these are spectacular: The iPhone turns "heard about garys internship at the whitehouse?" to "Heard about farts internship at the whorehouse?" On the Motorola Droid, you might aim for "mmm, I donno about that restaurant" but get, "Mommy, I donno" instead. The Web abounds with such gaffes; David Pogue's readers recently compiled a hilariously comprehensive list. Most errors, though, are relatively prosaic—the most common one I experience on the iPhone is its insistence that hell should be he'll. (That explains my recent preference for the schoolmarmish exclamation "What the heck?")
Perhaps due to the thanklessness of the job, nearly all the mobile phone companies I contacted about autocorrect were reluctant to discuss the software. Apple, Google, Microsoft, Research in Motion, and HTC all either did not respond or declined requests for interviews. Surreptitiousness seems to be the operating philosophy here: "You do your best not to be noticed," says Scott Taylor, the vice president of mobile solutions at Nuance, the one software company that was happy to talk about how your phone turns hapless tapping into something resembling readable English. Nuance makes T9, one of the oldest and most popular mobile text-entry systems. The software—which is often customized by handset makers, and sometimes doesn't carry the T9 branding—has been bundled with more than 4 billion phones. In its earliest incarnation, T9 was simply a way to enter text via a nine-digit numeric keypad. More recent versions automatically correct input from full-on QWERTY keyboards, and it can even recognize handwriting from styluses.
The basic algorithm behind autocorrection software like T9 is pretty simple. The system is essentially the same as a word processor's spell checker—as you type, the software checks each word against a built-in dictionary, and it suggests alternatives when it doesn't find a match. Many phones will also try to predict what you're going for and suggest a word before you've finished typing it.
There are two difficulties in this process, Taylor says. The first is building the correct dictionary. The phone's list of words has to be both comprehensive and well-targeted for its audience, stuffed with colloquialisms that a modern mobile user might employ. The second problem is creating an accurate "language model," the system that determines which words to suggest. If a user types in fecer, did he mean fever or feces? The right answer depends on the context and the user—if you were e-mailing your boss about your absence from work, you'd be going for the former, while if you were a film critic who'd just attended the The Last Airbender, you'd probably want the latter. The more sophisticated the autocorrection system, the more of these contextual factors get taken into account when suggesting alternatives.
The most obvious way to build a dictionary suitable for phones would be to collect and analyze a large sample of words that people actually type in to their devices. But privacy policies prohibit this sort of analysis, Taylor says. Indeed, many phone-based autocorrect systems aren't tied to the Web—they don't automatically learn new words or find more timely alternatives to old words, in the same way that Google's search-engine spell checker does. Instead, autocorrection systems are usually seeded by a large body of text—what linguists call a "corpus"—that's made up of articles from the popular media. "We analyze those for things like the structure of the language, frequency of word use, and other factors, and then we create this language model," Taylor says. The word-suggestion algorithm also considers the layout of your keyboard in order to predict which key you meant to hit when you mashed several of them at the same time.
Most autocorrection systems—including on the iPhone, Android, BlackBerry, and T9–also incorporate some kind of learning behavior. For instance, they'll pay attention to when you recorrect a corrected word, and learn never to offer that faulty choice again. They'll also note the proper nouns in your address book and avoid suggesting alternatives for those. T9 and Google's Android will also let you add your own words to the phone's dictionary. (The iPhone also allegedly has this option, but I haven't been able to get it to work.)
Farhad Manjoo is Slate's technology columnist and the author of True Enough: Learning To Live in a Post-Fact Society. You can email him at firstname.lastname@example.org and follow him on Twitter.