Google speech recognition software for your cellphone actually works.

Innovation, the Internet, gadgets, and more.
April 6 2011 4:36 PM

Now You're Talking!

Google has developed speech-recognition technology that actually works.

Google speech recognition.

If you've got an Android phone, try this: Hit the microphone icon on the home screen, then ask, "How many angstroms in a mile?" Use your normal speaking voice—don't speak slowly or strain to over-pronounce "angstrom." So long as you have a good Internet connection, the phone shouldn't take more than a second to recognize your question and shoot back a reply: 1.609344 × 1013.

This works with all kinds of queries. Say "what's 10 times 10 divided by 5 billion" and the phone will do math for you. Say "directions to McDonald's" or read out an address—even a vague one like "33rd and Sixth, NYC"—and Android will pull up a map showing where you want to go. It works for other languages, too: Android's Translate app (also available for the iPhone) will not only convert your English into spoken French (among several other languages) but also has a "conversation mode" that will translate the French waiter's response back into English. And if that's not enough, Android lets you dictate your e-mail and text messages, too.

If you've tried speech-recognition software in the past, you may be skeptical of Android's capabilities. Older speech software required you to talk in a stilted manner, and it was so prone to error that it was usually easier just to give up and type. Today's top-of-the-line systems—like software made by Dragon—don't ask you to talk funny, but they tend to be slow and use up a lot of your computer's power when deciphering your words. Google's system, on the other hand, offloads its processing to the Internet cloud. Everything you say to Android goes back to Google's data centers, where powerful servers apply statistical modeling to determine what you're saying. The process is fast, can be done from anywhere, and is uncannily accurate. You can speak normally (though if you want punctuation in your email, you've got to say "period" and "comma"), you can speak for as long as you'd like, and you can use the biggest words you can think of. It even works if you've got an accent.

How does Android's speech system work so well? The magic of data. Speech recognition is one of a handful of Google's artificial intelligence programs—the others are language translation and image search—that get their power by analyzing impossibly huge troves of information. For the speech system, the data are a large number of voice recordings. If you've used Android's speech recognition system, Google Voice's e-mail transcription service, Goog411 (a now-defunct information service), or some other Google speech-related service, there's a good chance that the company has your voice somewhere on its servers. And it's only because Google has your voice—and millions of others—that it can recognize mine.

Advertisement

Unless you've turned on Android's "personalized voice recognition" system, your recordings are stored anonymously—that is, Google can't tie your voice to your name. Still, the privacy implications in building a huge database of millions of peoples' utterances are fascinating—so fascinating that I'll devote my next column to discussing them. Leaving aside privacy concerns for a moment, it's undeniable that speech recognition is one of a number of programs that could only have come about because of our newfound capacity to store and analyze lots and lots of information. In some ways the future of software—and, thus, of the computer industry—depends on such databases. If The Graduate were filmed today, the job advice to Benjamin Braddock would go like this: "One word: data."

To understand why Google's stash of recorded voice snippets is necessary for speech recognition, it helps to understand the history of creating machines that can decipher speech. Late last year, I met Mike Cohen, the head of Google's speech system, in a nondescript conference room at Google's Mountain View, Calif., headquarters. Cohen is one of the world's experts in voice-recognition systems; he's been in the business for decades, and he's seen it evolve from a field dominated by linguists who were interested in computers to one dominated by engineers who are interested in linguistics.

TODAY IN SLATE

Politics

Smash and Grab

Will competitive Senate contests in Kansas and South Dakota lead to more late-breaking races in future elections?

Stop Panicking. America Is Now in Very Good Shape to Respond to the Ebola Crisis.

The 2014 Kansas City Royals Show the Value of Building a Mediocre Baseball Team

The GOP Won’t Win Any Black Votes With Its New “Willie Horton” Ad

Sleater-Kinney Was Once America’s Best Rock Band

Can it be again?

Technocracy

Forget Oculus Rift

This $25 cardboard box turns your phone into an incredibly fun virtual reality experience.

One of Putin’s Favorite Oligarchs Wants to Start an Orthodox Christian Fox News

These Companies in Japan Are More Than 1,000 Years Old

Trending News Channel
Oct. 20 2014 6:17 PM Watch Flashes of Lightning Created in a Lab  
  News & Politics
Politics
Oct. 20 2014 8:14 PM You Should Be Optimistic About Ebola Don’t panic. Here are all the signs that the U.S. is containing the disease.
  Business
Moneybox
Oct. 20 2014 7:23 PM Chipotle’s Magical Burrito Empire Keeps Growing, Might Be Slowing
  Life
Outward
Oct. 20 2014 3:16 PM The Catholic Church Is Changing, and Celibate Gays Are Leading the Way
  Double X
The XX Factor
Oct. 20 2014 6:17 PM I Am 25. I Don't Work at Facebook. My Doctors Want Me to Freeze My Eggs.
  Slate Plus
Tv Club
Oct. 20 2014 7:15 AM The Slate Doctor Who Podcast: Episode 9 A spoiler-filled discussion of "Flatline."
  Arts
Brow Beat
Oct. 20 2014 9:13 PM The Smart, Talented, and Utterly Hilarious Leslie Jones Is SNL’s Newest Cast Member
  Technology
Technocracy
Oct. 20 2014 11:36 PM Forget Oculus Rift This $25 cardboard box turns your phone into an incredibly fun virtual-reality experience.
  Health & Science
Medical Examiner
Oct. 20 2014 11:46 AM Is Anybody Watching My Do-Gooding? The difference between being a hero and being an altruist.
  Sports
Sports Nut
Oct. 20 2014 5:09 PM Keepaway, on Three. Ready—Break! On his record-breaking touchdown pass, Peyton Manning couldn’t even leave the celebration to chance.