Google Translate: It Already Speaks as Well as a 10-Year-Old. How Good Can it Get?

Innovation, the Internet, gadgets, and more.
Oct. 31 2011 7:04 AM

Google Translate

It already speaks 57 languages as well as a 10-year-old. How good can it get?

Was3634586
Will Google's computers understand languages better than humans?

Photograph by Karen Bleier/AFP/Getty Images.

A computer that translates "natural language" is the holy grail of artificial intelligence—language being so integral to our intelligence and to our humanness that to crack it would be to achieve artificial consciousness itself. But until relatively recently, attempts at it have mostly sucked. They’ve tended to mix the words of one language with the grammar of the other, getting both wrong in the process. Mostly, this is the fault of literal translation—the kind of process that translates kindergarten as children garden. Newer methods—dominated by Google—turn the problem around: Using data, statistics, and brute force, they succeed in part by their refusal to "deconstruct" language and teach meaning to computers in the traditional way.

Google is grossly outperforming the rule-based methods that have historically been used to teach language to computers. These classic methods work on the principle that language can be decoded, stripped to its purest component parts of "meaning," and built back up again into another language. Linguists feed computers vocabularies, grammars, and endless rules about sentence structure—but language isn’t so easily formalized this way. There are more exceptions, qualifications, and ambiguities than rules and laws to follow. And, when you really think about it, this approach hardly respects the complexity of the problem.

Advertisement

Enter Google Translate—Google didn’t invent this method but they’re certainly dominating it now—which avoids that reductive concept of language altogether. Google mines existing translated material, recognizes how words or phrases typically correspond, and uses probability to deliver the best match based on context. Being Google, its digital Rosetta Stone amounts to trillions of words, from a corpus of U.N. documentation (in its six official languages, translated at high quality) to company memos to Harry Potter novels. Although Google builds a "language model" that describes the basic look of a well-formed sentence, it doesn’t have linguists try to decode the languages at all. Wittgenstein’s maxim of "Don’t ask for the meaning, ask for the use" is an effective working mantra for Google's statistical method.

In his wonderful book, Is That a Fish in Your Ear?, the Princeton linguist and translator David Bellos notes the link between early machine translation pioneers and modern philosophers of language—that hopeless pursuit to discover “the purely hypothetical language which all people really speak in the great basement of their souls.” When I spoke to Bellos about Google, he stressed that Google's achievements doesn’t make Google Translate akin to how human translation actually works. Though a translation is what you get, translation isn’t really what Google Translate does. (Depending on what we understand by "translation"—but let’s not get into that.) “It’s like the difference between engineering and knowledge,” says Bellos. “An engineering solution is to make something work, but the way you make it work doesn’t necessarily have anything to do with the underlying things. Airplanes do not work the way birds fly.”

Which is quite true. But even if Google Translate doesn’t translate language like humans do, there are parallels in the effect, especially in the way Google Translate learns language. Children don’t learn with prescriptive rules and by deconstructing sentence structure. Subjects, nouns, verbs—these are drilled later, once we’re all but fluent. When I spoke to Franz Och, who heads up Google Translate, he told me how, in hindsight, it’s almost obvious that rule-based methods aren’t necessarily as fruitful as data-driven ones. When children learn, “You just give examples, you interact with the child—grammar is something which is never explicit, it’s always implicit,” he says. “Just the same, when our system is learning, a lot of the grammar is not explicit—it’s implicit in the model parameters, in what comes out.”

Here Wittgenstein pops up again. Translation was one of the philosopher’s many examples of a "language game," a form of rule-following wherein we partake in the game (of translation) without direct use of the rules that are implicit in it. Translation isn’t reducible to its rules (grammar, syntax, semantics), but they’re still there, in some sense, beneath the surface. Just the same, Google Translate doesn’t grasp the "rules"—they’re implicit, and learned implicitly, as Och says.

A metaphor, perhaps, but this isn't the first time a little applied Wittgenstein has been put to work at Google, intentionally or not. Part of Google’s search power is in its intelligent handling of context: Searches for "hot dogs" yield results for the food rather than puppies, working on the insights of family resemblance. In Steven Levy’s recent book about Google, In the Plex, an interview with search engineer Amit Singhal suggests that the Wittgenstein influence was deliberate, and was a key breakthrough. Another example: “Today, if you type ‘Gandhi bio,’ we know that ‘bio’ means ‘biography,’ ” Levy quotes Singhal. “And if you type ‘bio warfare,’ it means ‘biological.’ ” In other words, Google's search engine learns its semantics from human input and improves with more data, just as Google Translate does.

TODAY IN SLATE

Medical Examiner

Here’s Where We Stand With Ebola

Even experienced international disaster responders are shocked at how bad it’s gotten.

It’s Not Easy for Me, but I Stand With Emma Watson on Women’s Rights

Divestment Is Fine but Mostly Symbolic. There’s a Better Way for Universities to Fight Climate Change.

Subprime Loans Are Back

And believe it or not, that’s a good thing.

It Is Very Stupid to Compare Hope Solo to Ray Rice

Building a Better Workplace

In Defense of HR

Startups and small businesses shouldn’t skip over a human resources department.

Why Are Lighter-Skinned Latinos and Asians More Likely to Vote Republican?

How Ted Cruz and Scott Brown Misunderstand What It Means to Be an American Citizen

  News & Politics
The World
Sept. 23 2014 10:55 AM This Isn’t the Syria Intervention Anyone Wanted
  Business
Business Insider
Sept. 23 2014 10:03 AM Watch Steve Jobs Tell Michael Dell, "We're Coming After You"
  Life
The Vault
Sept. 23 2014 10:24 AM How Bad Are Your Drinking Habits? An 18th-Century Temperance Thermometer Has the Verdict.
  Double X
The XX Factor
Sept. 22 2014 7:43 PM Emma Watson Threatened With Nude Photo Leak for Speaking Out About Women's Equality
  Slate Plus
Slate Plus
Sept. 22 2014 1:52 PM Tell Us What You Think About Slate Plus Help us improve our new membership program.
  Arts
Brow Beat
Sept. 23 2014 9:42 AM Listen to the Surprising New Single From Kendrick Lamar
  Technology
Future Tense
Sept. 23 2014 10:51 AM Is Apple Picking a Fight With the U.S. Government? Not exactly.
  Health & Science
Bad Astronomy
Sept. 23 2014 11:00 AM Google CEO: Climate Change Deniers Are “Just Literally Lying”
  Sports
Sports Nut
Sept. 18 2014 11:42 AM Grandmaster Clash One of the most amazing feats in chess history just happened, and no one noticed.