In Artificial Intelligence Breakthrough, Google Computers Teach Themselves To Spot Cats on YouTube

The Citizen's Guide to the Future
June 27 2012 9:34 AM

In Artificial Intelligence Breakthrough, Google Computers Teach Themselves To Spot Cats on YouTube

72144845
Google's computers quickly concluded that cats' faces were among the more important features to be able to recognize when watching YouTube.

Photo by Timothy A. Clary/AFP/Getty Images

Working at the secretive Google X lab, researchers from Google and Stanford connected 1,000 computers, turned them loose on 10 million YouTube stills for three days, and watched as they learned to identify cat faces.

Will Oremus Will Oremus

Will Oremus is Slate's senior technology writer.

The research, thus summarized, is good for a laugh. “Perhaps this is not precisely what Turing had in mind,” wrote The Atlantic’s Alexis Madrigal. Sure it was, countered The Cato Institute’s Julian Sanchez: Google was training its computers to pass the “Purring Test.”

Advertisement

But what’s most fascinating about the study is that the researchers didn’t actually tell the computers to look for cat faces. The machines started doing that on their own.

The paper’s actual title, you see, has nothing to do with felines, or YouTube for that matter. It’s called “Building High-level Features Using Large Scale Unsupervised Learning,” and what it’s really about is the ability of computer networks to learn what’s meaningful in images—without humans’ help.

When an untutored computer looks at an image, all it sees are thousands of pixels of various colors. With practice and supervision, it can be trained to home in on certain features—say, those that tend to indicate the presence of a human face in a photo—and reliably identify them when they appear. But such training typically requires images that are labeled, so that the computer can tell whether it guessed right or wrong and refine its concept of a human face accordingly. That’s called supervised learning.

The problem is that most data in the real world doesn’t come in such neat categories. So in this study, the YouTube stills were unlabeled, and the computers weren’t told what they were supposed to be looking for. They had to teach themselves what parts of any given photo might be relevant based solely on patterns in the data. That’s called unsupervised learning.

They were to develop these concepts using artificial neural networks—a system of distributed information processing analogous to that of the human brain. The goal was to see if Google’s computers could mimic some of the functionality of humans’ visual cortex, which has evolved to be expert at recognizing the patterns that matter most to us (such as faces and facial expressions).

In fact, Google’s machines did home in on human faces as one of the more relevant features in the data set. They also developed the concepts of cat faces and human bodies—not because they were instructed to, but merely because the arrangement of pixels in image after image suggested that those features might be in some way important.

Google engineering ace Jeff Dean, who helped oversee the project, tells me he was surprised by how well the network accomplished this. In past unsupervised learning tests, machines have managed to attach importance to lower-level features like the edges of an object, but not more abstract features like faces (or cats).

It might seem surprising that this type of pattern recognition should be so difficult. After all, a three-year-old can do it. But for one thing, the neural networks in a three-year-old’s brain contain far more connections than even Google’s massive set-up. (“How Many Computers to Identify a Cat? 16,000” was the New York Times’ headline. When I spoke with Dean, he politely pointed out that it was only 1,000 computers, with a combined 16,000 cores, but either way it’s a lot.)

Secondly, humans by age three are already equipped with specialized tools for recognizing faces. Part of the point of the experiment was to study how such tools might develop in infants’ brains in the absence of feedback or supervision.

For all their successes, it’s worth noting that Google’s computers also fell far short of humans in several respects. After unsupervised learning followed by a period of supervised training, they picked out human faces with 82 percent accuracy. But their accuracy on a broad range of features that humans consider relevant was a far more humble 16.7 percent. 

Meanwhile, Dean notes that the computers “learned” a slew of concepts that have little meaning to humans. For instance, they became intrigued by “tool-like objects oriented at 30 degrees,” including spatulas and needle-nose pliers.

For Dean, the big takeaway was not that computers have achieved human-like visual processing skills, but that it’s possible that they will someday in the not-too-distant future. Why does he think that’s the case? Because Google’s experiment shows that having more processing power and more data makes a difference—and as time passes, we’ll only have more of both.

Future Tense is a partnership of SlateNew America, and Arizona State University.

TODAY IN SLATE

Politics

Meet the New Bosses

How the Republicans would run the Senate.

The Government Is Giving Millions of Dollars in Electric-Car Subsidies to the Wrong Drivers

Scotland Is Just the Beginning. Expect More Political Earthquakes in Europe.

Cheez-Its. Ritz. Triscuits.

Why all cracker names sound alike.

Friends Was the Last Purely Pleasurable Sitcom

The Eye

This Whimsical Driverless Car Imagines Transportation in 2059

Medical Examiner

Did America Get Fat by Drinking Diet Soda?  

A high-profile study points the finger at artificial sweeteners.

The Afghan Town With a Legitimately Good Tourism Pitch

A Futurama Writer on How the Vietnam War Shaped the Series

  News & Politics
Photography
Sept. 21 2014 11:34 PM People’s Climate March in Photos Hundreds of thousands of marchers took to the streets of NYC in the largest climate rally in history.
  Business
Business Insider
Sept. 20 2014 6:30 AM The Man Making Bill Gates Richer
  Life
Quora
Sept. 20 2014 7:27 AM How Do Plants Grow Aboard the International Space Station?
  Double X
The XX Factor
Sept. 19 2014 4:58 PM Steubenville Gets the Lifetime Treatment (And a Cheerleader Erupts Into Flames)
  Slate Plus
Tv Club
Sept. 21 2014 1:15 PM The Slate Doctor Who Podcast: Episode 5  A spoiler-filled discussion of "Time Heist."
  Arts
Television
Sept. 21 2014 9:00 PM Attractive People Being Funny While Doing Amusing and Sometimes Romantic Things Don’t dismiss it. Friends was a truly great show.
  Technology
Future Tense
Sept. 21 2014 11:38 PM “Welcome to the War of Tomorrow” How Futurama’s writers depicted asymmetrical warfare.
  Health & Science
Bad Astronomy
Sept. 22 2014 5:30 AM MAVEN Arrives at Mars
  Sports
Sports Nut
Sept. 18 2014 11:42 AM Grandmaster Clash One of the most amazing feats in chess history just happened, and no one noticed.