Blogging the Stanford Machine Learning Class

How Do You Teach a Computer How To Think On Its Own?
What's to come?
Dec. 6 2011 1:45 PM

Blogging the Stanford Machine Learning Class

VIEW ALL ENTRIES

Teaching a computer how to think on its own.

Derek Jeter.
If there is a way to split baseball teams into five groups, the machine will find it

Photograph by Nick Laham/Getty Images.

A few weeks ago, I mentioned that we were working on algorithms that could identify handwritten numbers with 97 percent accuracy. This is a classic example of a “supervised” learning problem: We’re telling the computer ahead of time what it’s looking for—one of 10 symbols—and helping it train by feeding it data in which the correct answer has already been provided.

Now let’s imagine another scenario, one in which we give the computer no training and don’t even tell it that all of these unfamiliar handwritten symbols are numerals. This is “unsupervised” machine learning—the computer has to figure out what it’s looking at on its own. If the computer does its job, it will be able to figure out that it’s dealing with 10 distinct symbols, each of which might have a bunch of slight variations. Even so, it should hopefully be able to put the symbols into the appropriate categories—everything that looks like a “1” goes into one bucket, all of the “2”s go into another, and so forth.

In this example, humans would do just as well as computers, if not better. We would notice that there were 10 symbols (even if we didn’t know what they meant), and we could sort them ourselves assuming we have enough time and patience. Unsupervised learning gets more interesting when the machines find patterns we could never identify ourselves. Most of the top contenders for the Netflix prize, for example, didn’t build their recommendation engines using preconceived ideas of genre and taste in film. They just trained a computer to look for whatever patterns showed up, no matter how unexpected or obscure.

Advertisement

Most of the rest of Stanford’s machine-learning course is devoted to learning how to write these sorts of algorithms. I don’t yet have the programming chops to write my own unsupervised code without the direct supervision of the instructors, but I’ve started thinking about different subject areas where such programs would be illuminating.

Let’s say you were asked to take your favorite sport and divide the teams into two categories based on their style of play, ignoring structural groupings like leagues and divisions. As a baseball fan, my instinct would be to divide the major leagues into teams that emphasize pitching and those that focus on hitting. A football fan might divide the NFL into running teams and passing teams, and a basketball watcher could group the NBA into teams that run the floor and those that play at a slow pace.

Once we make it three categories, those obvious dichotomies are no longer as useful. Perhaps you could divide Major League Baseball into young teams, middle-aged teams, and ballclubs full of grizzled veterans. OK, now let’s ratchet up the assignment to five categories, or 10, or 20. Now would be a good time to stock up on graph paper.

To figure out how to divide baseball teams into one of five categories, I would feed the computer a huge mess of data—say, 100 different statistical categories for each team—and let it group them the same way it would group handwritten numbers, searching for similarities in the data. If there is a way to split teams into five groups, the machine will find it.

Until this week, this class had dealt primarily with cases where we wanted the computer to help us guess some predetermined piece of information. This was all interesting, but the goal was a little too practical for me. I wanted to take this course to develop a better understanding of how machines learn. This week helped satisfy that curiosity immensely, to the point that I think a lesson on unsupervised learning should come earlier in future semesters. For whatever reason, it’s innately human to want to categorize things. Learning how machines can help us do that, and without any of our biases and blind spots, is tremendously exciting.

TODAY IN SLATE

Medical Examiner

The Most Terrifying Thing About Ebola 

The disease threatens humanity by preying on humanity.

I Bought the Huge iPhone. I’m Already Thinking of Returning It.

Scotland Is Just the Beginning. Expect More Political Earthquakes in Europe.

Students Aren’t Going to College Football Games as Much Anymore

And schools are getting worried.

Global Marches Demand Action on Climate Change

Politics

Blacks Don’t Have a Corporal Punishment Problem

Americans do. But when blacks exhibit the same behaviors as others, it becomes part of a greater black pathology. 

Why a Sketch of Chelsea Manning Is Stirring Up Controversy

How Worried Should Poland, the Baltic States, and Georgia Be About a Russian Invasion?

Moneybox
Sept. 19 2014 1:11 PM Americans' Inexplicable Aversion to the 1990s
  News & Politics
Weigel
Sept. 20 2014 11:13 AM -30-
  Business
Business Insider
Sept. 20 2014 6:30 AM The Man Making Bill Gates Richer
  Life
Quora
Sept. 20 2014 7:27 AM How Do Plants Grow Aboard the International Space Station?
  Double X
The XX Factor
Sept. 19 2014 4:58 PM Steubenville Gets the Lifetime Treatment (And a Cheerleader Erupts Into Flames)
  Slate Plus
Tv Club
Sept. 21 2014 1:15 PM The Slate Doctor Who Podcast: Episode 5  A spoiler-filled discussion of "Time Heist."
  Arts
Brow Beat
Sept. 21 2014 2:00 PM Colin Farrell Will Star in True Detective’s Second Season
  Technology
Future Tense
Sept. 19 2014 6:31 PM The One Big Problem With the Enormous New iPhone
  Health & Science
Bad Astronomy
Sept. 21 2014 8:00 AM An Astronaut’s Guided Video Tour of Earth
  Sports
Sports Nut
Sept. 18 2014 11:42 AM Grandmaster Clash One of the most amazing feats in chess history just happened, and no one noticed.