Blogging the Stanford Machine Learning Class

How Do You Teach a Computer How To Think On Its Own?
What's to come?
Dec. 6 2011 1:45 PM

Blogging the Stanford Machine Learning Class


Teaching a computer how to think on its own.

Derek Jeter.
If there is a way to split baseball teams into five groups, the machine will find it

Photograph by Nick Laham/Getty Images.

A few weeks ago, I mentioned that we were working on algorithms that could identify handwritten numbers with 97 percent accuracy. This is a classic example of a “supervised” learning problem: We’re telling the computer ahead of time what it’s looking for—one of 10 symbols—and helping it train by feeding it data in which the correct answer has already been provided.

Now let’s imagine another scenario, one in which we give the computer no training and don’t even tell it that all of these unfamiliar handwritten symbols are numerals. This is “unsupervised” machine learning—the computer has to figure out what it’s looking at on its own. If the computer does its job, it will be able to figure out that it’s dealing with 10 distinct symbols, each of which might have a bunch of slight variations. Even so, it should hopefully be able to put the symbols into the appropriate categories—everything that looks like a “1” goes into one bucket, all of the “2”s go into another, and so forth.

In this example, humans would do just as well as computers, if not better. We would notice that there were 10 symbols (even if we didn’t know what they meant), and we could sort them ourselves assuming we have enough time and patience. Unsupervised learning gets more interesting when the machines find patterns we could never identify ourselves. Most of the top contenders for the Netflix prize, for example, didn’t build their recommendation engines using preconceived ideas of genre and taste in film. They just trained a computer to look for whatever patterns showed up, no matter how unexpected or obscure.


Most of the rest of Stanford’s machine-learning course is devoted to learning how to write these sorts of algorithms. I don’t yet have the programming chops to write my own unsupervised code without the direct supervision of the instructors, but I’ve started thinking about different subject areas where such programs would be illuminating.

Let’s say you were asked to take your favorite sport and divide the teams into two categories based on their style of play, ignoring structural groupings like leagues and divisions. As a baseball fan, my instinct would be to divide the major leagues into teams that emphasize pitching and those that focus on hitting. A football fan might divide the NFL into running teams and passing teams, and a basketball watcher could group the NBA into teams that run the floor and those that play at a slow pace.

Once we make it three categories, those obvious dichotomies are no longer as useful. Perhaps you could divide Major League Baseball into young teams, middle-aged teams, and ballclubs full of grizzled veterans. OK, now let’s ratchet up the assignment to five categories, or 10, or 20. Now would be a good time to stock up on graph paper.

To figure out how to divide baseball teams into one of five categories, I would feed the computer a huge mess of data—say, 100 different statistical categories for each team—and let it group them the same way it would group handwritten numbers, searching for similarities in the data. If there is a way to split teams into five groups, the machine will find it.

Until this week, this class had dealt primarily with cases where we wanted the computer to help us guess some predetermined piece of information. This was all interesting, but the goal was a little too practical for me. I wanted to take this course to develop a better understanding of how machines learn. This week helped satisfy that curiosity immensely, to the point that I think a lesson on unsupervised learning should come earlier in future semesters. For whatever reason, it’s innately human to want to categorize things. Learning how machines can help us do that, and without any of our biases and blind spots, is tremendously exciting.



The Democrats’ War at Home

How can the president’s party defend itself from the president’s foreign policy blunders?

An Iranian Woman Was Sentenced to Death for Killing Her Alleged Rapist. Can Activists Save Her?

Piper Kerman on Why She Dressed Like a Hitchcock Heroine for Her Prison Sentencing

Windows 8 Was So Bad That Microsoft Will Skip Straight to Windows 10

We Need to Talk: A Terrible Name for a Good Women’s Sports Show


Cringing. Ducking. Mumbling.

How GOP candidates react whenever someone brings up reproductive rights or gay marriage.


How Even an Old Hipster Can Age Gracefully

On their new albums, Leonard Cohen, Robert Plant, and Loudon Wainwright III show three ways.

The U.S. Has a New Problem in Syria: The Moderate Rebels Feel Like We’ve Betrayed Them

Homeland Is Good Again! For Now, at Least.

  News & Politics
The World
Oct. 1 2014 12:20 PM Don’t Expect Hong Kong’s Protests to Spread to the Mainland
Oct. 1 2014 2:16 PM Wall Street Tackles Chat Services, Shies Away From Diversity Issues 
The Eye
Oct. 1 2014 1:04 PM An Architectural Crusade Against the Tyranny of Straight Lines
  Double X
The XX Factor
Oct. 1 2014 2:08 PM We Need to Talk: Terrible Name, Good Show
  Slate Plus
Political Gabfest
Oct. 1 2014 1:53 PM Slate Superfest East How to get your tickets before anyone else.
Brow Beat
Oct. 1 2014 3:02 PM The Best Show of the Summer Is Getting a Second Season
Future Tense
Oct. 1 2014 3:01 PM Netizen Report: Hong Kong Protests Trigger Surveillance and Social Media Censorship
  Health & Science
Oct. 1 2014 2:36 PM Climate Science Is Settled Enough The Wall Street Journal’s fresh face of climate inaction.
Sports Nut
Sept. 30 2014 5:54 PM Goodbye, Tough Guy It’s time for Michigan to fire its toughness-obsessed coach, Brady Hoke.