Blogging the Stanford Machine Learning Class

How Do You Teach a Computer How To Think On Its Own?
What's to come?
Dec. 6 2011 1:45 PM

Blogging the Stanford Machine Learning Class


Teaching a computer how to think on its own.

Derek Jeter.
If there is a way to split baseball teams into five groups, the machine will find it

Photograph by Nick Laham/Getty Images.

A few weeks ago, I mentioned that we were working on algorithms that could identify handwritten numbers with 97 percent accuracy. This is a classic example of a “supervised” learning problem: We’re telling the computer ahead of time what it’s looking for—one of 10 symbols—and helping it train by feeding it data in which the correct answer has already been provided.

Now let’s imagine another scenario, one in which we give the computer no training and don’t even tell it that all of these unfamiliar handwritten symbols are numerals. This is “unsupervised” machine learning—the computer has to figure out what it’s looking at on its own. If the computer does its job, it will be able to figure out that it’s dealing with 10 distinct symbols, each of which might have a bunch of slight variations. Even so, it should hopefully be able to put the symbols into the appropriate categories—everything that looks like a “1” goes into one bucket, all of the “2”s go into another, and so forth.

In this example, humans would do just as well as computers, if not better. We would notice that there were 10 symbols (even if we didn’t know what they meant), and we could sort them ourselves assuming we have enough time and patience. Unsupervised learning gets more interesting when the machines find patterns we could never identify ourselves. Most of the top contenders for the Netflix prize, for example, didn’t build their recommendation engines using preconceived ideas of genre and taste in film. They just trained a computer to look for whatever patterns showed up, no matter how unexpected or obscure.

Most of the rest of Stanford’s machine-learning course is devoted to learning how to write these sorts of algorithms. I don’t yet have the programming chops to write my own unsupervised code without the direct supervision of the instructors, but I’ve started thinking about different subject areas where such programs would be illuminating.

Let’s say you were asked to take your favorite sport and divide the teams into two categories based on their style of play, ignoring structural groupings like leagues and divisions. As a baseball fan, my instinct would be to divide the major leagues into teams that emphasize pitching and those that focus on hitting. A football fan might divide the NFL into running teams and passing teams, and a basketball watcher could group the NBA into teams that run the floor and those that play at a slow pace.

Once we make it three categories, those obvious dichotomies are no longer as useful. Perhaps you could divide Major League Baseball into young teams, middle-aged teams, and ballclubs full of grizzled veterans. OK, now let’s ratchet up the assignment to five categories, or 10, or 20. Now would be a good time to stock up on graph paper.

To figure out how to divide baseball teams into one of five categories, I would feed the computer a huge mess of data—say, 100 different statistical categories for each team—and let it group them the same way it would group handwritten numbers, searching for similarities in the data. If there is a way to split teams into five groups, the machine will find it.

Until this week, this class had dealt primarily with cases where we wanted the computer to help us guess some predetermined piece of information. This was all interesting, but the goal was a little too practical for me. I wanted to take this course to develop a better understanding of how machines learn. This week helped satisfy that curiosity immensely, to the point that I think a lesson on unsupervised learning should come earlier in future semesters. For whatever reason, it’s innately human to want to categorize things. Learning how machines can help us do that, and without any of our biases and blind spots, is tremendously exciting.


Medical Examiner

Here’s Where We Stand With Ebola

Even experienced international disaster responders are shocked at how bad it’s gotten.

Why Are Lighter-Skinned Latinos and Asians More Likely to Vote Republican?

A Woman Who Escaped the Extreme Babymaking Christian Fundamentalism of Quiverfull

The XX Factor
Sept. 22 2014 12:29 PM A Woman Who Escaped the Extreme Babymaking Christian Fundamentalism of Quiverfull

Subprime Loans Are Back

And believe it or not, that’s a good thing.

It Is Very Stupid to Compare Hope Solo to Ray Rice

Building a Better Workplace

In Defense of HR

Startups and small businesses shouldn’t skip over a human resources department.

How Ted Cruz and Scott Brown Misunderstand What It Means to Be an American Citizen

Divestment Is Fine but Mostly Symbolic. There’s a Better Way for Universities to Fight Climate Change.

  News & Politics
Sept. 22 2014 6:30 PM What Does It Mean to Be an American? Ted Cruz and Scott Brown think it’s about ideology. It’s really about culture.
Sept. 22 2014 5:38 PM Apple Won't Shut Down Beats Music After All (But Will Probably Rename It)
Dear Prudence
Sept. 23 2014 6:00 AM Naked and Afraid Prudie offers advice on whether a young boy should sleep in the same room with his nude grandfather.
  Double X
The XX Factor
Sept. 22 2014 7:43 PM Emma Watson Threatened With Nude Photo Leak for Speaking Out About Women's Equality
  Slate Plus
Slate Plus
Sept. 22 2014 1:52 PM Tell Us What You Think About Slate Plus Help us improve our new membership program.
Sept. 23 2014 7:14 AM Fighting the Sophomore Slump, Five Novels at a Time Announcing the Slate/Whiting Second Novel List.
Future Tense
Sept. 22 2014 6:27 PM Should We All Be Learning How to Type in Virtual Reality?
  Health & Science
Bad Astronomy
Sept. 23 2014 7:00 AM I Stand with Emma Watson
Sports Nut
Sept. 18 2014 11:42 AM Grandmaster Clash One of the most amazing feats in chess history just happened, and no one noticed.