Kids Understand the Probabilities of Possibilities Better Than Adults Do

The state of the universe.
Oct. 11 2012 3:30 AM

Why Your 4-Year-Old Is As Smart as Nate Silver

And if kids are so smart, why are adults so stupid about statistics?

How do children learn so much?
How do children learn so much?

Photograph by Brand X Pictures/Thinkstock.

Everyone who spends time with children knows how incredibly much they learn. But how can babies and young children possibly learn so much so quickly? In a recent article in Science, I describe a promising new theory about how babies and young children learn and the slew of research that supports it. The idea is that kids learn by thinking like Nate Silver, the polling analyst extraordinaire at the New York Times.

I suspect that most people who, like me, obsessively click his FiveThirtyEight blog throughout the day think of Nate as a semi-divine oracle who can tell you whether your electoral prayers will be answered. But there is a very particular kind of science behind Nate’s predictions. He uses what’s called Bayesian modeling, named after the Rev. Thomas Bayes, an 18th-century mathematician. The latest studies show that kids, at least unconsciously, are Bayesians, too.

The Bayesian idea is simple, but it turns out to be very powerful. It’s so powerful, in fact, that computer scientists are using it to design intelligent learning machines, and more and more psychologists think that it might explain human intelligence. Bayesian inference is a way to use statistical data to evaluate hypotheses and make predictions. These might be scientific hypotheses and predictions or everyday ones. If you’re Nate, they could be about whether 50- to 60-year-old voters in suburban Iowa prefer Obama or Romney for president. If you’re a 1-year-old, they could be about whether Mom would prefer to eat Goldfish crackers or raw broccoli for a snack. (In my lab, we showed that children learn about such preferences between the ages of 14 and 18 months.)


Here’s a simple bit of Bayesian election thinking. In early September, the polls suddenly improved for Obama. It could be because the convention inspired and rejuvenated Democrats. Or it could be because Romney’s overly rapid response to the Benghazi attack turned out to be a political gaffe. Or it could be because liberal pollsters deliberately manipulated the results. How could you rationally decide among those hypotheses?

Well, if the pollsters deliberately manipulated the results, that would certainly lead to a change in the numbers—in Bayesian terms, there is a high likelihood that the poll numbers will change given deliberate manipulation. If the convention was inspiring, that also usually leads to a rise in the polls—there’s also a high likelihood that the polls will change given a successful convention. It turns out that gaffes, though, especially foreign-policy gaffes, rarely lead to changes in the polls—not a high likelihood there. So conventions and manipulation are more likely to lead to changes in the polls than gaffes are.

On the other hand, it’s much less likely to begin with that the pollsters deliberately altered the polls than it is that the convention was inspiring or that Romney indeed made a gaffe (at least, unless you’ve been watching Fox News). This is what Bayesians call “the prior”—how likely you think the hypotheses are before you even consider the new data.

Combining your prior beliefs about the hypotheses and the likelihood of the data can help you (or Nate) sort through the possibilities. In this case, the inspiring convention idea is both likely to begin with and likely to have led to the change in the polls, so it wins out over the other two. And once you have that hypothesis, you can make predictions. For example, if the poll increase really was the result of the convention, you might predict that it would fade as the convention recedes.

Bayesian reasoning lets you combine these kinds of information and draw conclusions in a precise mathematical way. Bayesian reasoning is always about probabilities, not certainties. If logical deduction gives you proofs of truths, Bayesian inference tells you the probabilities of possibilities. You won’t jump to an unlikely hypothesis right away. Still, if enough data accumulate, even an initially unlikely hypothesis can turn out to be right. The “47 percent” gaffe, unlike the Benghazi gaffe, really did seem to move the numbers, but it took especially strong and convincing data to get Nate to draw that conclusion.

It turns out that even very young children reason in this way. For example, my student Tamar Kushnir, now at Cornell, and I showed 4-year-olds a toy and told them that blocks made it light up. Then we gave the kids a block and asked them how to make the toy light up. Almost all the children said you should put the block on the toy—they thought, sensibly, that touching the toy with the block was very likely to make it light up. That hypothesis had a high “prior.”

Then we showed 4-year-olds that when you put a block right on the toy it did indeed make it light up, but it did so only two out of six times. But when you waved a block over the top of the toy, it lit up two out of three times. Then we just asked the kids to make the toy light up.