Connecting the Dots, Missing the Story

With Big Data, the government doesn’t need to know the “why” behind anything.

June 24, 20137:45 AM

In this image released by the FBI on April 19 two suspects in the Boston Marathon bombing walk near the marathon finish line on April 15.
Handout

Could Big Data have prevented 9/11? Perhaps—Dick Cheney, for one, seems to think so. But let’s consider another, far more provocative question: What if 9/11 happened today, in the era of Big Data, making it all but inevitable that all the 19 hijackers had extensive digital histories?

It used to be that one’s propensity for terrorism was measured in books or sermons. Today, it’s measured in clicks. It’s not that books or sermons no longer matter—they still do—it’s just today they are consumed digitally, in a way that leaves a trail. And that trail allows us to establish patterns. Are the books you bought on Amazon today more radical than the books you bought last month? If so, you might be a person of interest.

The Tsarnaev brothers, who allegedly bombed the Boston Marathon earlier this year, are of this new breed of terrorists. The brothers felt at home in the world of Twitter and YouTube. And some of the videos reportedly favorited by Tamerlan, the older brother, are clearly of extremist nature. Had someone been analyzing the brothers’ viewing habits in real time, a great tragedy might have been averted.

The good news—at least to Big Data proponents—is that we don’t need to understand what any of these clicks or videos mean. We just need to establish some relationship between the unknown terrorists of tomorrow and the established terrorists of today. If the terrorists we do know have a penchant for, say, hummus, then we might want to apply extra scrutiny to anyone who’s ever bought it—without ever developing a hypothesis as to why the hummus is so beloved. (In fact, for a brief period of time in 2005 and 2006, the FBI, hoping to find some underground Iranian terrorist cells, did just that: They went through customer data collected by grocery stores in the San Francisco area searching for sales records of Middle Eastern food.)

The great temptation of Big Data is that we can stop worrying about comprehension and focus on preventive action instead. Instead of wasting precious public resources on understanding the “why”—i.e., exploring the reasons as to why terrorists become terrorists—one can focus on predicting the “when” so that a timely intervention could be made. And once someone has been identified as a suspect, it’s wise to get to know everyone in his social network: Catching just one Tsarnaev brother early on may not have stopped the Boston bombing. Thus, one is simply better off recording everything—you never know when it might be useful.

Gus Hunt, the chief technology officer of the CIA, said as much earlier this year. “The value of any piece of information is only known when you can connect it with something else that arrives at a future point in time,” he said at a Big Data conference. Thus, “since you can’t connect dots you don’t have … we fundamentally try to collect everything and hang on to it forever.” The end of theory, which Chris Anderson predicted in Wired a few years ago, has reached the intelligence community: Just like Google doesn’t need to know why some sites get more links from other sites—securing a better place on its search results as a result—the spies do not need to know why some people behave like terrorists. Acting like a terrorist is good enough.

As the media academic Mark Andrejevic points out in Infoglut, his new book on the political implications of information overload, there is an immense—but mostly invisible—cost to the embrace of Big Data by the intelligence community (and by just about everyone else in both the public and private sectors). That cost is the devaluation of individual and institutional comprehension, epitomized by our reluctance to investigate the causes of actions and jump straight to dealing with their consequences. But, argues Andrejevic, while Google can afford to be ignorant, public institutions cannot.

“If the imperative of data mining is to continue to gather more data about everything,” he writes, “its promise is to put this data to work, not necessarily to make sense of it. Indeed, the goal of both data mining and predictive analytics is to generate useful patterns that are far beyond the ability of the human mind to detect or even explain.” In other words, we don’t need to inquire why things are the way they are as long as we can affect them to be the way we want them to be. This is rather unfortunate. The abandonment of comprehension as a useful public policy goal would make serious political reforms impossible.

Forget terrorism for a moment. Take more mundane crime. Why does crime happen? Well, you might say that it’s because youths don’t have jobs. Or you might say that’s because the doors of our buildings are not fortified enough. Given some limited funds to spend, you can either create yet another national employment program or you can equip houses with even better cameras, sensors, and locks. What should you do?

If you’re a technocratic manager, the answer is easy: Embrace the cheapest option. But what if you are that rare breed, a responsible politician? Just because some crimes have now become harder doesn’t mean that the previously unemployed youths have finally found employment. Surveillance cameras might reduce crime—even though the evidence here is mixed—but no studies show that they result in greater happiness of everyone involved. The unemployed youths are still as stuck as they were before—only that now, perhaps, they displace anger onto one another. On this reading, fortifying our streets without inquiring into the root causes of crime is a self-defeating strategy, at least in the long run.

Big Data is very much like the surveillance camera in this analogy: Yes, it can help us avoid occasional jolts and disturbances and, perhaps, even stop the bad guys. But it can also blind us to the fact that the problem at hand requires a more radical approach. Big Data buys us time, but it also gives us a false illusion of mastery.

We can draw a distinction here between Big Data—the stuff of numbers that thrives on correlations—and Big Narrative—a story-driven, anthropological approach that seeks to explain why things are the way they are. Big Data is cheap where Big Narrative is expensive. Big Data is clean where Big Narrative is messy. Big Data is actionable where Big Narrative is paralyzing.
The promise of Big Data is that it allows us to avoid the pitfalls of Big Narrative. But this is also its greatest cost. With an extremely emotional issue such as terrorism, it’s easy to believe that Big Data can do wonders. But once we move to more pedestrian issues, it becomes obvious that the supertool it’s made out to be is a rather feeble instrument that tackles problems quite unimaginatively and unambitiously. Worse, it prevents us from having many important public debates.

As Band-Aids go, Big Data is excellent. But Band-Aids are useless when the patient needs surgery. In that case, trying to use a Band-Aid may result in amputation. This, at least, is the hunch I drew from Big Data.

This article arises from Future Tense, a collaboration among Arizona State University, the New America Foundation, and Slate. Future Tense explores the ways emerging technologies affect society, policy, and culture. To read more, visit the Future Tense blog and the Future Tense home page. You can also follow us on Twitter.