How the NSA Does "Social Network Analysis"
It's like the Kevin Bacon game.
Last Thursday, USA Today reported that the NSA has been collecting the phone records of millions of Americans. The agency is apparently using "data mining" techniques to scour these records for connections between terrorists. According to an intelligence official interviewed by USA Today,the NSA is analyzing this data using "social network analysis." What's social network analysis?
A technique to map and study the relationships between people or groups. The basic concept of the social network is familiar to anyone who has used Friendster or played Six Degrees of Kevin Bacon. Social network analysis formalizes this parlor game, using details about the network to interpret the role of each person or group.
In a basic analysis, people are seen as "nodes" and the relationships between them are "links." By studying the links—in the case of the NSA program, telephone calls—it's possible to determine the importance (or "centrality") of each node.
There are several ways to determine which members of a network are important. The most straightforward technique is to figure out a member's "degree," or the number of direct connections he has to other members of the network. With groups that are decentralized and complex, like terrorist cells, other measures of centrality are important as well. Network analysts also study the "betweenness" and the "closeness" of members. A member with relatively few direct connections could still be important because he serves as a connector between two large groups. A member might also be important because his links, direct and indirect, put him closest to all other members of the group (i.e., he has to go through fewer intermediaries to reach other members than anyone else).
Network analyst Valdis Krebs set out to prove after 9/11 that networks could help uncover terrorist cells. Working from publicly available information, Krebs showed that all 19 hijackers were within two connections of the al-Qaida members the CIA knew about in early 2000. Krebs' network map also showed that Mohamed Atta was a central figure. It's not clear whether such analysis could have been performed in advance,when researchers wouldn't have been certain which links were significant connections to another terrorist and which were casual connections to an acquaintance. (There is also controversy surrounding reports that a U.S. military unit called Able Danger used network analysis to identify Mohamed Atta before 9/11.)
There are two schools of thought on whether large data sets—like the NSA's database of phone records—help or hinder network analysis. One group argues that adding lots of innocuous data (the phone calls of ordinary Americans) will cloud the picture, and that it's better to construct a network by looking only at calls made to and from known terrorists. Another group maintains that large data sets are useful in establishing a "baseline" of normal behavior.
Social network analysis has been used in many areas besides terrorist surveillance. Google's PageRank system is based on network theory and the concept of "centrality." Doctors use network analysis to track the spread of HIV. And some academics have applied network theory to Enron's e-mail records in an attempt to understand relationships within the company.
Got a question about today's news? Ask the Explainer.
Alexander Dryer works for The New Yorker in Washington, D.C.