Using Metadata to Find Paul Revere

The state of the universe.
June 10 2013 4:22 PM

Using Metadata to Find Paul Revere

“Social Networke Analysis,” a small encroachment on freedom, identifies terrorists in the Colonies.

Portrait of Paul Revere by John Singleton Copley, oil on canvas, 1768.
Portrait of Paul Revere by John Singleton Copley, 1768

Courtesy of Museum of Fine Arts Boston/Wikipedia

London, 1772.

I have been asked by my superiors to give a brief demonstration of the surprising effectiveness of even the simplest techniques of the newfangled Social Networke Analysis in the pursuit of those who would seek to undermine the liberty enjoyed by His Majesty’s subjects. This is in connection with the discussion of the role of “metadata” in certain recent events and the assurances of various respectable parties that the government was merely “sifting through this so-called metadata” and that the “information acquired does not include the content of any communications”. I will show how we can use this metadata to find key persons involved in terrorist groups operating within the Colonies at the present time. I shall also endeavour to show how these methods work in what might be called a relational manner.

The analysis in this report is based on information gathered by our field agent Mr David Hackett Fischer and published in an Appendix to his lengthy report to the government. As you may be aware, Mr Fischer is an expert and respected field Agent with a broad and deep knowledge of the Colonies. I, on the other hand, have made my way from Ireland with just a little quantitative training—I placed several hundred rungs below the Senior Wrangler during my time at Cambridge—and I am presently employed as a junior analytical scribe at ye olde National Security Agency. Sorry, I mean the Royal Security Administration. And I should emphasize again that I know nothing of current affairs in the Colonies. However, our current 18th century beta of PRISM has been used to collect and analyze information on more than two hundred and sixty persons (of varying degrees of suspicion) belonging variously to seven different organizations in the Boston area.

Advertisement

Rest assured that we only collected metadata on these people, and no actual conversations were recorded or meetings transcribed. All I know is whether someone was a member of an organization or not. Surely this is but a small encroachment on the freedom of the Crown’s subjects. I have been asked, on the basis of this poor information, to present some names for our field agents in the Colonies to work with. It seems an unlikely task.

Here is what the data look like.

130610_SCI_Composite1

The organizations are listed in the columns and the names in the rows. As you can see, membership is represented by a 1. So this Samuel Adams person (whoever he is) belongs to the North Caucus, the Long Room Club, the Boston Committee, and the London Enemies List. I must say, these organizational names sound rather belligerent.

Anyway, what can get from these meagre metadata? This table is large and cumbersome. I am a pretty low-level operative at ye olde RSA, so I have to keep it simple. My superiors, I am quite sure, have far more sophisticated analytical techniques at their disposal. I will simply start at the very beginning and follow a technique laid out in a beautiful paper by my brilliant former colleague, Mr Ron Breiger, called “The Duality of Persons and Groups.” He wrote it as a graduate student at Harvard, some thirty-five years ago. (Harvard, you may recall, is what passes for a university in the Colonies. No matter.) The paper describes what we now think of as a basic way to represent information about links between people and some other kind of thing, like attendance at various events or membership in various groups. The foundational papers in this new science of social networke analysis, in fact, are almost all about what you can tell about people and their social lives based on metadata only, without much reference to the actual content of what they say.

Mr Breiger’s insight was that our table of 254 rows and seven columns is an adjacency matrix, and that a bit of matrix multiplication can bring out information that is in the table but perhaps hard to see. Take this adjacency matrix of people and groups and transpose it—that is, flip it over on its side, so that the rows are now the columns and vice versa. Now we have two tables, or matrices, one showing “People by Groups” and the other “Groups by People”. Call the first one the adjacency matrix A and the second one its transpose, AT. Now, as you will recall there are rules for multiplying matrices together. If you multiply out A(AT), you will get a big matrix with 254 rows and 254 columns. That is, it will be a 254x254 “Person by Person” matrix, where both the rows and columns are people (in the same order) and the cells show the number of organizations any particular pair of people both belonged to. Is that not marvelous? I have always thought this operation is somewhat akin to magick, especially as it involves moving one hand down and the other one across in a manner not wholly removed from an incantation.

I cannot show you the whole Person by Person matrix, because I would have to kill you. I jest, I jest! It is just because it is rather large. But here is a little snippet of it. At this point in the 18th century, a 254x254 matrix is what we call Bigge Data. I have an upcoming EDWARDx talk about it. You should come. Anyway:

130610_SCI_Table2

You can see here that Mr Appleton and Mr John Adams were connected through both being a member of one group, while Mr John Adams and Mr Samuel Adams shared memberships in two of our seven groups. Mr Ash, meanwhile, was not connected through organization membership to any of the first four men on our list. The rest of the table stretches out in both directions.

Notice again, I beg you, what we did there. We did not start with a “social networke” as you might ordinarily think of it, where individuals are connected to other individuals. We started with a list of memberships in various organizations. But now suddenly we do have a social network of individuals, where a tie in the network is defined by co-membership in an organization. This is a powerful trick.

We are just getting started, however. A thing about multiplying matrices is that the order matters. It is not like multiplying two numbers. If instead of multiplying A(AT) we put the transposed matrix first, and do AT(A), then we get a different result. This time, the result is a 7x7 “Organization by Organization” matrix, where the numbers in the cells represent how many people each organization has in common. Here’s what that looks like. Because it is small, we can see the whole table. 

130610_SCI_Composite2

Again, interesting! (I beg to venture.) Instead of seeing how (and which) people are linked by their shared membership in organizations, we see which organizations are linked through the people that belong to them both. People are linked through the groups they belong to. Groups are linked through the people they share. This is the “duality of persons and groups” in the title of Mr Breiger’s article.

TODAY IN SLATE

Politics

The Irritating Confidante

John Dickerson on Ben Bradlee’s fascinating relationship with John F. Kennedy.

My Father Invented Social Networking at a Girls’ Reform School in the 1930s

Renée Zellweger’s New Face Is Too Real

Sleater-Kinney Was Once America’s Best Rock Band

Can it be again?

The All The President’s Men Scene That Captured Ben Bradlee

Medical Examiner

Is It Better to Be a Hero Like Batman?

Or an altruist like Bruce Wayne?

Technology

Driving in Circles

The autonomous Google car may never actually happen.

The World’s Human Rights Violators Are Signatories on the World’s Human Rights Treaties

How Punctual Are Germans?

  News & Politics
Politics
Oct. 22 2014 12:44 AM We Need More Ben Bradlees His relationship with John F. Kennedy shows what’s missing from today’s Washington journalism.
  Business
Moneybox
Oct. 21 2014 5:57 PM Soda and Fries Have Lost Their Charm for Both Consumers and Investors
  Life
The Vault
Oct. 21 2014 2:23 PM A Data-Packed Map of American Immigration in 1903
  Double X
The XX Factor
Oct. 21 2014 3:03 PM Renée Zellweger’s New Face Is Too Real
  Slate Plus
Behind the Scenes
Oct. 21 2014 1:02 PM Where Are Slate Plus Members From? This Weird Cartogram Explains. A weird-looking cartogram of Slate Plus memberships by state.
  Arts
Brow Beat
Oct. 21 2014 9:42 PM The All The President’s Men Scene That Perfectly Captured Ben Bradlee’s Genius
  Technology
Technology
Oct. 21 2014 11:44 PM Driving in Circles The autonomous Google car may never actually happen.
  Health & Science
Climate Desk
Oct. 21 2014 11:53 AM Taking Research for Granted Texas Republican Lamar Smith continues his crusade against independence in science.
  Sports
Sports Nut
Oct. 20 2014 5:09 PM Keepaway, on Three. Ready—Break! On his record-breaking touchdown pass, Peyton Manning couldn’t even leave the celebration to chance.