How future historians will use the Library of Congress' Twitter archives.

Who's winning, who's losing, and why.
April 20 2010 7:03 PM

#Posterity

How future historians will use the Twitter archives.

The Library of Congress. Click image to expand.
The Library of Congress

Among the many criticisms of Twitter, the most common by far is that no one cares what you ate for breakfast.

In fact, quite a few people care. "I actually think it's very useful," says Paul Freedman, a professor at Yale University who studies the history of food. For him, a 140-character ode to your KFC Double Down—along with the worshipful photo you took before devouring it—could be a priceless historical document. "Historians are interested in ordinary life," Freedman says. "And Twitter is an incredible resource for ordinary life."

Hence the decision by the Library of Congress last week to store the complete archives of Twitter. Starting six months from now, every last tweet—currently produced at a rate of 50 million a day—will be saved on an LoC hard drive and will presumably be accessible to historians for … well, forever.

Digital archiving isn't anything new. A nonprofit digital library called the Internet Archive started collecting snapshots of the World Wide Web in 1996. University libraries regularly scan their research collections to make them accessible on the Web. Google Books is currently scanning the books of at least 20 major research libraries.

Advertisement

But the decision to archive Twitter takes digital preservation to a new level of detail. In the past, all archives, even digital ones, had to be selective. The Internet Archive doesn't preserve every last byte of the Web—only the seemingly important parts. The Twitter archive, by contrast, will be mind-numbingly complete. Everything from reactions to the uprising in Iran to Robert Gibbs' first tweet to your roommate's two-sentence analysis of Hot Tub Time Machine will be saved for posterity. Which is, from a historian's perspective, historic. Now that we've started logging all the stray thoughts hurled into cyberspace, the prospect of recording every last word ever published—to paraphrase archivist Brewster Kahle, we're "one-upping the Greeks"—doesn't seem especially crazy.

The question is, does the preservation of digital content, from tweets to Facebook updates to blog comments, make the job of historians easier or harder?

The answer is: both. On the one hand, there's more useful information for historians to sift. On the other, there's more useless information. And without the benefit of hindsight, it's impossible to tell which is which. It's like what John Wanamaker supposedly said about advertising: He knew half of it was wasted, he just didn't know which half.

The trick will be organization. Hashtags—the # symbols people use to create discussion threads, such as #ashtag for the Iceland volcano cloud and #snowpocalypse for the February snowstorm that swept Washington, D.C.—are a start. But many tweeters don't bother to tag their posts. Historians will probably be able to search by keyword. But that can lead them astray, too. How do you know if someone is complaining about the windows in their house or the Windows on their computer?

Data-mining has become sophisticated enough to make these distinctions based on context. Sometimes that means looking at other keywords surrounding a keyword. (If the word "laptop" appears near "Windows," for example, the author is probably talking about software.) It could also mean looking at metadata—when the tweet was sent, where it was sent from, whom the person is following and vice versa. Twitter has no plans to share public metadata with the LoC, but a spokesman says it would be "open to discussing this with them."

TODAY IN SLATE

Medical Examiner

Here’s Where We Stand With Ebola

Even experienced international disaster responders are shocked at how bad it’s gotten.

Why Are Lighter-Skinned Latinos and Asians More Likely to Vote Republican?

A Woman Who Escaped the Extreme Babymaking Christian Fundamentalism of Quiverfull

The XX Factor
Sept. 22 2014 12:29 PM A Woman Who Escaped the Extreme Babymaking Christian Fundamentalism of Quiverfull

Subprime Loans Are Back

And believe it or not, that’s a good thing.

It Is Very Stupid to Compare Hope Solo to Ray Rice

Building a Better Workplace

In Defense of HR

Startups and small businesses shouldn’t skip over a human resources department.

How Ted Cruz and Scott Brown Misunderstand What It Means to Be an American Citizen

Divestment Is Fine but Mostly Symbolic. There’s a Better Way for Universities to Fight Climate Change.

  News & Politics
Politics
Sept. 22 2014 6:30 PM What Does It Mean to Be an American? Ted Cruz and Scott Brown think it’s about ideology. It’s really about culture.
  Business
Moneybox
Sept. 22 2014 5:38 PM Apple Won't Shut Down Beats Music After All (But Will Probably Rename It)
  Life
Outward
Sept. 22 2014 4:45 PM Why Can’t the Census Count Gay Couples Accurately?
  Double X
The XX Factor
Sept. 22 2014 7:43 PM Emma Watson Threatened With Nude Photo Leak for Speaking Out About Women's Equality
  Slate Plus
Slate Plus
Sept. 22 2014 1:52 PM Tell Us What You Think About Slate Plus Help us improve our new membership program.
  Arts
Brow Beat
Sept. 22 2014 9:17 PM Trent Reznor’s Gone Girl Soundtrack Sounds Like an Eerie, Innovative Success
  Technology
Future Tense
Sept. 22 2014 6:27 PM Should We All Be Learning How to Type in Virtual Reality?
  Health & Science
Medical Examiner
Sept. 22 2014 4:34 PM Here’s Where We Stand With Ebola Even experienced international disaster responders are shocked at how bad it’s gotten.
  Sports
Sports Nut
Sept. 18 2014 11:42 AM Grandmaster Clash One of the most amazing feats in chess history just happened, and no one noticed.