Can a Language Time Machine Change How We View the Past?

What's to come?
Sept. 9 2013 9:45 AM

The Language Time Machine

Google’s Ngram Viewer gave us a new way to explore history, but has it led to any real discoveries?

(Continued from Page 1)

If academics and researchers are actually using it, that is. Mark Davies, a professor of corpus linguistics at Brigham Young University, and the creator of a corpus of American historical English similar to Google’s work, says his colleagues aren’t using results from the Viewer in published research or presentations. Google Books data, he says, “is not even on the radar for most people. They look at these cute charts and say, all you can do is see a chart for one word—that’s pretty limiting.”

Even though Google recently tagged words by part of speech, there’s no way to check and make sure it labeled words correctly. “In academia, it doesn’t fly to say ‘Trust us, we did it right,’ ” Davies says. Another reason Davies thinks the Viewer hasn’t gained traction in his world: It doesn’t allow for searching by collocates, or words that occur nearby other words—but aren’t adjacent. (The Viewer does allow users to search for words that are next to each other.)

Linguists use collocates to understand how word meanings change over time. Gay, for example, used to be surrounded by color names and party—later, it began to appear by bisexual and marriage. (Technically, researchers can search collocates, but only if they download the underlying Ngram raw data set—and even then, Davies says, it’s a very complicated process). According to Google Research Manager Jon Orwant, the team is working on making it possible to search for words that are not just adjacent, but nearby.

Advertisement

Other academics fall on the opposite end of the Davies spectrum—they place too much power in the Viewer, and can misinterpret its results. A recent yemeles New York Times piece, for example, suggests that an uptick in toddler and similar words in postmodern fiction could signal “growing attention paid to children.”

“But in a dataset where novels are mixed with parenting manuals and cookbooks, it’s really hard to say what that increase tells us about the novel,” Underwood says. Researchers can break down their search by a fiction-related genre, but it’s not restricted to only traditional novels, as Aiden and Michel’s original research paper in Science explains.

Though it’s tempting to make dramatic claims using the data, it may be that the most valuable contribution of the Viewer so far isn’t a seismic cultural discovery. It’s the shift in the way we see—and question—our historic record.

“For me, it’s no question that the broader set of changes associated with [the Viewer and Google data] are changing the way research happens,” Underwood says, “We’re looking at an initial simplified outline of a picture that will get much richer and more interesting as we approach to take a closer look.”

The data will get richer, too, as we learn what to search for—and how to parse it. “People need more training in thinking in terms of questions that are good digital questions,” Davies says.

Aiden and Michel envision a more sophisticated Viewer in the future, one that uses more languages (the current one has data from nine, including both American and British English) and more puissant search functionality. Right now, you can search 22 corpora, or large groups of books, under genres like English fiction, Russian, and Hebrew. But there’s a potential barrier to significant progress: copyright laws. “Basically after the mid-’20s, you can’t really share the full text of most books published,” Aiden says. ”I think something has to happen in Congress in order to make these sorts of big data approaches to history move forward.”

There is, of course, one set of data you’ll always be free to search—your own. “If you could apply this technology to contemporary text on Twitter, blogs, or on your own corpus, you could search your own past and see trends in your own life,” Michel says. “That’s going to be possible.”

The Ngram Viewer, it seems, may be a little like reading your first Shakespeare play. It may take a while to adjust to a new syntax and rhythm—and at first, it can be jargogling. But once you do, the meaning behind that foreign diction begins to reveal itself.

This article arises from Future Tense, a collaboration among Arizona State University, the New America Foundation, and Slate. Future Tense explores the ways emerging technologies affect society, policy, and culture. To read more, visit the Future Tense blog and the Future Tense home page. You can also follow us on Twitter.

TODAY IN SLATE

Politics

Don’t Worry, Obama Isn’t Sending U.S. Troops to Fight ISIS

But the next president might. 

IOS 8 Comes Out Today. Do Not Put It on Your iPhone 4S.

Why Greenland’s “Dark Snow” Should Worry You

How Much Should You Loathe NFL Commissioner Roger Goodell?

Here are the facts.

Amazon Is Launching a Serious Run at Apple and Samsung

Television

Slim Pickings at the Network TV Bazaar

Three talented actresses in three terrible shows.

Foreigners

More Than Scottish Pride

Scotland’s referendum isn’t about nationalism. It’s about a system that failed, and a new generation looking to take a chance on itself. 

The Ungodly Horror of Having a Bug Crawl Into Your Ear and Scratch Away at Your Eardrum

We Could Fix Climate Change for Free. Now There’s Just One Thing Holding Us Back.

  News & Politics
Weigel
Sept. 17 2014 7:03 PM Once Again, a Climate Policy Hearing Descends Into Absurdity
  Business
Business Insider
Sept. 17 2014 1:36 PM Nate Silver Versus Princeton Professor: Who Has the Right Models?
  Life
Outward
Sept. 17 2014 6:53 PM LGBTQ Luminaries Honored With MacArthur “Genius” Fellowships
  Double X
The XX Factor
Sept. 17 2014 6:14 PM Today in Gender Gaps: Biking
  Slate Plus
Slate Fare
Sept. 17 2014 9:37 AM Is Slate Too Liberal?  A members-only open thread.
  Arts
Brow Beat
Sept. 17 2014 8:25 PM A New Song and Music Video From Angel Olsen, Indie’s Next Big Thing
  Technology
Future Tense
Sept. 17 2014 9:00 PM Amazon Is Now a Gadget Company
  Health & Science
Jurisprudence
Sept. 17 2014 4:49 PM Schooling the Supreme Court on Rap Music Is it art or a true threat of violence?
  Sports
Sports Nut
Sept. 17 2014 3:51 PM NFL Jerk Watch: Roger Goodell How much should you loathe the pro football commissioner?