Google Public Data Explorer: Is it the first step toward a universal data format?

Inside the Internet.
Feb. 16 2011 6:00 PM

An HTML for Numbers

Is Google's Public Data Explorer the first step toward a universal data format?

Data. Click image to expand.

The Age of Data is just around the corner, right where it has been for years. As someone who spends a lot his time creating visualizations, I've been hoping for this day to come for a very long time. "It used to be that you would get stories by chatting to people in bars," Internet godfather (and non-journalist) Tim Berners-Lee declared last year. "But now it's also going to be about poring over data and equipping yourself with the tools to analyze it." Don't buy it? This transfixing eight-part video series from Knight Journalism fellow Geoff McGhee might change your mind. Data isn't just for nerds any more—it's beautiful, alluring, extraordinary.

It's also incredibly hard to work with. The problem with bringing data to journalism isn't convincing writers and editors that it's useful for telling stories; it's the toil required to get the numbers in a usable format. The data is already there, from federal sentencing figures and unemployment rates by county to minute-by-minute Twitter responses to the Black Eyed Peas' smoldering wreckage of a Super Bowl halftime show. The problem is that it all looks different. It is compiled by different people using different programs and represented in different formats. As a result, mashing up data isn't as simple as mashing together two balls of Silly Putty. It's more like trying to plug a bunch of American appliances into outlets in Tbilisi.

In hopes of bridging this data divide, Google is rolling out a tool called Public Data Explorer. While Data Explorer has been around for a while, it's now been extended to allow users to upload and visualize their own data sets. But that's not why Google's effort is important. If you want to make cool visualizations, IBM's Many Eyes offers more than a dozen different ways to display information. (Google currently offers four pretty standard ones.) The exciting news here is that Google is pushing for the adoption of a specific format. Users must upload their data in two files, one for all the numbers and one that describes what those numbers represent. If this feature becomes popular, it will make it a whole lot easier for people and agencies to use one another's data. It's not quite a universal format, but it's a lot closer than anything we have today.

Advertisement

The beauty of the Web—in fact, the reason the Internet can function in the first place—is that it doesn't require intensive training to publish a page in a readable format. Sure, you might have to learn a few HTML tags—or pay an 11-year-old who knows HTML—but it's a simple language that's easy to pick up. There is no equivalent for data. There are plenty of standards for making data readable by a machine, but no single format that everyone can understand and agree on.

While plenty of people have tried to develop a data standard, none of them have been named Google. A promising site called Swivel tried to became a "YouTube for data" a few years ago, but don't go looking for it now. One of Google's greatest powers is its ability to cajole Web developers into playing by the company's rules, in hopes of climbing in the rankings and generally staying in the demigod's good graces. For sure, there are well-developed languages, like XML and JSON, for organizing data in a way computers can understand. While these are great for representing data for a specific purpose, a search engine wouldn't know what to do with my code without extra information from me on what the numbers mean. This is where a standard format becomes essential.

To understand why I'm rooting for Google, consider this brief tale of woe. When I was trying to build a map of job-loss data for Slate, I started with the month-by-month, county-by-county figures from the Bureau of Labor Statistics. This data comes in huge text files with arcane codes—meaningless gibberish unless you have the software and the know-how to match those codes to the names of counties, which live in a different file. At the time, I did this in Excel with a cocktail of Byzantine macros, late nights, and emotional breakdowns.

TODAY IN SLATE

Frame Game

Hard Knocks

I was hit by a teacher in an East Texas public school. It taught me nothing.

Republicans Like Scott Walker Are Building Campaigns Around Problems That Don’t Exist

Why Greenland’s “Dark Snow” Should Worry You

If You’re Outraged by the NFL, Follow This Satirical Blowhard on Twitter

The Best Way to Organize Your Fridge

The World

Iran and the U.S. Are Allies

They’re just not ready to admit it yet.

Sports Nut

Giving Up on Goodell

How the NFL lost the trust of its most loyal reporters.

Chief Justice John Roberts Says $1,000 Can’t Buy Influence in Congress. Looks Like He’s Wrong.

Farewell! Emily Bazelon on What She Will Miss About Slate.

  News & Politics
Foreigners
Sept. 16 2014 4:08 PM More Than Scottish Pride Scotland’s referendum isn’t about nationalism. It’s about a system that failed, and a new generation looking to take a chance on itself. 
  Business
Moneybox
Sept. 16 2014 4:16 PM The iPhone 6 Marks a Fresh Chance for Wireless Carriers to Kill Your Unlimited Data
  Life
The Eye
Sept. 16 2014 12:20 PM These Outdoor Cat Shelters Have More Style Than the Average Home
  Double X
The XX Factor
Sept. 15 2014 3:31 PM My Year As an Abortion Doula
  Slate Plus
Slate Plus Video
Sept. 16 2014 2:06 PM A Farewell From Emily Bazelon The former senior editor talks about her very first Slate pitch and says goodbye to the magazine.
  Arts
Brow Beat
Sept. 16 2014 1:27 PM The Veronica Mars Spinoff Is Just Amusing Enough to Keep Me Watching
  Technology
Future Tense
Sept. 16 2014 1:48 PM Why We Need a Federal Robotics Commission
  Health & Science
Science
Sept. 16 2014 4:09 PM It’s All Connected What links creativity, conspiracy theories, and delusions? A phenomenon called apophenia.
  Sports
Sports Nut
Sept. 15 2014 9:05 PM Giving Up on Goodell How the NFL lost the trust of its most loyal reporters.