Future Tense

The Injustices of Open Data

We’re surrounded by data—more and more of it is created every day about every person and every topic. Accompanying this cascade is an open data movement that calls for datasets to be fully accessible. We can see this attitude in what has become an Internet activist catchphrase: “Information wants to be free.” Even the White House has gotten behind the movement with President Obama’s recent executive order that vows to, by default, make all data and information “open and machine readable.” The idea is that open data will lead to more transparency, which sounds harmless enough, right? Isn’t free and accessible a good thing?

But data isn’t like ore in the ground, waiting to be mined. Nor is it a neutral, objective reflection of reality. Data is constructed by people when information is, among other things: collected, processed, interpreted, stored, retrieved, and communicated.

In an excellent new paper titled “From Open Data to Information Justice,” which was presented at the Midwest Political Science Association Annual Conference, Jeffrey Johnson, a political scientist at Utah Valley University, writes that directing attention toward openness—and the mantra that sunlight is the best disinfectant—can detract from more serious questions about how data is created. “The constructed nature of data makes it quite possible for injustices to be embedded in the data itself,” he says. “Whether by design or as unintended consequences, the process of constructing data builds social values and patterns of privilege into the data.” Johnson explores three specific issues that arise from failing to question the ways in which data is constructed: embedded social privileges, different capabilities of users who have access, and the norms that these systems impose on people

When data is used to allocate resources or anticipate needs, it can perpetuate injustices by overrepresenting privileged groups of people. Consider the Street Bump app that the city of Boston uses to help record rough patches and potholes in the city’s infrastructure. All you have to do is download the app, keep it on while you drive, and, thanks to your smartphone’s GPS and accelerometer, the app will report back the location of any bad parts in the road. The dataset this app creates can be useful and make road work more efficient and targeted, but it also only represents data collected by those who own smartphones. Poorer neighborhoods will be left out and hence less able to signal that their infrastructure has fallen into disarray. As David Eaves argued in Slate in September 2012, it’s not enough that data be open. You must also determine if that data is reliable and credible—how has it been politicized and who’s unrepresented?

When data is made open, it isn’t necessarily accessible to everyone. People have a wide range of skills and capabilities when it comes to interpreting, understanding, and using all that information—but few individuals can truly take use open data to make a difference. As Johnson puts it: “[O]pen data projects remain dominated by state and business users: enterprises have the capacity to take advantage of big, open data, a capacity that citizens lack. … The result is that big data is not, in practice, open to citizens.” This means that governments and corporations gain more power and information from open data, but most people do not.

And with all this new information on display, values like privacy are on the line. For instance, as Evan Selinger wrote in Slate in November 2012, the real estate service Zillow can pilfer open data to determine whether a homeowner has defaulted on the mortgage, if the house has been seized, and how much it might go for in foreclosure. “After using the service, you can stop by the Johnsons’ to make them a low-ball offer, perhaps sweetening the exploitation with a plate of cookies.” Sometimes, access to data can help the plugged-in take advantage of the unprotected.

Focusing on the mission of open data is a red herring that can mask what’s really important: ensuring that the ways data is created and used align with values of justice.