Future Tense

How to Hold Governments Accountable for the Algorithms They Use

Algorithms determine prison sentences and Social Security benefits. So we need to know how they work.

bureaucracy.
Algorithms in use by the government may not produce many documents, making them difficult to learn about through a Freedom of Information Act request.

Fuse/Thinkstock

In 2015 more than 59 million Americans received some form of benefit from the Social Security Administration, not just for retirement but also for disability or as a survivor of a deceased worker. It’s a behemoth of a government program, and keeping it solvent has preoccupied the Office of the Chief Actuary of the Social Security Administration for years. That office makes yearly forecasts of key demographic (such as mortality rates) or economic (for instance, labor force participation) factors that inform how policy can or should change to keep the program on sound financial footing. But a recent Harvard University study examined several of these forecasts and found that they were systematically biased—underestimating life expectancy and implying that funds were on firmer financial ground than warranted. The procedures and methods that the SSA uses aren’t open for inspection either, posing challenges to replicating and debugging those predictive algorithms.

Whether forecasting the solvency of social programs, waging a war, managing national security, doling out justice and punishment, or educating the populace, government has a lot of decisions to make—and it’s increasingly using algorithms to systematize and scale that bureaucratic work. In the ideal democratic state, the electorate chooses a government that provides social goods and exercises its authority via regulation. The government is legitimate to the extent that it is held accountable to the citizenry. Though as the SSA example shows, tightly held algorithms pose issues of accountability that grind at the very legitimacy of the government itself.

One of the immensely useful abilities of algorithms is to rank and prioritize huge amounts of data, turning a messy pile of items into a neat and orderly list. In 2013 the Obama administration announced that it would be getting into the business of ranking colleges, helping the citizens of the land identify and evaluate the “best” educational opportunities. But two years later, the idea of ranking colleges had been neutered, traded in for what amounts to a data dump of educational statistics called the College Scorecard. The human influences, subjective factors, and methodological pitfalls involved in quantifying education into rankings would be numerous. Perhaps the government sensed that any ranking would be dubious—that it would be riddled with questions of what data was used and how various statistical factors were weighted. How could the government make such a ranking legitimate in the eyes of the public and of the industry that it seeks to hold accountable?

That’s a complicated question that goes far beyond college rankings. But whatever the end goal, government needs to develop protocols for opening up algorithmic black boxes to democratic processes.

Transparency offers one promising path forward. Let’s consider the new risk-assessment algorithm that the state of Pennsylvania is developing to help make criminal sentencing decisions. Unlike some other states that are pursuing algorithmic criminal justice using proprietary systems, the level of transparency around the Pennsylvania Risk Assessment Project is laudable, with several publicly available in-depth reports on the development of the system.

For instance, the validation report shows that a simplified score sheet was developed to turn the results of the statistical model into a 14-point risk score. Being male gets you a plus-1, as does being a property offender (e.g. theft). But such simplifications lead to a huge loss of information. A 29-year-old gets plus-2 points, but a 30-year-old only gets plus-1, and having anywhere in the wide range of five to 12 prior arrests earns plus-3. The statistical models have the potential to treat individuals as such, but instead the creators elected to use a simplified score sheet that lumped people into groups. The upshot here is that the transparency afforded by Pennsylvania’s lengthy reports allows for this kind of critical evaluation, which might not otherwise be possible.

When transparency isn’t baked in from the start, the Freedom of Information Act at the federal level, and the various analogues to that statute at the state level, can help induce information disclosure about government use of algorithms. First enacted in 1966, FOIA allows the public to formally request documents and data from the federal government that it must then furnish. A lot of great journalism has been built on the back of FOIA laws that enable investigations into the inner workings of the government apparatus. FOIA has been successfully used to compel government disclosure of algorithm details in at least a few cases. In 2011, after several months, a request to disclose the source code to a controversial Occupational Safety and Health Administration heat safety app was eventually fulfilled. Further back, in a FOIA case decided in 1992, the Federal Highway Administration rejected a request to disclose the algorithm it used to compute safety ratings for carriers. Ultimately it lost in court to a plaintiff who argued the government must disclose the weighting of factors used in that calculation.

But those examples don’t mean that FOIA requests for algorithm information are always successful. There are FOIA exceptions—perhaps most importantly exemption No. 4 allows the government to deny a request for information on the basis of trade secrets or confidential information. Like any other large organization, government contracts with external companies to acquire the tools and technology it needs. And whenever the government integrates a third-party algorithm into its systems, this can then be used as an excuse to block the public’s access to more information on that proprietary algorithm.

At the University of Maryland–College Park, I worked with Sandy Banisky’s Media Law Class in the fall of 2015 to have each student in the class send a FOIA request to a different state asking for any documents or source code pertaining to algorithms in use in criminal justice, such as for setting parole, bail, or sentencing. In at least three instances out of 50, the FOIA requests were rebuffed because the code was not public for proprietary reasons. The integration and use of third-party algorithms for important public decision-making is a challenge to government legitimacy that future regulation should work to address.

FOIA regulation also does not require agencies to create documents that do not already exist. In other words, you have a right to request a document that the government already has but not to oblige the government to make one especially for you. But in some cases algorithms in use by the government may not produce many documents—they may simple run in computer memory and only output a document at the end of their processes. Hypothetically, a government algorithm could predict a parole violation risk score that correlates to a protected class such as race. The correlated score is then used as part of some larger decision process but is not saved to a document. Since that internal score is never exposed externally via a document, FOIA would be toothless to compel its disclosure and reveal the use of race within the algorithm. Audit trails could help mitigate this issue, but guidelines need to be developed and implemented in regulation for when government use of an algorithm should trigger an audit trail.

The federal FOIA law could stand to have an update that considers the ramifications of algorithmic decision-making more carefully. One idea would be to extend the right that the public has to government information to include a right to how the government processes that information. This might be called the Freedom of Information Processing Act. FOIPA would allow the public to submit benchmark datasets to the government that it would be required to process through its systems and provide the output results. This could allow interested parties like journalists or policy experts to run assessments on the government algorithm and benchmark error rates. The advantage to this model would be that it doesn’t require the disclosure of source code, thus preserving third-party confidentiality interests and trade secret conflicts.

Computational algorithms can be a tremendous boon to efficiency and optimization; their use will only grow throughout industry and government. But how society moves forward democratically, and the legitimacy of our elected government, hinges on the accountability that we design into these algorithmic systems. Transparency and FOI regulation can enable the accountability that we need, but we must also re-think the implementation of these ideals in light of the unique challenges that algorithms pose.

This article is part of the algorithms installment of Futurography, a series in which Future Tense introduces readers to the technologies that will define tomorrow. Each month from January through June 2016, we’ll choose a new technology and break it down. Read more from Futurography about algorithms:

Future Tense is a collaboration among Arizona State University, New America, and Slate. To get the latest from Futurography in your inbox, sign up for the weekly Future Tense newsletter.