Predictive policing is too dependent on historical data.

Predictive Policing Isn’t About the Future. It’s About the Past.

Predictive Policing Isn’t About the Future. It’s About the Past.

The citizen’s guide to the future.
Nov. 21 2016 12:30 PM

Predictive Policing Isn’t About the Future

It’s about the past.

Police officers stand guard outside the apartment building where the alleged gunman, Chris Harper Mercer, lived in Winchester outside Roseburg, Oregon, on October 2, 2015.
Police officers stand guard outside the building where Chris Harper-Mercer, perpetrator of the the Umpqua Community College shooting, lived near Roseburg, Oregon, in 2015.

Josh Edelson/AFP/Getty Images

This article is part of Future Tense, a collaboration among Arizona State University, New America, and Slate. On Wednesday, Nov. 30, Future Tense will host an event in Washington, D.C., on the future of law enforcement technology. For more information and to RSVP, visit the New America website.

When predictive policing systems began rolling out nationwide about five years ago, coverage was often uncritical and overly reliant on references to Minority Report’s precog system. The coverage made predictive policing—the computer systems that attempt to use data to forecast where crime will happen or who will be involved—seem almost magical.


Typically, though, articles glossed over Minority Report’s moral about how such systems can go awry. Even Slate wasn’t immune, running a piece in 2011 called “Time Cops” that said, when it came to these systems, “Civil libertarians can rest easy.”

This soothsaying language extended beyond just media outlets. According to former New York City Police Commissioner William Bratton, predictive policing is the “wave of the future.” Microsoft agrees. One vendor even markets its system as “better than a crystal ball.” More recent coverage has rightfully been more balanced, skeptical, and critical. But many still seem to miss an important point: When it comes to predictive policing, what matters most isn’t the future—it’s the past.

Here’s how “predictive policing” works. Broadly speaking, there are two types of systems: place-based and person-based. “Place-based” systems, which are currently much more common, make predictions about where and when future crime will happen. Calling the outputs of these systems “predictions” is itself a stretch—it’s more accurate to say they make general forecasts.

Person-based systems, on the other hand, focus on “predicting” the people who are particularly likely to commit, or themselves be victims of, certain kinds of crimes.


Some predictive policing systems incorporate information like the weather, a location’s proximity to a liquor store, or even commercial data brokerage information. But at their core, they rely either mostly or entirely on historical crime data held by the police. Typically, these are records of reported crimes—911 calls or “calls for service”—and other crimes the police detect. Software automatically looks for historical patterns in the data, and uses those patterns to make its forecasts—a process known as machine learning.

Intuitively, it makes sense that predictive policing systems would base their forecasts on historical crime data. But historical crime data has limits. Criminologists have long emphasized that crime reports—and other statistics gathered by the police—do not necessarily offer an accurate picture of crime in a community. The Department of Justice’s National Crime Victimization Survey estimates that from 2006 to 2010, 52 percent of violent crime went unreported to police, as did 60 percent of household property crime. Essentially: Historical crime data is a direct record of how law enforcement responds to particular crimes, rather than the true rate of crime. Rather than predicting actual criminal activity, then, the current systems are probably better at predicting future police enforcement. 

We don’t have access to much data about these systems—which is itself an enormous problem. What information we do have suggests there are significant reasons to be concerned. For instance: In October, Kristian Lum and William Isaac—two researchers at the Human Rights Data Analysis Groupreleased a landmark study that reconstructed and applied a predictive policing algorithm to the Oakland, California police department’s record of drug crimes from 2010. Lum and Isaac found that the system “would have been dispatched [officers] almost exclusively to lower income, minority neighborhoods,” even though “public health-based estimates suggest that drug use is much more widespread” across the city. And if officers actually began patrolling those predicted hotspots, the underlying biases wouldn’t just be reinforced but further amplified.

There is an important caveat to make here: To my knowledge, today’s predictive systems aren’t using historical drug crime data to anticipate drug-related crime, which is a good thing, given the well-documented racial biases. Lum and Isaac’s study nevertheless demonstrates how historical crime data—in this case, about drugs—can be problematic.


Still, this research, using actual data and a close approximation of a predictive policing algorithm, establishes a fundamental point: Systems that incorporate these sorts of statistics may not account for the inaccuracies reflected in historical data, leading to a cycle of self-fulfilling prophecies. If the underlying historical crime data is biased in a statistical sense—meaning that the data doesn’t actually perfectly reflect reality, and certain things are overrepresented or underrepresented in the sample relative to the actual population—it’s fair to infer that the forecasts made on that data will, in turn, also be statistically biased. And, even if departments aren’t yet using historical drug crime data to make predictions, they soon might: The National Institute of Justice is, to the tune of $1.2 million, funding a predictive policing challenge that specifically requires participants to make predictions of future drug crime.

Even if we set aside the fact that historical crime data has significant limitations, there are other problems with predictive policing. The little independent evidence available also suggests that these systems may not actually improve community safety. Two Rand Corp. studies of predictive systems in Chicago and Shreveport, Louisiana, found that the tools drove greater enforcement, with police focus on generating more citations and arrests—but there’s little evidence that gun violence or crime were reduced.

Nevertheless, given that police departments will probably continue to operate with tight budgets, adoption of predictive policing systems is likely to continue. Today, at least 20 of the largest 50 police departments have used a predictive policing system, with at least an additional 11 exploring the option. One report from 2013 estimated that more than 150 police departments were using predictive systems. In the next decade or so, it’s very likely that these numbers will be even higher. As more departments across the country look to adopt these systems, and as the systems themselves become more sophisticated, it will become increasingly important that police departments, city councils, and vendors squarely address the civil rights concerns of 17 major civil rights, privacy, and technology organizations.

At present, vendors hide behind claims that their systems are proprietary and rarely work to have their predictive techniques independently validated. This veil of secrecy makes it harder for city councils and police departments to make informed decisions and prevents the public from meaningful participation in conversations about whether to buy or how to implement systems. For example, HunchLab works with police departments to determine a severity weight for their crime models. Essentially, this process looks for quantitative answers to the question “how important is it to prevent this type of crime?” Letting the public help answer those questions as part of democratic policing could help build better community-police relations.


Furthermore, police departments must develop policies to govern these systems and make them publicly available. So far, this isn’t common. To my knowledge, Chicago is the only police department to publish a policy describing how it will use its infamous system, the Strategic Subject List—which, as its name suggests, is a person-based system. But the policy doesn’t describe how long an individual might stay on the list, how often the model is validated, the variables used to make the model, and so on. Instead, it focuses on describing its notification procedure, which sends police (or sometimes mails letters) to peoples’ homes to offer a tailored warning about future involvement in criminal activity. Ostensibly, the notifications are also supposed come with the offer of social services, though it’s unclear how often this olive branch is extended.

Overall, when city councils use funds to purchase predictive policing systems, they should encourage their police departments to share predictions with and collaborate with social services. As Yale University sociologist Andrew Papachristos, whose research helped inspire the Strategic Subject List, wrote in the Chicago Tribune, “The real promise of using data analytics to identify those at risk of gunshot victimization lies not with policing, but within a broader public health approach.” More broadly, we should ask ourselves what we see in the predictions made by these systems: Do we only see future culpability and future offenders, or do we see opportunities for social intervention and services?

It’s also important that the algorithmic “predictions” coming out of these systems not be treated as reasons for stopping, searching, or arresting an individual. The Fourth Amendment prohibits law enforcement from stopping an individual without reasonable suspicion: a standard that requires a specific, individualized determination, not just a hunch. Computer-driven hunches should be no exception. Absent protections, predictive policing systems could significantly alter liberty protections for certain areas. For example, reasonable suspicion protections might be short-circuited if areas of forecasted future criminal activity are considered functionally equivalent to the “high crime areas” in the Supreme Court’s 2000 ruling in Illinois v. Wardlow.

Data, and maybe even predictive policing, could improve policing practices in the future. Hopefully, in the coming decade vendors will construct more thoughtful feedback loops in their systems; look to incorporate other, less biased sources of data; conduct racial impact assessments by default; and focus less on enforcement-based metrics of success. But if the systems continue to be applied as they are today, predictive policing isn’t the true future of law enforcement. If anything, it’s just the policing status quo, cast in a new name.