Facebook’s Leaked Censorship Policies Show How Bad the Company Is at Policing Hate Speech

June 28, 20173:39 PM

This white man’s company wants to ensure that he’s protected online.
Paul Marotta/Getty Images

On Wednesday morning, ProPublica published a troubling report about Facebook’s approach to censorship. Drawing on a “trove of internal documents,” it laid out some of the rules that the company’s content reviewers use to determine whether they should censor a post. (It’s not clear from the article whether those moderators are employees or subcontractors, though Facebook has relied on the latter group in the past.) Those documents underscore just how clumsy the company can be when it comes to dealing with hate speech, partly because it insists on tackling the issue in algorithmic terms.

As ProPublica’s headline puts it, “Facebook’s Secret Censorship Rules Protect White Men from Hate Speech but Not Black Children.” While that’s just part of the problem, it’s also not hyperbole. To the contrary, a training slide reproduced by Pro Publica establishes that very distinction, asking which of three groups—female drivers, black children, and white men—it aims to protect. Puzzlingly, it uses a photo of the Backstreet Boys to illustrate white men, but more baffling is the answer to the query: Out of that trio, Facebook only “protects” white men.

As ProPublica goes on to explain, this is an effect of the way Facebook defines its “protected categories” and the way those categories relate to one another. The company reportedly includes a broad array of terms under its protected rubric, including race, gender identity, sexual orientation, and national origin. On the other hand, it declines to protect a range of other categories, including social class, age, occupation, and appearance. Say something awful about the members of a protected category (e.g., “Women deserve to be beaten”), and your comments might get censored. Similar assertions about the members of a non-protected category such as an age demographic (e.g. “Millennials should all be set on fire”), on the other hand, will inspire no action from the site.

While the logic determining what counts as a protected category is already opaque, things get even more complicated when Facebook’s users start to combine these categories. As ProPublica shows in a series of slides, if a user pairs two protected categories together (“Irish women”), the resulting conglomerate is still considered a protected category. If, on the other hand, someone links a protected category to a nonprotected category (“Irish teens”), the composite is not protected, and users can say whatever they want with impunity. This is why Facebook considers threats against white men hate speech while it ignores those against black children: While the former combines two protected categories (race and gender), the latter includes one protected and one non-protected category (race and age, respectively).

In effect, the company’s approach is algorithmic, even if humans implement the rules of that algorithm. As the slides from ProPublica show, Facebook’s censorship principles are reducible to a simple set of equations—“PC + PC = PC while PC + NPC = NPC,” for example. As Will Oremus has observed, content moderation can be difficult, taxing work. These spare formulas may well be a blessing for the company’s human censors, giving them the tools to quickly determine what’s acceptable and what’s not with a modicum of thought. They’re given a set of simple instructions that work everywhere, freeing them from the burden of granular judgment.

But that same convenience also makes the system easy to exploit: Those looking to denigrate a given group need only apply a well-chosen modifier if they want to avoid oversight. Indeed, as ProPublica notes, Donald Trump’s anti-Muslim posts “may also have benefited from the exception for sub-groups. A Muslim ban could be interpreted as being directed against a sub-group, Muslim immigrants, and thus might not qualify as hate speech against a protected category.” Where the most complex computer algorithms threaten to reaffirm existing sociocultural biases, this relatively simple human algorithm offers the biased an out, so long as they’re willing to get specific about their hate.

While the social network may well recognize that flaw, it’s clear enough why it employs this system: Above all else, Facebook’s censorship policy is defined by the utopian assumption of universal egalitarianism. As ProPublica notes, the social network attempts to apply the same rules everywhere, a few regional exceptions aside. Accordingly, it begins from the presumption that all of those who fall under its protected categories are potentially subject to hate speech. Thus, “ban men” might potentially be understood as hate speech (since it calls for the exclusion of all men), in much the same way that calls for violence against women would be.

The trouble is that hate speech doesn’t play out in sanitized vacuums: Slurs accumulate in real circumstances and real-time, drawing strength from the particularity each new repetition. Where Facebook apparently aims to treat such language in the abstract, it is this situational specificity that hones the edges of ugly words, giving them the power to cut. The company’s “non-protected categories” don’t just offer users an out when they want to say something vile about one group or another; they also threaten to make vile language that much more violent.

Simply put, Facebook doesn’t understand how hate speech works. Language’s potential to harm is inevitably proportional to the marginality of those it targets. While it’s certainly possible, for example, to say loathsome things about white men, those insults will rarely, if ever, have the same weight as those made against already imperiled groups. Further, marginality, for its own part, is always a circumstantial problem, not one that’s everywhere the same, which makes Facebook’s universally inclined approach that much more meaningless. In practice, acting as if all language was the same everywhere and for everyone can only have one effect: Reaffirming the security of those who are already in power by shielding them from criticism.

For Facebook, that may well be a winning proposition.

Facebook