What Words Do Bing and Google Ban From Autocomplete?

What's to come?
Aug. 2 2013 11:43 AM

Sex, Violence, and Autocomplete Algorithms

What words do Bing and Google censor from their suggestions?

(Continued from Page 1)

As shown in the next figure, both algorithms do get much stricter when you add “child” before the search term. Bing blocks “child nipple,” for instance. But there are some conspicuous failures as well. While you might think it wry that Google and Bing suggest completions for “prostitute,” the fact that Google also offers completions of “child prostitute” for “images” or “movies” is far more alarming. Moreover, searching for “child genital” or “child lover” on Google or Bing, as well as “child lust” on Google, all lead to disturbing suggestions that relate to child pornography. Querying “child lover,” for instance, offers suggestions for “child lover pics,” “child lover guide,” and “child lover chat.” Given Google and Microsoft’s available technology and resources, and combined with their ostensible commitment, it’s hard to believe that these types of errors slipped through the cracks.

A Google representative acknowledged that the company does sometimes miss things but says that it’s an active and iterative process to improve the algorithm and filter out shocking or offensive suggestions. A committee meets periodically to review complaints and suggest changes to the engineering team, which then works to tweak, tune, and bake that into the next version of the algorithm. With hundreds of updates per year, the algorithm is constantly changing—perhaps even by the time you read this article. A Microsoft rep reached for comment indicated that the people behind Bing are likewise continually improving their algorithmic filters and that if suggestions that relate to child pornography are brought to their attention, they’ll remove them.

130802_FUT_Diagram2-EX

Promoting violence?

Another editorial rule that Google incorporates into its autocomplete algorithm is to exclude suggestions that promote violence. To test its boundaries, I collected and analyzed autocomplete responses for a list of 348 verbs in the Random House “violent actions” word menu, which includes words like “brutalize” and “choke.” In particular I queried using the templates “How to X” and “How can I X” in order to find instances where the algorithm was steering users toward knowledge of how to act violently.

Advertisement

As a reflection of what people are searching for, it’s perhaps a commentary on the content of video games that many of the suggestions for violent actions were about things like how to beat a boss in a particular game. Certain queries, like “how to molest” or “how to brutalize,” were blocked as expected, but other searches did evoke suggestions about how to accomplish violence toward people or animals.

Among the more gruesome suggestions that were not blocked: “how to dismember a human body,” “how to rape a man/child/people/woman,” and “how do I scalp a person.” Some suggestions were oriented toward animal cruelty, like “how to poison a cat,” and “how to strangle a dog.” Despite any annoyance you might have with the neighbor’s barking dog, that still doesn’t make it morally permissible to strangle it—such suggestions should also be blocked.

Algorithmic governance, meet algorithmic accountability

The queries that are prohibited, like Bing’s bizarre obstruction of completions for “homosexual,” are sometimes as surprising as the things not blocked, such as the various suggestions leading to child pornography or explicit violence. As we look to algorithms to enforce morality, we need to acknowledge that they too are not perfect. And I don’t think we can ever expect them to be—filtering algorithms will always have some error margin where they let through things we might still find objectionable. But with some vigilance, we can hold such algorithms accountable and better understand the underlying human (and corporate) criteria that drive such algorithms’ moralizing.

The editorial criteria that Google and Bing embed in their algorithms tacitly reflect company values and a willingness to self-regulate in order to protect people from socially deviant suggestions. Yet this self-regulation is largely opaque, making if difficult to understand how these mostly automated systems make the decisions they do. In the absence of corporate transparency, and as more aspects of society become algorithmically driven, reverse-engineering such algorithms using data and algorithms offers one potential way to systematically penetrate that opacity and recreate an, albeit low-resolution, semblance of how everything works.

This article arises from Future Tense, a collaboration among Arizona State University, the New America Foundation, and Slate. Future Tense explores the ways emerging technologies affect society, policy, and culture. To read more, visit the Future Tense blog and the Future Tense home page. You can also follow us on Twitter.

TODAY IN SLATE

Politics

The Democrats’ War at Home

How can the president’s party defend itself from the president’s foreign policy blunders?

Congress’ Public Shaming of the Secret Service Was Political Grandstanding at Its Best

Michigan’s Tradition of Football “Toughness” Needs to Go—Starting With Coach Hoke

A Plentiful, Renewable Resource That America Keeps Overlooking

Animal manure.

Windows 8 Was So Bad That Microsoft Will Skip Straight to Windows 10

Politics

Cringing. Ducking. Mumbling.

How GOP candidates react whenever someone brings up reproductive rights or gay marriage.

Building a Better Workplace

You Deserve a Pre-cation

The smartest job perk you’ve never heard of.

Hasbro Is Cracking Down on Scrabble Players Who Turn Its Official Word List Into Popular Apps

Florida State’s New President Is Underqualified and Mistrusted. He Just Might Save the University.

  News & Politics
Politics
Sept. 30 2014 9:33 PM Political Theater With a Purpose Darrell Issa’s public shaming of the head of the Secret Service was congressional grandstanding at its best.
  Business
Moneybox
Sept. 30 2014 7:02 PM At Long Last, eBay Sets PayPal Free
  Life
Gaming
Sept. 30 2014 7:35 PM Who Owns Scrabble’s Word List? Hasbro says the list of playable words belongs to the company. Players beg to differ.
  Double X
The XX Factor
Sept. 30 2014 12:34 PM Parents, Get Your Teenage Daughters the IUD
  Slate Plus
Behind the Scenes
Sept. 30 2014 3:21 PM Meet Jordan Weissmann Five questions with Slate’s senior business and economics correspondent.
  Arts
Brow Beat
Sept. 30 2014 8:54 PM Bette Davis Talks Gender Roles in a Delightful, Animated Interview From 1963
  Technology
Future Tense
Sept. 30 2014 7:00 PM There’s Going to Be a Live-Action Tetris Movie for Some Reason
  Health & Science
Medical Examiner
Sept. 30 2014 11:51 PM Should You Freeze Your Eggs? An egg freezing party is not a great place to find answers to this or other questions.
  Sports
Sports Nut
Sept. 30 2014 5:54 PM Goodbye, Tough Guy It’s time for Michigan to fire its toughness-obsessed coach, Brady Hoke.