Twitter’s New Anti-Harassment Effort Will Flag Users as Possibly Abusive Before They’re Reported

March 02, 20176:02 PM

Twitter announced new anti-harassment measures on Wednesday, the latest in a series of features the platform has added in recent months in response to heated criticism of the ease with which rampant abuse festers in the social platform’s depths. With the help of algorithms, the company has begun finding and taking action against people who harass fellow users, even if harassers haven’t been the subject of specific abuse reports.

This proactive step could relieve users who’ve been targeted for abuse of some of the need to file individual reports for every threatening tweet they get. In a blog post, Twitter’s vice president of engineering, Ed Ho, wrote that the company’s software will flag likely harassers—users that regularly tweet at accounts that don’t follow them back, for instance—and block those users’ tweets from being seen by anyone but their own followers for a set period of time. “We aim to only act on accounts when we’re confident, based on our algorithms, that their behavior is abusive,” Ho wrote, promising that the company will regularly update and improve the new feature as it learns what works.

Twitter is also rolling out a tool that lets users filter out notifications from certain kinds of accounts more likely to be trolls, such as ones that don’t have a photo or verified contact information. Another change will give people using the “mute” feature the ability to keep themselves from seeing certain words, phrases, or conversations for a limited time period—a welcome option for someone who’s at the center of an angry tweetstorm about penguins, say, but wants to resume seeing penguin tweets after the storm has passed.

It wasn’t until November 2016 that Twitter expanded its mute function to allow users to mute certain words and phrases (as they can on Facebook and Instagram), one of its most effective strategies in the fight against targeted harassment. Last month, Twitter unveiled a few other anti-abuse features: automatic collapsing of tweets that are likely to be abusive or “low-quality” in conversations; the ability to report harassment from a user who has blocked you; better monitoring of users who jump from one suspended account to a newly created account to continue their abuse; and the end of users getting notifications when someone replied to a conversation started by someone they’d blocked.

The new initiative to wield machine-learning against repeat harassers is the most interesting development in the platform’s struggle to address the problem that’s driving many cultural leaders and prolific tweeters, like writer Lindy West, to close their Twitter accounts altogether. The idea that lines of code may someday take over the soul-draining job of internet moderators is a popular one these days. Jigsaw, a tech company owned by Alphabet, released a public API last week called Perspective. The Ringer reports that Perspective claims to use artificial intelligence to flag “toxic” comments online by running them against a database of old comments, which have already been deemed toxic by human beings, from sources including Wikipedia and the New York Times.

If Twitter can calibrate its program properly, preemptive flagging may be the boost its existing moderators need to address harassment faster and more effectively—though it’s likely to raise concerns among people concerned about free-speech implications. Blanket muting and restrictive notification filters are blunt instruments that can block out the good stuff with the bad. Targeting repeat, committed offenders is harder to do, but, by monitoring problem users before they can amplify their abuse, Twitter’s new approach is way more likely to punish harassers instead of restricting the Twitter experience of their victims.

Twitter