The XX Factor

Um, No, to Ummo and All Speech “Improvement” Apps

Articulate rating: Not white and male enough.

Antonio Guillem / Thinkstock

A few years ago, my family went on vacation with two of my parents’ close friends. One evening, the pair turned the topic of dinner conversation to how much they hate hearing young people say “like.” I was only half-listening, contemplating a second helping, until one of them turned to me. “You don’t even hear it when you do it,” he said. I forced a smile, then excused myself to go for a fuming walk.

I wasn’t just livid; I was shaken. I’d been sure that I had decent control over my voice—or at least that I was capable of suppressing “like” in more formal situations such as family gatherings and, especially, work. Could I really not hear it? I spent the rest of the trip trying to prove the friends wrong. For days, my brain felt like a fist around my speech, clenching to make sure it didn’t misbehave, constraining my mental dexterity in the process. The middle and outer reaches of my vocabulary were suddenly inaccessible. I felt as inarticulate as I ever had. By the end of the trip, I was most at ease when I avoided speaking at all.

This memory comes back to me every time a new app hits the market promising to improve the way users speak—marketed as they usually are, explicitly or implicitly, toward women. My colleague Christina Cauterucci wrote this past winter about a Google Chrome extension that alerts the user to qualifying words such as “just,” “actually,” and “sorry,” and tempering phrases such as “I’m no expert” and “does this make sense,” in emails. In March, The Cut covered an app that tracks the ratio of “likes” and other filler words to non-filler words in speech and delivers the user an overall “Articulate” rating. Now, a team of Harvard and MIT students is introducing “Ummo,” an app that Harvard Business School student Andrea Coravos described in an email as “‘FitBit’ for speech fitness.”

Ummo prompts users to write their own lists of verboten filler words, which may include “umms” and “uhhs,” “totallys” and “literallys,” “likes,” “sorrys,” and so on and so forth. The app makes a note whenever you drop one; it can also track your pace, volume, and “clarity,” which, per the website, it calculates by telling you “how close your pronounciations [sic] were to the average American Accent.” But who gets to say what constitutes “speech fitness”? The standard that apps like Ummo help consumers aspire to isn’t an objective “better”—it’s a sound most closely associated with a middle class or affluent white man.  

The policing of women’s speech—from “uptalk” to “vocal fry”—is ubiquitous and well-documented. Often, it takes the form of friendly fire, with feminists urging their fellows to stop “undermining your authority” and “reclaim your strong female voice.” There are a few problems with this narrative. For one, women don’t sound as different from men as we’ve been led to believe—people simply pay more attention to the minutiae of women’s vocal delivery. “With men, we listen for what they’re saying, their point, their assertions. Which is what all of us want others to do when we speak,” the feminist linguist Robin Lakoff told New York magazine writer Ann Friedman last year. “With women, we tend to listen to how they’re talking, the words they use, what they emphasize, whether they smile.”

For another thing, who’s to say that speaking with the unindividuated polish of a news anchor is better than coming off creaky-voiced, self-reflective, and idiosyncratically human? Friedman was inspired to write her essay after she co-founded a podcast called Call Your Girlfriend—comprised of culturally astute chitchat with her friend Aminatou Sow—and would-be vocal coaches started showing up in her inbox. Listeners mansplained that Friedman and Sow needed to tamp down their “likes” and ramp up their enunciation—but Friedman worried that if they obeyed, their podcast “would lose the casual, friendly tone we wanted it to have and its special feeling of intimacy. It wouldn’t be ours anymore.” Lakoff gave her further reason to believe that the trade-off wouldn’t be a worthwhile one. She argued that so-called “filler” words are in fact an unconscious strategy that fosters “understanding and sharing” between speaker and listener. As Lakoff put it: “This is the major job of an articulate social species. If women use these forms more, it is because we are better at being human.”

The debate about how we view speech patterns that deviate from a perceived “neutral” isn’t abstract: These prejudices support real discrimination. A recent study comparing men’s and women’s professional performance reviews, for example, found that women are more likely to be critiqued on their “communication style,” while men are more likely to read about their actual work.

And gender isn’t the only identity axis along which speech policing operates. African American Vernacular English is regularly denigrated as “Standard English with mistakes”—and black Americans who use it are punished in workplaces and schools—even though linguists have argued for decades that AAVE is a distinct and legitimate variety of English with a distinct and legitimate grammar. Recent research has also helped to quantify the extent to which Americans with regional accents—which differ from the middle class “average American accent” that Ummo users may work to cultivate—still struggle to get a foothold in universities and other affluent settings. “Sociolinguists have shown that in the area of speech evaluation, we are particularly susceptible to the cultural stereotypes we have absorbed,” anti-discrimination law expert Mari Matsuda has written. “Low-status accents will sound foreign and unintelligible. High-status accents will sound clear and competent.” In other words, “there is significant discrimination against regional accents.”

Ummo and the flood of other apps designed to teach us to edit our speech reinforce the stereotypes that justify that unequal treatment. Sometimes, we need to be strategic about how we’re perceived—I try to keep my “likes” at bay when I’m on the phone with an interview subject—but the fact that we live in the status quo doesn’t mean we have to accede to it happily. We should resist the call to scrub our speech of “filler words,” accents, and other markers of where we come from and how we relate to people. And we shouldn’t buy into the myth that, if we could, we would come out the other side improved.