Ooh! Arrgh! How We Hear Emotion in Nonverbal Noises.

Jan 26, 20162:19 PM

“The brute tones of our human throat … may once have been all our meaning” —Robert Frost.
Janek Skarzynski/AFP/Getty Images

On May 10, 1915, renowned poet-cum-cranky-recluse Robert Frost gave a lecture to a group of schoolboys in Cambridge, Massachusetts. “Sounds in the mouths of men,” he told his audience, “I have found to be the basis of all effective expression.” Frost spent his career courting “the imagining ear”—that faculty of the reader that assigns to each sentence a melodic shape, one captured from life and tailored to a specific emotion. In letters and interviews, he’d use the example of “two people who are talking on the other side of a closed door, whose voices can be heard but whose words cannot be distinguished. Even though the words do not carry, the sound of them does, and the listener can catch the meaning of the conversation. This is because every meaning has a particular sound-posture.”

Frost’s preoccupation with the music of speech—with what we might call “tone of voice,” or the rise and fall of vocal pitch, intensity, and duration—has become a scientific field. Frost once wrote his friend John Freeman that this quality “is the unbroken flow on which [the semantic meanings of words] are carried along like sticks and leaves and flowers.” Neuroimaging bears him out, revealing that our brains process speech tempo, intonation, and dynamics more quickly than they do linguistic content. (Which shouldn’t come as a huge surprise: We vocalized at each other for millions of years before inventing symbolic language.)

Psychologists distinguish between the verbal channel—which uses word definitions to deliver meaning—and the vocal channel—which conveys emotion through subtle aural cues. The embedding of feelings in speech is called “emotional prosody,” and it’s no accident that the term prosody (“patterns of rhythm or sound”) originally belonged to poetry, which seeks multiple avenues of communication, direct and indirect. Frost believed that you could reverse-engineer vocal tones into written language, ordering words in ways that stimulated the imagining ear to hear precise slants of pitch. He went so far as to propose that sentences are “a notation for indicating tones of voice,” which “fly round” like “living things.”

So Frost comes to mind immediately upon encountering a new study on how the brain responds to sentence sounds. In Biological Psychology, researchers from McGill University in Canada detail how they played 24 volunteers a series of “affect bursts” (wordless vocalizations like giggles, growls, and sobs). As an EEG recorded their neural activity, the participants were asked to decide whether each clip expressed happiness, rage, or sadness. Then, the men and women were fed recordings of nonsense speech (“The dirms are in the cidabal”) uttered in various tones of voice. Again, they had to identify the emotion the speaker wished to convey—and again, an EEG mapped their brain responses.

The scientists found that, at least when the words are made up, our brains treat “raw,” “pure,” or “primitive” forms of vocal expression preferentially to emotionally inflected talking. We can unravel the emotional payload of an angry grunt in less than half a second. We can decode a happy laugh in less than a tenth of a second. Assigning a feeling to Frost’s “unbroken flow” of spoken melody takes longer.

The researchers also discovered that both irate noises and wrathful speech tones leave lasting traces in the brain. “Our data suggest that listeners engage in sustained monitoring of angry voices, irrespective of the form they take, to grasp the significance of potentially threatening events,” said Marc Pell, one of the study’s lead authors.

Most importantly, the McGill paper confirms what scientists long suspected: that our brains process emotional vocal sounds (laughs, weeping, growls) and the pitched, emotive rivers that course below speech in different areas. (There’s also a cortical region for neutral vocal sounds, like noncommittal grunts.) “Understanding emotions expressed in spoken language … involves more recent brain systems that have evolved as human language developed,” explained Pell. “The identification of emotional vocalizations depends on systems in the brain that are older in evolutionary terms.” Furthermore, all of the above neural real estate—much of it located in the right hemisphere—is distinct from the portions of the left hemisphere that interpret speech’s semantic content. What remains to be seen—the study didn’t test it—is whether our brains respond preferentially to the verbal phrase “I am sad” or to a miserable-sounding yowl. (My money is on the yowl.)

“The brute tones of our human throat … may once have been all our meaning,” Frost wrote in 1925. We’ve domesticated those tones by adding their timbre to language, but when it comes to rapidity of response, our brains haven’t quite gotten the memo. Doesn’t some part of you wish that I had written this entire post in animal noises? And isn’t it strange how the most basic vectors for meaning worm their way into the most finespun, sophisticated modes of expression, like poetry? And isn’t it fun when science unpacks human communication, using outlandish brain machines to confirm what our ancestors knew from the first time they opened their mouths? Woohoo!