Phonetics and physics at high pitches: Why is it hard to understand opera singers, sopranos, and other high-pitched singing?

Why Are Opera Singers Hard to Understand?

Lexicon Valley
A Blog About Language
July 28 2014 12:49 PM

Why Are Opera Singers Hard to Understand?

Soprano Maria Callas hitting a high note.

High, squeaky notes. Screeching soprano solos. Unintelligible opera divas. There are a slew of stereotypes for how soprano voices sound at the top of their range. Even exceptionally talented singers struggle to be understood when singing high notes. Is it just a matter of technique, or is there something else going on? As it turns out, soprano voices are limited more by physics than by skill, and here's why:

When we say (or sing) vowels, the first thing we do is get air flowing from our lungs. This vibrates our vocal folds and produces a weird buzzing sound, similar to the sound you'd make by blowing a raspberry. The two flappy bits of membrane that make up the vocal folds hit against each other, in the same way your lips hit against each other during a raspberry. This repetitive movement is called oscillation.


The reason we don’t hear speech as a buzzing sound, but rather as a vowel or some other speech sound, is because the buzz is shaped by the vocal tract (the throat, mouth, tongue, lips, and nose). This is pretty much the way brass instruments work, too. You blow a raspberry into the mouthpiece, more or less, and the size and shape of the tube determines the instrument's unique sound. Since I assume you already know what human looks like playing an instrument, let's use this as an excuse to look at these adorable animals playing instruments instead.

How fast the vocal folds are vibrating determines what the pitch of the resulting sound is going to be: Lower-pitched sounds are made by slower oscillation (lower frequency), and higher-pitched sounds are made by faster oscillation (higher frequency). A pure tone only has a single frequency, while more complex sounds—for example, the same note produced by a trumpet or the human voice—have several frequencies all at the same time. The lowest of these simultaneous frequencies is known as the fundamental frequency, while all the higher ones that are associated with it are called harmonics.

The tricky thing is that all of the harmonics need to be a multiple of the fundamental frequency. So if we start with a buzz that has a fundamental frequency of 100 oscillations per second (100 Hz), the first harmonic would be at 200 Hz, the second at 300 Hz, etc. 100 Hz is a relatively low note, about the second lowest A flat on a piano. A higher frequency sound, with a fundamental frequency of 1000Hz (about a high C), would have its first harmonic at 2000Hz: It can't have any closer harmonics, like at 1100 Hz or 1200 Hz, because those aren't multiples of 1000.

This means that the lower the note is, the more densely packed its harmonics are. Why does this matter?

Well, vowels are formed by the way the mouth and throat shape the harmonics as they're coming out from the vocal folds, by making them essentially echo and reinforce each other in the vocal tract. For example, the rounded lips that you make when saying "ooooooooo" cause different harmonics to be amplified than the spread lips that you make when saying "eeeeeeeee," even when they're the same note.

At the low notes, when you have a lot of densely-packed harmonics, it's easy to see hear exactly which parts have been amplified and therefore which vowel the speaker is intending to sing. But at higher notes, where you have fewer harmonics, it's harder to hear which parts have been amplified. And at the highest notes, you may not have the right harmonics to amplify at all, and you end up with a sort of mushy neutral vowel sound.


It's like looking at an image that's been horribly pixelated: if you don't have a high enough sampling rate, it's hard to tell what's going on.

And that's before we even start talking about the muscular control necessary to produce such a high pitched sound, differences in techniques, proficiency, and how well you or the singer understand various languages. Put it all together, and there's just no physical way that a human-shaped singer could pronounce intelligible vowels at really high pitches. So it's not your fault that you can't understand an opera singer, but it's not the singer's fault either.

A version of this post appeared on Wug Life.

Lauren Ackerman is a linguistics PhD candidate at Northwestern University, where she researches real-time sentence comprehension and intonation. She also runs Wug Life, a blog dedicated to young and budding linguists.

  Slate Plus
Plus Roundups
May 21 2015 2:34 PM What We Like Right Now Our favorite picks for the week of May 18, curated by Slate writers and editors. 
  Health & Science
May 22 2015 3:45 AM Drowning Doesn’t Look Like Drowning In 10 percent of drownings, adults are nearby but have no idea the victim is dying. Here’s what to look for.