The articulators ventriloquists have to constrain are the lips and jaw. Everything else can't be seen (except the larynx moving up and down, which most audiences wouldn't notice; next time you watch one, especially a man with an Adam's apple, keep an eye on the neck).
With the jaw, they set it at a fixed point that is relatively closed so people can't see the tongue moving inside but is open enough to produce many of the sounds. And the lips are open enough so as to be able to make vowels and coronals and velars consonants, plus a w-ish sound.
What's left are the , i.e., b, p, m, v, f and w.
With the only constraint being that the lips can't move away from this near-fixed position (though look carefully and many will cheat on the occasional b) the trouble is really only with these sounds.
Obviously, the first trick is to try to avoid words with these sounds in them. When they can't, they'll substitute one sound for another: b with d, p with t, m with n , v with w and f with th.
You can see these substitutions if you pay careful attention.
In this You Tube video Look at the way this guy says Simon (jump to 3:17):
He's really saying Sinon.
Or, in this video, notice how this woman says fresh as thresh (at 0:20):
(Compare, for example, the common substitution of th for f in certain English dialects, e.g., I'm free years old).
There are two reason why we don't really notice.
First, acoustically, n and m aren't that distinct.
Same with b and d:
Vowel formants of b and d with certain vowels. Notice the similarities.
Third, and more importantly, our perception of speech is driven as much by top down versus bottom-up processing ( ). That is, what you're expecting to hear makes you hear things that might not be there. It's why all that stuff with devil's messages worked on back-masked records: There really was nothing there, but if someone tells you what you're going to hear, you somehow make it out.
A famous example is the Ganong effect ( ) where people hear an ambiguous sound, between d or t, in a context that makes them a word or non-word (e.g., dash and task, where tash and dask aren't words).
For the same exact sound, people think they hear the sound that makes the word:
It's a pretty phenomenal effect: It really sounds like two different sounds. I did something similar with h and sh here:
The same thing is happening with the ventriloquist: They take advantage of top-down processing to make you hear the sounds they're not making.
One last thing on the linguistics associated with ventrilloquism:
When a ventriloquist does fix their jaw, another important thing to note is that the tongue is compensating for the lack of jaw movement resulting in the sound being made in a different way than usual.
Compare, for example, your jaw position for regular ee and ah.
Now, start by making an ee, then change it to an ah without moving your jaw, as a ventriloquist would do. You can compensate pulling your tongue down and back.
This sounds a little bit off from a normal ee. That's because you're not in canonical optimal position. The reason we have the vowels we do—every language has ee most have u and many have ah—is because the mouth is an optimally designed resonator for certain vowels. Making the ah differently no longer takes advantage of this quantal vowel space ( ). That is, while tongue position is continuous, clearly articulated vowels are not.
What is linguistically interesting with this tongue compensation for the fixed jaw/lips is that you likely should have been able to make this non-canonical ah even if you haven't ever tried before.
That's because of a feedback loop between your ear and your mouth. This feedback loop has been tested in some cool ways. Folks like , and Sazzad Nasir ( ) have played around with acoustic and sensorimotor feedback and its effects on speech using contraptions like this:
What they've found is that the human language system is incredibly dynamic in terms of reaching the target speech sounds it wants to make. Adjustments are made with incredible accuracy and speed. That's what ultimately makes ventriloquism possible: our ability to take what's coming into our ear and adapt our articulation accordingly to make the sounds we need to without explicit thought.
More questions on Speech and Language Pathology:
- What do linguists do?
- What are the best ways to help a toddler with a language delay?
- What are some tricks for learning how to roll an R