Tuvan throat singing, known broadly as Khoomei, is a remarkable vocal technique originating from the Tuva Republic in southern Siberia. It allows a single vocalist to produce two, and sometimes three, distinct pitches simultaneously.
To understand how a single human voice can achieve this polyphonic effect, one must look at the intersection of acoustic physics, human anatomy, and psychoacoustics—specifically, the manipulation of the harmonic series through vocal tract shaping.
Here is a detailed explanation of the role of harmonic overtones in Tuvan throat singing.
1. The Physics of the Voice: The Harmonic Series
To understand overtone singing, one must first understand that almost no sound in nature is a "pure" single frequency. When a human sings a standard musical note, the vocal folds vibrate at a primary speed. This primary vibration produces the fundamental frequency ($F_0$), which our brains perceive as the primary pitch of the note.
However, the vocal folds do not just vibrate as a whole; they vibrate in fractions (halves, thirds, quarters, etc.). Each of these fractional vibrations produces a proportionally higher frequency called a harmonic or overtone. * The 1st harmonic is the fundamental ($F0$). * The 2nd harmonic is twice the frequency of $F0$ (an octave higher). * The 3rd harmonic is three times the frequency (a perfect fifth above the octave), and so on.
In normal speech or singing, these overtones blend together. The human ear does not hear them as separate notes; rather, the specific mix and volume of these overtones give a voice its unique "tone color" or timbre.
2. Source-Filter Theory and Formants
The human voice operates on a "source-filter" system: * The Source: The vocal folds generate a buzz-like sound containing the fundamental frequency and a rich, densely packed series of harmonic overtones. * The Filter: The vocal tract (the larynx, pharynx, mouth cavity, tongue, and lips) acts as an acoustic filter.
As sound travels from the vocal folds out into the world, the vocal tract amplifies certain frequencies and dampens others. The areas of amplified resonance are called formants. For example, changing the shape of your mouth to say "Ah" versus "Ee" shifts the formants, which changes the overtone balance, allowing us to distinguish different vowels.
3. The Mechanism of Tuvan Throat Singing
In Tuvan throat singing, the vocalist manipulates the "filter" (the vocal tract) to extreme degrees, utilizing a technique called formant tuning.
Instead of spreading the resonant energy across several broad formants as we do in normal speech, the throat singer dramatically constricts certain parts of their vocal tract to merge two formants together. This creates a very narrow, highly concentrated band of acoustic resonance.
Here is how the distinct pitches are perceived:
- The Drone (First Pitch): The singer holds a steady fundamental note ($F_0$) using their vocal folds. This serves as the low drone.
- The Melody (Second Pitch): By making microscopic adjustments to the tongue, lips, and jaw, the singer aligns that sharply concentrated resonance band directly over a single specific harmonic overtone (usually between the 6th and 13th harmonic).
Because this specific overtone is amplified so intensely (while the surrounding overtones are completely suppressed), the overtone breaks away from the overall "timbre" of the voice. Psychoacoustically, the human brain stops perceiving this overtone as part of the vocal tone color and begins to perceive it as an entirely separate, high-pitched whistling note.
By slightly shifting the shape of the mouth (often moving the tip or root of the tongue), the singer slides this narrow resonance band up and down the harmonic series, playing melodies on the overtones while the fundamental drone remains completely unchanged.
4. Advanced Anatomy: The Role of the Epilarynx and False Folds
Scientific studies using MRI and fiber-optic endoscopy have revealed exactly how Tuvan singers create such extreme resonance.
- The Epilaryngeal Tube: Throat singers heavily constrict the sphincter surrounding the larynx (the epilaryngeal tube). This drastic narrowing right above the vocal folds creates an extreme acoustic mismatch between the lower throat and the mouth cavity, which is necessary to generate the hyper-focused formants required to isolate a single high harmonic.
- Ventricular Folds (Kargyraa Style): In a specific style of Tuvan singing called Kargyraa, singers produce a deep, growling drone that sounds an octave below the normal vocal range. They achieve this by engaging the ventricular folds (false vocal cords). These false folds vibrate at exactly half the speed of the true vocal folds (a 2:1 ratio). This creates a subharmonic, resulting in three perceived layers of sound: the deep subharmonic drone, the true fundamental, and the isolated high overtones dancing on top.
Summary
The illusion of multiple voices coming from a single Tuvan throat singer is actually an extreme acoustic exposure of sounds that are naturally present in every human voice. By generating a harmonically rich drone at the vocal folds and radically constricting the vocal tract to act as an ultra-precise acoustic filter, the singer amplifies a single harmonic overtone to such an extreme volume that the human ear perceives it as an entirely separate, simultaneous musical pitch.