Table of Contents
Acoustic phonetics is a subfield of phonetics that studies the sound waves produced when we speak, analysing how vocal vibrations travel through the air.
It focuses on the physical properties of these sound waves, such as frequency, amplitude, and duration, and how they can be measured and visualised using tools like spectrograms.
The other two branches of phonetics are:
- Articulatory phonetics, which studies how we physically produce sound.
- Auditory phonetics, which analyses how we hear and process sounds.
The Science Behind Speech Sounds
We know that speech is a physical event, and as such, it can be broken down into measurable parts like vibrations, airflow, and sound energy. When we analyse these patterns it can be seen that for example vowels and consonants vibrate differently, and these characteristics show up in the shape of their sound waves.
Here are some of the main features acoustic phonetics looks at:
Frequency (Pitch)
It represents how high or low a sound is, which is determined by how fast the sound wave vibrates. It's measured in Hertz (Hz), indicating the number of complete vibrations per second.
Speech frequency operates through fundamental frequency (F0), which reflects vocal fold vibration rates (80-200 Hz for males, 150-300 Hz for females, 200-400 Hz for children), and formant frequencies, which are vocal tract resonances that distinguish vowels like /i/ versus /a/.
In addition to sound identification, frequency patterns also carry linguistic meaning through pitch changes that can change word meaning in tonal languages or signal that a question is being asked in English.
Amplitude (Loudness)
Amplitude reflects the strength of a sound wave and corresponds to how loud we perceive it. Measured in decibels (dB), it's controlled by subglottal pressure, meaning the air pressure under the vocal folds. Louder speech is the result of more pressure and stronger vibrations.
In speech, amplitude isn't just about volume. Different sounds emphasize different frequency regions: vowels have more energy in lower frequencies, while fricatives like /s/ peak above 4000 Hz.
Amplitude also is a marker of stress. For example, 'OBject' (noun) stresses the first syllable, while 'obJECT' (verb) stresses the second. Changes in amplitude help show emphasis, breathing patterns, and can even point to speech disorders such as vocal fold paralysis.
Duration (Length)
Duration refers to how long a sound lasts, measured in milliseconds. It plays an important role in speech, for instance vowel duration can vary depending on the context, think of vowels that are longer before voiced consonants ('cab') than voiceless ones ('cap').
Consonant duration can distinguish words like 'unnamed' and 'unaimed', where the double /n/ in 'unnamed' is longer than the single /n/ in 'unaimed'. This phenomenon can be seen in languages with geminate (doubled) consonants, like Italian or Japanese.
Duration also influences rhythm and prosody. English uses stress-timed rhythm, where stressed syllables are evenly spaced and unstressed ones are shortened. Pause length can signal sentence boundaries or hesitation, and in some languages, even mark grammatical differences.
Timbre (Quality)
Timbre is the quality of sound that helps us tell voices and instruments apart, even when pitch and loudness are the same. In speech, it's shaped by how the vocal tract filters sound, creating unique patterns of energy across frequencies.
Vowels differ in timbre based on their formant frequencies—/i/ sounds brighter than /u/ because it has more high-frequency energy. Consonants like /s/ and /ʃ/ also have distinct energy profiles.
Voice quality (breathy, creaky, etc.) and emotion influence timbre too; an angry voice often boosts higher frequencies, while a sad one tends to soften them.
Additionally, vocal issues like nodules or paralysis can alter timbre, making it a useful tool in speech diagnosis.
Fun Facts about Acoustic Phonetics
Here are some facts about acoustic phonetics that highlight the incredible science behind human speech sounds:
You're a walking sound lab
Every time you speak, your vocal folds vibrate around 100–300 times per second, depending on age, gender, and pitch.
Vowels have musical patterns
Each vowel has its own unique formant structure, like a fingerprint in sound. It's what makes /i/ ("see") sound totally different from /a/ ("father") even if spoken at the same pitch.
Your voice is full of hidden frequencies
When you talk, you produce not just the main pitch (fundamental frequency), but also harmonics, multiple layers of frequencies stacked above it, shaping your unique sound.
Whispers are pitchless, but still carry timbre
Even without vocal fold vibration, whispered speech still uses formants and amplitude patterns to carry meaning.
Speech is faster than we think
Fluent speakers can produce 10–15 phonemes per second, faster than most people can type.
Acoustic forensics
Investigators can use spectrograms (visual displays of speech) to help identify speakers or verify audio recordings in legal cases.
Conclusion
Acoustic phonetics represents the science behind speech, turning every word into measurable sound waves with different patterns of pitch, loudness, length, and quality. This knowledge helps do things like create better hearing aids, solve crimes through voice analysis, or build the speech recognition systems we use every day.
Learning about acoustic phonetics shows that normal conversation is actually an amazing process, as our brains quickly decode thousands of sound signals every time someone speaks to us.