Something to Talk About: The Anatomy of Speech Sounds

When it comes to communication, humans are pretty unique. Our mouths and throats are specialized to create a wide array of sounds, and the fact that we string those sounds together to transfer thoughts from one brain to another is a pretty impressive feat in the animal kingdom. One thing I found on the path to my master’s degree in linguistics is that the more you learn about humans’ language capabilities, the more you feel awed by just how amazing they are!

How do our vocal tracts produce the range of sounds used in human language? What are the biological underpinnings of speech production? Stay tuned to find out.


A Tour of the Vocal Tract

The pathway air takes from our lungs to the outside world isn’t just a smooth, featureless tube. That would actually be pretty weird. Instead, the vocal tract is full of lots of muscles and structures that can obstruct the stream of air at various points along its journey, creating the sounds that make up the languages we speak.     

A brief video introduction to the vocal tract!  Footage from Human Anatomy Atlas 2019

When we exhale, air travels from the lungs up into the trachea. The first place where we can start messing with the air stream is the larynx, which is perched at the top of the trachea. We can contract muscles in the larynx to manipulate bands of tissue called the vocal cords (or vocal folds). The vibration of the vocal cords is called phonation.

speech-articulation-larynx-muscles-and-vocal-foldsImage from Human Anatomy Atlas.

By regulating the tension of the vocal cords and changing the amount of space between them (the glottis), we can modulate the pitch, volume, and tonal quality of our voices. There is a continuum of phonation types, from whispering to “creaky voice” (similar to vocal fry).

We can also completely stop the stream of air by fully closing the distance between the vocal folds. This gives us the glottal stop (think of the sound you make between the syllables of “uh-oh”).

Next, let’s talk about the tongue. The tongue is made up of four intrinsic muscles: the superior lingualis, inferior lingualis, vertical lingualis, and transverse lingualis. There are also four extrinsic tongue muscles that help the tongue move: the genioglossus, hyoglossus, palatoglossus, and styloglossus.

Muscle

Function

Genioglossus

Depresses and extends the tongue

Hyoglossus

Depresses the tongue

Palatoglossus

Elevates posterior tongue and constricts the pharynx

Styloglossus

Draws the sides of the tongue upward and draws the tongue back


speech-articulation-tongue-extrinsic-muscles-and-vocal-tractImage from Human Anatomy Atlas.

The tongue is one of the most active of the articulators in the vocal tract. It can impede the flow of air by coming in contact with the oropharyngeal wall, soft palate (velum), hard palate, and alveolar ridge (the part of the hard palate just behind the front teeth).

speech-articulation-velum-soft-palate
speech-articulation-alveolar-ridge-hard-palateImages from Human Anatomy Atlas.

It’s no wonder that the tongue has so many muscles helping it out—it needs to be pretty versatile to make the specific movements required for speech! Movements of the mouth, face, tongue, and larynx are so important, in fact, that a large portion of the primary motor cortex is devoted to them.

You might recognize the image below (the motor homunculus) from the neuromuscular interaction article from a few weeks back. The face/tongue/larynx and hands are depicted as the largest parts of the body in the homunculus representation because of the large regions of motor cortex devoted to their intricate motions.

neuromuscular-interaction-motor-homunculus-illustrationImage credit: D. Nguyen, Visible Body


Speech Sounds: Let's Make Some Noise!

Now we’re going to put all the muscle-y stuff together with some linguistics to give a more complete picture of how the motions of your articulators create particular sounds.

Phoneticians (linguists who study the articulatory and/or acoustic properties of speech sounds) have grouped the speech sounds humans make into several categories. There are vowels and consonants, of course, but there are also lots of smaller distinctions within those categories.

Let’s start with vowels. Vowels don’t involve stopping the stream of air as it travels up from the lungs, but they do involve changing the shape and size of the space through which the air passes. The vocal cords must also be vibrating in order for a vowel sound to be produced. If you’re an English speaker, try going through the vowel sounds “ah” “ey” “ee” “oh” and “ooh” and pay attention to how the shape of your lips and the amount of space inside your mouth changes. Vowel sounds can also combine to form diphthongs.

Linguists typically group vowels based on their tongue height (high, mid, low), tension (tense, lax), and tongue position (front, central, back) as well as whether the lips are rounded.

speech-articulation-vowel-chartImage credit: UCLA Phonetics Lab Archive

In contrast, a consonant is basically any sound that isn’t a vowel. They involve stopping the flow of air, either fully or partially, and releasing it again. Consonants are categorized by their place and manner of articulation.

The place of articulation refers to the point at which the airflow is impeded. This can occur at the lips, teeth, alveolar ridge, hard palate, soft palate, uvula, oropharyngeal wall, epiglottis, and glottis. Much of the time, the tongue is responsible for blocking the air stream, but glottal, epiglottal, bilabial (lips are pressed together), and labiodental (top teeth press against bottom lip) sounds are notable exceptions to this generalization.

The manner of articulation refers to what happens to the air. Stop consonants (p, b, t, d, k, hard g) completely obstruct the flow of air before releasing it again. Fricatives (like s or f) create a narrow space for air to pass through, giving them a hissing sound. Affricates (ch, j) are roughly between a stop and a fricative. Approximants (r, l, w, y) involve articulators coming close enough together to qualify as a consonant rather than a vowel, but no friction is created.

Nasal sounds (like English n, m, and ng) are not your average consonants. Basically, airflow is blocked in the mouth, as in a stop consonant, but the air is allowed to flow out through the nasal cavity because the velum (soft palate) is lowered.


Pathologies

When we string sounds and syllables into words and phrases, the primary motor cortex works together with regions of the brain, such as Broca’s area (BA 44–45), that deal with computational aspects of language production. Damage to Broca’s area results in expressive aphasia (Broca’s aphasia), which is characterized by patients having difficulty producing fluent speech, especially when complex grammar is required.

There are also a number of pathologies that can affect the articulatory/neuromuscular component of speech production.

One of these is dysarthria, in which neurological damage from stroke, traumatic brain injury, or degenerative disorders (ALS, MS, Dementia) makes it difficult to move the muscles that produce speech sounds. This is due to a disruption in the transmission of motor signals from the brain to the articulators. Direct damage to the speech organs can result in a condition called peripheral dysarthria. Typical symptoms of dysarthria include speech that is too fast or slow, slurred, or mumbled. People with dysarthria may also have trouble moving their jaw, tongue, or lips.

Another condition affecting speech articulation is a developmental disorder called childhood apraxia of speech (CAS). Potential causes for CAS can include (but are not limited to) brain damage or underlying genetic conditions. Unlike dysarthria, CAS does not involve muscle weakness. Children with CAS do still have trouble moving their muscles to make speech sounds, but this problem lies more with motor planning than disruptions in the transmission of signals from brain to muscle.


Whew! That was a lot of sounds. And just think—every time you speak, your brain and muscles coordinate the required movements at lightning speed! What’s more, the sounds of English are only a piece of the full sound inventory of the world’s languages. Check out UCLA’s phonetics archive to learn more (you can listen to just about any type of speech sound on this site—it’s awesome!).

New call-to-action


Be sure to subscribe to the Visible Body Blog for more anatomy awesomeness! 

Are you a professor (or know someone who is)? We have awesome visuals and resources for your anatomy and physiology course! Learn more here. 

Additional Sources: