Laryngeal Function and Speech Production
Learning Objectives Describe the basic role of the larynx in speech and song.
What is the basic role of the larynx in speech and song Sound source to excite the vocal tract Voice Whisper Prosody Fundamental frequency (F0) variation Amplitude variation Realization of phonetic goals Voicing Devoicing Glottal frication (//, //) Glottal stop (//) Aspiration Para-linguistic and extra-linguistic roles Transmit affect Speaker identity
Learning Objectives Possess a knowledge of laryngeal anatomy sufficient to understand the biomechanics, aerodynamics and acoustics of phonation.
The hyo-laryngeal complex SPPA 4030 Speech Science
Extrinsic/Supplementary Muscles SPPA 4030 Speech Science
Intrinsic muscles SPPA 4030 Speech Science
Muscular Actions SPPA 4030 Speech Science
CA joint function SPPA 4030 Speech Science
Muscular actions on vocal folds Alter position Adduction LCA, IA, TA Abduction PCA Alter tension (and length) Increase/decrease longitudinal tension Balance between TA and CT SPPA 4030 Speech Science
Extrinsic/supplementary muscles Holds the larynx in the neck Allows positional change of the larynx Elevates when swallowing Elevates during certain speech activities Elevating pitch High vowel production SPPA 4030 Speech Science
The larynx SPPA 4030 Speech Science
“Layered” structure of vocal fold SPPA 4030 Speech Science
Basic Structure of the vocal fold epithelium connective tissue superficial layer tissue loosely connected to the other layers intermediate layer elastic fibers deep layer collagen fibers (not stretchy) muscle (TA) Lamina propria Vocal ligament SPPA 4030 Speech Science
The vocal fold through life… Newborns No layered structure of LP LP loose and pliable Children Vocal ligament appears 1-4 yrs 3-layered LP is not clear until 15 yrs Old age Superficial layer becomes edematous & thicker Thinning of intermediate layer and thickening of deep layer Changes in LP more pronounced in men Muscle atrophy
Learning Objectives Describe the control variables of laryngeal function.
Laryngeal Opposing Pressure Pressure that opposes translaryngeal air pressure Sources Muscular pressure Surface tension Gravity
Laryngeal Airway Resistance (LAR) Components of LAR Translaryngeal pressure Translaryngeal flow Values can vary widely Resistance=Pressure/Flow
Glottal Size
Vocal Fold Stiffness
“Effective” Mass and Length
Learning Objectives Outline and briefly describe the different types of sounds that can be produced by the larynx.
Laryngeal Sound Generation Transient vs. Continuous Glottal stops Aperiodic vs. Periodic Glottal fricatives Whispering Voice production/phonation
Laryngeal Sound Generation Glottal stop Glottal fricative
Learning Objectives Describe a single cycle of vocal fold oscillation Describe why phonation is considered “quasi-periodic” Describe the relationship between vocal fold motion (kinematics), laryngeal aerodynamics and sound pressure wave formation Describe and draw idealized waveforms and spectra of the glottal sound source
Complexity of vocal fold vibration Vertical phase difference Longitudinal phase difference
The Glottal Cycle
Phonation is actually quasi-periodic Complex Periodic vocal fold oscillation Aperiodic Broad frequency noise embedded in signal Non-periodic vocal fold oscillation Asymmetry of vocal fold oscillation Air turbulence
Flow Glottogram
Synchronous plots Sound pressure waveform (microphone at mouth) Glottal Airflow (inverse filtered mask signal) Glottal Area (photoglottogram) Vocal Fold Contact (electroglottogram)
Sound pressure wave sound pressure Instantaneous Time
Learning Objectives Explain vocal fold motion using the 2-mass model version of the myoelastic-aerodynamic theory of phonation
Glottal Aerodynamics: Some Terminology Subglottal pressure Translaryngeal Pressure (Driving Pressure) Translaryngeal Airflow (Volume Velocity) Laryngeal Airway Resistance Phonation Threshold Pressure Initiate phonation Sustain phonation
Myoelastic Aerodynamic Theory of Phonation Necessary and Sufficient Conditions Vocal Folds are adducted (Adduction) Vocal Folds are tensed (Longitudinal Tension) Presence of Aerodynamic pressures (driving pressure)
2-mass model Upper part of vocal fold Mechanical coupling stiffness Lower part of vocal fold Coupling between mucosa & muscle TA muscle
Definitions of terms Pme : myoelastic pressure (aka laryngeal opposing pressure) Psg : subglottal pressure Patm: atmospheric pressure Pwg : pressure within the glottis Utg : transglottal (translaryngeal) airflow
VF adducted & tensed → myoelastic pressure (Pme ) Glottis is closed subglottal air pressure (Psg) ↑ Psg ~ 8-10 cm H20, Psg > Pme L and R M1 separate Transglottal airflow (Utg) = 0 As M1 separates, M2 follows due to mechanical coupling stiffness Psg > Pme glottis begins to open Psg > Patm therefore Utg > 0
Utg ↑ ↑ since glottal aperature << tracheal circumference Utg ↑ Pwg ↓ due to Bernoulli effect Pressure drop within the glottis Bernoulli’s Law P + ½ U2 = K where P = air pressure = air density U = air velocity
Utg ↑ Pwg ↓ due to Bernoulli effect* Pwg < Pme M1 returns to midline M2 follows M1 due to mechanical coupling stiffness Utg = 0 Pattern repeats 100-200 times a second
Role of glottal shape Current theories argue that Bernoulli effect plays a relatively small role in vocal fold closure. More important is glottal shape. Pwg is lower for ‘divergent’ vs. ‘convergent’ shape. As the glottis become divergent, Pwg drops resulting in the Pwg < Pme
Limitations of this simple model
Learning Objectives Describe how speakers control fundamental frequency. Provide expected values for different measures of fundamental frequency. Describe different methods for measuring fundamental frequency. Describe how speakers control sound pressure level. Provide expected values for different measures of sound pressure level.
Quantifying frequency Hertz: cycles per second (Hz) Non-linear scales Octave scale 1/3 octave bands Semitones Cents Other “auditory scales”: e.g. mel scale
Fundamental Frequency (F0) Control What factors dictate the vibratory frequency of the vocal folds? Anatomical factors Males ↑ VF mass and length = ↓ Fo Females ↓ VF mass and length = ↑ Fo Subglottal pressure adjustment – show example ↑ Psg = ↑ Fo Laryngeal and vocal fold adjustments ↑ CT activity = ↑ Fo TA activity = ↑ Fo or ↓ Fo Extralaryngeal adjustments ↑ height of larynx = ↑ Fo
Characterizing Fundamental Frequency (F0) Average F0 speaking fundamental frequency (SFF) Correlate of pitch Infants ~350-500 Hz Boys & girls (3-10) ~ 270-300 Hz Young adult females ~ 220 Hz Young adult males ~ 120 Hz Older females: F0 ↓ Older males: F0 ↑ F0 variability F0 varies due to Syllabic & emphatic stress Syntactic and semantic factors Phonetics factors (in some languages) Provides a melody (prosody) Measures F0 Standard deviation ~2-4 semitones for normal speakers F0 Range maximum F0 – minimum F0 within a speaking task
F0 in the first 10 years of life
F0 over the lifespan
Estimating the limits of vocal fold vibration Maximum Phonational Frequency Range highest possible F0 - lowest possible F0 Not a speech measure measured in Hz, semitones or octaves Males ~ 80-700 Hz1 Females ~135-1000 Hz1 Around a 3 octave range is often considered “normal” 1Baken (1987)
Approaches to Measuring Fundamental Frequency (F0) Time domain vs. frequency domain Manual vs. automated measurement Specific Approaches Peak picking Zero crossing Autocorrelation The cepstrum & cepstral analysis
Amplitude control during speech
Sound Pressure Level (SPL) Average SPL Correlate of loudness conversation: ~ 65-80 dBSPL SPL Variability SPL to mark stress Contributes to prosody Measure Standard deviation for neutral reading material: ~ 10 dBSPL
Estimating the limits of sound pressure generation Dynamic Range Amplitude analogue to maximum phonational frequency range ~50 – 115 dB SPL
Learning Objectives Differentiate between different types of vocal attack.
Learning Objectives Differentiate between different vocal registers.
Vocal Register Refers to a distinct mode of vibration According to Hollien… Range of consecutive Fos produced with a distinct voice quality Fo range should have minimal overlap with other registers SPPA 4030 Speech Science
Vocal Register Modal register (a.k.a. chest register) Pulse register (a.k.a. vocal fry, glottal fry, creaky voice) Falsetto register (a.k.a. loft register) SPPA 4030 Speech Science
Voice Registers
Vocal Registers Modal VF are relatively short and thick Reduced VF stiffness Large amplitude of vibration Possesses a clear closed phase The result is a voice that is relatively loud and low in pitch Average values cited refer to modal register SPPA 4030 Speech Science
Vocal Registers Pulse (Glottal fry) 30-80 Hz, mean ~ 60 Hz Closed phase very long (90 % cycle) May see biphasic pattern of vibration (open, close a bit, open and close completely) Low subglottal pressure (2 cm water) Energy dies out over the course of a cycle so parts of the cycle has very little energy Hear each individual cycle SPPA 4030 Speech Science
Vocal Registers Falsetto 500-1100 Hz (275-600 Hz males) VF are relatively long and thin Increased VF stiffness Small amplitude of vibration Vibration less complex Incomplete closure (no closed phase) The result is a voice that is high in pitch SPPA 4030 Speech Science
Learning Objectives Describe the physiological and acoustic correlates of pressed, breathy and rough voice qualities. Define terms such as harmonics (or signal) to noise ratio, jitter and shimmer. Explain how physical description and quantification of the phonatory signal can be informative for clinical populations.
Vocal Quality no clear acoustic correlates like pitch and loudness However, terms have invaded our vocabulary that suggest distinct categories of voice quality Common Terms Breathy Tense/strained Rough Hoarse
Are there features in the acoustic signal that correlate with these quality descriptors?
Breathiness Perceptual Description Audible air escape in the voice Physiologic Factors Diminished or absent closed phase Increased airflow Potential Acoustic Consequences Change in harmonic (periodic) energy Sharper harmonic roll off Change in aperiodic energy Increased level of aperiodic energy (i.e. noise), particularly in the high frequencies
harmonics (signal)-to-noise-ratio (SNR/HNR) harmonic/noise amplitude HNR Relatively more signal Indicative of a normality HNR Relatively more noise Indicative of disorder Normative values depend on method of calculation “normal” HNR ~ 15
Harmonic peak Noise ‘floor’ Harmonic peak Noise ‘floor’ Amplitude Frequency
First harmonic amplitude From Hillenbrand et al. (1996)
Spectral Tilt: Voice Source
Spectral Tilt: Radiated Sound
Tense/Pressed/Effortful/Strained Voice Perceptual Description Sense of effort in production Physiologic Factors Longer closed phase Reduced airflow Potential Acoustic consequences Change in harmonic (periodic) energy Flatter harmonic roll off
Spectral Tilt Pressed Breathy
Acoustic Basis of Vocal Effort Perception of Effort F0 + RMS + Open Quotient Tasko, Parker & Hillenbrand (2008)
Roughness Perceptual Description Physiologic Factors Perceived cycle-to-cycle variability in voice Physiologic Factors Vocal folds vibrate, but in an irregular way Potential Acoustic Consequences Cycle-to-cycle variations F0 and amplitude Elevated jitter Elevated shimmer
Period/frequency & amplitude variability Jitter: variability in the period of each successive cycle of vibration Shimmer: variability in the amplitude of each successive cycle of vibration …
Jitter and Shimmer Sources of jitter and shimmer Small structural asymmetries of vocal folds “material” on the vocal folds (e.g. mucus) Biomechanical events, such as raising/lowering the larynx in the neck Small variations in tracheal pressures “Bodily” events – system noise Measuring jitter and shimmer Variability in measurement approaches Variability in how measures are reported Jitter Typically reported as % or msec Normal ~ 0.2 - 1% Shimmer Can be % or dB Norms not well established
Additional features of voice Regular fluctuations in frequency (and amplitude) Vocal tremor Vocal “flutter” Irregular fluctuations in frequency Diplophonia and/or pitch breaks
Learning Objectives Briefly describe range of instruments used to capture phonatory behavior including stroboscopy, photoglottography, electroglottography, and laryngeal aeroydynamics.
Measuring Glottal Behavior Videolaryngoscopy Stroboscopy High speed video
Photoglottography (PGG) illumination Time
Electroglottography (EGG) Human tissue = conductor Air: conductor Electrodes placed on each side of thyroid lamina high frequency, low current signal is passed between them VF contact = impedance VF contact = impedance
Electroglottogram
Glottal Airflow Instantaneous airflow is measured as it leaves the mouth Looks similar to a pressure waveform Can be inverse filtered to remove effects of vocal tract Resultant is an estimate of the airflow at the glottis
Flow and Pressure Measurement
Flow and Pressure Measurement