Articulation and Coarticulation March 16, 2010. Update The hard drive on the computer in the booth failed. It will hopefully be fixed soon. The lab assignment.

Slides:



Advertisements
Similar presentations
Phonological Development
Advertisements

Tom Lentz (slides Ivana Brasileiro)
SPPA 403 Speech Science1 Unit 3 outline The Vocal Tract (VT) Source-Filter Theory of Speech Production Capturing Speech Dynamics The Vowels The Diphthongs.
Acoustic Characteristics of Consonants
Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)
Synthesizing naturally produced tokens Melissa Baese-Berk SoundLab 12 April 2009.
Acoustic Characteristics of Vowels
Coarticulation Analysis of Dysarthric Speech Xiaochuan Niu, advised by Jan van Santen.
SPEECH PERCEPTION 2 DAY 17 – OCT 4, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
Speech perception 2 Perceptual organization of speech.
Speech Science XII Speech Perception (acoustic cues) Version
Syllables and Stress, part II October 22, 2012 Potentialities There are homeworks to hand back! Production Exercise #2 is due at 5 pm today! First off:
Analyzing Students’ Pronunciation and Improving Tonal Teaching Ropngrong Liao Marilyn Chakwin Defense.
Clinical Phonetics.
Every child talking Nursery Clusters. Supporting speech, language and communication skills Nursery Clusters Cluster 3 Expressive Language.
Speech Perception Overview of Questions Can computers perceive speech as well as humans? Does each word that we hear have a unique pattern associated.
How General is Lexically-Driven Perceptual Learning of Phonetic Identity? Tanya Kraljic and Arthur G. Samuel Our Questions (e.g., learning a particular.
PSY 369: Psycholinguistics
Syllables and Stress October 21, 2009 Syllables “defined” “Syllables are necessary units in the organization and production of utterances.” (Ladefoged,
What is Phonetics? Short answer: The study of speech sounds in all their aspects. Phonetics is about describing speech. (Note: phonetics ¹ phonics) Phonetic.
Sound and Speech. The vocal tract Figures from Graddol et al.
Chapter three Phonology
Language Comprehension Speech Perception Naming Deficits.
The Description of Speech
Conclusions  Constriction Type does influence AV speech perception when it is visibly distinct Constriction is more effective than Articulator in this.
CSD 2230 HUMAN COMMUNICATION DISORDERS
Phonetics HSSP Week 5.
Sound Sound is a wave that carries vibrations. It is mechanical, longitudinal, and a pressure wave.
Present Experiment Introduction Coarticulatory Timing and Lexical Effects on Vowel Nasalization in English: an Aerodynamic Study Jason Bishop University.
SPOKEN LANGUAGE COMPREHENSION Anne Cutler Addendum: How to study issues in spoken language comprehension.
Phonetics and Phonology
Sebastián-Gallés, N. & Bosch, L. (2009) Developmental shift in the discrimination of vowel contrasts in bilingual infants: is the distributional account.
Speech Perception 4/6/00 Acoustic-Perceptual Invariance in Speech Perceptual Constancy or Perceptual Invariance: –Perpetual constancy is necessary, however,
Whither Linguistic Interpretation of Acoustic Pronunciation Variation Annika Hämäläinen, Yan Han, Lou Boves & Louis ten Bosch.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Suprasegmentals Segmental Segmental refers to phonemes and allophones and their attributes refers to phonemes and allophones and their attributes Supra-
Speech Science Fall 2009 Nov 2, Outline Suprasegmental features of speech Stress Intonation Duration and Juncture Role of feedback in speech production.
Speech Perception 4/4/00.
Glides, Place and Perception March 18, 2010 News The hard drive on the computer has been fixed! A couple of new readings have been posted to the course.
SPEECH PERCEPTION DAY 16 – OCT 2, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
Transitions + Perception March 27, 2012 Tidbits First: Guidelines for the final project report So far, I have two people who want to present their projects.
Speech Science IX How is articulation organized? Version WS
Assessment of Phonology
Epenthetic vowels in Japanese: a perceptual illusion? Emmanual Dupoux, et al (1999) By Carl O’Toole.
Levels of Language 6 Levels of Language. Levels of Language Aspect of language are often referred to as 'language levels'. To look carefully at language.
Speech Science IX How is articulation organized?.
Phonetic Context Effects Major Theories of Speech Perception Motor Theory: Specialized module (later version) represents speech sounds in terms of intended.
Tone, Accent and Quantity October 19, 2015 Thanks to Chilin Shih for making some of these lecture materials available.
LIN 3201 Sounds of Human Language Sayers -- Week 1 – August 29 & 31.
Syllables and Stress October 21, 2015.
Exemplar Theory, part 2 April 15, 2013.
Stop Acoustics and Glides December 2, 2013 Where Do We Go From Here? The Final Exam has been scheduled! Wednesday, December 18 th 8-10 am (!) Kinesiology.
Levels of Linguistic Analysis
Current Approaches to Management of DAS Michelle D. White.
Stop + Approximant Acoustics
Speech Science II Capturing and representing speech.
Speech Production “Problems” Key problems that science must address How is speech coded? How is speech coded? What is the size of the “basic units” of.
Transitions + Perception March 25, 2010 Tidbits Mystery spectrogram #3 is now up and ready for review! Final project ideas.
Suprasegmental Properties of Speech Robert A. Prosek, Ph.D. CSD 301 Robert A. Prosek, Ph.D. CSD 301.
Motor Theory of Perception March 29, 2012 Tidbits First: Guidelines for the final project report So far, I have two people who want to present their.
Signal Detection Theory March 25, 2010 Phonetics Fun, Ltd. Check it out:
Against formal phonology (Port and Leary).  Generative phonology assumes:  Units (phones) are discrete (not continuous, not variable)  Phonetic space.
Suprasegmental features and Prosody Lect 6A&B LING1005/6105.
GEPPETO 1 : A modeling approach to study the production of speech gestures Pascal Perrier (ICP – Grenoble) with Stéphanie Buchaillard (PhD) Matthieu Chabanas.
Precedence-based speech segregation in a virtual auditory environment
Phonological Priming and Lexical Access in Spoken Word Recognition
What is Phonetics? Short answer: The study of speech sounds in all their aspects. Phonetics is about describing speech. (Note: phonetics ¹ phonics) Phonetic.
Levels of Linguistic Analysis
Speech Perception (acoustic cues)
Topic: Language perception
Presentation transcript:

Articulation and Coarticulation March 16, 2010

Update The hard drive on the computer in the booth failed. It will hopefully be fixed soon. The lab assignment will have to be postponed until (at least) Tuesday the 23rd… I will give you more info as it comes to me.

Recap There are lots and lots of muscles involved in the articulation of speech sounds. (Check out Praat’s articulatory synthesis!) Their motions are generally organized in a hierarchical fashion: coordinative structures, speech motor equivalence, acoustic goals “We speak in order to be heard in order to be understood.” Specific gestures can vary according to context, and can adapt quite rapidly to physical perturbations. The role of feedback remains unclear…

Delayed Auditory Feedback In the 1950s, engineers accidentally discovered the delayed auditory feedback (DAF) effect. = Talkers speak into a microphone, and listen to themselves over headphones …but they don’t hear what they say until ms after they’ve said it. Q: What do you think happens? A: Fluent speakers sound, well, drunk… they speak with pauses, delays, interruptions, etc. However, stutterers can often become fluent with delayed auditory feedback!

Closed Feedback Loops The DAF effect inspired “closed feedback loop” models of speech production. In this model, articulatory commands are supposed to operate like a servomechanism. = an automatic device that uses error-sensing feedback to correct the performance of a mechanism Hypothesis: motor commands keeps firing until an articulatory goal is met. Only then do the commands for the next sound (or gesture) begin. (like your thermostat)

Outstanding Issues The closed feedback loop model had some problems… Motor commands couldn’t refer to other articulators (and therefore couldn’t account for on-the-fly adjustments made with respect to other articulators) Also: feedback doesn’t always seem to be important to speakers. Attempts were made to deprive speakers of auditory and tactile feedback: They mainly resulted in poorer accuracy for sounds like [s] (and other fricatives) Though note: “Lombard effect”

Open Feedback Loops The alternative model is an “open feedback loop”. (Kozhevnikov and Chistovich, 1966) In this model, commands to produce syllables are issued without regard to feedback. Each syllable command is automatically generated with respect to an internal “rhythm generator”. Rhythm and phrase timing remains relatively constant; Individual syllable timing can vary.

Aufhebung Browman & Goldstein’s Articulatory Phonology model is an attempt to combine what is known about: 1.Higher-level (task) organization in speech production 2.With the dynamic adaptation of speech gestures to changing contexts and conditions. This model assumes a hierarchy of articulatory representations, ranging from high-to-low dimensionality:

Nuts and Bolts vocal tract variables vocal tract variables reflect configurations of articulators in the vocal tract

Nuts and Bolts vocal tract variables gestures Gestures are goal-directed manipulations of the vocal tract variables primarily change aperture implemented dynamically

Dynamics Note: kinematics is the study of motion without regard to forces that cause it dynamics is the study of motions that result from forces.  The trajectory of dynamic motions can be shaped by different forces over time. In articulatory phonology, the basic gestural model has a sinusoidal pattern of activation: = the articulator behaves like a mass on a spring

Dynamics But the motion of the articulator can be damped in proportion to its proximity to an “equilibrium” point = the closer the articulator gets to its goal, the slower it moves Critical damping  the mass on the spring doesn’t keep bouncing around the equilibrium point. Simplifying assumption: the articulator reaches its goal at the 240 degree point of the wave cycle.

Gestural Scores Different gestures may exhibit a phase relationship with each other. A gestural score organizes a series of gestures in time. Mutliple gestures may be happening on different articulatory tiers at the same time. Gestural scores roughly resemble an autosegmental representation of phonological structure.

Autosegmental Phonology Example feature tree from Halle et al. (2000) The “Feature Geometry” of autosegmental phonology organized features into sub-groupings of related features. The relationships were primarily articulatory in nature.

Feature Spreading Feature Geometry made useful predictions about what kinds of features were likely to function together in phonological processes. A typical example: the Place features. 

Assumptions There are some important differences between the Feature Geometry and Articulatory Phonology models. 1. Feature Geometry generally assumed some sort of unifying “root” node, representing a segment; Articulatory Phonology did not. (There are no phonemes!) 2.Processes in Feature Geometry were discrete; In Articulatory Phonology, they can be gradient. 3.Representations in Feature Geometry are static; In Articulatory Phonology, they are dynamic.

Advantages The “translation” between phonology and phonetics is directly encoded into the model in Articulatory Phonology. I.e., the gestures may be discrete, but they represent instructions for moving articulators over time. Also: Articulatory Phonology can account for gradient phenomena that discrete/symbolic phonology cannot. Two possible levels of phonological action: Discrete ( = elimination/addition/alteration of units in the gestural score) Gradient (= changes in the magnitude/duration/phasing of gestures)

For Instance Farnetani (1986) investigated the coarticulation of /n/ with palatals and post-alveolars in Italian. …using electropalatography (EPG)

EPG: Therapeutic Applications

EPG: Scientific Applications (from Barry 1992) The contact pattern of electrodes has to be interpreted with respect to phonological categories of interest. (Better for more anterior places of articulation.) The contact pattern also changes quite rapidly over time.

Gradient vs. Discrete Results of Farnetani (1986): “Gestural overlap” between /n/ and post-alveolars is complete  discrete assimilation Timing overlap between /n/ and palatal articulation is partial  gradient assimilation

B + G (1990): Predictions “We propose that most of the phonetic units (gestures) that characterize a word in careful pronunciation will turn out to be observable in connected speech, although they may be altered in magnitude and in their temporal relation to other gestures. In faster, casual speech, we expect gestures to show decreased magnitudes (in both space and time) and to show increasing temporal overlap.” Test cases (Brown, 1977): Are these “connected speech processes” best described as discrete or gradient phenomena?

Theoretically Gestural score for hyperspeech Gestural score for hypospeech In this model, the /t/ in “must be” is not deleted--it’s just hidden behind the /b/, due to temporal reorganization.

X-Ray Microbeam Data! Browman & Goldstein fine-tuned their model on the basis of data collected from an X-ray microbeam study of connected speech. This data also verified some of their suspicions about the presence of “hidden” gestures in connected speech.

Perfect Memories Hyperspeech production: Hypospeech production:

“Place Assimilation” “Hyperspeech” = not completely assimilated Hypospeech = assimilated production Important: there is no reassignment of feature values between segments; There is just a reorganization of the timing and magnitude of gestures.

This is the End? Moral of the story: Articulatory Phonology can capture the gradient, highly variable nature of gestures in context. Timing reorganizations may also lead to insertions: Or eltse? Another gradience: Magnitude reduction

Place Assimilation: the EPG view There is evidence from EPG studies that the magnitude of coronal gestures may be reduced in an “assimilatory” context. Data from Kerswill & Wright (1989): note: time scale is the same in all productions (10 ms/frame)

Place Assimilation: the EPG view There is evidence from EPG studies that the magnitude of coronal gestures may be reduced in an “assimilatory” context. Data from Kerswill & Wright (1989): note: time scale is the same in all productions (10 ms/frame)

Assimilation Perception Kerswill & Wright (1989) presented these four types of tokens to trained phoneticians in a combined transcription + identification task. Transcription results: Note: heavy bias towards /d/ responses. (Why?)

Assimilation Perception The phoneticians also classified all of the items transcribed as /d/ as one of the three types: Place perception, like place production, seems to be gradient.

Stress = Hyperspeech? DeJong et al. (1993) collected X-ray microbeam data on the effects of stress on coarticulation across syllables. Note: stress involves a complex set of acoustic correlates. Stressed syllables are higher in pitch than unstressed syllables (usually) Stressed syllables are longer in duration than unstressed syllables (usually) Stressed syllables are higher in intensity than unstressed syllables (usually) DeJong et al. found: stressed syllables exhibit less gestural overlap than unstressed syllables.

Prompting Hyperspeech, version 47 DeJong et al. (1993) sat speakers in front of the X- ray microbeam and had them read sentences in the following frames: 1.Prompt: Did you say, “throw the toads on the table?” Target: I said “PUT the toads on the table.” 2.Prompt: Did you say, “put the frogs on the table?” Target: I said “put the TOADS on the table.” Question: how much coarticulation is there between “put” and “the” in the two conditions?

X-ray microbeam data Focus on PUT Focus on TOAST Focus on PUT

X-ray microbeam thoughts “Stress locally shifts articulations toward the hyperarticulate end of the continuum. Speakers do whatever is necessary to enhance the realization of segmentally contrasting features. A primary mechanism for enhancing distinctions is to decrease coarticulatory overlap so that gestures for segments in stressed syllables blend less with each other or with segments in neighboring syllables.” Counter-thought: coarticulation between segments can often be a useful perceptual cue. Especially in the case of stop consonants…