1 CS 551/651: Structure of Spoken Language Lecture 7: Syllable Structure, Vowel Neutralization, and Coarticulation John-Paul Hosom Fall 2010.

Slides:



Advertisements
Similar presentations
Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st March, 2011
Advertisements

CS 551/651: Structure of Spoken Language Spectrogram Reading: Approximants John-Paul Hosom Fall 2010.
Sounds that “move” Diphthongs, glides and liquids.
SPPA 403 Speech Science1 Unit 3 outline The Vocal Tract (VT) Source-Filter Theory of Speech Production Capturing Speech Dynamics The Vowels The Diphthongs.
Acoustic Characteristics of Consonants
Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)
1 CS 551/651: Structure of Spoken Language Lecture 4: Characteristics of Manner of Articulation John-Paul Hosom Fall 2008.
1 CS 551/651: Structure of Spoken Language Spectrogram Reading: Stops John-Paul Hosom Fall 2010.
CS 551/651: Structure of Spoken Language Lecture 12: Tests of Human Speech Perception John-Paul Hosom Fall 2008.
Coarticulation Analysis of Dysarthric Speech Xiaochuan Niu, advised by Jan van Santen.
The sound patterns of language
Phonology, part 5: Features and Phonotactics
The Sound Patterns of Language: Phonology
“Speech and the Hearing-Impaired Child: Theory and Practice” Ch. 13 Vowels and Diphthongs –Vowels are formed when sound produced at the glottal source.
Digital Systems: Hardware Organization and Design
Phonology Phonology is essentially the description of the systems and patterns of speech sounds in a language. It is, in effect, based on a theory of.
Chapter two speech sounds
Unit 4 Articulation I.The Stops II.The Fricatives III.The Affricates IV.The Nasals.
Phonetics The study of productive sounds within a language 2 Basic types of sounds in English: Consonants (C): restriction on airflow Vowels (V): no restriction.
Chapter 6 Features PHONOLOGY (Lane 335).
Chapter three Phonology
Chapter 2 Introduction to articulatory phonetics
Consonants and vowel January Review where we’ve been We’ve listened to the sounds of “our” English, and assigned a set of symbols to them. We.
Chapter 3 Phonetics: Describing Sounds. Phonetics -study of speech sounds Sounds and symbols --use a system of written symbols --one sound represents.
Last minute Phonetics questions?
Speech Sounds of American English and Some Iranian Languages
1 CS 551/651: Structure of Spoken Language Lecture 4: Characteristics of Manner of Articulation John-Paul Hosom Fall 2010.
Structure of Spoken Language
Phonology, phonotactics, and suprasegmentals
1 CS 552/652 Hidden Markov Models for Speech Recognition Spring, 2006 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
Phonetics and Phonology
Speech Production1 Articulation and Resonance Vocal tract as resonating body and sound source. Acoustic theory of vowel production.
Structure of Spoken Language
Phonology, part 4: Distinctive Features
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Speech Science Fall 2009 Nov 2, Outline Suprasegmental features of speech Stress Intonation Duration and Juncture Role of feedback in speech production.
Speech Science Fall 2009 Oct 26, Consonants Resonant Consonants They are produced in a similar way as vowels i.e., filtering the complex wave produced.
Speech Science VII Acoustic Structure of Speech Sounds WS
Introduction to Linguistics Ms. Suha Jawabreh Lecture # 7.
Phonology The sound patterns of language Nuha Alwadaani March, 2014.
1 Phonetics and Phonemics. 2 Phonetics and Phonemics : Phonetics The principle goal of Phonetics is to provide an exact description of every known speech.
CS 551/652: Structure of Spoken Language Lecture 2: Spectrogram Reading and Introductory Phonetics John-Paul Hosom Fall 2010.
Speech Science Fall 2009 Oct 28, Outline Acoustical characteristics of Nasal Speech Sounds Stop Consonants Fricatives Affricates.
NAE Vowels-Part 1 Think about the vowel phonemes as you say the vowels that occur in the middle of these words: beat, bit, bait, bet, bat, but, pot, bought,
Glides, Place and Perception March 18, 2010 News The hard drive on the computer has been fixed! A couple of new readings have been posted to the course.
Phonetics: Dimensions of Articulation October 13, 2010.
Transitions + Perception March 27, 2012 Tidbits First: Guidelines for the final project report So far, I have two people who want to present their projects.
Structure of Spoken Language
Chapter II phonology II. Classification of English speech sounds Vowels and Consonants The basic difference between these two classes is that in the production.
Introduction to Language Phonetics 1. Explore the relationship between sound and spelling Become familiar with International Phonetic Alphabet (IPA )
Matakuliah: G0922/Introduction to Linguistics Tahun: 2008 Session 3 Phonetics: Consonants.
0 / 27 John-Paul Hosom 1 Alexander Kain Brian O. Bush Towards the Recovery of Targets from Coarticulated Speech for Automatic Speech Recognition Center.
Stop Acoustics and Glides December 2, 2013 Where Do We Go From Here? The Final Exam has been scheduled! Wednesday, December 18 th 8-10 am (!) Kinesiology.
THE SOUND PATTERNS OF LANGUAGE
Stop + Approximant Acoustics
Ch4 – Features Features are partly acoustic partly articulatory aspects of sounds but they are used for phonology so sometimes they are created to distinguish.
Transitions + Perception March 25, 2010 Tidbits Mystery spectrogram #3 is now up and ready for review! Final project ideas.
Stop/Plosives.
Stop Acoustics + Glides December 2, 2015 Down The Stretch They Come Today: Stop and Glide Acoustics Friday: Sonorant Acoustics + USRI evaluations We’ll.
Speech in the DHH Classroom A new perspective. Speech in the DHH Bilingual Classroom Important to look beyond the traditional view of speech Think of.
Introduction to English Pronunciation
Introduction to Linguistics
Essentials of English Phonetics
Structure of Spoken Language
Structure of Spoken Language
Structure of Spoken Language
Phonetics and Phonemics
Phonetics and Phonemics
PHONETICS AND PHONOLOGY INTRODUCTION TO LINGUISTICS Lourna J. Baldera BSED- ENGLISH 1.
Presentation transcript:

1 CS 551/651: Structure of Spoken Language Lecture 7: Syllable Structure, Vowel Neutralization, and Coarticulation John-Paul Hosom Fall 2010

2 Syllables Words are composed of phonetic clusters: syllables Each syllable has a nucleus; typically the nucleus is a vowel or diphthong, sometimes a syllabic nasal or lateral (button, bottle) or rhotacized (r-colored) vowel (bird) Nucleus is syllabic nasal or lateral only when following alveolar consonant in previous syllable of a word Syllable boundaries sometimes ambiguous: “beefeater”:beef/eaterbee/feater (Hunt, ICSLP04) “dolphin”:dol/phindolph/in (Wells, 1990) “tender”:ten/dertend/er (Wells, 1990) Syllable can be broken into components: syllable contains {onset, rhyme} rhyme contains {nucleus, coda} onset and coda are consonants, rhyme is a vowel, syllabic nasal, or syllabic lateral

3 Syllables Limitations on consonant clusters: not all CCC combinations are possible in syllable-initial position. Of those that are possible, almost half are very rare. graphic from very few English words/root with /s k l/: “sclerosis” possibly only one word in English: “spew” only a few English words pronounced (optionally) with /s t y/: “Stewart”, “steward”, “stew” very few English words with /s k y/: “skew”, “askew”, “obscure”

4 Syllables Sonority corresponds roughly to degree of constriction along vocal and/or nasal tract Ordering of sonority: vowels, glides (/w/, /y/), liquids (/l/, /r/), nasals, fricatives, affricates, plosives If a binary classification (sonorant/non-sonorant), then sonorant consists of all vowels, glides, liquids, and nasals. Fricatives, affricates, and plosives may be clustered into one category, “obstruents,” for purposes of sonority Syllabification can be done according to “sonority principle”; the sonority must rise and fall in a syllable Also, there’s the Maximal Onset Principle: “Put a consonant in the onset rather than the coda when possible”

5 Syllables Because of rise and fall of sonority in syllables, the following restrictions occur: (a) glide (/w/,/y/) must be immediately adjacent to a vowel, (b) /r/ is next closest consonant to vowel, (c) /l/ is next closest consonant to vowel, (d) nasal is next closest, (e) obstruent is farthest from the vowel (but there may be more than one obstruent in onset or coda) Obstruents in a cluster must have same voicing In series of obstruents between two vowels, voicing can change only once, at the syllable boundary. English allows up to 3 consonants in syllable initial position, 4 consonants at syllable final position

6 Syllables Examples: sphere /s f iy r/, streak /s t r iy k/, texts /t eh k s t s/, helms /h eh l m z/ but not /s t l iy/, /s p w iy/, /z b r ay/, etc. The ordering of glides and liquids doesn’t matter for our purposes (applying to syllabification), because glides and liquids can not occur sequentially within the same syllable in English. (However, two liquids in the same syllable are possible, e.g. “Carl”, as long as /r/ is closer to the vowel than /l/.) In English, some burst-fricative pairs are represented as distinct phonemes (/ch/, /jh/), although there are some other cases of burst-fricative pairs that remain distinct (e.g. “tsunami,” “bishops”, “six”). It’s also possible to have two or more adjacent fricatives: “eleven twelfths” (note 4 consonants after final vowel)

7 Vowel Neutralization When speech is uttered very quickly (or is not well enunciated), the formants tend to shift toward that of a neutral vowel: (from Daniloff, p. 320)(from van Bergem 1993 p. 8)

8 Vowel Neutralization Target undershoot: /m ih pc ph ih eh/

9 Vowel Neutralization Target undershoot: /ih/ extracted and concatenated from “mip”:

10 Vowel Neutralization However, neutralization is not always so simple; sometimes vowel formants shift away from the neutral position, depending on their context, and vowels tend toward slightly different neutral “targets”. Neutralization is to some extent an artifact of averaging over speakers and contexts (van Bergem 1993) vowels from one speaker in different phonetic contexts, and in “reduced” and “isolated” speaking conditions

11 Coarticulation Coarticulation is the “blending” of adjacent speech sounds, due to gradual movement of the articulators. Coarticulation makes automatic speech recognition and text-to-speech synthesis difficult, but humans use coarticulation to conserve effort while speaking and provide robustness during recognition. There is Right-to-Left (RL) or “anticipatory” and Left-to-Right (LR) or “carry-over” coarticulation Models of coarticulation and syllabification:  Locus Theory  Modified Locus Theory (Klatt)  Öhman’s Theory  Kozhevnikov-Chistovich (KC) Theory  Wickelgren’s Theory, etc.

12 Coarticulation RL coarticulation occurs due to high-level planning of phonetic sequences: “spoon”:[spuwn] rounding in isolation –– +– rounding in context ++++ more observable if neighboring sounds not specified with respect to potentially coarticulated feature; e.g. /s/, /p/, /n/ not specified with respect to lip rounding (from Daniloff, pp )

13 Coarticulation: Locus Theory Locus Theory (Delattre, Liberman, and Cooper, 1955) “there are, for each consonant, characteristic frequency positions, or loci, at which the formant transitions begin, or to which they may be assumed to point. On this basis, the transitions may be regarded simply as movements of the formants from their respective loci to the frequency levels appropriate for the next phone … The spectrographic patterns …, which produce /d/ before /iy/, /aa/, and /ow/, show how … these transitions seem to be pointing to a [F2] locus in the vicinity of 1800 [Hz].”  Each consonant has “target frequencies” independent of the neighboring vowels.  Formants transition from these target frequencies to the vowel target frequencies.

14 Coarticulation: Locus Theory Locus Theory: Consonants and vowels both have “targets” of articulator positions and therefore formant frequency locations Given sufficient duration of a syllable, all phonemes reach their targets The slope of the formants during a transition from a consonant to a vowel is relatively constant until reaching the target If the syllable duration doesn’t allow enough time for the formants to reach their targets, “target undershoot” occurs and the formants change direction before fully realizing the intended vowel

15 Coarticulation: Locus Theory Locus Theory: (From Klatt 1987, p. 753)

16 Coarticulation: Modified Locus Theory Problems with Locus Theory: A transition may have both rapid and slow components; rapid release of obstruction via tongue tip, followed by slow movement of tongue body. Preceding vowel can influence F2 onset of a CV transition (Öhman, 1966) F2 may be insensitive to oral constrictions (obstruents) if the tongue position is toward the front of the mouth (as in /iy/) (as reported by Fant 1973, Klatt1987)

17 Coarticulation: Modified Locus Theory Modified Locus Theory: Klatt hypothesized that main effects of the vowel on the articulation of consonants are front/back position and lip rounding Vowels divided into three sets: {+front}{+round}{–front, –round} (because there are no rounded front vowels in English, sets 1 and 2 are mutually exclusive) {+front}/iy ih eh ae/ {+round}/uw ao ow er/ {–front, –round}/uh ah aa aw/ Predicted F onset from F target for these 3 classes (locus theory) Achieved 95% intelligibility for CVC nonsense syllables

18 Coarticulation: Locus Theory Modified Locus Theory: (From Klatt 1987, p. 754) ×= -front, -round ° = +front = +round

19 Coarticulation: Öhman’s Theory Öhman (1966) found that loci of consonants is NOT independent of neighboring vowels: and that for /g/ more than one locus is required Conclusion: consonant “gestures” are superimposed on vowel “gestures” that are present during the consonant; even when consonant is being uttered in VCV, there is effect of both V on C.

20 Coarticulation: Öhman’s Theory Öhman (1967) proposed model of coarticulation based on vocal-tract shape evolving over time. Assumes that vocal-tract shapes can be mapped to formant frequencies. For VCV utterances: where s(x,t) is the vocal tract shape at position x and time t, v(x,t) is the vocal tract shape at position x for a given vowel as it varies over time from vowel 1 to vowel 2, c(x) is the vocal tract shape of the consonant, k(t) is an interpolation value (from 0 to 1), and w c (x) describes the degree to which c(x) “resists” coarticulation.

21 Coarticulation: Kozhevnikov-Chistovich (KC) Theory (1)Syllabification using C n V pattern: CV, CCV, CCCV, … phrase “give true answers”: g ihvtruw ae n serz −−−− −−−−−−−−−−− −− −−−−−−− − S1 S2 S3S4S5 (2) Measured relative durations of words, “syllables”, vowels: relative duration of vowel = D vow / D syll, syllable = D syll / D word word = D word / D phrase

22 Coarticulation: Kozhevnikov-Chistovich (KC) Theory They measured articulatory effects of vowel on consonants. They found coarticulation within syllable but not across syllables: C 1 V 1 C 2 C 3 V 2 articulatory gestures for consonant(s) and vowel begin nearly simultaneously with onset of initial consonant in syllable Example: lip rounding in /uw/ begins with /v/ in “give true answers”, but nasalization of /ae/ does not occur. focused only on LR coarticulation, effect of V on previous C. assumes motor programming of speech is discontinuous at VC boundary counter-examples showing LR coarticulation (Moll and Daniloff 1971, Kent, Carney, and Severeid 1974, Öhman 1966)

23 Coarticulation: Wickelgren’s Theory Speech units are mentally coded as context-sensitive units: in phonetic string /X Y Z/, Y is encoded as X Y Z “By assuming (context-sensitive) allophones to be the basic unit of articulation, … it is trivial to account for how the ‘same phoneme’ in different phonemic environments can be … different in some respects at all levels of the speech process” (Wickelgren 1969, p. 11) However, coarticulation can spread over more than one phone (up to seven phones distance). Other criticisms: MacNeilage 1970, Whitaker 1970, Halwes and Jenkins 1971; “Allophonic richness may only beget strategic poverty” (Kent and Minifie 1977) However, Wickelgren’s is the only model currently used in ASR and concatenative text-to-speech (exceptions: Wouters 2001, Wrede 2001).

24 Coarticulation: Gay’s Theory Gay, 1977: The syllabic unit of motor organization is the CV unit Based on X-ray motion pictures of VCV utterances anticipatory tongue movements for V 2 in V 1 CV 2 sequence don’t begin until closure of C has been attained movement toward V 2 occurs during closure of C, having a large effect on position and shape of tongue during release of closure V 1 has little effect on position of tongue at moment of closure supports KC theory; conflicts with Öhman’s findings

25 Coarticulation Other models: MacNeilage, Henke, Benguerel and Cowan, Moll and Daniloff, Liberman, Tatham, etc. Some are “feature based” in that each phonetic segment is assigned distinctive features which can then be modified in regular ways Some are “hierarchical models”, with several levels of organization and complex interaction between levels However, “coarticulatory patterns are not explained adequately by any … theories or models” (Kent and Minifie, 1977) Conflicting evidence (Öhman and Kent & Moll vs. KC and Gay)