Hossein Sameti Department of Computer Engineering Sharif University of Technology
2 Phonetics The principle goal of Phonetics is to provide an exact description of every known speech sound phonetics Domain of phonetics is independent of any particular language Phonemics Phonemics is used for the study of speech sounds as they are perceived by speakers of a particular language
3 Articulatory phonetics ◦ How any given speech sound is produced, with particular emphasis on anatomical detail Acoustic phonetics ◦ The emphasis is on observable, measurable characteristics in the waveform of speech sounds ◦ Provides theoretical and experimental background for speech recognition and synthesis by electronic hardware
4 The first task of articulatory phonetics is to describe speech sounds in the terms of position of the vocal organs Phonetic alphabet ◦ Phoneticians have had to devise their own system of notation IPA ARPAbet
5 Phonation Whispering Frication Compression Vibration
Consonants are easy to define in anatomical terms ◦ Point of articulation is the location of the principal constriction in the vocal tract Bilabial Labiodental Apicodental Apicogingival Apicoalveolar Apicodomal Laminoalveolar Laminodomal Centrodomal Dorsovelar Pharyngeal Glottal 6
7 Manner of articulation: the degree constriction at the point of articulation and the manner of release into the following sound Plosive Aspirated Affricative Fricative Lateral Semivowel Nasal Trill
8 Voicing: this indicates the presence or absence of phonation Voiced Unvoiced
9 ◦ Vowels: vowels are much less well defined than consonants, this because tongue typically never touches another organ and vowels described by Tongue high or low Tongue front or back Lips rounded or unrounded Nasalized or unnasalized Diphthongs: combined two vowel sound in a single syllable by moving tongue from one position to another
10 ◦ Coarticulation: No speech sound is produced accurately in the context of other sound coarticulation Overlapping of phonetic features from phone to phone is termed coarticulation
11 Phonetics Phonetics is a view of speech sounds independent of the language Phonemics Phonemics is the view of speech sounds within a specific language Phonemes phone ◦ Phonetics: an individual sound is a phone phoneme ◦ Phonemics: the smallest meaningful unit in a specific language is the phoneme
12 A phoneme is the smallest sound unit in a given language that is sufficient to differentiate one word from another Example: ◦ In English, Voicing is a feature which distinguishes between two phonemes ‘bug’ contrast with ‘buck’ ◦ In some contexts voicing is not phonemics in German ‘Tag’ can be pronounced either [ta:g] or [ta:k]
60,000 Eskimo-Aleut 45 million SOUTH-ASIAN Vietnamese Khmer … 130 million JAPANESE-KOREAN 150 million BANTU and Related Swahili Zulu … 1,500 million INDO-EUROPEAN * 800 million SINO-TIBETAN Burmese Chinese Thai Tibetan … 150 million SEMITIC and Related Arabic Ethiopic Hamitic Hebrew … 140 million MALAY-POLYNESIAN Hawaiian Indonesian Maori … 100 million URAL-ALTAIC Finnish Hungarian Mongolian Turkish … 130 million DRAVIDIAN Malayalam, Tamil, Telugu … 10 million LATIN-AMERICAN INDIAN Quechua Guarani Arawak Carib … 10 million NORTH-AMERICAN INDIAN Aztecan, Algonquin, Iroquoian, Sioan, … 13
Baltic Lithunian Lettish Celtic Breton Irish Gaelic Welsh … Hellenic Greek Germanic Dutch, Flemish English German Scandinavian Danish Icelandic Norwegian Swedish Yiddish Slavic Bulgarian Czech Macedonian Polish Russian Serbo-Croatian Slovak Slovene Ukrainian … Armenian Albanian Romance Italian French Portuguese Romanian Spanish … Indo-Iranian Afghan Bengali Hindi Kurdish Persian Sanskrit Singhalese Urdu … 14
15
16
17 The largest number of phoneme known is 45 in Chipewyan, the smallest is 13 in Hawaiian English has 31 to 64 and Persian has 29 to 45 phonemes, depending on how they are analyzed
18 allophones. A phoneme is actually a set of phonetically similar sound which are accepted by the speakers of the language as being the same sound. Members of the set are called allophones. Example: ◦ The /k/ in “kin” and “cup”. ◦ The /k/ in “cope” and “scope”.
19 English Phonemes Vowels Semi-vowels Fricatives Nasals Stops Aspiration uw ux uh ah ax ah-h aa ao ae eh ih ix ey iy ay ow aw oy er axr el y r l el w jh ch s z sh zh f v th dh m n ng em en eng nx b d g p t k dx q bcl dcl gcl pcl tcl kcl hv hh
20
21
22
There are over 40 speech sounds in American English which can be organized by their basic manner of production Manner Class Number Vowels 18 Fricatives 8 Stops 6 Nasals 3 Semivowels 4 Affricates 2 Aspirant 1 Vowels, glides, and consonants differ in degree of constriction Sonorant consonants have no pressure build up at constriction Nasal consonants lower the velum allowing airflow in nasal cavity Continuant consonants do not block airflow in oral cavity 23
No significant constriction in the vocal tract Usually produced with periodic excitation Acoustic characteristics depend on the position of the jaw, tongue, and lips 24
There are approximately 18 vowels in American English made up of monothongs, diphthongs, and reduced vowels (schwa’s) They are often described by the articulatory features: High/Low, Front/Back, Retroflexed, Rounded, and Tense/Lax 25
26
Vowels are often characterized by the lower three formants High/Low is correlated with the first formant, F 1 Front/Back is correlated with the second formant, F 2 Retroflexion is marked by a low third formant, F 3 27
Each vowel has a different intrinsic duration Schwa’s have distinctly shorter durations (50ms) /I, ε, Λ, Ω/ are the shortest monothongs Context can greatly influence vowel duration 28
29
Turbulence produced at narrow constriction Constriction position determines acoustic characteristics Can be produced with periodic excitation 30
There are 8 fricatives in American English Four places of articulation: Labio-Dental (Labial), Interdental (Dental), Alveolar, and Palato-Alveolar (Palatal) They are often described by the features Voiced/Unvoiced, or Strident/Non-Strident (constriction behind alveolar ridge) 31
32
Strident fricatives tend to be stronger than non-strident fricatives. 33
Voiced fricatives tend to be shorter than unvoiced fricatives. 34
35
"Somewhat more accurate, yet somewhat less useful." 36
facetious 37
Complete closure in the vocal tract, pressure build up Sudden release of the constriction, turbulence noise Can have periodic excitation during closure 38
There are 6 stop consonants in American English Three places of articulation: Labial, Alveolar, and Velar Each place of articulation has a voiced and unvoiced stop Unvoiced stops are typically aspirated Voiced stops usually exhibit a “voice-bar’’ during closure Information about formant transitions and release useful for classification 39
40
41
42
There are many voicing cues for a stop. 43
Unvoiced stops are unaspirated in /s/ stop sequences. 44
45
pacific 46
Velum lowering results in airflow through nasal cavity Consonants produced with closure in oral cavity Nasal murmurs have similar spectral characteristics 47
Three places of articulation: Labial, Alveolar, and Velar Nasal consonants are always attached to a vowel, though can form an entire syllable in unstressed environments /ng/ is always post-vocalic in English Place identified by neighboring formant transitions 48
49
fisherman 50
Constriction in vocal tract, no turbulence Slower articulatory motion than other consonants Laterals form complete closure with tongue tip, airflow via sides of constriction 51
There are 4 semivowels in American English Sometimes referred to as Liquids or Glides Glides are a more extreme articulation of a corresponding vowel ◦ Similar, though more extreme, formant positions ◦ Generally weaker due to narrower constriction Semivowels are always attached to a vowel, though /l/ can form an entire syllable in unstressed environments 52
53
/w/ and /l/ are the most confusable semivowels /w/ is characterized by a very low F 1, F 2 ◦ Typically a rapid spectral falloff above F 2 /l/ is characterized by a low F 1 and F 2 ◦ Often presence of high frequency energy ◦ Postvocalic /l/ characterized by minimal spectral discontinuity, gradual motion of formants ◦ /y/ is characterized by very low F 1, very high F 2 /y/ only occurs in a syllable onset position (i.e., pre-vocalic) /r/ is characterized by a very low F 3 ◦ Prevocalic F 3 < medial F 3 < postvocalic F 3 54
normalize 55
There are two affricates in American English: Alveolar-stop palatal-fricative pairs Sudden release of the constriction, turbulence noise Can have periodic excitation during closure 56
There is only one aspirant in American English: /h/ (e.g., “hat’’) Produced by generating turbulence excitation at glottis No constriction in the vocal tract, normal formant excitation Sub-glottal coupling results in little energy in F 1 region Periodic excitation can be present in medial position 57
58
tragic 59
Phonotactics is the study of allowable sound sequences Analyses of word-initial and -final clusters reveal: ◦ 73 distinct initial clusters (about 10 “foreign” clusters) ◦ 208 distinct final clusters Can be used to eliminate impossible phoneme sequences: ◦ /tk/ can’t end a word, and ◦ /kt/ can’t begin a word, ◦ Therefore, */: : : t k t : : :/ is an impossible sequence 60
61
Syllable structure captures many useful generalizations ◦ Phoneme realization often depends on syllabification ◦ Many phonological rules depend on syllable structure Syllable structure is predicated on the notion of ranking the speech sounds in terms of their sonority values 62
Utterances can be divided into syllables The number of syllables equals the number of sonority peaks Within any syllable, there is a segment constituting a sonority peak that is preceded and/or followed by a sequence of segments with progressively decreasing sonority values 63
Branches marked by ° are optional Nucleus must contain a non-obstruent Sonority decreases away from nucleus Affix contains only coronals: Only the last syllable in a word can have an affix /sp/, /st/, and /sk/ are treated as single obstruents 64
65
66
67
68
69