Download presentation
Presentation is loading. Please wait.
Published bySabrina Shelton Modified over 9 years ago
2
Spectrogram of “Being in a crowd is tiring.” Time Frequency
3
Speech synthesis has been the most important tool used to answer these kinds of questions. Method: 1.Spend huge amounts of time studying spectrograms and coming up with hypotheses. 2.Test the hypotheses using a synthesizer.
4
Hand-painted spectrograms on the Pattern Playback (Haskins Labs, early 1950s) “glue” [mɑ ɑmɑ ɑm] “typical” [bɑ wɑ] “labs”
5
The state of the art in speech synthesis in the early 1950s – wildly exciting (to eggheads) at the time.
7
Five Hard Problems 1.Context sensitivity (acoustic-phonetic invariance, perceptual constancy. 2.Segmentation problem 3.Talker variability or talker normalization problem 4.Phonological recoding problem 5.Word segmentation problem
10
Epsilon ([ɛ]) in Three Different Phonetic Contexts (seven, less, ten) “Two plus seven is less than ten.”
12
Segmentation problem – the individual speech sounds tend to run into one another, with few clear dividing lines. How can speech sounds be identified if you can’t tell where they begin and end?
13
Talker Variability or Talker Normalization Speech patterns vary from one individual talker to the next. Sources of talker variability: 1.Vocal-tract length 2.Detailed configuration of the vocal tract 3.Voice pitch and other characteristics associated with the laryngeal tone 4.Dialect 5.Idiosyncratic differences
16
Phonological Recoding
17
Detection of Word Boundaries 1 [t ɛ lðəg ɑɚ dn ɚ təplænts ʌ mtul ɪ ps]
18
Detection of Word Boundaries 2 (Example from Ron Cole) Remember, a spoken sentence often contains many words that were not intended to be heard. Ream ember, us poke in cent tense off in contains men knee words that were knot in tend did tube bee herd.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.