Presentation is loading. Please wait.

Presentation is loading. Please wait.

74.419 Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural.

Similar presentations


Presentation on theme: "74.419 Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural."— Presentation transcript:

1 74.419 Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural Language Processing written text as input sentences (well-formed or not) Spoken Language Understanding analysis of spoken language (transcribed speech)

2 Speech & Natural Language Processing Areas in Speech Recognition Signal Processing Phonetics Word Recognition Areas in Natural Language Processing Morphology Grammar & Parsing (syntactic analysis) Semantics Pragamatics Discourse / Dialogue Spoken Language Understanding

3 Speech Production & Reception Sound and Hearing change in air pressure  sound wave reception through inner ear membrane / microphone break-up into frequency components: receptors in cochlea / mathematical frequency analysis (e.g. Fast-Fourier Transform FFT) → Frequency Spectrum perception/recognition of phonemes and subsequently words (e.g. Neural Networks, Hidden-Markov Models)

4

5

6 Phoneme Recognition: HMM, Neural Networks Phonemes Acoustic / sound wave Filtering, Sampling Spectral Analysis; FFT Frequency Spectrum Features (Phonemes; Context) Grammar or Statistics Phoneme Sequences / Words Grammar or Statistics for likely word sequences Word Sequence / Sentence Speech Recognition Signal Processing / Analysis

7 Speech Signal Analog-Digital Conversion of acoustic signal → Sampling in Time Frames = “ windows ” Characteristics of a Speech Signal  formants - strong frequency components; characterize e.g. vowels, gender of speaker; dark stripe in spectrum  pitch – fundamental frequency (baseline for higher frequency harmonics like formants)  place of articulation (recognition model based on model of vocal tract)  change in frequency distribution

8

9 Video of glottis and speech signal in lingWAVES (from http://www.lingcom.de)

10

11 Speech Signal Analog-Digital Conversion of Acoustic Signals → Sampling Analysis of Signal in Time Frames (“windows”) Characteristics of a Speech Signal  formants - strong frequency components; characterize e.g. vowels, gender of speaker; dark stripe in spectrum  pitch – fundamental frequency (baseline for higher frequency harmonics like formants)  place of articulation (recognition model based on model of vocal tract)  change in frequency distribution

12

13

14

15

16 Speech Recognition Characteristics Speech Recognition vs. Speaker Identification Speaker-dependent vs. speaker independent Single word vs. continuous speech Large vs. small vocabulary

17

18 Additional References Hong, X. & A. Acero & H. Hon: Spoken Language Processing. A Guide to Theory, Algorithms, and System Development. Prentice- Hall, NJ, 2001.


Download ppt "74.419 Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural."

Similar presentations


Ads by Google