Download presentation
Presentation is loading. Please wait.
Published byWarren Brooks Modified over 9 years ago
1
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05 Prof. Pushpak Bhattacharyya
2
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.2 S = {s 1, s 2 … s q } R = {t 1, t 2 … t q } Noisy Channel S R SPEECH RECOGNITION ( ASR – Automatic SR) - Signal processing (low level). - Cognitive Processing (higher level categories).
3
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.3 Noisy Channel Metaphor Due to Jelinek (IBM) – 1970’s Main field of study – speech. Problem Definition S = {Speech signals} = {s 1, s 2 … s s } R = {w 1, w 2 … w r } {s 1, s 2 … s p } {w 1, w 2 … w q }
4
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.4 Special and Easier case Isolated word Recognition (IWR) Complexity due to ‘Word Boundary’ will not arise. Example : I got a plate vs I got up late
5
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.5 Homophones: Words have same pronunciation. Example:bear, beer : Homographs: Words have same spellings but different meaning Example: bank; River bank and finance bank Homophones And Homographs
6
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.6 World of sounds – speech signals Phonetics Phonology World of words Orthography letters :Consonants Vowels World Of Sounds
7
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.7 Since alphabet to sound mapping is not one to one Vowels Tomato Tomaeto Tomaato
8
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.8 Sound Variations Lexical variations ‘because’ ‘cause because Allophonic variations ‘because’ because becase
9
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.9 Allophonic variations: More remarkable example Do [ δ][U] Go [G][0]
10
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.10 Socio-cultural variations something somethingsomethin formalinformal Dialectic variation Very – bheri in Bengal apple – ieple in south eple in north aapel in bengal
11
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.11 Orthography -- Phonology complex problem Very difficult to model using ‘Rule Governed’ system.
12
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.12 Probabilistic Approach W* = Best estimate for a word given S N C S W* W* = ARGMAX [ P(w|s) ] w belongs to set of words
13
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.13 P(w|s) called the ‘parameter’ of the system. Estimation Training The probability values need to be estimated from “SPEECH CORPORA”. Record speech of many speakers.
14
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.14 Look of Speech Corpora Annotation – Unique pronunciation. Signal Apple
15
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.15 Repository of Standard Sound Symbols IPA – International Phonetic Association. ARPABET – American’s Phonetic STD.
16
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.16 t Augment the Roman Alphabet with Greek symbols e [Є] ‘ebb’ [i] ‘need’ top [ t] IPA tool [θ] IPA
17
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.17 Speech corpora are annotated with IPA/ARPABET symbols. Indian Scenario Hindi TIFR Marathi IITB Tamil IITM
18
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.18 How to Estimate P(w|s) from speech corpora count(w,s)/ count(s) Not done this way
19
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.19 Apply Bayes Theorem P(w|s) = P(w). P(s|w) / P(s) W* = ARGMAX (P(w). P(s|w)) / P(s)
20
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.20 W* =ARGMAX (P(w). P(s|w)) w belongs to Words P(w) = Prior = Language model. P(s|w) = Likelihood of W being pronounced as ‘s’. = Acoustic Model.
21
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.21 Acoustic Model Pronunciation dictionary (Finite State Automata). Manually Built - Costly Resource. Example s 1 2 3 4 5 60 t 0m aa t ae 0
22
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.22 W* obtained from P(w) and P(w|s) Language model ? Rel. frequency of w in the corpora Ref freq Ξ unigram model P(knee) > P(need) I _ _ _ _ _ Knee High probability need Low probability
23
06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.23 Language Modelling by N-grams N – grams N: 2 – bigrams. 3 – trigrams (Best empirically for English).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.