Download presentation
Presentation is loading. Please wait.
Published byKelley Parker Modified over 9 years ago
1
Applications (2 of 2): Recognition, Transduction, Discrimination, Segmentation, Alignment, etc. Kenneth Church Kenneth.Church@jhu.edu Dec 9, 20091
2
Applications Recognition: Shannon’s Noisy Channel Model – Speech, Optical Character Recognition (OCR), Spelling Transduction – Part of Speech (POS) Tagging – Machine Translation (MT) Parsing: ??? Ranking – Information Retrieval (IR) – Lexicography Discrimination: – Sentiment, Text Classification, Author Identification, Word Sense Disambiguation (WSD) Segmentation – Asian Morphology (Word Breaking), Text Tiling Alignment: Bilingual Corpora, Dotplots Compression Language Modeling: good for everything Dec 9, 20092
3
3 Speech Language Shannon’s: Noisy Channel Model I Noisy Channel O I΄ ≈ ARGMAX I Pr(I|O) = ARGMAX I Pr(I) Pr(O|I) Trigram Language Model WordRankMore likely alternatives We9 The This One Two A Three Please In need7are will the would also do to1 resolve85have know do… all9 The This One Two A Three Please In of2 The This One Two A Three Please In the1 important657document question first… issues14thing point to Channel Model ApplicationInputOutput Speech Recognitionwriterrider OCR (Optical Character Recognition) allalla1la1l Spelling Correctiongovernmentgoverment Channel Model Language Model Application Independent Dec 9, 2009
4
4 Speech Language Using (Abusing) Shannon’s Noisy Channel Model: Part of Speech Tagging and Machine Translation Speech – Words Noisy Channel Acoustics OCR – Words Noisy Channel Optics Spelling Correction – Words Noisy Channel Typos Part of Speech Tagging (POS): – POS Noisy Channel Words Machine Translation: “Made in America” – English Noisy Channel French Didn’t have the guts to use this slide at Eurospeech (Geneva) Dec 9, 2009
5
5
6
Spelling Correction Dec 9, 20096
7
7
8
8
9
9
10
10
11
Evaluation Dec 9, 200911
12
Performance Dec 9, 200912
13
The Task is Hard without Context Dec 9, 200913
14
Easier with Context actuall, actual, actually – … in determining whether the defendant actually will die. constuming, consuming, costuming conviced, convicted, convinced confusin, confusing, confusion workern, worker, workers Dec 9, 200914
15
Dec 9, 200915 Easier with Context
16
Context Model Dec 9, 200916
17
Dec 9, 200917
18
Dec 9, 200918
19
Dec 9, 200919
20
Dec 9, 200920
21
Future Improvements Add More Factors – Trigrams – Thesaurus Relations – Morphology – Syntactic Agreement – Parts of Speech Improve Combination Rules – Shrink (Meaty Methodology) Dec 9, 200921
22
Dec 9, 200922
23
Conclusion (Spelling Correction) There has been a lot of interest in smoothing – Good-Turing estimation – Knesser-Ney Is it worth the trouble? Ans: Yes (at least for recognition applications) Dec 9, 200923
24
Dec 9, 200924
25
Dec 9, 200925
26
Dec 9, 200926
27
Dec 9, 200927
28
Dec 9, 200928
29
Dec 9, 200929
30
Dec 9, 200930
31
Dec 9, 200931
32
Dec 9, 200932
33
Dec 9, 200933
34
Dec 9, 200934
35
Dec 9, 200935
36
Dec 9, 200936
37
Dec 9, 200937
38
Dec 9, 200938
39
Dec 9, 200939
40
Dec 9, 200940
41
Dec 9, 200941
42
Dec 9, 200942
43
Dec 9, 200943
44
Dec 9, 200944
45
Dec 9, 200945
46
Dec 9, 200946
47
Aligning Words Dec 9, 200947
48
Dec 9, 200948
49
Dec 9, 200949
50
Dec 9, 200950
51
Dec 9, 200951
52
Dec 9, 200952
53
Dec 9, 200953
54
Dec 9, 200954
55
Dec 9, 200955
56
Dec 9, 200956
57
Dec 9, 200957
58
Dec 9, 200958
59
Dec 9, 200959
60
Dec 9, 200960
61
Dec 9, 200961
62
Dec 9, 200962
63
Dec 9, 200963
64
Dec 9, 200964
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.