Presentation is loading. Please wait.

Presentation is loading. Please wait.

Applications (2 of 2): Recognition, Transduction, Discrimination, Segmentation, Alignment, etc. Kenneth Church Dec 9, 20091.

Similar presentations


Presentation on theme: "Applications (2 of 2): Recognition, Transduction, Discrimination, Segmentation, Alignment, etc. Kenneth Church Dec 9, 20091."— Presentation transcript:

1 Applications (2 of 2): Recognition, Transduction, Discrimination, Segmentation, Alignment, etc. Kenneth Church Kenneth.Church@jhu.edu Dec 9, 20091

2 Solitaire  Multiplayer Games: Auctions (Ads) http://www.scienceoftheweb.org/15-396/lectures/lecture09.pdf http://www.scienceoftheweb.org/15-396/lectures/lecture09.pdf Dec 9, 20092 Mainline Ad Right Rail Right Rail: Avoid distortions from commercial interests

3 A Single Auction  A Stream of Continuous Auctions Standard Example of Second Price Auction – Single Auction for a Single Apple Theoretical Result – Second Price Auction  Truth Telling – http://en.wikipedia.org/wiki/Vickrey_auction http://en.wikipedia.org/wiki/Vickrey_auction – Optimal Strategy: Bid what the apple is worth to you Don’t worry about what it is worth to others – First Price Auction  Truth Telling Does theory generalize to a continuous stream? Dec 9, 20093

4 Pricing: Cost Per Click (CPC) B i = your bid B i+1 = next bid CTR i = your click through rate CTR i+1 = next click through rate CPC i = your price – (if we show your ad and user clicks) Improvement: CTR  Q (Prior) Single Auction: – CPC i = B i+1 Continuous Stream: – CPC i = B i+1 CTR i+1 / CTR i Equilibrium – Advertisers Awareness Sales New Customers ROI – Users Minimize pain Obtain Value – Market Maker Maximize Revenue Truth Telling? Dec 9, 20094

5 Multi-Player Games  Many Technical Opportunities Economics – http://www.wired.com/culture/culturereviews/magazine/17-06/nep_googlenomics?currentPage=all http://www.wired.com/culture/culturereviews/magazine/17-06/nep_googlenomics?currentPage=all Machine Learning – Learning to Rank – Estimate CTR (Q/Priors) – Sparse Data: What is the CTR for a new ad? – Errors can be expensive If CTR is too low for new ad  Penalize Growth If too high  Reward Bad Guys to do Bad Things Truth Telling for Continuous Auctions? – Probably not, especially if participants can estimate Q better than market maker Machine Learning: Solitaire  Multi-Player Games – Can I estimate Q better than you can? Man-eating tiger Dec 9, 20095

6 Applications Recognition: Shannon’s Noisy Channel Model – Speech, Optical Character Recognition (OCR), Spelling Transduction – Part of Speech (POS) Tagging – Machine Translation (MT) Parsing: ??? Ranking – Information Retrieval (IR) – Lexicography Discrimination: – Sentiment, Text Classification, Author Identification, Word Sense Disambiguation (WSD) Segmentation – Asian Morphology (Word Breaking), Text Tiling Alignment: Bilingual Corpora, Dotplots Compression Language Modeling: good for everything Dec 9, 20096

7 7 Speech  Language Shannon’s: Noisy Channel Model I  Noisy Channel  O I΄ ≈ ARGMAX I Pr(I|O) = ARGMAX I Pr(I) Pr(O|I) Trigram Language Model WordRankMore likely alternatives We9 The This One Two A Three Please In need7are will the would also do to1 resolve85have know do… all9 The This One Two A Three Please In of2 The This One Two A Three Please In the1 important657document question first… issues14thing point to Channel Model ApplicationInputOutput Speech Recognitionwriterrider OCR (Optical Character Recognition) allalla1la1l Spelling Correctiongovernmentgoverment Channel Model Language Model Application Independent Dec 9, 2009

8 8 Speech  Language Using (Abusing) Shannon’s Noisy Channel Model: Part of Speech Tagging and Machine Translation Speech – Words  Noisy Channel  Acoustics OCR – Words  Noisy Channel  Optics Spelling Correction – Words  Noisy Channel  Typos Part of Speech Tagging (POS): – POS  Noisy Channel  Words Machine Translation: “Made in America” – English  Noisy Channel  French Didn’t have the guts to use this slide at Eurospeech (Geneva) Dec 9, 2009

9 9

10 Spelling Correction Dec 9, 200910

11 Dec 9, 200911

12 Dec 9, 200912

13 Dec 9, 200913

14 Dec 9, 200914

15 Evaluation Dec 9, 200915

16 Performance Dec 9, 200916

17 The Task is Hard without Context Dec 9, 200917

18 Easier with Context actuall, actual, actually – … in determining whether the defendant actually will die. constuming, consuming, costuming conviced, convicted, convinced confusin, confusing, confusion workern, worker, workers Dec 9, 200918

19 Dec 9, 200919 Easier with Context

20 Context Model Dec 9, 200920

21 Dec 9, 200921

22 Dec 9, 200922

23 Dec 9, 200923

24 Dec 9, 200924

25 Future Improvements Add More Factors – Trigrams – Thesaurus Relations – Morphology – Syntactic Agreement – Parts of Speech Improve Combination Rules – Shrink (Meaty Methodology) Dec 9, 200925

26 Dec 9, 200926

27 Conclusion (Spelling Correction) There has been a lot of interest in smoothing – Good-Turing estimation – Knesser-Ney Is it worth the trouble? Ans: Yes (at least for recognition applications) Dec 9, 200927

28 Dec 9, 200928

29 Dec 9, 200929

30 Dec 9, 200930

31 Dec 9, 200931

32 Dec 9, 200932

33 Dec 9, 200933

34 Dec 9, 200934

35 Dec 9, 200935

36 Dec 9, 200936

37 Dec 9, 200937

38 Dec 9, 200938

39 Dec 9, 200939

40 Dec 9, 200940

41 Dec 9, 200941

42 Dec 9, 200942

43 Dec 9, 200943

44 Dec 9, 200944

45 Dec 9, 200945

46 Dec 9, 200946

47 Dec 9, 200947

48 Dec 9, 200948

49 Dec 9, 200949

50 Dec 9, 200950

51 Aligning Words Dec 9, 200951

52 Dec 9, 200952

53 Dec 9, 200953

54 Dec 9, 200954

55 Dec 9, 200955

56 Dec 9, 200956

57 Dec 9, 200957

58 Dec 9, 200958

59 Dec 9, 200959

60 Dec 9, 200960

61 Dec 9, 200961

62 Dec 9, 200962

63 Dec 9, 200963

64 Dec 9, 200964

65 Dec 9, 200965

66 Dec 9, 200966

67 Dec 9, 200967

68 Dec 9, 200968


Download ppt "Applications (2 of 2): Recognition, Transduction, Discrimination, Segmentation, Alignment, etc. Kenneth Church Dec 9, 20091."

Similar presentations


Ads by Google