Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMSC 723 / LING 645: Intro to Computational Linguistics September 22, 2004: Dorr Supplement: PC-Kimmo Tutorial Prof. Bonnie J. Dorr Dr. Christof Monz TA:

Similar presentations


Presentation on theme: "CMSC 723 / LING 645: Intro to Computational Linguistics September 22, 2004: Dorr Supplement: PC-Kimmo Tutorial Prof. Bonnie J. Dorr Dr. Christof Monz TA:"— Presentation transcript:

1 CMSC 723 / LING 645: Intro to Computational Linguistics September 22, 2004: Dorr Supplement: PC-Kimmo Tutorial Prof. Bonnie J. Dorr Dr. Christof Monz TA: Adam Lee

2 Building Automata in Kimmo: English Epenthesis Chomsky & Halle: +s → es / X__, else s; where X = {s, ch, s, sh, x} +:0, = s +:e S S Note 1: S = S:S, s = s:s, etc. Note 2 : No way to get out of last state except for `s’ How do we implement this? X+s ==> Xes; else Y+s ==> Ys What does this look like? Let S = {x, s, z}

3 Building Automata in Kimmo: English Epenthesis Chomsky & Halle: +s → es / X__, else s; where X = {s, ch, s, sh, x} +:0, = s +:e S S How do we implement this? X+s ==> Xes; else Y+s ==> Ys What does this look like? Let S = {x, s, z} +:e +:0 Note 1: S = S:S, s = s:s, etc. Note 2 : No way to get out of last state except for `s’ Note 3: No way to get out of intermed state on `s’ =, S

4 Building Automata in Kimmo: English Epenthesis +:0, = s +:eS S +:0 =, S This takes care of x, s, z, but what about ch, sh?

5 Building Automata in Kimmo: English Epenthesis This takes care of x, s, z, but what about ch, sh, ss? Add two new states. (And now we need to add numbers!) +:0, = +:eS S +:0 s c s,h,S +:e s s =, S c

6 Building Automata in Kimmo: English Epenthesis 5 13 +:0, = +:eS S +:0 6 4 2 s c s,h,S Add numbers to states! Problem: Now that we have introduced c, s, h, we need to worry about these in other states! +:e s =, S s c

7 Building Automata in Kimmo: English Epenthesis 5 13 +:0, =, h +:eS S +:0 6 4 2 s c s,h,S Add numbers to states! Problem: Now that we have introduced c, s, h, we need to worry about these in other states! +:e s =, S, c, h s c c c h s

8 Building Automata in Kimmo: English Epenthesis 5 13 +:0, =, h +:eS S +:0 6 4 2 s c s,h,S Add numbers to states! Problem: In fact, we have to worry about all feasible pairs in every state. +:e s =, S, c, h s c c c h s = = = Note: If a feasible pair is missing between 2 states, it is assumed that pair is not possible, e.g., we cannot go from 6 to 1 on s:s

9 Building Automata in Kimmo: English Epenthesis 5 13 +:0, =, h +:eS S +:0 6 4 2 s c s,h,S Unfortunately, life can get even more complicated (because of restriction below): there can be interaction with other automata. For example, Epenthesis interacts with Y-replacement, so we need a feasible pair for y:i and +:0 in Epenthesis! +:e s =, S, c, h s c c c h s = = = Note: If a feasible pair is missing between 2 states, it is assumed that pair is not possible, e.g., we cannot go from 6 to 1 on s:s

10 Building Automata in Kimmo: English Epenthesis Here is the actual automaton matrix used in simple-english.aut 5 13 +:0, =, h +:eS S +:0 6 4 2 s c s,h,S +:e s =, S, c, h s c c c h s = = = RULE "Epenthesis" 6 9 c h s S y + + = _ c h s S i e 0 = _ 1: 2 1 4 3 3 0 1 1 1 2: 2 3 3 3 3 0 1 1 1 3: 2 1 3 3 3 5 6 1 1 4: 2 3 3 3 3 5 6 1 1 5. 0 0 1 0 0 0 0 0 0 6: 1 1 0 1 1 5 6 1 1 Hint: This is the sort of thing you’ll need for German find+t

11 What about long-distance dependencies?  How do we handle Buch and Bücher?  Trick: Use maybe-umlaut, coupled with special marker in the suffix  Root form in lexicon: b|ch Continuation class (or alternation) is called /NOUN-NEUT-ADD-ER  In lexicon for NOUN-NEUT-ADD-ER, put ending +&er, where & indicates that the root may have a character that should be umlauted.  In the Maybe-Umlaut automaton, search for sequence “|.. +&e” and change to “*.. +&e”

12 Building a Lexicon in Kimmo: Simple English Example  ALTERNATION /Root Root ALTERNATION /N N  ALTERNATION /End End  LEXICON INITIAL 0 /Root "(“  LEXICON N 0 /C1 "(cat n) (person p3) (number sg)“ +s /C2 "(cat n) (person p3) (number pl)“ +y /A "(cat a)“  LEXICON Root spy /N spy /V  LEXICON End 0 # “)”  END

13 Demo of PC-Kimmo  cd kimmo  pckimmo  load rules simple-english.aut  load lexicon simple-english.dic  generate try+s  recognize tries  set tracing on  recognize tries


Download ppt "CMSC 723 / LING 645: Intro to Computational Linguistics September 22, 2004: Dorr Supplement: PC-Kimmo Tutorial Prof. Bonnie J. Dorr Dr. Christof Monz TA:"

Similar presentations


Ads by Google