Download presentation
Presentation is loading. Please wait.
1
CMSC 723 / LING 645: Intro to Computational Linguistics September 22, 2004: Dorr Supplement: PC-Kimmo Tutorial Prof. Bonnie J. Dorr Dr. Christof Monz TA: Adam Lee
2
Building Automata in Kimmo: English Epenthesis Chomsky & Halle: +s → es / X__, else s; where X = {s, ch, s, sh, x} +:0, = s +:e S S Note 1: S = S:S, s = s:s, etc. Note 2 : No way to get out of last state except for `s’ How do we implement this? X+s ==> Xes; else Y+s ==> Ys What does this look like? Let S = {x, s, z}
3
Building Automata in Kimmo: English Epenthesis Chomsky & Halle: +s → es / X__, else s; where X = {s, ch, s, sh, x} +:0, = s +:e S S How do we implement this? X+s ==> Xes; else Y+s ==> Ys What does this look like? Let S = {x, s, z} +:e +:0 Note 1: S = S:S, s = s:s, etc. Note 2 : No way to get out of last state except for `s’ Note 3: No way to get out of intermed state on `s’ =, S
4
Building Automata in Kimmo: English Epenthesis +:0, = s +:eS S +:0 =, S This takes care of x, s, z, but what about ch, sh?
5
Building Automata in Kimmo: English Epenthesis This takes care of x, s, z, but what about ch, sh, ss? Add two new states. (And now we need to add numbers!) +:0, = +:eS S +:0 s c s,h,S +:e s s =, S c
6
Building Automata in Kimmo: English Epenthesis 5 13 +:0, = +:eS S +:0 6 4 2 s c s,h,S Add numbers to states! Problem: Now that we have introduced c, s, h, we need to worry about these in other states! +:e s =, S s c
7
Building Automata in Kimmo: English Epenthesis 5 13 +:0, =, h +:eS S +:0 6 4 2 s c s,h,S Add numbers to states! Problem: Now that we have introduced c, s, h, we need to worry about these in other states! +:e s =, S, c, h s c c c h s
8
Building Automata in Kimmo: English Epenthesis 5 13 +:0, =, h +:eS S +:0 6 4 2 s c s,h,S Add numbers to states! Problem: In fact, we have to worry about all feasible pairs in every state. +:e s =, S, c, h s c c c h s = = = Note: If a feasible pair is missing between 2 states, it is assumed that pair is not possible, e.g., we cannot go from 6 to 1 on s:s
9
Building Automata in Kimmo: English Epenthesis 5 13 +:0, =, h +:eS S +:0 6 4 2 s c s,h,S Unfortunately, life can get even more complicated (because of restriction below): there can be interaction with other automata. For example, Epenthesis interacts with Y-replacement, so we need a feasible pair for y:i and +:0 in Epenthesis! +:e s =, S, c, h s c c c h s = = = Note: If a feasible pair is missing between 2 states, it is assumed that pair is not possible, e.g., we cannot go from 6 to 1 on s:s
10
Building Automata in Kimmo: English Epenthesis Here is the actual automaton matrix used in simple-english.aut 5 13 +:0, =, h +:eS S +:0 6 4 2 s c s,h,S +:e s =, S, c, h s c c c h s = = = RULE "Epenthesis" 6 9 c h s S y + + = _ c h s S i e 0 = _ 1: 2 1 4 3 3 0 1 1 1 2: 2 3 3 3 3 0 1 1 1 3: 2 1 3 3 3 5 6 1 1 4: 2 3 3 3 3 5 6 1 1 5. 0 0 1 0 0 0 0 0 0 6: 1 1 0 1 1 5 6 1 1 Hint: This is the sort of thing you’ll need for German find+t
11
What about long-distance dependencies? How do we handle Buch and Bücher? Trick: Use maybe-umlaut, coupled with special marker in the suffix Root form in lexicon: b|ch Continuation class (or alternation) is called /NOUN-NEUT-ADD-ER In lexicon for NOUN-NEUT-ADD-ER, put ending +&er, where & indicates that the root may have a character that should be umlauted. In the Maybe-Umlaut automaton, search for sequence “|.. +&e” and change to “*.. +&e”
12
Building a Lexicon in Kimmo: Simple English Example ALTERNATION /Root Root ALTERNATION /N N ALTERNATION /End End LEXICON INITIAL 0 /Root "(“ LEXICON N 0 /C1 "(cat n) (person p3) (number sg)“ +s /C2 "(cat n) (person p3) (number pl)“ +y /A "(cat a)“ LEXICON Root spy /N spy /V LEXICON End 0 # “)” END
13
Demo of PC-Kimmo cd kimmo pckimmo load rules simple-english.aut load lexicon simple-english.dic generate try+s recognize tries set tracing on recognize tries
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.