Download presentation
Presentation is loading. Please wait.
Published byJemimah Owens Modified over 9 years ago
1
Human Language Technology Finite State Transducers
2
October 2009HLT: finite state transducers2 Acknowledgement Material in this lecture derived/copied in part from –Richard Sproat CL46 Lectures –Lauri Karttunen LSA lectures 2005 –Shuly Wintner 2008 Malta
3
October 2009HLT: finite state transducers3 Three Key Concepts Finite State Transducers Regular Relations Computational Morphology
4
October 2009HLT: finite state transducers4 Three Key Concepts Finite State Transducers Regular Relations Computational Morphology
5
October 2009HLT: finite state transducers5 A Regular Set ab abab ababab abababab ababababab. L1
6
October 2009HLT: finite state transducers6 Two Regular Sets ab abab ababab abababab ababababab. ba baba bababa babababa bababababa. L1 L2
7
October 2009HLT: finite state transducers7 A Regular Relation L1 x L2 ab abab ababab abababab ababababab. ba baba bababa babababa bababababa. L1 L2 or {("ab","ba"), ("abab","baba"),...}
8
October 2009HLT: finite state transducers8 Some closure properties for regular relations Concatenation [R1 R2] Power (R n ) Reversal Inversion (R -1 ) Composition: R 1 ○ R 2
9
October 2009HLT: finite state transducers9 Concatenation and Power Concatenation R1 = {("a","b")} R2 = {("c","d")} [R1 R2] = {("ac","bd")} Power R1+ = {("a","b"),("aa","bb"), ("aaa","bbb"),...}
10
October 2009HLT: finite state transducers10 61 Composition R 1 ○ R 2 denotes the composition of relations R 1 and R 2. Definition If R 1 contains And R 2 contains Then R 1 ○ R 2 contains R 1 and R 2 and B must be relations. If either is just a language, it is assumed to abbreviate the identity relation. R 1 ○ R 2 is written [R 1.o. R 2 ] in xfst
11
October 2009HLT: finite state transducers11 Closure Properties of Regular Languages and Relations Operation Regular Languages Regular Relations Union yes yes Concatenation yes yes Iteration yes yes Intersection yes no Subtraction yes no Complementation yes no Composition n/a yes
12
October 2009HLT: finite state transducers12 Morphology as a Regular Relation cat cats mice lives. cat cat+N+PL mouse+N+PL life+N+PL live+V+3SING. surface language lexical language or {("cat,cat"),("cats","cat+N+PL"),......}
13
October 2009HLT: finite state transducers13 Part-of-Speech Tagging I know some new tricks PRON V DET ADJ N said the Cat in the Hat V DET N P DET N
14
October 2009HLT: finite state transducers14 Singular-to-plural mapping: cat hat ox child mouse sheep cats hats oxen children mice sheep
15
October 2009HLT: finite state transducers15 Three Key Concepts Finite State Transducers Regular Relations Computational Morphology
16
October 2009HLT: finite state transducers16 FSA a Used for Recognition Generation
17
October 2009HLT: finite state transducers17 Finite State Transducers A finite state transducer (FST) is essentially an FSA finite state automaton that works on two (or more) tapes. The most common way to think about transducers is as a kind of translating machine which works by reading from one tape and writing onto the other.
18
October 2009HLT: finite state transducers18 FST Definition A 2 way FST is a quintuple (K,s,F, i x o, ) where i, o are input and output alphabets K is a finite set of states s K is an initial state F K are final states is a transition relation of type K x i x o x K
19
October 2009HLT: finite state transducers19 FST a Used for Recognition Generation Translation b upper tape lower tape
20
October 2009HLT: finite state transducers20 A Very Simple Transducer a b Relation { ("a","b") } Notation a:b encodes the transition
21
October 2009HLT: finite state transducers21 A Very Simple Transducer a b a:ba:b also written as
22
October 2009HLT: finite state transducers22 A Very Simple Transducer a b a:ba:b upper side lower side
23
October 2009HLT: finite state transducers23 Symbol Pairs Symbols vs. symbol pairs –In general, no distinction is made in xfst between a the language {“a”} a:a the identity relation {(“a”, “a”)} a
24
October 2009HLT: finite state transducers24 A (more interesting) Transducer Relation { ("a","b"), ("aa","bb"),...} Notation a:b* N.B. with this notation a and b must be single symbols 1 a:ba:b
25
October 2009HLT: finite state transducers25 Transducer have Several Modes of Operation generation mode: It writes on both tapes. A string of as on one tape and a string of bs on the other tape. Both strings have the same length. recognition mode: It accepts when the word on the first tape consists of exactly as many as as the word on the second tape consists of bs. translation mode (left to right): It reads as from the first tape and writes a b for every a that it reads onto the second tape. translation mode (right to left): It reads bs from the second tape and writes an a for every b that it reads onto the first tape.
26
October 2009HLT: finite state transducers26 The Basic Idea Morphology is regular Morphology is finite state
27
October 2009HLT: finite state transducers27 Morphology is Regular The relation between the surface forms of a language and the corresponding lexical forms can be described as a regular relation, e.g. { ("leaf+N+Pl","leaves"),("hang+V+Past","hung"),...} Regular relations are closed under operations such as concatenation, iteration, union, and composition. Complex regular relations can be derived from simpler relations.
28
October 2009HLT: finite state transducers28 Morphology is finite-state A regular relation can be defined using the metalanguage of regular expressions. [{talk} | {walk} | {work}] [%+Base:0 | %+SgGen3:s | %+Progr:{ing} | %+Past:{ed}]; A regular expression can be compiled into a finite-state transducer that implements the relation computationally.
29
October 2009HLT: finite state transducers29 Compilation [{talk} | {walk} | {work}] [%+Base:0 | %+SgGen3:s | %+Progr:{ing} | %+Past:{ed}]; Regular expression k t a a w o l r +Progr:i :g +3rdSg:s +Past:e :d :n +Base: Finite-state transducer final state initial state
30
October 2009HLT: finite state transducers30 work+3rdSg --> works k:kk:k t:t a:a w:ww:w o:oo:o l:l r:rr:r +Progr:i :g +3rdSg:s +Past:e :d :n +Base: Generation
31
October 2009HLT: finite state transducers31 talked --> talk+Past k:kk:k t:tt:t a:a a:aa:a w:w o:o l:ll:l r:r +Progr:i :g +3rdSg:s +Past:e :d:d :n +Base: Analysis
32
October 2009HLT: finite state transducers32 XFST Demo 2 xfst[0]: regex [{talk} | {walk} | {work}] [% +Base:0 | %+SgGen3:s | %+Progr:{ing} | %+Past:{ed}]; % xfst xfst[0]: start xfst compile a regular expression apply the result xfst[1]: apply up walked walk+Past xfst[1]: apply down talk+SgGen3 talks
33
October 2009HLT: finite state transducers33 Lexical transducer veut vouloir +IndP +SG + P3 Finite-state transducer inflected form citation forminflection codes vouloir+IndP+SG+P3 veut Bidirectional: generation or analysis Compact and fast Comprehensive systems have been built for over 40 languages: –English, German, Dutch, French, Italian, Spanish, Portuguese, Finnish, Russian, Turkish, Japanese, Korean, Basque, Greek, Arabic, Hebrew, Bulgarian, …
34
October 2009HLT: finite state transducers34 How lexical transducers are made Lexicon FST Rule FSTs Compiler fat +Adj r +Comp fat te Lexical Transducer (a single FST) composition Lexicon Regular Expression Rules Regular Expressions Morphotactics Alternations
35
October 2009HLT: finite state transducers35 Sequential Model... Surface form Intermediate form Lexical form fst 1fst 2 fst n Ordered sequence of rewrite rules (Chomsky & Halle ‘68) can be modeled by a cascade of finite-state transducers Johnson ‘72 Kaplan & Kay ‘81
36
October 2009HLT: finite state transducers36 Parallel Model Set of parallel of two-level rules (constraints) compiled into finite-state automata interpreted as transducers Koskenniemi ‘83 fst 1 fst 2 fst n... Surface form Lexical form
37
October 2009HLT: finite state transducers37 Sequential vs. Parallel rules compose intersect FST rule 1 rule 2 rule n... Surface form Lexical form Koskenniemi 1983 Intermediate form... Surface form Lexical form rule 1 rule n rule 1 Chomsky&Halle 1968
38
October 2009HLT: finite state transducers38 Sequential vs. Parallel Rules Sequential rules are combined by means of composition. Advantage: FSTs are closed under composition Disadvantage: order of operations is sensitive Parallel rules are combined by means of intersection In general, FSTs are not closed under intersection. … but FSTs without ε-transitions are closed under intersection.
39
October 2009HLT: finite state transducers39 Crossproduct A.x. BThe relation that maps every string in A to every string in B, and vice versa A:BSame as [A.x. B]. b:y c:0a:x a b c.x. x y[a b c] : [x y]{abc}:{xy}
40
October 2009HLT: finite state transducers40 Composition A.o. BThe relation C such that if A maps x to y and B maps y to z, C maps x to z. b:B c:Ca:A b ca b:B c:C d:D {abc}.o. [a:A | b:B | c:C | d:D]*
41
October 2009HLT: finite state transducers41 Transducers are not closed under intersection ε:b c:b ε:a ε:bc:a T 1 (C n ) = { a n b m | m≥0 } T 2 (C n ) = { a m b n | m≥0 } T 1 ∩T 2 (C n ) = { a n b n }
42
October 2009HLT: finite state transducers42 Xerox RE Operators $ containment => restriction -> replacement –Make it easier to describe complex languages and relations without extending the formal power of finite- state systems.
43
October 2009HLT: finite state transducers43 Containmenta? ? a $a [?* a ?*]
44
October 2009HLT: finite state transducers44 Restriction ? c b b c ? a c a => b _ c “Any a must be preceded by b and followed by c.” ~[~[?* b] a ?*] & ~[?* a ~[c ?*]] Equivalent expression
45
October 2009HLT: finite state transducers45 Replacement a:b b a ? ? b:a a a:b a b -> b a “Replace ‘ab’ by ‘ba’.” [[~$[a b] [[a b].x. [b a]]]* ~$[a b]] Equivalent expression
46
October 2009HLT: finite state transducers46 Replacement + Marking 0:[ [ 0:] ? a e i o u ] a|e|i|o|u -> %[... %] p o t a t o p[o]t[a]t[o]
47
October 2009HLT: finite state transducers47 Conditional Replacement The relation that replaces A by B between L and R leaving everything else unchanged. A -> B Replacement L _ R Context
48
October 2009HLT: finite state transducers48 Sequential application N -> m / _ p p -> m / m _ k a N p a n k a m p a n k a m m a n
49
October 2009HLT: finite state transducers49 Sequential application in detail N:m N ? ? 0 2 1 p m p N m p:m ? ? 0 1 m p m k a N p a n k a m p a n k a m m a n 0 0 0 2 0 0 0 0 0 0 1 0 0 0
50
October 2009HLT: finite state transducers50 Composition N:m N ? ? 0 3 1 m p N ? m 2 p:m N m N:m k a N p a n k a m m a n 0 0 0 3 0 0 0
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.