LING 438/538 Computational Linguistics Sandiway Fong Lecture 14: 10/12
2 Administrivia Reminder –Homework 3 due tonight
3 Last Time morphology –words are composed of morphemes –morpheme: semantic unit, e.g. -ee in employee –Inflectional: no change in category, e.g. V -ed V –Derivational: category-changing, e.g. V -able A Porter Stemmer –normalization procedure –based on (manually determined) ad hoc rules –“measure” of a stem: C(VC) m V –output: “root” (not necessarily a word) words that stem to the same root are considered “variants” –English orthography an illustration of the gap that can occur between computation and linguistic theory
4 Walkers. Standees. © Sandiway Fong sign above travelator at Pittsburgh International Airport
5 Today’s Topic Finite State Transducers (FST) for morphological processing –... also Prolog implementation
6 Recall Finite State Automata (FSA) from lecture 8 –(Q,s,f,Σ, ) 1.set of states (Q): {s,x,y}must be a finite set 2.start state (s): s 3.end state(s) (f): y 4.alphabet ( Σ ): {a, b} 5.transition function : signature: character × state → state (a,s)=x (a,x)=x (b,x)=y (b,y)=y sx y a a b b
7 Modeling English Adjectives using FSA –from section 3.2 of textbook examples –big, bigger, biggest, *unbig –cool, cooler, coolest, coolly –red, redder, reddest, *redly –clear, clearer, clearest, clearly, unclear, unclearly –happy, happier, happiest, happily –unhappy, unhappier, unhappiest, unhappily –real, *realer, *realest, unreal, really fsa (3.4) Initial machine is overly simple need more classes to make finer grain distinctions e.g. *unbig
8 Modeling English Adjectives using FSA divide adjectives into classes examples –adj-root 2 : big, bigger, biggest, *unbig –adj-root 2 : cool, cooler, coolest, coolly –adj-root 2 : red, redder, reddest, *redly –adj-root 1 : clear, clearer, clearest, clearly, unclear, unclearly –adj-root 1 : happy, happier, happiest, happily –adj-root 1 : unhappy, unhappier, unhappiest, unhappily –adj-root 1 : real, *realer, *realest, unreal, really fsa (3.5) However... Examples uncooler Smoking uncool and getting uncooler. google: 22,800 (2006), 10,900 (2005) *realer google: 3,500,000 (2006) 494,000 (2005) *realest google: 795,000 (2006) 415,000 (2005)
9 Modeling English Adjectives using FSA e.g. *unbig google: 11,000 hits (2006) morphology is productive morphemes carry (compositional) meaning can be used for dramatic effect unbig vs. small
10 The Mapping Problem To map between a surface form and the decomposition of a word into its components –e.g. root + (person/number/gender) and other features using spelling rules Example: (3.11) Notes: ^ marks a morpheme boundary # is the end-of-word marker
11 Stage 1: Lexical Intermediate Levels example: –f o x +N +PL (lexical) –f o x ^s# (intermediate) lexical level: –uninflected “dictionary” level intermediate level: –replace abstract morphemes by concrete ones key –+N : noun fox can also be a verb, but fox +V cannot combine with +PL –+PL : (abstract) plural morpheme realized in English as s (basic case) –boundary markers ^ and # for use by the spelling rule machine (later)
12 Stage 1: Lexical Intermediate Levels example: –f o x +N +PL (lexical) –f o x ^s# (intermediate) machine idea –character-by-character correspondences –f f –o o –x x –+N ( = empty string) –+PL ^s# use a Finite State Machine with input/output mapping –Finite State Transducer (FST)
13 Stage 1: Lexical Intermediate Levels Example: –g o o s e +N +PL (lexical) –g e e s e # (intermediate) Example: –g o o s e +N +SG (lexical) –g o o s e # (intermediate) Example: –m o u s e +N +PL (lexical) –m i c e # (intermediate) Example: –s h e e p +N +PL (lexical) –s h e e p # (intermediate)
14 Stage 1: Lexical Intermediate Levels 3.11 Notation: input : output f means f:f
15 Extension to Finite State Transducers (FST) [Mealy machine extension to FSA] –(Q,s,f,Σ, ) 1.set of states (Q): {s,x,y}must be a finite set 2.start state (s): s 3.end state(s) (f): y 4.alphabet ( Σ ): pairs I:O –I = input alphabet, O = output alphabet –ε may be included in I and O 5.transition function (or matrix) : signature: i/o pair × state → state (a:b,s)=x (a:b,x)=x (b:a,x)=y (b:ε,y)=y sx y a:b b: ε b:a
16 Finite State Automata (FSA) recall: one possible Prolog encoding strategy –define one predicate for each state taking one argument (the input string) consume input character call next state with remaining input string –query ?- s(L). call start state s
17 Finite State Automata (FSA) –from lecture 9 –define one predicate for each state take one argument (the input string), and consume input character call next state with remaining input string –query ?- s(L). i.e. call start state s –state s: (start state) s([a|L]) :- x(L). –state x: x([a|L]) :- x(L). x([b|L]) :- y(L). –state y: (end state) y([]). y([b|L]) :- y(L). sx y a a b b simple extension to FST: each predicate takes two arguments: input and output
18 Stage 1: Lexical Intermediate Levels example –s0([f|L1],[f|L2]) :- s1(L1,L2). –s0([c|L1],[c|L2]) :- s3(L1,L2). –s1([o|L1],[o|L2]) :- s2(L1,L2). –s2([x|L1],[x|L2]) :- s5(L1,L2). –s3([a|L1],[a|L2]) :- s4(L1,L2). –s4([t|L1],[t|L2]) :- s5(L1,L2). –s5([‘+N’|L1],L2) :- s6(L1,L2). –s6([‘+PL’|L1],[^,s,#|L2]) :- s7(L1,L2). –s7([],[]).% end state
19 Stage 1: Lexical Intermediate Levels FST queries –lexical intermediate ?- s0([f,o,x,’+N’,’+PL’],X). –X = [f, o, x, ^, s, #] –intermediate lexical ?- s0(X,[f,o,x,^,s,#]). –X = [f, o, x, '+N', '+PL'] –enumerator ?- s0(X,Y). –X = [f, o, x, '+N', '+PL'] –Y = [f, o, x, ^, s, #] ; –X = [c, a, t, '+N', '+PL'] –Y = [c, a, t, ^, s, #] ; No inversion of a transducer T: T -1 switch input and output labels in Prolog, simply change the call
20 Stage 1: Lexical Intermediate Levels Figure 3.17 (top half): tape view of input/output pairs
21 The Mapping Problem Example: (3.11) (Context-Sensitive) Spelling Rule: (3.5) – e / { x, s, z } ^ __ s# rewrites to letter e in left context x^ or s^ or z^ and right context s# i.e. insert e after the ^ when you see x^s# or s^s# or z^s# in particular, we have x^s# x^es#
22 Stage 2: Intermediate Surface Levels also can be implemented using a FST important! machine is designed to pass input not matching the rule through unmodified (rather than fail) implements context-sensitive rule q 0 to q 2 : left context q 3 to q 0 : right context
23 Stage 2: Intermediate Surface Levels Example (3.17)
24 Stage 2: Intermediate Surface Levels Transition table for FST in 3.14 Note: –other: (catch-all case) means pass any remaining symbol (other than specified explicitly in the state) to the other side unchanged –#: # is never included in other
25 Stage 2: Intermediate Surface Levels in Prolog (simplified) –with special treatment for “other” –q0([],[]). % final state –q0([^|L1],L2) :- !, q0(L1,L2). –% ^: –q0([z|L1],[z|L2]) :- !, q1(L1,L2). –% repeat for s,x –q0([#|L1],[#|L2]) :- !, q0(L1,L2). –q0([X|L1],[X|L2]) :- q0(L1,L2). –% other ! is known as the “cut” predicate –it affects how Prolog searches –it means “cut” the search off –Prolog will not try any other compatible rule on backtracking –problematic for generation, e.g. ^: case
26 Stage 2: Intermediate Surface Levels in Prolog (simplified) –with special treatment for “other” –q0([],[]). % final state –q0([^|L1],L2) :- !, q0(L1,L2). –% ^: –q0([z|L1],[z|L2]) :- !, q1(L1,L2). –% repeat for s,x –q0([#|L1],[#|L2]) :- !, q0(L1,L2). –q0([X|L1],[X|L2]) :- q0(L1,L2). –% other ! is known as the “cut” predicate –it affects how Prolog searches –it means “cut” the search off –Prolog will not try any other compatible rule on backtracking –problematic for generation, e.g. ^: case backtrack points: other choices
27 Stage 2: Intermediate Surface Levels problem for generation –?- q0(X,[f,o,x,e,s,#]). X = [^|L1] ?- q0(L1,[f,o,x,e,s,#]). L1 = [^|L1’] –?- q0(L1’,[f,o,x,e,s,#]). –infinite loop –Culprit: ^: case (morpheme boundary deletion) –can keep introducing ^^^^^^^... ad infinitum –requires more than finite state power to correct q0([],[]). % final state q0([^|L1],L2) :- !, q0(L1,L2). % ^: q0([z|L1],[z|L2]) :- !, q1(L1,L2). % repeat for s,x q0([#|L1],[#|L2]) :- !, q0(L1,L2). q0([X|L1],[X|L2]) :- q0(L1,L2). % other q0([],[]). % final state q0([^|L1],L2) :- !, q0(L1,L2). % ^: q0([z|L1],[z|L2]) :- !, q1(L1,L2). % repeat for s,x q0([#|L1],[#|L2]) :- !, q0(L1,L2). q0([X|L1],[X|L2]) :- q0(L1,L2). % other
28 Stage 2: Intermediate Surface Levels Other cases of ^: do not loop. Could eliminate just the loop case.
29 Stage 2: Intermediate Surface Levels query (generation) –?- q0(X,[c,a,t,s,#]). X = [c, a, t, s, ^, #] ; q0+ -> q1 -> q2 -> q0 X = [c, a, t, s, #] ; q0+ -> q1 -> q0 No
30 Looking ahead Read Chapter 5: Probabilistic Models of (Pronunciation and) Spelling