Semantic Representation and Formal Transformation kevin knight usc/isi MURI meeting, CMU, Nov 3-4, 2011
Machine Translation Phrase-based MT Syntax-based MT Meaning-based MT source string meaning representation target string source string target string source string source tree target tree target string source tree target tree NIST 2009 c2e
Meaning-based MT Too big for this MURI: – What content goes into the meaning representation? linguistics, annotation – How are meaning representations probabilistically generated, transformed, scored, ranked? automata theory, efficient algorithms – How can a full MT system be built and tested? engineering, language modeling, features, training
Meaning-based MT Too big for this MURI: – What content goes into the meaning representation? linguistics, annotation – How are meaning representations probabilistically generated, transformed, scored, ranked? automata theory, efficient algorithms – How can a full MT system be built and tested? engineering, language modeling, features, training
Meaning-based MT Too big for this MURI: – What content goes into the meaning representation? linguistics, annotation – How are meaning representations probabilistically generated, transformed, scored, ranked? automata theory, efficient algorithms – How can a full MT system be built and tested? engineering, language modeling, features, training Language-independent theory. But driven by practical desires.
Automata Frameworks How to represent and manipulate linguistic representations? Linguistics, NLP, and Automata Theory used to be together (1960s, 70s) – Context-free grammars were invented to model human language – Tree transducers were invented to model transformational grammar They drifted apart Renewed connections around MT (this century) Role: greatly simplify systems!
Finite-State Transducer (FST) k n i g h t q k q2 *e* q2 n q N q i q AY q g q3 *e* q4 t qfinal T q3 h q4 *e* Original input:Transformation: q k n i g h t FST q q2 qfinal q3q4 k : *e* n : N h : *e* g : *e* t : T i : AY
Finite-State (String) Transducer q2 n i g h t q k q2 *e* q2 n q N q i q AY q g q3 *e* q4 t qfinal T q3 h q4 *e* Original input:Transformation: k n i g h t FST q q2 qfinal q3q4 k : *e* n : N h : *e* g : *e* t : T i : AY
Finite-State (String) Transducer N q i g h t q k q2 *e* q2 n q N q i q AY q g q3 *e* q4 t qfinal T q3 h q4 *e* Original input:Transformation: k n i g h t FST q q2 qfinal q3q4 k : *e* n : N h : *e* g : *e* t : T i : AY
Finite-State (String) Transducer q g h t q k q2 *e* q2 n q N q i q AY q g q3 *e* q4 t qfinal T q3 h q4 *e* AY N Original input:Transformation: k n i g h t FST q q2 qfinal q3q4 k : *e* n : N h : *e* g : *e* t : T i : AY
Finite-State (String) Transducer q3 h t q k q2 *e* q2 n q N q i q AY q g q3 *e* q4 t qfinal T q3 h q4 *e* AY N Original input:Transformation: k n i g h t FST q q2 qfinal q3q4 k : *e* n : N h : *e* g : *e* t : T i : AY
Finite-State (String) Transducer q4 t q k q2 *e* q2 n q N q i q AY q g q3 *e* q4 t qfinal T q3 h q4 *e* AY N Original input:Transformation: k n i g h t FST q q2 qfinal q3q4 k : *e* n : N h : *e* g : *e* t : T i : AY
Finite-State (String) Transducer q k q2 *e* q2 n q N q i q AY q g q3 *e* q4 t qfinal T q3 h q4 *e* T qfinal AY N k n i g h t Original input:Transformation: FST q q2 qfinal q3q4 k : *e* n : N h : *e* g : *e* t : T i : AY
Transliteration Angela Knight a n ji ra na i to transliteration Frequently occurring translation problem for languages with different sound systems and character sets. (Japanese, Chinese, Arabic, Russian, English…) Can’t be solved by dictionary lookup.
Transliteration Angela Knight WFST 7 input symbols13 output symbols
Transliteration Angela Knight WFST B WFSA A WFST D AE N J EH L UH N AY T WFST C a n j i r a n a i t o
WFST B WFSA A WFST D WFST C a n j i r a n a i t o AE N J IH R UH N AY T AH N J IH L UH N AY T OH + millions more DECODE
General-Purpose Algorithms for String Automata N-best …… paths through an WFSA (Viterbi, 1967; Eppstein, 1998) EM trainingForward-backward EM (Baum & Welch, 1971; Eisner 2001) Determinization …… of weighted string acceptors (Mohri, 1997) IntersectionWFSA intersection Applicationstring WFST WFSA Transducer compositionWFST composition (Pereira & Riley, 1996) General-purpose toolkitCarmel (Graehl & Knight 97), OpenFST (Google, via AT&T),...
S NPVP PRO he VBZ enjoys NP VBG listening VP P to NP SBAR music Original input:Transformation: q S NPVP PRO he VBZ enjoys NP VBG listening VP P to NP SBAR music Top-Down Tree Transducer (W. Rounds 1970; J. Thatcher 1970)
S NPVP PRO he VBZ enjoys NP VBG listening VP P to NP SBAR music Original input:Transformation: q S NPVP PRO he VBZ enjoys NP VBG listening VP P to NP SBAR music Top-Down Tree Transducer (W. Rounds 1970; J. Thatcher 1970) q S x0:NPVP s x0, wa, r x2, ga, q x1 x1:VBZx2:NP 0.2
S NPVP PRO he VBZ enjoys NP VBG listening VP P to NP SBAR music Original input:Transformation: s NP PRO he q VBZ enjoys r NP VBG listening VP P to NP SBAR music,, Top-Down Tree Transducer (W. Rounds 1970; J. Thatcher 1970), wa, ga
S NPVP PRO he VBZ enjoys NP VBG listening VP P to NP SBAR music Original input:Transformation: s NP PRO he q VBZ enjoys r NP VBG listening VP P to NP SBAR music,, Top-Down Tree Transducer (W. Rounds 1970; J. Thatcher 1970), wa, ga s NP PRO kare he 0.7
S NPVP PRO he VBZ enjoys NP VBG listening VP P to NP SBAR music Original input:Transformation: q VBZ enjoys r NP VBG listening VP P to NP SBAR music, karewa, Top-Down Tree Transducer (W. Rounds 1970; J. Thatcher 1970),, ga
S NPVP PRO he VBZ enjoys NP VBG listening VP P to NP SBAR music karekikuongakuowadaisukidesugano Original input:Final output:,,,,,,,, Top-Down Tree Transducer (W. Rounds 1970; J. Thatcher 1970)
Top-Down Tree Transducer Introduced by Rounds (1970) & Thatcher (1970) “Recent developments in the theory of automata have pointed to an extension of the domain of definition of automata from strings to trees … parts of mathematical linguistics can be formalized easily in a tree-automaton setting …” (Rounds 1970, “Mappings on Grammars and Trees”, Math. Systems Theory 4(3)) Large theory literature – e.g., Gécseg & Steinby (1984), Comon et al (1997) Once again re-connecting with NLP practice – e.g., Knight & Graehl (2005), Galley et al (2004, 2006), May & Knight (2006, 2010), Maletti et al (2009)
Tree Transducers Can be Extracted from Bilingual Data (Galley et al, 04) i felt obliged to do my part 我 有 责任 尽 一份 力 TREE TRANSDUCER RULES: VBD(felt) 有 VBN(obliged) 责任 VB(do) 尽 NN(part) 一份 NN(part) 一份 力 VP-C(x0:VBN x1:SG-C) x0 x1 VP(TO(to) x0:VP-C) x0 … S(x0:NP-C x1:VP) x0 x1 S NP-C VP VP-C VBD SG-C VP VBN TO VP-C VB NP-C NPB PRP PRP$ NN
这 7 人 中包括 来自 法国 和 俄罗斯 的 宇航 员. Syntax-Based Decoding
这 7 人 中包括 来自 法国 和 俄罗斯 的 宇航 员. RULE 1: DT(these) 这 RULE 2: VBP(include) 中包括 RULE 6: NNP(Russia) 俄罗斯 RULE 4: NNP(France) 法国 RULE 8: NP(NNS(astronauts)) 宇航, 员 RULE 5: CC(and) 和 RULE 9: PUNC(.) . “these”“Russia”“astronauts”“.”“include”“France”“and” Syntax-Based Decoding
RULE 1: DT(these) 这 RULE 2: VBP(include) 中包括 RULE 6: NNP(Russia) 俄罗斯 RULE 4: NNP(France) 法国 RULE 8: NP(NNS(astronauts)) 宇航, 员 RULE 5: CC(and) 和 RULE 9: PUNC(.) . 这 7 人 中包括 来自 法国 和 俄罗斯 的 宇航 员. RULE 13: NP(x0:NNP, x1:CC, x2:NNP) x0, x1, x2 “France and Russia” “include”“these”“France”“and”“Russia”“astronauts”“.” Syntax-Based Decoding
RULE 1: DT(these) 这 RULE 2: VBP(include) 中包括 RULE 6: NNP(Russia) 俄罗斯 RULE 4: NNP(France) 法国 RULE 8: NP(NNS(astronauts)) 宇航, 员 RULE 5: CC(and) 和 RULE 9: PUNC(.) . 这 7 人 中包括 来自 法国 和 俄罗斯 的 宇航 员. RULE 13: NP(x0:NNP, x1:CC, x2:NNP) x0, x1, x2 RULE 11: VP(VBG(coming), PP(IN(from), x0:NP)) 来自, x0 “France and Russia” “coming from France and Russia” “these”“Russia”“astronauts”“.”“include”“France”“&” Syntax-Based Decoding
RULE 1: DT(these) 这 RULE 2: VBP(include) 中包括 RULE 6: NNP(Russia) 俄罗斯 RULE 4: NNP(France) 法国 RULE 8: NP(NNS(astronauts)) 宇航, 员 RULE 5: CC(and) 和 RULE 9: PUNC(.) . 这 7 人 中包括 来自 法国 和 俄罗斯 的 宇航 员. RULE 13: NP(x0:NNP, x1:CC, x2:NNP) x0, x1, x2 RULE 11: VP(VBG(coming), PP(IN(from), x0:NP)) 来自, x0 RULE 16: NP(x0:NP, x1:VP) x1, 的, x0 “astronauts coming from France and Russia” “France and Russia” “coming from France and Russia” “these”“Russia”“astronauts”“.”“include”“France”“&” Syntax-Based Decoding
RULE 1: DT(these) 这 RULE 2: VBP(include) 中包括 RULE 6: NNP(Russia) 俄罗斯 RULE 4: NNP(France) 法国 RULE 8: NP(NNS(astronauts)) 宇航, 员 RULE 5: CC(and) 和 RULE 9: PUNC(.) . 这 7 人 中包括 来自 法国 和 俄罗斯 的 宇航 员. RULE 13: NP(x0:NNP, x1:CC, x2:NNP) x0, x1, x2 RULE 16: NP(x0:NP, x1:VP) x1, 的, x0 RULE 11: VP(VBG(coming), PP(IN(from), x0:NP)) 来自, x0 RULE 14: VP(x0:VBP, x1:NP) x0, x1 “include astronauts coming from France and Russia” “France and Russia” “coming from France and Russia” “astronauts coming from France and Russia” “these”“Russia”“astronauts”“.”“include”“France”“&”
RULE 1: DT(these) 这 RULE 2: VBP(include) 中包括 RULE 6: NNP(Russia) 俄罗斯 RULE 4: NNP(France) 法国 RULE 8: NP(NNS(astronauts)) 宇航, 员 RULE 5: CC(and) 和 RULE 9: PUNC(.) . 这 7 人 中包括 来自 法国 和 俄罗斯 的 宇航 员. RULE 10: NP(x0:DT, CD(7), NNS(people) x0, 7 人 RULE 13: NP(x0:NNP, x1:CC, x2:NNP) x0, x1, x2 RULE 15: S(x0:NP, x1:VP, x2:PUNC) x0, x1, x2 RULE 16: NP(x0:NP, x1:VP) x1, 的, x0 RULE 11: VP(VBG(coming), PP(IN(from), x0:NP)) 来自, x0 RULE 14: VP(x0:VBP, x1:NP) x0, x1 “These 7 people include astronauts coming from France and Russia” “France and Russia” “coming from France and Russia” “astronauts coming from France and Russia” “these 7 people” “include astronauts coming from France and Russia” “these”“Russia”“astronauts”“.”“include”“France”“&”
These7peopleincludeastronautscomingfromFranceandRussia. DTCDVBPNNSIN NNPCCNNPPUNC NP VP NP VP S NNSVBG PP NP Derived English Tree
General-Purpose Algorithms for Tree Automata String Automata Algorithms Tree Automata Algorithms N-best …… paths through an WFSA (Viterbi, 1967; Eppstein, 1998) … trees in a weighted forest (Jiménez & Marzal, 2000; Huang & Chiang, 2005) EM trainingForward-backward EM (Baum/Welch, 1971; Eisner 2003) Tree transducer EM training (Graehl & Knight, 2004) Determinization …… of weighted string acceptors (Mohri, 1997) … of weighted tree acceptors (Borchardt & Vogler, 2003; May & Knight, 2005) IntersectionWFSA intersectionTree acceptor intersection Applying transducersstring WFST WFSAtree TT weighted tree acceptor Transducer compositionWFST composition (Pereira & Riley, 1996) Many tree transducers not closed under composition (Maletti et al 09) General-purpose toolsCarmel, OpenFSTTiburon (May & Knight 10)
Machine Translation Phrase-based MT Syntax-based MT Meaning-based MT source string meaning representation target string source string target string source string source tree target tree target string source tree target tree
Five Equivalent Meaning Representation Formats (w / WANT :agent (b / BOY) :patient (g / GO :agent b))) w, b, g : instance(w, WANT) ^ instance(g, GO) ^ instance(b, BOY) ^ agent(w, b) ^ patient(w, g) ^ agent(g, b) E WANT BOY GO instance agent patient agent ((x0 instance) = WANT ((x1 instance) = BOY ((x2 instance) = GO ((x0 agent) = x1 ((x0 patent) = x2 ((x2 agent) = x1 instance: WANT agent: patient: instance: GO agent: instance: BOY 1 1 LOGICAL FORM PATH EQUATIONS FEATURE STRUCTURE DIRECTED ACYCLIC GRAPH PENMAN “The boy wants to go.”
Example “Government forces closed on rebel outposts on Thursday, showering the western mountain city of Zintan with missiles and attacking insurgents holed up near the Tunisian border, according to rebel sources.” (s / say :agent (s2 / source :mod (r / rebel)) :patient (a / and :op1 (c / close-on :agent (f / force :mod (g / government)) :patient (o / outpost :mod (r2 / rebel)) :temporal-locating (t / thursday)) :op2 (s / shower :agent f :patient (c / city :mod (m / mountain) :mod (w / west) :name "Zintan") :instrument (m2 / missile)) :op3 (a / attack :agent f :patient (i / insurgent :agent-of (h / hole-up :pp-near (b / border :gpi (c2 / country :name "Tunisia"] Slogan: “more logical than a parse tree”
General-Purpose Algorithms for Feature Structures (Graphs) String Automata Algorithms Tree Automata Algorithms Graph Automata Algorithms? N-best …… paths through an WFSA (Viterbi, 1967; Eppstein, 1998) … trees in a weighted forest (Jiménez & Marzal, 2000; Huang & Chiang, 2005) EM trainingForward-backward EM (Baum/Welch, 1971; Eisner 2003) Tree transducer EM training (Graehl & Knight, 2004) Determinization…… of weighted string acceptors (Mohri, 1997) … of weighted tree acceptors (Borchardt & Vogler, 2003; May & Knight, 2005) IntersectionWFSA intersectionTree acceptor intersection Applying transducers string WFST WFSAtree TT weighted tree acceptor Transducer composition WFST composition (Pereira & Riley, 1996) Many tree transducers not closed under composition (Maletti et al 09) General toolsCarmel, OpenFSTTiburon (May & Knight 10)
Automata Frameworks Hyperedge-replacement graph grammars – (Drewes et al) DAG acceptors – (Hart 75) DAG-to-tree transducers – (Kamimura & Slutski 82)
Mapping Between Meaning and Text the boy wants to see WANT BOY SEE instance agent patient agent foreign text
Mapping Between Meaning and Text the boy wants to be seen WANT BOY SEE instance agent patient foreign text
Mapping Between Meaning and Text the boy wants the girl to be seen WANT BOY SEE instance agent patient GIRL instance foreign text
Mapping Between Meaning and Text the boy wants to see the girl WANT BOY SEE instance agent patient GIRL instance agent foreign text
Mapping Between Meaning and Text the boy wants to see himself WANT BOY SEE instance agent patient agent foreign text
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) PROMISE BOY GO instance agent patient recipient GIRL instance agent
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) PROMISE BOY GO instance agent patient recipient GIRL instance
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) PROMISE BOY GO instance agent patient recipient instance q.NN | girl
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) PROMISE GO instance agent patient recipient instance q.NN | girl q.NN | boy
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) PROMISE instance agent patient recipient instance q.VBZ | goes q.NN | girl q.NN | boy
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) PROMISE instance agent patient recipient instance q.VB | go q.NN | girl q.NN | boy
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) instance agent patient recipient q.NN | girl instance q.VB | go q.VBD | promised q.NN | boy
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) instance agent patient recipient q.VB | go q.VBD | promised q.NP | NN | boy DT | the q. NP | NN | girl DT | the
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) instance patient recipient q.VB | go q.VBD | promised q.agt NP | NN | boy DT | the q. NP | NN | girl DT | the q.subj NP | NN | boy DT | the
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) instance patient recipient q.VP / \ TO VB | | to go q.VBD | promised q. NP | NN | girl DT | the q.agt NP | NN | boy DT | the
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) instance recipient qpatsubj.VP / \ TO VB | | to go q.VBD | promised q.agt NP | NN | boy DT | the q. NP | NN | girl DT | the
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) instance q.VBD | promised qpatsubj.VP / \ TO VB | | to go q.agt NP | NN | boy DT | the q.rec NP | NN | girl DT | the
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) VBD | promised VP / \ TO VB | | to go S NP | NN | boy DT | the NP | NN | girl DT | the VP
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) VBD | promised VP / \ TO VB | | to go S NP | NN | boy DT | the NP | NN | girl DT | the VP PROMISE BOY GO instance agent patient recipient GIRL instance agent
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) VBD | persuaded S NP | NN | boy DT | the NP | NN | girl DT | the VP PERSUADE BOY GO instance agent patient recipient GIRL instance agent VP / \ TO VB | | to go
Bottom-Up DAG-to-Tree Transduction (Kamimura & Slutski 82) VBD | persuaded VP that he would go S NP | NN | boy DT | the NP | NN | girl DT | the VP PERSUADE BOY GO instance agent patient recipient GIRL instance agent
General-Purpose Algorithms for Feature Structures String Automata Algorithms Tree Automata Algorithms Graph Automata Algorithms? N-best …… paths through an WFSA (Viterbi, 1967; Eppstein, 1998) … trees in a weighted forest (Jiménez & Marzal, 2000; Huang & Chiang, 2005) EM trainingForward-backward EM (Baum/Welch, 1971; Eisner 2003) Tree transducer EM training (Graehl & Knight, 2004) Determinization…… of weighted string acceptors (Mohri, 1997) … of weighted tree acceptors (Borchardt & Vogler, 2003; May & Knight, 2005) IntersectionWFSA intersectionTree acceptor intersection Applying transducers string WFST WFSAtree TT weighted tree acceptor Transducer composition WFST composition (Pereira & Riley, 1996) Many tree transducers not closed under composition (Maletti et al 09) General toolsCarmel, OpenFSTTiburon (May & Knight 10)
Automata for Statistical MT Used in SMT Automata Framework Devised FST TDTT DAG
end