Training Tree Transducers

Slides:



Advertisements
Similar presentations
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Advertisements

Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
1 Statistical NLP: Lecture 12 Probabilistic Context Free Grammars.
1 Introduction to Computability Theory Lecture7: PushDown Automata (Part 1) Prof. Amos Israeli.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Normal forms for Context-Free Grammars
Chapter 3: Formal Translation Models
Syntactic Pattern Recognition Statistical PR:Find a feature vector x Train a system using a set of labeled patterns Classify unknown patterns Ignores relational.
Daniel Gildea (2003): Loosely Tree-Based Alignment for Machine Translation Linguistics 580 (Machine Translation) Scott Drellishak, 2/21/2006.
Languages and Grammars MSU CSE 260. Outline Introduction: E xample Phrase-Structure Grammars: Terminology, Definition, Derivation, Language of a Grammar,
Problem of the DAY Create a regular context-free grammar that generates L= {w  {a,b}* : the number of a’s in w is not divisible by 3} Hint: start by designing.
Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Grammars CPSC 5135.
Learning Automata and Grammars Peter Černo.  The problem of learning or inferring automata and grammars has been studied for decades and has connections.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.
What’s in a translation rule? Paper by Galley, Hopkins, Knight & Marcu Presentation By: Behrang Mohit.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
/02/20161 Probabilistic Context Free Grammars Chris Brew Ohio State University.
Natural Language Processing : Probabilistic Context Free Grammars Updated 8/07.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Department of Software & Media Technology
Natural Language Processing Vasile Rus
Comp 411 Principles of Programming Languages Lecture 3 Parsing
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
Pushdown Automata.
BCT 2083 DISCRETE STRUCTURE AND APPLICATIONS
Describing Syntax and Semantics
Instructor: Laura Kallmeyer
Parsing & Context-Free Grammars
CS 404 Introduction to Compiler Design
Programming Languages Translator
Parsing and Parser Parsing methods: top-down & bottom-up
Introduction to Parsing (adapted from CS 164 at Berkeley)
Basic Parsing with Context Free Grammars Chapter 13
Syntax Specification and Analysis
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Chapter 7 PUSHDOWN AUTOMATA.
PARSE TREES.
PZ03A - Pushdown automata
Context-Free Languages
Syntax-Directed Translation
REGULAR LANGUAGES AND REGULAR GRAMMARS
CSE322 LEFT & RIGHT LINEAR REGULAR GRAMMAR
Jaya Krishna, M.Tech, Assistant Professor
LR(1) grammars The Chinese University of Hong Kong Fall 2010
CS4705 Natural Language Processing
Lecture 8: Top-Down Parsing
R.Rajkumar Asst.Professor CSE
CS 3304 Comparative Languages
Compiler Construction
CS 3304 Comparative Languages
Stochastic Context Free Grammars for RNA Structure Modeling
CS4705 Natural Language Processing
Bottom-Up Parsing “Shift-Reduce” Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Regular Grammars.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Dekai Wu Presented by David Goss-Grubbs
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Faculty of Computer Science and Information System
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Training Tree Transducers Author: Jonathan Graehl Kevin Knight Presented by Zhengbo Zhou 11/16/2018

Outline Finite State Transducers (FSTs) and R Trees and Regular Tree Grammars xR and Derivation Tree Inside-Outside algorithm and EM training Turning tree to string (xRS) Example and Related Work My thought/questions 11/16/2018

Finite State Transducers (FSTs) Finite-state Transducer: from what we’ve learned-> q0 q1 b:y a:x 11/16/2018

R transducer An R transducer compactly represent a potentially infinite set of input/output tree pairs. While a FST compactly represent such a set of input/output string pairs. R is a generalization of FST. 11/16/2018

Example of R He drinks water S PRO VP V NP he drinks water 11/16/2018

Example for R cont Rule: 2,3,4 English order S(PRO, VP(V, NP)) q S PRO VP V NP he drinks water S qleft.vp.v VP qpro PRO qright.vp.np VP Rule 1: Rule: 2,3,4 V NP PRO S qleft.vp.v VP qpro PRO qright.vp.np VP English order S(PRO, VP(V, NP)) Arabic order S(V,PRO,NP) 11/16/2018

Trees Definitions: 11/16/2018

Regular Tree Grammars (RTG) Regular Tree Grammar, a common way of compactly representing a potentially infinite set of trees. wRTG is just like WFSA. wRTG G : (∑,N,S,P) ∑: alphabet N: nonterminals S: start nonterminal : Weighted productions 11/16/2018

Sample wRTG 11/16/2018

Extended-LHS Tree Transducer (xR) Different from R: explicitly represent the lookahead and movement with a more specified LHS Form of LHS is: The pattern will be used to match an input subtree. There is a set of finite tree patterns. 11/16/2018

Binary Relation: 11/16/2018

Derivation Tree So many trees now, but this derivation tree is a representation of the transducer, neither the input tree nor the output tree. But derivation tree can deterministically produce a single weighted output tree. 11/16/2018

Derivation tree & derivation wRTG X X’ 11/16/2018

Inside-Outside algorithm Basic idea of inside-outside algorithm: Use current probability of rules to estimate the expected frequencies of certain types of derivation steps and compute new probabilities for those rules.[1] Generally for inside probability is to recalculate p of A->a may go through A->BC for outside probability is to recalculate p of C->AB or C->BA 11/16/2018

Inside-Outside for wRTG Inside weights using G are given by βG: Outside weights αG: 11/16/2018

EM training EM training: to maximized the corpus likelihood, repeatedly estimating the expectation of decision and maximizing by assigning counts to parameter and renormaliztion. Algorithm 2 implements EM xR training by repeatedly computing inside-outside weights. 11/16/2018

From tree to string Although we can use Extended-LHS Tree Transducer (xR) to get an output tree from an input tree (say parse trees), but still, it is a (parse) tree, not the sentence in another language (for machine translation). Now we have xRS—tree to string transducer. 11/16/2018

Tree-to-string transducer Weighted extended-lhs root-to-frontier tree-to-string transducer: X=(∑,Δ,Q, Qi, R) It is similar to xR, but the rhs is strings instead of trees. 11/16/2018

Example Implemented the translation model of (Yamada and Knight 2001) There is a trainable xRS tree-to-string transducer that embodies: 11/16/2018

Example 11/16/2018

Related Work TSG vs RTG (equivalent) xR vs weighted synchronous TSG (similar) EM training vs forward backward algorithm for finite state (string) transducer and also for HMM 11/16/2018

Questions Is there any future work on this tree transducer especially for Machine Translation? Precision? Recall? Also a little bit confused in the descriptions of those two relationships =>x and =>G Not very sure about inside-outside algorithm. Questions? 11/16/2018

Thank you!! 11/16/2018

Reference 1 Fernando Pereira, Yves Schabes INSIDE-OUTSIDE REESTIMATION FROM PARTIALLY BRACKETED CORPORA 1992 11/16/2018

What might be useful An Overview of Probabilistic Tree Transducers for Natural Language Processing Kevin Knight and Jonathan Graehl 11/16/2018

– R: Top-down transducer, introduced before. – F: Bottom-up transducer (“Frontier-to-root”), with similar rules, but transforming the leaves of the input tree first, and working its way up. – L: Linear transducer, which prohibits copying subtrees. Rule 4 in Figure 4 is example of a copying production, so this whole transducer is R but not RL. – N: Non-deleting transducer, which requires that every left-hand-side variable also appear on the right-hand side. A deleting R-transducer can simply delete a subtree (without inspecting it). The transducer in Figure 4 is the deleting kind, because of rules 34-39. It would also be deleting if it included a rule for dropping English determiners, e.g., q NP(x0, x1) q x1. – D: Deterministic transducer, with a maximum of one production per <state, symbol> pair. – T: Total transducer, with a minimum of one production per <state, symbol> pair. – PDTT: Push-down tree transducer, the transducer analog of CFTG [36]. – subscript: Regular-lookahead transducer, which can check to see if an input subtree is tree-regular, i.e., whether it belongs to a specified RTL. Productions only fire when their lookahead conditions are met. 11/16/2018

11/16/2018

11/16/2018