National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Data-Oriented Natural Language Processing using Lexical-Functional.

Slides:



Advertisements
Similar presentations
Lexical Functional Grammar : Grammar Formalisms Spring Term 2004.
Advertisements

Lexical Functional Grammar History: –Joan Bresnan (linguist, MIT and Stanford) –Ron Kaplan (computational psycholinguist, Xerox PARC) –Around 1978.
1 Natural Language Processing Lecture 7 Unification Grammars Reading: James Allen NLU (Chapter 4)
Probabilistic and Lexicalized Parsing CS Probabilistic CFGs: PCFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
Grammar Development Platform Miriam Butt October 2002.
Introduction to LFG Kersti Börjars & Nigel Vincent {k.borjars, University of Manchester Winter school in LFG July University.
Grammar Engineering: Set-valued Attributes Various Kinds of Constraints Case Restrictions on Arguments Miriam Butt (University of Konstanz) and Martin.
Grammatical Relations and Lexical Functional Grammar Grammar Formalisms Spring Term 2004.
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
LFG Slides based on slides by: Kersti Börjars & Nigel Vincent {k.borjars, University of Manchester Winter school in LFG July
NLP and Speech 2004 Feature Structures Feature Structures and Unification.
CS Basic Parsing with Context-Free Grammars.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור עשר Chart Parsing (cont) Features.
Issues in Computational Linguistics: Parsing and Generation Dick Crouch and Tracy King.
Context-Free Grammar Parsing by Message Passing Paper by Dekang Lin and Randy Goebel Presented by Matt Watkins.
Context-Free Grammars Lecture 7
Features and Unification
June 7th, 2008TAG+91 Binding Theory in LTAG Lucas Champollion University of Pennsylvania
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
CS 4705 Basic Parsing with Context-Free Grammars.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
CS 4705 Lecture 11 Feature Structures and Unification Parsing.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
Feature structures and unification Attributes and values.
1 Features and Unification Chapter 15 October 2012 Lecture #10.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
CS 4705 Parsing More Efficiently and Accurately. Review Top-Down vs. Bottom-Up Parsers Left-corner table provides more efficient look- ahead Left recursion.
Spring /22/071 Beyond PCFGs Chris Brew Ohio State University.
Grammar Engineering: What is it good for? Miriam Butt (University of Konstanz) and Martin Forst (NetBase Solutions) Colombo 2014.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
Rule Learning - Overview Goal: Syntactic Transfer Rules 1) Flat Seed Generation: produce rules from word- aligned sentence pairs, abstracted only to POS.
Page 1 Probabilistic Parsing and Treebanks L545 Spring 2000.
Computing Science, University of Aberdeen1 CS4025: Logic-Based Semantics l Compositionality in practice l Producing logic-based meaning representations.
Albert Gatt Corpora and Statistical Methods Lecture 11.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 4.
What’s in a translation rule? Paper by Galley, Hopkins, Knight & Marcu Presentation By: Behrang Mohit.
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
October 25, : Grammars and Lexicons Lori Levin.
Evaluating Models of Computation and Storage in Human Sentence Processing Thang Luong CogACLL 2015 Tim J. O’Donnell & Noah D. Goodman.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1.[ S I forced him [ S PRO to be kind]] Phrase structure analyses in traditional transformational grammar:
Handling Unlike Coordinated Phrases in TAG by Mixing Syntactic Category and Grammatical Function Carlos A. Prolo Faculdade de Informática – PUCRS CELSUL,
Lexical-Functional Grammar A Formal System for Grammatical Representation Kaplan and Bresnan, 1982 Erin Fitzgerald NLP Reading Group October 18, 2006.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Probabilistic and Lexicalized Parsing. Probabilistic CFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations –Use weights to.
Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur
A Simple Syntax-Directed Translator
Introduction to Parsing
Programming Languages Translator
Bottom-up parsing Goal of parser : build a derivation
Introduction to Parsing (adapted from CS 164 at Berkeley)
Basic Parsing with Context Free Grammars Chapter 13
Lexical Functional Grammar
Probabilistic and Lexicalized Parsing
Parsing and More Parsing
Lecture 7: Introduction to Parsing (Syntax Analysis)
CSA2050 Introduction to Computational Linguistics
Principles and Parameters (I)
David Kauchak CS159 – Spring 2019
Presentation transcript:

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Data-Oriented Natural Language Processing using Lexical-Functional Grammar Mary Hearne School of Computing, Dublin City University

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Data-Oriented Natural Language Processing using Lexical-Functional Grammar Data-Oriented Parsing (DOP): A review Parsing with Lexical-Functional Grammar: LFG-DOP LFG-based models: what are the challenges?

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Experience-Based Parsing: PCFGs Basic strategy: extract grammar rules and their relative frequencies from a treebank S NPVP johnVNP lovesmary S  NP VP VP  V NP V  loves NP  john NP  mary (1) (1/2) ‘Vanilla’ PCFG S NP ^S VP ^S johnV ^VP NP ^VP lovesmary S  NP ^S VP ^S VP ^S  V ^VP NP ^VP V ^VP  loves NP ^S  john NP ^VP  mary (1) Parent-annotated PCFG S,loves NPVP,loves johnV,loves NP lovesmary S,loves  NP VP,loves VP,loves  V,loves NP V,loves  loves NP  john NP  mary (1) (1/2) Head-lexicalised PCFG

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Experience-Based Parsing: DOP Basic strategy: extract grammar rules and their relative frequencies from a treebank S NPVP johnVNP lovesmary S NPVP VNP lovesmary S NPVP johnVNP mary S NPVP johnVNP loves S NPVP VNP mary S NPVP johnVNP S VP VNP loves S NPVP VNP S VP john S NPVP S NPVP VNP lovesmary john NP john V loves NP mary VP VNP lovesmary VP VNP mary VP VNP loves VP VNP 1 1/10 1/4 1/2

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Non-local dependencies VP VNPPP keep DN aneye PNP on DN theLEDs NPadj ANPP lastpage PNP to NUMBERN firstpage PP P from “keep an eye on NP”“from last N to first N”

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Tree-DOP: Decomposition Root operation: Select any non-frontier non-terminal node to be root Delete all except this new root and the subtree it dominates Frontier operation: Select a (possibly empty) set of non-root non- terminal nodes in the newly-created subtree Delete all subtrees dominated by these nodes S NPVP johnVNP lovesmary root(f w )=VP frontiers(f w )={V,NP} : VP VNP root(f y )=NP frontiers(f y )={} : NP john S NPVP VNP lovesmary john root(f z )=S frontiers(f z )={} :

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Tree-DOP: Parsing new input Composition operation (x 0 y): Identify the leftmost open non-terminal LNT in x Substitute y at LNT(x) if root(y) = LNT(x) S NPVP VNP loves NP mary o= S NPVP maryVNP loves S NPVP maryVNP lovesjohn NP john o= S NPVP VNP john NP mary ooo S NPVP maryVNP lovesjohn = V loves o S NPVP VNP mary oo NP john o S NPVP maryVNP lovesjohn = V loves

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Tree-DOP: Ranking output parses Relative frequency: |e| - the number of occurrences of subtree e in the set of fragments r(e) - the root node category of subtree e Parse probability: Multiply fragment probabilities to calculate derivation probability Sum derivation probabilities to calculate parse probability t  D(T)e  d(t) = Σ Π P DOP (T) fragment probability derivation probability parse probability |e||e| Σ u:r(e)=r(u) |u|

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Tree-DOP: Some results d = 1 d = 2 F-Score MPP MPD d = 3 d = English; full parses only (90.83%) d = 1 d = 2 Exact Match MPP MPD d = 3 d = d = 1 d = 2 F-Score MPP MPD d = 3 d = French; full parses only (92.36%) d = 1 d = 2 Exact Match MPP MPD d = 3 d =

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP Lexical Functional Grammar (LFG): a constraint-based theory of language c-structure: context-free phrase structure trees f-structure: attribute-value matrix  -links: mapping from c- to f-structure NP DETNPadj theAdjN yellowLED VPaux AUXV isflashing S PRED ‘flash ’ TNS-ASPMOOD indicative PERF - PROG + TENSE pres SPEC-FORM the SPEC-TYPE def PRED ‘yellow ’ SUBJ f3f3 f5f5 PRED ‘LED’ CASE nom NUM sing PERS 3 SUBJ SPEC ADJUNCT f1f1 f4f4 f3f3 f2f2

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: Fragmentation Root: select any non-frontier non-terminal c- structure node as root and C-structure: delete all except this new root and the subtree it dominates Frontier: select a (possibly empty) set of non-root non-terminal nodes in the root-created c-structure C-structure: delete all c-structure subtrees dominated by frontier nodes  -links: delete all  -links corresponding to deleted c- structure nodes F-structure: delete all f-structure units not  - accessible from the remaining c-structure nodes Forms: delete all semantic forms corresponding to deleted terminals Root and Frontier:  -accessibility: F-structure unit f is  -accessible from node n iff n is  -linked to f i.e.  (n) = f, or f is contained within  (n) i.e. there is a chain of attributes leading from  (n) to f. S NPVP johnVNP lovesmary PRED ‘love ’ TNS pres NUM sg SUBJ OBJ PRED ‘john’ NUM sg PRED ‘mary’ NUM sg f1f1 f2f2 f3f3 S NPVP johnVNP lovesmary PRED ‘love ’ TNS pres NUM sg SUBJ OBJ PRED ‘john’ NUM sg PRED ‘mary’ NUM sg f1f1 f2f2 f3f3

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: Parsing new input S NPVP john TNS pres NUM sg SUBJ OBJ PRED ‘john’ NUM sg VP VNP lovesmary PRED ‘love ’ TNS pres NUM sg SUBJ OBJ NUM sg PRED ‘mary’ NUM sg o S NPVP johnVNP lovesmary PRED ‘love ’ TNS pres NUM sg SUBJ OBJ PRED ‘john’ NUM sg PRED ‘mary’ NUM sg = C-structure: left-most substitution  category-matching F-structure: unification  uniqueness, completeness, coherence LFG-DOP composition

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: computing probabilities Parse probability: Multiply fragment probabilities to calculate derivation probability Sum derivation probabilities to calculate parse probability Normalise over the probabilities of valid parses t  D(T)e  d(t) = Σ Π P DOP (T) fragment probability derivation probability parse probability P(e) Σ ex  CS P(e x ) P LFG-DOP (T|T is valid)= P DOP (T) Σ Tx is valid P DOP (T x )

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: Robustness via discard Discard operation: Delete attribute-value pairs from the f-structure while keeping c-structure and  -links constant Restriction: pairs whose values are  -linked to remaining c-structure nodes are not deleted S NPVP john TNS pres NUM sg SUBJ OBJ PRED ‘john’ NUM sg VP VNP seemary PRED ‘see ’ TNS pres SUBJ OBJ NUM pl PRED ‘mary’ NUM sg o = NUM pl … … S NPVP johnVNP seemary PRED ‘see ’ TNS pres NUM sg SUBJ OBJ PRED ‘john’ NUM pl PRED ‘mary’ NUM sg PRED ‘see ’ TNS pres NUM sg SUBJ OBJ PRED ‘john’ NUM sg PRED ‘mary’ NUM sg PRED ‘see ’ NUM sg SUBJ OBJ PRED ‘john’ PRED ‘mary’ NUM sg TNS pres

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: What are the challenges? To define fragmentation operations such that: phenomena such as recursion and re-entrancy are handled constraints are applied appropriately discard is used only to handle ill-formed input To somehow distinguish between constraining and informative features translation vs. parsing To address the fact that substitution is local but unification is global, i.e. to enforce LFG well-formedness conditions in an accurate and efficient manner to sample for the best parse in an accurate and efficient manner to define a probability model that doesn’t ‘leak’

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: Recursion & Re-entrancy NP DETNPadj theAdjN yellowLED SPEC-FORM the SPEC-TYPE def PRED ‘yellow ’ SUBJ f3f3 f5f5 PRED ‘LED’ CASE nom NUM sing PERS 3 SPEC ADJUNCT f4f4 f3f3 S NPVP jeanVV’ vient de PRED jean NUM sg PERS 3 PRED ‘tomber ’ SUBJ DE + FIN - f2f2 f3f3 PRED ‘venir ’ TNS pres FIN + PERS 3 SUBJ XCOMP f2f2 f1f1 C OMP V tomber Adj yellow SPEC-TYPE def PRED ‘yellow ’ SUBJ f3f3 f5f5 CASE nom NUM sing PERS 3 SPEC ADJUNCT f4f4 f3f3 V tomber NUM sg PERS 3 PRED ‘tomber ’ SUBJ DE + FIN - f2f2 f3f3 SUBJ XCOMP f2f2 f1f1 How can we adequately express the constraints on the composition of fragments such as (ADJ yellow) and (V tomber)?

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: Constraint over-specification V flashing PRED ‘flash ’ TNS-ASPMOOD indicative PERF - PROG + TENSE pres SPEC-TYPE def SUBJf3f3 f5f5 CASE nom NUM sing PERS 3 SUBJ SPEC ADJUNCT f1f1 f4f4 f3f3 f2f2 Is it appropriate to insist that the subject of flashing have an adjunct? Is it appropriate to be forced to use discard to allow the subject of flashing have an indefinite specifier? Important: we also want to remain language-independent…

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: An alternative fragmentation process 1)Determine a c-structure fragment using root and frontier as for Tree-DOP but retain the full f-structure given in the original representation. 2)Delete all f-structure units (and the attributes with which they are associated) which are not  -linked from one or more remaining c-structure nodes unless that unit is the value of an attribute subcategorised for by a PRED value whose corresponding terminal is dominated by the current fragment root node in the original representation. a)Where we have floating f-structure units, also retain the minimal f- structure unit which retains them both. By minimal unit we mean the unit containing both floating f-structures along with their (nested sequence of) attributes. 3)Delete all semantic forms (including PRED attributes and their values) not associated with one of the remaining c-structure teriminals.

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: constraints vs. information To prune attribute-value pairs based on a language-specific algorithm e.g. English: subj-verb agreement but not obj-verb agreement To automatically learn which attribute-value pairs should be pruned for a particular dataset To do ‘soft’ pruning: distinguish between constraining features and informative features Account for the difference during unification best suited to translation? best suited to parsing?

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: substitution vs. unification substitution is local but unification is global Enforce category matching during parsingModel M1: Model M2: Model M3: Enforce category matching and uniqueness during parsing Enforce category matching, uniqueness and coherence during parsing To be enforced: category matching, uniqueness, coherence and completeness There is no Model M4: completeness can never be checked until a complete parse has been obtained S NPVP TNS pres NUM sg SUBJ OBJ NUM sg NUM pl VP VNP john PRED ‘john’ NUM sg oo V loves PRED ‘love ’TNS pres NUM sg SUBJ OBJ NUM sg TNS pres NUM sg SUBJ OBJ NUM sg NUM pl o NP mary PRED ‘mary’ NUM sg o

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: sampling VP VNP VP VNP TNS pres NUM sg SUBJ OBJ NUM sg f1f1 f2f2 f3f3 f x: [i][k][i+k][j-k] [i][j],VP … The exact probability of sampling f x at [i][j],VP is: P DOP (f x ) Multiplied by the sampling probability mass available at each of its substitution sites [i][k],V and [i+k][j-k],NP And divided by the sampling probability mass available at [i][j],VP [i][k][i+k][j-k] … [i][j],VP We cannot know the sampling probability mass available at substitution site [i+k][j-k],NP until [i+k][j-k],NP is the leftmost substitution site unless we stick with Model M1 Problem for computing the exact probability of sampling f x at [i][j],VP: Problem for establishing when enough samples have been taken: We cannot know how many valid parses there are until all constraints have been resolved f x:

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 LFG-DOP: ‘leaked’ probability mass NP john PRED ‘john’ NUM sg S NPVP TNS pres SUBJNUM pl oo VP TNS pres SUBJNUM pl left PRED ‘leave ’ S NPVP johnleft TNS pres SUBJNUM pl PRED ‘leave ’ NUM sg PRED ‘john’ = =** This derivation will be thrown out because it does not satisfy the uniqueness condition Its probability is thrown out with it  ‘leaked’ probability mass Normalisation camouflages the problem but does not solve it

National Centre for Language Technology Dublin City University NCLT Seminar Series – November 2005 Data-Oriented Natural Language Processing using Lexical-Functional Grammar Questions?