Dependency Parsing Some slides are based on:

Slides:



Advertisements
Similar presentations
Albert Gatt Corpora and Statistical Methods Lecture 11.
Advertisements

Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
R Yun-Nung Chen 資工碩一 陳縕儂 1 /39.  Non-projective Dependency Parsing using Spanning Tree Algorithms (HLT/EMNLP 2005)  Ryan McDonald, Fernando.
Dependency Parsing Joakim Nivre. Dependency Grammar Old tradition in descriptive grammar Modern theroretical developments: –Structural syntax (Tesnière)
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
28 June 2007EMNLP-CoNLL1 Probabilistic Models of Nonprojective Dependency Trees David A. Smith Center for Language and Speech Processing Computer Science.
A Joint Model For Semantic Role Labeling Aria Haghighi, Kristina Toutanova, Christopher D. Manning Computer Science Department Stanford University.
Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Natural Language Processing
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
Amirkabir University of Technology Computer Engineering Faculty AILAB Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing Course,
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Computational Grammars Azadeh Maghsoodi. History Before First 20s 20s World War II Last 1950s Nowadays.
Seven Lectures on Statistical Parsing Christopher Manning LSA Linguistic Institute 2007 LSA 354 Lecture 7.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Lexical and syntax analysis
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling.
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Data-Driven Dependency Parsing. 2 Background: Natural Language Parsing Syntactic analysis String to (tree) structure He likes fish S NP VP NP VNPrn.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)
10/12/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 10 Giuseppe Carenini.
Inductive Dependency Parsing Joakim Nivre
Dependency Parsing Prashanth Mannem
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Albert Gatt Corpora and Statistical Methods Lecture 11.
Coarse-to-Fine Efficient Viterbi Parsing Nathan Bodenstab OGI RPE Presentation May 8, 2006.
Dependency Parser for Swedish Project for EDA171 by Jonas Pålsson Marcus Stamborg.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
CSA2050 Introduction to Computational Linguistics Parsing I.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
© 2005 Hans Uszkoreit FLST WS 05/06 FLST Grammars and Parsing Hans Uszkoreit.
NLP. Introduction to NLP The probabilities don’t depend on the specific words –E.g., give someone something (2 arguments) vs. see something (1 argument)
Supertagging CMSC Natural Language Processing January 31, 2006.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Natural Language Processing Lecture 15—10/15/2015 Jim Martin.
Dependency Parsing Parsing Algorithms Peng.Huang
Chapter 6 – Trees. Notice that in a tree, there is exactly one path from the root to each node.
Computational lexicology, morphology and syntax Diana Trandabă
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.
Natural Language Processing : Probabilistic Context Free Grammars Updated 8/07.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 King Faisal University.
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Natural Language Processing Vasile Rus
Computational lexicology, morphology and syntax
Raymond J. Mooney University of Texas at Austin
CSC 594 Topics in AI – Natural Language Processing
An Introduction to the Government and Binding Theory
Programming Languages Translator
Basic Parsing with Context Free Grammars Chapter 13
Dependency Parsing & Feature-based Parsing
Statistical NLP Spring 2011
Probabilistic and Lexicalized Parsing
CSCI 5832 Natural Language Processing
LING/C SC 581: Advanced Computational Linguistics
محمدصادق رسولی rasooli.ms{#a#t#}gmail.com
CSCI 5832 Natural Language Processing
CSA2050 Introduction to Computational Linguistics
Lecture 7: Definite Clause Grammars
Presentation transcript:

Dependency Parsing Some slides are based on: PPT presentation on dependency parsing by Prashanth Mannem Seven Lectures on Statistical Parsing by Christopher Manning

Constituency parsing Breaks sentence into constituents (phrases), which are then broken into smaller constituents Describes phrase structure and clause structure ( NP, PP, VP, etc.) Structures often recursive

S NP VP VP NP mom is an amazing show

Dependency parsing Syntactic structure consists of lexical items, linked by binary asymmetric relations called dependencies Interested in grammatical relations between individual words (governing & dependent words) Does not propose a recursive structure, rather a network of relations These relations can also have labels

Dependency vs. Constituency Dependency structures explicitly represent Head-dependent relations (directed arcs) Functional categories (arc labels) Possibly some structural categories (parts-of-speech) Constituency structure explicitly represent Phrases (non-terminal nodes) Structural categories (non-terminal labels) Possibly some functional categories (grammatical functions)

Dependency vs. Constituency A dependency grammar has a notion of a head Officially, CFGs don’t But modern linguistic theory and all modern statistical parsers (Charniak, Collins, …) do, via hand-written phrasal “head rules”: The head of a Noun Phrase is a noun/number/… The head of a Verb Phrase is a verb/modal/…. Based on a slide by Chris Manning

Dependency vs. Constituency The head rules can be used to extract a dependency parse from a CFG parse (follow the heads) A phrase structure tree can be got from a dependency tree, but dependents are flat Based on a slide by Chris Manning

Definition: dependency graph An input word sequence w1…wn Dependency graph G = (V,E) where V is the set of nodes i.e. word tokens in the input seq. E is the set of unlabeled tree edges (i, j) i, j є V (ii, j) indicates an edge from i (parent, head, governor) to j (child, dependent)

Definition: dependency graph A dependency graph is well-formed iff Single head: Each word has only one head Acyclic: The graph should be acyclic Connected: The graph should be a single tree with all the words in the sentence Projective: If word A depends on word B, then all words between A and B are also subordinate to B (i.e. dominated by B)

Non-projective dependencies Ram saw a dog yesterday which was a Yorkshire Terrier

Parsing algorithms Dependency based parsers can be broadly categorized into Grammar driven approaches Parsing done using grammars Data driven approaches Parsing by training on annotated/un-annotated data

Unlabeled graphs Dan Klein recently showed that labeling is relatively easy and that the difficulty of parsing lies in creating bracketing (Klein, 2004) Therefore some parsers run in two steps: 1) bracketing; 2) labeling

Traditions Dynamic programming Deterministic search e.g., Eisner (1996), McDonald (2006) Deterministic search e.g., Covington (2001), Yamada and Matsumoto, Nivre (2006) Constraints satisfaction e.g., Maruyama, Foth et al.

Data driven Two main approaches Global, Exhaustive, Graph-based parsing Local, greedy, transition-based parsing

Graph-based parsing Assume there is a scoring function: The score of a graph is Parsing for input string x is All dependency graphs

MST algorithm (McDonald, 2006) Scores are based on features, independent of other dependencies Features can be Head and dependent word and POS separately Head and dependent word and POS bigram features Words between head and dependent Length and direction of dependency

MST algorithm (McDonald, 2006) Parsing can be formulated as maximum spanning tree problem Use Chu-Liu-Edmonds (CLE) algorithm for MST (runs in , considers non-projective arcs) Uses online learning for determining weight vector w

Transition-based parsing A transition system for dependency parsing defines: a set C of parser configurations, each of which defines a (partially built) dependency graph G a set T of transitions, each a function t :CC for every sentence x = w0,w1, . . . ,wn a unique initial configuration cx a set Qx of terminal configurations

Transition sequence A transition sequence Cx,m = (cx, c1, . . . , cm) for a sentence x is a sequence of configurations such that and, for every there is a transition such that The graph defined by is the dependency graph of x

Transition scoring function The score of a transition t in a configuration c s(c, t) represents the likelihood of taking transition t out of configuration c Parsing is finding the optimal transition sequence ( )

Yamada and Matsumoto (2003) A transition-based (shift-reduce) parser Considers two adjacent words Runs in iterations, continues as long as new dependencies are created In every iteration, consider 3 different actions and choose one using SVM (or other discriminative learning technique) Time complexity Accuracy was shown to be close to the state-of-the-art algorithms (e.g., Eisner’s)

Y&M (2003) Actions Shift Left Right

Y&M (2003) Learning Features (lemma, POS tag) are collected from the context

Stack-based parsing Introducing a stack and a buffer The buffer is a queue of all input words (left to right) The stack begins empty; words are pushed to the stack by the defined actions Reduces Y&M complexity to linear time

2 stack-based parsers Nivre’s (2003, 2006) arc-standard Stack Buffer i doesn’t have a head already j doesn’t have a head already Stack Buffer

2 stack-based parsers Nivre’s (2003, 2006) arc-eager

Borrowed from Dependency Parsing (P. Mannem) Example (arc eager) _ROOT_ Red figures on the screen indicated falling stocks S Q Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)

Borrowed from Dependency Parsing (P. Mannem) Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)

Graph (MSTParser) vs. Transitions (MaltParser) Accuracy on different languages Characterizing the Errors of Data-Driven Dependency Parsing Models, McDonald and Nivre 2007

Graph (MSTParser) vs. Transitions (MaltParser) Sentence length vs. accuracy Characterizing the Errors of Data-Driven Dependency Parsing Models, McDonald and Nivre 2007

Graph (MSTParser) vs. Transitions (MaltParser) Dependency length vs. precision Characterizing the Errors of Data-Driven Dependency Parsing Models, McDonald and Nivre 2007

Known Parsers Stanford (constituency + dependency) MaltParser (dependency) MSTParser (dependency) Hebrew Yoav Goldberg’s parser (http://www.cs.bgu.ac.il/~yoavg/software/hebparsers/hebdepparser/)