Download presentation
1
Dependency Parsing Some slides are based on:
PPT presentation on dependency parsing by Prashanth Mannem Seven Lectures on Statistical Parsing by Christopher Manning
2
Constituency parsing Breaks sentence into constituents (phrases), which are then broken into smaller constituents Describes phrase structure and clause structure ( NP, PP, VP, etc.) Structures often recursive
3
S NP VP VP NP mom is an amazing show
4
Dependency parsing Syntactic structure consists of lexical items, linked by binary asymmetric relations called dependencies Interested in grammatical relations between individual words (governing & dependent words) Does not propose a recursive structure, rather a network of relations These relations can also have labels
6
Dependency vs. Constituency
Dependency structures explicitly represent Head-dependent relations (directed arcs) Functional categories (arc labels) Possibly some structural categories (parts-of-speech) Constituency structure explicitly represent Phrases (non-terminal nodes) Structural categories (non-terminal labels) Possibly some functional categories (grammatical functions)
7
Dependency vs. Constituency
A dependency grammar has a notion of a head Officially, CFGs don’t But modern linguistic theory and all modern statistical parsers (Charniak, Collins, …) do, via hand-written phrasal “head rules”: The head of a Noun Phrase is a noun/number/… The head of a Verb Phrase is a verb/modal/…. Based on a slide by Chris Manning
8
Dependency vs. Constituency
The head rules can be used to extract a dependency parse from a CFG parse (follow the heads) A phrase structure tree can be got from a dependency tree, but dependents are flat Based on a slide by Chris Manning
9
Definition: dependency graph
An input word sequence w1…wn Dependency graph G = (V,E) where V is the set of nodes i.e. word tokens in the input seq. E is the set of unlabeled tree edges (i, j) i, j є V (ii, j) indicates an edge from i (parent, head, governor) to j (child, dependent)
10
Definition: dependency graph
A dependency graph is well-formed iff Single head: Each word has only one head Acyclic: The graph should be acyclic Connected: The graph should be a single tree with all the words in the sentence Projective: If word A depends on word B, then all words between A and B are also subordinate to B (i.e. dominated by B)
11
Non-projective dependencies
Ram saw a dog yesterday which was a Yorkshire Terrier
12
Parsing algorithms Dependency based parsers can be broadly categorized into Grammar driven approaches Parsing done using grammars Data driven approaches Parsing by training on annotated/un-annotated data
13
Unlabeled graphs Dan Klein recently showed that labeling is relatively easy and that the difficulty of parsing lies in creating bracketing (Klein, 2004) Therefore some parsers run in two steps: 1) bracketing; 2) labeling
14
Traditions Dynamic programming Deterministic search
e.g., Eisner (1996), McDonald (2006) Deterministic search e.g., Covington (2001), Yamada and Matsumoto, Nivre (2006) Constraints satisfaction e.g., Maruyama, Foth et al.
15
Data driven Two main approaches
Global, Exhaustive, Graph-based parsing Local, greedy, transition-based parsing
16
Graph-based parsing Assume there is a scoring function:
The score of a graph is Parsing for input string x is All dependency graphs
17
MST algorithm (McDonald, 2006)
Scores are based on features, independent of other dependencies Features can be Head and dependent word and POS separately Head and dependent word and POS bigram features Words between head and dependent Length and direction of dependency
18
MST algorithm (McDonald, 2006)
Parsing can be formulated as maximum spanning tree problem Use Chu-Liu-Edmonds (CLE) algorithm for MST (runs in , considers non-projective arcs) Uses online learning for determining weight vector w
19
Transition-based parsing
A transition system for dependency parsing defines: a set C of parser configurations, each of which defines a (partially built) dependency graph G a set T of transitions, each a function t :CC for every sentence x = w0,w1, ,wn a unique initial configuration cx a set Qx of terminal configurations
20
Transition sequence A transition sequence Cx,m = (cx, c1, , cm) for a sentence x is a sequence of configurations such that and, for every there is a transition such that The graph defined by is the dependency graph of x
21
Transition scoring function
The score of a transition t in a configuration c s(c, t) represents the likelihood of taking transition t out of configuration c Parsing is finding the optimal transition sequence ( )
22
Yamada and Matsumoto (2003)
A transition-based (shift-reduce) parser Considers two adjacent words Runs in iterations, continues as long as new dependencies are created In every iteration, consider 3 different actions and choose one using SVM (or other discriminative learning technique) Time complexity Accuracy was shown to be close to the state-of-the-art algorithms (e.g., Eisner’s)
23
Y&M (2003) Actions Shift Left Right
24
Y&M (2003) Learning Features (lemma, POS tag) are collected from the context
25
Stack-based parsing Introducing a stack and a buffer
The buffer is a queue of all input words (left to right) The stack begins empty; words are pushed to the stack by the defined actions Reduces Y&M complexity to linear time
26
2 stack-based parsers Nivre’s (2003, 2006) arc-standard Stack Buffer
i doesn’t have a head already j doesn’t have a head already Stack Buffer
27
2 stack-based parsers Nivre’s (2003, 2006) arc-eager
28
Borrowed from Dependency Parsing (P. Mannem)
Example (arc eager) _ROOT_ Red figures on the screen indicated falling stocks S Q Borrowed from Dependency Parsing (P. Mannem)
29
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)
30
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)
31
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)
32
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)
33
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)
34
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)
35
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)
36
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)
37
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)
38
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)
39
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)
40
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)
41
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)
42
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)
43
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)
44
Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)
45
Graph (MSTParser) vs. Transitions (MaltParser)
Accuracy on different languages Characterizing the Errors of Data-Driven Dependency Parsing Models, McDonald and Nivre 2007
46
Graph (MSTParser) vs. Transitions (MaltParser)
Sentence length vs. accuracy Characterizing the Errors of Data-Driven Dependency Parsing Models, McDonald and Nivre 2007
47
Graph (MSTParser) vs. Transitions (MaltParser)
Dependency length vs. precision Characterizing the Errors of Data-Driven Dependency Parsing Models, McDonald and Nivre 2007
48
Known Parsers Stanford (constituency + dependency)
MaltParser (dependency) MSTParser (dependency) Hebrew Yoav Goldberg’s parser (
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.