Download presentation
Presentation is loading. Please wait.
Published byFranklin Shaw Modified over 9 years ago
1
Dependency Parsing Parsing Algorithms Peng.Huang peng.huangp@alibaba-inc.com
2
Outline Introduction Phrase Structure Grammar Dependency Grammar Comparison and Conve rsion Dependency Parsing Formal definition Parsing Algorithms Introduction Dynamic programming Constraint satisfaction Deterministic search 2
3
Introduction Syntactic Parsing finding the correct syntactic structure of that sentence in a given formalism/grammar Two Formalism Dependency Parsing (DG) Phrase Structure Grammar (PSG) 3
4
Phrase Structure Grammar (PSG) Breaks sentence into constituents (phrases) Which are then broken into smaller constituents Describes phrase structure, clause structure E.g.. NP, PP, VP etc.. Structures often recursive The clever tall blue-eyed old … man 4
5
Tree Structure 5
6
Dependency Grammar Syntactic structure consists of lexical items, linked by binary asymmetric relations called dependencies Interested in grammatical relations between individual words (governing & dependent words) Does not propose a recursive structure Rather a network of relations These relations can also have labels 6
7
Comparison Dependency structures explicitly represent –Head-dependent relations (directed arcs) –Functional categories (arc labels) –Possibly some structural categories (parts-of-speech) Phrase structure explicitly represent –Phrases (non-terminal nodes) –Structural categories (non-terminal labels) –Possibly some functional categories (grammatical functions) 7
8
Conversion … PSG to DG Head of a phrase governs/dominates all the siblings Heads are calculated using heuristics Dependency relations are established between the head of each phrase as the parent and its siblings as children The tree thus formed is the unlabeled dependency tree 8
9
PSG to DG 9
10
Conversion … DG to PSG Each head together with its dependents (and their dependents) forms a constituent of the sentence Difficult to assign the structural categories (NP, VP, S, PP etc…) to these derived constituents Every projective dependency grammar has a strongly equivalent context-free grammar, but not vice versa [Gaifman 1965] 10
11
Outline Introduction Phrase Structure Grammar Dependency Grammar Comparison and Conve rsion Dependency Parsing Formal definition Parsing Algorithms Introduction Dynamic programming Constraint satisfaction Deterministic search 11
12
Dependency Tree Formal definition –An input word sequence w 1 …w n –Dependency graph D = (W,E) where W is the set of nodes i.e. word tokens in the input seq E is the set of unlabeled tree edges (w i, w j ) (w i, w j є W) (w i, w j ) indicates an edge from w i (parent) to w j (child) Formal definition Task of mapping an input string to a dependency graph satisfying certain conditions is called dependency parsing 12
13
Well-formedness A dependency graph is well-formed iff Single head: Each word has only one head Acyclic: The graph should be acyclic Connected: The graph should be a single tree with all the words in the sentence Projective: If word A depends on word B, then all words between A and B are also subordinate to B (i.e. dominated by B) 13
14
14 Non-projective dependency tree Ram saw a dog yesterday which was a Yorkshire Terrier * Crossing lines English has very few non-projective cases.
15
Outline Introduction Phrase Structure Grammar Dependency Grammar Comparison and Conve rsion Dependency Parsing Formal definition Parsing Algorithms Introduction Dynamic programming Constraint satisfaction Deterministic search 15
16
Dependency Parsing (P. Mannem)16 Dependency Parsing Dependency based parsers can be broadly categorized into –Grammar driven approaches Parsing done using grammars. –Data driven approaches Parsing by training on annotated/un-annotated data. May 28, 2008
17
Well-formedness A dependency graph is well-formed iff Single head: Each word has only one head Acyclic: The graph should be acyclic Connected: The graph should be a single tree with all the words in the sentence Projective: If word A depends on word B, then all words between A and B are also subordinate to B (i.e. dominated by B) 17
18
18 Parsing Methods Three main traditions –Dynamic programming CYK, Eisner, McDonald –Constraint satisfaction Maruyama, Foth et al., Duchier –Deterministic search Covington, Yamada and Matsumuto, Nivre
19
Dynamic Programming Basic Idea: Treat dependencies as constituents Use, e.g., CYK parser (with minor modifications) 19
20
Eisner 1996 Two novel aspects: Modified parsing algorithm Probabilistic dependency parsing Time requirement: O(n 3 ) Modification: Instead of storing subtrees, store spans Span: Substring such that no interior word links to any word outside the span Idea: In a span, only the boundary words are active, i.e. still need a head or a child 20
21
21 Re d figure s onon th e scree n indicat ed fallin g stocks_ROO T_ Example
22
22 Red figuresonthescreenindicatedfallingstocks_ROOT_ Example indicatedfallingstocks Red figures Spans:
23
23 Red figuresonthescreenindicatedfallingstocks_ROOT_ Assembly of correct parse Red figures Start by combining adjacent words to minimal spans figureson the
24
24 Red figuresonthescreenindicatedfallingstocks_ROOT _ Assembly of correct parse Combine spans which overlap in one word; this word must be governed by a word in the left or right span. onthe screen +→ onthescreen
25
25 Red figuresonthescreenindicatedfallingstocks_ROOT _ Assembly of correct parse Combine spans which overlap in one word; this word must be governed by a word in the left or right span. figure s on +→ thescreenfiguresonthescreen
26
26 Red figuresonthescreenindicatedfallingstocks_ROOT _ Assembly of correct parse Combine spans which overlap in one word; this word must be governed by a word in the left or right span. Invalid span Red figuresonthescreen
27
McDonald’s Maximum Spanning Trees Score of a dependency tree = sum of scores of dependencies Scores are independent of other dependencies If scores are available, parsing can be formulated as maximum spanning tree problem Uses online learning for determining weight vector w 27
28
McDonald’s Algorithm Score entire dependency tree Online Learning 28
29
29 Parsing Methods Three main traditions –Dynamic programming CYK, Eisner, McDonald –Constraint satisfaction Maruyama, Foth et al., Duchier –Deterministic search Covington, Yamada and Matsumuto, Nivre
30
Dependency Parsing (P. Mannem)30 Constraint Satisfaction Uses Constraint Dependency Grammar Grammar consists of a set of boolean constraints, i.e. logical formulas that describe well-formed trees A constraint is a logical formula with variables that range over a set of predefined values Parsing is defined as a constraint satisfaction problem Constraint satisfaction removes values that contradict constraints May 28, 2008
31
31 Parsing Methods Three main traditions –Dynamic programming CYK, Eisner, McDonald –Constraint satisfaction Maruyama, Foth et al., Duchier –Deterministic search Covington, Yamada and Matsumuto, Nivre
32
Deterministic Parsing Basic idea –Derive a single syntactic representation (dependency graph) through a deterministic sequence of elementary parsing actions – Sometimes combined with backtracking or repair 32
33
Shift-Reduce Type Algorithms Data structures: –Stack [...,w i ]S of partially processed tokens –Queue [w j,...]Q of remaining input tokens Parsing actions built from atomic actions: –Adding arcs (w i → w j, w i ← w j ) –Stack and queue operations Left-to-right parsing 33
34
34 Yamada’s Algorithm Three parsing actions: Shift [...]S [w i,...]Q [..., w i ]S [...]Q Left[..., w i, w j ]S [...]Q [..., w i ]S [...]Q w i → w j Right[..., w i, w j ]S [...]Q [..., w j ]S [...]Q w i ← w j Multiple passes over the input give time complexity O(n 2 )
35
35 Yamada and Matsumoto Parsing in several rounds: deterministic bottom-up O(n 2 ) Looks at pairs of words 3 actions: shift, left, right Shift: shifts focus to next word pair
36
36 Yamada and Matsumoto Right: decides that the right word depends on the left word Left: decides that the left word depends on the right one
37
37 Nivre’s Algorithm Four parsing actions: Shift[...]S [w i,...]Q [..., w i ]S [...]Q Reduce[..., w i ]S [...]Q Эw k : w k → w i [...]S [...]Q Left-Arc[..., w i ]S [w j,...]Q ¬Эw k : w k → w i [...]S [w j,...]Q w i ← w j Right-Arc [...,w i ]S [w j,...]Q ¬Эw k : w k → w j [..., w i, w j ]S [...]Q w i → w j
38
Dependency Parsing (P. Mannem)38 Nivre’s Algorithm Characteristics: –Arc-eager processing of right-dependents –Single pass over the input gives time worst case complexity O(2n) May 28, 2008
39
39 Red figuresonthescreenindicatedfallingstocks_ROOT_ Example SQ
40
40 Red figuresonthescreenindicatedfallingstocks_ROOT _ Example SQ Shift
41
41 Red figuresonthescreenindicatedfallingstocks_ROOT _ Example SQ Left-arc
42
42 Red figuresonthescreenindicatedfallingstocks_ROOT _ Example SQ Shift
43
43 Red figuresonthescreenindicatedfallingstocks_ROOT _ Example SQ Right-arc
44
44 Red figuresonthescreenindicatedfallingstocks_ROOT _ Example SQ Shift
45
45 Red figureson the screenindicatedfallingstocks_ROOT _ Example SQ Left-arc
46
46 Red figureson the screenindicatedfallingstocks_ROOT _ Example SQ Right-arc
47
47 Red figureson thescreen indicatedfallingstocks_ROOT _ Example SQ Reduce
48
48 Red figures onthescreen indicatedfallingstocks_ROOT _ Example SQ Reduce
49
49 Red figuresonthescreen indicatedfallingstocks_ROOT _ Example SQ Left-arc
50
50 Red figure s onthescreen indicatedfallingstocks_ROOT _ Example SQ Right-arc
51
51 Red figure s onthescreen indicatedfallingstocks_ROOT _ Example SQ Shift
52
52 Red figuresonthescreen indicated falling stocks_ROOT _ Example SQ Left-arc
53
53 Red figuresonthescreen indicated falling stocks_ROOT _ Example SQ Right-arc
54
54 Red figuresonthescreen indicated fallingstocks _ROOT _ Example SQ Reduce
55
55 Red figuresonthescreenindicate d fallingstocks _ROOT _ Example SQ Reduce
56
The End QAQA 56
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.