Download presentation
1
Natural Language Processing
Earley’s Algorithm and Dependencies
2
Survey Feedback Expanded office hours More detail in the lectures
Tuesday evenings Friday afternoons More detail in the lectures Piazza Quiz & Midterm policy You don’t get them back Grading policy
3
Earley’s Algorithm
4
Grammar for Examples NP -> N DT -> a NP -> DT N DT -> the
NP -> NP PP NP -> PNP PP -> P NP S -> NP VP S -> VP VP -> V NP VP -> VP PP DT -> a DT -> the P -> through P -> with PNP -> Swabha PNP -> Chicago V -> book V -> books N -> book N -> books N -> flight
5
Earley’s Algorithm More “top-down” than CKY.
Still dynamic programming. The Earley chart: ROOT → • S [0,0] goal: ROOT → S• [0,n] book the flight through Chicago
6
Earley’s Algorithm: Predict
Given V → α•Xβ [i, j] and the rule X → γ, create X → •γ [j, j] ROOT → • S [0,0] S→ VP S → VP • [0,0] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] book the flight through Chicago
7
Earley’s Algorithm: Scan
Given V → α•Tβ [i, j] and the rule T → wj+1, create T → wj+1• [j, j+1] VP → • V NP [0,0] V → book V → book • [0,1] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] V → book• [0, 1] book the flight through Chicago
8
Earley’s Algorithm: Complete
Given V → α•Xβ [i, j] and X → γ• [j, k], create V → αX•β [i, k] VP → • V NP [0,0] V → book • [0,1] VP → V • NP [0,1] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] V → book• [0, 1] VP → V • NP [0,1] book the flight through Chicago
10
Earley’s Algorithm: Complete
Given V → α•Xβ [i, j] and X → γ• [j, k], create V → αX•β [i, k] VP → • V NP [0,0] V → book • [0,1] VP → V • NP [0,1] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] V → book• [0, 1] VP → V • NP [0,1] book the flight through Chicago
11
Thought Questions Runtime? Memory? Weighted version? Recovering trees?
12
Parsing as Search
13
Implementing Recognizers as Search
Agenda = { state0 } while(Agenda not empty) s = pop a state from Agenda if s is a success-state return s // valid parse tree else if s is not a failure-state: generate new states from s push new states onto Agenda return nil // no parse!
14
Agenda-Based Probabilistic Parsing
Agenda = { (item, value) : initial updates from equations } // items take the form [X, i, j]; values are reals while(Agenda not empty) u = pop an update from Agenda if u.item is goal return u.value // valid parse tree else if u.value > Chart[u.item] store Chart[u.item] ← u.value if u.item combines with other Chart items: generate new updates from u and items stored in Chart push new updates onto Agenda return nil // no parse! “States” on the agenda are (possible) updates to the chart. BEST FIRST: Order the updates by their values Guarantee: the first time you pop any state, you have its final value! Extension: order by update times h, where h introduces more information about which states we like. Under some conditions, this is faster and still optimal. Even when not optimal, performance is sometimes good.
15
Catalog of CF Parsing Algorithms
Recognition/Boolean vs. parsing/probabilistic Chomsky normal form/CKY vs. general/Earley’s Exhaustive vs. agenda
16
Dependency Parsing
17
Treebank Tree S VP PP NP NP NP NP
DT NN NN NN JJ NN VBD CD NNS IN DT NNP The luxury auto maker last year sold 1,214 cars in the U.S.
18
Headed Tree S VP PP NP NP NP NP DT NN NN NN JJ NN VBD CD NNS IN DT NNP
The luxury auto maker last year sold 1,214 cars in the U.S.
19
Lexicalized Tree Ssold VPsold PPin NPmaker NPyear NPcars NPU.S.
DT NN NN NN JJ NN VBD CD NNS IN DT NNP The luxury auto maker last year sold 1,214 cars in the U.S.
20
Dependency Tree
21
Methods for Dependency Parsing
Parse with a phrase-structure parser with headed / lexicalized rules Reuse algorithms we know Leverage improvements in phrase structure parsing Maximum spanning tree algorithms Words are nodes, edges are possible links MSTParser Shift-reduce parsing Read words in one at a time, decide to “shift” or “reduce” to incrementally build tree structures MaltParser, Stanford NN Dependency Parser
22
Maximum Spanning Tree Each dependency is an edge
Assign each edge a goodness score (ML problem) Dependencies must form a tree Find the highest scoring tree (Chu-Liu-Edmonds algorithm) Figure: Graham Neubig
23
Shift-Reduce Parsing Two data structures At each point choose
Buffer: words that are being read in Stack: partially built dependency trees At each point choose Shift: move word from stack to queue Reduce-left: combine top two items in stack by making the top word the head of the tree Reduce-right: combine top two items in stack by maing the second word the head of the tree Parsing as classification: classifier says “shift” or “reduce-left” or “reduce-right”
24
Shift-Reduce Parsing Stack Buffer Stack Buffer Figure: Graham Neubig
25
Parsing as Classification
Given a state: What action is best? Better classification -> better parsing Stack Buffer
26
Shift-Reduce Algorithm
ShiftReduce(queue) make list heads stack = [ (0, “ROOT”, “ROOT”) ] while |buffer| > 0 or |stack| > 1: feats = MakeFeats(stack, buffer) action = Predict(feats, weights) if action = shift: stack.push(buffer.read()) elif action = reduce_left: heads[stack[-2]] = stack[-1] stack.remove(-2) else: # action = reduce_right heads[scack[-1]] = stack[-2] stack.remove(-1)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.