Presentation is loading. Please wait.

Presentation is loading. Please wait.

Natural Language Processing

Similar presentations


Presentation on theme: "Natural Language Processing"— Presentation transcript:

1 Natural Language Processing
Earley’s Algorithm and Dependencies

2 Survey Feedback Expanded office hours More detail in the lectures
Tuesday evenings Friday afternoons More detail in the lectures Piazza Quiz & Midterm policy You don’t get them back Grading policy

3 Earley’s Algorithm

4 Grammar for Examples NP -> N DT -> a NP -> DT N DT -> the
NP -> NP PP NP -> PNP PP -> P NP S -> NP VP S -> VP VP -> V NP VP -> VP PP DT -> a DT -> the P -> through P -> with PNP -> Swabha PNP -> Chicago V -> book V -> books N -> book N -> books N -> flight

5 Earley’s Algorithm More “top-down” than CKY.
Still dynamic programming. The Earley chart: ROOT → • S [0,0] goal: ROOT → S• [0,n] book the flight through Chicago

6 Earley’s Algorithm: Predict
Given V → α•Xβ [i, j] and the rule X → γ, create X → •γ [j, j] ROOT → • S [0,0] S→ VP S → VP • [0,0] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] book the flight through Chicago

7 Earley’s Algorithm: Scan
Given V → α•Tβ [i, j] and the rule T → wj+1, create T → wj+1• [j, j+1] VP → • V NP [0,0] V → book V → book • [0,1] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] V → book• [0, 1] book the flight through Chicago

8 Earley’s Algorithm: Complete
Given V → α•Xβ [i, j] and X → γ• [j, k], create V → αX•β [i, k] VP → • V NP [0,0] V → book • [0,1] VP → V • NP [0,1] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] V → book• [0, 1] VP → V • NP [0,1] book the flight through Chicago

9

10 Earley’s Algorithm: Complete
Given V → α•Xβ [i, j] and X → γ• [j, k], create V → αX•β [i, k] VP → • V NP [0,0] V → book • [0,1] VP → V • NP [0,1] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] V → book• [0, 1] VP → V • NP [0,1] book the flight through Chicago

11 Thought Questions Runtime? Memory? Weighted version? Recovering trees?

12 Parsing as Search

13 Implementing Recognizers as Search
Agenda = { state0 } while(Agenda not empty) s = pop a state from Agenda if s is a success-state return s // valid parse tree else if s is not a failure-state: generate new states from s push new states onto Agenda return nil // no parse!

14 Agenda-Based Probabilistic Parsing
Agenda = { (item, value) : initial updates from equations } // items take the form [X, i, j]; values are reals while(Agenda not empty) u = pop an update from Agenda if u.item is goal return u.value // valid parse tree else if u.value > Chart[u.item] store Chart[u.item] ← u.value if u.item combines with other Chart items: generate new updates from u and items stored in Chart push new updates onto Agenda return nil // no parse! “States” on the agenda are (possible) updates to the chart. BEST FIRST: Order the updates by their values Guarantee: the first time you pop any state, you have its final value! Extension: order by update times h, where h introduces more information about which states we like. Under some conditions, this is faster and still optimal. Even when not optimal, performance is sometimes good.

15 Catalog of CF Parsing Algorithms
Recognition/Boolean vs. parsing/probabilistic Chomsky normal form/CKY vs. general/Earley’s Exhaustive vs. agenda

16 Dependency Parsing

17 Treebank Tree S VP PP NP NP NP NP
DT NN NN NN JJ NN VBD CD NNS IN DT NNP The luxury auto maker last year sold 1,214 cars in the U.S.

18 Headed Tree S VP PP NP NP NP NP DT NN NN NN JJ NN VBD CD NNS IN DT NNP
The luxury auto maker last year sold 1,214 cars in the U.S.

19 Lexicalized Tree Ssold VPsold PPin NPmaker NPyear NPcars NPU.S.
DT NN NN NN JJ NN VBD CD NNS IN DT NNP The luxury auto maker last year sold 1,214 cars in the U.S.

20 Dependency Tree

21 Methods for Dependency Parsing
Parse with a phrase-structure parser with headed / lexicalized rules Reuse algorithms we know Leverage improvements in phrase structure parsing Maximum spanning tree algorithms Words are nodes, edges are possible links MSTParser Shift-reduce parsing Read words in one at a time, decide to “shift” or “reduce” to incrementally build tree structures MaltParser, Stanford NN Dependency Parser

22 Maximum Spanning Tree Each dependency is an edge
Assign each edge a goodness score (ML problem) Dependencies must form a tree Find the highest scoring tree (Chu-Liu-Edmonds algorithm) Figure: Graham Neubig

23 Shift-Reduce Parsing Two data structures At each point choose
Buffer: words that are being read in Stack: partially built dependency trees At each point choose Shift: move word from stack to queue Reduce-left: combine top two items in stack by making the top word the head of the tree Reduce-right: combine top two items in stack by maing the second word the head of the tree Parsing as classification: classifier says “shift” or “reduce-left” or “reduce-right”

24 Shift-Reduce Parsing Stack Buffer Stack Buffer Figure: Graham Neubig

25 Parsing as Classification
Given a state: What action is best? Better classification -> better parsing Stack Buffer

26 Shift-Reduce Algorithm
ShiftReduce(queue) make list heads stack = [ (0, “ROOT”, “ROOT”) ] while |buffer| > 0 or |stack| > 1: feats = MakeFeats(stack, buffer) action = Predict(feats, weights) if action = shift: stack.push(buffer.read()) elif action = reduce_left: heads[stack[-2]] = stack[-1] stack.remove(-2) else: # action = reduce_right heads[scack[-1]] = stack[-2] stack.remove(-1)


Download ppt "Natural Language Processing"

Similar presentations


Ads by Google