Natural Language Processing

Slides:



Advertisements
Similar presentations
Probabilistic and Lexicalized Parsing CS Probabilistic CFGs: PCFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations.
Advertisements

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
PARSING WITH CONTEXT-FREE GRAMMARS
Dependency Parsing Some slides are based on:
Probabilistic Earley Parsing Charlie Kehoe, Spring 2004 Based on the 1995 paper by Andreas Stolcke: An Efficient Probabilistic Context-Free Parsing Algorithm.
CKY Parsing Ling 571 Deep Processing Techniques for NLP January 12, 2011.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
Context-Free Grammar Parsing by Message Passing Paper by Dekang Lin and Randy Goebel Presented by Matt Watkins.
Earley’s algorithm Earley’s algorithm employs the dynamic programming technique to address the weaknesses of general top-down parsing. Dynamic programming.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
Fall 2004 Lecture Notes #5 EECS 595 / LING 541 / SI 661 Natural Language Processing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
11/22/1999 JHU CS /Jan Hajic 1 Introduction to Natural Language Processing ( ) Shift-Reduce Parsing in Detail Dr. Jan Hajič CS Dept., Johns.
1 Data-Driven Dependency Parsing. 2 Background: Natural Language Parsing Syntactic analysis String to (tree) structure He likes fish S NP VP NP VNPrn.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)
SI485i : NLP Set 8 PCFGs and the CKY Algorithm. PCFGs We saw how CFGs can model English (sort of) Probabilistic CFGs put weights on the production rules.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Page 1 Probabilistic Parsing and Treebanks L545 Spring 2000.
Syntax and Semantics Structure of programming languages.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Parsing Based on presentations from Chris Manning’s course on Statistical Parsing (Stanford)
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.
Dependency Parser for Swedish Project for EDA171 by Jonas Pålsson Marcus Stamborg.
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
Basic Parsing Algorithms: Earley Parser and Left Corner Parsing
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
CPSC 503 Computational Linguistics
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
NLP. Introduction to NLP The probabilities don’t depend on the specific words –E.g., give someone something (2 arguments) vs. see something (1 argument)
Intro to NLP - J. Eisner1 Parsing Tricks.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Dependency Parsing Parsing Algorithms Peng.Huang
December 2011CSA3202: PCFGs1 CSA3202: Human Language Technology Probabilistic Phrase Structure Grammars (PCFGs)
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
DERIVATION S RULES USEDPROBABILITY P(s) = Σ j P(T,S) where t is a parse of s = Σ j P(T) P(T) – The probability of a tree T is the product.
NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.
Data Structures David Kauchak cs302 Spring Data Structures What is a data structure? Way of storing data that facilitates particular operations.
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
November 2004csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,
Speech and Language Processing SLP Chapter 13 Parsing.
Chapter 12: Probabilistic Parsing and Treebanks Heshaam Faili University of Tehran.
COSC160: Data Structures: Lists and Queues
Statistical NLP Winter 2009
CSC 594 Topics in AI – Natural Language Processing
Basic Parsing with Context Free Grammars Chapter 13
Probabilistic and Lexicalized Parsing
LING/C SC 581: Advanced Computational Linguistics
CSCI 5832 Natural Language Processing
Probabilistic Parsing
David Kauchak CS159 – Spring 2019
Presentation transcript:

Natural Language Processing Earley’s Algorithm and Dependencies

Survey Feedback Expanded office hours More detail in the lectures Tuesday evenings Friday afternoons More detail in the lectures Piazza Quiz & Midterm policy You don’t get them back Grading policy

Earley’s Algorithm

Grammar for Examples NP -> N DT -> a NP -> DT N DT -> the NP -> NP PP NP -> PNP PP -> P NP S -> NP VP S -> VP VP -> V NP VP -> VP PP DT -> a DT -> the P -> through P -> with PNP -> Swabha PNP -> Chicago V -> book V -> books N -> book N -> books N -> flight

Earley’s Algorithm More “top-down” than CKY. Still dynamic programming. The Earley chart: ROOT → • S [0,0] goal: ROOT → S• [0,n] book the flight through Chicago

Earley’s Algorithm: Predict Given V → α•Xβ [i, j] and the rule X → γ, create X → •γ [j, j] ROOT → • S [0,0] S→ VP S → VP • [0,0] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] book the flight through Chicago

Earley’s Algorithm: Scan Given V → α•Tβ [i, j] and the rule T → wj+1, create T → wj+1• [j, j+1] VP → • V NP [0,0] V → book V → book • [0,1] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] V → book• [0, 1] book the flight through Chicago

Earley’s Algorithm: Complete Given V → α•Xβ [i, j] and X → γ• [j, k], create V → αX•β [i, k] VP → • V NP [0,0] V → book • [0,1] VP → V • NP [0,1] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] V → book• [0, 1] VP → V • NP [0,1] book the flight through Chicago

Earley’s Algorithm: Complete Given V → α•Xβ [i, j] and X → γ• [j, k], create V → αX•β [i, k] VP → • V NP [0,0] V → book • [0,1] VP → V • NP [0,1] ROOT → • S [0,0] S → • VP [0,0] S → • NP VP [0,0] ... VP → • V NP [0,0] NP → • DT N [0,0] V → book• [0, 1] VP → V • NP [0,1] book the flight through Chicago

Thought Questions Runtime? Memory? Weighted version? Recovering trees?

Parsing as Search

Implementing Recognizers as Search Agenda = { state0 } while(Agenda not empty) s = pop a state from Agenda if s is a success-state return s // valid parse tree else if s is not a failure-state: generate new states from s push new states onto Agenda return nil // no parse!

Agenda-Based Probabilistic Parsing Agenda = { (item, value) : initial updates from equations } // items take the form [X, i, j]; values are reals while(Agenda not empty) u = pop an update from Agenda if u.item is goal return u.value // valid parse tree else if u.value > Chart[u.item] store Chart[u.item] ← u.value if u.item combines with other Chart items: generate new updates from u and items stored in Chart push new updates onto Agenda return nil // no parse! “States” on the agenda are (possible) updates to the chart. BEST FIRST: Order the updates by their values Guarantee: the first time you pop any state, you have its final value! Extension: order by update times h, where h introduces more information about which states we like. Under some conditions, this is faster and still optimal. Even when not optimal, performance is sometimes good.

Catalog of CF Parsing Algorithms Recognition/Boolean vs. parsing/probabilistic Chomsky normal form/CKY vs. general/Earley’s Exhaustive vs. agenda

Dependency Parsing

Treebank Tree S VP PP NP NP NP NP DT NN NN NN JJ NN VBD CD NNS IN DT NNP The luxury auto maker last year sold 1,214 cars in the U.S.

Headed Tree S VP PP NP NP NP NP DT NN NN NN JJ NN VBD CD NNS IN DT NNP The luxury auto maker last year sold 1,214 cars in the U.S.

Lexicalized Tree Ssold VPsold PPin NPmaker NPyear NPcars NPU.S. DT NN NN NN JJ NN VBD CD NNS IN DT NNP The luxury auto maker last year sold 1,214 cars in the U.S.

Dependency Tree

Methods for Dependency Parsing Parse with a phrase-structure parser with headed / lexicalized rules Reuse algorithms we know Leverage improvements in phrase structure parsing Maximum spanning tree algorithms Words are nodes, edges are possible links MSTParser Shift-reduce parsing Read words in one at a time, decide to “shift” or “reduce” to incrementally build tree structures MaltParser, Stanford NN Dependency Parser

Maximum Spanning Tree Each dependency is an edge Assign each edge a goodness score (ML problem) Dependencies must form a tree Find the highest scoring tree (Chu-Liu-Edmonds algorithm) Figure: Graham Neubig

Shift-Reduce Parsing Two data structures At each point choose Buffer: words that are being read in Stack: partially built dependency trees At each point choose Shift: move word from stack to queue Reduce-left: combine top two items in stack by making the top word the head of the tree Reduce-right: combine top two items in stack by maing the second word the head of the tree Parsing as classification: classifier says “shift” or “reduce-left” or “reduce-right”

Shift-Reduce Parsing Stack Buffer Stack Buffer Figure: Graham Neubig

Parsing as Classification Given a state: What action is best? Better classification -> better parsing Stack Buffer

Shift-Reduce Algorithm ShiftReduce(queue) make list heads stack = [ (0, “ROOT”, “ROOT”) ] while |buffer| > 0 or |stack| > 1: feats = MakeFeats(stack, buffer) action = Predict(feats, weights) if action = shift: stack.push(buffer.read()) elif action = reduce_left: heads[stack[-2]] = stack[-1] stack.remove(-2) else: # action = reduce_right heads[scack[-1]] = stack[-2] stack.remove(-1)