Probabilistic Earley Parsing Charlie Kehoe, Spring 2004 Based on the 1995 paper by Andreas Stolcke: An Efficient Probabilistic Context-Free Parsing Algorithm.

Slides:



Advertisements
Similar presentations
Probabilistic and Lexicalized Parsing CS Probabilistic CFGs: PCFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations.
Advertisements

PARSING WITH CONTEXT-FREE GRAMMARS
Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 :
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
CKY Parsing Ling 571 Deep Processing Techniques for NLP January 12, 2011.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.
Parsing context-free grammars Context-free grammars specify structure, not process. There are many different ways to parse input in accordance with a given.
Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.
Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.
Top-Down Parsing.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
Amirkabir University of Technology Computer Engineering Faculty AILAB Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing Course,
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
Fall 2004 Lecture Notes #5 EECS 595 / LING 541 / SI 661 Natural Language Processing.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
Intro to NLP - J. Eisner1 Earley’s Algorithm (1970) Nice combo of our parsing ideas so far:  no restrictions on the form of the grammar:  A.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Probabilistic Context Free Grammars for Representing Action Song Mao November 14, 2000.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
1 Natural Language Processing Lecture Notes 11 Chapter 15 (part 1)
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Coarse-to-Fine Efficient Viterbi Parsing Nathan Bodenstab OGI RPE Presentation May 8, 2006.
1 Chart Parsing Allen ’ s Chapter 3 J & M ’ s Chapter 10.
Comp 311 Principles of Programming Languages Lecture 3 Parsing Corky Cartwright August 28, 2009.
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
Natural Language - General
Basic Parsing Algorithms: Earley Parser and Left Corner Parsing
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Evaluating Models of Computation and Storage in Human Sentence Processing Thang Luong CogACLL 2015 Tim J. O’Donnell & Noah D. Goodman.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Top-Down Parsing.
Instructor: Nick Cercone CSEB - 1 Parsing and Context Free Grammars Parsers, Top Down, Bottom Up, Left Corner, Earley.
NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 13 (17/02/06) Prof. Pushpak Bhattacharyya IIT Bombay Top-Down Bottom-Up.
NATURAL LANGUAGE PROCESSING
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
By Kyle McCardle.  Issues with Natural Language  Basic Components  Syntax  The Earley Parser  Transition Network Parsers  Augmented Transition Networks.
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
Speech and Language Processing SLP Chapter 13 Parsing.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Probabilistic and Lexicalized Parsing. Probabilistic CFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations –Use weights to.
Natural Language Processing Vasile Rus
Programming Languages Translator
Basic Parsing with Context Free Grammars Chapter 13
Probabilistic and Lexicalized Parsing
CSCI 5832 Natural Language Processing
Earley’s Algorithm (1970) Nice combo of our parsing ideas so far:
Parsing and More Parsing
CSA2050 Introduction to Computational Linguistics
David Kauchak CS159 – Spring 2019
Presentation transcript:

Probabilistic Earley Parsing Charlie Kehoe, Spring 2004 Based on the 1995 paper by Andreas Stolcke: An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities

Overview What is this paper all about? Key ideas from the title: Context-Free Parsing Probabilistic Computes Prefix Probabilities Efficient

Context-Free Parsing The ball is heavy. Parser The ball is heavy

Context-Free Parsing ParserGrammar The ball is heavy. The ball is heavy

Context-Free Parsing ParserGrammarLexicon The ball is heavy. The ball is heavy

Context-Free Parsing What if there are multiple legal parses? Example: “Yair looked over the paper.” How does the word “over” function? Yair looked over the paper S NPVP VNP Yair looked over the paper S NPVP V NP PP P NN or

Probabilistic Parsing Use probabilities to find the most likely parse Store typical probabilities for words and rules In this case: P = 0.99 P = 0.01 Yair looked over the paper S NPVP VNP Yair looked over the paper S NPVP V NP PP P NN or

Prefix Probabilities How likely is a partial parse? Yair looked over … S NPVP VNP S VP V NP PP P NN or

Efficiency The Earley algorithm (upon which Stolcke builds) is one of the most efficient known parsing algorithms Probabilities allow intelligent pruning of the developing parse tree(s)

Parsing Algorithms How do we construct a parse tree? Work from grammar to sentence (top-down) Work from sentence to grammar (bottom-up) Work from both ends at once (Earley) Predictably, Earley works best

Earley Parsing Overview Start with a root constituent, e.g. sentence Until the sentence is complete, repeatedly Predict: expand constituents as specified in the grammar Scan: match constituents with words in the input Complete: propagate constituent completions up to their parents Prediction is top-down, while scanning and completion are bottom-up

Earley Parsing Overview Earley parsing uses a chart rather than a tree to develop the parse Constituents are stored independently, indexed by word positions in the sentence Why do this? Eliminate recalculation when tree branches are abandoned and later rebuilt Concisely represent multiple parses

Earley Parsing Example theballisheavy S Begin S → NP VPNP → ART N VP → V ADJ

Earley Parsing Example theballisheavy S Begin NP Begin S → NP VPNP → ART N VP → V ADJ

Earley Parsing Example theballisheavy S Begin NP Pending ART Scan S → NP VPNP → ART N VP → V ADJ

Earley Parsing Example theballisheavy S Begin NP Complete ART Scan N S → NP VPNP → ART N VP → V ADJ

Earley Parsing Example theballisheavy S Pending NP Complete ART Scan N S → NP VPNP → ART N VP → V ADJ

Earley Parsing Example theballisheavy S Pending NP Complete ART Scan N VP Begin S → NP VPNP → ART N VP → V ADJ

Earley Parsing Example theballisheavy S Pending NP Complete ART Scan N VP Pending V Scan S → NP VPNP → ART N VP → V ADJ

Earley Parsing Example theballisheavy S Pending NP Complete ART Scan N VP Complete V Scan ADJ Scan S → NP VPNP → ART N VP → V ADJ

Earley Parsing Example theballisheavy S Complete NP Complete ART Scan N VP Complete V Scan ADJ Scan S → NP VPNP → ART N VP → V ADJ

Probabilistic Parsing How do we parse probabilistically? Assign probabilities to grammar rules and words in lexicon Grammar and lexicon “randomly” generate all possible sentences in the language P(parse tree) = P(sentence generation)

Probabilistic Parsing Terminology Earley state: each step of the processing that a constituent undergoes. Examples: Starting sentence Half-finished sentence Complete sentence Half-finished noun phrase etc. Earley path: a sequence of linked states Example: the complete parse just described

etc. Probabilistic Parsing Can represent the parse as a Markov chain: Markov assumption (state probability is independent of path) applies, due to CFG S ► NP VP Begin S Begin NP ► ART N Begin NP ► PN Begin NP Half Done NP Done S Half Done (path abandoned) Predict S Predict NP Scan “the”Scan “ball”Complete NP

Probabilistic Parsing Every Earley path corresponds to a parse tree P(tree) = P(path) Assign a probability to each state transition Prediction: probability of grammar rule Scanning: probability of word in lexicon Completion: accumulated (“inner”) probability of the finished constituent P(path) = product of P(transition)s

Probabilistic Parse Example RuleP S → NP VP1.0 NP → ART N0.7 NP → PN0.3 VP → V NP0.4 VP → V ADJ0.6 wordPSP theART1.0 isV1.0 ballN0.8 appleN0.2 heavyADJ0.4 blueADJ0.6 GrammarLexicon

Probabilistic Parse Example theballisheavyP S Begin1.0 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Probabilistic Parse Example theballisheavyP S Begin1.0 NP Begin0.7 NP Begin0.3 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Probabilistic Parse Example theballisheavyP S Begin1.0 NP Pending0.7 NP Failed0.3 ART Scan1.0 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Probabilistic Parse Example theballisheavyP S Begin1.0 NP Complete0.56 ART Scan1.0 N Scan0.8 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Probabilistic Parse Example theballisheavyP S Pending0.56 NP Complete0.56 ART Scan1.0 N Scan0.8 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Probabilistic Parse Example theballisheavyP S Pending0.56 NP Complete0.56 ART Scan1.0 N Scan0.8 VP Begin0.4 VP Begin0.6 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Probabilistic Parse Example theballisheavyP S Pending0.56 NP Complete0.56 ART Scan1.0 N Scan0.8 VP Pending0.4 VP Pending0.6 V Scan1.0 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Probabilistic Parse Example theballisheavyP S Pending0.56 NP Complete0.56 ART Scan1.0 N Scan0.8 VP Pending0.4 VP Pending0.6 V Scan1.0 NP Begin0.7 NP Begin0.3 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Probabilistic Parse Example theballisheavyP S Pending0.56 NP Complete0.56 ART Scan1.0 N Scan0.8 VP Pending0.4 VP Pending0.6 V Scan1.0 NP Failed0.7 NP Failed0.3 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Probabilistic Parse Example theballisheavyP S Pending0.56 NP Complete0.56 ART Scan1.0 N Scan0.8 VP Failed0.4 VP Complete0.24 V Scan1.0 ADJ Scan0.4 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Probabilistic Parse Example theballisheavyP S Complete NP Complete0.56 ART Scan1.0 N Scan0.8 VP Complete0.24 V Scan1.0 ADJ Scan0.4 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Prefix Probabilities Current algorithm reports parse tree probability when the sentence is completed What if we don’t have a full sentence? Probability is tracked by constituent (“inner”), rather than by path (“forward”)

Prefix Probabilities Solution: add a separate path probability Same as before, but propagate down on prediction step This is the missing link to chain the path probabilities together

Prefix Probability Example theballisheavyP inner P forwar d S Begin1.0 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Prefix Probability Example theballisheavyP inner P forwar d S Begin1.0 NP Begin0.7 Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Prefix Probability Example theballisheavyP inner P forwar d S Begin1.0 NP Pending0.7 ART Scan1.0(N/A) Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Prefix Probability Example theballisheavyP inner P forwar d S Begin1.0 NP Complete0.56 ART Scan1.0(N/A) N Scan0.8(N/A) Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Prefix Probability Example theballisheavyP inner P forwar d S Pending0.56 NP Complete0.56 ART Scan1.0(N/A) N Scan0.8(N/A) Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Prefix Probability Example theballisheavyP inner P forwar d S Pending0.56 NP Complete0.56 ART Scan1.0(N/A) N Scan0.8(N/A) VP Begin Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Prefix Probability Example theballisheavyP inner P forwar d S Pending0.56 NP Complete0.56 ART Scan1.0(N/A) N Scan0.8(N/A) VP Pending V Scan1.0(N/A) Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Prefix Probability Example theballisheavyP inner P forwar d S Pending0.56 NP Complete0.56 ART Scan1.0(N/A) N Scan0.8(N/A) VP Complete V Scan1.0(N/A) ADJ Scan0.4(N/A) Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Prefix Probability Example theballisheavyP inner P forwar d S Complete NP Complete0.56 ART Scan1.0(N/A) N Scan0.8(N/A) VP Complete V Scan1.0(N/A) ADJ Scan0.4(N/A) Rule S → NP VPNP → ART NNP → PNVP → V NPVP → V ADJ P

Summary Use Earley chart parser for efficient parsing, even with ambiguous or complex sentences Use probabilities to choose among multiple possible parse trees Track constituent probability for complete sentences Also track path probability for incomplete sentences