BİL711 Natural Language Processing1 Statistical Parse Disambiguation Problem: –How do we disambiguate among a set of parses of a given sentence? –We want.

Slides:



Advertisements
Similar presentations
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
Advertisements

Probabilistic and Lexicalized Parsing CS Probabilistic CFGs: PCFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations.
Probabilistic Parsing Chapter 14, Part 2 This slide set was adapted from J. Martin, R. Mihalcea, Rebecca Hwa, and Ray Mooney.
Chapter 12 Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006.
September PROBABILISTIC CFGs & PROBABILISTIC PARSING Universita’ di Venezia 3 Ottobre 2003.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
6/9/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 11 Giuseppe Carenini.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
1/13 Parsing III Probabilistic Parsing and Conclusions.
Features and Unification
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
1/17 Probabilistic Parsing … and some other approaches.
Fall 2004 Lecture Notes #5 EECS 595 / LING 541 / SI 661 Natural Language Processing.
Context Free Grammar S -> NP VP NP -> det (adj) N
Probabilistic Parsing Ling 571 Fei Xia Week 4: 10/18-10/20/05.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Robert Hass CIS 630 April 14, 2010 NP NP↓ Super NP tagging JJ ↓
1 Basic Parsing with Context- Free Grammars Slides adapted from Dan Jurafsky and Julia Hirschberg.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
December 2004CSA3050: PCFGs1 CSA305: Natural Language Algorithms Probabilistic Phrase Structure Grammars (PCFGs)
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Page 1 Probabilistic Parsing and Treebanks L545 Spring 2000.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Albert Gatt Corpora and Statistical Methods Lecture 11.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.
Lexicalized and Probabilistic Parsing Read J & M Chapter 12.
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-14: Probabilistic parsing; sequence labeling, PCFG.
LING 001 Introduction to Linguistics Spring 2010 Syntactic parsing Part-Of-Speech tagging Apr. 5 Computational linguistics.
Supertagging CMSC Natural Language Processing January 31, 2006.
CPSC 422, Lecture 28Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 28 Nov, 18, 2015.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
11 Project, Part 3. Outline Basics of supervised learning using Naïve Bayes (using a simpler example) Features for the project 2.
December 2011CSA3202: PCFGs1 CSA3202: Human Language Technology Probabilistic Phrase Structure Grammars (PCFGs)
NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
Stochastic Methods for NLP Probabilistic Context-Free Parsers Probabilistic Lexicalized Context-Free Parsers Hidden Markov Models – Viterbi Algorithm Statistical.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,
1 Statistical methods in NLP Course 5 Diana Trandab ă ț
Chapter 12: Probabilistic Parsing and Treebanks Heshaam Faili University of Tehran.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Probabilistic and Lexicalized Parsing. Probabilistic CFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations –Use weights to.
Natural Language Processing Vasile Rus
CSC 594 Topics in AI – Natural Language Processing
CS60057 Speech &Natural Language Processing
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 28
Probabilistic CKY Parser
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Probabilistic and Lexicalized Parsing
Grammars August 31, /16/2018.
CSCI 5832 Natural Language Processing
Probabilistic and Lexicalized Parsing
CSCI 5832 Natural Language Processing
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 26
CPSC 503 Computational Linguistics
CPSC 503 Computational Linguistics
David Kauchak CS159 – Spring 2019
CPSC 503 Computational Linguistics
CPSC 503 Computational Linguistics
David Kauchak CS159 – Spring 2019
CPSC 503 Computational Linguistics
Presentation transcript:

BİL711 Natural Language Processing1 Statistical Parse Disambiguation Problem: –How do we disambiguate among a set of parses of a given sentence? –We want to pick the parse tree that corresponds to the correct meaning. Possible Solutions: –Pass the problem onto Semantic Processing –Use principle-based disambiguation methods. –Use a probabilistic model to assign likelihoods to the alternative parse trees and select the best one (or at least rank them). –Associating probabilities with the grammar rules gives us such a model.

BİL711 Natural Language Processing2 Probabilistic CFGs Associate a probability with each grammar rule. The probability reflects relative likelihood of using the rule in generating the LHS constituent. Assume for a constituent C we have k grammar rules of form C   i. We are interested in calculating P(C   i |C) -- the probability of using rule i for deriving C. Such probabilities can be estimated from a corpus of parse trees:

BİL711 Natural Language Processing3 Probabilistic CFGs (cont.) Attach probabilities to grammar rules The expansions for a given non-terminal sum to 1 VP -> Verb.55 VP-> Verb NP.40 VP-> Verb NP NP.05

BİL711 Natural Language Processing4 Assigning Probabilities to Parse Trees Assume that probability of a constituent is independent of context in which it appears in the parse tree. Probability of a constituent C’ that was constructed from A1’,…,An’ using the rule C  A1,…,An is: P(C’)=P(C  A1,…,An|C) P(A1’) … P(An’) At the leafs of the tree, we use the POS probabilities P(C|wi).

BİL711 Natural Language Processing5 Assigning Probabilities to Parse Trees (cont.) A derivation (tree) consists of the set of grammar rules that are in the tree The probability of a derivation (tree) is just the product of the probabilities of the rules in the derivation.

BİL711 Natural Language Processing6 Assigning Probabilities to Parse Trees (Ex. Grammar) S -> NP VP 0.6 S -> VP 0.4 NP -> Noun1.0 VP -> Verb0.3 VP -> Verb NP0.7 Noun -> book 0.2. Verb-> book 0.1.

BİL711 Natural Language Processing7 Parse Trees for An Input: book book [S [NP [Noun book]] [VP [Verb book]]] P([Noun book])=P(Noun->book)=0.1 P([Verb book])=P(Verb->book)=0.2 P([NP [Noun book]])=P(NP->Noun)P([Noun book])=1.0*0.1=0.1 P([VP [Verb book]])=P(VP->Verb)P([Verb book])=0.3*0.2=0.06 P [S [NP [Noun book]] [VP [Verb book]]]) =P(S->NP VP)*0.1*0.06=0.6*0.1*0.06= [S [VP [Verb book] [NP [Noun book]]]] P([VP [Verb book] [NP [Noun book]]])=P(VP->Verb NP)*0.2*0.1=0.7*0.2*0.1=0.014 P([S [VP [Verb book] [NP [Noun book]]]])=P(S->VP)*0.014=0.4*.014=0.0056

BİL711 Natural Language Processing8 Problems with Probabilistic CFG Models Main problem with Probabilistic CFG Model: it does not take contextual effects into account. Example: Pronouns are much more likely to appear in the subject position of a sentence than an object position. But in a PCFG, the rule NP  Pronoun has only one probability. One simple possible extension -- make probabilities dependent on first word of the constituent. Instead of P(C   i |C), use P(C   i |C,w) where w is the first word in C. Example: the rule VP  V NP PP is used 93% of the time with the verb put, but only 10% of the time for like. Requires estimating a much larger set of probabilities, and can significantly improve disambiguation performance.

BİL711 Natural Language Processing9 Probabilistic Lexicalized CFGs A solution to some of the problems with Probabilistic CFGs is to use Probabilistic Lexicalized CFGs. Use the probabilities of particular words in the computation of the probabilities in the derivation

BİL711 Natural Language Processing10 Example

BİL711 Natural Language Processing11 How to find the probabilities? We used to have –VP -> V NP PP P(r|VP) That’s the count of this rule divided by the number of VPs in a treebank Now we have –VP(dumped)-> V(dumped) NP(sacks)PP(in) –P(r|VP ^ dumped is the verb ^ sacks is the head of the NP ^ in is the head of the PP) –Not likely to have significant counts in any treebank

BİL711 Natural Language Processing12 Subcategorization When stuck, exploit independence and collect the statistics you can… We’ll focus on capturing two things –Verb subcategorization Particular verbs have affinities for particular VPs –Objects affinities for their predicates (mostly their mothers and grandmothers) Some objects fit better with some predicates than others Condition particular VP rules on their head… so r: VP -> V NP PP P(r|VP) Becomes P(r | VP ^ dumped) What’s the count? How many times was this rule used with dump, divided by the number of VPs that dump appears in total