Syntactic Parsing: Part I Niranjan Balasubramanian Stony Brook University March 8 th and Mar 22 nd, 2016 Many slides adapted from: Ray Mooney, Michael.

Slides:



Advertisements
Similar presentations
Probabilistic and Lexicalized Parsing CS Probabilistic CFGs: PCFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations.
Advertisements

Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
PARSING WITH CONTEXT-FREE GRAMMARS
Chapter 12 Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006.
Introduction and Jurafsky Model Resource: A Probabilistic Model of Lexical and Syntactic Access and Disambiguation, Jurafsky 1996.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
CKY Parsing Ling 571 Deep Processing Techniques for NLP January 12, 2011.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Parsing context-free grammars Context-free grammars specify structure, not process. There are many different ways to parse input in accordance with a given.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
1/17 Probabilistic Parsing … and some other approaches.
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Page 1 Probabilistic Parsing and Treebanks L545 Spring 2000.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Albert Gatt Corpora and Statistical Methods Lecture 11.
Rules, Movement, Ambiguity
CSA2050 Introduction to Computational Linguistics Parsing I.
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
December 2011CSA3202: PCFGs1 CSA3202: Human Language Technology Probabilistic Phrase Structure Grammars (PCFGs)
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,
Natural Language Processing : Probabilistic Context Free Grammars Updated 8/07.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Natural Language Processing Vasile Rus
CSC 594 Topics in AI – Natural Language Processing
Statistical NLP Winter 2009
Basic Parsing with Context Free Grammars Chapter 13
Probabilistic CKY Parser
CS 388: Natural Language Processing: Statistical Parsing
Probabilistic and Lexicalized Parsing
CS 388: Natural Language Processing: Syntactic Parsing
Probabilistic and Lexicalized Parsing
Natural Language - General
Parsing and More Parsing
CSA2050 Introduction to Computational Linguistics
David Kauchak CS159 – Spring 2019
David Kauchak CS159 – Spring 2019
Presentation transcript:

Syntactic Parsing: Part I Niranjan Balasubramanian Stony Brook University March 8 th and Mar 22 nd, 2016 Many slides adapted from: Ray Mooney, Michael Collins, and Chris Manning. Notes:

Overview Syntactic Structure (of English) –Phrase structure and how they compose to form sentences. Context-free Grammar for capturing English Phrase structure. Algorithms for parsing (identifying phrase structure) using CFG. Why is parsing difficult?

Regularities in language Word n-grams model regularities in word sequences Part-of-speech n-grams model regularities in word category sequences. Language has richer structure. –Phrases There are groupings of words in sentences that behave like a unit. –Dependencies Words in a sentence have specific syntactic (and semantic) relationships.

Constituency View of Syntactic Structure I shot an elephant Position An elephant I shot[Poetic] Expansion I shot a green elephant[Perceptive] I shot an elephant that ate my lunch[Vengeful] Constituents are groups of words (phrases) that behave as a single unit.

Types of Phrases Noun phrases The long second assignment that Niranjan gave us was quite interesting. DT ADJ ADJ NN CC NNP VBD NN [Sarcastic] Headed by a noun, and has an optional determiner, adjectives (phrases), post-modifiers (rel. clauses, prep-phrases). Verb Phrases Everyone absolutely loved CS390. [Optimistic] Headed by a verb, and has other modifiers (adverbs), and object nouns. No subject noun phrases.

Types of Phrases Prepositional Phrases[Chef Bala] I ate pasta with chopsticks. Headed by a preposition, and a noun phrase (that completes it). Usually part of verb or noun phrases. Usually convey information about spatial, temporal aspects. Adjectival Phrases SAC pizza is edible.[Fact] SAC pizza is quite close to being edible.[Opinion] Headed by the adjective and can sometimes include other types of phrases.

Two take-aways about phrases Each type of phrase has some typical structure E.g., NP: DT* (ADJ)* NN Phrases can nest within each other. E.g., NP:DT NP NP: PP NN Recursion. You can keep expanding phrases forever. –This can lead to separating words that are dependent on each other. The green salad served to the workers was not fresh.

Why do we care about this syntactic structure? Many aspects of meaning can be learnt using the syntactic structure. –The NP preceding VP is likely the subject of the action. –The NP following the VP is likely the object of the action. Knowing basic units is helpful in modeling language. –You can use this to predict or complete the sentence. –Re-organize sentences or simplify them. Many many NLP applications use syntactic structure to make decisions. –Relation extraction. –Question answering. –Machine translation. –Semantic role labeling. –…

Capturing Phrase Structure using Grammar Grammar is a way to specify acceptable or valid strings in a language. Phrase types have specific structures. –Can hope to write down rules for valid ways in which phrases can be formed. How to generate noun phrases (NP) in English? Specify re-write rules. NP → NP PP NP → DT NNNN → hat NP → DT NNSNNS → cats DT → thePP → IN NP DT → a IN → in Apply rules starting from the NP symbol and generate phrases. If a given phrase can be generated by applying these rules then it is an NP.

Context-free Grammar Starts with a special sentence symbol ‘S’ always. Has terminal and non-terminal symbols. –Terminals correspond to words in the language that don’t expand. –Non-terminals are syntactic categories (or phrase types). Is a set of rules with LHS and RHS. –LHS is a single non-terminal –RHS is a combination of non-terminal and terminal symbols. A valid sentence is one that can be derived from the grammar. –The structure of the sentence is captured by the derivation. Often this derivation or structure is represented as a tree. –Parsing is the task of finding the derivation that leads to the sentence.

Context-Free Grammar: Formally N a set of non-terminal symbols (or variables)[NP, VP, PP etc.]  a set of terminal symbols (disjoint from N)[words] R a set of productions or rules of the form: A→ , [NP → DT NN] where A is a non-terminal and  is a string of symbols from (  N)* S, a designated non-terminal called the start symbol Strings that can be generated by applying a sequence of rules from R are said to be in the language of the grammar. Parsing becomes the task of identifying if a string is generated by the grammar (and recovering the sequence of rules that generated it).

So, are we done? What is so hard about this? Write a grammar and figure out which set of rules lead to the sentence. [Easy-peasy-lemon-squeezy? Not.] –Chomsky tried this in PhD thesis in 1950s –Wrote symbolic grammar (CFG or often richer) and lexicon –Used grammar/proof systems to prove parses from words Ambiguity + Exceptions –Most grammars lead to many many derivations for a given sentence. –There are always exceptions to cover and grammar grows bigger! [All grammars leak! - Sapir] E.g., Fed raises interest rates 0.5% in effort to control inflation Minimal grammar gives 36 parses Simple 10 rule grammar gives592 parses Real-size broad-coverage grammarmillions of parses [Inflation indeed!] Small grammars: Easy to write but less coverage Broad coverage grammars: Hard to write and produce many parses.

What is so hard about this?Ambiguity. Warning: You are about to see (possibly) disturbing images. I shot an elephant in my pajamas.

Ambiguity S NP VP NN VBDNP IN PRP NP NN Ishot an DTNN elephant PP in mypajamas

Ambiguity S NP VP NN VBDNP IN PRP NP NN Ishot an DTNN elephant PP in mypajamas

Probabilistic CFG: A Solution for ambiguity. Some trees (derivations or parses) are more likely than others. –Some rules are more frequent than others. Key idea: –Associate some probability with each rule.

Parsing with PCFG argmax Pr(Tree | Sentence) Brute force search is not efficient! Many trees might share edges. We might be able to save computations. –Top down and Bottom up algorithms can explore the possible trees w/o fully generating all possible derivations.

What are some ways to parse given a CFG? Top down parsing Bottom up parsing Dynamic Programming –CYK or CKY parsing. [Bottom up] –Earley Algorithm [Top down]

Top Down Parsing Start with the start symbol. Apply rules as long as the rules lead to the correct sequence of words. If there is a mismatch with what is observed, then backtrack. Keep iterating until you have generated the sentence, or you have exhausted all paths.

Top Down Parsing S VP Verb NP book Det Nominal that Noun flight book that flight Idea: Do not explore paths that cannot lead to the sentence.

Top Down Parsing S NP VP Pronoun S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S NP VP Pronoun book X S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S NP VP ProperNoun S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S NP VP ProperNoun book X S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S NP VP Det Nominal S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S NP VP Det Nominal book X S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S Aux NP VP S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S Aux NP VP book X S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S VP Verb S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S VP S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S VP Verb book S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S VP Verb book X that S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S VP Verb NP S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S VP Verb NP book S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S VP Verb NP book Pronoun S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….

Top Down Parsing S VP Verb NP book Det Nominal that Noun flight book that flight And on it goes until …. S -> VP VP -> Verb NP Verb -> book NP -> Det Nominal Det -> that Nominal -> Noun Noun -> flight Rules Used:

Top down and Bottom up Both approaches save some exploration! Top down never explores options that will not lead to a full parse –But will explore many options that never connect to the actual sentence. –Can get into left recursion. Bottom up never explores options that do not connect to the actual sentence –But can explore options that can never lead to a full parse. Which search wastes more depends on how the grammar branches. –With too many productions for each non-terminal, the TD will suffer. –With too many RHS in which a non-terminal appears, BU will suffer. Straightforward implementations can take exponential time!

CKY Intuition Imagine your goal is just identify the nested phrase structure (no labeling the phrases). ABCDEFABCDEF [A][B C][D][EF] [ABC][DEF] [ABCDEF]

CKY Intuition Optimal solution should include one of the following! [A][ BCDEF]c[1, 1] c[2, 6] [AB][ CDEF]c[1, 2] c[3, 6] [ABC][ DEF]c[1, 3] c[4, 6] [ABC D][EF]c[1, 4] c[5, 6] [ABC DE][F]c[1, 5] c[6, 6] Also each of the sub-phrases must also be optimal. Optimal for span i through j by composing optimal for splits from i + 1 until j-1. ii+1 …j-1j [][]c[i, i+1] c[i+1, j] k=i+1 [][]c[i, i+2] c[i+2, j] k=i+2… [][]c[i, j-2] c[j-1, j] k=j-1 opt(i, j) = argmaxopt(i, k). opt(k+1, j) i ≤ k ≤ j opt(i, j) = argmaxopt(i, k). opt(k+1, j) i ≤ k ≤ j

CKY Parser:Composing from sub-phrases 40 Book the flight through Houston j= i= c[i,j] contains all constituents for words i through j c[1,4] possibilities

CKY Parser:Composing from sub-phrases 41 Book the flight through Houston i= j= c[i,j] contains all constituents for words i through j c[1,4] possibilities c[1,3], c[4,4]

CKY Parser:Composing from sub-phrases 42 Book the flight through Houston j= i= c[i,j] contains all constituents for words i through j c[1,4] possibilities c[1,3], c[4,4] c[1,2], c[3,4]

CKY Parser:Composing from sub-phrases 43 Book the flight through Houston j= i= c[i,j] contains all constituents for words i through j c[1,4] possibilities c[1,3], c[4,4] c[1,2], c[3,4] c[1,1], c[2,4]

CKY Parser:Table filling order 44 Book the flight through Houston j= i= c[i,j] contains all constituents for words i through j

CKY Parser:Example 45 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser:Example 46 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 47 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 48 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 49 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 50 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 51 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 52 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 53 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP VP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 54 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP S VP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 55 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP VP S S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 56 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP VP S S S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through

CKY Parser 57 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP VP S S Parse Tree #1

CKY Parser 58 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP VP S S Parse Tree #2

CKY Parser: Complexity? 1) How many cells? 2) How much work to be done in each cell?

Chomsky Normal Form. S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP Original GrammarChomsky Normal Form S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP

CKY: An algorithm for parsing given a grammar. How to address the ambiguity problem? –Use some way to score the parses. –If you cannot decompose the scores into the DP steps, then we are done for! Scoring –Probabilistic context free grammar (PCFG)

Parsing and Scoring PCFGs Grammar Learning from training data.

CKY Parser for PCFG. C(2, 4) X 1 1) At each cell store the maximum (optimal) parse for that span. 2) Score of a parse at a span includes the probability of rule, times the probabilities of the parses of the sub-spans. C(2, 2) Y 1 C(3, 4) Z 1 C(2,4) = max { C(2,2) x C(3, 4) x Pr(X 1  Y 1 Z 1 ), C(2,3) x C(4,4) x Pr(X 2  Y 2 Z 2 ) } C(2, 3) Y 2 C(4, 4) Y 4

Issues with PCFGs Makes strong independence assumptions about language –Lexical independence –Structural independence

Lexical Dependence: PP Attachment Ambiguity workers dumped sacks into a bin

Lexical Dependence: PP Attachment Ambiguity Prob(Tree 1, Sentence) = …. x Prob(VP -> VP PP | VP) x …

Lexical Dependence: PP Attachment Ambiguity Prob(Tree 2, Sentence) = …. x Prob(NP -> NP PP | NP) x …

Lexical Dependence: PP Attachment Ambiguity Prob(Tree 1, Sentence) > Prob(Tree 2, Sentence) If Prob(VP -> VP PP | VP) > Prob(NP -> NP PP | NP)

Lexical Dependence: Co-ordination Ambiguity

Structural Preferences president of [company in Africa] [president of company] in Africa Same rules applied in both cases. Tree probability is the same Left structure is twice as likely in Wall Street Journal. There are similar issues when PPs can attach to multiple verbs.

Structural Dependence

Structural Dependence: Sub-categorization Specific verbs take some types of arguments but not others. –Intransitive, transitive, and di-transitive –Finite vs. Non-finite verbs. A generic VP label hides the different argument preferences of the various sub- categories.

How to address these issues? Lexical dependence –Introduce lexical items into the tree. –Use headwords of constituents as part of the node-label. [Charniak 1997] Structural Dependence –Add more information to non-terminal categories[state splitting] Include information about parents.[Johnson 1998] Include fine-grained information (mark possessives for example) Trade-off: Adding lexical information and fine-grained categories: a) Increases sparsity -- Need appropriate smoothing. b) Adds more rules – Can affect parsing speed. Sub-categorization –Add information to the non-terminal categories [state splitting] –E.g., S -> NP_firstpersonsingular VP_firstpersonsingular

Lexicalized Charniak Parser Key idea is to identify heads of constituents and use them to condition probabilities. There are a handful of rules that specify how to identify heads. Probability of lexicalized parse tree is computed using these two quantities. P(cur_head = profits | cur_category = NP, parent_head = rose, parent_category = S) P(rule = r_i | cur_head = profits, cur_category = NP, parent_category = S)