CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24– Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing)

Slides:



Advertisements
Similar presentations
Linguistics Lecture-12: X-bar Theory; Adjuncts and Complements
Advertisements

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011.
Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
1 Statistical NLP: Lecture 12 Probabilistic Context Free Grammars.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
CS460/449 : Speech, Natural Language Processing and the Web/Topics in AI Programming (Lecture 2– Introduction+ML and NLP) Pushpak Bhattacharyya CSE Dept.,
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 20-21– Natural Language Parsing.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
Language Model. Major role: Language Models help a speech recognizer figure out how likely a word sequence is, independent of the acoustics. A lot of.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 26: Probabilistic Parsing.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
CS : Speech, Natural Language Processing and the Web/Topics in Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 12: Deeper.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Constituent Parsing and Algorithms (with.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 17: Parsing Ambiguity, Probabilistic Parsing, sample seminar 17.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-17: Probabilistic parsing; inside- outside probabilities.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-16: Probabilistic parsing; computing probability of.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 26– Recap HMM; Probabilistic Parsing cntd) Pushpak Bhattacharyya CSE Dept., IIT.
CSA2050 Introduction to Computational Linguistics Parsing I.
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-14: Probabilistic parsing; sequence labeling, PCFG.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 6 (14/02/06) Prof. Pushpak Bhattacharyya IIT Bombay Top-Down and Bottom-Up.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 13: Deeper Adjective and PP Structure; Structural Ambiguity.
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 13 (17/02/06) Prof. Pushpak Bhattacharyya IIT Bombay Top-Down Bottom-Up.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Parsing Algos.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,
Natural Language Processing : Probabilistic Context Free Grammars Updated 8/07.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 11: Evidence for Deeper Structure; Top Down Parsing.
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
Chapter 12: Probabilistic Parsing and Treebanks Heshaam Faili University of Tehran.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Natural Language Processing Vasile Rus
1 Statistical methods in NLP Diana Trandabat
CSC 594 Topics in AI – Natural Language Processing
CSC 594 Topics in AI – Natural Language Processing
Basic Parsing with Context Free Grammars Chapter 13
CS : Speech, NLP and the Web/Topics in AI
CS 388: Natural Language Processing: Statistical Parsing
CSCI 5832 Natural Language Processing
CS : Speech, NLP and the Web/Topics in AI
Probabilistic and Lexicalized Parsing
CS 388: Natural Language Processing: Syntactic Parsing
CSCI 5832 Natural Language Processing
Probabilistic and Lexicalized Parsing
CSCI 5832 Natural Language Processing
Natural Language - General
Parsing and More Parsing
CS : Language Technology for the Web/Natural Language Processing
Pushpak Bhattacharyya CSE Dept., IIT Bombay
CSA2050 Introduction to Computational Linguistics
David Kauchak CS159 – Spring 2019
Parsing I: CFGs & the Earley Parser
David Kauchak CS159 – Spring 2019
Prof. Pushpak Bhattacharyya, IIT Bombay
Presentation transcript:

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24– Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 8th, 10th March, 2011 (Lectures 21 and 22 were on Sentiment Analysis by Aditya Joshi)

A note on Language Modeling Example sentence “ ^ The tortoise beat the hare in the race.” Guided Guided by Guided by by frequency Language world Knowledge Knowledge N-gram (n=3) CFG Probabilistic CFG Dependency Grammar Prob. DG ^ the tortoise 5*10-3 S-> NP VP S->NP VP 1.0 Semantic Roles agt, obj, sen, etc. Semantic Rules are always between “Heads” Semantic Roles with probabilities the tortoise beat 3*10-2 NP->DT N NP->DT N 0.5 tortoise beat the 7*10-5 VP->V NP PP VP->V NP PP 0.4 beat the hare 5*10-6 PP-> P NP PP-> P NP 1.0

Parse Tree DT N V NP PP The Tortoise beat DT N P NP S NP VP the hare in DT N the race

UNL Expression beat@past scn (scene) agt obj race@def tortoise@def hare@def

Purpose of LM Prediction of next word (Speech Processing) Language Identification (for same script) Belongingness check (parsing) P(NP->DT N) means what is the probability that the ‘YIELD’ of the non terminal NP is DT N

Need for Deep Parsing Sentences are linear structures But there is a hierarchy- a tree- hidden behind the linear structure There are constituents and branches

PPs are at the same level: flat with respect to the head word “book” No distinction in terms of dominance or c-command NP PP The AP PP book with the blue cover of poems big [The big book of poems with the Blue cover] is on the table.

“Constituency test of Replacement” runs into problems One-replacement: I bought the big [book of poems with the blue cover] not the small [one] One-replacement targets book of poems with the blue cover Another one-replacement: I bought the big [book of poems] with the blue cover not the small [one] with the red cover One-replacement targets book of poems

More deeply embedded structure NP N’1 The AP N’2 N’3 big PP N book with the blue cover PP of poems

Grammar and Parsing Algorithms

A simplified grammar S  NP VP NP  DT N | N VP  V ADV | V

Example Sentence People laugh 2 3 Lexicon: People - N, V Laugh - N, V 2 3 Lexicon: People - N, V Laugh - N, V These are positions This indicate that both Noun and Verb is possible for the word “People”

Position of input pointer Top-Down Parsing State Backup State Action ----------------------------------------------------------------------------------------------------- 1. ((S) 1) - - 2. ((NP VP)1) - - 3a. ((DT N VP)1) ((N VP) 1) - 3b. ((N VP)1) - - 4. ((VP)2) - Consume “People” 5a. ((V ADV)2) ((V)2) - 6. ((ADV)3) ((V)2) Consume “laugh” 5b. ((V)2) - - 6. ((.)3) - Consume “laugh” Termination Condition : All inputs over. No symbols remaining. Note: Input symbols can be pushed back. Position of input pointer

Discussion for Top-Down Parsing This kind of searching is goal driven. Gives importance to textual precedence (rule precedence). No regard for data, a priori (useless expansions made).

Work on the LHS done, while the work on RHS remaining Bottom-Up Parsing Some conventions: N12 S1? -> NP12 ° VP2? Represents positions Work on the LHS done, while the work on RHS remaining End position unknown

Bottom-Up Parsing (pictorial representation) S -> NP12 VP23 ° People Laugh 1 2 3 N12 N23 V12 V23 NP12 -> N12 ° NP23 -> N23 ° VP12 -> V12 ° VP23 -> V23 ° S1? -> NP12 ° VP2?

Problem with Top-Down Parsing Left Recursion Suppose you have A-> AB rule. Then we will have the expansion as follows: ((A)K) -> ((AB)K) -> ((ABB)K) ……..

Combining top-down and bottom-up strategies

Top-Down Bottom-Up Chart Parsing Combines advantages of top-down & bottom-up parsing. Does not work in case of left recursion. e.g. – “People laugh” People – noun, verb Laugh – noun, verb Grammar – S  NP VP NP  DT N | N VP  V ADV | V

Transitive Closure People laugh 1 2 3 S NP VP NP N VP  V  1 2 3 S NP VP NP N VP  V  NP DT N S  NPVP S  NP VP  NP N VP V ADV success VP V

Arcs in Parsing Each arc represents a chart which records Completed work (left of ) Expected work (right of )

Example People laugh loudly 1 2 3 4 1 2 3 4 S  NP VP NP  N VP  V VP  V ADV NP  DT N S  NPVP VP  VADV S  NP VP NP  N VP  V ADV S  NP VP VP  V

Dealing With Structural Ambiguity Multiple parses for a sentence The man saw the boy with a telescope. The man saw the mountain with a telescope. The man saw the boy with the ponytail. At the level of syntax, all these sentences are ambiguous. But semantics can disambiguate 2nd & 3rd sentence.

Prepositional Phrase (PP) Attachment Problem V – NP1 – P – NP2 (Here P means preposition) NP2 attaches to NP1 ? or NP2 attaches to V ?

Parse Trees for a Structurally Ambiguous Sentence Let the grammar be – S  NP VP NP  DT N | DT N PP PP  P NP VP  V NP PP | V NP For the sentence, “I saw a boy with a telescope”

Parse Tree - 1 S NP VP N V NP Det N PP P NP Det N saw I a boy with a telescope

Parse Tree -2 S NP VP PP N V NP P NP Det N Det N saw I a with boy a telescope

Parsing Structural Ambiguity

Parsing for Structurally Ambiguous Sentences Sentence “I saw a boy with a telescope” Grammar: S  NP VP NP  ART N | ART N PP | PRON VP  V NP PP | V NP ART  a | an | the N  boy | telescope PRON  I V  saw

Ambiguous Parses Two possible parses: PP attached with Verb (i.e. I used a telescope to see) ( S ( NP ( PRON “I” ) ) ( VP ( V “saw” ) ( NP ( (ART “a”) ( N “boy”)) ( PP (P “with”) (NP ( ART “a” ) ( N “telescope”))))) PP attached with Noun (i.e. boy had a telescope) ( NP ( (ART “a”) ( N “boy”) (PP (P “with”) (NP ( ART “a” ) ( N “telescope”))))))

Top Down Parse State Backup State 1 ( ( S ) 1 ) − Action Comments Use S  NP VP

Use NP  ART N | ART N PP | PRON Top Down Parse State Backup State Action Comments 1 ( ( S ) 1 ) − Use S  NP VP 2 ( ( NP VP ) 1 ) Use NP  ART N | ART N PP | PRON

Top Down Parse State Backup State 1 ( ( S ) 1 ) − 2 ( ( NP VP ) 1 ) 3 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP 2 ( ( NP VP ) 1 ) Use NP  ART N | ART N PP | PRON 3 ( ( ART N VP ) 1 ) ( ( ART N PP VP ) 1 ) ( ( PRON VP ) 1) ART does not match “I”, backup state (b) used

Top Down Parse State Backup State 1 ( ( S ) 1 ) − 2 ( ( NP VP ) 1 ) 3 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP 2 ( ( NP VP ) 1 ) Use NP  ART N | ART N PP | PRON 3 ( ( ART N VP ) 1 ) ( ( ART N PP VP ) 1 ) ( ( PRON VP ) 1) ART does not match “I”, backup state (b) used 3B ( ( PRON VP ) 1 )

Top Down Parse State Backup State 1 ( ( S ) 1 ) − 2 ( ( NP VP ) 1 ) 3 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP 2 ( ( NP VP ) 1 ) Use NP  ART N | ART N PP | PRON 3 ( ( ART N VP ) 1 ) ( ( ART N PP VP ) 1 ) ( ( PRON VP ) 1) ART does not match “I”, backup state (b) used 3B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed “I”

Top Down Parse State Backup State 1 ( ( S ) 1 ) − 2 ( ( NP VP ) 1 ) 3 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP 2 ( ( NP VP ) 1 ) Use NP  ART N | ART N PP | PRON 3 ( ( ART N VP ) 1 ) ( ( ART N PP VP ) 1 ) ( ( PRON VP ) 1) ART does not match “I”, backup state (b) used 3B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed “I” 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used

Top Down Parse State Backup State 1 ( ( S ) 1 ) − 2 ( ( NP VP ) 1 ) 3 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP 2 ( ( NP VP ) 1 ) Use NP  ART N | ART N PP | PRON 3 ( ( ART N VP ) 1 ) ( ( ART N PP VP ) 1 ) ( ( PRON VP ) 1) ART does not match “I”, backup state (b) used 3B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed “I” 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed “saw”

Top Down Parse State Backup State 1 ( ( S ) 1 ) − 2 ( ( NP VP ) 1 ) 3 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP 2 ( ( NP VP ) 1 ) Use NP  ART N | ART N PP | PRON 3 ( ( ART N VP ) 1 ) ( ( ART N PP VP ) 1 ) ( ( PRON VP ) 1) ART does not match “I”, backup state (b) used 3B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed “I” 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed “saw” 7 ( ( ART N PP ) 3 ) ( ( ART N PP PP ) 3 ) ( ( PRON PP ) 3 )

Top Down Parse State Backup State 1 ( ( S ) 1 ) − 2 ( ( NP VP ) 1 ) 3 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP 2 ( ( NP VP ) 1 ) Use NP  ART N | ART N PP | PRON 3 ( ( ART N VP ) 1 ) ( ( ART N PP VP ) 1 ) ( ( PRON VP ) 1) ART does not match “I”, backup state (b) used 3B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed “I” 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed “saw” 7 ( ( ART N PP ) 3 ) ( ( ART N PP PP ) 3 ) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed “a”

Top Down Parse State Backup State 1 ( ( S ) 1 ) − 2 ( ( NP VP ) 1 ) 3 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP 2 ( ( NP VP ) 1 ) Use NP  ART N | ART N PP | PRON 3 ( ( ART N VP ) 1 ) ( ( ART N PP VP ) 1 ) ( ( PRON VP ) 1) ART does not match “I”, backup state (b) used 3B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed “I” 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed “saw” 7 ( ( ART N PP ) 3 ) ( ( ART N PP PP ) 3 ) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed “a” 9 ( ( PP ) 5 ) Consumed “boy”

Top Down Parse State Backup State 1 ( ( S ) 1 ) − 2 ( ( NP VP ) 1 ) 3 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP 2 ( ( NP VP ) 1 ) Use NP  ART N | ART N PP | PRON 3 ( ( ART N VP ) 1 ) ( ( ART N PP VP ) 1 ) ( ( PRON VP ) 1) ART does not match “I”, backup state (b) used 3B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed “I” 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed “saw” 7 ( ( ART N PP ) 3 ) ( ( ART N PP PP ) 3 ) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed “a” 9 ( ( PP ) 5 ) Consumed “boy” 10 ( ( P NP ) )

Top Down Parse State Backup State 1 ( ( S ) 1 ) − … 7 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP … 7 ( ( ART N PP ) 3 ) ( ( ART N PP PP ) 3 ) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed “a” 9 ( ( PP ) 5 ) Consumed “boy” 10 ( ( P NP ) 5 ) 11 ( ( NP ) 6 ) Consumed “with”

Top Down Parse State Backup State 1 ( ( S ) 1 ) − … 7 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP … 7 ( ( ART N PP ) 3 ) ( ( ART N PP PP ) 3 ) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed “a” 9 ( ( PP ) 5 ) Consumed “boy” 10 ( ( P NP ) 5 ) 11 ( ( NP ) 6 ) Consumed “with” 12 ( ( ART N ) 6 ) ( ( ART N PP ) 6 ) ( ( PRON ) 6)

Top Down Parse State Backup State 1 ( ( S ) 1 ) − … 7 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP … 7 ( ( ART N PP ) 3 ) ( ( ART N PP PP ) 3 ) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed “a” 9 ( ( PP ) 5 ) Consumed “boy” 10 ( ( P NP ) 5 ) 11 ( ( NP ) 6 ) Consumed “with” 12 ( ( ART N ) 6 ) ( ( ART N PP ) 6 ) ( ( PRON ) 6) 13 ( ( N ) 7 )

Top Down Parse State Backup State 1 ( ( S ) 1 ) − … 7 Action Comments 1 ( ( S ) 1 ) − Use S  NP VP … 7 ( ( ART N PP ) 3 ) ( ( ART N PP PP ) 3 ) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed “a” 9 ( ( PP ) 5 ) Consumed “boy” 10 ( ( P NP ) 5 ) 11 ( ( NP ) 6 ) Consumed “with” 12 ( ( ART N ) 6 ) ( ( ART N PP ) 6 ) ( ( PRON ) 6) 13 ( ( N ) 7 ) 14 ( ( − ) 8 ) Consume “telescope” Finish Parsing

Top Down Parsing - Observations Top down parsing gave us the Verb Attachment Parse Tree (i.e., I used a telescope) To obtain the alternate parse tree, the backup state in step 5 will have to be invoked Is there an efficient way to obtain all parses ?

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Colour Scheme : Blue for Normal Parse Green for Verb Attachment Parse Purple for Noun Attachment Parse Red for Invalid Parse

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 S1?NP12VP2?

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? S1?NP12VP2? VP2?V23NP3?PP??

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? NP35 ART34N45 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5?

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5?

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? VP25V23NP35 VP2?V23NP35PP5? S15NP12VP25

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? VP25V23NP35 VP2?V23NP35PP5? S15NP12VP25

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? S15NP12VP25

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? NP68ART67N78 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? S15NP12VP25

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? NP68ART67N78 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? PP58P56NP68 S15NP12VP25

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? NP68ART67N78 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? PP58P56NP68 S15NP12VP25 NP38ART34N45PP58

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? NP68ART67N78 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? PP58P56NP68 S15NP12VP25 NP38ART34N45PP58 VP28V23NP35PP58 VP28V23NP38

Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? NP68ART67N78 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? PP58P56NP68 S15NP12VP25 NP38ART34N45PP58 VP28V23NP35PP58 VP28V23NP38 S18NP12VP28

Bottom Up Parsing - Observations Both Noun Attachment and Verb Attachment Parses obtained by simply systematically applying the rules Numbers in subscript help in verifying the parse and getting chunks from the parse

Exercise For the sentence, “The man saw the boy with a telescope” & the grammar given previously, compare the performance of top-down, bottom-up & top-down chart parsing.

Start of Probabilistic Parsing

Example of Sentence labeling: Parsing [S1[S[S[VP[VBCome][NP[NNPJuly]]]] [,,] [CC and] [S [NP [DT the] [JJ IIT] [NN campus]] [VP [AUX is] [ADJP [JJ abuzz] [PP[IN with] [NP[ADJP [JJ new] [CC and] [ VBG returning]] [NNS students]]]]]] [..]]]

Noisy Channel Modeling Source sentence Noisy Channel Target parse T*= argmax [P(T|S)] T = argmax [P(T).P(S|T)] = argmax [P(T)], since given the parse the T sentence is completely determined and P(S|T)=1

Corpus A collection of text called corpus, is used for collecting various language data With annotation: more information, but manual labor intensive Practice: label automatically; correct manually The famous Brown Corpus contains 1 million tagged words. Switchboard: very famous corpora 2400 conversations, 543 speakers, many US dialects, annotated with orthography and phonetics

Discriminative vs. Generative Model W* = argmax (P(W|SS)) W Generative Model Discriminative Model Compute directly from P(W|SS) Compute from P(W).P(SS|W)

Language Models N-grams: sequence of n consecutive words/characters Probabilistic / Stochastic Context Free Grammars: Simple probabilistic models capable of handling recursion A CFG with probabilities attached to rules Rule probabilities  how likely is it that a particular rewrite rule is used?

PCFGs Why PCFGs? Intuitive probabilistic models for tree-structured languages Algorithms are extensions of HMM algorithms Better than the n-gram model for language modeling.

Formal Definition of PCFG A PCFG consists of A set of terminals {wk}, k = 1,….,V {wk} = { child, teddy, bear, played…} A set of non-terminals {Ni}, i = 1,…,n {Ni} = { NP, VP, DT…} A designated start symbol N1 A set of rules {Ni  j}, where j is a sequence of terminals & non-terminals NP  DT NN A corresponding set of rule probabilities

Rule Probabilities Rule probabilities are such that E.g., P( NP  DT NN) = 0.2 P( NP  NN) = 0.5 P( NP  NP PP) = 0.3 P( NP  DT NN) = 0.2 Means 20 % of the training data parses use the rule NP  DT NN

Probabilistic Context Free Grammars S  NP VP 1.0 NP  DT NN 0.5 NP  NNS 0.3 NP  NP PP 0.2 PP  P NP 1.0 VP  VP PP 0.6 VP  VBD NP 0.4 DT  the 1.0 NN  gunman 0.5 NN  building 0.5 VBD  sprayed 1.0 NNS  bullets 1.0

Example Parse t1` The gunman sprayed the building with bullets. S1.0 P (t1) = 1.0 * 0.5 * 1.0 * 0.5 * 0.6 * 0.4 * 1.0 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.00225 NP0.5 VP0.6 NN0.5 DT1.0 VP0.4 PP1.0 P1.0 NP0.3 NP0.5 The gunman VBD1.0 DT1.0 NN0.5 with NNS1.0 sprayed the building bullets

Another Parse t2 The gunman sprayed the building with bullets. S1.0 P (t2) = 1.0 * 0.5 * 1.0 * 0.5 * 0.4 * 1.0 * 0.2 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.0015 NP0.5 VP0.4 NN0.5 DT1.0 VBD1.0 NP0.2 The gunman sprayed NP0.5 PP1.0 DT1.0 NN0.5 P1.0 NP0.3 NNS1.0 the building with bullets

Probability of a sentence Notation : wab – subsequence wa….wb Nj dominates wa….wb or yield(Nj) = wa….wb Nj NP wa……………..wb the..sweet..teddy ..bear Probability of a sentence = P(w1m) Where t is a parse tree of the sentence If t is a parse tree for the sentence w1m, this will be 1 !!

Assumptions of the PCFG model Place invariance : P(NP  DT NN) is same in locations 1 and 2 Context-free : P(NP  DT NN | anything outside “The child”) = P(NP  DT NN) Ancestor free : At 2, P(NP  DT NN|its ancestor is VP) = P(NP DT NN) S NP VP 1 The child NP 2 The toy

Probability of a parse tree Domination :We say Nj dominates from k to l, symbolized as , if Wk,l is derived from Nj P (tree |sentence) = P (tree | S1,l ) where S1,l means that the start symbol S dominates the word sequence W1,l P (t |s) approximately equals joint probability of constituent non-terminals dominating the sentence fragments (next slide)

Probability of a parse tree (cont.) S1,l NP1,2 VP3,l N2 V3,3 PP4,l P4,4 NP5,l w2 w4 DT1 w1 w3 w5 wl P ( t|s ) = P (t | S1,l ) = P ( NP1,2, DT1,1 , w1, N2,2, w2, VP3,l, V3,3 , w3, PP4,l, P4,4 , w4, NP5,l, w5…l | S1,l ) = P ( NP1,2 , VP3,l | S1,l) * P ( DT1,1 , N2,2 | NP1,2) * D(w1 | DT1,1) * P (w2 | N2,2) * P (V3,3, PP4,l | VP3,l) * P(w3 | V3,3) * P( P4,4, NP5,l | PP4,l ) * P(w4|P4,4) * P (w5…l | NP5,l) (Using Chain Rule, Context Freeness and Ancestor Freeness )

HMM ↔ PCFG O observed sequence ↔ w1m sentence X state sequence ↔ t parse tree  model ↔ G grammar Three fundamental questions

HMM ↔ PCFG How likely is a certain observation given the model? ↔ How likely is a sentence given the grammar? How to choose a state sequence which best explains the observations? ↔ How to choose a parse which best supports the sentence? ↔ ↔

HMM ↔ PCFG How to choose the model parameters that best explain the observed data? ↔ How to choose rule probabilities which maximize the probabilities of the observed sentences? ↔

Interesting Probabilities What is the probability of having a NP at this position such that it will derive “the building” ? - Inside Probabilities NP The gunman sprayed the building with bullets 1 2 3 4 5 6 7 Outside Probabilities What is the probability of starting from N1 and deriving “The gunman sprayed”, a NP and “with bullets” ? -

Interesting Probabilities Random variables to be considered The non-terminal being expanded. E.g., NP The word-span covered by the non-terminal. E.g., (4,5) refers to words “the building” While calculating probabilities, consider: The rule to be used for expansion : E.g., NP  DT NN The probabilities associated with the RHS non-terminals : E.g., DT subtree’s inside/outside probabilities & NN subtree’s inside/outside probabilities

Outside Probability j(p,q) : The probability of beginning with N1 & generating the non-terminal Njpq and all words outside wp..wq N1 Nj w1 ………wp-1 wp…wqwq+1 ……… wm

Inside Probabilities j(p,q) : The probability of generating the words wp..wq starting with the non-terminal Njpq. N1  Nj  w1 ………wp-1 wp…wqwq+1 ……… wm

Outside & Inside Probabilities: example NP The gunman sprayed the building with bullets 1 2 3 4 5 6 7