Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

Slides:



Advertisements
Similar presentations
Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
Advertisements

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language.
PARSING WITH CONTEXT-FREE GRAMMARS
Parsing I Context-free grammars and issues for parsers.
Top-down Parsing By Georgi Boychev, Rafal Kala, Ildus Mukhametov.
Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
1 Earley Algorithm Chapter 13.4 October 2009 Lecture #9.
1 Pertemuan 23 Syntatic Processing Matakuliah: T0264/Intelijensia Semu Tahun: 2005 Versi: 1/0.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
CS Basic Parsing with Context-Free Grammars.
Parsing context-free grammars Context-free grammars specify structure, not process. There are many different ways to parse input in accordance with a given.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
Amirkabir University of Technology Computer Engineering Faculty AILAB Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing Course,
1/13 Parsing III Probabilistic Parsing and Conclusions.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
CS 4705 Basic Parsing with Context-Free Grammars.
Professor Yihjia Tsai Tamkang University
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Linguistics II Syntax. Rules of how words go together to form sentences What types of words go together How the presence of some words predetermines others.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 7 Mälardalen University 2010.
Intro to NLP - J. Eisner1 Earley’s Algorithm (1970) Nice combo of our parsing ideas so far:  no restrictions on the form of the grammar:  A.
October 2008csa3180: Setence Parsing Algorithms 1 1 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up.
Top-Down Parsing - recursive descent - predictive parsing
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
1 CKY and Earley Algorithms Chapter 13 October 2012 Lecture #8.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Chapter 13: Parsing with Context-Free Grammars Heshaam Faili University of Tehran.
11 Syntactic Parsing. Produce the correct syntactic parse tree for a sentence.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from: Jim.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
1 Chart Parsing Allen ’ s Chapter 3 J & M ’ s Chapter 10.
Parsing with Context-Free Grammars References: 1.Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2.Speech and Language Processing, chapters 9,
Chart Parsing and Augmenting Grammars CSE-391: Artificial Intelligence University of Pennsylvania Matt Huenerfauth March 2005.
Natural Language - General
Basic Parsing Algorithms: Earley Parser and Left Corner Parsing
November 2004csa3050: Sentence Parsing II1 CSA350: NLP Algorithms Sentence Parsing 2 Top Down Bottom-Up Left Corner BUP Implementation in Prolog.
Quick Speech Synthesis CMSC Natural Language Processing April 29, 2003.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Top-Down Parsing.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 3.
Instructor: Nick Cercone CSEB - 1 Parsing and Context Free Grammars Parsers, Top Down, Bottom Up, Left Corner, Earley.
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm.
November 2009HLT: Sentence Parsing1 HLT Sentence Parsing Algorithms 2 Problems with Depth First Top Down Parsing.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
November 2004csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Parsing Algos.
Basic Parsing with Context Free Grammars Chapter 13
CS : Speech, NLP and the Web/Topics in AI
Natural Language - General
Earley’s Algorithm (1970) Nice combo of our parsing ideas so far:
Parsing and More Parsing
Syntax 2: Parsing Strategy and Active Chart Parsing
CSA2050 Introduction to Computational Linguistics
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Presentation transcript:

Context-Free Parsing

2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first Handling recursive rules Handling empty rules

3/37 Some terminology Rules written A  B c oTerminal vs. non-terminal symbols oLeft-hand side (head): always non-terminal oRight-hand side (body): can be mix of terminal and non-terminal, any number of them oUnique start symbol (usually S) o‘  ’ “rewrites as”, but is not directional (an “=” sign would be better)

4/37 S  NP VP NP  det n VP  v VP  v NP 1. Top-down with simple grammar Lexicon det  {an, the} n  {elephant, man} v  shot S  NP VP S NP VP NP  det n the man shot an elephant det n det  {an, the} the n  {elephant, man} man VP  v VP  v NP v v  shot shot No more rules, but input is not completely accounted for… So we must backtrack, and try the other VP rule

5/37 Lexicon det  {an, the} n  {elephant, man} v  shot S  NP VP NP  det n VP  v VP  v NP 1. Top-down with simple grammar S  NP VP S NP VP NP  det n the man shot an elephant det n det  {an, the} the n  {elephant, man} man VP  v VP  v NP v NP v  shot shot NP  det n det n det  {an, the} an n  {elephant, man} elephant No more rules, and input is completely accounted for

6/37 Breadth-first vs depth-first (1) When we came to the VP rule we were faced with a choice of two rules “Depth-first” means following the first choice through to the end “Breadth-first” means keeping all your options open We’ll see this distinction more clearly later, And also see that it is quite significant

7/37 S  NP VP NP  det n VP  v VP  v NP 2. Bottom-up with simple grammar Lexicon det  {an, the} n  {elephant, man} v  shot S  NP VP S NP  det n the man shot an elephant VP  v NPVP We’ve reached the top, but input is not completely accounted for… So we must backtrack, and try the other VP rule det  {an, the} n  {elephant, man} v  shot detnv n NP VP  v NP VP S  NP VP S We’ve reached the top, and input is completely accounted for

8/37 S  NP VP NP  det n VP  v VP  v NP Same again but with lexical ambiguity Lexicon det  {an, the} n  {elephant, man, shot} v  shot shot can be v or n

9/37 S  NP VP NP  det n VP  v VP  v NP 3. Top-down with lexical ambiguity Lexicon det  {an, the} n  {elephant, man, shot} v  shot S  NP VP S NP VP NP  det n the man shot an elephant det n det  {an, the} the n  {elephant, man} man VP  v VP  v NP v NP shot det n an elephant Same as before: at this point, we are looking for a v, and shot fits the bill; the n reading never comes into play

10/37 S  NP VP NP  det n VP  v VP  v NP 4. Bottom-up with lexical ambiguity Lexicon det  {an, the} n  {elephant, man, shot} v  shot S  NP VP S NP  det n the man shot an elephant VP  v NP VP det  {an, the} n  {elephant, man, shot} v  shot det n v n NP n VP  v NP VP S Terminology: graph nodes arcs (edges)

11/37 det n S  NP VP NP  det n VP  v VP  v NP 4. Bottom-up with lexical ambiguity Lexicon det  {an, the} n  {elephant, man, shot} v  shot the man shot an elephant NP det n v NP n VP S S Let’s get rid of all the unused arcs

12/37 det n S  NP VP NP  det n VP  v VP  v NP 4. Bottom-up with lexical ambiguity Lexicon det  {an, the} n  {elephant, man, shot} v  shot the man shot an elephant NP det n v NP VP S Let’s get rid of all the unused arcs

13/37 det n S  NP VP NP  det n VP  v VP  v NP 4. Bottom-up with lexical ambiguity Lexicon det  {an, the} n  {elephant, man, shot} v  shot the man shot an elephant NP det n v NP VP S And let’s clear away all the arcs…

14/37 det n S  NP VP NP  det n VP  v VP  v NP 4. Bottom-up with lexical ambiguity Lexicon det  {an, the} n  {elephant, man, shot} v  shot the man shot an elephant NP det n v NP VP S And let’s clear away all the arcs…

15/37 Breadth-first vs depth-first (2) In chart parsing, the distinction is more clear cut: At any point there may be a choice of things to do: which arcs to develop Breadth-first vs. depth-first can be seen as what order they are done in Queue (FIFO = breadth-first) vs. stack (LIFO= depth-first)

16/37 S  NP VP NP  det n NP  det n PP VP  v VP  v NP VP  v NP PP PP  prep NP Same again but with structural ambiguity Lexicon det  {an, the, his} n  {elephant, man, shot, pyjamas} v  shot prep  in S NP VP det n theman v NP shot det n anelephant prep NP in det n hispyjamas PP in his pyjamas the man shot an elephant We introduce a PP rule in two places

17/37 Lexicon det  {an, the, his} n  {elephant, man, shot, pyjamas} v  shot prep  in S NP VP det n theman v NP shot det n anelephant prep NP in det n hispyjamas PP in his pyjamas the man shot an elephant We introduce a PP rule in two places S  NP VP NP  det n NP  det n PP VP  v VP  v NP VP  v NP PP PP  prep NP Same again but with structural ambiguity

18/37 S  NP VP S NP VP NP  det n NP  det n PP the man shot an elephant in his pyjamas det n det  {an, the, his} n  {elephant, man, shot, pyjamas} 5. Top-down with structural ambiguity At this point, depending on our strategy (breadth-first vs. depth-first) we may consider the NP complete and look for the VP, or we may try the second NP rule. Let’s see what happens in the latter case. PP  prep NP prep  in PP prep NP The next word, shot, isn’t a prep, So this rule simply fails the man S  NP VP NP  det n NP  det n PP VP  v VP  v NP VP  v NP PP PP  prep NP

19/37 S  NP VP S NP VP NP  det n NP  det n PP the man shot an elephant in his pyjamas det n det  {an, the, his} the n  {elephant, man, shot, pyjamas} man VP  v VP  v NP VP  v NP PP v  shot anelephant 5. Top-down with structural ambiguity shot v As before, the first VP rule works, But does not account for all the input. v NP shot NP  det n NP  det n PP det n det  {an, the, his} n  {elephant, man, shot, pyjamas} Similarly, if we try the second VP rule, and the first NP rule … S  NP VP NP  det n NP  det n PP VP  v VP  v NP VP  v NP PP PP  prep NP

20/37 S  NP VP S NP VP NP  det n NP  det n PP the man shot an elephant in his pyjamas det n det  {an, the, his} the n  {elephant, man, shot, pyjamas} man VP  v VP  v NP VP  v NP PP v  shot 5. Top-down with structural ambiguity shot v NP  det n NP  det n PP So what do we try next? This? Or this? Depth-first: it’s a stack, LIFO Breadth-first: it’s a queue, FIFO S  NP VP NP  det n NP  det n PP VP  v VP  v NP VP  v NP PP PP  prep NP an elephant v NP shot det n

21/37 S  NP VP S NP VP NP  det n NP  det n PP the man shot an elephant in his pyjamas det n det  {an, the, his} the n  {elephant, man, shot, pyjamas} man VP  v VP  v NP VP  v NP PP v  shot 5. Top-down with structural ambiguity (depth-first) NP  det n NP  det n PP anelephant v NP det  {an, the, his} n  {elephant, man, shot, pyjamas} shot det n PP PP  prep NP prep  in prep NP in his pyjamas S  NP VP NP  det n NP  det n PP VP  v VP  v NP VP  v NP PP PP  prep NP

22/37 S  NP VP S NP VP NP  det n NP  det n PP the man shot an elephant in his pyjamas det n det  {an, the, his} the n  {elephant, man, shot, pyjamas} man VP  v VP  v NP VP  v NP PP v  shot 5. Top-down with structural ambiguity (breadth-first) NP  det n NP  det n PP v NP PP prep NP in his pyjamas shot det n an det  {an, the, his} elephant n  {elephant, man, shot, pyjamas} PP  prep NP prep  in S  NP VP NP  det n NP  det n PP VP  v VP  v NP VP  v NP PP PP  prep NP

23/37 Recognizing ambiguity Notice how the choice of strategy determines which result we get (first). In both strategies, there are often rules left untried, on the list (whether queue or stack). If we want to know if our input is ambiguous, at some time we do have to follow these through. As you will see later, trying out alternative paths can be quite intensive

24/37 6. Bottom-up with structural ambiguity S  NP VP NP  det n the man shot an elephant in his pyjamas VP  v NP VP VP  v NP S  NP VP NP  det n NP  det n PP VP  v VP  v NP VP  v NP PP PP  prep NP detnv nprepdetn NP NP  det n PP PP  prep NP PP NP VP VP  v NP PP VP S S S S

25/37 6. Bottom-up with structural ambiguity the man shot an elephant in his pyjamas NP S  NP VP NP  det n NP  det n PP VP  v VP  v NP VP  v NP PP PP  prep NP detnv nprepdetn NP PP NP VP S S S

26/37 Recursive rules “Recursive” rules call themselves We already have some recursive rule pairs: NP  det n PP PP  prep NP Rules can be immediately recursive AdjG  adj AdjG (the) big fat ugly (man)

27/37 Recursive rules Left recursive AdjG  AdjG adj AdjG  adj Right recursive AdjG  adj AdjG AdjG  adj AdjG adj AdjG adj big fat rich old AdjG adj AdjG adj big fat rich old

28/37 NP det n the NP  det n NP  det AdjG n AdjG  AdjG adj AdjG  adj 7. Top-down with left recursion NP  det n NP  det AdjG n the big fat rich old man the AdjG  AdjG adj AdjG  adj NP det AdjG n AdjG adj You can’t have left- recursive rules with a top-down parser, even if the non-recursive rule is first

29/37 NP  det n NP  det AdjG n AdjG  adj AdjG AdjG  adj 7. Top-down with right recursion NP  det n NP  det AdjG n the big fat rich old man the AdjG  adj AdjG AdjG  adj NP det AdjG n adj AdjG big adj AdjG fat adj AdjG rich adj AdjG old adj AdjG adjold man old

30/37 NP  det n NP  det AdjG n AdjG  AdvG adj AdjG AdjG  adj AdvG  AdvG adv AdvG  adv 8. Bottom-up with left and right recursion AdjG rule is right recursive, AdvG rule is left recursive the very very fat ugly man detadv adj n AdjG  adj AdjG AdvG  adv AdvG AdjG  AdvG adj AdjG AdjG AdvG  AdvG adv AdvG AdjG  AdvG adj AdjG AdjG NP  det AdjG n NP Quite a few useless paths, but overall no difficulty

31/37 NP  det n NP  det AdjG n AdjG  AdvG adj AdjG AdjG  adj AdvG  AdvG adv AdvG  adv 8. Bottom-up with left and right recursion AdjG rule is right recursive, AdvG rule is left recursive the very very fat ugly man detadv adj n AdjG  adj AdjG AdvG  adv AdvG AdjG  AdvG adj AdjG AdvG  AdvG adv AdvG AdjG  AdvG adj AdjG AdjG NP  det AdjG n NP

32/37 Empty rules For example NP  det AdjG n AdjG  adj AdjG AdjG  ε Equivalent to NP  det AdjG n NP  det n AdjG  adj AdjG  adj AdjG Or NP  det (AdjG) n AdjG  adj (AdjG)

33/37 NP  det AdjG n AdjG  adj AdjG AdjG  ε 7. Top-down with empty rules NP  det AdjG n the man the AdjG  adj AdjG AdjgG  ε NP det AdjG n adj AdjG man NP  det AdjG n the big fat man the AdjG  adj AdjG AdjgG  ε NP det AdjG n adj AdjG big adj AdjG fat man

34/37 8. Bottom-up with empty rules the fat man detadjn AdjG  ε AdjG  adj AdjG NP  det AdjG n NP Lots of useless paths, especially in a long sentence, but otherwise no difficulty NP  det AdjG n AdjG  adj AdjG AdjG  ε AdjG

35/37 Top down vs. bottom-up Bottom-up builds many useless trees Top-down can propose false trails, sometimes quite long, which are only abandoned when they reach the word level –Especially a problem if breadth-first Bottom-up very inefficient with empty rules Top-down CANNOT handle left-recursion Top-down cannot do partial parsing –Especially useful for speech Wouldn’t it be nice to combine them to get the advantages of both?

36/37 Left-corner parsing The “left corner” of a rule is the first symbol after the rewrite arrow –e.g. in S  NP VP, the left corner is NP. Left corner parsing starts bottom-up, taking the first item off the input and finding a rule for which it is the left corner. This provides a top-down prediction, but we continue working bottom-up until the prediciton is fulfilled. When a rule is completed, apply the left-corner principle: is that completed constituent a left- corner?

37/37 S  NP VP NP  det n VP  v VP  v NP 9. Left-corner with simple grammar NP  det n the man shot an elephant VP  v det the man NP n S  NP VP S VP shot v but text not all accounted for, so try VP  v NP NP det an NP  det n n elephant