CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini 11/13/2018 CPSC503 Winter 2014
Today 25 Sept Context Free Grammars (Phrase Structure)/ Dependency Grammar Parsing 11/13/2018 CPSC503 Winter 2014
Knowledge-Formalisms Map (next three lectures) State Machines (and prob. versions) (Finite State Automata,Finite State Transducers, Markov Models) Morphology Syntax Rule systems (and prob. versions) (e.g., (Prob.) Context-Free Grammars) Semantics My Conceptual map - This is the master plan Markov Models used for part-of-speech and dialog Syntax is the study of formal relationship between words How words are clustered into classes (that determine how they group and behave) How they group with they neighbors into phrases Pragmatics Discourse and Dialogue Logical formalisms (First-Order Logics) AI planners 11/13/2018 CPSC503 Winter 2014
Context Free Grammar (summary) 11/13/2018 CPSC503 Winter 2014
Dependency Grammars Syntactic structure: binary relations between words Links: grammatical function or very general semantic relation Abstract away from word-order variations (simpler grammars) Useful features in many NLP applications (for classification, summarization and NLG) 11/13/2018 CPSC503 Winter 2014
Today 25 Sept Context Free Grammars (Phrase Structure)/ Dependency Grammar Parsing 11/13/2018 CPSC503 Winter 2014
Parsing with CFGs Parser CFG Valid parse trees Sequence of words flight Nominal I prefer a morning flight Parser CFG Parsing with CFGs refers to the task of assigning correct trees to input strings Correct here means a tree that covers all and only the elements of the input and has an S at the top It doesn’t actually mean that the system can select the correct tree from among the possible trees Assign valid trees: covers all and only the elements of the input and has an S at the top 11/13/2018 CPSC503 Winter 2014
Parsing as Search CFG Search space of possible parse trees S -> NP VP S -> Aux NP VP NP -> Det Noun VP -> Verb Det -> a Noun -> flight Verb -> left, arrive Aux -> do, does defines As with everything of interest, parsing involves a search which involves the making of choices Search space defined by the grammar Two constraints should guide the search: the grammar (must be a tree with an S at the top) and the input (must cover all the words in the input) Parsing: find all trees that cover all and only the words in the input 11/13/2018 CPSC503 Winter 2014
Constraints on Search Parser CFG (search space) Sequence of words Valid parse trees flight Nominal I prefer a morning flight Parser CFG (search space) Parsing with CFGs refers to the task of assigning correct trees to input strings Correct here means a tree that covers all and only the elements of the input and has an S at the top It doesn’t actually mean that the system can select the correct tree from among the possible trees Search Strategies: Top-down or goal-directed Bottom-up or data-directed 11/13/2018 CPSC503 Winter 2014
Top-Down Parsing Since we’re trying to find trees rooted with an S (Sentences) start with the rules that give us an S. Then work your way down from there to the words. flight Input: We’ll start with some basic (meaning bad) methods before moving on to the one or two that you need to know 11/13/2018 CPSC503 Winter 2014
Next step: Top Down Space …….. When POS categories are reached, reject trees whose leaves fail to match all words in the input 11/13/2018 CPSC503 Winter 2014
Bottom-Up Parsing Of course, we also want trees that cover the input words. So start with trees that link up with the words in the right way. Then work your way up from there. flight 11/13/2018 CPSC503 Winter 2014
Two more steps: Bottom-Up Space …….. flight flight flight 11/13/2018 CPSC503 Winter 2014
Top-Down vs. Bottom-Up Top-down Bottom-up Only searches for trees that can be answers But suggests trees that are not consistent with the words Bottom-up Only forms trees consistent with the words Suggest trees that make no sense globally 11/13/2018 CPSC503 Winter 2014
So Combine Them Top-down: control strategy to generate trees Bottom-up: to filter out inappropriate parses Top-down Control strategy: Depth vs. Breadth first Which node to try to expand next Which grammar rule to use to expand a node (left-most) There are a million ways to combine top-down expectations with bottom-up data to get more efficient searches To incorporate input as soon as possible 3 choices to determine a control strategy Natural forward incorporation of the input (textual order) 11/13/2018 CPSC503 Winter 2014
Top-Down, Depth-First, Left-to-Right Search Sample sentence: “Does this flight include a meal?” Search state a partial parse tree and the next word to match in [] 11/13/2018 CPSC503 Winter 2014
Example “Does this flight include a meal?” 11/13/2018 CPSC503 Winter 2014
Example “Does this flight include a meal?” flight flight 11/13/2018 CPSC503 Winter 2014
Example “Does this flight include a meal?” flight flight 11/13/2018 CPSC503 Winter 2014
Adding Bottom-up Filtering The following sequence was a waste of time because an NP cannot generate a parse tree starting with an AUX You should not bother expanding a node if it cannot generate a parse tree matching the next word Aux Aux Aux Aux 11/13/2018 CPSC503 Winter 2014
Bottom-Up Filtering Category Left Corners S Det, Proper-Noun, Aux, Verb NP Det, Proper-Noun Nominal Noun VP Verb Aux 11/13/2018 CPSC503 Winter 2014
Problems with TD-BU-filtering Ambiguity Repeated Parsing The number of valid parses can grow exponentially in the number of phrases (most of them do not make sense semantically so we do not consider them) Parser builds valid trees for portions of the input, then discards them during backtracking, only to find out that it has to rebuild them again SOLUTION: Earley Algorithm (once again dynamic programming!) 11/13/2018 CPSC503 Winter 2014
(1) Structural Ambiguity #of PP # of NP parses … 6 429 7 1430 Three basic kinds: Attachment/Coordination/NP-bracketing “I shot an elephant in my pajamas” What are other kinds of ambiguity? VP -> V NP ; NP -> NP PP VP -> V NP PP Attachment non-PP “I saw Mary passing by cs2” Coordination “new student and profs” NP-bracketing “French language teacher” In combinatorial mathematics, the Catalan numbers form a sequence of natural numbers that occur in various counting problems, often involving recursively defined objects. Catalan numbers (2n)! / (n+1)! n! 11/13/2018 CPSC503 Winter 2014
Structural Ambiguity (Ex. 1) VP -> V NP ; NP -> NP PP VP -> V NP PP “I shot an elephant in my pajamas” What are other kinds of ambiguity? VP -> V NP ; NP -> NP PP VP -> V NP PP Attachment non-PP “I saw Mary passing by cs2” Coordination “new student and profs” NP-bracketing “French language teacher” In combinatorial mathematics, the Catalan numbers form a sequence of natural numbers that occur in various counting problems, often involving recursively defined objects. Catalan numbers (2n)! / (n+1)! n! CPSC 422, Lecture 27
Structural Ambiguity (Ex.2) “I saw Mary passing by cs2” (ROOT (S (NP (PRP I)) (VP (VBD saw) (NP (NNP Mary)) (VP (VBG passing) (PP (IN by) (NP (NNP cs2))))))) (ROOT (S (NP (PRP I)) (VP (VBD saw) (NP (NNP Mary)) (VP (VBG passing) (PP (IN by) (NP (NNP cs2))))))) What are other kinds of ambiguity? VP -> V NP ; NP -> NP PP VP -> V NP PP Attachment non-PP “I saw Mary passing by cs2” Coordination “new student and profs” NP-bracketing “French language teacher” In combinatorial mathematics, the Catalan numbers form a sequence of natural numbers that occur in various counting problems, often involving recursively defined objects. Catalan numbers (2n)! / (n+1)! n! CPSC 422, Lecture 27
Structural Ambiguity (Ex. 3) Coordination “new student and profs” What are other kinds of ambiguity? VP -> V NP ; NP -> NP PP VP -> V NP PP Attachment non-PP “I saw Mary passing by cs2” Coordination “new student and profs” NP-bracketing “French language teacher” In combinatorial mathematics, the Catalan numbers form a sequence of natural numbers that occur in various counting problems, often involving recursively defined objects. Catalan numbers (2n)! / (n+1)! n! CPSC 422, Lecture 27
Structural Ambiguity (Ex. 4) NP-bracketing “French language teacher” What are other kinds of ambiguity? VP -> V NP ; NP -> NP PP VP -> V NP PP Attachment non-PP “I saw Mary passing by cs2” Coordination “new student and profs” NP-bracketing “French language teacher” In combinatorial mathematics, the Catalan numbers form a sequence of natural numbers that occur in various counting problems, often involving recursively defined objects. Catalan numbers (2n)! / (n+1)! n! CPSC 422, Lecture 27
(2) Repeated Work Parsing is hard, and slow. It’s wasteful to redo stuff over and over and over. Consider an attempt to top-down parse the following as an NP “A flight from Indi to Houston on TWA” 11/13/2018 CPSC503 Winter 2014
starts from…. NP -> Det Nom NP-> NP PP Nom -> Noun …… fails and backtracks flight 11/13/2018 CPSC503 Winter 2014
restarts from…. NP -> Det Nom NP-> NP PP Nom -> Noun fails and backtracks flight 11/13/2018 CPSC503 Winter 2014
restarts from…. fails and backtracks.. flight 11/13/2018 CPSC503 Winter 2014
restarts from…. Success! 11/13/2018 CPSC503 Winter 2014
4 But…. 3 2 1 11/13/2018 CPSC503 Winter 2014
Dynamic Programming Fills tables with solution to subproblems Parsing: sub-trees consistent with the input, once discovered, are stored and can be reused Stores ambiguous parse compactly Does not do (avoidable) repeated work Solves an exponential problem in polynomial time We need a method that fills a “table” with partial results that Sub-trees consistent with portions of the input… 11/13/2018 CPSC503 Winter 2014
Earley Parsing O(N 3) Fills a table in a single sweep over the input words Table is length N +1; N is number of words Table entries represent: Predicted constituents In-progress constituents Completed constituents and their locations 11/13/2018 CPSC503 Winter 2014
States The table-entries are called states and express: what is predicted from that point What is in progress at that point what has been recognized up to that point Representation: dotted-rules + location S -> · VP [0,0] A VP is predicted at the start of the sentence NP -> Det · Nominal [1,2] An NP is in progress; the Det goes from 1 to 2 VP -> V NP · [0,3] A VP has been found starting at 0 and ending at 3 Predicted / expected 11/13/2018 CPSC503 Winter 2014
Earley: answer S –> · [0,n] Answer found by looking in the table (chart) in the right place. The following state should be in the final column: S –> · [0,n] As with most dynamic programming approaches, i.e., an S state that spans from 0 to n and is complete. 11/13/2018 CPSC503 Winter 2014
Earley Parsing Procedure So sweep through the table from 0 to n in order, applying one of three operators to each state: predictor: add top-down predictions to the chart scanner: read input and add corresponding state to chart completer: move dot to right when new constituent found Results (new states) added to current or next set of states in chart No backtracking and no states removed 11/13/2018 CPSC503 Winter 2014
Predictor Intuition: new states represent top-down expectations Applied when non-part-of-speech non-terminals are to the right of a dot S --> • VP [0,0] Adds new states to end of current chart One new state for each expansion of the non-terminal in the grammar VP --> • V [0,0] VP --> • V NP [0,0] 11/13/2018 CPSC503 Winter 2014
Scanner (part of speech) New states for predicted part of speech. Applicable when part of speech is to the right of a dot VP --> • Verb NP [0,0] ( 0 “Book…” 1 ) Looks at current word in input If match, adds state(s) to next chart Verb --> book • [0,1] 11/13/2018 CPSC503 Winter 2014
Completer Intuition: we’ve found a constituent, so tell everyone waiting for this Applied when dot has reached right end of rule NP --> Det Nom • [1,3] Find all states w/dot at 1 and expecting an NP VP --> V • NP PP [0,1] VP --> V • NP [0,1] Adds new state(s) to current chart by advancing the dot VP --> V NP • PP [0,3] VP --> V NP • [0,3] 11/13/2018 CPSC503 Winter 2014
Example: “Book that flight” We should find… an S from 0 to 3 that is a completed state… 11/13/2018 CPSC503 Winter 2014
Example “Book that flight” 11/13/2018 CPSC503 Winter 2014
So far only a recognizer… To generate all parses: When old states waiting for the just completed constituent are updated => add a pointer from each “updated” to “completed” Chart [0] ….. S5 S->.VP [0,0] [] S6 VP -> . Verb [0,0] [] S7 VP -> . Verb NP [0,0] [] …. Chart [1] S8 Verb -> book . [0,1] [] S9 VP -> Verb . [0,1] [S8] S10 S->VP. [0,1] [??] S11 VP->Verb . NP [0,1] [??] …. S8 Then simply read off all the backpointers from every complete S in the last column of the table 11/13/2018 CPSC503 Winter 2014
Error Handling What happens when we look at the contents of the last table column and don't find a S --> state? Is it a total loss? Chart contains every constituent and combination of constituents possible for the input given the grammar Also useful for partial parsing or shallow parsing used in information extraction 11/13/2018 CPSC503 Winter 2014
A probabilistic earley parser as a psycholinguistic model Author: John Hale The Johns Hopkins University, Baltimore MD Published in: · Proceeding NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics · Citation Count: 27 11/13/2018 CPSC503 Winter 2014
Dynamic Programming Approaches Earley Top-down, no filtering, no restriction on grammar form CKY (will see this applied to Prob. CFG) Bottom-up, no filtering, grammars restricted to Chomsky-Normal Form (CNF) (i.e., -free and each production either A-> BC or A-> a) 11/13/2018 CPSC503 Winter 2014
Start working on assignment-2 ! For Next Time Read Chapter 14: Statistical Parsing Optional: Read Chapter 15 (Features and Unification) – skip algorithms and implementation Start working on assignment-2 ! 11/13/2018 CPSC503 Winter 2014
Grammars and Constituency Of course, there’s nothing easy or obvious about how we come up with right set of constituents and the rules that govern how they combine... That’s why there are so many different theories of grammar and competing analyses of the same data. The approach to grammar, and the analyses, adopted here are very generic (and don’t correspond to any modern linguistic theory of grammar). 11/13/2018 CPSC503 Winter 2014
Syntactic Notions so far... N-grams: prob. distr. for next word can be effectively approximated knowing previous n words POS categories are based on: distributional properties (what other words can occur nearby) morphological properties (affixes they take) Syntactic in the sense that they extend beyond a single word out of context By syntax I mean the kind of implicit knowledge of your native language that you had mastered by the time you were 3 or 4 years old without explicit instruction Not the kind of stuff you were later taught in school. 11/13/2018 CPSC503 Winter 2014