CPSC 503 Computational Linguistics Lecture 8 Giuseppe Carenini 4/11/2019 CPSC503 Winter 2007
Knowledge-Formalisms Map State Machines (and prob. versions) (Finite State Automata,Finite State Transducers, Markov Models) Morphology Syntax Rule systems (and prob. versions) (e.g., (Prob.) Context-Free Grammars) Semantics Last time Big transition state machines (Regular languages) CFGgrammars (CF languages) Parsing two approaches TD vs. BU (combine them with left corners) Still inefficient for 3 reasons Pragmatics Discourse and Dialogue Logical formalisms (First-Order Logics) AI planners 4/11/2019 CPSC503 Winter 2007
Today 4/10 The Earley Algorithm Partial Parsing: Chuncking 4/11/2019 CPSC503 Winter 2007
Parsing with CFGs Parser CFG Valid parse trees Sequence of words flight Nominal I prefer a morning flight Parser CFG CFG declarative formalism – what does it mean? Parsing with CFGs refers to the task of assigning correct trees to input strings Correct here means a tree that covers all and only the elements of the input and has an S at the top It doesn’t actually mean that the system can select the correct tree from among the possible trees Assign valid trees: covers all and only the elements of the input and has an S at the top 4/11/2019 CPSC503 Winter 2007
Parsing as Search CFG Search space of possible parse trees S -> NP VP S -> Aux NP VP NP -> Det Noun VP -> Verb Det -> a Noun -> flight Verb -> left Aux -> do, does defines As with everything of interest, parsing involves a search which involves the making of choices Search space defined by the grammar Two constraints should guide the search: the grammar (must be a tree with an S at the top) and the input (must cover all the words in the input) Parsing: find all trees that cover all and only the words in the input 4/11/2019 CPSC503 Winter 2007
Constraints on Search Parser CFG (search space) Sequence of words Valid parse trees flight Nominal I prefer a morning flight Parser CFG (search space) Search space finite or infinite?? Parsing with CFGs refers to the task of assigning correct trees to input strings Correct here means a tree that covers all and only the elements of the input and has an S at the top It doesn’t actually mean that the system can select the correct tree from among the possible trees Search Strategies: Top-down or goal-directed Bottom-up or data-directed 4/11/2019 CPSC503 Winter 2007
Problems with TD-BU-filtering A typical TD, depth-first, left to right, backtracking strategy (with BU filtering) cannot deal effectively with Left-Recursion Ambiguity Repeated Parsing Left recursive rules can lead a TD DF LtoR parser to recursively expand the same non-terminal over and over… leading to an infinite expansion of trees The number of valid parses can grow exponentially in the number of phrases (most of them do not make sense semantically so we do not consider them) Parser builds valid trees for portions of the input, then discards them during backtracking, only to find out that it has to rebuild them again SOLUTION: Earley Algorithm (once again dynamic programming!) 4/11/2019 CPSC503 Winter 2007
(1) Left-Recursion These rules appears in most English grammars S -> S and S VP -> VP PP NP -> NP PP TD depth-first LtoR is led into an infinite expansion TD breath-first we would successfully find parses for valid sentences, but when given an invalid sentence, We would bet stuck in an infinite search space You can rewrite the grammar to eliminate this… A->A beta | alpha =>> A-> alpha A’ A’ -> beta A’ | empty (CHECK!) but non-natural grammatical structure and also makes semantic interpretation quite difficult 4/11/2019 CPSC503 Winter 2007
(2) Structural Ambiguity #of PP # of NP parses … 6 469 7 1430 Three basic kinds: Attachment/Coordination/NP-bracketing “I shot an elephant in my pajamas” What are other kinds of ambiguity? VP -> V NP ; NP -> NP PP VP -> V NP PP Attachment non-PP “I saw Mary passing by cs2” Coordination “new student and profs” NP-bracketing “French language teacher” Catalan numbers 1/n+1 ( 2n) ( n ) 4/11/2019 CPSC503 Winter 2007
(3) Repeated Work Parsing is hard, and slow. It’s wasteful to redo stuff over and over and over. Consider an attempt to top-down parse the following as an NP “A flight from Indi to Houston on TWA” 4/11/2019 CPSC503 Winter 2007
starts from…. NP -> Det Nom NP-> NP PP Nom -> Noun …… fails and backtracks flight 4/11/2019 CPSC503 Winter 2007
restarts from…. NP -> Det Nom NP-> NP PP Nom -> Noun fails and backtracks flight 4/11/2019 CPSC503 Winter 2007
restarts from…. fails and backtracks.. flight 4/11/2019 CPSC503 Winter 2007
restarts from…. Success! 4/11/2019 CPSC503 Winter 2007
4 But…. 3 2 1 4/11/2019 CPSC503 Winter 2007
Dynamic Programming Fills tables with solution to subproblems Parsing: sub-trees consistent with the input, once discovered, are stored and can be reused Does not fall prey to left-recursion Stores ambiguous parse compactly Does not do (avoidable) repeated work Solves an exponential problem in polynomial time We need a method that fills a “table” with partial results that Sub-trees consistent with portions of the input… 4/11/2019 CPSC503 Winter 2007
Earley Parsing O(N3) Fills a table in a single sweep over the input words Table is length N +1; N is number of words Table entries represent: Predicted constituents In-progress constituents Completed constituents and their locations 4/11/2019 CPSC503 Winter 2007
States The table-entries are called states and express: what is predicted from that point what has been recognized up to that point Representation: dotted-rules + location S -> · VP [0,0] A VP is predicted at the start of the sentence NP -> Det · Nominal [1,2] An NP is in progress; the Det goes from 1 to 2 VP -> V NP · [0,3] A VP has been found starting at 0 and ending at 3 Predicted / expected 4/11/2019 CPSC503 Winter 2007
Graphically S -> · VP [0,0] NP -> Det · Nominal [1,2] VP -> V NP · [0,3] 4/11/2019 CPSC503 Winter 2007
Earley: answer S –> · [0,n] Answer found by looking in the table in the right place. The following state should be in the final column: S –> · [0,n] As with most dynamic programming approaches, i.e., an S state that spans from 0 to n and is complete. 4/11/2019 CPSC503 Winter 2007
Earley Parsing Procedure So sweep through the table from 0 to n in order, applying one of three operators to each state: predictor: add top-down predictions to the chart scanner: read input and add corresponding state to chart completer: move dot to right when new constituent found Results (new states) added to current or next set of states in chart No backtracking and no states removed 4/11/2019 CPSC503 Winter 2007
Predictor Intuition: new states represent top-down expectations Applied when non-part-of-speech non-terminals are to the right of a dot S --> • VP [0,0] Adds new states to end of current chart One new state for each expansion of the non-terminal in the grammar VP --> • V [0,0] VP --> • V NP [0,0] 4/11/2019 CPSC503 Winter 2007
Scanner (part of speech) New states for predicted part of speech. Applicable when part of speech is to the right of a dot VP --> • Verb NP [0,0] ( 0 “Book…” 1 ) Looks at current word in input If match, adds state(s) to next chart Verb --> book • NP [0,1] 4/11/2019 CPSC503 Winter 2007
Completer Intuition: we’ve found a constituent, so tell everyone waiting for this Applied when dot has reached right end of rule NP --> Det Nom • [1,3] Find all states w/dot at 1 and expecting an NP VP --> V • NP [0,1] Adds new (completed) state(s) to current chart VP --> V NP • [0,3] 4/11/2019 CPSC503 Winter 2007
Example: “Book that flight” We should find… an S from 0 to 3 that is a completed state… 4/11/2019 CPSC503 Winter 2007
Example “Book that flight” 4/11/2019 CPSC503 Winter 2007
So far only a recognizer… To generate all parses: When old states waiting for the just completed constituent are updated => add a pointer from each “updated” to “completed” Chart [0] ….. S5 S->.VP [0,0] [] S6 VP -> . Verb [0,0] [] S7 VP -> . Verb NP [0,0] [] …. Chart [1] S8 Verb -> book . [0,1] [] S9 VP -> Verb . [0,1] [S8] S10 S->VP. [0,1] [S9] S11 VP->Verb . NP [0,1] [??] …. S8 Then simply read off all the backpointers from every complete S in the last column of the table 4/11/2019 CPSC503 Winter 2007
Error Handling What happens when we look at the contents of the last table column and don't find a S --> state? Is it a total loss? No... Chart contains every constituent and combination of constituents possible for the input given the grammar Also useful for partial parsing or shallow parsing used in information extraction 4/11/2019 CPSC503 Winter 2007
Earley and Left Recursion So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search. Never place a state into the chart that’s already there Copy states before advancing them 4/11/2019 CPSC503 Winter 2007
Earley and Left Recursion: 1 S -> NP VP NP -> NP PP The first rule predicts S -> · NP VP [0,0] that adds NP -> · NP PP [0,0] stops there since adding any subsequent prediction would be fruitless 4/11/2019 CPSC503 Winter 2007
Earley and Left Recursion: 2 When a state gets advanced make a copy and leave the original alone… Say we have NP -> · NP PP [0,0] We find an NP from 0 to 2 so we create NP -> NP · PP [0,2] But we leave the original state as is So that it can wait for the recognition of “longer” NPs 4/11/2019 CPSC503 Winter 2007
Dynamic Programming Approaches Earley Top-down, no filtering, no restriction on grammar form CKY Bottom-up, no filtering, grammars restricted to Chomsky-Normal Form (CNF) (i.e., -free and each production either A-> BC or A-> a) 4/11/2019 CPSC503 Winter 2007
Today 4/10 The Earley Algorithm Partial Parsing: Chunking 4/11/2019 CPSC503 Winter 2007
Chunking Classify only basic non-recursive phrases (NP, VP, AP, PP) Find non-overlapping chunks Assign labels to chunks Chunk: typically includes headword and pre-head material [NP The HD box] that [NP you] [VP ordered] [PP from] [NP Shaw] [VP never arrived] 4/11/2019 CPSC503 Winter 2007
Approaches to Chunking (1): Finite-State Rule-Based Set of hand-crafted rules (no recursion!) e.g., NP -> (Det) Noun* Noun Implemented as FSTs (unionized/deteminized/minimized) F-measure 85-92 To build tree-like structures several FSTs can be combined [Abney ’96] 4/11/2019 CPSC503 Winter 2007
Approaches to Chunking (1): Finite-State Rule-Based … several FSTs can be combined 4/11/2019 CPSC503 Winter 2007
Approaches to Chunking (2): Machine Learning A case of sequential classification IOB tagging: (I) internal, (O) outside, (B) beginning Internal and Beginning for each chunk type => size of tagset (2n + 1) where n is the num of chunk types Find an annotated corpus Select feature set Select and train a classifier 4/11/2019 CPSC503 Winter 2007
Context window approach Typical features: Current / previous / following words Current / previous / following POS Previous chunks 4/11/2019 CPSC503 Winter 2007
Context window approach Specific choice of machine learning approach does not seem to matter F-measure 92-94 range Common causes of errors: POS tagger inaccuracies Inconsistencies in training corpus Ambiguities involving conjunctions (e.g., “late arrivals and cancellations/departure are common in winter” ) 4/11/2019 CPSC503 Winter 2007
For Next Time Read Chapter 14 (Probabilistic CFG and Parsing) Speaker: Lydia Kavraki, Professor, Rice University Time: 3:30 - 4:50 pm Venue: Hugh Dempster Pavilion 6245 Agronomy Rd., Room 310 Title: Motion Planning for Physical Systems 4/11/2019 CPSC503 Winter 2007