CPSC 503 Computational Linguistics

Slides:



Advertisements
Similar presentations
PARSING WITH CONTEXT-FREE GRAMMARS
Advertisements

Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin
1 Earley Algorithm Chapter 13.4 October 2009 Lecture #9.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
6/9/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 11 Giuseppe Carenini.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
9/8/20151 Natural Language Processing Lecture Notes 1.
10/12/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 10 Giuseppe Carenini.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from: Jim.
6/2/2016CPSC503 Winter CPSC 503 Computational Linguistics Lecture 9 Giuseppe Carenini.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
CPSC 503 Computational Linguistics
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
Natural Language Processing Lecture 15—10/15/2015 Jim Martin.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
1/22/2016CPSC503 Winter CPSC 503 Computational Linguistics Lecture 8 Giuseppe Carenini.
Instructor: Nick Cercone CSEB - 1 Parsing and Context Free Grammars Parsers, Top Down, Bottom Up, Left Corner, Earley.
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm.
Speech and Language Processing SLP Chapter 13 Parsing.
CSC 594 Topics in AI – Natural Language Processing
Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition
Statistical NLP Winter 2009
CSC 594 Topics in AI – Natural Language Processing
CS60057 Speech &Natural Language Processing
Heng Ji September 13, 2016 SYNATCTIC PARSING Heng Ji September 13, 2016.
Speech and Language Processing
Basic Parsing with Context Free Grammars Chapter 13
Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
CPSC 503 Computational Linguistics
CSCI 5832 Natural Language Processing
CSC 594 Topics in AI – Natural Language Processing
Probabilistic and Lexicalized Parsing
CSCI 5832 Natural Language Processing
CPSC 503 Computational Linguistics
CSCI 5832 Natural Language Processing
CS 388: Natural Language Processing: Syntactic Parsing
CPSC 503 Computational Linguistics
CSCI 5832 Natural Language Processing
Probabilistic and Lexicalized Parsing
Lecture 14: Grammar and Parsing (II) November 11, 2004 Dan Jurafsky
CSCI 5832 Natural Language Processing
Natural Language - General
Earley’s Algorithm (1970) Nice combo of our parsing ideas so far:
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Parsing and More Parsing
CPSC 503 Computational Linguistics
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 26
CSCI 5832 Natural Language Processing
CPSC 503 Computational Linguistics
Heng Ji January 17, 2019 SYNATCTIC PARSING Heng Ji January 17, 2019.
CPSC 503 Computational Linguistics
CPSC 503 Computational Linguistics
CSCI 5832 Natural Language Processing
CSA2050 Introduction to Computational Linguistics
Parsing I: CFGs & the Earley Parser
David Kauchak CS159 – Spring 2019
NLP.
Presentation transcript:

CPSC 503 Computational Linguistics Lecture 9 Giuseppe Carenini 1/13/2019 CPSC503 Winter 2010

Knowledge-Formalisms Map State Machines (and prob. versions) (Finite State Automata,Finite State Transducers, Markov Models) Morphology Syntax Rule systems (and prob. versions) (e.g., (Prob.) Context-Free Grammars) Semantics Last time Big transition state machines (Regular languages)  CFGgrammars (CF languages) Parsing two approaches TD vs. BU (combine them with left corners) Still inefficient for 3 reasons Pragmatics Discourse and Dialogue Logical formalisms (First-Order Logics) AI planners 1/13/2019 CPSC503 Winter 2010

Today 7/10 The Earley Algorithm Partial Parsing: Chunking Dependency Grammars / Parsing Treebank 1/13/2019 CPSC503 Winter 2010

Problems with TD-BU-filtering Ambiguity Repeated Parsing Left recursive rules can lead a TD DF LtoR parser to recursively expand the same non-terminal over and over… leading to an infinite expansion of trees The number of valid parses can grow exponentially in the number of phrases (most of them do not make sense semantically so we do not consider them) Parser builds valid trees for portions of the input, then discards them during backtracking, only to find out that it has to rebuild them again SOLUTION: Earley Algorithm (once again dynamic programming!) 1/13/2019 CPSC503 Winter 2010

(1) Structural Ambiguity #of PP # of NP parses … 6 469 7 1430 Three basic kinds: Attachment/Coordination/NP-bracketing “I shot an elephant in my pajamas” What are other kinds of ambiguity? VP -> V NP ; NP -> NP PP VP -> V NP PP Attachment non-PP “I saw Mary passing by cs2” Coordination “new student and profs” NP-bracketing “French language teacher” In combinatorial mathematics, the Catalan numbers form a sequence of natural numbers that occur in various counting problems, often involving recursively defined objects. Catalan numbers (2n)! / (n+1)! n! 1/13/2019 CPSC503 Winter 2010

(2) Repeated Work Parsing is hard, and slow. It’s wasteful to redo stuff over and over and over. Consider an attempt to top-down parse the following as an NP “A flight from Indi to Houston on TWA” 1/13/2019 CPSC503 Winter 2010

starts from…. NP -> Det Nom NP-> NP PP Nom -> Noun …… fails and backtracks flight 1/13/2019 CPSC503 Winter 2010

restarts from…. NP -> Det Nom NP-> NP PP Nom -> Noun fails and backtracks flight 1/13/2019 CPSC503 Winter 2010

restarts from…. fails and backtracks.. flight 1/13/2019 CPSC503 Winter 2010

restarts from…. Success! 1/13/2019 CPSC503 Winter 2010

4 But…. 3 2 1 1/13/2019 CPSC503 Winter 2010

Dynamic Programming Fills tables with solution to subproblems Parsing: sub-trees consistent with the input, once discovered, are stored and can be reused Stores ambiguous parse compactly Does not do (avoidable) repeated work Solves an exponential problem in polynomial time We need a method that fills a “table” with partial results that Sub-trees consistent with portions of the input… 1/13/2019 CPSC503 Winter 2010

Earley Parsing O(N3) Fills a table in a single sweep over the input words Table is length N +1; N is number of words Table entries represent: Predicted constituents In-progress constituents Completed constituents and their locations 1/13/2019 CPSC503 Winter 2010

States The table-entries are called states and express: what is predicted from that point What is in progress at that point what has been recognized up to that point Representation: dotted-rules + location S -> · VP [0,0] A VP is predicted at the start of the sentence NP -> Det · Nominal [1,2] An NP is in progress; the Det goes from 1 to 2 VP -> V NP · [0,3] A VP has been found starting at 0 and ending at 3 Predicted / expected 1/13/2019 CPSC503 Winter 2010

Earley: answer S –> · [0,n] Answer found by looking in the table in the right place. The following state should be in the final column: S –> · [0,n] As with most dynamic programming approaches, i.e., an S state that spans from 0 to n and is complete. 1/13/2019 CPSC503 Winter 2010

Earley Parsing Procedure So sweep through the table from 0 to n in order, applying one of three operators to each state: predictor: add top-down predictions to the chart scanner: read input and add corresponding state to chart completer: move dot to right when new constituent found Results (new states) added to current or next set of states in chart No backtracking and no states removed 1/13/2019 CPSC503 Winter 2010

Predictor Intuition: new states represent top-down expectations Applied when non-part-of-speech non-terminals are to the right of a dot S --> • VP [0,0] Adds new states to end of current chart One new state for each expansion of the non-terminal in the grammar VP --> • V [0,0] VP --> • V NP [0,0] 1/13/2019 CPSC503 Winter 2010

Scanner (part of speech) New states for predicted part of speech. Applicable when part of speech is to the right of a dot VP --> • Verb NP [0,0] ( 0 “Book…” 1 ) Looks at current word in input If match, adds state(s) to next chart Verb --> book • NP [0,1] 1/13/2019 CPSC503 Winter 2010

Completer Intuition: we’ve found a constituent, so tell everyone waiting for this Applied when dot has reached right end of rule NP --> Det Nom • [1,3] Find all states w/dot at 1 and expecting an NP VP --> V • NP [0,1] Adds new (completed) state(s) to current chart VP --> V NP • [0,3] 1/13/2019 CPSC503 Winter 2010

Example: “Book that flight” We should find… an S from 0 to 3 that is a completed state… 1/13/2019 CPSC503 Winter 2010

Example “Book that flight” 1/13/2019 CPSC503 Winter 2010

So far only a recognizer… To generate all parses: When old states waiting for the just completed constituent are updated => add a pointer from each “updated” to “completed” Chart [0] ….. S5 S->.VP [0,0] [] S6 VP -> . Verb [0,0] [] S7 VP -> . Verb NP [0,0] [] …. Chart [1] S8 Verb -> book . [0,1] [] S9 VP -> Verb . [0,1] [S8] S10 S->VP. [0,1] [S9] S11 VP->Verb . NP [0,1] [??] …. S8 Then simply read off all the backpointers from every complete S in the last column of the table 1/13/2019 CPSC503 Winter 2010

Error Handling What happens when we look at the contents of the last table column and don't find a S -->  state? Is it a total loss? Chart contains every constituent and combination of constituents possible for the input given the grammar Also useful for partial parsing or shallow parsing used in information extraction 1/13/2019 CPSC503 Winter 2010

Dynamic Programming Approaches Earley Top-down, no filtering, no restriction on grammar form CKY (will see this applied to Prob. CFG) Bottom-up, no filtering, grammars restricted to Chomsky-Normal Form (CNF) (i.e., -free and each production either A-> BC or A-> a) 1/13/2019 CPSC503 Winter 2010

Today 7/10 The Earley Algorithm Partial Parsing: Chunking Dependency Grammars / Parsing Treebank 1/13/2019 CPSC503 Winter 2010

Chunking Classify only basic non-recursive phrases (NP, VP, AP, PP) Find non-overlapping chunks Assign labels to chunks Chunk: typically includes headword and pre-head material [NP The HD box] that [NP you] [VP ordered] [PP from] [NP Shaw] [VP never arrived] (Specifier) head (Complements) 1/13/2019 CPSC503 Winter 2009

Approaches to Chunking (1): Finite-State Rule-Based Set of hand-crafted rules (no recursion!) e.g., NP -> (Det) Noun* Noun Implemented as FSTs (unionized/determinized/minimized) F-measure 85-92 To build tree-like structures several FSTs can be combined [Abney ’96] Show NLTK demo 1/13/2019 CPSC503 Winter 2009

Approaches to Chunking (1): Finite-State Rule-Based … several FSTs can be combined What about ambiguity? 1/13/2019 CPSC503 Winter 2009

Approaches to Chunking (2): Machine Learning A case of sequential classification IOB tagging: (I) internal, (O) outside, (B) beginning Internal and Beginning for each chunk type => size of tagset (2n + 1) where n is the num of chunk types Find an annotated corpus Select feature set Select and train a classifier 1/13/2019 CPSC503 Winter 2009

Context window approach Typical features: Current / previous / following words Current / previous / following POS Previous chunks NN noun 1/13/2019 CPSC503 Winter 2009

Context window approach and others.. Specific choice of machine learning approach does not seem to matter F-measure 92-94 range Common causes of errors: POS tagger inaccuracies Inconsistencies in training corpus Inaccuracies in identifying heads Ambiguities involving conjunctions (e.g., “late arrivals and cancellations/departure are common in winter” ) - The Head is the word in a phrase that is grammatically more important - Shallow parsing using specialized hmms Full text Pdf (239 KB) Source The Journal of Machine Learning Research archive Volume 2 ,  (March 2002) table of contents SPECIAL ISSUE: Special issue on machine learning approaches to shallow parsing table of contents Pages: 595 - 613   Year of Publication: 2002 ISSN:1533-7928 Authors Antonio Molina  Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Camí de Vera s/n, 46020 València (Spain) Ferran Pla  Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Camí de Vera s/n, 46020 València (Spain) Publisher MIT Press  Cambridge, MA, USA NAACL ‘03 1/13/2019 CPSC503 Winter 2009

Today 7/10 The Earley Algorithm Partial Parsing: Chunking Dependency Grammars / Parsing Treebank 1/13/2019 CPSC503 Winter 2010

Dependency Grammars Syntactic structure: binary relations between words Links: grammatical function or very general semantic relation Abstract away from word-order variations (simpler grammars) Useful features in many NLP applications (for classification, summarization and NLG) 1/13/2019 CPSC503 Winter 2009

Dependency Grammars (more verbose) In CFG-style phrase-structure grammars the main focus is on constituents. But it turns out you can get a lot done with just binary relations among the words in an utterance. In a dependency grammar framework, a parse is a tree where the nodes stand for the words in an utterance The links between the words represent dependency relations between pairs of words. Relations may be typed (labeled), or not. 1/13/2019 CPSC503 Winter 2009

Dependency Relations Show grammar primer 1/13/2019 CPSC503 Winter 2009 Clausal subject: That he had even asked her made her angry. The clause "that he had even asked her" is the subject of this sentence. Show grammar primer 1/13/2019 CPSC503 Winter 2009

Dependency Parse (ex 1) They hid the letter on the shelf 1/13/2019 CPSC503 Winter 2009

Dependency Parse (ex 2) 1/13/2019 CPSC503 Winter 2009

Dependency Parsing (see MINIPAR / Stanford demos) Dependency approach vs. CFG parsing. Deals well with free word order languages where the constituent structure is quite fluid Parsing is much faster than CFG-based parsers Dependency structure often captures all the syntactic relations actually needed by later applications The dependency approach has a number of advantages over full phrase-structure parsing. Deals well with free word order languages where the constituent structure is quite fluid Parsing is much faster than CFG-bases parsers Dependency structure often captures the syntactic relations needed by later applications CFG-based approaches often extract this same information from trees anyway. 1/13/2019 CPSC503 Winter 2009

Dependency Parsing There are two modern approaches to dependency parsing (supervised learning from Treebank data) Optimization-based approaches that search a space of trees for the tree that best matches some criteria Transition-based approaches that define and learn a transition system (state machine) for mapping a sentence to its dependency graph Data-Driven Dependency Parsing ◮ Dependency parsing based on (only) supervised learning from treebank data (annotated sentences) ◮ Graph-based [Eisner 1996, McDonald et al. 2005a] ◮ Define a space of candidate dependency graphs for a sentence ◮ Learning: Induce a model for scoring an entire dependency graph for a sentence ◮ Inference: Find the highest-scoring dependency graph, given the induced model ◮ Transition-based [Yamada and Matsumoto 2003, Nivre et al. 2004]: ◮ Define a transition system (state machine) for mapping a sentence to its dependency graph ◮ Learning: Induce a model for predicting the next state transition, given the transition history ◮ Inference: Construct the optimal transition sequence, given the induced model 1/13/2019 CPSC503 Winter 2009

Today 7/10 The Earley Algorithm Partial Parsing: Chunking Dependency Grammars / Parsing Treebank 1/13/2019 CPSC503 Winter 2010

Treebanks DEF. corpora in which each sentence has been paired with a parse tree These are generally created Parse collection with parser human annotators revise each parse Requires detailed annotation guidelines POS tagset Grammar instructions for how to deal with particular grammatical constructions. Treebanks are corpora in which each sentence has been paired with a parse tree (presumably the right one). These are generally created By first parsing the collection with an automatic parser And then having human annotators correct each parse as necessary. This generally requires detailed annotation guidelines that provide a POS tagset, a grammar and instructions for how to deal with particular grammatical constructions. 1/13/2019 CPSC503 Winter 2009

Penn Treebank Penn TreeBank is a widely used treebank. Most well known is the Wall Street Journal section of the Penn TreeBank. 1 M words from the 1987-1989 Wall Street Journal. Penn Treebank phrases annotated with grammatical function To make recovery of predicate argument easier 1/13/2019 CPSC503 Winter 2009

Treebank Grammars Treebanks implicitly define a grammar. Simply take the local rules that make up the sub-trees in all the trees in the collection if decent size corpus, you’ll have a grammar with decent coverage. Treebanks implicitly define a grammar for the language covered in the treebank. Simply take the local rules that make up the sub-trees in all the trees in the collection and you have a grammar. Not complete, but if you have decent size corpus, you’ll have a grammar with decent coverage. 1/13/2019 CPSC503 Winter 2009

Treebank Grammars Such grammars tend to be very flat due to the fact that they tend to avoid recursion. To ease the annotators burden For example, the Penn Treebank has 4500 different rules for VPs! Among them... Total of 17,500 rules 1/13/2019 CPSC503 Winter 2009

Heads in Trees Finding heads in treebank trees is a task that arises frequently in many applications. Particularly important in statistical parsing We can visualize this task by annotating the nodes of a parse tree with the heads of each corresponding node. 1/13/2019 CPSC503 Winter 2009

Lexically Decorated Tree 1/13/2019 CPSC503 Winter 2009

Head Finding The standard way to do head finding is to use a simple set of tree traversal rules specific to each non-terminal in the grammar. 1/13/2019 CPSC503 Winter 2009

Noun Phrases 1/13/2019 CPSC503 Winter 2009 For each phrase type Simple set of hand-written rules to find the head of such a phrase. This rules are often called head percolation 1/13/2019 CPSC503 Winter 2009

Treebank Uses Searching a Treebank. TGrep2 NP < PP or NP << PP Treebanks (and headfinding) are particularly critical to the development of statistical parsers Chapter 14 Also valuable to Corpus Linguistics Investigating the empirical details of various constructions in a given language NP immediately dominating a PP NP dominating a PP 1/13/2019 CPSC503 Winter 2009

Next time: read Chpt 14 State Machines (and prob. versions) (Finite State Automata,Finite State Transducers, Markov Models) Morphology Syntax Rule systems (and prob. versions) (e.g., (Prob.) Context-Free Grammars) Semantics Last time Big transition state machines (Regular languages)  CFGgrammars (CF languages) Parsing two approaches TD vs. BU (combine them with left corners) Still inefficient for 3 reasons Pragmatics Discourse and Dialogue Logical formalisms (First-Order Logics) AI planners 1/13/2019 CPSC503 Winter 2009