1 Basic Parsing with Context- Free Grammars Slides adapted from Julia Hirschberg and Dan Jurafsky.

Slides:



Advertisements
Similar presentations
Introduction to Syntax and Context-Free Grammars Owen Rambow
Advertisements

Basic Parsing with Context-Free Grammars CS 4705 Julia Hirschberg 1 Some slides adapted from Kathy McKeown and Dan Jurafsky.
Syntax and Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Introduction to Syntax Owen Rambow September 30.
PARSING WITH CONTEXT-FREE GRAMMARS
Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin
1 Earley Algorithm Chapter 13.4 October 2009 Lecture #9.
CS Basic Parsing with Context-Free Grammars.
Parsing context-free grammars Context-free grammars specify structure, not process. There are many different ways to parse input in accordance with a given.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.
Introduction to Syntax Owen Rambow September
Introduction to Syntax Owen Rambow October
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
Introduction to Syntax and Context-Free Grammars Owen Rambow September
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
CS 4705 Basic Parsing with Context-Free Grammars.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Basic Parsing with Context- Free Grammars Slides adapted from Dan Jurafsky and Julia Hirschberg.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
1 CKY and Earley Algorithms Chapter 13 October 2012 Lecture #8.
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from: Jim.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Context Free Grammars Chapter 9 (Much influenced by Owen Rambow) October 2009 Lecture #7.
Quick Speech Synthesis CMSC Natural Language Processing April 29, 2003.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Natural Language Processing Lecture 15—10/15/2015 Jim Martin.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
Instructor: Nick Cercone CSEB - 1 Parsing and Context Free Grammars Parsers, Top Down, Bottom Up, Left Corner, Earley.
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm.
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Speech and Language Processing SLP Chapter 13 Parsing.
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Natural Language Processing Vasile Rus
Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition
Introduction to Syntax and Context-Free Grammars
Basic Parsing with Context Free Grammars Chapter 13
Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition
Introduction to Syntax and Context-Free Grammars cs
CSCI 5832 Natural Language Processing
Parsing and More Parsing
Introduction to Syntax
David Kauchak CS159 – Spring 2019
Parsing I: CFGs & the Earley Parser
Presentation transcript:

1 Basic Parsing with Context- Free Grammars Slides adapted from Julia Hirschberg and Dan Jurafsky

2 Homework – Getting Started Data – News articles in TDT4 – Make sure you are looking in ENG sub-directory – You need to represent each article in.arff form – You need to write a program that will extract “features” from each article – (Note that files with POS tags are now available in Eng_POSTAGGED) The.arff file contains independent variables as well as the dependent variable

3 An example Start with classifying into topic Suppose you want to start with just the words Two approaches 1. Use your intuition to choose a few words that might disambiguate 2. Start with all words

4 What would your.arff file look like? Words are the attributes. What are the values? Binary: present or not Frequency: how many times it occurs TF*IDF: how many times it occurs in this document (TF = term frequency) divided by how many times it occurs in all documents (DF = document frequency

5 news_2865.input NWIRE ENG NYT BT_9 NT_34 Reversing a policy that has kept medical errors secret for more than two decades, federal officials say they will soon allow Medicare beneficiaries to obtain data about doctors who botched their care. Tens of thousands of Medicare patients file complaints each year about the quality of care they receive from doctors and hospitals. But in many cases, patients get no useful information because doctors can block the release of assessments of their performance. Under a new policy, officials said, doctors will no longer be able to veto disclosure of the findings of investigations. Federal law has for many years allowed for review of care received by Medicare patients, and the law says a peer review organization must inform the patient of the ``final disposition of the complaint'' in each case. But the federal rules used to carry out the law say the peer review organization may disclose information about a doctor only ``with the consent of that practitioner.'' The federal manual for peer review organizations includes similar language about disclosure. Under the new plan, investigators will have to tell patients whether their care met ``professionally recognized standards of health care'' and inform them of any action against the doctor or the hospital. Patients could use such information in lawsuits and other actions against doctors and hospitals that provided substandard care. The new policy came in response to a lawsuit against the government

6 All words for news_2865.input Class BT_9dependent Reversing1independent A100 Policy20 That50 Has75 Kept3 Commonwealth0 (news_2816.input) Independent0 States0 Preemptive0 Refugees0

7 Try it! Open your file Select attributes using Chi-square You can cut and paste resulting attributes to a file Classify How does it work? Try n-grams, POS or date next in same way – How many features would each give you?

8 CFG: Example Many possible CFGs for English, here is an example (fragment): – S  NP VP – VP  V NP – NP  DetP N | AdjP NP – AdjP  Adj | Adv AdjP – N  boy | girl – V  sees | likes – Adj  big | small – Adv  very – DetP  a | the the very small boy likes a girl

9 Modify the grammar

10

12 Derivations of CFGs String rewriting system: we derive a string (=derived structure) But derivation history represented by phrase- structure tree (=derivation structure)!

13 Formal Definition of a CFG G = (V,T,P,S) V: finite set of nonterminal symbols T: finite set of terminal symbols, V and T are disjoint P: finite set of productions of the form A  , A  V and   (T  V)* S  V: start symbol

14 Context? The notion of context in CFGs has nothing to do with the ordinary meaning of the word context in language All it really means is that the non-terminal on the left- hand side of a rule is out there all by itself (free of context) A -> B C Means that I can rewrite an A as a B followed by a C regardless of the context in which A is found

15 Key Constituents (English) Sentences Noun phrases Verb phrases Prepositional phrases

16 Sentence-Types Declaratives: I do not. S -> NP VP Imperatives: Go around again! S -> VP Yes-No Questions: Do you like my hat? S -> Aux NP VP WH Questions: What are they going to do? S -> WH Aux NP VP

18

19

20 NPs NP -> Pronoun – I came, you saw it, they conquered NP -> Proper-Noun – New Jersey is west of New York City – Lee Bollinger is the president of Columbia NP -> Det Noun – The president NP -> Det Nominal Nominal -> Noun Noun – A morning flight to Denver

21 PPs PP -> Preposition NP – Over the house – Under the house – To the tree – At play – At a party on a boat at night

22

23

24

26

27 Recursion We’ll have to deal with rules such as the following where the non-terminal on the left also appears somewhere on the right (directly) NP -> NP PP[[The flight] [to Boston]] VP -> VP PP[[departed Miami] [at noon]]

28 Recursion Of course, this is what makes syntax interesting Flights from Denver Flights from Denver to Miami Flights from Denver to Miami in February Flights from Denver to Miami in February on a Friday Flights from Denver to Miami in February on a Friday under $300 Flights from Denver to Miami in February on a Friday under $300 with lunch

29 Recursion [[Flights] [from Denver]] [[[Flights] [from Denver]] [to Miami]] [[[[Flights] [from Denver]] [to Miami]] [in February]] [[[[[Flights] [from Denver]] [to Miami]] [in February]] [on a Friday]] Etc. NP -> NP PP

30 Implications of recursion and context-freeness If you have a rule like – VP -> V NP – It only cares that the thing after the verb is an NP It doesn’t have to know about the internal affairs of that NP

31 The point VP -> V NP (I) hate flights from Denver flights from Denver to Miami flights from Denver to Miami in February flights from Denver to Miami in February on a Friday flights from Denver to Miami in February on a Friday under $300 flights from Denver to Miami in February on a Friday under $300 with lunch

32 Grammar Equivalence Can have different grammars that generate same set of strings (weak equivalence) – Grammar 1: NP  DetP N and DetP  a | the – Grammar 2: NP  a N | NP  the N Can have different grammars that have same set of derivation trees (strong equivalence) – With CFGs, possible only with useless rules – Grammar 2: NP  a N | NP  the N – Grammar 3: NP  a N | NP  the N, DetP  many Strong equivalence implies weak equivalence

33 Normal Forms &c There are weakly equivalent normal forms (Chomsky Normal Form, Greibach Normal Form) There are ways to eliminate useless productions and so on

34 Chomsky Normal Form A CFG is in Chomsky Normal Form (CNF) if all productions are of one of two forms: A  BC with A, B, C nonterminals A  a, with A a nonterminal and a a terminal Every CFG has a weakly equivalent CFG in CNF

35 “Generative Grammar” Formal languages: formal device to generate a set of strings (such as a CFG) Linguistics (Chomskyan linguistics in particular): approach in which a linguistic theory enumerates all possible strings/structures in a language (=competence) Chomskyan theories do not really use formal devices – they use CFG + informally defined transformations

36 Nobody Uses Simple CFGs (Except Intro NLP Courses) All major syntactic theories (Chomsky, LFG, HPSG, TAG-based theories) represent both phrase structure and dependency, in one way or another All successful parsers currently use statistics about phrase structure and about dependency Derive dependency through “head percolation”: for each rule, say which daughter is head

37 Penn Treebank (PTB) Syntactically annotated corpus of newspaper texts (phrase structure) The newspaper texts are naturally occurring data, but the PTB is not! PTB annotation represents a particular linguistic theory (but a fairly “vanilla” one) Particularities – Very indirect representation of grammatical relations (need for head percolation tables) – Completely flat structure in NP (brown bag lunch, pink-and- yellow child seat ) – Has flat Ss, flat VPs

Example from PTB ( (S (NP-SBJ It) (VP 's (NP-PRD (NP (NP the latest investment craze) (VP sweeping (NP Wall Street))) : (NP (NP a rash) (PP of (NP (NP new closed-end country funds), (NP (NP those (ADJP publicly traded) portfolios) (SBAR (WHNP-37 that) (S (NP-SBJ *T*-37) (VP invest (PP-CLR in (NP (NP stocks) (PP of (NP a single foreign country)))))))))))

39 Types of syntactic constructions Is this the same construction? – An elf decided to clean the kitchen – An elf seemed to clean the kitchen An elf cleaned the kitchen Is this the same construction? – An elf decided to be in the kitchen – An elf seemed to be in the kitchen An elf was in the kitchen

40 Types of syntactic constructions (ctd) Is this the same construction? There is an elf in the kitchen – *There decided to be an elf in the kitchen – There seemed to be an elf in the kitchen Is this the same construction? It is raining/it rains – ??It decided to rain/be raining – It seemed to rain/be raining

41 Types of syntactic constructions (ctd) Conclusion: to seem: whatever is embedded surface subject can appear in upper clause to decide: only full nouns that are referential can appear in upper clause Two types of verbs

Types of syntactic constructions: Analysis an elf S NP VP V to decide S NPVP V to be PP in the kitchen S VP V to seem S NPVP V to be PP in the kitchen an elf

Types of syntactic constructions: Analysis an elf S NP VP V decided S NP PRO VP V to be PP in the kitchen S VP V seemed S NPVP V to be PP in the kitchen an elf

Types of syntactic constructions: Analysis an elf S NP VP V decided S NP PRO VP V to be PP in the kitchen S VP V seemed S NPVP V to be PP in the kitchen an elf

Types of syntactic constructions: Analysis an elf S NP VP V decided S NP PRO VP V to be PP in the kitchen S NP i VP V seemed S NPVP V to be PP in the kitchen an elf titi

46 Types of syntactic constructions: Analysis to seem: lower surface subject raises to upper clause; raising verb seems (there to be an elf in the kitchen) there seems (t to be an elf in the kitchen) it seems (there is an elf in the kitchen)

47 Types of syntactic constructions: Analysis (ctd) to decide: subject is in upper clause and co-refers with an empty subject in lower clause; control verb an elf decided (an elf to clean the kitchen) an elf decided (PRO to clean the kitchen) an elf decided (he cleans/should clean the kitchen) *it decided (an elf cleans/should clean the kitchen)

48 Lessons Learned from the Raising/Control Issue Use distribution of data to group phenomena into classes Use different underlying structure as basis for explanations Allow things to “move” around from underlying structure -> transformational grammar Check whether explanation you give makes predictions

The Big Picture Empirical Matter Formalisms Data structures Formalisms Algorithms Distributional Models Maud expects there to be a riot *Teri promised there to be a riot Maud expects the shit to hit the fan *Teri promised the shit to hit the or Linguistic Theory Content: Relate morphology to semantics Surface representation (eg, ps) Deep representation (eg, dep) Correspondence uses descriptive theory is about explanatory theory is about predicts

50 Syntactic Parsing

51 Syntactic Parsing Declarative formalisms like CFGs, FSAs define the legal strings of a language -- but only tell you ‘this is a legal string of the language X’ Parsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

52 Parsing as a Form of Search Searching FSAs – Finding the right path through the automaton – Search space defined by structure of FSA Searching CFGs – Finding the right parse tree among all possible parse trees – Search space defined by the grammar Constraints provided by the input sentence and the automaton or grammar

CFG for Fragment of English S  NP VPVP  V S  Aux NP VPPP -> Prep NP NP  Det NomN  old | dog | footsteps | young NP  PropNV  dog | include | prefer Nom -> Adj NomAux  does Nom  N NomPrep  from | to | on | of Nom  NPropN  Bush | McCain | Obama Nom  Nom PPDet  that | this | a| the VP  V NP TopDBotUpE.g. LC’s

Parse Tree for ‘The old dog the footsteps of the young’ for Prior CFGPrior CFG S NPVP NPV DET NOM N PP DETNOM N Theolddog the footsteps of the young

55 Top-Down Parser Builds from the root S node to the leaves Expectation-based Common search strategy – Top-down, left-to-right, backtracking – Try first rule with LHS = S – Next expand all constituents in these trees/rules – Continue until leaves are POS – Backtrack when candidate POS does not match input string

56 Rule Expansion “The old dog the footsteps of the young.” Where does backtracking happen? What are the computational disadvantages? What are the advantages?

57 Bottom-Up Parsing Parser begins with words of input and builds up trees, applying grammar rules whose RHS matches Det N V Det N Prep Det N The old dog the footsteps of the young. Det Adj N Det N Prep Det N The old dog the footsteps of the young. Parse continues until an S root node reached or no further node expansion possible

58 What’s right/wrong with…. Top-Down parsers – they never explore illegal parses (e.g. which can’t form an S) -- but waste time on trees that can never match the input Top-Down parsers Bottom-Up parsers – they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root) Bottom-Up parsers For both: find a control strategy -- how explore search space efficiently? – Pursuing all parses in parallel or backtrack or …? – Which rule to apply next? – Which node to expand next?

59 Some Solutions Dynamic Programming Approaches – Use a chart to represent partial results CKY Parsing Algorithm – Bottom-up – Grammar must be in Normal Form – The parse tree might not be consistent with linguistic theory Early Parsing Algorithm – Top-down – Expectations about constituents are confirmed by input – A POS tag for a word that is not predicted is never added Chart Parser

60 Earley Parsing Allows arbitrary CFGs Fills a table in a single sweep over the input words – Table is length N+1; N is number of words – Table entries represent Completed constituents and their locations In-progress constituents Predicted constituents

61 States The table-entries are called states and are represented with dotted-rules. S -> · VPA VP is predicted NP -> Det · NominalAn NP is in progress VP -> V NP · A VP has been found

62 States/Locations It would be nice to know where these things are in the input so… S -> · VP [0,0]A VP is predicted at the start of the sentence NP -> Det · Nominal [1,2]An NP is in progress; the Det goes from 1 to 2 VP -> V NP · [0,3]A VP has been found starting at 0 and ending at 3

63 Graphically

64 Earley As with most dynamic programming approaches, the answer is found by looking in the table in the right place. In this case, there should be an S state in the final column that spans from 0 to n+1 and is complete. If that’s the case you’re done. – S –> α · [0,n+1]

65 Earley Algorithm March through chart left-to-right. At each step, apply 1 of 3 operators – Predictor Create new states representing top-down expectations – Scanner Match word predictions (rule with word after dot) to words – Completer When a state is complete, see what rules were looking for that completed constituent

66 Predictor Given a state – With a non-terminal to right of dot – That is not a part-of-speech category – Create a new state for each expansion of the non-terminal – Place these new states into same chart entry as generated state, beginning and ending where generating state ends. – So predictor looking at S ->. VP [0,0] – results in VP ->. Verb [0,0] VP ->. Verb NP [0,0]

67 Scanner Given a state – With a non-terminal to right of dot – That is a part-of-speech category – If the next word in the input matches this part-of-speech – Create a new state with dot moved over the non-terminal – So scanner looking at VP ->. Verb NP [0,0] – If the next word, “book”, can be a verb, add new state: VP -> Verb. NP [0,1] – Add this state to chart entry following current one – Note: Earley algorithm uses top-down input to disambiguate POS! Only POS predicted by some state can get added to chart!

68 Completer Applied to a state when its dot has reached right end of role. Parser has discovered a category over some span of input. Find and advance all previous states that were looking for this category – copy state, move dot, insert in current chart entry Given: – NP -> Det Nominal. [1,3] – VP -> Verb. NP [0,1] Add – VP -> Verb NP. [0,3]

69 Earley: how do we know we are done? How do we know when we are done?. Find an S state in the final column that spans from 0 to n+1 and is complete. If that’s the case you’re done. – S –> α · [0,n+1]

70 Earley More specifically… 1. Predict all the states you can upfront 2. Read a word 1. Extend states based on matches 2. Add new predictions 3. Go to 2 3. Look at N+1 to see if you have a winner

71 Example Book that flight We should find… an S from 0 to 3 that is a completed state…

72 Example

73 Example

74 Example

75 Details What kind of algorithms did we just describe – Not parsers – recognizers The presence of an S state with the right attributes in the right place indicates a successful recognition. But no parse tree… no parser That’s how we solve (not) an exponential problem in polynomial time

76 Converting Earley from Recognizer to Parser With the addition of a few pointers we have a parser Augment the “Completer” to point to where we came from.

Augmenting the chart with structural information S8 S9 S10 S11 S13 S12 S8 S9 S8

78 Retrieving Parse Trees from Chart All the possible parses for an input are in the table We just need to read off all the backpointers from every complete S in the last column of the table Find all the S -> X. [0,N+1] Follow the structural traces from the Completer Of course, this won’t be polynomial time, since there could be an exponential number of trees So we can at least represent ambiguity efficiently

79 Left Recursion vs. Right Recursion Depth-first search will never terminate if grammar is left recursive (e.g. NP --> NP PP)

Solutions: – Rewrite the grammar (automatically?) to a weakly equivalent one which is not left-recursive e.g. The man {on the hill with the telescope…} NP  NP PP (wanted: Nom plus a sequence of PPs) NP  Nom PP NP  Nom Nom  Det N …becomes… NP  Nom NP’ Nom  Det N NP’  PP NP’ (wanted: a sequence of PPs) NP’  e Not so obvious what these rules mean…

81 – Harder to detect and eliminate non-immediate left recursion – NP --> Nom PP – Nom --> NP – Fix depth of search explicitly – Rule ordering: non-recursive rules first NP --> Det Nom NP --> NP PP

82 Another Problem: Structural ambiguity Multiple legal structures – Attachment (e.g. I saw a man on a hill with a telescope) – Coordination (e.g. younger cats and dogs) – NP bracketing (e.g. Spanish language teachers)

83 NP vs. VP Attachment

84 Solution? – Return all possible parses and disambiguate using “other methods”

85 Summing Up Parsing is a search problem which may be implemented with many control strategies – Top-Down or Bottom-Up approaches each have problems Combining the two solves some but not all issues – Left recursion – Syntactic ambiguity Next time: Making use of statistical information about syntactic constituents – Read Ch 14