Presentation is loading. Please wait.

Presentation is loading. Please wait.

6/2/2016CPSC503 Winter 20091 CPSC 503 Computational Linguistics Lecture 9 Giuseppe Carenini.

Similar presentations


Presentation on theme: "6/2/2016CPSC503 Winter 20091 CPSC 503 Computational Linguistics Lecture 9 Giuseppe Carenini."— Presentation transcript:

1 6/2/2016CPSC503 Winter 20091 CPSC 503 Computational Linguistics Lecture 9 Giuseppe Carenini

2 6/2/2016CPSC503 Winter 20092 Knowledge-Formalisms Map Logical formalisms (First-Order Logics) Rule systems (and prob. versions) (e.g., (Prob.) Context-Free Grammars) State Machines (and prob. versions) (Finite State Automata,Finite State Transducers, Markov Models) Morphology Syntax Pragmatics Discourse and Dialogue Semantics AI planners

3 6/2/2016CPSC503 Winter 20093 Today 7/10 Finish CFG for Syntax of NL (problems) Parsing The Earley Algorithm Partial Parsing: Chuncking Dependency Grammars / Parsing

4 6/2/2016CPSC503 Winter 20094 Problems with CFGs Agreement Subcategorization

5 6/2/2016CPSC503 Winter 20095 Agreement In English, –Determiners and nouns have to agree in number –Subjects and verbs have to agree in person and number Many languages have agreement systems that are far more complex than this (e.g., gender).

6 6/2/2016CPSC503 Winter 20096 Agreement This dog Those dogs This dog eats You have it Those dogs eat *This dogs *Those dog *This dog eat *You has it *Those dogs eats

7 6/2/2016CPSC503 Winter 20097 Possible CFG Solution S -> NP VP NP -> Det Nom VP -> V NP … SgS -> SgNP SgVP PlS -> PlNp PlVP SgNP -> SgDet SgNom PlNP -> PlDet PlNom PlVP -> PlV NP …. SgVP3p ->SgV3p NP … Sg = singular Pl = plural OLD GrammarNEW Grammar

8 6/2/2016CPSC503 Winter 20098 CFG Solution for Agreement It works and stays within the power of CFGs But it doesn’t scale all that well (explosion in the number of rules)

9 6/2/2016CPSC503 Winter 20099 Subcategorization *John sneezed the book *I prefer United has a flight *Give with a flight Def. It expresses constraints that a predicate (verb here) places on the number and type of its arguments (see first table)

10 6/2/2016CPSC503 Winter 200910 Subcategorization Sneeze: John sneezed Find: Please find [a flight to NY] NP Give: Give [me] NP [a cheaper fare] NP Help: Can you help [me] NP [with a flight] PP Prefer: I prefer [to leave earlier] TO-VP Told: I was told [United has a flight] S …

11 6/2/2016CPSC503 Winter 200911 So? So the various rules for VPs overgenerate. –They allow strings containing verbs and arguments that don’t go together –For example: VP -> V NP therefore Sneezed the book VP -> V S therefore go she will go there

12 6/2/2016CPSC503 Winter 200912 Possible CFG Solution VP -> V VP -> V NP VP -> V NP PP … VP -> IntransV VP -> TransV NP VP -> TransPP to NP PP to … TransPP to -> hand,give,.. This solution has the same problem as the one for agreement OLD Grammar NEW Grammar

13 6/2/2016CPSC503 Winter 200913 CFG for NLP: summary CFGs cover most syntactic structure in English. But there are problems (overgeneration) –That can be dealt with adequately, although not elegantly, by staying within the CFG framework. There are simpler, more elegant, solutions that take us out of the CFG framework: LFG, XTAGS… Chpt 15 “Features and Unification”

14 6/2/2016CPSC503 Winter 200914 Dependency Grammars Syntactic structure: binary relations between words Links: grammatical function or very general semantic relation Abstract away from word-order variations (simpler grammars) Useful features in many NLP applications (for classification, summarization and NLG)

15 6/2/2016CPSC503 Winter 200915 Today 7/10 Finish CFG for Syntax of NL (problems) Parsing The Earley Algorithm Partial Parsing: Chuncking Dependency Grammars / Parsing

16 6/2/2016CPSC503 Winter 200916 Parsing with CFGs Assign valid trees: covers all and only the elements of the input and has an S at the top Parser I prefer a morning flight flight Nominal CFG Sequence of words Valid parse trees

17 6/2/2016CPSC503 Winter 200917 Parsing as Search S -> NP VP S -> Aux NP VP NP -> Det Noun VP -> Verb Det -> a Noun -> flight Verb -> left, arrive Aux -> do, does Search space of possible parse trees CFG defines Parsing : find all trees that cover all and only the words in the input

18 6/2/2016CPSC503 Winter 200918 Constraints on Search Parser I prefer a morning flight flight Nominal CFG (search space) Sequence of wordsValid parse trees Search Strategies: Top-down or goal-directed Bottom-up or data-directed

19 6/2/2016CPSC503 Winter 200919 Top-Down Parsing Since we’re trying to find trees rooted with an S (Sentences) start with the rules that give us an S. Then work your way down from there to the words. flight Input:

20 6/2/2016CPSC503 Winter 200920 Next step: Top Down Space When POS categories are reached, reject trees whose leaves fail to match all words in the input ……..

21 6/2/2016CPSC503 Winter 200921 Bottom-Up Parsing Of course, we also want trees that cover the input words. So start with trees that link up with the words in the right way. Then work your way up from there. flight

22 6/2/2016CPSC503 Winter 200922 Two more steps: Bottom-Up Space flight ……..

23 6/2/2016CPSC503 Winter 200923 Top-Down vs. Bottom-Up Top-down –Only searches for trees that can be answers –But suggests trees that are not consistent with the words Bottom-up –Only forms trees consistent with the words –Suggest trees that make no sense globally

24 6/2/2016CPSC503 Winter 200924 So Combine Them Top-down: control strategy to generate trees Bottom-up: to filter out inappropriate parses Top-down Control strategy: Depth vs. Breadth first Which node to try to expand next Which grammar rule to use to expand a node (left-most) (textual order)

25 6/2/2016CPSC503 Winter 200925 Top-Down, Depth-First, Left-to- Right Search Sample sentence: “Does this flight include a meal?”

26 6/2/2016CPSC503 Winter 200926 Example “Does this flight include a meal?”

27 6/2/2016CPSC503 Winter 200927 flight Example “Does this flight include a meal?”

28 6/2/2016CPSC503 Winter 200928 flight Example “Does this flight include a meal?”

29 6/2/2016CPSC503 Winter 200929 Adding Bottom-up Filtering The following sequence was a waste of time because an NP cannot generate a parse tree starting with an Aux Aux

30 6/2/2016CPSC503 Winter 200930 Bottom-Up Filtering CategoryLeft Corners SDet, Proper-Noun, Aux, Verb NPDet, Proper-Noun NominalNoun VPVerb Aux

31 6/2/2016CPSC503 Winter 200931 Problems with TD-BU-filtering Ambiguity Repeated Parsing SOLUTION: Earley Algorithm (once again dynamic programming!)

32 6/2/2016CPSC503 Winter 200932 (1) Structural Ambiguity “I shot an elephant in my pajamas” #of PP # of NP parses …… 6469 71430 …… Three basic kinds: Attachment/Coordination/NP-bracketing

33 6/2/2016CPSC503 Winter 200933 (2) Repeated Work Parsing is hard, and slow. It’s wasteful to redo stuff over and over and over. Consider an attempt to top-down parse the following as an NP “A flight from Indi to Houston on TWA”

34 6/2/2016CPSC503 Winter 200934 flight NP -> Det Nom NP-> NP PP Nom -> Noun …… fails and backtracks starts from….

35 6/2/2016CPSC503 Winter 200935 flight NP -> Det Nom NP-> NP PP Nom -> Noun fails and backtracks restarts from….

36 6/2/2016CPSC503 Winter 200936 flight restarts from…. fails and backtracks..

37 6/2/2016CPSC503 Winter 200937 restarts from…. Success!

38 6/2/2016CPSC503 Winter 200938 4 3 2 1 But….

39 6/2/2016CPSC503 Winter 200939 Dynamic Programming Fills tables with solution to subproblems Parsing: sub-trees consistent with the input, once discovered, are stored and can be reused 1.Stores ambiguous parse compactly 2.Does not do (avoidable) repeated work

40 6/2/2016CPSC503 Winter 200940 Earley Parsing O(N 3 ) Fills a table in a single sweep over the input words –Table is length N +1; N is number of words –Table entries represent: Predicted constituents In-progress constituents Completed constituents and their locations

41 6/2/2016CPSC503 Winter 200941 States The table-entries are called states and express: what is predicted from that point What is in progress at that point what has been recognized up to that point Representation: dotted-rules + location S -> · VP [0,0]A VP is predicted at the start of the sentence NP -> Det · Nominal[1,2]An NP is in progress; the Det goes from 1 to 2 VP -> V NP · [0,3]A VP has been found starting at 0 and ending at 3

42 6/2/2016CPSC503 Winter 200942 Graphically S -> · VP [0,0] NP -> Det · Nominal[1,2] VP -> V NP · [0,3]

43 6/2/2016CPSC503 Winter 200943 Earley: answer Answer found by looking in the table in the right place. The following state should be in the final column: i.e., an S state that spans from 0 to n and is complete. S –>  · [0,n]

44 6/2/2016CPSC503 Winter 200944 Earley Parsing Procedure So sweep through the table from 0 to n in order, applying one of three operators to each state: –predictor: add top-down predictions to the chart –scanner: read input and add corresponding state to chart –completer: move dot to right when new constituent found Results (new states) added to current or next set of states in chart No backtracking and no states removed

45 6/2/2016CPSC503 Winter 200945 Predictor Intuition: new states represent top- down expectations Applied when non-part-of-speech non- terminals are to the right of a dot S --> VP [0,0] Adds new states to end of current chart –One new state for each expansion of the non-terminal in the grammar VP --> V [0,0] VP --> V NP [0,0]

46 6/2/2016CPSC503 Winter 200946 Scanner (part of speech) New states for predicted part of speech. Applicable when part of speech is to the right of a dot VP --> Verb NP [0,0] ( 0 “Book…” 1 ) Looks at current word in input If match, adds state(s) to next chart Verb --> book NP [0,1]

47 6/2/2016CPSC503 Winter 200947 Completer Intuition: we’ve found a constituent, so tell everyone waiting for this Applied when dot has reached right end of rule NP --> Det Nom [1,3] Find all states w/dot at 1 and expecting an NP VP --> V NP [0,1] Adds new (completed) state(s) to current chart VP --> V NP [0,3]

48 6/2/2016CPSC503 Winter 200948 Example: “Book that flight” We should find… an S from 0 to 3 that is a completed state…

49 6/2/2016CPSC503 Winter 200949 Example “Book that flight”

50 6/2/2016CPSC503 Winter 200950 So far only a recognizer… Then simply read off all the backpointers from every complete S in the last column of the table To generate all parses : When old states waiting for the just completed constituent are updated => add a pointer from each “updated” to “completed” Chart [0] ….. S5 S->.VP[0,0] [] S6 VP ->. Verb[0,0] [] S7 VP ->. Verb NP[0,0] [] …. Chart [1] S8 Verb -> book. [0,1] [] S9 VP -> Verb. [0,1] [S8] S10 S->VP. [0,1] [S9] S11 VP->Verb. NP [0,1] [??] ….

51 6/2/2016CPSC503 Winter 200951 Error Handling What happens when we look at the contents of the last table column and don't find a S -->  state? –Is it a total loss? –Chart contains every constituent and combination of constituents possible for the input given the grammar Also useful for partial parsing or shallow parsing used in information extraction

52 6/2/2016CPSC503 Winter 200952 Dynamic Programming Approaches Earley –Top-down, no filtering, no restriction on grammar form CKY –Bottom-up, no filtering, grammars restricted to Chomsky-Normal Form (CNF) (i.e.,  -free and each production either A-> BC or A-> a)

53 6/2/2016CPSC503 Winter 200953 For Next Time Read in Chapter 13 (Parsing): 13.4.2, 13.5 Optional: Read Chapter 16 (Features and Unification) – skip algorithms and implementation


Download ppt "6/2/2016CPSC503 Winter 20091 CPSC 503 Computational Linguistics Lecture 9 Giuseppe Carenini."

Similar presentations


Ads by Google