Parsing SLP Chapter 13
7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline Parsing with CFGs Bottom-up, top-down CKY parsing Mention of Earley and chart parsing
7/2/2015 Speech and Language Processing - Jurafsky and Martin 3 Parsing Parsing with CFGs refers to the task of assigning trees to input strings Trees that covers all and only the elements of the input and has an S at the top This chapter: find all possible trees Next chapter (14): choose the most probable one
7/2/2015 Speech and Language Processing - Jurafsky and Martin 4 Parsing parsing involves a search
7/2/2015 Speech and Language Processing - Jurafsky and Martin 5 Top-Down Search We’re trying to find trees rooted with an S; start with the rules that give us an S. Then we can work our way down from there to the words.
7/2/2015 Speech and Language Processing - Jurafsky and Martin 6 Top Down Space
7/2/2015 Speech and Language Processing - Jurafsky and Martin 7 Bottom-Up Parsing We also want trees that cover the input words. Start with trees that link up with the words Then work your way up from there to larger and larger trees.
Bottom-Up Space 8
7/2/2015 Speech and Language Processing - Jurafsky and Martin 9 Top-Down and Bottom-Up Top-down Only searches for trees that can be S’s But also suggests trees that are not consistent with any of the words Bottom-up Only forms trees consistent with the words But suggests trees that make no sense globally
7/2/2015 Speech and Language Processing - Jurafsky and Martin 10 Control Which node to try to expand next Which grammar rule to use to expand a node One approach: exhaustive search of the space of possibilities Not feasible Time is exponential in the number of non- terminals LOTS of repeated work, as the same constituent is created over and over (shared sub-problems)
7/2/2015 Speech and Language Processing - Jurafsky and Martin 11 Dynamic Programming DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time (well, no not really – we’ll return to this point) Efficiently store ambiguous structures with shared sub-parts. We’ll cover two approaches that roughly correspond to bottom-up and top-down approaches. CKY Earley – we will mention this, not cover it in detail
7/2/2015 Speech and Language Processing - Jurafsky and Martin 12 CKY Parsing Consider the rule A BC If there is an A somewhere in the input then there must be a B followed by a C in the input. If the A spans from i to j in the input then there must be some k st. i<k<j Ie. The B splits from the C someplace.
7/2/2015 Speech and Language Processing - Jurafsky and Martin 13 Convert Grammar to CNF What if your grammar isn’t binary? As in the case of the TreeBank grammar? Convert it to binary… any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically. The resulting grammar accepts (and rejects) the same set of strings as the original grammar. But the resulting derivations (trees) are different. We saw this in the last set of lecture notes
7/2/2015 Speech and Language Processing - Jurafsky and Martin 14 Convert Grammar to CNF More specifically, we want our rules to be of the form A B C Or A w That is, rules can expand to either 2 non- terminals or to a single terminal.
7/2/2015 Speech and Language Processing - Jurafsky and Martin 15 Binarization Intuition Introduce new intermediate non-terminals into the grammar that distribute rules with length > 2 over several rules. So… S A B C turns into S X C and X A B Where X is a symbol that doesn’t occur anywhere else in the the grammar.
Converting grammar to CNF 1.Copy all conforming rules to the new grammar unchanged 2.Convert terminals within rules to dummy non-terminals 3.Convert unit productions 4.Make all rules with NTs on the right binary In lecture: what these mean; apply to example on next two slides 7/2/
7/2/2015 Speech and Language Processing - Jurafsky and Martin 17 Sample L1 Grammar
7/2/2015 Speech and Language Processing - Jurafsky and Martin 18 CNF Conversion
7/2/2015 Speech and Language Processing - Jurafsky and Martin 19 CKY Build a table so that an A spanning from i to j in the input is placed in cell [i,j] in the table. E.g., a non-terminal spanning an entire string will sit in cell [0, n] Hopefully an S
7/2/2015 Speech and Language Processing - Jurafsky and Martin 20 CKY If there is an A spanning i,j in the input A B C is a rule in the grammar Then There must be a B in [i,k] and a C in [k,j] for some i<k<j
7/2/2015 Speech and Language Processing - Jurafsky and Martin 21 CKY The loops to fill the table a column at a time, from left to right, bottom to top. When we’re filling a cell, the parts needed to fill it are already in the table to the left and below
7/2/2015 Speech and Language Processing - Jurafsky and Martin 22 CKY Algorithm
7/2/2015 Speech and Language Processing - Jurafsky and Martin 23 Example Go through full example in lecture
7/2/2015 Speech and Language Processing - Jurafsky and Martin 24 CKY Parsing Is that really a parser? So, far it is only a recognizer Success? an S in cell [0,N] To turn it into a parser … see Lecture
7/2/2015 Speech and Language Processing - Jurafsky and Martin 25 CKY Notes Since it’s bottom up, CKY populates the table with a lot of worthless constituents. To avoid this we can switch to a top-down control strategy Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis.
Dynamic Programming Parsing Methods CKY (Cocke-Kasami-Younger) algorithm based on bottom-up parsing and requires first normalizing the grammar. Earley parser is based on top-down parsing and does not require normalizing grammar but is more complex. More generally, chart parsers retain completed phrases in a chart and can combine top-down and bottom-up search. 26
Conclusions Syntax parse trees specify the syntactic structure of a sentence that helps determine its meaning. John ate the spaghetti with meatballs with chopsticks. How did John eat the spaghetti? What did John eat? CFGs can be used to define the grammar of a natural language. Dynamic programming algorithms allow computing a single parse tree in cubic time or all parse trees in exponential time. 27