LIN 69321 LIN6932: Topics in Computational Linguistics Hana Filip.

LIN 69321 LIN6932: Topics in Computational Linguistics Hana Filip

LIN 69322 Parsing with context-free grammars

LIN 69323 Grammar Equivalence and Chomsky Normal Form Weak equivalence Strong equivalence

LIN 69324 Grammar Equivalence and Chomsky Normal Form (CNF) many proofs in the field of languages and computability make use of the Chomsky Normal Form. there are algorithms that decide whether a given string can be generated by a given grammar and that use the Chomsky normal form: e.g., the CYK (Cocke-Younger- Kasami)

LIN 69325 Grammar Equivalence and Chomsky Normal Form (CNF) Chomsky Normal Form (CNF) is one of the most basic Normal Forms (roughly: in the context of computing and rewriting systems, a form that cannot be further reduced to a simpler form). In CNF each production (rewriting rule) has the form A → B C or A → α where –A, B and C are nonterminal symbols –α is a terminal symbol (i.e., a symbol that represents a constant value) –productions (rewriting rules) are expansive: throughout the derivation of a string, each string of terminals and nonterminals is always either the same length or one element longer than the previous such string

LIN 69326 Grammar Equivalence and Chomsky Normal Form (CNF) For grammars in Chomsky Normal Form the parse tree is always a binary tree. We can talk about the relationship between: –the depth of the parse tree, and –the length of its yield.

LIN 69327 Grammar Equivalence and Chomsky Normal Form (CNF) If a parse tree for a word string w is generated by a CNF and the parse tree –has a path length of at most i, –then the length of w is at most 2 i-1.

LIN 69328 Grammar Equivalence and Chomsky Normal Form (CNF)

LIN 69329 Grammar Equivalence and Chomsky Normal Form (CNF) Every grammar in Chomsky normal form is context-free, and conversely, every context-free grammar can be efficiently transformed into an equivalent one which is in Chomsky normal form.

LIN 693212 CFG for Fragment of English: G 0

LIN 693213 Parse Tree for ‘Book that flight’ using G 0

LIN 693214 FSA and Syntactic Parsing with CFGs (see previous lecture: types of formal grammar on Chomsky H - the class of languages they generate - types of finite state automata that recognizes each class) CFG rule: NP  (Det) Adj* N

LIN 693215 Parsing as a Search Problem parsing (linguistics: syntax analysis) is the process of analyzing a sequence of tokens to determine its grammatical structure with respect to a given formal grammar.

LIN 693216 Parsing as a Search Problem Searching FSAs –Finding the right path through the automaton –Search space defined by structure of FSA Searching CFGs –Finding the right parse tree among all possible parse trees –Search space defined by the grammar Constraints provided by –the input sentence and –the automaton or grammar

LIN 693217 Two Search Strategies How can we use G o to assign the correct parse tree(s) to a given string of words? Constraints provided by –the input sentence and –the automaton or grammar Give rise to two search strategies: Top-Down (Hypothesis-Directed) Search –Search for tree starting from S until input words covered. Bottom-Up (Data-Directed) Search –Start with words and build upwards toward S

LIN 693218 Two Search Strategies search strategies and epistemology (the study of knowledge and justified belief, philosophy of science) Top-Down (Hypothesis-Directed) Search –Search for tree starting from S until all input words covered –Rationalist tradition: emphasizes the use of prior knowledge Bottom-Up (Data-Directed) Search –Start with words and build upwards toward S –Empiricist tradition: emphasizes the data The rationalist vs. empiricist controversy concerns the extent to which we are dependent upon sense experience in our effort to gain knowledge

LIN 693219 Top-Down Parser Builds from the root S node down to the leaves Assuming we build all trees in parallel: –Find all trees with root S –Next expand all constituents in these trees/rules –Continue until leaves are part of speech categories (pos) –Candidate trees failing to match pos of input string are rejected Top-Down: Rationalist Tradition –Expectation- or Theory-driven –Goal: Build tree for input starting with S

LIN 693220 Top-Down Search Space for G 0

LIN 693221 Bottom-Up Parsing The earliest known parsing algorithm (suggested by Yngve 1955) Parser begins with words of input and builds up trees, applying G 0 rules whose right-hand sides match Book that flight NDetNVDetN BookthatflightBookthatflight –‘Book’ ambiguous –Parse continues until an S root node reached or no further node expansion possible Bottom-Up: Empiricist Tradition –Data driven –Primary consideration: Lowest sub-trees of final tree must hook up with words in input.

LIN 693222 Expanding Bottom-Up Search Space for ‘Book that flight’

LIN 693223 Comparing Top-Down and Bottom-Up Top-Down parsers: never explore illegal parses (e.g. parses that can’t form an S) -- but waste time on trees that can never match the input Bottom-Up parsers: never explore trees inconsistent with input -- but waste time exploring illegal parses (no S root) For both: how to explore the search space? –Pursuing all parses in parallel or …? –Which node to expand next? –Which rule to apply next?

LIN 693224 A Possible Top-Down Parsing Strategy Depth-first search: –start at the root (selecting some node as the root in the graph case) and expand as far as possible until –you reach a state (tree) inconsistent with input, backtrack to the most recent unexplored state (tree) Which node to expand? –Leftmost Which grammar rule to use? –Order in the grammar

LIN 693225 Basic Algorithm for Top-Down, Depth-First, Left-Right Strategy Initialize agenda with ‘S’ tree and point to first word and make this current search state (cur) Loop until successful parse or empty agenda –Apply next applicable grammar rule to leftmost unexpanded node (n) of current tree (t) on agenda and push resulting tree (t’) onto agendagrammar rule If n is a POS category and matches the POS of cur, push new tree (t’’) onto agenda Else pop t’ from agenda –Final agenda contains history of successful parse

LIN 693226 Example: Does this flight include a meal?

LIN 693227 Example continued …

LIN 693228 Augmenting Top-Down Parsing with Bottom-Up Filtering We saw: Top-Down, depth-first, L-to-R parsing –Expands non-terminals along the tree’s left edge down to leftmost leaf of tree –Moves on to expand down to next leftmost leaf… In a successful parse, current input word will be the first word in derivation of the unexpanded node that the parser is currently processing So … look ahead to left-corner of the tree –B is a left-corner of A if A ==>* B  –Build table with left-corners of all non-terminals in grammar and consult before applying rule

LIN 693229 Left Corners Pre-compute all POS that can serve as the leftmost POS in the derivations of each non-terminal category

LIN 693230 Left-Corner Table for G 0 Previous Example: CategoryLeft Corners SNP, Det, PropN, Aux, V NPDet, PropN NomN VPV

LIN 693231 Summing Up Parsing Strategies Parsing is a search problem which may be implemented with many search strategies Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation: Top-Down Parsing with Bottom-Up look-ahead Left-corner table provides more efficient look- ahead –Pre-compute all POS that can serve as the leftmost POS in the derivations of each non-terminal category

LIN 693232 Three Critical Problems in Parsing Left Recursion Ambiguity Repeated Parsing of Sub-trees

LIN 693233 Left Recursion A long-standing issue regarding algorithms that manipulate context-free grammars (CFGs) in a "top-down" left-to-right fashion is that left recursion can lead to nontermination, to an infinite loop. Direct Left Recursion happens when you have a rule that calls itself before anything else. Examples: NP  NP PP, NP  NP and NP, VP  VP PP, S  S and S Indirect Left Recursion: Example: NP  Det Nominal Det  NP ’s

LIN 693234 Left Recursion Indirect Left Recursion: Example: NP  Det Nominal Det  NP ’s NP NominalDet ’s

LIN 693235 Solutions to Left Recursion Don't use recursive rules Rule ordering Limit depth of recursion in parsing to some analytically or empirically set limit Don't use top-down parsing

LIN 693236 Solution: Grammar Rewriting Rewrite a left-recursive grammar to a weakly equivalent one which is not left- recursive. How? –By Hand (ick) or … –Automatically

LIN 693237 Solution: Grammar Rewriting I saw the man on the hill with a telescope. N V NPPP PP NP: noun phrase PP: prepositional phrase Phrase: characterized by its head (N, V, P) Ambiguous: 5 possible parses

LIN 693238 Solution: Grammar Rewriting I saw the man on the hill with a telescope. (1)S NP VP V NP N PP PP

LIN 693239 Solution: Grammar Rewriting I saw the man on the hill with a telescope. (2) S NP VP PP V NP N PP

LIN 693240 Solution: Grammar Rewriting I saw the man on the hill with a telescope. (3) S NP VP PP PP V NP

LIN 693241 Solution: Grammar Rewriting I saw the man on the hill with the telescope… NP  NP PP (recursive) NP  N PP (nonrecursive) NP  N …becomes… NP  N NP’ NP’  PP NP’ NP’  e Not so obvious what these rules mean…

LIN 693242 Rule Ordering Bad: –NP  NP PP –NP  Det N Rule ordering: non-recursive rules first –First: NP  Det N –Then: NP  NP PP

LIN 693243 Depth Bound Set an arbitrary bound Set an analytically derived bound Run tests and derive reasonable bound empirically

LIN 693244 Ambiguity Lexical Ambiguity –Leads to hypotheses that are locally reasonable but eventually lead nowhere –“Book that flight” Structural Ambiguity –Leads to multiple parses for the same input

LIN 693245 Lexical Ambiguity: Word Sense Disambiguation (WSD) as Text Categorization Each sense of an ambiguous word is treated as a category. –“play” (verb) play-game play-instrument play-role –“pen” (noun) writing-instrument enclosure Treat current sentence (or preceding and current sentence) as a document to be classified. –“play”: play-game: “John played soccer in the stadium on Friday.” play-instrument: “John played guitar in the band on Friday.” play-role: “John played Hamlet in the theater on Friday.” –“pen”: writing-instrument: “John wrote the letter with a pen in New York.” enclosure: “John put the dog in the pen in New York.”

LIN 693246 Structural ambiguity Multiple legal structures –Attachment (e.g. I saw a man on a hill with a telescope) –Coordination (e.g. younger cats and dogs) –NP bracketing (e.g. Spanish language teachers)

LIN 693247 Two Parse Trees for Ambiguous Sentence

LIN 693248 Humor and Ambiguity Many jokes rely on the ambiguity of language: –Groucho Marx: One morning I shot an elephant in my pajamas. How he got into my pajamas, I’ll never know. –She criticized my apartment, so I knocked her flat. –Noah took all of the animals on the ark in pairs. Except the worms, they came in apples. –Policeman to little boy: “We are looking for a thief with a bicycle.” Little boy: “Wouldn’t you be better using your eyes.” –Why is the teacher wearing sun-glasses. Because the class is so bright.

LIN 693249 Ambiguity is Explosive Ambiguities compound to generate enormous numbers of possible interpretations. In English, a sentence ending in n prepositional phrases has over 2 n syntactic interpretations. – “ I saw the man with the telescope”: 2 parses –“I saw the man on the hill with the telescope.”: 5 parses –“I saw the man on the hill in Texas with the telescope”: 14 parses –“I saw the man on the hill in Texas with the telescope at noon.”: 42 parses

LIN 693250 What’s the solution? Return all possible parses and disambiguate using “other methods”

LIN 693251 Summing Up Parsing is a search problem which may be implemented with many control strategies –Top-Down or Bottom-Up approaches each have problems Combining the two solves some but not all issues –Left recursion –Syntactic ambiguity Next time: Making use of statistical information about syntactic constituents

LIN 693252 Dynamic Programming Create table of solutions to sub-problems (e.g. subtrees) as parse proceeds Look up subtrees for each constituent rather than re- parsing Since all parses implicitly stored, all available for later disambiguation Examples: Cocke-Younger-Kasami (CYK) (1960), Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

LIN 693253 Earley Algorithm Jay Earley (1970) A type of chart parser that uses dynamic programming to do parallel top-down search Can parse all context-free languages Dot notation –Given a production A  BCD where B, C, and D are symbols in the grammar (terminals or nonterminals), the notation A  B C D represents a condition in which B has already been parsed and the sequence C D is expected.

LIN 693254 Earley Algorithm left-to-right pass fills out a chart with N+1 states –Think of chart entries as sitting between words in the input string keeping track of states of the parse at these positions 0 Book 1 that 2 flight 3 –For each word position, chart contains set of states representing all partial parse trees generated to date. E.g. chart[0] contains all partial parse trees generated at the beginning of the sentence

LIN 693255 Chart Entries Represent three types of constituents: completed constituents We keep track of what we have built with what we call complete edges in the chart, noting where a constituent stops and where it starts. For the lexical complete edges in 0 Book 1 that 2 flight 3 we want: Verb [0,1] Det [1,2] Noun [2,3] in-progress constituents predicted constituents We keep track of what we are looking for with incomplete edges, saying what we are looking for and where it starts (we don't know where it ends yet). We start out looking for an S at 0. S [0]

LIN 693256 Progress in parse represented by Dotted Rules Position of indicates the progress made in recognizing a grammar rule 0 Book 1 that 2 flight 3 S  VP, [0,0] (predicted) NP  Det Nom, [1,2] (in progress) VP  V NP, [0,3] (completed) [x,y] tells us what portion of the input is spanned so far by this rule Each State s i :, [, ]

LIN 693257 S  VP, [0,0] –First 0 means S constituent begins at the start of input –Second 0 means the dot here too –So, this is a top-down prediction NP  Det Nom, [1,2] –the NP begins at position 1 –the dot is at position 2 –so, Det has been successfully parsed –Nom predicted next 0 Book 1 that 2 flight 3

LIN 693258 0 Book 1 that 2 flight 3 (continued) VP  V NP, [0,3] –Successful VP parse of entire input

LIN 693259 Some Earley edges (parser states) 1. S => NP VP [0,0]: Incomplete. We're trying to build an S that starts at 0 using the NP VP rule. We've found nothing yet (the dot is in first position), we're currently looking for an NP that starts at 0. 2. S => NP VP [0,1]: Incomplete. We're trying to build an S that starts at 0 using the NP VP rule. We've found an NP (the dot is in second position), we're currently looking for a VP that starts at 0. 3. S => NP VP [0,3]: Complete. We've succeeded in building an S that starts at 0 using the NP VP rule. It ends at 3 (the dot is in the last position), we're currently looking for a VP that starts at 0.

LIN 693260 Successful Parse Final answer found by looking at last entry in chart If entry is S  , [nil, N], then input parsed successfully Chart will also contain record of all possible parses of input string, given the grammar

LIN 693261 Parsing Procedure for the Earley Algorithm Move through each set of states in order, applying one of three operators to each state: –predictor: adds predictions (creates new states) to the chart –scanner: reads input words and enters states corresponding to those words to chart –completer: moves the dot to right when new constituent found

LIN 693262 Earley Algorithm from Book

LIN 693263 Earley Algorithm: Essential Ideas Initialize To look for an S at 0, we add the following incomplete edge at 0, which uses a dummy category gamma, gamma => S [0,0]

LIN 693264 Earley Algorithm: Essential Ideas Predictor Intuition: create new state for top-down prediction of new phrase Applied when non part-of-speech non-terminals are to the right of a dot: S  VP [0,0] Adds new states to current chart –One new state for each expansion of the non-terminal in the grammar VP  V [0,0] VP  V NP [0,0]

LIN 693265 Earley Algorithm: Essential Ideas Scanner Intuition: Create new states for rules matching part of speech of next word. Applicable when part of speech is to the right of a dot: VP  V NP [0,0] ‘Book…’ Looks at current word in input If match, adds state(s) to next chart –if VP  Verb NP, [0, 0] is processed, scanner consults the current word in the input; since book can be a verb, it matches the expectation in the current state. This results in the creation of the new state which is added to the chart: Verb  book, [0,1]

LIN 693266 Earley Algorithm: Essential Ideas Completer Intuition: parser has finished a new phrase, so must find and advance states, all states that were waiting for this Applied when dot has reached right end of rule NP  Det Nom [1,3] Find all states with dot at 1 and expecting an NP VP  V NP [0,1] Adds new (completed) state(s) to current chart VP  V NP [0,3] 0 Book 1 that 2 flight 3

LIN 693267 Example: State Set S 0 for Parsing “Book that flight” using Grammar G 0 nil

LIN 693268 Example: State Set S 1 for Parsing “Book that flight” VP   V and VP   V NP are both passed to Scanner, which adds them to Chart[1], moving dots to right Scanner

LIN 693269 Last Two States Scanner γ → S. [nil,3] Completer

LIN 693270 Error Handling Valid sentences will leave the state S  , [nil, N] What happens when we look at the contents of the last table column and don't find a S   rule? –Is it a total loss? No... –Chart contains every constituent and combination of constituents possible for the input given the grammar Also useful for partial parsing or shallow parsing used in information extraction

LIN 693271 How do we retrieve the parses at the end? The representation of each state must be augmented with an additional field to store information about the completed states that generated its constituents –i.e. what state did we advance here? –Read the pointers back from the final state

LIN 693272 Earley’s Keys to Efficiency Left-recursion, Ambiguity and repeated re-parsing of subtrees –Solution: dynamic programming Combine top-down predictions with bottom-up look-ahead to avoid unnecessary expansions Earley is still one of the most efficient parsers All efficient parsers avoid re-computation in a similar way.

LIN 693273 Next Time * Chapter 1 of Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2007. * D. Moldovan, S. Harabagiu, M. Pasca, R. Mihalcea, R. Goodrum, R. Girju, and V. Rus. 1999. LASSO: A tool for surfing the answer net. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), 1999. * E. Brill, S. Dumais and M. Banko. 2002. An analysis of the AskMSR question-answering system. Proceedings of EMNLP 2002

LIN 69321 LIN6932: Topics in Computational Linguistics Hana Filip.

Similar presentations

Presentation on theme: "LIN 69321 LIN6932: Topics in Computational Linguistics Hana Filip."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

LIN 69321 LIN6932: Topics in Computational Linguistics Hana Filip.

Similar presentations

Presentation on theme: "LIN 69321 LIN6932: Topics in Computational Linguistics Hana Filip."— Presentation transcript:

Similar presentations

About project

Feedback