Syntactic Parsing: Part I Niranjan Balasubramanian Stony Brook University March 8 th and Mar 22 nd, 2016 Many slides adapted from: Ray Mooney, Michael Collins, and Chris Manning. Notes:
Overview Syntactic Structure (of English) –Phrase structure and how they compose to form sentences. Context-free Grammar for capturing English Phrase structure. Algorithms for parsing (identifying phrase structure) using CFG. Why is parsing difficult?
Regularities in language Word n-grams model regularities in word sequences Part-of-speech n-grams model regularities in word category sequences. Language has richer structure. –Phrases There are groupings of words in sentences that behave like a unit. –Dependencies Words in a sentence have specific syntactic (and semantic) relationships.
Constituency View of Syntactic Structure I shot an elephant Position An elephant I shot[Poetic] Expansion I shot a green elephant[Perceptive] I shot an elephant that ate my lunch[Vengeful] Constituents are groups of words (phrases) that behave as a single unit.
Types of Phrases Noun phrases The long second assignment that Niranjan gave us was quite interesting. DT ADJ ADJ NN CC NNP VBD NN [Sarcastic] Headed by a noun, and has an optional determiner, adjectives (phrases), post-modifiers (rel. clauses, prep-phrases). Verb Phrases Everyone absolutely loved CS390. [Optimistic] Headed by a verb, and has other modifiers (adverbs), and object nouns. No subject noun phrases.
Types of Phrases Prepositional Phrases[Chef Bala] I ate pasta with chopsticks. Headed by a preposition, and a noun phrase (that completes it). Usually part of verb or noun phrases. Usually convey information about spatial, temporal aspects. Adjectival Phrases SAC pizza is edible.[Fact] SAC pizza is quite close to being edible.[Opinion] Headed by the adjective and can sometimes include other types of phrases.
Two take-aways about phrases Each type of phrase has some typical structure E.g., NP: DT* (ADJ)* NN Phrases can nest within each other. E.g., NP:DT NP NP: PP NN Recursion. You can keep expanding phrases forever. –This can lead to separating words that are dependent on each other. The green salad served to the workers was not fresh.
Why do we care about this syntactic structure? Many aspects of meaning can be learnt using the syntactic structure. –The NP preceding VP is likely the subject of the action. –The NP following the VP is likely the object of the action. Knowing basic units is helpful in modeling language. –You can use this to predict or complete the sentence. –Re-organize sentences or simplify them. Many many NLP applications use syntactic structure to make decisions. –Relation extraction. –Question answering. –Machine translation. –Semantic role labeling. –…
Capturing Phrase Structure using Grammar Grammar is a way to specify acceptable or valid strings in a language. Phrase types have specific structures. –Can hope to write down rules for valid ways in which phrases can be formed. How to generate noun phrases (NP) in English? Specify re-write rules. NP → NP PP NP → DT NNNN → hat NP → DT NNSNNS → cats DT → thePP → IN NP DT → a IN → in Apply rules starting from the NP symbol and generate phrases. If a given phrase can be generated by applying these rules then it is an NP.
Context-free Grammar Starts with a special sentence symbol ‘S’ always. Has terminal and non-terminal symbols. –Terminals correspond to words in the language that don’t expand. –Non-terminals are syntactic categories (or phrase types). Is a set of rules with LHS and RHS. –LHS is a single non-terminal –RHS is a combination of non-terminal and terminal symbols. A valid sentence is one that can be derived from the grammar. –The structure of the sentence is captured by the derivation. Often this derivation or structure is represented as a tree. –Parsing is the task of finding the derivation that leads to the sentence.
Context-Free Grammar: Formally N a set of non-terminal symbols (or variables)[NP, VP, PP etc.] a set of terminal symbols (disjoint from N)[words] R a set of productions or rules of the form: A→ , [NP → DT NN] where A is a non-terminal and is a string of symbols from ( N)* S, a designated non-terminal called the start symbol Strings that can be generated by applying a sequence of rules from R are said to be in the language of the grammar. Parsing becomes the task of identifying if a string is generated by the grammar (and recovering the sequence of rules that generated it).
So, are we done? What is so hard about this? Write a grammar and figure out which set of rules lead to the sentence. [Easy-peasy-lemon-squeezy? Not.] –Chomsky tried this in PhD thesis in 1950s –Wrote symbolic grammar (CFG or often richer) and lexicon –Used grammar/proof systems to prove parses from words Ambiguity + Exceptions –Most grammars lead to many many derivations for a given sentence. –There are always exceptions to cover and grammar grows bigger! [All grammars leak! - Sapir] E.g., Fed raises interest rates 0.5% in effort to control inflation Minimal grammar gives 36 parses Simple 10 rule grammar gives592 parses Real-size broad-coverage grammarmillions of parses [Inflation indeed!] Small grammars: Easy to write but less coverage Broad coverage grammars: Hard to write and produce many parses.
What is so hard about this?Ambiguity. Warning: You are about to see (possibly) disturbing images. I shot an elephant in my pajamas.
Ambiguity S NP VP NN VBDNP IN PRP NP NN Ishot an DTNN elephant PP in mypajamas
Ambiguity S NP VP NN VBDNP IN PRP NP NN Ishot an DTNN elephant PP in mypajamas
Probabilistic CFG: A Solution for ambiguity. Some trees (derivations or parses) are more likely than others. –Some rules are more frequent than others. Key idea: –Associate some probability with each rule.
Parsing with PCFG argmax Pr(Tree | Sentence) Brute force search is not efficient! Many trees might share edges. We might be able to save computations. –Top down and Bottom up algorithms can explore the possible trees w/o fully generating all possible derivations.
What are some ways to parse given a CFG? Top down parsing Bottom up parsing Dynamic Programming –CYK or CKY parsing. [Bottom up] –Earley Algorithm [Top down]
Top Down Parsing Start with the start symbol. Apply rules as long as the rules lead to the correct sequence of words. If there is a mismatch with what is observed, then backtrack. Keep iterating until you have generated the sentence, or you have exhausted all paths.
Top Down Parsing S VP Verb NP book Det Nominal that Noun flight book that flight Idea: Do not explore paths that cannot lead to the sentence.
Top Down Parsing S NP VP Pronoun S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S NP VP Pronoun book X S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S NP VP ProperNoun S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S NP VP ProperNoun book X S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S NP VP Det Nominal S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S NP VP Det Nominal book X S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S Aux NP VP S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S Aux NP VP book X S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S VP Verb S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S VP S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S VP Verb book S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S VP Verb book X that S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S VP Verb NP S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S VP Verb NP book S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S VP Verb NP book Pronoun S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP book that flight Det → the | a | that | this Noun → book | flight | meal Verb → book | prefer Pronoun → I | he | she | me Aux → does ….
Top Down Parsing S VP Verb NP book Det Nominal that Noun flight book that flight And on it goes until …. S -> VP VP -> Verb NP Verb -> book NP -> Det Nominal Det -> that Nominal -> Noun Noun -> flight Rules Used:
Top down and Bottom up Both approaches save some exploration! Top down never explores options that will not lead to a full parse –But will explore many options that never connect to the actual sentence. –Can get into left recursion. Bottom up never explores options that do not connect to the actual sentence –But can explore options that can never lead to a full parse. Which search wastes more depends on how the grammar branches. –With too many productions for each non-terminal, the TD will suffer. –With too many RHS in which a non-terminal appears, BU will suffer. Straightforward implementations can take exponential time!
CKY Intuition Imagine your goal is just identify the nested phrase structure (no labeling the phrases). ABCDEFABCDEF [A][B C][D][EF] [ABC][DEF] [ABCDEF]
CKY Intuition Optimal solution should include one of the following! [A][ BCDEF]c[1, 1] c[2, 6] [AB][ CDEF]c[1, 2] c[3, 6] [ABC][ DEF]c[1, 3] c[4, 6] [ABC D][EF]c[1, 4] c[5, 6] [ABC DE][F]c[1, 5] c[6, 6] Also each of the sub-phrases must also be optimal. Optimal for span i through j by composing optimal for splits from i + 1 until j-1. ii+1 …j-1j [][]c[i, i+1] c[i+1, j] k=i+1 [][]c[i, i+2] c[i+2, j] k=i+2… [][]c[i, j-2] c[j-1, j] k=j-1 opt(i, j) = argmaxopt(i, k). opt(k+1, j) i ≤ k ≤ j opt(i, j) = argmaxopt(i, k). opt(k+1, j) i ≤ k ≤ j
CKY Parser:Composing from sub-phrases 40 Book the flight through Houston j= i= c[i,j] contains all constituents for words i through j c[1,4] possibilities
CKY Parser:Composing from sub-phrases 41 Book the flight through Houston i= j= c[i,j] contains all constituents for words i through j c[1,4] possibilities c[1,3], c[4,4]
CKY Parser:Composing from sub-phrases 42 Book the flight through Houston j= i= c[i,j] contains all constituents for words i through j c[1,4] possibilities c[1,3], c[4,4] c[1,2], c[3,4]
CKY Parser:Composing from sub-phrases 43 Book the flight through Houston j= i= c[i,j] contains all constituents for words i through j c[1,4] possibilities c[1,3], c[4,4] c[1,2], c[3,4] c[1,1], c[2,4]
CKY Parser:Table filling order 44 Book the flight through Houston j= i= c[i,j] contains all constituents for words i through j
CKY Parser:Example 45 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser:Example 46 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 47 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 48 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 49 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 50 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 51 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 52 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 53 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP VP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 54 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP S VP S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 55 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP VP S S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 56 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP VP S S S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP Verb → book Noun → book Det → the Prep → through
CKY Parser 57 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP VP S S Parse Tree #1
CKY Parser 58 Book the flight through Houston S, VP, Verb, Nominal, Noun Det Nominal, Noun None NP VP S Prep None NP ProperNoun PP Nominal NP VP S S Parse Tree #2
CKY Parser: Complexity? 1) How many cells? 2) How much work to be done in each cell?
Chomsky Normal Form. S → NP VP S → Aux NP VP S → VP NP → Pronoun NP → Proper-Noun NP → Det Nominal Nominal → Noun Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP Original GrammarChomsky Normal Form S → NP VP S → X1 VP X1 → Aux NP S → book | include | prefer S → Verb NP S → VP PP NP → I | he | she | me NP → Houston | NWA NP → Det Nominal Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal PP VP → book | include | prefer VP → Verb NP VP → VP PP PP → Prep NP
CKY: An algorithm for parsing given a grammar. How to address the ambiguity problem? –Use some way to score the parses. –If you cannot decompose the scores into the DP steps, then we are done for! Scoring –Probabilistic context free grammar (PCFG)
Parsing and Scoring PCFGs Grammar Learning from training data.
CKY Parser for PCFG. C(2, 4) X 1 1) At each cell store the maximum (optimal) parse for that span. 2) Score of a parse at a span includes the probability of rule, times the probabilities of the parses of the sub-spans. C(2, 2) Y 1 C(3, 4) Z 1 C(2,4) = max { C(2,2) x C(3, 4) x Pr(X 1 Y 1 Z 1 ), C(2,3) x C(4,4) x Pr(X 2 Y 2 Z 2 ) } C(2, 3) Y 2 C(4, 4) Y 4
Issues with PCFGs Makes strong independence assumptions about language –Lexical independence –Structural independence
Lexical Dependence: PP Attachment Ambiguity workers dumped sacks into a bin
Lexical Dependence: PP Attachment Ambiguity Prob(Tree 1, Sentence) = …. x Prob(VP -> VP PP | VP) x …
Lexical Dependence: PP Attachment Ambiguity Prob(Tree 2, Sentence) = …. x Prob(NP -> NP PP | NP) x …
Lexical Dependence: PP Attachment Ambiguity Prob(Tree 1, Sentence) > Prob(Tree 2, Sentence) If Prob(VP -> VP PP | VP) > Prob(NP -> NP PP | NP)
Lexical Dependence: Co-ordination Ambiguity
Structural Preferences president of [company in Africa] [president of company] in Africa Same rules applied in both cases. Tree probability is the same Left structure is twice as likely in Wall Street Journal. There are similar issues when PPs can attach to multiple verbs.
Structural Dependence
Structural Dependence: Sub-categorization Specific verbs take some types of arguments but not others. –Intransitive, transitive, and di-transitive –Finite vs. Non-finite verbs. A generic VP label hides the different argument preferences of the various sub- categories.
How to address these issues? Lexical dependence –Introduce lexical items into the tree. –Use headwords of constituents as part of the node-label. [Charniak 1997] Structural Dependence –Add more information to non-terminal categories[state splitting] Include information about parents.[Johnson 1998] Include fine-grained information (mark possessives for example) Trade-off: Adding lexical information and fine-grained categories: a) Increases sparsity -- Need appropriate smoothing. b) Adds more rules – Can affect parsing speed. Sub-categorization –Add information to the non-terminal categories [state splitting] –E.g., S -> NP_firstpersonsingular VP_firstpersonsingular
Lexicalized Charniak Parser Key idea is to identify heads of constituents and use them to condition probabilities. There are a handful of rules that specify how to identify heads. Probability of lexicalized parse tree is computed using these two quantities. P(cur_head = profits | cur_category = NP, parent_head = rose, parent_category = S) P(rule = r_i | cur_head = profits, cur_category = NP, parent_category = S)