Download presentation
Presentation is loading. Please wait.
1
recovering empty categories
2
Penn Treebank The Penn Treebank Project annotates naturally occurring text for linguistic structure. It produces skeletal parses showing rough syntactic and semantic information: a bank of linguistic trees. It annotates text with POS tags. Bracketing (strictly POS Vs. syntax and predicates): (Mary) (visited a very nice boy) (1) (A very nice boy) (visited Mary) (2) (1) (S (NP Mary) (VP (V visited) (NP (ART a) (ADJP (ADV very) (ADJ nice)) (N boy))))
3
syntactic tags ADJP - Adjective phrase. Example: “outrageously expensive”. ADVP - Adverb phrase. Examples: “rather timidly”, “very well indeed”. NP - Noun phrase. PNP - Proper noun phrase. PP - Prepositional phrase. S - Simple declarative clause SBAR - Clause introduced by a subordinating conjunction. SBARQ - Direct question introduced by a wh-word or wh-phrase.
4
syntactic tags SINV - Inverted declarative sentence, one in which the subject follows the verb. SQ - That part of an SBARQ that excludes the wh-word or wh-phrase. VP - Verb phrase. Phrasal category headed a verb. WHADVP - Wh-adverb phrase. Example: “how” or “where”. WHNP - Wh-noun phrase. Examples: “who”, “whose daughter”, “which book”. WHPP - Wh-prepositional phrase. Example: “on what”. QP - Quantifier phrase used within NP. X - Constituent of unknown or uncertain type.
5
examples adverb and preposition: (S (NP He) was (VP (ADVP very hurriedly) throwing (NP clothes) (PP into NP (a suitcase))).) apposition: (NP (NP Mr. Smith), (ADJP (NP 65 years) old), (NP chairman (PP of (NP the board)))) comparative: (S (NP He) (VP is (ADJP as tall (SBAR as (S (NP John) (VP is))))).) (S (SBAR( X the sooner) (S our vans hit the road)), (S (X the easier) (S we will fulfill that obligation)).)
6
function tags Subject and Predicate NP’s: (S (NP-SBJ I) (VP consider (S (NP-SBJ Kris) (NP-PRD a fool)))) Benefactive: (S (NP-SBJ I) (VP baked (NP a cake) (PP-BNF for (NP Doug)))) ADV (adverbial noun: “a little bit”), VOC (vocative), DTV (dative), DIR (direction with PP like from-to), LOC (locative with PP), MNR (manner), TMP (temporal), CLR (closely related: predication adjuncts or phrasal verbs), HLN (headline or dateline), TTL (title), etc.
7
gapping gap coindexing: (S (S (NP-SBJ-1 Mary) (VP likes) (NP-2 Bach)) and (S (NP-SBJ=1 Susan), (NP=2 Beethoven))) predicate-argument structure LIKES(Mary,Bach) & LIKES(Susan,Beethoven) (S (NP-SBJ I) (VP (VP eat (NP-1 breakfast) (PP-TMP-2 in (NP the morning))) and (VP (NP=1 lunch) (PP-TMP=2 in (NP the afternoon)))))
8
empty categories Empty categories or null elements are used for non-local dependencies, discontinuous constituents, and missing elements. They are coindexed with their antecedents in the same sentence. In addition, if a node has a particular grammatical function (such as subject) or semantic role (such as location), it has a function tag indicating that role; empty categories may also have function tags. NP *arbitrary or controlled PRO, trace of NP movement *T*trace of A movement (WHNP) 0null complementizer (i.e. that) *U*unit
9
indexing & *T* examples Indices used to express coreference, binding (wh- movement), close association (it extraposition) (S (NP-SBJ Willie) (VP knew (SBAR (WHNP-1 who) (S (NP-SBJ *T*-1) (VP threw (NP the ball)))))) (SBARQ (WHNP-1 what) (SQ are (NP-SBJ you) (VP thinking (PP-CLR about (NP *T*-1)))) ?)
10
NP * object of passive verb: (S (NP-SBJ-1 John) (VP was (VP hit (NP *-1) (PP by (NP a ball))))) reduced relative clause: (NP (NP an agreement) (VP signed (NP *) (PP by (NP everyone)))) subjects of participial clauses and gerunds: (S NP-SBJ-1 I) (VP stopped (S (NP-SBJ *-1) (VP eating (NP chocolate))))) (S (NP *) (VP Having (VP carefully considered (NP his options)))) adverbial: (S (NP-SBJ-1 She) (VP left, (S-ADV (NP- SBJ-2 *-1) (VP offended (NP *-2) (PP by (NP their remarks))))))
11
0 and *U* that: (S (NP-SBJ I) (VP believe (SBAR 0 (S (NP-SBJ he) (VP will (VP stay)))))) WHNP 0: (NP (NP a movie) (SBAR (WHNP-1 0) (S (NP- SBJ *) (VP to (VP see (NP *T*-1)))))) WHADVP 0: (S (NP-SBJ That) (VP is (NP-PRD (NP a good way (SBAR (WHADVP-1 0) (S (NP-SBJ *) (VP to (VP keep (ADJP-PRD warm) (ADVP-MNR *T*-1)))))))) units: (NP US$ 5 *U*) (NP (QP between 12% to 13%) *U*)
12
recovery of empty categories [Campbell 2004] recovery refers to: detection: locate empty categories in the parse tree resolution: coindexation with their antecedents, assign them function tags. NOT learning- or corpus-based, but syntax rule-based.
13
algorithm for recovery Walk the tree from top. At each node X try to insert every empty category c. If the syntactic context of c (rule- based) is met by X, decide for c. Assign function tags to X. If X = NP *, try to find antecedent for X. rule to insert NP *: if X is passive VP & X has no complement S if postmodifying PP Y ins NP * before postmodifiers of Y else ins NP * before postmodifiers of X else if X is a non-finite S and X has no subject ins NP-SBJ * after all premodifiers of X
14
parameters rules make no use of lexical information. only some function words (aux. or inf. to) but no content. for WHADVP: check quality of the head of the NP relative clause and add function tag to *T* (why: PRP, how: MNR, when: TMP, etc) (NP (NP the country) (SBAR (WHADVP-1 where) (S (NP-SBJ I) (VP live (ADVP-LOC *T*-1))))) “time to go” ? the method depends on the system’s ability to detect passives, infinitives, modifiers, functional info such as subject etc.
15
more rules an extra rule inserts NP * as subject of imperative: (S (NP-VOC Chris), (NP-SBJ *) (VP go (ADVP-DIR home)) !) to find antecedents of NP *: If non-subject NP *, assign local subject (“John was hit (NP *) by a ball”). If NP * subject of a non-finite S, search the tree for another NP subject (“I stopped (NP-SBJ *) eating chocolate”).
16
evaluation perfect input: PTB w/o empty categories. correct recovery: label + string position. Prec: % correct empty categories / empty categories detected Rec: % correct empty categories / empty categories in corpus F 1 : 2*PR/(P+R) perfect input w/o function tags.
17
evaluation Charniak’s parser output as input. PCFG parser based on the PTB for training/testing. correct recovery: label + string position. low results: errors introduced by the parser and no function tags on parser output.
18
learning & lexical-based? we need lexical info in some cases: VP (S (VP to…)…) empty category as subject to S: NP * or NP *T* ? “I’d like (NP-SBJ *) to have.” “Everyone seems (NP-SBJ *) to dislike him.” “John designed telescopes (NP-SBJ *T*) to sit on Kitt Park.” “We bought a broom (NP-SBJ *) to sweep the floor with (NP *T*) ” the last 2 verbs + to express purpose (PRP). combined learning + rule-based for function or subject tags in NP * and their antecedents (resolution).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.