Download presentation
Presentation is loading. Please wait.
Published byCarlos Kimbrough Modified over 9 years ago
1
Using Syntax to Disambiguate Explicit Discourse Connectives in Text Source: ACL-IJCNLP 2009 Author: Emily Pitler and Ani Nenkova Reporter: Yong-Xiang Chen
2
Discourse connectives Words or phrases that explicitly signal the presence of a discourse relation such as –once –since –on the contrary Implicit relations –a discourse connective is absent and inferred by the reader –hard to identify automatically Explicit relations are much easier to predict, but…
3
Two types of ambiguity 1.Discourse or non-discourse usage –For example, ” once” a temporal discourse connective a simply a word meaning “formerly” 2.Some connectives are ambiguous in terms of the relation they mark –For example, ” since” serve as temporal connective serve as causal connective
4
Goal Explore the predictive power of syntactic features for both disambiguation tasks
5
Corpus and features Corpus: Penn Discourse Treebank (PDTB) –Each discourse connective is assigned a sense from a three-level hierarchy of senses –Annotates 40,600 discourse relations (the largest public resource ) 18,459 Explicit Relations –of 100 explicit discourse connectives 16,053 Implicit Relations Other relations Annotators were allowed to provide two senses for a given connective
6
Relation categories of discourse connective in PDTB This work consider only the top level categories –general enough to be annotated with high inter- annotator agreement 1.Expansion 擴展 ( 遞進 / 解證 ) one clause is elaborating information in the other 2.Comparison 對比 ( 並列 ) information in the two clauses is compared or contrasted 3.Contingency 情況 ( 因果 / 條件 ) one clause expresses the cause of the other 4.Temporal 循序 ( 承接 ) information in two clauses are related because of their timing
7
Syntactic features Syntax has not been used for discourse vs. non- discourse disambiguation –Syntax extensively used for dividing sentences into elementary discourse units Idea: Discourse connectives appear in specific syntactic contexts Four feature categories: –Self Category –Parent Category –Left Sibling Category –Right Sibling Category Parent Left SiblingSelfRight Sibling
8
Self Category The highest node in the tree which dominates the words in the connective –For single word connectives this might correspond to the POS tag of the word –For multi-word connectives Example cue phrase “in addition” Parsed as (PP (IN In) (NP (NN addition) )) –Preposition + Noun –the Self Category of the phrase is prepositional phrase
9
Parent Category The category of the immediate parent of the Self Category –Example: My favorite colors are blue and green –when “and” doesn’t has a discourse function the parent of “and” would be an NP (“blue and green”)
10
Left Sibling Category The syntactic category of the sibling immediately to the left of the Self Category –If the left sibling does not exist, this features takes the value “NONE” Self Category has a discourse function –while in example above, the left sibling of “and” is “NP” so doesn’t has a discourse function
11
Right Sibling Category The syntactic category of the sibling immediately to the right of the Self Category English is a right-branching language –the right sibling is often the dependent of the potential discourse connective If the connective string has a discourse function –this dependent will often be a clause (SBAR) –Example: “After I went to the store, I went home” “After May, I will go on vacation”
12
More features about the right sibling Example: –NASA won’t attempt a rescue; instead, it will try to predict whether any of the rubble will smash to the ground and where. –Although the syntactic category of “where” is SBAR, “and” doesn’t has a discourse function So include two additional features about the contents of the right sibling –Right Sibling Contains a VP –Right Sibling Contains a Trace This example is a wh-trace
13
Discourse vs. non-discourse usage only 11 PDTB connectives appear as a discourse connective more than 90% of the time –although, in turn, afterward, consequently, additionally, alternatively, whereas, on the contrary, if and when, lest, and on the one hand...on the other hand –while “or” only serves a discourse function 2.8% of the times it appears
14
Training and testing Positive examples: –explicit discourse connectives annotated in the PDTB Negative examples: –same strings in the PDTB texts that were not annotated as explicit connectives report results using a maximum entropy classifier 2 sections (0 and 1) of the PDTB were used for development of the features 21 sections (2-22) used for ten-fold cross-validation Baseline: the string of the connective –f-score=75.33% Accuracy=85.86%
15
Combinations of features Different connectives have different syntactic contexts 1.pair-wise interaction features For example: connective=also-RightSibling=SBAR 2.Adding interaction terms between pairs of syntactic features
16
Sense classification a few connectives are quite ambiguous –since : indicates Temporal or Contingency –Contingency and Temporal are the senses most often annotated together. do classification between the four senses for each explicit relation using a Naive Bayes classifier The connectives most often doubly annotated are –when –and –as
17
Results The human inter-annotator agreement on the top level sense class was also 94% –suggesting further improvements may not be possible
18
Error Analysis Temporal relations are the least frequent of the four senses(19% of the explicit relations) But more than half of the errors involve the Temporal class –most commonly confused pairing was Contingency relations > Temporal relations –making up 29% of errors
19
Conclusion Using a few syntactic features leads to state-of-the-art accuracy for discourse vs. non-discourse usage classification Syntactic features also helps sense class identification –already attained results at the level of human annotator agreement
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.