Download presentation
Presentation is loading. Please wait.
Published byOctavia McLaughlin Modified over 9 years ago
1
7/2003EMNLP031 Learning Extraction Patterns for Subjective Expressions Ellen Riloff Janyce Wiebe University of Utah University of Pittsburgh
2
7/2003EMNLP032 Subjectivity Subjective language includes opinions, speculations, emotions Distinguishing subjective and objective information could benefit many applications: –Information extraction (discard subjective information or label it as uncertain) –Question answering (find answers reflecting different opinions) –Summarization (summarize various views on topic)
3
7/2003EMNLP033 Goals Sentence-level subjectivity classification –Wiebe et al. 2001 found that 44% of sentences in news articles are subjective *
4
7/2003EMNLP034 Goals Sentence-level subjectivity classification Learning subjectivity clues *
5
7/2003EMNLP035 Goals Sentence-level subjectivity classification Learning subjectivity clues from unannotated text *
6
7/2003EMNLP036 Goals Sentence-level subjectivity classification Learning subjectivity clues from unannotated text corpora Learning linguistically rich patterns (represented as IE extraction patterns)
7
7/2003EMNLP037 Previous Work in NLP Subjectivity Analysis in Text Document-level subjectivity classification (e.g., Turney 2002; Pang et al 2002; Spertus 1997) and above (Tong 2001) Genre classification (e.g., Karlgren and Cutting 1994; Kessler et al. 1997; Wiebe et al. 2001) Supervised sentence-level classification (Wiebe et al 1999) Learning adjectives, adjectival phrases, verbs, nouns, and N-grams (e.g., Turney 2002; Hatzivassiloglou & McKeown 1997; Wiebe et al. 2001; Riloff et al. 2003)
8
7/2003EMNLP038 Recent Related Work Yu and Hatzivassiloglou (EMNLP03): unsupervised sentence level classification. Complementary approach and features. Dave et al. (WWW03): reviews classified as positive or negative. Agrawal et al. (WWW03): newsgroup authors partitioned into camps based on quotation links Gordon et al. (ACL03): manually developed grammars for some types of subjective language
9
7/2003EMNLP039 Extraction Patterns Extraction patterns are lexico-syntactic patterns to identify relevant information Typically they represent role relationships surrounding noun and verb phrases
10
7/2003EMNLP0310 Extraction Patterns Extraction patterns are lexico-syntactic patterns to identify relevant information Typically they represent role relationships surrounding noun and verb phrases hijacking of : hijacked vehicle was hijacked: hijacked vehicle
11
7/2003EMNLP0311 Extraction Patterns Extraction patterns are lexico-syntactic patterns to identify relevant information Typically they represent role relationships surrounding noun and verb phrases hijacking of : hijacked vehicle was hijacked: hijacked vehicle hijacked: hijacker
12
Our Method Subjective expressions represented as extraction patterns get to know appear to be was satisfied complained Subtle variations can be significant: “The comedian bombed last night.” Often higher precision than sub-expressions More general than fixed n-grams
13
7/2003EMNLP0313 Our Method Subjective expressions represented as extraction patterns get to know appear to be was satisfied complained Supervised extraction pattern learning Training data generated automatically
14
7/2003EMNLP0314 Our Method Subjective expressions represented as extraction patterns get to know appear to be was satisfied complained Supervised extraction pattern learning Training data generated automatically Entire process bootstrapped
15
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
16
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
17
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
18
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
19
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
20
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Results For 1 cycle
21
7/2003EMNLP0321 Test Data Manual annotation for multiple perspective QA (ARDA AQUAINT NRRC) (working on copyright issues to release corpus this summer) Good agreement on sentence classes used here –0.77 ave pair-wise kappa –0.89 ave pair-wise kappa with borderline sentences removed (11% of the corpus) Wilson & Wiebe SIGdial 2003 describes the annotation scheme and agreement study
22
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
23
7/2003EMNLP0323 Unannotated Text Collection English language versions of FBIS news articles from a variety of countries. Size: 302,160 sentences
24
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
25
Known subjective vocabulary From previous work Manually identified (e.g, entries from Levin 1993) Automatically identified (e.g., nouns from Riloff et al CoNLL03)
26
Known subjective vocabulary From previous work Manually identified (e.g, entries from Levin 1993) Automatically identified (e.g., nouns from Riloff et al. 2003) Strongly subjective: most instances subjective Weakly subjective: objective instances also common
27
Known subjective vocabulary From previous work Manually identified (e.g, entries from Levin 1993) Automatically identified (e.g., nouns from Riloff et al. 2003) Strongly subjective: most instances subjective Weakly subjective: objective instances also common Any data used is separate from data in this paper
28
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
29
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective >1 strongly subjective Classifier clue unlabeled sentences subjective sentences Objective Classifier objective sentences 91.3% Precision 31.9% Recall Test set: 2197 sentences 59% subjective
30
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective 2+ strongly subjective Classifier clues unlabeled sentences Objective previous, current, next sentence: Classifier 0 strongly subjective clue & 0 or 1 weakly subjective clue subjective sentences 82.6% Precision 16.4% Recall objective sentences
31
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
32
Subjective Classifier Extraction Pattern AutoSlog-TS Learner Riloff 1996 Objective Classifier subjective patterns subjective sentences “relevant texts” 17,000 objective sentences “irrelevant texts” 17,000
33
7/2003EMNLP0333 Step 1: Apply Syntactic Templates active-verb dobj verb infinitive aux noun Active-verb Verb infinitive Noun prep Infinitive prep
34
7/2003EMNLP0334 Step 1: Apply Syntactic Templates active-verb dobj dealt blow verb infinitive appear to be aux noun has position Active-verb endorsed Verb infinitive get to know Noun prep opinion on Infinitive prep to resort to
35
7/2003EMNLP0335 Step 1: Apply Syntactic Templates active-verb dobj dealt blow verb infinitive appear to be aux noun has position Active-verb endorsed Verb infinitive get to know Noun prep opinion on Infinitive prep to resort to
36
7/2003EMNLP0336 Step 1: Apply Syntactic Templates active-verb dobj dealt blow Matches any sentence with verb phrase with head=dealt direct object with head=blow. “The experience certainly dealt a stiff blow to his pride.”
37
7/2003EMNLP0337 Step 2: Select Patterns Apply all learned patterns to training data Calculate precision and frequency: precision(pattern) = # in subjective sentences / total # Select patterns based on their frequency and precision on the training data (No tuning on the test set)
38
Examples from Training Data was asked100% asked 63% is talk 100% talk of 90% will talk 71% was expected from 100% was expected 42% is fact 100% fact is 100% %SUBJ
39
Examples from Training Data was asked100% asked 63% is talk 100% talk of 90% will talk 71% was expected from 100% was expected 42% is fact 100% fact is 100% %SUBJ
40
Examples from Training Data was asked100% asked 63% is talk 100% talk of 90% will talk 71% was expected from 100% was expected 42% is fact 100% fact is 100% %SUBJ
41
Examples from Training Data was asked100% asked 63% is talk 100% talk of 90% will talk 71% was expected from 100% was expected 42% is fact 100% fact is 100% %SUBJ
42
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
43
7/2003EMNLP0343 Evaluation of Learned Patterns Test data: –3947 sentences –54% subjective Train Test F >= 10 P=100% P = 85% Recall=41% F >= 2 P >= 60% P = 71% Recall=92%
44
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
45
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
46
unlabeled sentences Known subjective vocabulary Subjective Classifier subjective sentences Extraction Pattern Learner subjective patterns
47
unlabeled sentences Known subjective vocabulary Subjective Classifier New subjective sentences: 1 old clue + 1 new >1 new old + new subjective sentences Extraction Pattern Learner F >= 10, P = 100% on training data subjective patterns
48
7/2003EMNLP0348 Evaluation on Test Data Original subjective classifier Augmented subjective classifier 40.1% recall 32.9% recall 90.2% precision 91.3% precision
49
7/2003EMNLP0349 Future Work
50
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
51
Known subjective vocabulary Pattern-Based Objective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences objective sentences objective sentences Improve original high-precision classifier identify new objective sentences during bootstrapping
52
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
53
Unannotated Text Collection unlabeled sentences Subjective Classifier Iteration 0 Iteration 1+ Objective Classifier Iteration 0 Iteration 1+ Known subjective vocabulary Iteration 0: use corpus-independent subjectivity clues to generate initial training set Iteration 1+: supervised learning algorithm to tune to corpus and combine old and new clues effectively
54
Known subjective vocabulary Build up subjective lexicon as the process is applied to additional corpora Once bootstrapping process terminates, human review of high precision patterns tough act to follow: linguistic subjectivity Rush Limbaugh: opinionated source police: “lightning rod” topic
55
7/2003EMNLP0355 Conclusions High-precision subjectivity classification can be used to generate large amounts of labeled training data Extraction pattern learning techniques can learn linguistically rich subjective patterns Bootstrapping process results in higher recall with little loss in precision
56
Known subjective vocabulary Build up subjective lexicon as the process is applied to new corpora. Richer Representation with deeper knowledge (theta roles, polarity, evaluative?, speculative?, tone, ambiguity,…) Human review of high-precision patterns tough act to follow: linguistic subjectivity Rush Limbaugh: opinionated source police: “lightning rod” topic
57
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
58
Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns objective sentences 17000 new subjective sentences
59
Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier > 0 instances of patterns with F >4 P = 100 on training data Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns objective sentences 17000 subjective sentences 9500 new
60
Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective sentences objective sentences 17000 7500 9500 new new subjective patterns 4248 new patterns P >.59 on training data 308 new patterns P = 100 on training data
61
Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective sentences objective sentences 17000 7500 9500 new new subjective patterns New + old patterns on test set: Recall increased more than precision decreased +2R, -0.5P to +4R, -2P
62
7/2003EMNLP0362 Example The Foreign Ministry said Thursday that it was “surprised, to put it mildly” by the U.S. State Department’s criticism of Russia’s human rights record and objected in particular to the “odious” section on Chechnya. (writer,FM,FM) (writer,FM) (writer,FM,FM,SD) (writer,FM)
63
7/2003EMNLP0363
64
7/2003EMNLP0364 Annotation Scheme The annotation scheme was developed as part of a U.S. government-sponsored project (ARDA AQUAINT NRRC) to investigate multiple perspective question answering. Annotators labeled private state expressions. Each private state can have low, medium, or high strength. Our gold standard considers a sentence to be subjective if it contains at least one private state expression of medium or higher strength.
65
7/2003EMNLP0365 Two Ways of Expressing Private States Explicit mentions of private states and speech events –The United States fears a spill-over from the anti-terrorist campaign Expressive subjective elements –The part of the US human rights report about China is full of absurdities and fabrications.
66
7/2003EMNLP0366 Nested Sources “The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities. (writer, Xirao-Nima, US) (writer, Xirao-Nima) (writer) “The report is full of absurdities,’’ he continued. (writer, Xirao-Nima) (writer)
67
7/2003EMNLP0367 OnlyFactive “The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities. (writer) OnlyFactive=yes (writer, Xirao-Nima) OnlyFactive=yes (writer, Xirao-Nima, US) OnlyFactive=no
68
7/2003EMNLP0368 Example The Foreign Ministry said Thursday that it was “surprised, to put it mildly” by the U.S. State Department’s criticism of Russia’s human rights record and objected in particular to the “odious” section on Chechnya. (writer,FM,FM) (writer,FM) (writer,FM,FM,SD) (writer,FM)
69
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
70
Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.