Download presentation
Presentation is loading. Please wait.
Published bySilas Carpenter Modified over 8 years ago
1
Subjectivity and Sentiment Analysis Jan Wiebe Department of Computer Science CERATOPS: Center for the Extraction and Summarization of Events and Opinions in Text University of Pittsburgh
2
What is Subjectivity? The linguistic expression of somebody’s opinions, sentiments, emotions, evaluations, beliefs, speculations (private states) Wow, this is my 4th Olympus camera Staley declared it to be “one wicked song” Most voters believe he won‘t raise their taxes
3
One Motivation Automatic question answering…
4
Fact-Based Question Answering Q: When is the first day of spring in 2007? Q: Does the US have a tax treaty with Cuba?
5
Fact-Based Question Answering Q: When is the first day of spring in 2007? A: March 21 Q: Does the US have a tax treaty with Cuba? A: Thus, the U.S. has no tax treaties with nations like Iraq and Cuba.
6
Opinion Question Answering Q: What is the international reaction to the reelection of Robert Mugabe as President of Zimbabwe?
7
Opinion Question Answering Q: What is the international reaction to the reelection of Robert Mugabe as President of Zimbabwe? A: African observers generally approved of his victory while Western Governments strongly denounced it.
8
More Motivations Product review mining: What features of the ThinkPad T43 do customers like and which do they dislike? Review classification: Is a review positive or negative toward the movie? Information Extraction: Is “killing two birds with one stone” a terrorist event? Tracking sentiments toward topics over time: Is anger ratcheting up or cooling down? Etcetera!
9
wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </>
10
wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 1
11
wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 2
12
wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 3
13
wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 4
14
Outline Subjectivity annotation scheme (MPQA) Learning subjective expressions from unannotated texts Contextual polarity of sentiment expressions Word sense and subjectivity Conclusions and pointers to related work
15
wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 1
16
Corpus Annotation Wiebe, Wilson, Cardie 2005 Wilson & Wiebe 2005 Somasundaran, Wiebe, Hoffmann, Litman 2006 Wilson 2007
17
Outline for Section 1 Overview Frame types Nested Sources Extensions
18
Overview Fine-grained: expression level rather than sentence or document level Annotate –expressions of opinions, sentiments, emotions, evaluations, speculations, … –material attributed to a source, but presented objectively
19
Overview Opinions, evaluations, emotions, speculations are private states They are expressed in language by subjective expressions Private state: state that is not open to objective observation or verification Quirk, Greenbaum, Leech, Svartvik (1985). A Comprehensive Grammar of the English Language.
20
Overview Focus on three ways private states are expressed in language –Direct subjective expressions –Expressive subjective elements –Objective speech events
21
Direct Subjective Expressions Direct mentions of private states The United States fears a spill-over from the anti-terrorist campaign. Private states expressed in speech events “We foresaw electoral fraud but not daylight robbery,” Tsvangirai said.
22
Expressive Subjective Elements Banfield 1982 “We foresaw electoral fraud but not daylight robbery,” Tsvangirai said The part of the US human rights report about China is full of absurdities and fabrications
23
Objective Speech Events Material attributed to a source, but presented as objective fact (without evaluation…) The government, it added, has amended the Pakistan Citizenship Act 10 of 1951 to enable women of Pakistani descent to claim Pakistani nationality for their children born to foreign husbands.
25
Nested Sources “The report is full of absurdities,’’ Xirao-Nima said the next day.
26
Nested Sources “The report is full of absurdities,’’ Xirao-Nima said the next day. (Writer)
27
Nested Sources “The report is full of absurdities,’’ Xirao-Nima said the next day. (Writer, Xirao-Nima)
28
Nested Sources “The report is full of absurdities,’’ Xirao-Nima said the next day. (Writer Xirao-Nima)
29
Nested Sources “The report is full of absurdities,’’ Xirao-Nima said the next day. (Writer Xirao-Nima) (Writer)
30
“The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative
31
“The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative
32
“The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative
33
“The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative
34
“The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative
35
“The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative
36
“The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities.
37
(Writer)
38
“The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities. (writer, Xirao-Nima)
39
“The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities. (writer, Xirao-Nima, US)
40
“The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities. (writer, Xirao-Nima, US) (writer, Xirao-Nima) (Writer)
41
Objective speech event anchor: the entire sentence source: implicit: true Objective speech event anchor: said source: Direct subjective anchor: fears source: intensity: medium expression intensity: medium attitude type: negative target: spill-over “The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities.
42
Corpus @ www.cs.pitt.edu/mpqa English language versions of articles from the world press (187 news sources) Also includes contextual polarity annotations Themes of the instructions: –No rules about how particular words should be annotated. –Don’t take expressions out of context and think about what they could mean, but judge them as they are used in that sentence.
43
Extensions Wilson 2007
44
I think people are happy because Chavez has fallen. direct subjective span: are happy source: attitude: inferred attitude span: are happy because Chavez has fallen type: neg sentiment intensity: medium target: target span: Chavez has fallen target span: Chavez attitude span: are happy type: pos sentiment intensity: medium target: direct subjective span: think source: attitude: attitude span: think type: positive arguing intensity: medium target: target span: people are happy because Chavez has fallen
45
Outline Subjectivity annotation scheme (MPQA) Learning subjective expressions from unannotated texts Contextual polarity of sentiment expressions Word sense and subjectivity Conclusions and pointers to related work
46
wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 2
47
Outline for Section 2 Learning subjective nouns with extraction pattern bootstrapping Automatically generating training data with high-precision classifiers Learning subjective and objective expressions (not simply words or n-grams) Riloff, Wiebe, Wilson 2003; Riloff & Wiebe 2003; Wiebe & Riloff 2005; Riloff, Patwardhan, Wiebe 2006.
48
Outline for Section 2 Learning subjective nouns with extraction pattern bootstrapping Automatically generating training data with high-precision classifiers Learning subjective and objective expressions
49
Information Extraction Information extraction (IE) systems identify facts related to a domain of interest. Extraction patterns are lexico-syntactic expressions that identify the role of an object. For example: was killed assassinated murder of
50
Learning Subjective Nouns Goal: to learn subjective nouns from unannotated text Method: applying IE-based bootstrapping algorithms that were designed to learn semantic categories Hypothesis: extraction patterns can identify subjective contexts that co-occur with subjective nouns Example: “expressed ” concern, hope, support
51
Extraction Examples expressed condolences, hope, grief, views, worries indicative of compromise, desire, thinking inject vitality, hatred reaffirmed resolve, position, commitment voiced outrage, support, skepticism, opposition, gratitude, indignation show of support, strength, goodwill, solidarity was sharedanxiety, view, niceties, feeling
52
Meta-Bootstrapping Riloff & Jones 1999 Unannotated Texts Best Extraction Pattern Extractions (Nouns) Ex: hope, grief, joy, concern, worries Ex: expressed Ex: happiness, relief, condolences
53
Subjective Seed Words cowardiceembarrassment hatred outrage crapfool hell slander delightgloom hypocrisy sigh disdaingrievance love twit dismayhappiness nonsense virtue
54
Subjective Noun Results Bootstrapping corpus: 950 unannotated news articles We ran both bootstrapping algorithms for several iterations We manually reviewed the words and labeled them as strong, weak, or not subjective 1052 subjective nouns were learned (454 strong, 598 weak) –included in our subjectivity lexicon @ www.cs.pitt.edu/mpqa
55
Examples of Strong Subjective Nouns anguish exploitation pariah antagonism evil repudiation apologist fallacies revenge atrocities genius rogue barbarian goodwill sanctimonious belligerence humiliationscum bully ill-treatment smokescreen condemnation injustice sympathy denunciation innuendo tyranny devil insinuation venom diatribe liar exaggeration mockery
56
Examples of Weak Subjective Nouns aberration eyebrowsresistant allusion failuresrisk apprehensions inclinationsincerity assault intrigue slump beneficiary liabilityspirit benefit likelihoodsuccess blood peacefultolerance controversy persistenttrick credence plaguetrust distortion pressureunity drama promise eternity rejection
57
Outline for Section 2 Learning subjective nouns with extraction pattern bootstrapping Automatically generating training data with high-precision classifiers Learning subjective and objective expressions
58
Training Data Creation rule-based subjective sentence classifier rule-based objective sentence classifier subjective & objective sentences unlabeled texts subjective clues
59
Subjective Clues Subjectivity lexicon –@ www.cs.pitt.edu/mpqa –Entries from manually developed resources (e.g., General Inquirer, Framenet) + –Entries automatically identified (E.g., nouns just described)
60
Creating High-Precision Rule- Based Classifiers a sentence is subjective if it contains 2 strong subjective clues a sentence is objective if: –it contains no strong subjective clues –the previous and next sentence contain 1 strong subjective clue –the current, previous, and next sentence together contain 2 weak subjective clues GOAL: use subjectivity clues from previous research to build a high-precision (low-recall) rule-based classifier
61
Accuracy of Rule-Based Classifiers (measured against MPQA annotations) SubjRecSubjPrecSubjF Subj RBC 34.2 90.4 49.6 ObjRecObjPrecObjF Obj RBC 30.7 82.444.7
62
Generated Data We applied the rule-based classifiers to ~300,000 sentences from (unannotated) new articles ~53,000 were labeled subjective ~48,000 were labeled objective training set of over 100,000 labeled sentences!
63
Generated Data The generated data may serve as training data for supervised learning, and initial data for self training Wiebe & Riloff 2005 The rule-based classifiers are part of OpinionFinder @ www.cs.pitt.edu/mpqa Here, we use the data to learn new subjective expressions…
64
Outline for Section 2 Learning subjective nouns with extraction pattern bootstrapping Automatically generating training data with high-precision classifiers Learning subjective and objective expressions
65
Representing Subjective Expressions with Extraction Patterns EPs can represent expressions that are not fixed word sequences drove [NP] up the wall - drove him up the wall - drove George Bush up the wall - drove George Herbert Walker Bush up the wall step on [modifiers] toes - step on her toes - step on the mayor’s toes - step on the newly elected mayor’s toes gave [NP] a [modifiers] look - gave his annoying sister a really really mean look
66
The Extraction Pattern Learner Used AutoSlog-TS [Riloff 96] to learn EPs AutoSlog-TS needs relevant and irrelevant texts as input Statistics are generated measuring each pattern’s association with the relevant texts The subjective sentences are the relevant texts, and the objective sentences are the irrelevant texts
67
passive-vp was satisfied active-vp complained active-vp dobj dealt blow active-vp infinitive appears to be passive-vp infinitive was thought to be auxiliary dobj has position active-vp endorsed infinitive to condemn active-vp infinitive get to know passive-vp infinitive was meant to show subject auxiliary fact is passive-vp prep opinion on active-vp prep agrees with infinitive prep was worried about noun prep to resort to
68
RelevantIrrelevant [The World Trade Center], [an icon] of [New York City], was intentionally attacked very early on [September 11, 2001]. Parser Extraction Patterns: was attacked icon of was attacked on Syntactic Templates AutoSlog-TS (Step 1)
69
AutoSlog-TS (Step 2) RelevantIrrelevant Extraction PatternsFreqProb was attacked100.90 icon of 5.20 was attacked on 80.79 Extraction Patterns: was attacked icon of was attacked on
70
Identifying Subjective Patterns Subjective patterns: –Freq > X –Probability > Y –~6,400 learned on 1 st iteration Evaluation against the MPQA corpus: –direct evaluation of performance as simple classifiers –evaluation of classifiers using patterns as additional features –Does the system learn new knowledge? Add patterns to the strong subjective set and re-run the rule-based classifier recall += 20 while precision -+ 7 on 1 st iteration
71
Patterns with Interesting Behavior PATTERNFREQP(Subj | Pattern) asked 128.63 was asked 11 1.0 was expected 45.42 was expected from 5 1.0 talk 28.71 talk of 10.90 is talk 5 1.0 put 187.67 put end 10.90 is fact 38 1.0 fact is 12 1.0
72
Conclusions for Section 2 Extraction pattern bootstrapping can learn subjective nouns (unannotated data) Extraction patterns can represent richer subjective expressions Learning methods can discover subtle distinctions that are important for subjectivity (unannotated data) Ongoing work: –lexicon representation integrating different types of entries, enabling comparisons (e.g., subsumption relationships) –learning and bootstrapping processes applied to much larger unannotated corpora –processes applied to learning positive and negative subjective expressions
73
Outline Subjectivity annotation scheme (MPQA) Learning subjective expressions from unannotated texts Contextual polarity of sentiment expressions (briefly) –Wilson, Wiebe, Hoffmann 2005; Wilson 2007; Wilson & Wiebe forthcoming Word sense and subjectivity Conclusions and pointers to related work
74
wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 3
75
Most approaches use a lexicon of positive and negative words Prior polarity: out of context, positive or negative beautiful positive horrid negative A word may appear in a phrase that expresses a different polarity in context Contextual polarity “Cheers to Timothy Whitfield for the wonderfully horrid visuals.” Prior Polarity versus Contextual Polarity
76
Goal of This Work Automatically distinguish prior and contextual polarity
77
Approach Use machine learning and variety of features Achieve significant results for a large subset of sentiment expressions Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances
78
Manual Annotations Subjective expressions of the MPQA corpus annotated with contextual polarity
79
Annotation Scheme Mark polarity of subjective expressions as positive, negative, both, or neutral African observers generally approved of his victory while Western governments denounced it. Besides, politicians refer to good and evil … Jerome says the hospital feels no different than a hospital in the states. positive negative both neutral
80
Annotation Scheme Judge the contextual polarity of sentiment ultimately being conveyed They have not succeeded, and will never succeed, in breaking the will of this valiant people.
81
Annotation Scheme Judge the contextual polarity of sentiment ultimately being conveyed They have not succeeded, and will never succeed, in breaking the will of this valiant people.
82
Annotation Scheme Judge the contextual polarity of sentiment ultimately being conveyed They have not succeeded, and will never succeed, in breaking the will of this valiant people.
83
Annotation Scheme Judge the contextual polarity of sentiment ultimately being conveyed They have not succeeded, and will never succeed, in breaking the will of this valiant people.
84
Features Many inspired by Polanyi & Zaenen (2004): Contextual Valence Shifters Example:little threat little truth Others capture dependency relationships between words Example: wonderfully horrid pos mod
85
1.Word features 2.Modification features 3.Structure features 4.Sentence features 5.Document feature Word token terrifies Word part-of-speech VB Context that terrifies me Prior Polarity negative Reliability strong subjective Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances
86
1.Word features 2.Modification features 3.Structure features 4.Sentence features 5.Document feature Binary features: Preceded by – adjective – adverb (other than not) – intensifier Self intensifier Modifies – strong clue – weak clue Modified by – strong clue – weak clue Dependency Parse Tree Thehumanrights report poses asubstantial challenge … det adj mod adj det subjobj p Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances
87
1.Word features 2.Modification features 3.Structure features 4.Sentence features 5.Document feature Binary features: In subject [The human rights report] poses In copular I am confident In passive voice must be regarded Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances Thehumanrights report poses asubstantial challenge … det adj mod adj det subjobj p
88
1.Word features 2.Modification features 3.Structure features 4.Sentence features 5.Document feature Count of strong clues in previous, current, next sentence Count of weak clues in previous, current, next sentence Counts of various parts of speech Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances
89
1.Word features 2.Modification features 3.Structure features 4.Sentence features 5.Document feature Document topic (15) – economics – health – Kyoto protocol – presidential election in Zimbabwe … Example: The disease can be contracted if a person is bitten by a certain tick or if a person comes into contact with the blood of a congo fever sufferer. Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances
90
Word token Word prior polarity Negated Negated subject Modifies polarity Modified by polarity Conjunction polarity General polarity shifter Negative polarity shifter Positive polarity shifter Word token terrifies Word prior polarity negative Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances
91
Word token Word prior polarity Negated Negated subject Modifies polarity Modified by polarity Conjunction polarity General polarity shifter Negative polarity shifter Positive polarity shifter Binary features: Negated For example: –not good –does not look very good not only good but amazing Negated subject No politically prudent Israeli could support either of them. Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances
92
Word token Word prior polarity Negated Negated subject Modifies polarity Modified by polarity Conjunction polarity General polarity shifter Negative polarity shifter Positive polarity shifter Modifies polarity 5 values: positive, negative, neutral, both, not mod substantial: negative Modified by polarity 5 values: positive, negative, neutral, both, not mod challenge: positive substantial (pos) challenge (neg) Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances
93
Word token Word prior polarity Negated Negated subject Modifies polarity Modified by polarity Conjunction polarity General polarity shifter Negative polarity shifter Positive polarity shifter Conjunction polarity 5 values: positive, negative, neutral, both, not mod good: negative good (pos) and evil (neg) Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances
94
Word token Word prior polarity Negated Negated subject Modifies polarity Modified by polarity Conjunction polarity General polarity shifter Negative polarity shifter Positive polarity shifter General polarity shifter have few risks/rewards Negative polarity shifter lack of understanding Positive polarity shifter abate the damage Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances
95
Findings Statistically significant improvements can be gained Require combining all feature types Ongoing work: –richer lexicon entries –compositional contextual processing
96
S/O and Pos/Neg: both Important Several studies have found two steps beneficial –Yu & Hatzivassiloglou 2003; Pang & Lee 2004; Wilson et al. 2005; Kim & Hovy 2006 Subjective Objective Positive Negative Neutral Both
97
S/O and Pos/Neg: both Important S/O can be more difficult –Manual annotation of phrases Takamura et al. 2006 –Contextual polarity Wilson et al. 2005 –Sentiment tagging of words Andreevskaia & Bergler 2006 –Sentiment tagging of word senses Esuli & Sabastiani 2006 –Later: evidence that S/O more significant for senses Subjective Objective Positive Negative Neutral Both
98
S/O and Pos/Neg: both Important Desirable for NLP systems to find a wide range of private states, including motivations, thoughts, and speculations, not just positive and negative sentiments
99
Sentiment Other Subjectivity Objective Pos Neg Both Pos Neg Neu Both S/O and Pos/Neg: both Important
100
Outline Subjectivity annotation scheme (MPQA) Learning subjective expressions from unannotated texts Contextual polarity of sentiment expressions Word sense and subjectivity –Wiebe & Mihalcea 2006 Conclusions and pointers to related work
101
wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn> condemn#1 condemn#2 condemn#3 <> </><> </> <> </> <> </> 4
102
Introduction Continuing interest in word sense –Sense annotated resources being developed for many languages www.globalwordnet.org –Active participation in evaluations such as SENSEVAL
103
Word Sense and Subjectivity Though both are concerned with text meaning, until recently they have been investigated independently
104
Subjectivity Labels on Senses Alarm, dismay, consternation – (fear resulting from the awareness of danger) Alarm, warning device, alarm system – (a device that signals the occurrence of some undesirable event) S O
105
Subjectivity Labels on Senses Interest, involvement -- (a sense of concern with and curiosity about someone or something; "an interest in music") Interest -- (a fixed charge for borrowing money; usually a percentage of the amount borrowed; "how much interest do you pay on your mortgage?") S O
106
WSD using Subjectivity Tagging The notes do not pay interest. He spins a riveting plot which grabs and holds the reader’s interest. Sense 4 “a sense of concern with and curiosity about someone or something” S Sense 1 “a fixed charge for borrowing money” O WSD System Sense 4 Sense 1? Sense 1 Sense 4?
107
Sense 4 “a sense of concern with and curiosity about someone or something” S Sense 1 “a fixed charge for borrowing money” O WSD using Subjectivity Tagging The notes do not pay interest. He spins a riveting plot which grabs and holds the reader’s interest. WSD System Sense 4 Sense 1? Sense 1 Sense 4? Subjectivity Classifier S O
108
Sense 4 “a sense of concern with and curiosity about someone or something” S Sense 1 “a fixed charge for borrowing money” O WSD using Subjectivity Tagging The notes do not pay interest. He spins a riveting plot which grabs and holds the reader’s interest. WSD System Sense 4 Sense 1? Sense 1 Sense 4? Subjectivity Classifier S O
109
Subjectivity Classifier Subjectivity Tagging using WSD The notes do not pay interest. He spins a riveting plot which grabs and holds the reader’s interest. O S? S O?
110
Subjectivity Classifier S Sense 4 “a sense of concern with and curiosity about someone or something” O Sense 1 “a fixed charge for borrowing money” Subjectivity Tagging using WSD The notes do not pay interest. He spins a riveting plot which grabs and holds the reader’s interest. WSD System Sense 4 Sense 1 O S? S O?
111
Subjectivity Classifier S Sense 4 “a sense of concern with and curiosity about someone or something” O Sense 1 “a fixed charge for borrowing money” Subjectivity Tagging using WSD The notes do not pay interest He spins a riveting plot which grabs and holds the reader’s interest. WSD System Sense 4 Sense 1 O S? S O?
112
Refining WordNet Semantic Richness Find inconsistencies and gaps –Verb assault – attack, round, assail, last out, snipe, assault (attack in speech or writing) “The editors of the left-leaning paper attacked the new House Speaker” –But no sense for the noun as in “His verbal assault was vicious”
113
Goals Explore interactions between word sense and subjectivity –Can subjectivity labels be assigned to word senses? Manually Automatically –Can subjectivity analysis improve word sense disambiguation? –Can word sense disambiguation improve subjectivity analysis? Current work
114
Outline for Section 4 Motivation and Goals Assigning Subjectivity Labels to Word Senses –Manually –Automatically Word Sense Disambiguation using Automatic Subjectivity Analysis Conclusions
115
Annotation Scheme Assigning subjectivity labels to WordNet senses –S: subjective –O: objective –B: both
116
Examples Alarm, dismay, consternation – (fear resulting form the awareness of danger) –Fear, fearfulness, fright – (an emotion experiences in anticipation of some specific pain or danger (usually accompanied by a desire to flee or fight)) Alarm, warning device, alarm system – (a device that signals the occurrence of some undesirable event) –Device – (an instrumentality invented for a particular purpose; “the device is small enough to wear on your wrist”; “a device intended to conserve water” S O
117
Subjective Sense Definition When the sense is used in a text or conversation, we expect it to express subjectivity, and we expect the phrase/sentence containing it to be subjective.
118
Subjective Sense Examples His alarm grew Alarm, dismay, consternation – (fear resulting form the awareness of danger) –Fear, fearfulness, fright – (an emotion experiences in anticipation of some specific pain or danger (usually accompanied by a desire to flee or fight)) He was boiling with anger Seethe, boil – (be in an agitated emotional state; “The customer was seething with anger”) –Be – (have the quality of being; (copula, used with an adjective or a predicate noun); “John is rich”; “This is not a good answer”)
119
Subjective Sense Examples What’s the catch? Catch – (a hidden drawback; “it sounds good but what’s the catch?”) Drawback – (the quality of being a hindrance; “he pointed out all the drawbacks to my plan”) That doctor is a quack. Quack – (an untrained person who pretends to be a physician and who dispenses medical advice) –Doctor, doc, physician, MD, Dr., medico
120
Objective Sense Examples The alarm went off Alarm, warning device, alarm system – (a device that signals the occurrence of some undesirable event) –Device – (an instrumentality invented for a particular purpose; “the device is small enough to wear on your wrist”; “a device intended to conserve water” The water boiled Boil – (come to the boiling point and change from a liquid to vapor; “Water boils at 100 degrees Celsius”) –Change state, turn – (undergo a transformation or a change of position or action)
121
Objective Sense Examples He sold his catch at the market Catch, haul – (the quantity that was caught; “the catch was only 10 fish”) –Indefinite quantity – (an estimated quantity) The duck’s quack was loud and brief Quack – (the harsh sound of a duck) –Sound – (the sudden occurrence of an audible event)
122
Objective Senses: Observation We don’t necessarily expect phrases/sentences containing objective senses to be objective –Will someone shut that darn alarm off? –Can’t you even boil water? Subjective, but not due to alarm and boil
123
Objective Sense Definition When the sense is used in a text or conversation, we don’t expect it to express subjectivity and, if the phrase/sentence containing it is subjective, the subjectivity is due to something else.
124
Inter-Annotator Agreement Results Overall: –Kappa=0.74 –Percent Agreement=85.5% Without the 12.3% cases when a judge is uncertain: –Kappa=0.90 –Percent Agreement=95.0%
125
S/O and Pos/Neg Hypothesis: moving from word to word sense is more important for S versus O than it is for Positive versus Negative Pilot study with the nouns of the SENSEVAL-3 English lexical sample task –½ have both subjective and objective senses –Only 1 has both positive and negative subjective senses
126
Outline for Section 4 Motivation and Goals Assigning Subjectivity Labels to Word Senses –Manually –Automatically Word Sense Disambiguation using Automatic Subjectivity Analysis Conclusions
127
Overview Main idea: assess the subjectivity of a word sense based on information about the subjectivity of –a set of distributionally similar words –in a corpus annotated with subjective expressions
128
Preliminaries Suppose the goal were to assess the subjectivity of a word w, given an annotated corpus We could consider how often w appears in subjective expressions Or, we could take into account more evidence – the subjectivity of a set of distributionally similar words
129
Lin’s Distributional Similarity Lin 1998 Ihaveabrowndog R1 R3 R2 R4 Word R W I R1 have have R2 dog brown R3 dog...
130
Lin’s Distributional Similarity R W R W Word1 Word2
131
Subjectivity of word w Unannotated Corpus (BNC) DSW = {dsw 1, …, dsw j } Annotated Corpus (MPQA) #insts(DSW) in SE - #insts(DSW) not in SE #insts (DSW) subj(w) =
132
Subjectivity of word w Unannotated Corpus (BNC) DSW = {dsw 1, …, dsw j } Annotated Corpus (MPQA) [-1, 1] [highly objective, highly subjective] #insts(DSW) in SE - #insts(DSW) not in SE #insts (DSW) subj(w) =
133
Subjectivity of word w Unannotated Corpus (BNC) DSW = {dsw 1,dsw 2 } Annotated Corpus (MPQA) dsw 1 inst 1 dsw 1 inst 2 dsw 2 inst 1 +1 +1 +1 -1 +1 subj(w) = 3 = 1/3
134
Subjectivity of word sense w i Rather than 1, add or subtract sim(w i,dsw j ) Annotated Corpus (MPQA) dsw 1 inst 1 dsw 1 inst 2 dsw 2 inst 1 +sim(w i,dsw 1 ) -sim(w i,dsw 1 ) +sim(w i,dsw 2 ) subj(w i ) = +sim(w i,dsw 1 ) - sim(w i,dsw 1 ) + sim(w i,dsw 2 ) 2 * sim(w i,dsw 1 ) + sim(w i,dsw 2 )
135
Method –Step 1 Given word w Find distributionally similar words –DSW = {dsw j | j = 1.. n}
136
Method –Step 2 word w = Alarm DSW 1 Panic DSW 2 Detector Sense w 1 “fear resulting from the awareness of danger” sim(w 1,panic)sim(w 1,detector) Sense w 2 “a device that signals the occurrence of some undesirable event” sim(w 2,panic)sim(w 2,detector)
137
Method – Step 2 Find the similarity between each word sense and each distributionally similar word wnss can be any concept-based similarity measure between word senses we use Jiang & Conrath 1997
138
Method –Step 3 Input: word sense w i of word w DSW = {dsw j | j = 1..n} sim(w i,dsw j ) MPQA Opinion Corpus Output: subjectivity score subj(w i )
139
Method –Step 3 total sim = #insts(dsw j ) * sim(w i,dsw j ) evi_subj = 0 for each dsw j in DSW: for each instance k in insts(dsw j ): if k is in a subjective expression: evi_subj += sim(w i,dsw j ) else: evi_subj -= sim(w i,dsw j ) subj(w i ) = evi_subj / total sim
140
Evaluation Calculate subj scores for each word sense, and sort them While 0 is a natural candidate for division between S and O, we perform the evaluation for different thresholds in [- 1,+1] Determine the precision of the algorithm at different points of recall
141
Evaluation: precision/recall curves Number of distri- butionally similar words = 160
142
Outline for Section 4 Motivation and Goals Assigning Subjectivity Labels to Word Senses –Manually –Automatically Word Sense Disambiguation using Automatic Subjectivity Analysis Conclusions
143
Overview Augment an existing data-driven WSD system with a feature reflecting the subjectivity of the context of the ambiguous word Compare the performance of original and subjectivity-aware WSD systems The ambiguous nouns of the SENSEVAL- 3 English Lexical Task SENSEVAL-3 data
144
Original WSD System Integrates local and topical features: Local: context of three words to the left and right, their part-of-speech Topical: top five words occurring at least three times in the context of a word sense [Ng & Lee, 1996], [Mihalcea, 2002] Naïve Bayes classifier [Lee & Ng, 2003]
145
Subjectivity Classifier Rule-based automatic sentence classifier from Wiebe & Riloff 2005 Harvests subjective and objective sentences it is certain about from unannotated data Part of OpinionFinder @ www.cs.pitt.edu/mpqa/
146
Subjectivity Tagging for WSD Sentences of the SENSEVAL-3data that contain a target noun are tagged with O, S, or B: Subjectivity Classifier Sentence j “interest” … … … Sentence k “atmosphere” “interest” Sentence i S O S … Tags are fed as input to the Subjectivity Aware WSD System
147
WSD using Subjectivity Tagging Subjectivity Classifier S, O, or B S Original WSD System Subjectivity Aware WSD System Sense 4Sense 1 Sentence i “interest” Sense 1 “a sense of concern with and curiosity about someone or something” Sense 4 “a fixed charge for borrowing money”
148
WSD using Subjectivity Tagging Hypothesis: instances of subjective senses are more likely to be in subjective sentences, so sentence subjectivity is an informative feature for WSD of words with both subjective and objective senses
149
Words with S and O Senses 4.3% error reduction; significant (p < 0.05 paired t-test) S sense not in data
150
Words with Only O Senses Overall 2.2% error reduction; significant (p < 0.1) often target > > = < = = = = = =
151
Conclusions for Section 4 Can subjectivity labels be assigned to word senses? –Manually Good agreement; Kappa=0.74 Very good when uncertain cases removed; Kappa=0.90 –Automatically Method substantially outperforms baseline Showed feasibility of assigning subjectivity labels to the fine-grained sense level of word senses
152
Conclusions for Section 4 Can subjectivity analysis improve word sense disambiguation? –Quality of a WSD system can be improved with subjectivity information Improves performance, but mainly for words with both S and O senses –4.3% error reduction; significant (p < 0.05) Performance largely remains the same or degrades for words that don’t Once senses have been assigned subjectivity labels, a WSD system could consult them to decide whether to consider the subjectivity feature
153
Pointers To Related Work Tutorial held at EUROLAN 2007 “Semantics, Opinion, and Sentiment in Text”, Iaşi, Romania, August Slides, bibliographies, … –www.cs.pitt.edu/~wiebe/EUROLAN07www.cs.pitt.edu/~wiebe/EUROLAN07
154
Conclusions Subjectivity is common in language Recognizing it is useful in many NLP tasks It comes in many forms and often is context- dependent A wide variety of features seem to be necessary for opinion and polarity recognition Subjectivity may be assigned to word senses, promising improved performance for both subjectivity analysis and WSD –Promising as well for multi-lingual subjectivity analysis Mihalcea, Banea, Wiebe 2007
155
Acknowledgements CERATOPS Center for the Extraction and Summarization of Events and Opinions in Text –Pittsburgh: Paul Hoffmann, Josef Ruppenhofer, Swapna Somasundaran, Theresa Wilson –Cornell: Claire Cardie, Eric Breck, Yejin Choi, Ves Stoyanov –Utah: Ellen Riloff, Sidd Patwardhan, Bill Phillips UNT: Rada Mihalcea, Carmen Banea NLP@Pitt: Wendy Chapman, Rebecca Hwa, Diane Litman, …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.