Download presentation
Presentation is loading. Please wait.
Published byLucy Skinner Modified over 8 years ago
1
Finding strong and weak opinion clauses Theresa Wilson, Janyce Wiebe, Rebecca Hwa University of Pittsburgh Just how mad are you? AAAI-2004
2
AAAI 20042 7/27/2004 Problem and Motivation Problem: Opinion Extraction Automatically identify and extract attitudes, opinions, sentiments in text Applications Information Extraction, Summarization, Question Answering, Flame Detection, etc. Focus: Individual clauses and strength
3
AAAI 20043 7/27/2004 “I think people are happy because Chavez has fallen. But there’s also a feeling of uncertainty about how the country’s obvious problems are going to be solved,” said Ms. Ledesma. Motivating Example
4
AAAI 20044 7/27/2004 Though some of them did not conceal their criticisms of Hugo Chavez, the member countries of the Organization of American States condemned the coup and recognized the legitimacy of the elected president. Motivating Example low strength high strength medium strength
5
AAAI 20045 7/27/2004 Our Goal Identify opinions below sentence level Characterize strength of opinions
6
AAAI 20046 7/27/2004 Identify embedded sentential clauses Dependency tree representation Supervised learning to classify strength of clauses NO OPINION VERY STRONG neutral low medium high Significant improvements over baseline Mean-squared error < 1.0 Our Approach
7
AAAI 20047 7/27/2004 I am furious that my landlord refused to return my security deposit until I sued them. Our Approach return my that am them sued I to refused landlord furious I until deposit securitymy High Strength Medium Strength Neutral Opinionated Sentence (Riloff et al. (2003), Riloff and Wiebe (2003))
8
AAAI 20048 7/27/2004 Outline Introduction Opinions and Emotions in Text Clues and Features Subjectivity Clues Organizing Clues into Features Experiments Strength Classification Results Conclusions
9
AAAI 20049 7/27/2004 Private States and Subjective Expressions Private state: covering term for opinions, emotions, sentiments, attitudes, speculations, etc. (Quirk et al., 1985) Subjective Expressions: words and phrases that express private states (Banfield, 1982) “The US fears a spill-over,” said Xirao-Nima. “The report is full of absurdities,” he complained.
10
AAAI 200410 7/27/2004 Corpus of Opinion Annotations Multi-perspective Question Answering (MPQA) Corpus Sponsored by NRRC ARDA Released November, 2003 http://nrrc.mitre.org/NRRC/publications.htm http://nrrc.mitre.org/NRRC/publications.htm Detailed expression-level annotations of private states: strength See Wilson and Wiebe (SIGdial 2003) Freely Available
11
AAAI 200411 7/27/2004 Outline Introduction Opinions and Emotions in Text Clues and Features Subjectivity Clues Organizing Clues into Features Experiments Strength Classification Results Conclusions
12
AAAI 200412 7/27/2004 Clues from Previous Work 29 sets of clues Culled from manually developed resources Learned from annotated/unannotated data Words, phrases, extraction patterns Examples SINGLE WORDS – bizarre, hate, concern, applaud, foolish, vexing PHRASES – long for, stir up, grin and bear it, on the other hand EXTRACTION PATTERNS – expressed (condolences|hope|*) show of (support|goodwill|*)
13
AAAI 200413 7/27/2004 head Syntax Clues: Generation happy,JJ think,VBP I,PRP people,NNS are,VBP objsubj pred Training data I think people are happy because Chavez has fallen Parse Convert to dependency modifiers peoplearehappy Ithink S NP PRP VP VBP NNSVBPJJ SBAR NPVP
14
AAAI 200414 7/27/2004 Example: bilex(are,VBP,pred,happy,JJ) Example: allkids(fallen,VBN,subj,Chavez,NNP,mod,has,VBZ) Syntax Clues: Generation 1. root 2. leaf 3. node 4. all-kids 5. bilex Dependency Parse Tree5 Classes of Clues objsubj pred obj happy,JJ think,VBP I,PRP people,NNS are,VBP because,IN fallen,VBN Chavez,NNPhas,VBZ subjmod i
15
AAAI 200415 7/27/2004 Syntax Clues: Selection ≥ 70% instances in subjective expressions in training data? DiscardFrequency ≥ 5 YESNO Highly Reliable YES Any instances in AUTOGEN Corpus? NO Not Very Reliable NO ≥ 80% instances in subjective sentences? YES Somewhat Reliable Discard YESNO Parameters chosen on tuning set
16
AAAI 200416 7/27/2004 Syntax Clues 15 sets of clues 5 classes: root, leaf, node, bilex, allkids 3 reliability levels: highly reliable, somewhat reliable, not very reliable
17
AAAI 200417 7/27/2004 Organizing Clues into Features SET1 = {believe, happy, sad, think, … } SET2 = {although, because, however, …} … SET44 = {certainly, unlikely, maybe, …} S1: I think people are happy because Chavez has fallen Input to Machine Learning Algorithm: believe happy sad think although because … S1010101 SET1 SET2 … SET44 S121 … 0
18
AAAI 200418 7/27/2004 NEUTRAL_SET LOW_SET MEDIUM_SET HIGH_SET S1 0 1 2 0 Organizing Clues by Strength NEUTRAL_SET = {however, … } LOW_SET = {because, maybe, think, unlikely, …} MEDIUM_SET = {believe, certainly, happy, sad, … } HIGH_SET = {condemn, hate, tremendous, …} S1: I think people are happy because Chavez has fallen Input to Machine Learning Algorithm: Training Data
19
AAAI 200419 7/27/2004 Many Types/Sets of Subjectivity Clues 29 from previous work 15 new syntax clues TYPE – features correspond to type sets 44 features STRENGTH – features correspond to strength sets 4 features (neutral, low, medium, high) Clues and Features: Summary
20
AAAI 200420 7/27/2004 Outline Introduction Opinions and Emotions in Text Clues and Features Subjectivity Clues Organizing Clues into Features Experiments Strength Classification Results Conclusions
21
AAAI 200421 7/27/2004 Approaches to Strength Classification Target Classes: neutral, low, medium, high Boosting BoosTexter (Schapire and Singer, 2000) AdaBoost.HM 1000 rounds of boosting Support Vector Regression SVMlight (Joachims, 1999) Discretize output into ordinal strength classes ClassificationRegression
22
AAAI 200422 7/27/2004 Approaches to Strength Classification: Evaluation Target Classes: neutral, low, medium, high AccuracyMean-Squared Error ClassificationRegression total correct NN 1
23
AAAI 200423 7/27/2004 Units of Classification happy,JJ think,VBP I,PRP people,NNS are,VBP because,IN fallen,VBN Chavez,NNPhas,VBZ Level 1 Level 2 Level 3 Train Test Train Test Train Test Train Test
24
AAAI 200424 7/27/2004 Gold-standard Classes happy,JJ think,VBP I,PRP people,NNS are,VBP because,IN fallen,VBN Chavez,NNPhas,VBZ Level 1: medium Level 2 medium Level 3 neutral medium low
25
AAAI 200425 7/27/2004 Overview of Results 10-fold cross validation over 9313 sentences Bag-of-words (BAG) Best Results: All Clues + Bag-of-words Boosting MSE: 48% to 60% improvement over baseline Accuracy: 23% to 79% improvement over baseline Support Vector Regression MSE: 57% to 64% improvement over baseline Baseline – most frequent class
26
AAAI 200426 7/27/2004 Improvements over BASELINE SVM: 57% - 64% Boosting: 48% to 60% Results: Mean-Squared Error SVMBoosting Clause Level STRENGTH + BAG STRENGTH Features TYPE Features BAG (Bag-of-words) BASELINE: 1.9 to 2.5 1.6
27
AAAI 200427 7/27/2004 Improvements over BASELINE SVM: 57% clause level 1 Boosting: 23% to 79% Results: Accuracy SVMBoosting Clause Level STRENGTH + BAG STRENGTH Features TYPE Features BAG (Bag-of-words) BASELINE: 30.8 to 48.3
28
AAAI 200428 7/27/2004 Removing Syntax Clues: MSE MSE SVMBoosting Clause Level All Clues MINUS Syntax Clues
29
AAAI 200429 7/27/2004 Removing Syntax Clues: Accuracy % Accuracy SVMBoosting Clause Level All Clues MINUS Syntax Clues
30
AAAI 200430 7/27/2004 Types of Attitude Gordon et al. (2003), Liu et al. (2003) Tracking sentiment timelines Tong (2001) Positive/Negative Language Pang et al. (2002), Morinaga et al. (2002), Turney and Littman (2003), Yu and Hatzivassiloglou (2003), Dave et al. (2003), Nasukawa and Yi (2003), Hu and Liu (2004) Public sentiment in message boards and stock prices Das and Chen (2001) Related Work
31
AAAI 200431 7/27/2004 Promising results MSE under 0.80 for sentences MSE near 1 for embedded clauses Embedded clauses more difficult less information Wide range of features produces best results syntax clues Organizing features by strength is useful Conclusions
32
AAAI 200432 7/27/2004 Thank you! MPQA Corpus http://nrrc.mitre.org/NRRC/publications.htm
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.