Finding strong and weak opinion clauses Theresa Wilson, Janyce Wiebe, Rebecca Hwa University of Pittsburgh Just how mad are you? AAAI-2004
AAAI /27/2004 Problem and Motivation Problem: Opinion Extraction Automatically identify and extract attitudes, opinions, sentiments in text Applications Information Extraction, Summarization, Question Answering, Flame Detection, etc. Focus: Individual clauses and strength
AAAI /27/2004 “I think people are happy because Chavez has fallen. But there’s also a feeling of uncertainty about how the country’s obvious problems are going to be solved,” said Ms. Ledesma. Motivating Example
AAAI /27/2004 Though some of them did not conceal their criticisms of Hugo Chavez, the member countries of the Organization of American States condemned the coup and recognized the legitimacy of the elected president. Motivating Example low strength high strength medium strength
AAAI /27/2004 Our Goal Identify opinions below sentence level Characterize strength of opinions
AAAI /27/2004 Identify embedded sentential clauses Dependency tree representation Supervised learning to classify strength of clauses NO OPINION VERY STRONG neutral low medium high Significant improvements over baseline Mean-squared error < 1.0 Our Approach
AAAI /27/2004 I am furious that my landlord refused to return my security deposit until I sued them. Our Approach return my that am them sued I to refused landlord furious I until deposit securitymy High Strength Medium Strength Neutral Opinionated Sentence (Riloff et al. (2003), Riloff and Wiebe (2003))
AAAI /27/2004 Outline Introduction Opinions and Emotions in Text Clues and Features Subjectivity Clues Organizing Clues into Features Experiments Strength Classification Results Conclusions
AAAI /27/2004 Private States and Subjective Expressions Private state: covering term for opinions, emotions, sentiments, attitudes, speculations, etc. (Quirk et al., 1985) Subjective Expressions: words and phrases that express private states (Banfield, 1982) “The US fears a spill-over,” said Xirao-Nima. “The report is full of absurdities,” he complained.
AAAI /27/2004 Corpus of Opinion Annotations Multi-perspective Question Answering (MPQA) Corpus Sponsored by NRRC ARDA Released November, 2003 Detailed expression-level annotations of private states: strength See Wilson and Wiebe (SIGdial 2003) Freely Available
AAAI /27/2004 Outline Introduction Opinions and Emotions in Text Clues and Features Subjectivity Clues Organizing Clues into Features Experiments Strength Classification Results Conclusions
AAAI /27/2004 Clues from Previous Work 29 sets of clues Culled from manually developed resources Learned from annotated/unannotated data Words, phrases, extraction patterns Examples SINGLE WORDS – bizarre, hate, concern, applaud, foolish, vexing PHRASES – long for, stir up, grin and bear it, on the other hand EXTRACTION PATTERNS – expressed (condolences|hope|*) show of (support|goodwill|*)
AAAI /27/2004 head Syntax Clues: Generation happy,JJ think,VBP I,PRP people,NNS are,VBP objsubj pred Training data I think people are happy because Chavez has fallen Parse Convert to dependency modifiers peoplearehappy Ithink S NP PRP VP VBP NNSVBPJJ SBAR NPVP
AAAI /27/2004 Example: bilex(are,VBP,pred,happy,JJ) Example: allkids(fallen,VBN,subj,Chavez,NNP,mod,has,VBZ) Syntax Clues: Generation 1. root 2. leaf 3. node 4. all-kids 5. bilex Dependency Parse Tree5 Classes of Clues objsubj pred obj happy,JJ think,VBP I,PRP people,NNS are,VBP because,IN fallen,VBN Chavez,NNPhas,VBZ subjmod i
AAAI /27/2004 Syntax Clues: Selection ≥ 70% instances in subjective expressions in training data? DiscardFrequency ≥ 5 YESNO Highly Reliable YES Any instances in AUTOGEN Corpus? NO Not Very Reliable NO ≥ 80% instances in subjective sentences? YES Somewhat Reliable Discard YESNO Parameters chosen on tuning set
AAAI /27/2004 Syntax Clues 15 sets of clues 5 classes: root, leaf, node, bilex, allkids 3 reliability levels: highly reliable, somewhat reliable, not very reliable
AAAI /27/2004 Organizing Clues into Features SET1 = {believe, happy, sad, think, … } SET2 = {although, because, however, …} … SET44 = {certainly, unlikely, maybe, …} S1: I think people are happy because Chavez has fallen Input to Machine Learning Algorithm: believe happy sad think although because … S SET1 SET2 … SET44 S121 … 0
AAAI /27/2004 NEUTRAL_SET LOW_SET MEDIUM_SET HIGH_SET S Organizing Clues by Strength NEUTRAL_SET = {however, … } LOW_SET = {because, maybe, think, unlikely, …} MEDIUM_SET = {believe, certainly, happy, sad, … } HIGH_SET = {condemn, hate, tremendous, …} S1: I think people are happy because Chavez has fallen Input to Machine Learning Algorithm: Training Data
AAAI /27/2004 Many Types/Sets of Subjectivity Clues 29 from previous work 15 new syntax clues TYPE – features correspond to type sets 44 features STRENGTH – features correspond to strength sets 4 features (neutral, low, medium, high) Clues and Features: Summary
AAAI /27/2004 Outline Introduction Opinions and Emotions in Text Clues and Features Subjectivity Clues Organizing Clues into Features Experiments Strength Classification Results Conclusions
AAAI /27/2004 Approaches to Strength Classification Target Classes: neutral, low, medium, high Boosting BoosTexter (Schapire and Singer, 2000) AdaBoost.HM 1000 rounds of boosting Support Vector Regression SVMlight (Joachims, 1999) Discretize output into ordinal strength classes ClassificationRegression
AAAI /27/2004 Approaches to Strength Classification: Evaluation Target Classes: neutral, low, medium, high AccuracyMean-Squared Error ClassificationRegression total correct NN 1
AAAI /27/2004 Units of Classification happy,JJ think,VBP I,PRP people,NNS are,VBP because,IN fallen,VBN Chavez,NNPhas,VBZ Level 1 Level 2 Level 3 Train Test Train Test Train Test Train Test
AAAI /27/2004 Gold-standard Classes happy,JJ think,VBP I,PRP people,NNS are,VBP because,IN fallen,VBN Chavez,NNPhas,VBZ Level 1: medium Level 2 medium Level 3 neutral medium low
AAAI /27/2004 Overview of Results 10-fold cross validation over 9313 sentences Bag-of-words (BAG) Best Results: All Clues + Bag-of-words Boosting MSE: 48% to 60% improvement over baseline Accuracy: 23% to 79% improvement over baseline Support Vector Regression MSE: 57% to 64% improvement over baseline Baseline – most frequent class
AAAI /27/2004 Improvements over BASELINE SVM: 57% - 64% Boosting: 48% to 60% Results: Mean-Squared Error SVMBoosting Clause Level STRENGTH + BAG STRENGTH Features TYPE Features BAG (Bag-of-words) BASELINE: 1.9 to
AAAI /27/2004 Improvements over BASELINE SVM: 57% clause level 1 Boosting: 23% to 79% Results: Accuracy SVMBoosting Clause Level STRENGTH + BAG STRENGTH Features TYPE Features BAG (Bag-of-words) BASELINE: 30.8 to 48.3
AAAI /27/2004 Removing Syntax Clues: MSE MSE SVMBoosting Clause Level All Clues MINUS Syntax Clues
AAAI /27/2004 Removing Syntax Clues: Accuracy % Accuracy SVMBoosting Clause Level All Clues MINUS Syntax Clues
AAAI /27/2004 Types of Attitude Gordon et al. (2003), Liu et al. (2003) Tracking sentiment timelines Tong (2001) Positive/Negative Language Pang et al. (2002), Morinaga et al. (2002), Turney and Littman (2003), Yu and Hatzivassiloglou (2003), Dave et al. (2003), Nasukawa and Yi (2003), Hu and Liu (2004) Public sentiment in message boards and stock prices Das and Chen (2001) Related Work
AAAI /27/2004 Promising results MSE under 0.80 for sentences MSE near 1 for embedded clauses Embedded clauses more difficult less information Wide range of features produces best results syntax clues Organizing features by strength is useful Conclusions
AAAI /27/2004 Thank you! MPQA Corpus