Download presentation
Presentation is loading. Please wait.
Published byDeanna Milham Modified over 9 years ago
1
Subjectivity and Sentiment Analysis of Arabic Tweets with Limited Resources Supervisor Dr. Verena Rieser Presented By ESHRAG REFAEE OSACT 27 May 2014
2
Outline 1. Introduction The concept of subjectivity and sentiment analysis (SSA) Motivations and challenges of SSA for Arabic Previous work on SSA of Arabic social networks 2. Experimental setup Twitter corpus: collection and annotation Evaluation metrics Machine learners 3. Results and Error Analysis 4. Summary and future work 2
3
Subjectivity and Sentiment analysis (SSA) Definition: Analysing and understanding people’s sentiments, evaluations, opinions, attitudes, and emotions from written text. 3
4
Hierarchical Model of Subjectivity and Sentiment analysis (SSA) 4 User- generated text SubjectivePositiveNegativeObjective
5
Applications In addition to its significance as a major sub-field of Natural Language Processing (NLP) research, SSA has a range of real-world applications: Commercial applications measuring success of a product Social applications Political applications Economical applications 5
6
SSA and Social Networks The growing importance of sentiment analysis coincides with the growth of social media such as micro-blogs. 6
7
7
8
Twitter (Statistic Brain, 2014) March 2012, Twitter now available in Arabic (Twitter Blog, 2012) 8 Twitter ~60 M tweets/day >600 M active users 10 th most popular site in the world SSA and Twitter
9
About Arabic Arabic is the language of over 422 million people First language of the 22 member countries of the Arabic League Official language in three other countries (UNISCO, 2013). 9
10
About Arabic Arabic is the language of over 422 million people Arabic language can be classified into three major levels (Habash, 2010): Classic Arabic (CA) Modern standard Arabic (MSA) Arabic Dialects (AD). 10 Used in social networks side-by-side
11
Challenges with Respect to Arabic Limited availability of NLP resources for DA. Noisy features. No large-scale Arabic Twitter corpus annotated for SSA publically available. Sparse labelled data. BUT: Lots of unlabelled data! 11
12
Challenges With Respect to Twitter ‘Bad language’ (Eisenstein, J. 2013) Unclear sentiment indicator Dynamic nature/ topic-shifting (Go et al, 2009). 12 المساواة في قمع الحريات الشخصية عدل Equality in supressing personal freedom is justice ew, ugh instead of disgusting bro instead of brother
13
Previous Work on SSA of Arabic Tweets PublicationFeature-setsClassificatio n scheme Results Abdul_magged et al (2012) Stem and lemma word tokens, POS, semantic features, user: person/org SVM (two-stage binary classification) The best acc. 65.32% for sentiment analysis and 79.01% for subjectivity analysis Mourad and Darwish (2013) Stem word tokens, tweets-specific features, stylistic features SVM and NB with 10-fold cross-validation The best acc. 64.1% for subjectivity classification and 72.5% for sentiment classification 13 Mainly Supervised Learning on manually annotated corpora. Costly annotations. Not scalable/ applicable to unseen topics!
14
Previous Work on SSA of Arabic Tweets PublicationFeature-setsClassificati on scheme DatasetsResults Abdul_Mageed et al (2012) Stem and lemma word tokens, POS, semantic features, user: person/org SVM (two- stage binary classification) 3k Arabic tweets The best acc. 65.32% for sentiment analysis and 79.01% for subjectivity analysis Mourad and Darwish (2013) Stem word tokens, tweets- specific features, stylistic features SVM and NB with 10-fold cross- validation 2,300 Arabic tweets The best acc. 64.1% for subjectivity classification and 72.5% for sentiment classification 14 Word-based features. SVM shown to perform best (large feature sets) Evaluation: 10-fold cross-validation Held-out test set from same corpus No test for unseen topics/ scalability for topic shift!
15
Outline 1. Introduction Motivations and challenges of subjectivity and sentiment analysis (SSA) for Arabic Previous work on SSA of Arabic social networks 2. Experimental setup Twitter corpus: collection and annotation Evaluation metrics Machine learners 3. Results and Error Analysis 4. Summary and future work 15
16
Methodology and Approach Un- labelled tweets Human annotators Gold- standard labelled tweets Arabic ALP tools Train machine learning scheme: SVM classifier Manually- annotated held-out test set Features Model evaluation
17
Arabic Twitter SSA Corpora 17
18
Arabic Twitter SSA Corpora: Gold Standard Data Set Manually annotated for sentiment analysis (total=3,309) 2 native speaker annotators (weighted Kappa=0.76) 18
19
Arabic Twitter SSA Corpora: Held-out Test Set 963 tweets were manually annotated for evaluating the trained models. 19
20
Arabic Twitter SSA Corpora Sentiment labelExample Positive السياحة في اليمن جمال لا يصدق Tourism in Yemen, unbelievable beauty Negative حنا للأسف نستخدم ايفون Unfortunately, we use the iPhone Neutral ميركل تدعو اوكرانيا لتشكيل حكومة جديدة Merkel calls for Ukraine to form a new government 20 Examples of annotated tweets
21
Features Extraction TypeLinguistic tool/resourceFeature-set Morphological features Arabic morphological analyser: MADA + TOKAN V3.2 (Habash and Rambow, 2005 & Habash, and Roth, 2009). Diacritic, Aspect, Gender, mood, person,part-of- speech, State, voice, Has- morph-analysis Syntactic features N-grams of word tokens Semantic features Polarity lexicons: 1)ArabSenti (Abdul- Mageed et al, 2011) 2)MPQA-translation (Wilson et al, 2005) Has-positive-lexicon, Has-negative-lexicon, Has-neutral-lexicon, Has-negator Stylistic features Has-positive-emoticon, Has-negative-emoticon 21
22
Subjectivity and Sentiment Classification Experiments 22
23
SSA Classification: Problem Formulations 23 TextSubjectivePositiveNegativeObjective TextPositiveNegativeNeutral
24
Machine Learning Classifiers Support Vector Machines (SVM): Sequential Minimal Optimization-SMO (Platt, 1999) Majority baseline: ZeroR 24 SVM aims to identify the Optimal hyperplane that linearly separates data instances with the maximum margin (Hsu et al, 2003)
25
Evaluation Metrics F-measure Accuracy: Significant differences: T-test with p<0.05 25
26
Outline 1. Introduction Motivations and challenges of subjectivity and sentiment analysis (SSA) for Arabic Previous work on SSA of Arabic social networks 2. Experimental setup Twitter corpus: collection and annotation Evaluation metrics Machine learners 3. Results and Error Analysis 4. Summary and future work 26
27
Results and evaluation Data-set Majority baseline SVM 10-fold cross- validation SVM Held-out test set FAccF F Polar vs. neutral 0.4459.210.8686.930.4346.62 Positive vs. negative 0.6750.220.8887.740.4149.65 Positive vs. negative vs. neutral 0.4459.210.8585.390.2828.24 27
28
Error Analysis: 28 The most predictive word uni-grams in the two datasets as evaluated by Chi-Squared IDDevelopment set (Spring’13)Test set (Autumn’13) ArabicEnglishArabicEnglish 1 الخيرWell-being10.0221اجمل More beautiful 7.061 2 الشعبNation7.114احسن Better5.8727 3 اجملMore beautiful6.9927آه (sigh)5.236 4 ماهرSkilful5.0705سعادة Happines s 4.689 5 مبروك Congratulations 4.984الخير Welfare/ Well- being 4.689
29
Error Analysis 29 The most predictive word uni-grams in the two datasets as evaluated by Chi-Squared IDDevelopment set (Spring’13)Test set (Autumn’13) ArabicEnglishArabicEnglish 1 الخيرWell-being10.0221اجمل More beautiful 7.061 2 الشعبNation7.114احسن Better5.8727 3 اجمل More beautiful 6.9927آه (sigh)5.236 4 ماهرSkilful5.0705سعادة Happines s 4.689 5 مبروك Congratulation s 4.984الخير Welfare/ Well- being 4.689
30
Current Work A large-scale Arabic Twitter SSA Corpus: DISTANT supervision (DS) data set **Refaee and Rieser (2014). Can we Read Emotions from a smiley face? Emoticon-based distant supervision for subjectivity and sentiment analysis of Arabic Twitter feeds. In the 5th International Workshop on Emotion, Social Signals, Sentiment and Linked Open Data. 30 Un-labelled tweets Noisy labels: #hashtags & Automatically- labelled tweets Arabic ALP tools Train machine learning scheme: Learn SVM classifier Model evaluation: Manually- annotated test set Features
31
Current work Annotate and release a newly collected gold-standard Arabic Twitter corpus* Extended feature-sets: * Available via ELRA repository, details described in [Refaee & Rieser, LREC 2014]. 31 TypeFeature-set Twitter-specific featuresHas-hashtag, has-URL, is-favourite, is-retweet Social signalsHas-consents, has-dazzle, has-laugh, has-regret, has-sigh Language styleMSA/DA, is-sarcastic Tweet categoryTweet-category {politics, sport, social, religious, internet, commercial, etc.} Number of instances6,894 Word frequencies91,419 Word tokens28,373
32
32 Please come and see my poster on May 29, Time 11:45-13:25 Session: social media processing P 32 No. 317
33
Thanks Looking forward to hear your feedback … Or contact me through Eaar1@hw.ac.uk @eshragR 33
34
DS for SSA of social networks in other languages 34 LanguagePublication Auto- sentiment feature Sentiment labels Feature-sets Classificatio n schemes Results English Go et al (2009) Emoticons Positive vs. negative Unigrams, bigrams, and POS NB, SVM, ME Best Accuracy= 83% Bifet and Frank (2010) Emoticons Positive vs. negative Unigrams Multinomial NB, SGD Best accuracy= 82.45% Purver and Battersby (2012) Emoticons 6 emotion classes UnigramsSVM Best F- score=77.5% (detecting happiness) Suttles and Ide (2013) Emoticons, hashtags and emoji 8 emotion classes (binary classification) UnigramsNB, ME Best acc. 90.6% {joy vs. sadness} Chinese Yuan and Purver (2012) Emoticons 6 emotion classes Character- based and word-based N-grams SVM Best accuracy= 78.2% (detecting happiness)
35
Example of annotation disagreement 35 #Tweet textLabel Annot ator 1 Annot ator 2 1 لنرى قوتكم يا ارهابيه بشار الاسد لنسحقكم ونحن لا نتشرف بلقياكم يا كلاب الناتو Let’s see your power you the terrorists of Bashar Al-Assad to crush you and we do not even want to see you, you NATO’s dogs Negati ve 2يوجد ايفون بين كل اربعة هواتف ذكية There is an iPhone among each 4 smart phones Neutra l (facts) Neutra l (no- clear positiv e evalua tion) 3 تعتبر السياحة مورد هام للاقتصاد البحريني حيث بلغ عدد السائحين في 2007 الى 4.8 مليون سائح ومتوقع ان يزداد بشكل كبير جدا Tourism is considered as an important revenue of the Bahrain's economy, as number of tourists in 2007 reached 4.8 M and expected to increase (very) enormously Positiv e (positi ve evalua tion) Neutra l (news) 4علمتنا الثورات العربية ان بشار الاسد عنده حقThe political revolution (Arab Spring) has taught us that Bashar Al-Assad is right Neutra l (sarca stic view) Negati ve (negati ve stance )
36
Methodology and Approach Un- labelled tweets Noisy labels: #hashtags & Automaticall y-labelled tweets Arabi c ALP tools Train machine learning scheme: Learn SVM classifier Model evaluatio: Manually- annotated test set Features
37
Approach and methodology Arabic Twitter Corpora Build and annotate a Twitter corpora for SSA Machine Learning Algorithm Apply a machine learning scheme: Support Vector Machines (SVM) Build a sentiment classifier Learn a statistical classifier to discriminate a given text to: subjective vs. objective subjective positive vs. subjective negative Evaluate and test models’ capabilities of being generalised Independent test set 37
38
Experimental settings Pre-processing Remove re-tweets Normalize Latin characters, digits, URLs, user-names, hashtags Replace > 2 repetitive characters consecutively with only 2 Apply light Arabic stemmer Remove stop words Problem formulations Two-stage binary classification: subjective vs. objective; positive vs. negative One-stage multi-class classification: positive vs. negative vs. neutral 38
39
DS for SSA of social networks in other languages 39 LanguagePublication Auto- sentiment feature Sentiment labels Feature-sets Classificatio n schemes Results English Go et al (2009) Emoticons Positive vs. negative Unigrams, bigrams, and POS NB, SVM, ME Best Accuracy= 83% Bifet and Frank (2010) Emoticons Positive vs. negative Unigrams Multinomial NB, SGD Best accuracy= 82.45% Purver and Battersby (2012) Emoticons 6 emotion classes UnigramsSVM Best F- score=77.5% (detecting happiness) Suttles and Ide (2013) Emoticons, hashtags and emoji 8 emotion classes (binary classification) UnigramsNB, ME Best acc. 90.6% {joy vs. sadness} Chinese Yuan and Purver (2012) Emoticons 6 emotion classes Character- based and word-based N-grams SVM Best accuracy= 78.2% (detecting happiness)
40
DS for SSA of social networks in other languages 40 LanguagePublication Auto- sentiment feature Sentiment labels Feature-sets Classificatio n schemes Results English Go et al (2009) Emoticons Positive vs. negative Unigrams, bigrams, and POS NB, SVM, ME Best Accuracy= 83% Bifet and Frank (2010) Emoticons Positive vs. negative Unigrams Multinomial NB, SGD Best accuracy= 82.45% Purver and Battersby (2012) Emoticons 6 emotion classes UnigramsSVM Best F- score=77.5% (detecting happiness) Suttles and Ide (2013) Emoticons, hashtags and emoji 8 emotion classes (binary classification) UnigramsNB, ME Best acc. 90.6% {joy vs. sadness} Chinese Yuan and Purver (2012) Emoticons 6 emotion classes Character- based and word-based N-grams SVM Best accuracy= 78.2% (detecting happiness)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.