SemEval-2013 Task 2: Sentiment Analysis in Twitter Preslav NakovSara Rosenthal Zornitsa KozarevaVeselin Stoyanov Alan RitterTheresa Wilson.

Slides:



Advertisements
Similar presentations
Dan Jurafsky Lecture 4: Sarcasm, Alzheimers, +Distributional Semantics Computational Extraction of Social and Interactional Meaning SSLST, Summer 2011.
Advertisements

Sentiment Analysis on Twitter Data
Overview of the TAC2013 Knowledge Base Population Evaluation: English Slot Filling Mihai Surdeanu with a lot help from: Hoa Dang, Joe Ellis, Heng Ji, and.
GermanPolarityClues A Lexical Resource for German Sentiment Analysis
Large-Scale Entity-Based Online Social Network Profile Linkage.
Distant Supervision for Emotion Classification in Twitter posts 1/17.
Problem Semi supervised sarcasm identification using SASI
SemEval 2013 Task 2 Labs AVAYA: Sentiment Analysis in Twitter with Self-Training and Polarity Lexicon Expansion Lee Becker, George Erhart, David Skiba,
Sentiment Analysis An Overview of Concepts and Selected Techniques.
A Brief Overview. Contents Introduction to NLP Sentiment Analysis Subjectivity versus Objectivity Determining Polarity Statistical & Linguistic Approaches.
Annotating Topics of Opinions Veselin Stoyanov Claire Cardie.
Joint Sentiment/Topic Model for Sentiment Analysis Chenghua Lin & Yulan He CIKM09.
A Framework for Automated Corpus Generation for Semantic Sentiment Analysis Amna Asmi and Tanko Ishaya, Member, IAENG Proceedings of the World Congress.
Peiti Li 1, Shan Wu 2, Xiaoli Chen 1 1 Computer Science Dept. 2 Statistics Dept. Columbia University 116th Street and Broadway, New York, NY 10027, USA.
Product Feature Discovery and Ranking for Sentiment Analysis from Online Reviews. __________________________________________________________________________________________________.
Amazon Mechanical Turk (Mturk) What is MTurk? – Crowdsourcing Internet marketplace that utilizes human intelligence to perform tasks that computers are.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Crowdsourcing research data UMBC ebiquity,
1 Attributions and Private States Jan Wiebe (U. Pittsburgh) Theresa Wilson (U. Pittsburgh) Claire Cardie (Cornell U.)
Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, Bing Qin
Extracting Interest Tags from Twitter User Biographies Ying Ding, Jing Jiang School of Information Systems Singapore Management University AIRS 2014, Kuching,
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
At the login page, you will enter the following: Click “Logon” Do not use dashes in User ID.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
C OLLECTIVE ANNOTATION OF WIKIPEDIA ENTITIES IN WEB TEXT - Presented by Avinash S Bharadwaj ( )
ONTOLOGY LEARNING AND POPULATION FROM FROM TEXT Ch8 Population.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
Morpho Challenge competition Evaluations and results Authors Mikko Kurimo Sami Virpioja Ville Turunen Krista Lagus.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
Opinion Sentence Search Engine on Open-domain Blog Osamu Furuse, Nobuaki Hiroshima, Setsuo Yamada, Ryoji Kataoka NTT Cyber Solutions Laboratories, NTT.
Exploiting Subjectivity Classification to Improve Information Extraction Ellen Riloff University of Utah Janyce Wiebe University of Pittsburgh William.
Opinion Mining of Customer Feedback Data on the Web Presented By Dongjoo Lee, Intelligent Databases Systems Lab. 1 Dongjoo Lee School of Computer Science.
Summarization Focusing on Polarity or Opinion Fragments in Blogs Yohei Seki Toyohashi University of Technology Visiting Scholar at Columbia University.
SemEval-2010 Task 8 Multi-Way Classification of Semantic Relations Between Pairs of Nominals 1. Task Description Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva,
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
IR Homework #3 By J. H. Wang May 4, Programming Exercise #3: Text Classification Goal: to classify each document into predefined categories Input:
Improving Search Results Quality by Customizing Summary Lengths Michael Kaisser ★, Marti Hearst  and John B. Lowe ★ University of Edinburgh,  UC Berkeley,
Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization Shubhanshu Mishra 1, Jana Diesner 1, Jason Byrne 2, Elizabeth.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Extracting Hidden Components from Text Reviews for Restaurant Evaluation Juanita Ordonez Data Mining Final Project Instructor: Dr Shahriar Hossain Computer.
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
Subjectivity Recognition on Word Senses via Semi-supervised Mincuts Fangzhong Su and Katja Markert School of Computing, University of Leeds Human Language.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Extracting and Ranking Product Features in Opinion Documents Lei Zhang #, Bing Liu #, Suk Hwan Lim *, Eamonn O’Brien-Strain * # University of Illinois.
Consensus Relevance with Topic and Worker Conditional Models Paul N. Bennett, Microsoft Research Joint with Ece Kamar, Microsoft Research Gabriella Kazai,
UIC at TREC 2006: Blog Track Wei Zhang Clement Yu Department of Computer Science University of Illinois at Chicago.
2014 Lexicon-Based Sentiment Analysis Using the Most-Mentioned Word Tree Oct 10 th, 2014 Bo-Hyun Kim, Sr. Software Engineer With Lina Chen, Sr. Software.
Lecture: Sentiment Analysis Krista Lagus Statistical Natural Language Processing course at Aalto
Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon -Smit Shilu.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
My Favorite Top 5 Free Keyword Research Tools –
Influence detection of famous personalities using Politeness and Likeability Navita Jain.
Exhibiting Partner Information
Sentiment Analysis and Data visualization Techniques of Experiments
Lecture: Sentiment Analysis
The Sellout: Readers Sentiment Analysis of 2016 Man Booker Prize Winner Paper ID : 748.
Sentiment Analysis of Twitter Messages Using Word2Vec
Name: Sushmita Laila Khan Affiliation: Georgia Southern University
Sentiment analysis algorithms and applications: A survey
Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang
Participation Reports
ELPA21 Screener Participation Reports
Sentiment/opinion analysis
KS2 Questions types KS2 need exposure to..
Text Mining & Natural Language Processing
Participation Reports
Participation Reports
Presentation transcript:

SemEval-2013 Task 2: Sentiment Analysis in Twitter Preslav NakovSara Rosenthal Zornitsa KozarevaVeselin Stoyanov Alan RitterTheresa Wilson

Task 2 - Overview Sentiment Analysis Understanding how opinions and sentiments are expressed in language Extracting opinions and sentiments from human language data Social Media Short messages Informal language Creative spelling, punctuation, words, and word use Genre-specific terminology (#hashtags) and discourse (RT) Task Goal: Promote sentiment analysis research in Social Media  SemEval Tweet Corpus Publically available (within Twitter TOS) Phrase and message-level sentiment Tweets and SMS 1 for evaluating generalizability 1 From NUS SMS Corpus (Chen and Kan, 2012)

Task Description Two subtasks: A.Phrase-level sentiment B.Message-level sentiment Classify as positive, negative, neutral/objective: – Words and phrases identified as subjective [Subtask A] – Messages (tweets/SMS) [Subtask B]

Data Collection Extract NEs (Ritter et al., 2011) Identify Popular Topics (Ritter et al., 2012) NEs frequently associated with specific dates Extract Messages Mentioning Topics Filter Messages for Sentiment Keep if ≥ pos/neg term from SentiWordNet (>0.3) Data for Annotation

Annotation Task Instructions: Subjective words are ones which convey an opinion. Given a sentence, identify whether it is objective, positive, negative, or neutral. Then, identify each subjective word or phrase in the context of the sentence and mark the position of its start and end in the text boxes below. The number above each word indicates its position. The word/phrase will be generated in the adjacent textbox so that you can confirm that you chose the correct range. Choose the polarity of the word or phrase by selecting one of the radio buttons: positive, negative, or neutral. If a sentence is not subjective please select the checkbox indicating that ”There are no subjective words/phrases”. Please read the examples and invalid responses before beginning if this is your first time answering this hit. Mechanical Turk HIT (3-5 workers per tweet)

Data Annotations Worker 1 Worker 2 Worker 3 Worker 4 Worker 5 Intersection Final annotations determined using majority vote

Example Annotations friday evening plans were great, but saturday’s plans didnt go as expected – i went dancing & it was an ok club, but terribly crowded :-( WHY THE HELL DO YOU GUYS ALL HAVE MRS. KENNEDY! SHES A FUCKING DOUCHE AT&T was okay but whenever they do something nice in the name of customer service it seems like a favor, while T-Mobile makes that a normal everyday thin obama should be impeached on TREASON charges. Our Nuclear arsenal was TOP Secret. Till HE told our enemies what we had. #Coward #Traitor My graduation speech: ”I’d like to thanks Google, Wikipedia and my computer! :D #iThingteens

Distribution of Classes TrainDevTest-TWEETTest-SMS Positive 5, ,734 (60%) 1,071 (46%) Negative 3, ,541 (33%) 1,104 (47%) Neutral (3%) 159 (7%) Total 4,6352,334 TrainDevTest-TWEETTest-SMS Positive 3, ,573 (41%) 492 (23%) Negative 1, (16%) 394 (19%) Neutral/O bjective 4, ,640 (43%) 1,208 (58%) Total 3,8142,094 Subtask B Subtask A

Options for Participation 1.Subtask A and/or Subtask B 2.Constrained* and/or Unconstrained Refers to data used for training 3.Tweets and/or SMS *Used for ranking

Participation Unconstrained (7)Constrained (21) Unconstrained (15) Constrained (36) Submissions (148)

Participation 14 teams participated in both A & B subtasks

Scoring Recall, Precision, F-measure calculated for pos/neg classes for each run submitted Score = Ave(Pos F, Neg F)

Subtask A (words/phrases) Results Tweets SMS Top Systems 1. NRC-Canada 2. AVAYA 3. Bounce Top Systems 1. GU-MLT-LT 2. NRC-Canada 3. AVAYA

Subtask B (messages) Results Tweets SMS Top Systems 1. NRC-Canada 2. GU-MLT-LT 3. KLUE Top Systems 1. NRC-Canada 2. GU-MLT-LT 3. teragram

Observations Majority of systems were supervised and constrained 5 semi-supervised, 1 fully unsupervised Systems that made best use of unconstrained option: Subtask A: senti.ue-en Subtask B Tweet: AVAYA, bwbaugh, ECNUCS, OPTIMA, sinai Subtask B SMS: bwbaugh, nlp.cs.aueb.gr, OPTIMA, SZTE-NLP Most popular classifiers SVM, MaxEnt, linear classifier, Naive Bayes

Thank You! Special thanks to co-organizers: Preslav Nakov, Sara Rosenthal, Alan Ritter Zonitsa Kozareva, Veselin Stoyanov SemEval Tweet Corpus Funding for annotations provided by: JHU Human Language Technology Center of Excellence ODNI IARPA