Influence detection of famous personalities using Politeness and Likeability Navita Jain.

Slides:



Advertisements
Similar presentations
Sentiment Analysis on Twitter Data
Advertisements

#Retweet this: HIV stigma in the twitterverse Miriam Y. Vega, PhD Latino Commission on AIDS & SPSSI UN NGO Abstract: TUAD0301.
WWW 2014 Seoul, April 8 th SNOW 2014 Data Challenge Two-level message clustering for topic detection in Twitter Georgios Petkos, Symeon Papadopoulos, Yiannis.
Tweet Classification for Political Sentiment Analysis Micol Marchetti-Bowick.
Presented By: Omofonmwan Nelson. Agenda:  Twitter  Benefits of Twitter  Tweet  Tweeter Services  Geographical Distribution  Conclusion.
Distant Supervision for Emotion Classification in Twitter posts 1/17.
Subjectivity and Sentiment Analysis of Arabic Tweets with Limited Resources Supervisor Dr. Verena Rieser Presented By ESHRAG REFAEE OSACT 27 May 2014.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
BEHAVIORAL PREDICTION OF TWITTER USERS BASED ON TEXTUAL INFORMATION Shiyao Wang.
Semi Supervised Recognition of Sarcastic Sentences in Twitter and Amazon Dmitry DavidovOren TsurAri Rappoport.
Sarcasm Detection on Twitter A Behavioral Modeling Approach
Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.
Who Needs Polls? Gauging Public Opinion from Twitter Data David Cummings Haruki Oh Ningxuan (Jason) Wang.
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
TESTING DIFFERENT CLASSIFICATION APPROACHES BASED ON FACE RECOGNITION APPLICATION AHMED HELMI ABULILA.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
COLING 2012 Extracting and Normalizing Entity-Actions from Users’ comments Swapna Gottipati, Jing Jiang School of Information Systems, Singapore Management.
FACULTY ADMINISTRATIVE SCIENCE AND POLICY STUDIES ADS511 RESEARCH METHOD AND DATA NALAYSIS CORRELATION ANALYSIS.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
ENGLISH LANGUAGE CAN and COULD. CAN can and could “can” is used to express: 1.Ability (be able to): I can (am able to) help you with your homework. 2.
Prediction of Influencers from Word Use Chan Shing Hei.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization Shubhanshu Mishra 1, Jana Diesner 1, Jason Byrne 2, Elizabeth.
Poorva Potdar Sentiment and Textual analysis of Create-Debate data EECS 595 – End Term Project.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Extracting Hidden Components from Text Reviews for Restaurant Evaluation Juanita Ordonez Data Mining Final Project Instructor: Dr Shahriar Hossain Computer.
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of.
The Correlational Research Strategy Chapter 12. Correlational Research The goal of correlational research is to describe the relationship between variables.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Reputation Management System
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Info Sabanci University start-up company founded in March 2013 by academicians and graduate students from Sabanci University. We develop social media.
Musical Genre Categorization Using Support Vector Machines Shu Wang.
Info Start-up company founded by academicians and graduate students from Sabanci University. We offer social media analysis tools and services including.
2014 Lexicon-Based Sentiment Analysis Using the Most-Mentioned Word Tree Oct 10 th, 2014 Bo-Hyun Kim, Sr. Software Engineer With Lina Chen, Sr. Software.
Measuring User Influence in Twitter: The Million Follower Fallacy Meeyoung Cha Hamed Haddadi Fabricio Benevenuto Krishna P. Gummadi.
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
Our path Understanding emphaty in a Twitter community - Valerio Cestrone, Simona Balbi, Agnieszka Stawinoga What empathy is ? How can be measured on Twitter.
Project Deliverable-1 -Prof. Vincent Ng -Girish Ramachandran -Chen Chen -Jitendra Mohanty.
A Sentiment-Based Approach to Twitter User Recommendation BY AJAY ABDULPUR RAJARAM NIKKAM.
Sentiment Analysis of Twitter Data(using HadoopMapreduce)
Empirical Analysis of Implicit Brand Networks on Social Media
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Messages Using Word2Vec
Name: Sushmita Laila Khan Affiliation: Georgia Southern University
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
Semi-supervised Machine Learning Gergana Lazarova
Influence detection of famous personalities using Politeness and Likeability Navita Jain.
Erasmus University Rotterdam
Sentiment Analysis Study
SOCIAL COMPUTING Homework 3 Presentation
Sentiment Analysis in Turkish Media
Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang
Machine Learning with Weka
iSRD Spam Review Detection with Imbalanced Data Distributions
Twitter Terms Tweeps are the people who follow you.
Big Data Environment. Analysing Public Perceptions of South Africa’s Local Elections by using Geo-located Twitter Data.
By Hossein Hematialam and Wlodek Zadrozny Presented by
Introduction to Sentiment Analysis
Elena Mikhalkova, Nadezhda Ganzherli, Yuri Karyakin, Dmitriy Grigoryev
Kanchana Ihalagedara Rajitha Kithuldeniya Supun weerasekara
Extracting Why Text Segment from Web Based on Grammar-gram
Analyzing Influence of Social Media Through Twitter
Austin Karingada, Jacob Handy, Adviser : Dr
Presentation transcript:

Influence detection of famous personalities using Politeness and Likeability Navita Jain

Motivation Goal: Model influentiality of known personalities on Twitter Hypothesis:  People generally follow personalities who are likeable  Politeness is a trait liked by people Polite and likeable people are influential

Data Twitter data 2 different types of data 1.For likeability or attitude detection: A dataset of tweets in which each influential or non-influential user is referred. Example tweets : FOUR Doctors Warn - Trump Has a Narcissistic Personality Disorder! Unstable 2b President? #DonaldsDisorder #VOAV https… 2.For politeness detection: A dataset of tweets, tweeted by each influential or non- influential user. @DanScavino Vote trump to save the west. Don't become like Europe - #WakeUpAmerica

Converted all the alphabets to lowercase for consistency Removed statements that do not convey a message (sentiment) Kept 100 genuine tweets (tweets after preprocessing) for each test user Data Pre-processing Data collected from Twitter Api in English language

Annotation: Manually classified tweets ‘we are becoming a third world country because of jerks like him\' Great !‘, 'On EarthDay, reverence and gratitude to our planet that has given us everything’, ‘polite’ referring Barack Obama: ‘I AM SO PROUD that I was able, in my lifetime, to see a Black man, BARACK H. OBAMA, become the President of the United States of America.’, referring Sarah Palin: ‘ Are you sure, That well known scientist Sarah Palin told me it's all rubbish. Phew! you had me going for a moment.’, ‘negative’

Tweet2FeatureVector Used Stanford NLP tool to automatically parse dataset. Politeness Feature Vector : pronouns, adjective, verb.. [‘greetings’, ‘thank you’, ’please’, ’respected’, ’go to hell’, ‘f*ck’…….] Likeable Feature Vector : adjective, verb.. Created Likeable Feature Vector ['hate', 'pleased', 'perfect', 'envy‘, ’lmao’]

Classification Trained Naïve Bayes classifier for Politeness: Performed k = 5 fold cross validation Train on 80% of data, test on rest 20% Trained Support Vector Machine classifier for Likeability: Kernel : Linear Performed k = 10 fold cross validation Train on 90% of data, test on rest 10%

Spearman's rank correlation coefficient Measure of the strength of the association between two rank sets, where x i and y i are the ranks of users based on two different influence measures in a dataset of N test case where x i is politeness ranking and y i is likeability ranking

Result Test CasePoliteness Rank/ScoreLikeability Rank/Score Influence Rank/Score Barack Obama Donald Trump Jeb Bush John Boehner Kim Kardashian Narendra Modi Rahul Gandhi Sarah Palin Stephen Smith Taylor Swift

Ground TruthPredicted Influential users Barack ObamaKim kardashian Donald TrumpNarendra Modi Narendra ModiJeb Bush Kim KardashianBarack ObamaTaylor Swift Non-influential users Jeb BushDonald TrumpJohn BoehnerRahul GandhiSarah PalinStephen Smith

Thanks

Influence Detection Train on 80% of data, test on rest 20% Politeness correlates positively with influence Higher Politeness score more influential a person should be. Likeability proportional to influence First iteration, sum the score if above some threshold : influential

Thanks Suggestions

Twitter Api command for Data collection ''' Tweets on : user ''‘ raw_tweets = myApi.GetSearch('Donald Trump',lang='en',count=100) ‘’' Tweets by user''‘ raw_tweets =

Opinion lexicon : Positive and negative word list

Downloaded 100 tweets for each test user for likability measurement Kept retweets, as if a person retweets it means its their sentiment/agrees with it. (not sure about this ) Removed urls but kept hashtag because informative. For ex. “President Donald J. Trump #IrritateMeIn4Words”