Influence detection of famous personalities using Politeness and Likeability Navita Jain
Data Twitter data 2 different types of data For likeability or attitude detection: A dataset of tweets in which each influential or non-influential user is referred. Example tweets : RT @Chamberlain1973: FOUR Doctors Warn - Trump Has a Narcissistic Personality Disorder! Unstable 2b President? #DonaldsDisorder #VOAV https… For politeness detection: A dataset of tweets, tweeted by each influential or non- influential user. Example tweets: "@R_U_OK_UK: @realDonaldTrump @glozee1 @PaulManafort @CNN @DanScavino Vote trump to save the west. Don't become like Europe - #WakeUpAmerica
Ground Truth Influential users: From Time magazine 1. Barack Obama 2. Donald Trump 3. Narendra Modi 4. Kim Kardashian 5. Taylor Swift Non-influential users: Partial from CEOWorld.biz and Partial Self 1. Jeb Bush 2. John Boehner 3. Rahul Gandhi 4. Sarah Palin 5. Johnny Depp http://time.com/3732203/the-30-most-influential-people-on-the-internet/ http://ceoworld.biz/2014/11/25/top-30-least-influential-people-united-states-non-influencers-list-2014
Data Pre-processing Remove emoticons, links, hashtags Remove numbers, punctuations, stopwords(a, the, ask, just…) Convert all the alphabets to lowercase for consistency Remove statements that do not convey a message
Feature Vector Likeable Feature Vector Since each tweet is referred to a specific test user, detect Positive or negative sentiment. Politeness Feature Vector Create Politeness Feature Vector [‘pardon me’, ‘thank you’, ‘if I may say’,’please’, ’shut up’, ’go to hell’,…….]. Annotate data and train classifier
Influence Detection Politeness correlates positively with influence Higher Politeness score more influential a person should be. Likeability proportional to influence First iteration, sum the score if above some threshold : influential
Thanks Suggestions
Twitter Api command for Data collection ''' Tweets on : user ''‘ raw_tweets = myApi.GetSearch('Donald Trump',lang='en',count=100) ‘’' Tweets by user''‘ raw_tweets = myApi.GetUserTimeline(screen_name='@realDonaldTrump',count=200 ,exclude_replies='true')