A Network Science Approach to Fake News Detection on Social Media Isayas Adhanom
Introduction Social media are computer-mediated technologies that facilitate the creation and sharing of information, ideas, career interests and other forms of expression via virtual communities and networks. There are more than 2.46 billion active social media users in the world (more than ⅓ of the world population).
Introduction Contd...
Introduction Contd... It is easier and less expensive to produce news on social media compared to traditional news media. Social media allows users to comment, share and discuss the news with other users. 67% of American adults get news on social media (Pew research center). During the 2016 presidential election the most popular fake news stories were more widely shared on Facebook than the most popular mainstream news stories
Introduction Contd... Most news on social media is not fact-checked or filtered. Social media platforms are attractive to fake news creators, because it is easy to disseminate fake news on social media and there is a large audience. Fake news: News that is deliberately created to deceive the reader into believing something that is not true.
Introduction Contd...
Introduction Contd... Fake news could have a negative impact on the consumer, society or the nation. Examples: 1. a fake news article about explosions at the White House injuring President Obama, spread by a compromised Associated Press account on Twitter resulted in a loss $136.5 billion in the market cap of the S&P 500. 2. a man travelled from Salisbury, NC to Washington DC to investigate the #Pizzagate conspiracy theory, and fired an assault rifle inside the Comet Ping Pong restaurant in northwest Washington.
Introduction Contd... Current detection strategies: News consumer awareness Manual flagging Artificial intelligence based content filtering methods Machine learning and data mining based outlier detection methods
Project Idea Due to the increasing complexity of fake news it is not easy to detect fake news in social media using only one technique. Social media platforms can all be viewed as large graphs (networks) among entities of different types with various interactions. Through modeling these interactions as graphs (networks), network science can provide insights beyond what we can see and understand from individual posts.
Project Idea Contd ... We will use a dataset of news posts that were posted on facebook in october 2016. The news posts have been manually categorized based on their truthfulness, type of post, and political inclination. We will model the user engagements with these posts and the relations among the users as a complex network. We will analyze the graph to spot suspicious user or post behaviour.
Related Work
Information Credibility on Twitter (Castillo et al.) Analyzing microblog postings related to trending topics and classify them as credible or not credible based on the features extracted from them. Message based features: Eg: length of a message, whether or not the text contains exclamation marks or question marks, and the number of positive or negative sentiment words in a message. User-based features: Eg: registration age, number of followers, number of followees, and the number of tweets the user has authored in the past.
Information Credibility on Twitter (Castillo et al.) Contd ... Topic-based features: such as the fraction of tweets that contain URL’s, the fraction of tweets with hashtags and the fraction of sentiment positive and negative in a set. Propagation based features: such as the depth of the retweet tree; the number of initial tweets of a topic.
User Behavior Modeling with Large-Scale Graph Analysis (Alex Beutel) Models user behaviour in social media. The problem is divided into three broad aspects: Modelling abnormal behaviour Eg: Detecting accounts working in lockstep Modelling normal behaviour Creating models of how normal users are supposed to act Scaling machine learning
Leveraging the Implicit structure within Social Media for Emergent Rumor Detection (Sampson et al.) Proposed a method for classifying conversations in their early stages, and tried to improve the accuracy of classification within mature conversations by discovering implicit linkages. They used two methods for implicit linkage detection: Hashtag linkage - link posts that use the same hashtag Web linkage - link posts that mention similar web links
Rumors in a Network: Who is the Culprit (Shah and Zaman) Worked on the problem of finding the source of a rumour in a network. They proposed a rumor spreading model based on the SIR (susceptible-infected-recovered) model. They defined the problem as a maximum likelihood (ML) estimation problem.
Detecting Rumor and Disinformation by Web Mining (Boris Galitsky) Proposed a method to detect if a given text is a rumor or disinformation. Based on the hypothesis that most rumors/disinformation are based on some genuine information. The method tries to identify the original source of the information and then detect all the rumors using linguistic means.
Conclusion The structure of social media platforms can be leveraged to allow for the detection of fake news. The relations among the users and their engagement with posts on social media can be used to create rich graphs (networks). Analyzing these networks can allow us to spot unusual behaviour in users and posts. This information can be used to enrich other fake news detection strategies.
Thank You!