Information Propagation Speed and Patterns in Social Networks: a Case Study Analysis of German Tweets International Conference of Algorithms, Computing.

Slides:



Advertisements
Similar presentations
1 KSIDI June 9, 2010 Measuring User Influence in Twitter: The Million Follower Fallacy Meeyoung Cha Max Planck Institute for Software Systems (MPI-SWS)
Advertisements

30,000,000 Occupation Tweets A hashtag co-occurrence network analysis of information flows Jeff Hemsley, Katherine Thornton, Josef Eckert, Shawn Walker,
Presented By: Omofonmwan Nelson. Agenda:  Twitter  Benefits of Twitter  Tweet  Tweeter Services  Geographical Distribution  Conclusion.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
The Role of Twitter in YouTube Videos Diffusion George Christodoulou EPFL Switzerland Laboratory for Internet Computing Department of Computer Science.
Overview What is ‘Impact’, and how can it be measured? Citation Metrics Usage Metrics Altmetrics Strategies and Considerations.
Information | Analytics | Expertise SOCIAL MEDIA INTELLIGENCE Practical Strategies for Using Social Media to Enhance Security AUGUST 2014 © 2014 IHS IHS.
Twitter rank—finding topic- sensitive influential twitters Singapore Management University Jianshu WENG Ee Peng LIM Jing JIANG Qi He ACM International.
Experimental Evaluation in Computer Science: A Quantitative Study Paul Lukowicz, Ernst A. Heinz, Lutz Prechelt and Walter F. Tichy Journal of Systems and.
TWITTER EFFECT: A S OCIAL N ETWORK ? OR A N EWS MEDIA ? Presented by: Bohyun Kim Under the Guidance of: Augustin Chaintreau.
25 Need-to-Know Facts. Fact 1 Every 2 days we create as much information as we did from the beginning of time until 2003 [Source]Source © 2014 Bernard.
Overview of Web Data Mining and Applications Part I
In Situ Evaluation of Entity Ranking and Opinion Summarization using Kavita Ganesan & ChengXiang Zhai University of Urbana Champaign
SOCIAL NETWORKS AND THEIR IMPACTS ON BRANDS Edwin Dionel Molina Vásquez.
A methodology for developing new technology ideas to avoid
Search Engines and Information Retrieval Chapter 1.
Bibliometrics toolkit: ISI products Website: Last edited: 11 Mar 2011 Thomson Reuters ISI product set is the market leader for.
Understanding Cross-site Linking in Online Social Networks Yang Chen 1, Chenfan Zhuang 2, Qiang Cao 1, Pan Hui 3 1 Duke University 2 Tsinghua University.
Pete Bohman Adam Kunk. What is real-time search? What do you think as a class?
TWITTER What is Twitter, a Social Network or a News Media? Haewoon Kwak Changhyun Lee Hosung Park Sue Moon Department of Computer Science, KAIST, Korea.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Hypersearching the Web, Chakrabarti, Soumen Presented By Ray Yamada.
Prediction of Influencers from Word Use Chan Shing Hei.
Bibliometrics toolkit Website: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Further info: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Scopus Scopus was launched by Elsevier in.
Most of contents are provided by the website Introduction TJTSD66: Advanced Topics in Social Media Dr.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Twitter Games: How Successful Spammers Pick Targets Vasumathi Sridharan, Vaibhav Shankar, Minaxi Gupta School of Informatics and Computing, Indiana University.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
Students: Aiman Md Uslim, Jin Bai, Sam Yellin, Laolu Peters Professors: Dr. Yung-Hsiang Lu CAM 2 Continuous Analysis of Many CAMeras The Problem Currently.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
A Nonparametric Method for Early Detection of Trending Topics Zhang Advisor: Prof. Aravind Srinivasan.
Modeling and Visualizing Information Propagation in Microblogging Platforms Chien-Tung Ho, Cheng-Te Li, and Shou-De Lin National Taiwan University ASONAM.
A Connectivity-Based Popularity Prediction Approach for Social Networks Huangmao Quan, Ana Milicic, Slobodan Vucetic, and Jie Wu Department of Computer.
Measuring User Influence in Twitter: The Million Follower Fallacy Meeyoung Cha Hamed Haddadi Fabricio Benevenuto Krishna P. Gummadi.
More than words: Social network’s text mining for consumer brand sentiments Expert Systems with Applications 40 (2013) 4241–4251 Mohamed M. Mostafa Reporter.
Tourism has the potential to be the engine of a country’s economic development. Some countries, especially those in Africa, rely on tourism for their citizens’
Data mining in web applications
Aberdeen Networking Event Workshop
Cohesive Subgraph Computation over Large Graphs
Botnet Campaign Detection on Twitter
Topical Authority Detection and Sentiment Analysis on Top Influencers
Bibliometrics toolkit: Thomson Reuters products
The Spread of Media Content through the Blogosphere
By : Namesh Kher Big Data Insights – INFM 750
Market Intelligence Analysis
Big Data.
Discover How Your Business Can Benefit from a Facebook Fanpage
Discover How Your Business Can Benefit from a Facebook Fanpage
Presenter Organisation(s)
Trends in my profession, Information Technology
Summary Presented by : Aishwarya Deep Shukla
Knowledge Management Systems
Presenter Organisation(s)
Introduction to Data Programming
Past and Present: Verb Tenses Across Blog Topics
GROUP 3 – SENTIMENTAL TWITTER
TDM=Text Mining “automated processing of large amounts of structured digital textual content for purposes of information retrieval, extraction, interpretation.
Data Mining Chapter 6 Search Engines
How to Use Social Networking to Help Job Seekers
A Network Science Approach to Fake News Detection on Social Media
International Marketing and Output Database Conference 2005
Web archives as a research subject
Analyzing social media data to monitor public health trends
Modeling Trust and Influence in the Blogosphere using Link Polarity
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Why Social Media? Think of the marketing potential that is inexpensive, anyone can do, and how effective it is.
Presentation transcript:

Information Propagation Speed and Patterns in Social Networks: a Case Study Analysis of German Tweets International Conference of Algorithms, Computing and Systems, 10th-Aug, South Korea Raad Bin Tareaf – Social Media Analysis Internet Technologies and Systems Hasso Plattner Institut- Digital Engineering Faculty

Motivation “Local Trends will allow you to learn more about the nuances in our world and discover even more relevant topics that might matter to you.” - By Twitter Inc./blogsite. Information propagation Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

What is Information propagation Speed ? Twitter launched in Mar-2006, structured data. Social Network are graph-based. Number of interactions (re-tweets) between users within timeframe called scale, infection builds cascade (range). Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Related Work Kwak et al:`influncers can be identified by calculating the page rank for a set of Twitter users´. Hennig et al: `used information retrieval approches to identify trend inside unstructured blog data`. Yang et al: ´divided the diffusion into three major properties: Speed,scale,range´ Lack of information concerning difference between local and wordwide diffused tweets.

Data Collection- Cronjob Scheduler (15 minutes): Top 10 trends, 4 places per day APIs: Live Streaming search Collect tweets JSON processing: metadata, status, tweetID Data Cleansing Pushes data into database( local: MySQL, remote: HANA) Save to .json

Concept – Trends For each trend, settle all tweets which were posted during the same hour inside one chunk. Why? Analyzing histograms lead to discover propagation patterns: - Number of tweets decreases during the local night time - Others, did not decreases and kept active Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Indicators The time points from (i – 1, i , i+ 1) are investigated, count retweets number over a day and check if there is a significant increase (51%) in the re-tweet amount for each hour bucket. Now, there is enough change in the tweet/hastag status to trigger process of categorization using four main indicators: Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Automated Analysis Indicators Day-Night Circle Indicator Night-Inactivity Indicator Language Indicator Short-Day trend Indicator Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Day-Night Circle Indicator Trends which are only valid for a certain reigon /country follows a characterstics of day-night cycle. Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

2. Night-Inactivity Indicator All trends by Day-Night cycle are as well considerd local by Night-Inactivity indicator. Opposite? Then clear indication that trend is becoming global. Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

2. Night-Inactivity Indicator Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

3. Language Indicator Map of Tweet and Language most of the tweets (>80%)" are in one language? Does not apply for English #DavidBowie: 84% English #Diekmann: 95% German Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

4. Short Day Trend Indicator Dose not even satisfy the whole day-night cycle. → Genuine indicator for locality, since global trends stay much longer active. Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

4. Short Day Trend Indicator Presentation Title Speaker, Job Description, Date if needed

Dataset Statistics (1) Number of full-structured Tweets: 1.2 m Number of trendy hashtags: 291 Twitter uses WOEIDs to identify place around the world Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Dataset Statistics (2) Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Languages Distribution for Hashtags Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Dataset Insights Dataset contains most followed users: ~@KatyPerry (95.431.353) ~@JustinBieber (90.886.180) ~@TaylorSwift13 (83.115.401) ~@BarackObama (80.652.888) Most active user: 6.843.102 tweets, ~@notiven (second most: 6.628.870) → bots

Detecting Local to Global Transformation The Algorithm check iteratively every hour if there are a significant change in re-tweets amount for the past 3 days. Factors taken into consideration : Tweets Scale Time Irregularities Language Distribution Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Local vs. Global Trends (time,Language,Locaiton) Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Local vs. Global Trends (9th – 11th JAN) Presentation Title Speaker, Job Description, Date if needed

Observations External Influences Bots with enormous amout of tweets Considering every re-tweet as a relation, calculating HITS scores produce high hub and authority scores for users who get retweeted often. Limitation of Twitter API Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Future Work Analyse the influence of social media bots. Build ranking predictions for influncers. Verify local-global categorization on further datasets. Identify new propagation patterns. Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems

Conclusion Implementation of automated analysis indicators: 1. Day-Night Circle Indicator 3. Language Indicator 2. Night-Inactivity Indicator 4. Short-Day trend Indicator Local and global trends allow to discover how big the influence and how fast the propagation speed of a trend is and in which way a similar trend will evolve. Friends connections can, once they are identified, be used as initial marketing campaigns.

Discussion [1] J. Yang and S. Counts. Predicting the Speed, Scale, and Range of Information Diffusion in Twitter, ICWSM,10: 355-358, 2010. [2] A. Java, X. Song, T. Finin and B. Tseng. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pages 56-65. ACM, 2007. [3] H. Kwak, C. Lee, H. Park and S. Moon. What is Twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web, Pages 591-600. ACM 2010. [4] M. Cha, H. Haddadi, F. Benevenuto and P.K. Gummadi. Measuring user influence in twitter: The million follower fallacy. ICWSM 10(10- 17):30, 2010. [5] P. Hennig, P. Berger, C. Lehmann, A. Mascher and C. Meinel. Accelerate the detection of trends by using sentiment analysis within the blogosphere, In Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International Conference on, pages 503- 508. IEEE, 2014. [6] M. Naaman, H. Becker and L. Gravano. Hip and trendy: Characterizing emerging trends on Twitter. Journal of the Association for Information Science and Technology, 62(5):902-918, 2011. Dataset is available: https://easy.dans.knaw.nl/ui/datasets/id/easy-dataset:73236 Implementation: https://github.com/raadbintareaf/IPS_Meassurement Information Propagation Speed Raad Bin Tareaf, Internet Technologies and Systems