Yasuko Matsubara (Kyoto University),

Slides:



Advertisements
Similar presentations
ENHANCING DECISION MAKING
Advertisements

Web Writing Workshop: How Better Content Always Improves Your Site Leslie OFlahavan, Association Media & Publishing July 17, 2012.
Components and Dashlets
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Research Infrastructures – Grant Agreement n
Amortized Analysis Some of the slides are from Prof. Leong Hon Wais resources at National University of Singapore Prof. Muhammad Saeed.
Geneva, Switzerland, 11 June 2012 Future Network: Mobility Tae-Wan You ETRI, Joint ITU-T SG 13 and ISO/JTC1/SC 6 Workshop on Future Networks.
Illinois Department of Human Services Division of Mental Health
© Family Links Keeping student teachers in the job: a report on the use of the Nurturing Programme in PGCE, BA & Teach First programmes Annette Mountford.
D.I.E.I. - Università degli Studi di Perugia h-quasi planar drawings of bounded treewidth graphs in linear area Emilio Di Giacomo, Walter Didimo, Giuseppe.
1 Milk Market Situation Brussels, 23 February 2012.
Greek Research and Technology Network EGI Community Forum Delivering IaaS for the Greek Academic and Research Community.
Work Schedule Rules, Push Codes, Retro Time Evaluations, oh my…
Lebanese Energy Statistics: A Decade in Review Dr. Joseph Al Assad.
Protect Our Water Our Future Florence Community Meeting November 27, 2012.
A SIMPLE VIEW OF HOW TO REDUCE TOTAL COST OF QUALITY PRESENTED BY CATHERINE OEHL COQAA Monthly Meeting December 5, 2012.
Ideal Observer Approach for Assessment of Image Quality on Stereo Displays Devices for Medical Imaging. Fahad Zafar, Dr. Yaacov Yesha, Dr. Aldo Badano.
COMPASS® ReportsCOMPASS® Reports Customized List Report Placement Summary Report KDE:OAA:js & pp:2/6/20121
CSE 332 Data Abstractions: Dictionary ADT: Arrays, Lists and Trees Kate Deibel Summer 2012 June 27, 2012CSE 332 Data Abstractions, Summer
Milk Market Situation Brussels, 19 January Market Situation – 19 January !!! Data from some Member States are confidential and are NOT included.
Smart Data Pricing (SDP) Soumya Sen Joint Work with: Sangtae Ha, Carlee Joe-Wong, Mung Chiang Innovating Data Plans Soumya Sen, WITE
Instructional Use of Information Technologies: Teachers' Resistance to the Use of New Technologies Assist.Prof. Dr. Pervin Oya TANERI Assoc.Prof. Dr. Süleyman.
LOGIC FAMILIES CHAPTER : 6
Solving Manufacturing Equipment Monitoring Through Efficient Complex Event Processing Tilmann Rabl, Kaiwen Zhang, Mohammad Sadoghi, Navneet Kumar Pandey,
Hearing Complex Sounds
What does Open Source Mean for HDF? Mike Folk The HDF Group July 2012 ESIP Summer Meeting
Office 2010: Whats new 57th Annual Conference and Exhibits March 6 - 9, 2012 Hershey Lodge and Convention Center, Hershey, Pennsylvania Wednesday 3:15.
CAPHRI Day - April 3 rd Imagine I invite you to come to my house tonight! How could I guide you to get there? Think of any possibility...
GLOBAL E-BUSINESS AND COLLABORATION
BUILDING INFORMATION SYSTEMS
Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT VIDEO CASES Case 1: Maruti Suzuki Business Intelligence and Enterprise.
Chapter 11 MANAGING KNOWLEDGE VIDEO CASES
1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
Lecture IV: Genomic Medicine: Communicating with the Patient
Design and construction of a mid-IR SPIDER apparatus 09/10/2012 Malte Christian Brahms Imperial College London 09/10/20121.
CDLPA Spring 2012 Plenary Session Legal Aid Ontario Presentation Executive Summary Thursday, May 10, 2012 Kingston, Ontario Bob.
© NBN Co Limited Disclaimer This document sets out NBN Co’s proposals in respect of certain aspects of the National Broadband Network. The contents.
© Dr Kelvyn Youngman, May Two Diagonals – The Change Matrix Here we have the change matrix of the two principals of The Goal; Jonah and Alex Dissatisfaction.
Nagios XI 2012 Mike Guthrie Twitter: mguthrie88 Projects:
Helping you patients… focus on not paperwork. Innovian® Anesthesia 5.0
W. Bergholz GEE2 Spring _wind energy Wind Power –Explain the special features of wind power –Compare to solar cells.
Milk Market Situation Brussels, 20 September 2012.
Selecting Multiple Commodity Codes. 30/03/ Click on STATISTICS, then BUILD YOUR OWN TABLES from the drop-down menu. 2. Click on Data by Commodity.
Approach for Long Term Imported Coal Contract - A brief overview 20 November th Coal Summit 2012, New Delhi 1.
Mexico’s Competitive Position in the New Global Economy Gordon Hanson UC San Diego and NBER November 2012.
© Dr Kelvyn Youngman, Aug Efrat's Layer 2 At the conference I graphed the 5 layers of resistance onto the change matrix as above 3. Have reservations.
Mandatory Reporting of Child Abuse & Neglect - Training for All School Employees Nic Dibble, LSSW, CISW Education Consultant, School Social Work Services.
Revised: August “Why don’t you just take your meds?” CIT Officer Wendi Shackelford Anchorage (Alaska) Police Department.
FUNNEL: Automatic Mining of Spatially Coevolving Epidemics Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Willem G. van Panhuis (University of.
Modeling Blog Dynamics Speaker: Michaela Götz Joint work with: Jure Leskovec, Mary McGlohon, Christos Faloutsos Cornell University Carnegie Mellon University.
Dynamical Processes on Large Networks B. Aditya Prakash Carnegie Mellon University MMS, SIAM AN, Minneapolis, July 10,
ICDM, Shenzhen, 2014 Flu Gone Viral: Syndromic Surveillance of Flu on Twitter using Temporal Topic Models Liangzhe Chen, K. S. M. Tozammel Hossain, Patrick.
Self-introduction Name:  鲍鹏 (Peng Bao) Research Interests:  Popularity Prediction, Information Diffusion, Social Network , etc… Grade:  In the third.
Interacting Viruses: Can Both Survive? Alex Beutel, B. Aditya Prakash, Roni Rosenfeld, Christos Faloutsos Carnegie Mellon University, USA KDD 2012, Beijing.
School of Computer Science Carnegie Mellon University Athens University of Economics & Business Patterns amongst Competing Task Frequencies: S u p e r.
CMU SCS Large Graph Mining - Patterns, Tools and Cascade Analysis Christos Faloutsos CMU.
WindMine: Fast and Effective Mining of Web-click Sequences SDM 2011Y. Sakurai et al.1 Yasushi Sakurai (NTT) Lei Li (Carnegie Mellon Univ.) Yasuko Matsubara.
Cascading Behavior in Large Blog Graphs Patterns and a Model Leskovec et al. (SDM 2007)
Blogosphere  What is blogosphere?  Why do we need to study Blog-space or Blogosphere?
Cascading Behavior in Large Blog Graphs: Patterns and a model offence.
Fast and Exact Monitoring of Co-evolving Data Streams Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Naonori Ueda (NTT) Masatoshi Yoshikawa (Kyoto.
Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), Christos Faloutsos (CMU), Tomoharu.
AutoPlait: Automatic Mining of Co-evolving Time Sequences Yasuko Matsubara (Kumamoto University) Yasushi Sakurai (Kumamoto University) Christos Faloutsos.
Forecasting with Cyber-physical Interactions in Data Centers (part 3)
Non-linear Mining of Competing Local Activities
Distributed Representations of Subgraphs
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Large Graph Mining: Power Tools and a Practitioner’s guide
Human-centered Machine Learning
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Presentation transcript:

Rise and Fall Patterns of Information Diffusion: Model and Implications Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), B. Aditya Prakash (CMU), Lei Li (UCB), Christos Faloutsos (CMU) KDD 2012 Y. Matsubara et al.

Q: How do news and rumors spread in social media? Motivation Social media facilitate faster diffusion of news and rumors Q: How do news and rumors spread in social media? KDD 2012 Y. Matsubara et al.

News spread in social media MemeTracker [Leskovec et al. KDD’09] short phrases sourced from U.S. politics in 2008 “you can put lipstick on a pig” (# of mentions in blogs) (per hour, 1 week) “yes we can” KDD 2012 Y. Matsubara et al.

News spread in social media MemeTracker [Leskovec et al. KDD’09] short phrases sourced from U.S. politics in 2008 “you can put lipstick on a pig” (# of mentions in blogs) News spread Decay Breaking news (per hour, 1 week) “yes we can” KDD 2012 Y. Matsubara et al.

Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries per week) “#assange” “#stevejobs” (per hour, 1week) (per hour, 1 week) “tsunami” (in 2005) “harry potter” (2010 - 2011) (per week, 1 year) (per week, 2 years) KDD 2012 Y. Matsubara et al.

Rise and fall patterns in social media How many patterns are there? -Earlier work claims there’re several classes four classes on YouTube [Crane et al. PNAS’08] six classes on Media [Yang et al. WSDM’11] KDD 2012 Y. Matsubara et al.

Rise and fall patterns in social media Q. How many classes are there after all? A. Our answer is “ONE”! We can represent all patterns by single model KDD 2012 Y. Matsubara et al.

Outline Motivation Problem definition Proposed method Experiments Discussion – SpikeM at work Conclusions KDD 2012 Y. Matsubara et al.

Problem definition Goal: predict/model social activity β Problem 1 (What-if?) β Given: Network of bloggers/users External shock/event Quality of the event β Find: How blogging activity will evolve over time KDD 2012 Y. Matsubara et al.

Problem 2 (Model design) Problem definition Goal: predict/model social activity Problem 2 (Model design) Given: Behavior of spikes Find: Equation/model that can explain them, e.g., # of potential bloggers Strength of external shock Quality of the event β Epidemic process by word-of-mouth β KDD 2012 Y. Matsubara et al.

Outline Motivation Problem definition Proposed method Experiments Discussion – SpikeM at work Conclusions KDD 2012 Y. Matsubara et al.

Proposed method SpikeM capture 3 properties of real spike 1. periodicities KDD 2012 Y. Matsubara et al.

Proposed method SpikeM capture 3 properties of real spike 1. periodicities 2. avoid infinity KDD 2012 Y. Matsubara et al.

Proposed method SpikeM capture 3 properties of real spike 3. power-law fall 1. periodicities 2. avoid infinity KDD 2012 Y. Matsubara et al.

Proposed method SpikeM capture 3 properties of real spike 3. power-law fall 1. periodicities 2. avoid infinity SpikeM capture behavior of real spikes using few parameters KDD 2012 Y. Matsubara et al.

Main idea (details) Un-informed of rumor 1. Un-informed bloggers (clique of N bloggers/nodes) Time n=0 Nodes (bloggers) consist of two states Un-informed of rumor informed, and Blogged about rumor U B KDD 2012 Y. Matsubara et al.

Main idea (details) External shock 1. Un-informed bloggers (clique of N bloggers/nodes) 2. External shock at time nb (e.g, breaking news) Time n=0 Time n=nb External shock Event happened at time bloggers are informed, blog about news KDD 2012 Y. Matsubara et al.

Main idea (details) β Infectiveness of a blog-post 1. Un-informed bloggers (clique of N bloggers/nodes) 2. External shock at time nb (e.g, breaking news) 3. Infection (word-of-mouth effects) β Time n=0 Time n=nb Time n=nb+1 Infectiveness of a blog-post Strength of infection (quality of news) Decay function (how infective a blog posting is) KDD 2012 Y. Matsubara et al.

Main idea (details) Decay function: β Infectiveness of a blog-post 1. Un-informed bloggers (clique of N bloggers/nodes) 2. External shock at time nb (e.g, breaking news) 3. Infection (word-of-mouth effects) Decay function: Linear scale Log scale β -1.5 Time n=0 Time n=nb Time n=nb+1 Infectiveness of a blog-post Strength of infection (quality of news) Decay function (how infective a blog posting is) KDD 2012 Y. Matsubara et al.

SpikeM-base (details) Equations of SpikeM (base) Blogged Un-informed Total population of available bloggers Strength of infection/news External shock at birth (time ) Background noise KDD 2012 Y. Matsubara et al.

SpikeM - with periodicity (details) Full equation of SpikeM Blogged Periodicity Un-informed 12pm Peak activity 3am Low activity Time n activity Bloggers change their activity over time (e.g., daily, weekly, yearly) KDD 2012 Y. Matsubara et al.

Model fitting (Details) SpikeM consists of 7 parameters Learning parameters Given a real time sequence Minimize the error (Levenberg-Marquardt (LM) fitting) KDD 2012 Y. Matsubara et al.

Analysis SpikeM matches reality exponential rise and power-raw fall rise fall SpikeM vs. SI model (susceptible infected model) KDD 2012 Y. Matsubara et al.

Analysis rise fall Rise-part SpikeM: exponential SI model: exponential Reverse x-axis Rise-part SpikeM: exponential SI model: exponential Linear-log Log-log KDD 2012 Y. Matsubara et al.

Analysis rise fall Fall-part SpikeM: power law SI model: exponential SpikeM matches reality Linear-log Log-log KDD 2012 Y. Matsubara et al.

Outline Motivation Problem definition Proposed method Experiments Discussion – SpikeM at work Conclusions KDD 2012 Y. Matsubara et al.

Experiments We answer the following questions… Q1. Match real spikes - Q1-1: K-SC clusters - Q1-2: MemeTracker - Q1-3: Twitter - Q1-4: Google trend Q2. Forecast future patterns KDD 2012 Y. Matsubara et al.

Q1-1 Explaining K-SC clusters Six patterns of K-SC [Yang et al. WSDM’11] SpikeM can generate all patterns in K-SC KDD 2012 Y. Matsubara et al.

Q1-2 Matching MemeTracker patterns MemeTracker (memes in blogs) [Leskovec et al. KDD’09] SpikeM can fit various patterns in blog Linear scale Noise-robust fitting Log scale Outliers KDD 2012 Y. Matsubara et al.

Q1-3 Matching Twitter data Twitter data (hashtags) SpikeM can generate various patterns in social media Linear scale Log scale KDD 2012 Y. Matsubara et al.

Q1-4 Matching Google trend data Volume of searches for queries on Google SpikeM can capture various patterns KDD 2012 Y. Matsubara et al.

Q2 Tail-part forecasts - Given a first part of the spike - forecast the tail part SpikeM can capture tail part (AR: fail) KDD 2012 Y. Matsubara et al.

Outline Motivation Problem definition Proposed method Experiments Discussion – SpikeM at work Conclusions KDD 2012 Y. Matsubara et al.

SpikeM at work SpikeM is capable of various applications A1. What-if forecasting A2. Outlier detection A3. Reverse engineering KDD 2012 Y. Matsubara et al.

A1. “What-if” forecasting Forecast not only tail-part, but also rise-part! e.g., given (1) first spike, (2) release date of two sequel movies (3) access volume before the release date (1) First spike (2) Release date (3) Two weeks before release ? ? KDD 2012 Y. Matsubara et al.

A1. “What-if” forecasting Forecast not only tail-part, but also rise-part! SpikeM can forecast upcoming spikes (1) First spike (2) Release date (3) Two weeks before release KDD 2012 Y. Matsubara et al.

One year after Indian Ocean earthquake A2. Outlier detection Fitting result of “tsunami (Google trend)” in log-log scale Another earthquake One year after Indian Ocean earthquake Indian Ocean earthquake KDD 2012 Y. Matsubara et al.

A3. Reverse engineering SpikeM provide an intuitive explanation PDF of parameters over 1,000 memes/hashtags Meme Twitter KDD 2012 Y. Matsubara et al.

Total population N is almost same A3. Reverse engineering SpikeM provide an intuitive explanation PDF of parameters over 1,000 memes/hashtags Observation 1 Total population N is almost same Meme Twitter KDD 2012 Y. Matsubara et al.

Strength of first burst (news) is A3. Reverse engineering SpikeM provide an intuitive explanation PDF of parameters over 1,000 memes/hashtags Observation 2 Strength of first burst (news) is Meme Twitter KDD 2012 Y. Matsubara et al.

A3. Reverse engineering SpikeM provide an intuitive explanation PDF of parameters over 1,000 memes/hashtags Observation 3 Daily periodicity with phase shift Every meme has the same periodicity without lag Meme Twitter (Twitter) Daily periodicity with more spread in (i.e., Multiple time zone) KDD 2012 Y. Matsubara et al.

Outline Motivation Background Proposed method Experiments Discussion – SpikeM at work Conclusions KDD 2012 Y. Matsubara et al.

Conclusions Unification power Practicality: Parsimony Usefulness: SpikeM has following advantages: Unification power It includes earlier patterns/models Practicality: It Matches real datasets Parsimony It requires only 7 parameters Usefulness: What-if scenarios, outliers, etc. KDD 2012 Y. Matsubara et al.

Acknowledgements Thanks Jaewon Yang & Jure Leskovec for the six clusters [WSDM’11] Funding KDD 2012 Y. Matsubara et al.

Thank you Yasuko Matsubara Yasushi Sakurai B. Aditya Prakash Lei Li Christos Faloutsos Code: http://www.kecl.ntt.co.jp/csl/sirg/people/yasuko/software.html Email: matsubara.yasuko lab.ntt.co.jp KDD 2012 Y. Matsubara et al.