Analysis and Forecasting of Trending Topics in Online Media Streams 1 ACM MM 2013 Tim Althoff, Damian Borth, Jörn Hees, Andreas Dengel German Research.

Slides:



Advertisements
Similar presentations
PROQUEST SIRS ISSUES RESEARCHER INSIGHT INTO TODAYS LEADING ISSUES Online Tutorial sks.sirs.com | proquestk12.com.
Advertisements

Web Mining.
An Online News Recommender System for Social Networks Department of Computer Science University of Illinois at Urbana-Champaign Manish Agrawal, Maryam.
Creating Collaborative Partnerships
2015 SLA IT Webinar Using Analytics to Understand Social Media Activity Michelle Chen School of Information San José State University February 4 th, 2015.
Developing a Social Media Strategy Ashley Schaffer Ebe Randeree For Your Organization.
Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
Copyright © 2012 IHS Inc. All Rights Reserved. Company Confidential Leveraging Social Media Data as Real-Time Indicators of X September 7,
Information | Analytics | Expertise SOCIAL MEDIA INTELLIGENCE Practical Strategies for Using Social Media to Enhance Security AUGUST 2014 © 2014 IHS IHS.
Asking Questions on the Internet
1 Web Search and Web Search Overlap: What the Deal? Amanda Spink Queensland University of Technology.
Disasters and Human Factors Literature Nestor L Osorio Northern Illinois University.
THE UNIVERSITY OF HONG KONG WEB BY DANIEL CHURCHILL 2.0.
Building an Intelligent Web: Theory and Practice Pawan Lingras Saint Mary’s University Rajendra Akerkar American University of Armenia and SIBER, India.
Time-dependent Similarity Measure of Queries Using Historical Click- through Data Qiankun Zhao*, Steven C. H. Hoi*, Tie-Yan Liu, et al. Presented by: Tie-Yan.
E-Marketing/7E Chapter 8
Inbound Statistics Slides Attract. 1 Blogging There are 31% more bloggers today than there were three years ago 46% of people read blogs more than once.
Knowledge is Power Marketing Information System (MIS) determines what information managers need and then gathers, sorts, analyzes, stores, and distributes.
Knowledge Discovery Centre: CityU-SAS Partnership 1 Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr.
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Overview of Web Data Mining and Applications Part I
Oncosifter: A Customized Approach to Cancer Information Authors : Ketan Mane & Sidharth Thakur Presenter: Yueyu Fu.
Introduction to Data Science Kamal Al Nasr, Matthew Hayes and Jean-Claude Pedjeu Computer Science and Mathematical Sciences College of Engineering Tennessee.
Best Practices Using Enterprise Search Technology Aurelien Dubot Consultant – Media and Entertainment, Fast Search & Transfer (FAST) British Computer Society.
Social Theory Driven Operational Forecasting of Civil Unrest Event Outbreaks Final Project Presentation Peter Wu Apr 30, 2015.
9. Learning Objectives  How do companies utilize social media research? What are the primary approaches to social media research?  What is the research.
Ambulation : a tool for monitoring mobility over time using mobile phones Computational Science and Engineering, CSE '09. International Conference.
Temporal Event Map Construction For Event Search Qing Li Department of Computer Science City University of Hong Kong.
Attention and Event Detection Identifying, attributing and describing spatial bursts Early online identification of attention items in social media Louis.
Introduction The large amount of traffic nowadays in Internet comes from social video streams. Internet Service Providers can significantly enhance local.
Page 1 Understanding the Consumer Buying Cycle The Role of Search Sara Stevens Director – Marketing Solutions comScore Networks, Inc. Drilling Down on.
Nowcasting RDU with trends Based on Durham Paper By Ramy Khorshed.
Media Relations in a Social Media World By Julie DeBardelaben Deputy Director of Public Affairs CAP National Headquarters.
Technology in Action Alan Evans Kendall Martin Mary Anne Poatsy Twelfth Edition.
Gradual Adaption Model for Estimation of User Information Access Behavior J. Chen, R.Y. Shtykh and Q. Jin Graduate School of Human Sciences, Waseda University,
1 Enabling Webscale Research in Europe Julien Masanès European Archive Foundation Consultation Workshop, Brussels, 19/1/2010.
Accessing the Deep Web Bin He IBM Almaden Research Center in San Jose, CA Mitesh Patel Microsoft Corporation Zhen Zhang computer science at the University.
Chapter Six Strategic Research. Prentice Hall, © Market research is the foundation for advertising decisions because it: a) Identifies people.
1 Mining User Behavior Mining User Behavior Eugene Agichtein Mathematics & Computer Science Emory University.
Australian Curriculum Geography
Pete Bohman Adam Kunk. Real-Time Search  Definition: A search mechanism capable of finding information in an online fashion as it is produced. Technology.
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Social Media 101 An Overview of Social Media Basics.
© 2006 McGraw-Hill Companies, Inc., McGraw-Hill/IrwinSlide 8-1.
Detecting Influenza Outbreaks by Analyzing Twitter Messages By Aron Culotta Jedsada Chartree 02/28/11.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
HISTORY DAY Project Categories. Types of Presentations n Research paper (individual only) n Documentary n Exhibit n Performance n Web site.
Big Data – Big Opportunity Mohammad Khansari ITRC President Jan 2015 ITRC, Tehran, Iran.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
2014 Lexicon-Based Sentiment Analysis Using the Most-Mentioned Word Tree Oct 10 th, 2014 Bo-Hyun Kim, Sr. Software Engineer With Lina Chen, Sr. Software.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
PREDICTING STOCK MARKET MOVEMENT USING SENTIMENTS For EECSE 6898-From Data to Solutions class Presented by-Tulika Bhatt(tb2658)
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Data mining in web applications
Wikis and Socially-Constructed Text
PRIMARY DATA vs SECONDARY DATA RESEARCH Lesson 23 June 2016
Social Media Data Mining
Course Introduction CSC 576: Data Mining.
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
Big DATA.
Analyzing social media data to monitor public health trends
Artificial Intelligence
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Dynamics 365 Market Insights Preview – what’s new
Presentation transcript:

Analysis and Forecasting of Trending Topics in Online Media Streams 1 ACM MM 2013 Tim Althoff, Damian Borth, Jörn Hees, Andreas Dengel German Research Center for Artificial Intelligence (DFKI) ALL RIGHTS RESERVED. No part of this work may be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system without expressed written permission from the authors.

Analysis and Forecasting of Trending Topics in Online Media Streams 2 Motivation

Analysis and Forecasting of Trending Topics in Online Media Streams 3 Large-scale Record of Human Behavior Primary method for communication by college students in the U.S. 340 million tweets each day by 500 million users One billion search requests and about twenty petabytes of user-generated data each day (2009)

Analysis and Forecasting of Trending Topics in Online Media Streams 4 Motivation People use the web for news, information, and communication Online activity becomes indicative of the interests of the global population › „Mirror of Society“ [Steven Van Belleghem, 2012]

Analysis and Forecasting of Trending Topics in Online Media Streams 5 Use Cases of Online Activity

Analysis and Forecasting of Trending Topics in Online Media Streams 6 Tracking Flu Infections flu remedy [Ginsberg et al. Detecting influenza epidemics using search engine query data. Nature ]

Analysis and Forecasting of Trending Topics in Online Media Streams 7 Relevance for the MM Community Panel: Cross-Media Analysis and Mining Focus more on the time dimension How do topics of interest develop over time?

Analysis and Forecasting of Trending Topics in Online Media Streams 8 Relevance for the MM Community Panel: Cross-Media Analysis and Mining Focus more on the time dimension How do topics of interest develop over time? Trend aware recommendation systems Automatic vocabulary selection for classification

Analysis and Forecasting of Trending Topics in Online Media Streams 9 Shortcomings of Related Work 1.Focus on very specific use cases (e.g. flu) 2.Nowcasting instead of forecasting 3.Manual correlation analysis instead of fully automatic forecasting 4.No comparison/study of different online & social media channels wrt. emerging trends [Choi and Varian. Predicting initial claims for unemployment benefits. Google, Inc ] [Choi and Varian. Predicting the Present with Google Trends. Economic Record, 2012.] [Ginsberg et al. Detecting influenza epidemics using search engine query data. Nature ] [Cooper et al. Cancer internet search activity on a major search engine. Journal of medical Internet research, 2005.] [Goel et al. Predicting consumer behavior with web search. National Academy of Sciences, 2010.] [Jin et al. The wisdom of social multimedia: using Flickr for prediction and forecast. ACM MM 2010.] [Adar et al. Why we search: visualizing and predicting user behavior. WWW, 2007.]

Analysis and Forecasting of Trending Topics in Online Media Streams 10 Goals 1.Multi-channel analysis of trending topics 2.Fully automatic forecasting of trending topics

Analysis and Forecasting of Trending Topics in Online Media Streams 11 Goals 1.Multi-channel analysis of trending topics 2.Fully automatic forecasting of trending topics

Analysis and Forecasting of Trending Topics in Online Media Streams 12 Trending Topics

Analysis and Forecasting of Trending Topics in Online Media Streams 13 Trending Topics A trending topic is a subject matter of discussion agreed on by a group of people experiences a sudden spike in user interest or engagement

Analysis and Forecasting of Trending Topics in Online Media Streams 14 Daily crawling of 10 sources Capturing Communication needs (Twitter) Search patterns (Google) Information demand (Wikipedia) Trending Topics Discovery Trends Search Germany Search USA News Germany News USA English Wikipedia German Wikipedia Trends Germany Trends USA Daily Trends Google [Damian Borth, Adrian Ulges, Thomas Breuel. Dynamic Vocabularies for Web-based Concept Detection by Trend Discovery. ACM Multimedia, 2012]

Analysis and Forecasting of Trending Topics in Online Media Streams 15 Daily crawling of 10 sources Capturing Communication needs (Twitter) Search patterns (Google) Information demand (Wikipedia) Trending Topics Discovery Trends Search Germany Search USA News Germany News USA English Wikipedia German Wikipedia Trends Germany Trends USA Daily Trends Dataset Observation period – Sept – Sept ,000 items analyzed ~ 3,000 trending topics discovered Google [Damian Borth, Adrian Ulges, Thomas Breuel. Dynamic Vocabularies for Web-based Concept Detection by Trend Discovery. ACM Multimedia, 2012]

Analysis and Forecasting of Trending Topics in Online Media Streams 16 Trending Topics Discovery

Analysis and Forecasting of Trending Topics in Online Media Streams 17 Trending Topics Discovery

Analysis and Forecasting of Trending Topics in Online Media Streams 18 Goals 1.Multi-channel analysis of trending topics 2.Fully automatic forecasting of trending topics

Analysis and Forecasting of Trending Topics in Online Media Streams 19 Multi-Channel Analysis

Analysis and Forecasting of Trending Topics in Online Media Streams 20 Example Neil Armstrong died on August 25, 2012.

Analysis and Forecasting of Trending Topics in Online Media Streams 21 Lifetime Analysis

Analysis and Forecasting of Trending Topics in Online Media Streams 22 Topic Category Analysis

Analysis and Forecasting of Trending Topics in Online Media Streams 23 Topic Category Analysis

Analysis and Forecasting of Trending Topics in Online Media Streams 24 Topic Category Analysis

Analysis and Forecasting of Trending Topics in Online Media Streams 25 Topic Category Analysis

Analysis and Forecasting of Trending Topics in Online Media Streams 26 Forecasting

Analysis and Forecasting of Trending Topics in Online Media Streams 27 Forecasting time observed valuesforecast

Analysis and Forecasting of Trending Topics in Online Media Streams 28 Basic Forecasting Techniques Naive Linear Trend Average/Median ARMA Nearest Neighbor

Analysis and Forecasting of Trending Topics in Online Media Streams 29 Basic Forecasting Techniques Naive Linear Trend Average/Median ARMA Nearest Neighbor

Analysis and Forecasting of Trending Topics in Online Media Streams 30 Basic Forecasting Techniques Naive Linear Trend Average/Median ARMA Nearest Neighbor

Analysis and Forecasting of Trending Topics in Online Media Streams 31 Basic Forecasting Techniques Naive Linear Trend Average/Median ARMA Nearest Neighbor

Analysis and Forecasting of Trending Topics in Online Media Streams 32 Basic Forecasting Techniques Naive Linear Trend Average/Median ARMA Nearest Neighbor

Analysis and Forecasting of Trending Topics in Online Media Streams 33 Example Signal

Analysis and Forecasting of Trending Topics in Online Media Streams 34 Forecasting Can Be Very Challenging

Analysis and Forecasting of Trending Topics in Online Media Streams 35 Recurring Signal

Analysis and Forecasting of Trending Topics in Online Media Streams 36 Recurring Signal

Analysis and Forecasting of Trending Topics in Online Media Streams 37 What To Do In This Case?

Analysis and Forecasting of Trending Topics in Online Media Streams 38 What To Do In This Case? Semantics!

Analysis and Forecasting of Trending Topics in Online Media Streams 39 Use Semantically Similar Topics

Analysis and Forecasting of Trending Topics in Online Media Streams 40 Approach

Analysis and Forecasting of Trending Topics in Online Media Streams 41 Approach

Analysis and Forecasting of Trending Topics in Online Media Streams 42 Approach

Analysis and Forecasting of Trending Topics in Online Media Streams 43 Discovering Semantically Similar Topics Trending Topic: “Olympics 2012”

Analysis and Forecasting of Trending Topics in Online Media Streams 44 Nearest Neighbor Sequence Matching

Analysis and Forecasting of Trending Topics in Online Media Streams 45 Forecasting

Analysis and Forecasting of Trending Topics in Online Media Streams 46 Forecasting

Analysis and Forecasting of Trending Topics in Online Media Streams 47 Forecasting

Analysis and Forecasting of Trending Topics in Online Media Streams 48 Forecasting

Analysis and Forecasting of Trending Topics in Online Media Streams 49 Forecasting

Analysis and Forecasting of Trending Topics in Online Media Streams 50 Forecasting

Analysis and Forecasting of Trending Topics in Online Media Streams 51 Evaluation

Analysis and Forecasting of Trending Topics in Online Media Streams 52 Dataset Public dataset of Wikipedia view statistics Historical data from 2008 until today Large-scale: 5 million articles attracting 870 million views each day – 2.8 TB compressed → Semantic similarity between 5 million articles → 9 billion sequences to choose from Use 200 top trending topics for evaluation

Analysis and Forecasting of Trending Topics in Online Media Streams 53 Discovering Semantically Similar Topics

Analysis and Forecasting of Trending Topics in Online Media Streams 54 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 55 Example: “The Hunger Games” Trending Topic Trigger 14 day horizon Time in days Wikipedia article views Different models

Analysis and Forecasting of Trending Topics in Online Media Streams 56 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 57 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 58 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 59 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 60 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 61 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 62 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 63 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 64 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 65 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 66 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 67 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 68 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 69 Example: “The Hunger Games”

Analysis and Forecasting of Trending Topics in Online Media Streams 70 Quantitative Results

Analysis and Forecasting of Trending Topics in Online Media Streams 71 Root Mean Square Error

Analysis and Forecasting of Trending Topics in Online Media Streams 72 Root Mean Square Error

Analysis and Forecasting of Trending Topics in Online Media Streams 73 Root Mean Square Error

Analysis and Forecasting of Trending Topics in Online Media Streams 74 Mean Average Percentage Error

Analysis and Forecasting of Trending Topics in Online Media Streams 75 Mean Average Percentage Error

Analysis and Forecasting of Trending Topics in Online Media Streams 76 Mean Average Percentage Error

Analysis and Forecasting of Trending Topics in Online Media Streams 77 Conclusion

Analysis and Forecasting of Trending Topics in Online Media Streams 78 Main Contributions 1.Multi-channel analysis of trending topics 2.Fully automatic forecasting technique exploiting semantic relationships between topics 3.Empirical evaluation on a large-scale Wikipedia dataset

Analysis and Forecasting of Trending Topics in Online Media Streams 79 Insights 1.Trends on Twitter and Wikipedia are significantly more ephemeral than on Google. 2.All observed media channels tend to specialize in specific topic categories. 3.Semantically similar topics exhibit similar behavior. 4.Exploiting this observation can significantly improve forecasting performance (by 20-90% compared to baselines).

Analysis and Forecasting of Trending Topics in Online Media Streams 80 Acknowledgments

Analysis and Forecasting of Trending Topics in Online Media Streams 81 Questions? Thank you for your attention Tim Althoff, Damian Borth, Jörn Hees, Andreas Dengel Analysis and Forecasting of Trending Topics in Online Media Streams

82 Backup Slides

Analysis and Forecasting of Trending Topics in Online Media Streams 83 Motivation Big Data: “High volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization” Large-scale user behavior › „Mirror of Society“ [Mark A. Beyer and Douglas Laney. The importance of ’big data’: A definition. June 2012.]

Analysis and Forecasting of Trending Topics in Online Media Streams 84 Large-scale Record of Human Behavior 1.2 million video minutes per second (est. 2016) Primary method for communication by college students in the U.S. 25 million articles and 1.8 billion page edits in 285 languages created by 100k contributors 340 million tweets each day by 500 million users One billion search requests and about twenty petabytes of user-generated data each day (2009)

Analysis and Forecasting of Trending Topics in Online Media Streams 85 Unemployment Claims jobs [Choi and Varian. Predicting initial claims for unemployment benefits. Google, Inc ]

Analysis and Forecasting of Trending Topics in Online Media Streams 86 What is the world talking about? Social Media can offer insights in what the world is talking about how people feel about events, brands and products resulting in an collective awareness for trending topics A trending topic is a subject matter of discussion agreed on by a group of people experiences a sudden spike in user interest or engagement [Damian Borth, Adrian Ulges, Thomas Breuel. Dynamic Vocabularies for Web-based Concept Detection by Trend Discovery. ACM Multimedia, 2012]

Analysis and Forecasting of Trending Topics in Online Media Streams 87 Trending Topics Discovery

Analysis and Forecasting of Trending Topics in Online Media Streams 88 Trending Topics Discovery

Analysis and Forecasting of Trending Topics in Online Media Streams 89 Lifetime Analysis

Analysis and Forecasting of Trending Topics in Online Media Streams 90 Lifetime Analysis

Analysis and Forecasting of Trending Topics in Online Media Streams 91 Lifetime Analysis

Analysis and Forecasting of Trending Topics in Online Media Streams 92 Lifetime Analysis

Analysis and Forecasting of Trending Topics in Online Media Streams 93 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 94 Example 2: “Battlefield 3” Trending Topic Trigger 14 day horizon Time in days Wikipedia article views

Analysis and Forecasting of Trending Topics in Online Media Streams 95 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 96 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 97 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 98 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 99 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 100 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 101 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 102 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 103 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 104 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 105 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 106 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 107 Example 2: “Battlefield 3”

Analysis and Forecasting of Trending Topics in Online Media Streams 108 Example 2: “Battlefield 3”