Presentation is loading. Please wait.

Presentation is loading. Please wait.

Attention and Event Detection Identifying, attributing and describing spatial bursts Early online identification of attention items in social media Louis.

Similar presentations


Presentation on theme: "Attention and Event Detection Identifying, attributing and describing spatial bursts Early online identification of attention items in social media Louis."— Presentation transcript:

1 Attention and Event Detection Identifying, attributing and describing spatial bursts Early online identification of attention items in social media Louis Gong louisgong@ust.hk www.louisgong.com

2 Identifying, attributing and describing spatial bursts Problem Description Related Works Solution Experiment & Result Q&A Michael Mathioudakis, Nilesh Bansal, and Nick Koudas. 2010. Identifying, attributing and describing spatial burstsIdentifying, attributing and describing spatial bursts. Proc. VLDB. 3, 1-2 (September 2010), 1091-1102.

3 BlogScope Automatically collect information. (blogosphere, news sources, social network, online forums.) Advanced information retrieval tasks with data mining and language processing. Warehouses metadata about the content (time of creation, demographic profile of author).

4 Problem Description User generated content that appears on blogs, microblogging websites, wikis and social networks proliferates at profound rates. Automating the process of information discovery given the vast collection of information. Example: Barack Obama, 2008, Bin laden, recently

5 Related Works 1. J. Kleinberg. Bursty and hierarchical structure in streams. In KDD, 2002. proposed a model for burst identification over document streams. 2. J. M. Kleinberg and E. Tardos. Approximation algorithms for classification problems with pairwise relationships: metric labeling and markov random fields. J. ACM, 2002 provides a 2-approximation linear programming algorithm to spatial burst detection problem. 3. Statistical discrepancy functions are used to quantify the difference between distributions and are commonly used to identify regions where two spatial distributions differ significantly. Such regions can be interpreted as areas where one spatial distribution exhibits a burst in comparison with the other.

6 Solution Identify spatial burst Burst attribution Keywords based description 6

7 Spatial Bursts G: grid; for a suitable choice of granularity, geographical entities of interest(cities) correspond to a cell. Rs: the spatial distribution of related documents published within t. Ds: the spatial distribution of all the documents published within t. Spatial bursts are identified as cells for which the value of Rs is large in comparison with Ds.

8 Burst Attribution Attribute the burst to profile features. 1. Focus on a specific set of bursty cells and ask what are the demographic factors in the absence of which no burst would have been detected. (eg. “Toronto Film Festival”) 2. Compare a bursty region with a non-bursty region and get the demographic factors that make the difference.

9 Keyword based description of bursts Query Expansion: Identify the keywords highly related to q (bursts for a query q). q U wi. Curve Estimation: the keywords w that occur frequently together with q often exhibit a burst themselves over the same interval. q0[t]est = (1 + ) minfb(q)[t]; b(wi)[t]g

10 Experiment & Result Average running time of the algorithms

11 Experiment & Result Queries q were submitted to BlogScope, with temporal interval qt set as the first 10 days of March 2009. Retrieving distributions Rs and Ds for a query.

12 Experiment & Result Parameter Sensitivity

13 Summary Scalable method to identify spatial information bursts. Efficient techniques to attribute bursts to specific demographic factors. Techniques to analyze bursts and effectively identify sets of keywords that describe the burst.

14 Early online identification of attention items in social media Problem Description ISIS Model Experiment Result Q&A Michael Mathioudakis, Nick Koudas, and Peter Marbach. 2010 In Proceedings of the third ACM international conference on Web search and data mining (WSDM '10). ACM, New York, NY, USA, 301- 310.

15 Problem Description Activity in social media is manifested via interaction that involve text, images, links and other information items. Naturally, some items attract more attention than others, expressed with large volumes of linking, commenting or tagging activity. Being able to identify information items that gather much attention in such a real time information collective is a challenging task.

16 Comparison (traditional & social media) Traditional webpages – Graph Model (PageRank) diff: 1. Social media is associated with individual documents, pictures, news articles. So it is reasonable to separate the measures for the importance or attention gathering potential of different items. 2.Linking activity in social media is the product of continuous interaction between participating individuals. Dynamic aspects of this process are not captured by graph model.

17 Comparison (traditional & social media) 3.Linking is not the only action by which structure arises in social media, as individuals also interact by commenting, sharing, recommending or rating.

18 Subject Proposed the first formal definition and analysis of such a model and use it as a basis to identify attention gathering items in online fashion. Identify individual items that attract a significant number of actions and its main focus is ‘early identification’ of such items.

19 ISIS Model An abstraction of social media activity. Information units(units) – items such as blog posts status messages, photos, etc. in social media stream. Information sources(sources) – individuals contributing information. A source participate in two sets of stochastic processes: 1. The process of emitting information units in a streaming fashion. 2. Processes of interaction with other sources.

20 ISIS Model Each unit is associated with a timestamp tp and a validity period dp. The validity periods of units emitted by the same source might overlap.

21 ISIS Model Source interaction

22 ISIS Model Source interaction

23 ISIS Model Source interaction

24 Experiment Setting

25 Result Interaction weights of posts in (a) engadget.com (b) techcrunch.com

26 Result Attention Gathering Posts

27 Result Quality vs Efficiency Trade-offs

28 Summary ISIS Model : a general stochastic model for interacting streaming information sources. Measure for the attention gathering potential of information units. Experimental results on real data collected form a period of blogging activity.

29 Q&A

30 Thank You Louis Gong louisgong@ust.hk www.louisgong.com


Download ppt "Attention and Event Detection Identifying, attributing and describing spatial bursts Early online identification of attention items in social media Louis."

Similar presentations


Ads by Google