Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discovery of Blog Communities based on Mutual Awareness

Similar presentations


Presentation on theme: "Discovery of Blog Communities based on Mutual Awareness"— Presentation transcript:

1 Discovery of Blog Communities based on Mutual Awareness
Yu-Ru Lin, Hari Sundaram, YunChi, Jun Tatemura, Belle Tseng [3rd Annual Workshop on the Weblogging Ecosystems] Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei, Liang Date: 2008/06/19

2 Outline Introduction Analytical framework Community formation
Extracting blog communities Performance metrics Experiment

3 Introduction Blog Become prominent social media Enable users to publish content quickly Blog communities are different from traditional web communities Formed based on mutual awareness

4 Introduction Mutual awareness
Individual bloggers become aware of each other’s presence through interaction (e.g., Comments, trackback). No blogger is aware of others’ s actions Not communities!

5 Introduction New approach to community extraction Two steps:
(a) Analysis of mutual awareness from bloggers’ actions. (b) Ranking-based community extraction from mutual awareness.

6 Analytical framework (1/3)
Community emerges through individual bloggers’ behavior Individual bloggers are aware of each other’s presence through interaction (Bi-directional property) Community needs to sustain over time

7 Analytical framework (2/3)
Properties of blogs 1. Temporal dynamics: Blogs represent easily editable content. 2. Event Locality: A typical blog entry is time sensitive. 3. Link Semantics A hyperlink can have different semantics. 4. Community Centric People that interested in each others’ content

8 Analytical framework (3/3)
Community extraction problem Distinct from traditional ranking problems on the web. Key difference is the semantics of the hyperlinked structure. Hub Authority

9 Community Formation Acts of bloggers in the blogosphere 1. Surf/read
2. Create entries 3. Comment 4. Change blogroll

10 Extracting blog communities
1. Computing Mutual Awareness 2. Ranking-Based Clustering Method

11 Computing Mutual Awareness (1/7)
Mutual awareness is affected by Type of action Number of actions for each type When the action occurred Mutual awareness depends on sustained actions

12 Computing Mutual Awareness (2/7)
Graph G = (V, E), nodes = bloggers, edge = connection of nodes, wij = weight on edges is a function of mutual awareness Mutual awareness matrix M A weighted linear combination of action matrices

13 Computing Mutual Awareness (3/7)
For each action type k at time t, compute Temporal action matrix Xk,t Each entry xij,k,t of matrix represents the number of times the kth action ak was performed by blogger i on blogger j. e.g. Blogger i leaves a comment on blogger j’s entry.

14 Computing Mutual Awareness (4/7)
C action type 1: yellow action type 2: blue action type 3: red X1,t = X2,t =

15 Computing Mutual Awareness (5/7)
Effect of actions to mutual awareness diminish gradually. (λk is decaying factor for the action type k)

16 Computing Mutual Awareness (6/7)
Two specific types of actions (a) create an entry-to-entry link (mij=0 if mij < λm ) (b) send a trackback (r denotes how likely the trackback receiver is to be aware of the trackback sender by the action of sending a trackback)

17 Computing Mutual Awareness (7/7)
Fusion of Actions Assume actions ak in the action set are independent of each other.

18 Ranking-Based Clustering Method
Start from highly ranked blogs to extract dense subgraphs that include popular bloggers 1. Choose seeds based on PageRank (rt is a vector of ranking score r, α is damping factor) 2. Find community members associated with seeds 3. Iterate 1 and 2 to discover N communities Exclude blogs in the existing communities

19 Performance Metrics Coverage Conductance
measures the fraction of edges that are intra-community Higher coverage have higher quality Conductance Small conductance has higher equality

20 Performance Metrics Interest Coefficient
Measure how much a community member is interested in his or her assigned community. Interest coefficient of an individual blogger m toward an assigned cluster Ck Interest coefficient of a cluster Ck

21 Performance Metrics Sustainability
Average sustainability for a set C of k extracted communities

22 Experimental Results NEC Blog Dataset
127,467 entries for a period of 25 consecutive weeks 584 seed blogs 40,877 links 2,898 trackback links

23 Experimental Results Compare the communities extracted by using two different features Baseline adjacency matrix (wij = # entry to entry links) Mutual awareness matrix (wij = mutual awareness score)

24 Experimental Results Experimental design

25 Experimental Results

26 Experimental Results Sustainability of communities

27 Experimental Results Use content of blog entries to validate communities red: trackback link blue: entry-to entry links

28 Experimental Results

29 Experimental Results WWW 2006 Blog Workshop Dataset
8.37 million entries 1.43 million blog sites (Narrow down to 122K)

30 Experimental Results

31 Experimental Results

32 Experimental Results


Download ppt "Discovery of Blog Communities based on Mutual Awareness"

Similar presentations


Ads by Google