Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Spread of Media Content through the Blogosphere

Similar presentations


Presentation on theme: "The Spread of Media Content through the Blogosphere"— Presentation transcript:

1 The Spread of Media Content through the Blogosphere
TU Berlin Deutsche Telekom Lab Flash Floods and Ripples: The Spread of Media Content through the Blogosphere Meeyoung Cha Juan A. Navarro Max Planck Institute for Software Systems (MPI-SWS) Hamed Haddadi ICWSM Data Challenge 2009

2 How does content spread in blogs? What kinds of content are shared?
Motivation Blogs play a significant role in today’s Internet culture Blogs are used for information propagation purposes Discuss political issues Review new products and online contents Form communities and special interest groups Increasingly, media content is shared through blogs How does content spread in blogs? What kinds of content are shared?

3 Our goal Characterize how the structure of the blogosphere influences the patterns of content spreading 1. Understand the structure of the blogosphere Is the structure ideal for content dissemination? 2. Understand the spreading patterns of content What types of content spread? How quickly does content spread?

4 Part1. Measurement methodology
Part2. Analysis of network properties Part3. Analysis of spreading patterns

5 Spinn3r dataset Extracted post URL, site, host, language, timestamps, etc. Step1: Focus on top 15 blog domains Step2: Scrape content to find embedded HTML links Code available at Limitations Comments and blogrolls missing Some blogs only post summaries Only used dataset with numbered ‘tiers’

6 Step1: Top 15 blog sites # blogs # posts posts/blog Language 390,812
1,217,757 3.1 English 321,730 1,161,103 3.6 Chinese 254,225 1,666,165 6.6 72,376 1,127,383 15.6 Japanese 66,598 2,120,474 31.8 Total 1,196,412 8,794,983 7.4 English

7 Step2: Extracting HTML links
Links to media content Links to other blogs

8 Part1. Measurement methodology
Part2. Analysis of network properties Part3. Analysis of spreading patterns

9 Directed network of 85,013 nodes and 129,079 edges
Network of blogs Directed network of 85,013 nodes and 129,079 edges A B

10 Network structure Average node degree 1.5
Power-law degree distribution 6% of links are reciprocal 35% of links cross blog domains 7% of links cross language boundaries [ 73% of blogs in the largest connected component ]

11 Network structure is more sparse than social networks
Density = Ratio of observed links, out of all possible links Network structure is more sparse than social networks

12 Insights for information propagation
Sparse structure & power-law degree distribution Clear preference for bloggers to particular topics or sources Trend setters (high in-degree) and recommenders (high out-degree) Potential factors that can limit spreading Blog domains had no visible effect on linking Language barriers inhibit the flow of information

13 Part1. Measurement methodology
Part2. Analysis of network properties Part3. Analysis of spreading patterns

14 Spreading of media content
What types of content are shared? How quickly does information spread? media

15 Types of content shared
Popular sharing of user-generated content Rank Website # posts 1 youtube.com 206,803 2 photobucket.com 140,194 3 flickr.com 135,327 4 imageshack.us 41,997 5 amazon.com 36,379 6 nytimes.com 33,801 7 twitter.com 30,572 8 technorati.com 27,583 9 tinypic.com 23,899 10 bbc.co.uk 20,893

16 Popularity of YouTube videos
Video popularity follows a power-law distribution: Very large diffusion processes exist Preferential attachment may drive linking

17 Popular video categories
We downloaded metadata of top 10,000 videos Music most popular Category % of links % of videos Music 27.2 19.6 Taken down 19.9 22.3 News & Politics 18.4 23.5 Comedy 9.7 8.9 Entertainment 8.0 8.8 Film & Animation 3.8 4.7 People & Blogs 2.5 2.9 Science & Technology 2.4 1.5 Pets & Animals 1.7 1.2 Education 1.4 0.9 Still spread! Keen on politics

18 Time lag in the spread of videos
Median video age 2 days 72 days 125 days 357 days Flash floods Ripples

19 Example spreading pattern
Other Blogs linking the same video are connected = Diffusion through the blogosphere McCain’s political campaign linked by 79 blogs

20 Insights from spreading patterns
Videos in different genres spread with very different patterns Flash floods: found quickly and spread rapidly Ripples: took longer to spread, re-discovered years after upload Diffusion through links in the blogosphere 24% of videos had any spreading in the blog graph Other spreading factors: featuring and search

21 Part1. Measurement methodology
Part2. Analysis of network properties Part3. Analysis of spreading patterns

22 Conclusion Identified spreading patterns and factors that limit spreading Blogs serve as a medium to filter and spread media content Potential implication: Recommendation systems can take into account and exploit different spreading patterns Future work: spreading patterns of other types of content


Download ppt "The Spread of Media Content through the Blogosphere"

Similar presentations


Ads by Google