Download presentation
Presentation is loading. Please wait.
Published byBuck James Modified over 9 years ago
1
A Latent Social Approach to YouTube Popularity Prediction Amandianeze Nwana Prof. Salman Avestimehr Prof. Tsuhan Chen
2
Statistics Up to 60% of all videos are watched through YouTube 1 2 http://www.sandvine.com 1 YouTube Traffic Characterization: A View From the Edge; P Gill, Arlitt, Li, Mahanti
3
Statistics: Campus View The most popular videos globally typically only account for about 1% of the videos viewed on campus daily 1 Correlation coefficient between global popularity and local popularity is too low 2 2 http://www.sandvine.com 1 YouTube Traffic Characterization: A View From the Edge; P Gill, Arlitt, Li, Mahanti
4
Typical Request Patterns Source: UMass Amherst YouTube trace dataset (1 week) “Romnesia speech” goes viral Conventional approach catches them too late Conventional approach catches them Sports Highlights Music Videos
5
Main Idea lol….did you see that video? Requests are correlated in time (and space) because of some hidden social contagion process
6
Main Idea Can a record of the transactions reveal information about the network structure and graph ? Can the network structure and a record of the transactions predict future trends ?
7
Traditional Caching Gateway Router YouTube Server Local Network Requests Cache Response
8
Predictive (Social) Caching Gateway Router Transactions Requests Cache Latent Network Transactions
9
Goal
10
Challenge Gateway Router Transactions Latent Network
11
Estimating the Social Network Mathematical Epidemiology
12
Mathematical Epidemiology Compartmental Models
13
Diffusion Model 2 Stages Stage 1 Decide who gets infected by whom independently Stage 2 “Decide” the time of infection (observed symptoms)
14
Latent Social Network Inference t=8 t=1 t=3 t=6 t=4 t=2 t=6 ; t=10
15
Latent Social Network Inference 1 3 4 6 8 2 10
16
Inference Steps Occurs in two stages: – Stage 1: Given the transaction fit the inter-arrivals to a power law – Stage 2: Given the estimated power law, and the transactions, find the Influence matrix that maximizes the likelihood of the observed transactions Maximum Likelihood Estimation
17
… back to caching We now have the social graph over the network of users We need a video relevance function to assign relevance scores to videos Rank the videos according to relevance scores and store the top K videos in the cache
18
Video Relevance Combine temporal score with social score.
19
Model Deficiencies In reality, all the requests cannot be completely modeled by diffusion processes – Influence external to network (news sites, aggregators, etc.) – User preference/tastes Insufficient data leads to many isolated vertices – On our dataset 60% of users are isolated vertices
20
Results Comparison% Improve Inter-Arrival/CRF11.6 Combined/Inter-Arrrival13.2 Table 1: Percentage Improvements of algorithms using all users Fig.2: Cache Size comparison between purely social and baseline using all users Fig. 1: Average Hitrate for all approaches over different cache sizes Few useful cascades leads to many isolated nodes
21
Results Connected Users Comparison% Improve Temporal/CRF15.6 Combined/CRF21.1 Table 2: Percentage Improvements of algorithms without isolated nodes Fig. 3: Average Hitrate for all approaches over different cache sizes without isolated nodes Fig.4: Cache Size comparison between purely social and baseline without isolated nodes
22
Future Directions Explore other epidemic models On the fly update of nodes and edges Graph clustering into different communities and influence groups User recommendations using the social graph Object detection and tagging via twitter
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.