Download presentation
Presentation is loading. Please wait.
Published byDwayne Ryan Modified over 9 years ago
1
by Huang et al., SOSP 2013 An Analysis of Facebook Photo Caching Presented by Phuong Nguyen Some animations and figures are borrowed from the original paper and presentation
2
Photos on Facebook: Overview Profile Feed Album 2 250 billion photos, as of Sep 2013
3
Photos on Facebook: Overview 3 Storage Backend FB Cache Layers Full-stack Study Akamai CDN
4
FACEBOOK PHOTO CACHING: HOW IT WORKS? 4
5
Client-based Browser Cache Client Browser Cache Client 5 Local Fetch
6
Geo-distributed Edge Cache (FIFO) Edge Cache (Tens) Browser Cache Client PoP (Millions) 6
7
Single Global Origin Cache (FIFO) Browser Cache Edge Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) 7 Hash(url)
8
Haystack Backend Backend (Haystack) Browser Cache Edge Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) 8
9
FULL-STACK CACHE STUDY: DATA COLLECTION 9
10
Objective: collecting a representative sample that could permits correlation of events related to the same request Trace Collection Instrumentation Scope Backend (Haystack) Browser Cache Edge Cache Origin Cache PoP ClientData Center 10
11
Sampling Strategies Request-based: sampling requests randomly Bias on popular content Objected-based: focused on some subset of photos selected by a deterministic test on photoId Fair coverage of unpopular photos Cross stack analysis 11
12
WORKLOAD ANALYSIS 12
13
Analysis Objectives Traffic sheltering effects of caches Photo popularity distribution Geographic traffic distribution & collaborative caching Can we make the cache better? Impact of sizes & algorithm Could we know which photos to cache? 13
14
ANALYSIS: TRAFFIC SHELTERING 14
15
Traffic Sheltering 77.2M 26.6M 11.2M 7.6M Backend (Haystack) Browser Cache Edge Cache Origin Cache PoP ClientData Center 65.5% 58.0% 31.8% R Traffic Share 65.5%20.0%4.6% 9.9% 15
16
ANALYSIS: PHOTO POPULARITY IMPACT 16
17
Popularity Distribution Skewness is reduced after layers of cache 17
18
Popularity Impact on Caches 18
19
ANALYSIS: GEOGRAPHIC TRAFFIC DISTRIBUTION & COLLABORATIVE CACHING 19
20
Substantial Remote Traffic at Edge 20 Atlanta 20% local Miami 35% local Dallas 50% local Chicago 60% local LA 18% local NYC 35% local
21
Substantial Remote Traffic at Edge 21 Atlanta 20% local 5% Dallas 35% D.C. 5% NYC 20% Miami 5% California 10% Chicago Atlanta has 80% requests served by remote Edges
22
Collaborative Edge 22
23
Impact of Using Collaborative Edge Collaborative Edge increases hit ratio by 18% 18% 23 Collaborative
24
ANALYSIS: IMPACTS OF CACHE SIZE & ALGORITHM 24
25
Potential Improvement Study Methodology: cache simulation Replay the trace (25% warm up) Evaluate using remaining 75% Improvement factors: Cache size Caching algorithm Evaluation metric: hit ratio 25
26
Edge Cache with Different Sizes & Algorithms Infinite Cache 26 The same hit ratio can be achieved with a smaller cache and higher-performing algorithms
27
Edge Cache with Different Sizes & Algorithms Infinite Cache 27 Sophisticated algorithm can achieve better hit ratio with the same cache size
28
ANALYSIS: WHICH PHOTOS TO CACHE? 28
29
Intuitions Properties that intuitively associated with photo traffic: The age of photos The number of Facebook followers associated with the owner 29
30
Content Age Affect Age-based cache replacement algorithm could be effective Fresh content is popular and tends to be effectively cached throughout the hierarchy 30
31
Social Affect The more popular photo owner is, the more likely the photo is to be accessed Browser caches tend to have lower hit ratios for popular users (“viral” effect) 31
32
DISCUSSIONS 32
33
Discussions 33 Evaluation method: Only consider desktop clients, excluding mobile clients Trends by mobility of users Sampling: object-based sampling might not represent realistic workload Impact of caching done by Akamai CDN Correlating requests method is not perfect Latency issue Evaluation mainly focuses on hit ratio & traffic sheltering, not latency Latency of collaborative caching is note evaluated
34
Discussions (cont.) 34 Other potential improvements: Improved caching algorithm taking into account metadata of photos Optimal placement of resizing functionality along the stack The use of Clairvoyant caching might be possible based on predicting future accesses E.g., photos from the same album, photos appear on news feed, etc. Solve geographical diversity by improving routing policy (e.g., put more weight into locality aspect)
35
THANK YOU! 35
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.