Download presentation
Presentation is loading. Please wait.
Published byKaty Imm Modified over 10 years ago
1
Qi Huang, Ken Birman, Robbert van Renesse (Cornell), Wyatt Lloyd (Princeton, Facebook), Sanjeev Kumar, Harry C. Li (Facebook) An Analysis of Facebook Photo Caching
2
250 Billion * Photos on Facebook Profile Feed Album 1 * Internet.org, Sept., 2013 Storage Backend Storage Backend Cache Layers Cache Layers Full-stack Study
3
Preview of Results 2 Current Stack Performance Opportunities for Improvement Browser cache is important (reduces 65+% request ) Photo popularity distribution shifts across layers Smarter algorithms can do much better (S4LRU) Collaborative geo-distributed cache worth trying
4
Client Facebook Photo-Serving Stack 3
5
Client-based Browser Cache Client Browser Cache Browser Cache Client 4 Local Fetch
6
Client-based Browser Cache Client Browser Cache Browser Cache (Millions) Client 5
7
Stack Choice Browser Cache Browser Cache Client Caches Storage Caches Storage Facebook Stack Akamai Content Distribution Network (CDN) Content Distribution Network (CDN) Focus: Facebook stack (Millions) 6
8
Geo-distributed Edge Cache (FIFO) Edge Cache Edge Cache (Tens) Browser Cache Browser Cache Client PoP (Millions) 7
9
Geo-distributed Edge Cache (FIFO) Edge Cache Edge Cache (Tens) Browser Cache Browser Cache Client PoP (Millions) 8 Purpose 1.Reduce cross-country latency 2.Reduce Data Center bandwidth
10
Geo-distributed Edge Cache (FIFO) Edge Cache Edge Cache (Tens) Browser Cache Browser Cache Client PoP (Millions) 9
11
Geo-distributed Edge Cache (FIFO) Edge Cache Edge Cache (Tens) Browser Cache Browser Cache Client PoP (Millions) 10
12
Single Global Origin Cache (FIFO) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) 11
13
Single Global Origin Cache (FIFO) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) 12 Purpose 1.Minimize I/O-bound operations
14
Single Global Origin Cache (FIFO) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) Hash(url) 13
15
Single Global Origin Cache (FIFO) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) 14
16
Haystack Backend Backend (Haystack) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) 15
17
How did we collect the trace? 16
18
Trace Collection Instrumentation Scope Backend (Haystack) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Object-based sampling) Request-based: collect X% of requests Object-based: collect reqs for X% objects 17
19
Sampling on Power-law 18 Object rank
20
Sampling on Power-law Req-based: bias on popular content, inflate cache perf 19 Object rank Req-based
21
Sampling on Power-law Object-based 20 Object rank Object-based: fair coverage of unpopular content
22
Sampling on Power-law Object-based: fair coverage of unpopular content 21 Object rank Object-based
23
Trace Collection Instrumentation Scope Backend (Haystack) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center Resizer R 22
24
Analysis Traffic sheltering effects of caches Photo popularity distribution Size, algorithm, collaborative Edge In paper –Stack performance as a function of photo age –Stack performance as a function of social connectivity –Geographical traffic flow 23
25
Traffic Sheltering Backend (Haystack) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center R 24
26
Photo popularity and its cache impact 25
27
Popularity Distribution Browser resembles a power-law distribution 26
28
Popularity Distribution Viral photos becomes the head for Edge 27
29
Popularity Distribution Skewness is reduced after layers of cache 28
30
Popularity Distribution Backend resembles a stretched exponential dist. 29
31
Popularity with Absolute Traffic Storage/cache designers: pick a layer 30
32
Popularity Impact on Caches 31 High Low M Lowest Each has 25% requests
33
Popularity Impact on Caches Browser traffic share decreases gradually 32
34
Popularity Impact on Caches Edge serves consistent share except for the tail 33
35
Popularity Impact on Caches Origin contributes most for low group 34
36
Popularity Impact on Caches Backend serves the tail 35
37
Can we make the cache better? 36
38
Simulation Replay the trace (25% warm up) Estimate the base cache size Evaluate two hit-ratios (object-wise, byte-wise) 37
39
Edge Cache with Different Sizes Picked San Jose edge (high traffic, median hit ratio) 38
40
Edge Cache with Different Sizes x estimates current deployment size (59% hit ratio) 39
41
Edge Cache with Different Sizes Infinite size ratio needs 45x of current capacity 40
42
Edge Cache with Different Algos Both LRU and LFU outperforms FIFO slightly 41
43
S4LRU 42
44
S4LRU 43
45
S4LRU 44
46
S4LRU 45
47
Edge Cache with Different Algos S4LRU improves the most 46
48
Edge Cache with Different Algos Clairvoyant (Bélády) shows much improvement space 47
49
Origin Cache S4LRU improves Origin more than Edge 48
50
Which Photo to Cache Recency & frequency leads S4LRU Does age, social factors also play a role? 49
51
Collaborative cache on the Edge 50
52
Geographic Coverage of Edge 51
53
Geographic Coverage of Edge 52
54
Geographic Coverage of Edge 53
55
Geographic Coverage of Edge 54
56
Geographic Coverage of Edge 55 Atlanta has 80% requests served by remote Edges
57
Geographic Coverage of Edge 56 Substantial remote traffic is normal
58
Geographic Coverage of Edge 57
59
Collaborative Edge 58
60
Collaborative Edge Independent aggregates all high-volume Edges 59
61
Collaborative Edge Collaborative Edge increases hit ratio by 18% 60 Collaborative
62
Related Work 61 Storage Analysis Content Distribution Analysis Web Analysis BSD file system (SOSP 85), Sprite ( SOSP 91), NT (SOSP 99), NetApp (SOSP 11), iBench (SOSP 11) Cooperative caching (SOSP 99), CDN vs. P2P (OSDI 02), P2P (SOSP 03), CoralCDN (NSDI 10), Flash crowds (IMC 11) Zipfian (INFOCOM 00), Flash crowds (WWW 02), Modern web traffic (IMC 11)
63
Conclusion 62 Quantify caching performance Quantify popularity changes across layers of caches Recency, frequency, age, social factors impact cache Outline potential gain of collaborative caching
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.