Qi Huang, Ken Birman, Robbert van Renesse (Cornell), Wyatt Lloyd (Princeton, Facebook), Sanjeev Kumar, Harry C. Li (Facebook) An Analysis of Facebook Photo.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

EcoTherm Plus WGB-K 20 E 4,5 – 20 kW.
2 Casa 15m Perspectiva Lateral Izquierda.
1 A B C
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Variations of the Turing Machine
AP STUDY SESSION 2.
1
Select from the most commonly used minutes below.
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
David Burdett May 11, 2004 Package Binding for WS CDL.
Create an Application Title 1Y - Youth Chapter 5.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
CHAPTER 18 The Ankle and Lower Leg
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
The 5S numbers game..
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Media-Monitoring Final Report April - May 2010 News.
Break Time Remaining 10:00.
The basics for simulations
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
PP Test Review Sections 6-1 to 6-6
Bellwork Do the following problem on a ½ sheet of paper and turn in.
Regression with Panel Data
1 The Royal Doulton Company The Royal Doulton Company is an English company producing tableware and collectables, dating to Operating originally.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
BEEF & VEAL MARKET SITUATION "Single CMO" Management Committee 22 November 2012.
TESOL International Convention Presentation- ESL Instruction: Developing Your Skills to Become a Master Conductor by Beth Clifton Crumpler by.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
1..
Adding Up In Chunks.
FAFSA on the Web Preview Presentation December 2013.
SLP – Endless Possibilities What can SLP do for your school? Everything you need to know about SLP – past, present and future.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
Facebook Pages 101: Your Organization’s Foothold on the Social Web A Volunteer Leader Webinar Sponsored by CACO December 1, 2010 Andrew Gossen, Senior.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Artificial Intelligence
When you see… Find the zeros You think….
Before Between After.
Subtraction: Adding UP
Bell Busters! Unit 1 #1-61. Purposes of Government 1. Purposes of government 2. Preamble to the Constitution 3. Domestic tranquility 4. Common defense.
: 3 00.
5 minutes.
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
Static Equilibrium; Elasticity and Fracture
Essential Cell Biology
Converting a Fraction to %
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Clock will move after 1 minute
famous photographer Ara Guler famous photographer ARA GULER.
PSSA Preparation.
Physics for Scientists & Engineers, 3rd Edition
Select a time to count down from the clock above
Copyright Tim Morris/St Stephen's School
1.step PMIT start + initial project data input Concept Concept.
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
By Huang et al., SOSP 2013 An Analysis of Facebook Photo Caching Presented by Phuong Nguyen Some animations and figures are borrowed from the original.
Serving Photos at Scaaale: Caching and Storage An Analysis of Facebook Photo Caching. Huang et al. Finding a Needle in a Haystack. Beaver et al. Vlad Niculae.
An Analysis of Facebook photo Caching
Presentation transcript:

Qi Huang, Ken Birman, Robbert van Renesse (Cornell), Wyatt Lloyd (Princeton, Facebook), Sanjeev Kumar, Harry C. Li (Facebook) An Analysis of Facebook Photo Caching

250 Billion * Photos on Facebook Profile Feed Album 1 * Internet.org, Sept., 2013 Storage Backend Storage Backend Cache Layers Cache Layers Full-stack Study

Preview of Results 2 Current Stack Performance Opportunities for Improvement Browser cache is important (reduces 65+% request ) Photo popularity distribution shifts across layers Smarter algorithms can do much better (S4LRU) Collaborative geo-distributed cache worth trying

Client Facebook Photo-Serving Stack 3

Client-based Browser Cache Client Browser Cache Browser Cache Client 4 Local Fetch

Client-based Browser Cache Client Browser Cache Browser Cache (Millions) Client 5

Stack Choice Browser Cache Browser Cache Client Caches Storage Caches Storage Facebook Stack Akamai Content Distribution Network (CDN) Content Distribution Network (CDN) Focus: Facebook stack (Millions) 6

Geo-distributed Edge Cache (FIFO) Edge Cache Edge Cache (Tens) Browser Cache Browser Cache Client PoP (Millions) 7

Geo-distributed Edge Cache (FIFO) Edge Cache Edge Cache (Tens) Browser Cache Browser Cache Client PoP (Millions) 8 Purpose 1.Reduce cross-country latency 2.Reduce Data Center bandwidth

Geo-distributed Edge Cache (FIFO) Edge Cache Edge Cache (Tens) Browser Cache Browser Cache Client PoP (Millions) 9

Geo-distributed Edge Cache (FIFO) Edge Cache Edge Cache (Tens) Browser Cache Browser Cache Client PoP (Millions) 10

Single Global Origin Cache (FIFO) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) 11

Single Global Origin Cache (FIFO) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) 12 Purpose 1.Minimize I/O-bound operations

Single Global Origin Cache (FIFO) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) Hash(url) 13

Single Global Origin Cache (FIFO) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) 14

Haystack Backend Backend (Haystack) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Tens)(Millions)(Four) 15

How did we collect the trace? 16

Trace Collection Instrumentation Scope Backend (Haystack) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center (Object-based sampling) Request-based: collect X% of requests Object-based: collect reqs for X% objects 17

Sampling on Power-law 18 Object rank

Sampling on Power-law Req-based: bias on popular content, inflate cache perf 19 Object rank Req-based

Sampling on Power-law Object-based 20 Object rank Object-based: fair coverage of unpopular content

Sampling on Power-law Object-based: fair coverage of unpopular content 21 Object rank Object-based

Trace Collection Instrumentation Scope Backend (Haystack) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center Resizer R 22

Analysis Traffic sheltering effects of caches Photo popularity distribution Size, algorithm, collaborative Edge In paper –Stack performance as a function of photo age –Stack performance as a function of social connectivity –Geographical traffic flow 23

Traffic Sheltering Backend (Haystack) Browser Cache Browser Cache Edge Cache Edge Cache Origin Cache Origin Cache PoP ClientData Center R 24

Photo popularity and its cache impact 25

Popularity Distribution Browser resembles a power-law distribution 26

Popularity Distribution Viral photos becomes the head for Edge 27

Popularity Distribution Skewness is reduced after layers of cache 28

Popularity Distribution Backend resembles a stretched exponential dist. 29

Popularity with Absolute Traffic Storage/cache designers: pick a layer 30

Popularity Impact on Caches 31 High Low M Lowest Each has 25% requests

Popularity Impact on Caches Browser traffic share decreases gradually 32

Popularity Impact on Caches Edge serves consistent share except for the tail 33

Popularity Impact on Caches Origin contributes most for low group 34

Popularity Impact on Caches Backend serves the tail 35

Can we make the cache better? 36

Simulation Replay the trace (25% warm up) Estimate the base cache size Evaluate two hit-ratios (object-wise, byte-wise) 37

Edge Cache with Different Sizes Picked San Jose edge (high traffic, median hit ratio) 38

Edge Cache with Different Sizes x estimates current deployment size (59% hit ratio) 39

Edge Cache with Different Sizes Infinite size ratio needs 45x of current capacity 40

Edge Cache with Different Algos Both LRU and LFU outperforms FIFO slightly 41

S4LRU 42

S4LRU 43

S4LRU 44

S4LRU 45

Edge Cache with Different Algos S4LRU improves the most 46

Edge Cache with Different Algos Clairvoyant (Bélády) shows much improvement space 47

Origin Cache S4LRU improves Origin more than Edge 48

Which Photo to Cache Recency & frequency leads S4LRU Does age, social factors also play a role? 49

Collaborative cache on the Edge 50

Geographic Coverage of Edge 51

Geographic Coverage of Edge 52

Geographic Coverage of Edge 53

Geographic Coverage of Edge 54

Geographic Coverage of Edge 55 Atlanta has 80% requests served by remote Edges

Geographic Coverage of Edge 56 Substantial remote traffic is normal

Geographic Coverage of Edge 57

Collaborative Edge 58

Collaborative Edge Independent aggregates all high-volume Edges 59

Collaborative Edge Collaborative Edge increases hit ratio by 18% 60 Collaborative

Related Work 61 Storage Analysis Content Distribution Analysis Web Analysis BSD file system (SOSP 85), Sprite ( SOSP 91), NT (SOSP 99), NetApp (SOSP 11), iBench (SOSP 11) Cooperative caching (SOSP 99), CDN vs. P2P (OSDI 02), P2P (SOSP 03), CoralCDN (NSDI 10), Flash crowds (IMC 11) Zipfian (INFOCOM 00), Flash crowds (WWW 02), Modern web traffic (IMC 11)

Conclusion 62 Quantify caching performance Quantify popularity changes across layers of caches Recency, frequency, age, social factors impact cache Outline potential gain of collaborative caching