CSE 486/586 Distributed Systems Case Study: Facebook Photo Stores

Slides:



Advertisements
Similar presentations
Domain Name System (DNS) Name resolution for both small and large networks Host names IP Addresses Like a phone book, but stores more information Older.
Advertisements

CSE 486/586 CSE 486/586 Distributed Systems Case Study: Facebook f4 Steve Ko Computer Sciences and Engineering University at Buffalo.
Domain Name System (or Service) (DNS) Computer Networks Computer Networks Term B10.
1 EEC-484/584 Computer Networks Lecture 5 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
EEC-484/584 Computer Networks Lecture 5 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
Domain Name System (or Service) (DNS) Computer Networks Computer Networks Spring 2012 Spring 2012.
EEC-484/584 Computer Networks Lecture 5 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
2: Application Layer1 FTP, SMTP and DNS. 2: Application Layer2 FTP: separate control, data connections r FTP client contacts FTP server at port 21, specifying.
1 Domain Name System (DNS) Reading: Section 9.1 COS 461: Computer Networks Spring 2006 (MW 1:30-2:50 in Friend 109) Jennifer Rexford Teaching Assistant:
EEC-484/584 Computer Networks Lecture 5 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
1 Domain Name System (DNS). 2 DNS: Domain Name System Internet hosts, routers: –IP address (32 bit) - used for addressing datagrams –“name”, e.g., gaia.cs.umass.edu.
EEC-484/584 Computer Networks Lecture 5 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
2: Application Layer1 Chapter 2 Application Layer Computer Networking: A Top Down Approach, 4 th edition. Jim Kurose, Keith Ross Addison-Wesley, July 2007.
Application Layer session 1 TELE3118: Network Technologies Week 12: DNS Some slides have been taken from: r Computer Networking: A Top Down Approach.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks Domain.
Caching and Content Distribution Networks. Web Caching r As an example, we use the web to illustrate caching and other related issues browser Web Proxy.
Name Resolution and DNS. Domain names and IP addresses r People prefer to use easy-to-remember names instead of IP addresses r Domain names are alphanumeric.
2: Application Layer1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
CS 4396 Computer Networks Lab
1 Domain Name System (DNS). 2 DNS: Domain Name System Internet hosts: – IP address (32 bit) - used for addressing datagrams – “name”, e.g.,
DNS & P2P A PPLICATIONS د. عـــادل يوسف أبو القاسم.
Data Communications and Computer Networks Chapter 2 CS 3830 Lecture 10 Omar Meqdadi Department of Computer Science and Software Engineering University.
DNS. 2 DNS: Domain Name System DNS services Hostname to IP address translation Host aliasing – Canonical and alias names Mail server aliasing Load distribution.
CS 471/571 Domain Name Server Slides from Kurose and Ross.
Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April A note on the use.
DNS: Domain Name System People: many identifiers: – SSN, name, Passport # Internet hosts, routers: – IP address (32 bit) - used for addressing datagrams.
25.1 Chapter 25 Domain Name System Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Naming and the DNS. Names and Addresses  Names are identifiers for objects/services (high level)  Addresses are locators for objects/services (low level)
CPSC 441: DNS 1. DNS: Domain Name System Internet hosts: m IP address (32 bit) - used for addressing datagrams m “name”, e.g., - used by.
CS 3830 Day 10 Introduction 1-1. Announcements r Quiz #2 this Friday r Program 2 posted yesterday 2: Application Layer 2.
Lecture 5: Web Continued 2-1. Outline  Network basics:  HTTP protocols  Studies on HTTP performance from different views:  Browser types [NSDI 2014]
Lecture 7: Domain Name Service (DNS) Reading: Section 9.1 ? CMSC 23300/33300 Computer Networks
1 EEC-484/584 Computer Networks Lecture 5 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer Networking book.
Chapter 2 Application Layer Computer Networking: A Top Down Approach, 4 th edition. Jim Kurose, Keith Ross Addison-Wesley, July 2007.
1 Kyung Hee University Chapter 19 DNS (Domain Name System)
Content Distribution Network, Proxy CDN: Distributed Environment
2: Application Layer 1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
1. Internet hosts:  IP address (32 bit) - used for addressing datagrams  “name”, e.g., ww.yahoo.com - used by humans DNS: provides translation between.
Application Layer, 2.5 DNS 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley.
CSEN 404 Application Layer II Amr El Mougy Lamia Al Badrawy.
Spring 2006 CPE : Application Layer_DNS 1 Special Topics in Computer Engineering Application layer: Domain Name System Some of these Slides are.
@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.
@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.
1 Prof. Leonardo Mostarda University of Camerino Naming Prof. Leonardo Mostarda-- Camerino,
2: Application Layer 1 Some network apps r r Web r Instant messaging r Remote login r P2P file sharing r Multi-user network games r Streaming stored.
Internet Applications
CSE 486/586 Distributed Systems Web Content Distribution---1 DNS & CDN
Chapter 17 DNS (Domain Name System)
Introduction to Networks
Chapter 19 DNS (Domain Name System)
Session 6 INST 346 Technologies, Infrastructure and Architecture
Chapter 9: Domain Name Servers
Introduction to Communication Networks
Domain Name System (DNS)
Steve Ko Computer Sciences and Engineering University at Buffalo
Steve Ko Computer Sciences and Engineering University at Buffalo
Steve Ko Computer Sciences and Engineering University at Buffalo
Chapter 2 Application Layer
Content Distribution Networks
Chapter 7: Application layer
Cookies, Web Cache & DNS Dr. Adil Yousif.
Steve Ko Computer Sciences and Engineering University at Buffalo
Anthony D. Joseph and Ion Stoica
Domain Name System (DNS)
Content Distribution Networks
Chapter 19 DNS (Domain Name System)
EEC-484/584 Computer Networks
Domain Name System (DNS)
FTP, SMTP and DNS 2: Application Layer.
Naming in Networking Jennifer Rexford COS 316 Guest Lecture.
Presentation transcript:

CSE 486/586 Distributed Systems Case Study: Facebook Photo Stores Steve Ko Computer Sciences and Engineering University at Buffalo

Engineering a System Generally, when you engineer a system, you need to understand your workload. And design your system according to the workload (Perhaps not in the beginning since there’s no workload) Engineering principle Make the common case fast, and rare cases correct (From Patterson & Hennessy books) This principle cuts through generations of systems. Example? Caching Knowing common cases == understanding your workload E.g., read dominated? Write dominated? Mixed? We’ll look at Facebook’s example.

Facebook Workload What are the most frequent things you do on Facebook? Read/write wall posts/comments/likes View/upload photos Very different in their characteristics Mix of reads and writes so more care is necessary in terms of consistency But small in size so probably less performance sensitive Photos Write-once, read-many so less care is necessary in terms of consistency But large in size so more performance sensitive

Facebook Photo Workload (This is from 2010.) 260 billion images (~20 PB) 1 billion new photos per week (~60 TB) One million image views per second at peak Two characteristics: Facebook has analyzed their photo workload and discovered two characteristics. The popularity distribution follows Zipf. Popularity changes over time as photos “age.”

Items sorted by popularity Zipf distribution Based on the power law Models a lot of natural phenomena Social graphs, media popularity, wealth distribution, etc. A lot of Web contents too. Popularity Items sorted by popularity

Popularity Comes with Age

Facebook Photo Distribution “Hot” vs. “warm” vs. “cold” photos Hot: Popular, a lot of views (approx. 90% of views) Warm: Somewhat popular, but still a lot of views in aggregate Cold: Unpopular, occasional views Popularity Based on the observations, Facebook divided their workload into three types and designed mechanisms to handle those types differently. Items sorted by popularity

Handling Different Types of Photos Hot photos Facebook uses a CDN (Content Distribution Network) for these. Very good performance, but no reliability guarantee CDN is a cache, not a permanent storage. Warm photos Facebook has designed its own storage called Haystack. Balances performance and reliability Cold photos Facebook has designed an “archival” storage called f4. Aims for storage efficiency when storing replicated photos (but not high performance) For each type of workload, the emphasis changes.

CSE 486/586 Administrivia PA4 deadline: 5/10 Survey & course evaluation Survey: https://forms.gle/eg1wHN2G8S6GVz3e9 Course evaluation: https://www.smartevals.com/login.aspx?s=buffalo If both have 80% or more participation, For each of you, I’ll take the better one between the midterm and the final, and give the 30% weight for the better one and the 20% weight for the other one. (Currently, it’s 20% for the midterm and 30% for the final.) No recitation this week; replaced with office hours

CDN for Hot Photos Content providers are CDN customers 10 CDN for Hot Photos Content providers are CDN customers Content replication CDN company (e.g., Akamai) installs thousands of servers throughout Internet In large datacenters close to users CDN replicates customers’ content When provider updates content, CDN updates servers origin server in North America CDN distribution node CDN server in S. America CDN server in Asia CDN server in Europe

Domain Name System For a given user, how to locate a close server? Many CDNs rely on Domain Name System (DNS) DNS maps a DNS name to an IP address or another DNS name (alias). E.g., www.cse.buffalo.edu Domain: registrar for each top-level domain Host name: local administrator assigns to each host Properties of DNS Hierarchical name space Distributed over a collection of DNS servers Hierarchy of DNS servers Root servers Top-level domain (TLD) servers Authoritative DNS servers

Distributed Hierarchical Database unnamed root com edu org ac uk zw arpa generic domains country domains bar ac in- addr west east cam 12 foo my usr 34 my.east.bar.edu usr.cam.ac.uk 56 12.34.56.0/24

DNS Root Servers 13 root servers (see http://www.root-servers.org/) Labeled A through M A Verisign, Dulles, VA C Cogent, Herndon, VA (also Los Angeles) D U Maryland College Park, MD G US DoD Vienna, VA H ARL Aberdeen, MD J Verisign, ( 11 locations) K RIPE London (+ Amsterdam, Frankfurt) I Autonomica, Stockholm (plus 3 other locations) E NASA Mt View, CA F Internet Software C. Palo Alto, CA (and 17 other locations) m WIDE Tokyo B USC-ISI Marina del Rey, CA L ICANN Los Angeles, CA

TLD and Authoritative DNS Servers Top-level domain (TLD) servers Generic domains (e.g., com, org, edu) Country domains (e.g., uk, fr, ca, jp) Typically managed professionally Network Solutions maintains servers for “com” Educause maintains servers for “edu” Authoritative DNS servers Provide public records for hosts at an organization For the organization’s servers (e.g., Web and mail) Can be maintained locally or by a service provider

Example Host at cis.poly.edu wants IP address for gaia.cs.umass.edu root DNS server Host at cis.poly.edu wants IP address for gaia.cs.umass.edu 2 3 TLD DNS server 4 local DNS server dns.poly.edu 5 7 6 1 8 authoritative DNS server dns.cs.umass.edu requesting host cis.poly.edu gaia.cs.umass.edu

How a CDN Works facebook.com (content provider) DNS root server 16 How a CDN Works facebook.com (content provider) HTTP DNS root server GET index.html Akamai cluster Akamai global DNS server http://cache.facebook.com/facebook.com/foo.jpg 1 2 HTTP Akamai regional DNS server Each Akamai DNS server runs an algorithm to narrow down close servers. Nearby Akamai cluster End-user

How a CDN Works facebook.com (content provider) DNS root server 17 How a CDN Works facebook.com (content provider) HTTP DNS root server DNS lookup cache.facebook.com Akamai cluster Akamai global DNS server 3 1 2 ALIAS: g.akamai.net 4 Akamai regional DNS server Nearby Akamai cluster End-user

How a CDN Works facebook.com (content provider) DNS root server 18 How a CDN Works facebook.com (content provider) HTTP DNS root server DNS lookup g.akamai.net Akamai cluster Akamai global DNS server 5 3 1 2 Server selection algorithm 6 4 Akamai regional DNS server ALIAS a73.g.akamai.net Nearby Akamai cluster End-user

How a CDN Works facebook.com (content provider) DNS root server Akamai 19 How a CDN Works facebook.com (content provider) HTTP DNS root server Akamai cluster Akamai global DNS server 5 3 1 2 6 4 Akamai regional DNS server 7 DNS a73.g.akamai.net 8 Server selection algorithm Address 1.2.3.4 Nearby Akamai cluster End-user

How a CDN Works facebook.com (content provider) DNS root server Akamai 20 How a CDN Works facebook.com (content provider) HTTP DNS root server Akamai cluster Akamai global DNS server 5 3 1 2 6 4 Akamai regional DNS server 7 8 9 Nearby Akamai cluster End-user GET /foo.jpg Host: cache.facebook.com

How a CDN Works facebook.com (content provider) DNS root server 21 How a CDN Works facebook.com (content provider) HTTP DNS root server GET foo.jpg 11 12 Akamai cluster Akamai global DNS server 5 3 1 2 6 4 Akamai regional DNS server 7 8 9 Nearby Akamai cluster End-user GET /foo.jpg Host: cache.facebook.com

How a CDN Works facebook.com (content provider) DNS root server 11 12 22 How a CDN Works facebook.com (content provider) HTTP DNS root server 11 12 Akamai cluster Akamai global DNS server 5 3 1 2 6 4 Akamai regional DNS server 7 8 9 Nearby Akamai cluster End-user 10

Facebook Photo Distribution “Hot” vs. “warm” vs. “cold” photos Hot: Popular, a lot of views (approx. 90% of views) Warm: Somewhat popular, but still a lot of views in aggregate Cold: Unpopular, occasional views Popularity Items sorted by popularity

Handling Warm Photos: Haystack Designed for performance and reliability “Default” photo storage

Haystack Directory Helps the URL construction for an image http://⟨CDN⟩/⟨Cache⟩/⟨Machine id⟩/⟨Logical volume, Photo⟩ Staged lookup CDN strips out its portion. Cache strips out its portion. Machine strips out its portion Logical & physical volumes A logical volume is replicated as multiple physical volumes Physical volumes are stored. Each volume contains multiple photos. Each stage uses <Logical volume, Photo> as well.

Haystack Cache & Store Haystack cache Haystack store Facebook-operated second-level cache using DHT Photo IDs as the key Further removes traffic to Store Haystack store Maintains physical volumes One volume is a single large file (100GB) with many photos (needles) Performance-optimized: requires a single disk read for image retrieval (Not going into details since it’s more about how a filesystem works.)

Facebook Photo Distribution “Hot” vs. “warm” vs. “cold” photos Hot: Popular, a lot of views (approx. 90% of views) Warm: Somewhat popular, but still a lot of views in aggregate Cold: Unpopular, occasional views Popularity Based on the observations, Facebook divided their workload into three types and designed mechanisms to handle those types differently. Items sorted by popularity

CDN / Haystack / f4 CDN absorbs much traffic for hot photos. Haystack’s tradeoff: good throughput and reliability, but somewhat inefficient storage space usage (mainly due to replication). f4’s tradeoff: less throughput, but more storage efficient. ~ 1 month after upload, photos/videos are moved to f4. f4 uses an error-correcting coding scheme to efficiently replicate data.

f4’s Replication (n, k) Reed-Solomon code Parity example: XOR k data blocks, f==(n-k) parity blocks, n total blocks Upon a failure, any k blocks can reconstruct the lost block. Can tolerate up to f block failures Need to go through coder/decoder for read/write, which affects the throughput Parity example: XOR (Reed-Solomon uses something more complicated that this.) XOR bits, e.g., (0, 1, 1, 0)  P: 0 Reconstruction after failures: (0, 1, 1, 0)  P: 0 k data blocks f parity blocks

f4: Single Datacenter Within a single data center, (14, 10) Reed-Solomon code This tolerates up to 4 block failures 1.4X storage usage per block Distribute blocks across different racks This tolerates four host/rack failures

f4: Cross-Datacenter Additional parity block Can tolerate a single datacenter failure Overall average space usage per block: 2.1X E.g., average for block A & B: (1.4*2 + 1.4)/2 = 2.1 With 2.1X space usage, 4 host/rack failures tolerated 1 datacenter failure tolerated

Summary Engineering a system needs workload understanding. Facebook photo workload Hot, warm, and cold. CDN for hot photos Performance Haystack for warm photos Performance & reliability f4 for cold photos Reliability and storage efficiency

Acknowledgements These slides contain material developed and copyrighted by Indranil Gupta (UIUC), Michael Freedman (Princeton), and Jennifer Rexford (Princeton).