Content Distribution Networks

Slides:



Advertisements
Similar presentations
Internet Applications INTERNET APPLICATIONS. Internet Applications Domain Name Service Proxy Service Mail Service Web Service.
Advertisements

1 Server Selection & Content Distribution Networks (slides by Srini Seshan, CS CMU)
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
Spring 2003CS 4611 Content Distribution Networks Outline Implementation Techniques Hashing Schemes Redirection Strategies.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gommadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy Proceedings of.
What’s a Web Cache? Why do people use them? Web cache location Web cache purpose There are two main reasons that Web cache are used:  to reduce latency.
CDNs & Replication Prof. Vern Paxson EE122 Fall 2007 TAs: Lisa Fowler, Daniel Killebrew, Jorge Ortiz.
HTTP and Web Content Delivery COS 461: Computer Networks Spring 2011 Mike Freedman
Anycast Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
1 Web Content Delivery Reading: Section and COS 461: Computer Networks Spring 2007 (MW 1:30-2:50 in Friend 004) Ioannis Avramopoulos Instructor:
Web Caching Schemes For The Internet – cont. By Jia Wang.
Web Caching and CDNs March 3, Content Distribution Motivation –Network path from server to client is slow/congested –Web server is overloaded Web.
Caching and Content Distribution Networks. Web Caching r As an example, we use the web to illustrate caching and other related issues browser Web Proxy.
Content Distribution Networks (CDNs) Mike Freedman COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
CSCI-1680 Web Performance and Content Distribution Based partly on lecture notes by Scott Shenker and John Jannotti Rodrigo Fonseca.
Content Distribution Networks CPE 401 / 601 Computer Network Systems Modified from Ravi Sundaram, Janardhan R. Iyengar, and others.
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science.
Caching and Content Distribution Networks. Some Interesting Observations r Top 1 % of all documents account for 20% - 35% of proxy requests r Top 10%
Sipat Triukose, Zhihua Wen, Michael Rabinovich WWW 2011 Presented by Ye Tian for Course CS05112.
Redirection and Load Balancing
{ Content Distribution Networks ECE544 Dhananjay Makwana Principal Software Engineer, Semandex Networks 5/2/14ECE544.
1 IP: putting it all together Part 2 G53ACC Chris Greenhalgh.
CPSC 441: Multimedia Networking1 Outline r Scalable Streaming Techniques r Content Distribution Networks.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Making the Best of the Best-Effort Service (2) Advanced Multimedia University of Palestine University of Palestine Eng. Wisam Zaqoot Eng. Wisam Zaqoot.
Internet Address and Domain Name Service (DNS)
DYNAMIC LOAD BALANCING ON WEB-SERVER SYSTEMS by Valeria Cardellini Michele Colajanni Philip S. Yu.
Computer Science Lecture 14, page 1 CS677: Distributed OS Last Class: Concurrency Control Concurrency control –Two phase locks –Time stamps Intro to Replication.
Content Distribution Network, Proxy CDN: Distributed Environment
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
COMP 431 Internet Services & Protocols
Content Distribution Networks (CDNs)
John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante Northwestern, EECS.
Content Distribution Networks
19 – Multimedia Networking
Content Distribution Networks
The Intranet.
Coral: A Peer-to-peer Content Distribution Network
Content Distribution Networks
F5 BIGIP V 9 Training.
The Internet.
CloudFront: Living on the Edge
Content Distribution Networks
Internet Applications
Utilization of Azure CDN for the large file distribution
Internet Networking recitation #12
Co* Projects : CoDNS, CoDeploy, CoMon
CS 4700 / CS 5700 Network Fundamentals
Internet Applications
ECE 671 – Lecture 16 Content Distribution Networks
Distributed Content in the Network: A Backbone View
COS 518: Advanced Computer Systems Lecture 9 Michael Freedman
Distributed Systems CS
Content Distribution Networks
DotSlash: An Automated Web Hotspot Rescue System
Content Distribution Networks
AWS Cloud Computing Masaki.
Content Distribution Networks + P2P File Sharing
INTERNET APPLICATIONS
Computer Networks Primary, Secondary and Root Servers
EE 122: Lecture 22 (Overlay Networks)
Advanced Computer Networks
AKAMAI Content Delivery Services
Anonymous Communication
Content Distribution Networks + P2P File Sharing
Presentation transcript:

Content Distribution Networks COS 518: Advanced Computer Systems Lecture 16 Mike Freedman

Single Server, Poor Performance Single point of failure Easily overloaded Far from most clients Popular content Popular site “Flash crowd” Denial of Service attack

Skewed Popularity of Web Traffic “Zipf” or “power-law” distribution Characteristics of WWW Client-based Traces Carlos R. Cunha, Azer Bestavros, Mark E. Crovella, BU-CS-95-01

Web Caching

Proxy Caches Proxy server origin server HTTP request HTTP request 5 Proxy Caches origin server Proxy server HTTP request HTTP request client HTTP response HTTP response HTTP request HTTP response client

Forward Proxy Cache “close” to the client Explicit proxy Under administrative control of client-side AS Explicit proxy Requires configuring browser Implicit proxy Service provider deploys an “on path” proxy … that intercepts and handles Web requests Proxy server HTTP request client HTTP response HTTP request HTTP response client

Reverse Proxy Cache “close” to server Directing clients to the proxy origin server Cache “close” to server Either by proxy run by server or in third-party CDNs Directing clients to the proxy Map the site name to the IP address of the proxy Proxy server HTTP request HTTP response HTTP request HTTP response origin server

Google Design . . . Router Data Centers Servers Client Reverse Proxy Requests Private Backbone Internet Client Client

Proxy Caches (A) Forward (B) Reverse (C) Both (D) Neither Reactively replicates popular content Reduces origin server costs Reduces client ISP costs Intelligent load balancing between origin servers Offload form submissions (POSTs) and user auth Content reassembly or transcoding on behalf of origin Smaller round-trip times to clients Maintain persistent connections to avoid TCP setup delay (handshake, slow start)

Proxy Caches (A) Forward (B) Reverse (C) Both (D) Neither Reactively replicates popular content (C) Reduces origin server costs (C) Reduces client ISP costs (A) Intelligent load balancing between origin servers (B) Offload form submissions (POSTs) and user auth (D) Content reassembly, transcoding on behalf of origin (C) Smaller round-trip times to clients (C) Maintain persistent connections to avoid TCP setup delay (handshake, slow start) (C)

Limitations of Web Caching Much content is not cacheable Dynamic data: stock prices, scores, web cams CGI scripts: results depend on parameters Cookies: results may depend on passed data SSL: encrypted data is not cacheable Analytics: owner wants to measure hits Stale data Or, overhead of refreshing the cached data

Modern HTTP Video-on-Demand Download “content manifest” from origin server List of video segments belonging to video Each segment 1-2 seconds in length Client can know time offset associated with each Standard naming for different video resolutions and formats: e.g., 320dpi, 720dpi, 1040dpi, … Client downloads video segment (at certain resolution) using standard HTTP request. HTTP request can be satisfied by cache: it’s a static object Client observes download time vs. segment duration, increases/decreases resolution if appropriate

Content Distribution Networks

Content Distribution Network origin server in North America Proactive content replication Content provider (e.g., CNN) contracts with a CDN CDN replicates the content On many servers spread throughout the Internet Updating the replicas Updates pushed to replicas when the content changes CDN distribution node CDN server in S. America CDN server in Asia CDN server in Europe

Server Selection Policy Live server For availability Lowest load To balance load across the servers Closest Nearest geographically, or in round-trip time Best performance Throughput, latency, … Cheapest bandwidth, electricity, … Requires continuous monitoring of liveness, load, and performance

Server Selection Mechanism Application HTTP redirection Advantages Fine-grain control Selection based on client IP address Disadvantages Extra round-trips for TCP connection to server Overhead on the server GET Redirect GET OK

Server Selection Mechanism Advantages No extra round trips Route to nearby server Disadvantages Does not consider network or server load Different packets may go to different servers Used only for simple request-response apps Routing Anycast routing 1.2.3.0/24

Server Selection Mechanism Naming DNS-based server selection 1.2.3.4 DNS query 1.2.3.5 local DNS server

A DNS lookup traverses DNS hierarchy . (root) authority 198.41.0.4 edu.: NS 192.5.6.30 com.: NS 158.38.8.133 io.: NS 156.154.100.3 edu. authority 192.5.6.30 princeton.edu.: NS 66.28.0.14 pedantic.edu.: NS 19.31.1.1 Contact 192.5.6.30 for edu. www.princeton.edu? Client Client is doing a lookup for www.princeton.edu SEGUE: Finally, the local nameserver replies to the client with an answer. www.princeton.edu? Contact 66.28.0.14 for princeton.edu. www.princeton.edu? www.princeton.edu? www.princeton.edu.: A 140.180.223.42 www.princeton.edu A 140.180.223.42 princeton.edu. authority 66.28.0.14 www.princeton.edu.: A 140.180.223.42 Local nameserver . (root): NS 198.41.0.4 edu.: NS 192.5.6.30 princeton.edu.: NS 66.28.0.14

DNS caching Performing all these queries takes time And all this before actual communication takes place Caching can greatly reduce overhead Top-level servers very rarely change, popular sites visited often Local DNS server often has information cached How DNS caching works All DNS servers cache responses to queries Responses include a time-to-live (TTL) field, akin to cache expiry >>> SEGUE: And these TTL fields play a key role in how CDNs like Akamai function.

Server Selection Mechanism Advantages Avoid TCP set-up delay DNS caching reduces overhead Relatively fine control Disadvantage Based on IP address of local DNS server “Hidden load” effect DNS TTL limits adaptation Naming DNS-based server selection 1.2.3.4 DNS query 1.2.3.5 local DNS server

How Akamai Works

cnn.com (content provider) 23 How Akamai Uses DNS cnn.com (content provider) HTTP DNS root server GET index.html Akamai cluster Akamai global DNS server 1 2 HTTP http://cache.cnn.com/foo.jpg Akamai regional DNS server Nearby Akamai cluster end user

cnn.com (content provider) 24 How Akamai Uses DNS cnn.com (content provider) HTTP DNS TLD server DNS lookup cache.cnn.com Akamai cluster Akamai global DNS server 3 1 2 ALIAS: g.akamai.net 4 Akamai regional DNS server Nearby Akamai cluster end user

cnn.com (content provider) 25 How Akamai Uses DNS cnn.com (content provider) HTTP DNS TLD server DNS lookup g.akamai.net Akamai cluster Akamai global DNS server 5 3 1 2 6 4 Akamai regional DNS server ALIAS a73.g.akamai.net Nearby Akamai cluster end user

cnn.com (content provider) 26 How Akamai Uses DNS cnn.com (content provider) HTTP DNS TLD server Akamai cluster Akamai global DNS server 5 3 1 2 6 4 Akamai regional DNS server 7 DNS a73.g.akamai.net 8 Address 1.2.3.4 Nearby Akamai cluster end user

cnn.com (content provider) 27 How Akamai Uses DNS cnn.com (content provider) HTTP DNS TLD server Akamai cluster Akamai global DNS server 5 3 1 2 6 4 Akamai regional DNS server 7 8 9 Nearby Akamai cluster end user GET /foo.jpg Host: cache.cnn.com

cnn.com (content provider) 28 How Akamai Uses DNS cnn.com (content provider) HTTP DNS TLD server GET foo.jpg 11 12 Akamai cluster Akamai global DNS server 5 3 1 2 6 4 Akamai regional DNS server 7 8 9 Nearby Akamai cluster end user GET /foo.jpg Host: cache.cnn.com

cnn.com (content provider) 29 How Akamai Uses DNS cnn.com (content provider) HTTP DNS TLD server 11 12 Akamai cluster Akamai global DNS server 5 3 1 2 6 4 Akamai regional DNS server 7 8 9 Nearby Akamai cluster end user 10

How Akamai Works: Cache Hit 30 How Akamai Works: Cache Hit cnn.com (content provider) HTTP DNS TLD server Akamai cluster Akamai global DNS server 1 2 Akamai regional DNS server 3 4 5 Nearby Akamai cluster end user 6

Mapping System Equivalence classes of IP addresses IP addresses experiencing similar performance Quantify how well they connect to each other Collect and combine measurements Ping, traceroute, BGP routes, server logs E.g., over 100 TB of logs per days Network latency, loss, and connectivity

Mapping System Map each IP class to a preferred server cluster Based on performance, cluster health, etc. Updated roughly every minute Map client request to a server in the cluster Load balancer selects a specific server E.g., to maximize the cache hit rate

Adapting to Failures Failing hard drive on a server Failed server Suspends after finishing “in progress” requests Failed server Another server takes over for the IP address Low-level map updated quickly Failed cluster High-level map updated quickly Failed path to customer’s origin server Route packets through an intermediate node

Conclusion Content distribution is hard Many, diverse, changing objects Clients distributed all over the world Reducing latency is king Contribution distribution solutions Reactive caching Proactive content distribution networks