CSE 534 – Fundamentals of Computer Networks Lecture 11: Content Delivery Networks (Over 1 billion served … each day) Based on slides by D. NEU.

Slides:



Advertisements
Similar presentations
Akamai Content Delivery Network Slides from Bruce Maggs.
Advertisements

CS 4700 / CS 5700 Network Fundamentals Lecture 15: Content Delivery Networks (Over 1 billion served … each day) Revised 10/22/2014.
1 Server Selection & Content Distribution Networks (slides by Srini Seshan, CS CMU)
CSE 390 – Advanced Computer Networks Lecture 11: HTTP/Web (The Internet’s first killer app) Based on slides from Kurose + Ross, and Carey Williamson (UCalgary).
CS 4700 / CS 5700 Network Fundamentals Lecture 16: IXPs (The Underbelly of the Internet) Revised 3/23/2015.
On the Effectiveness of Measurement Reuse for Performance-Based Detouring David Choffnes Fabian Bustamante Fabian Bustamante Northwestern University INFOCOM.
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
Engineering a Content Delivery Network Bruce Maggs.
1 From Internet Data Centers to Data Centers in the Cloud Data Centers Evolution − Internet Data Center − Enterprise Data Centers −Web 2.0 Mega Data Centers.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
CDNs & Replication Prof. Vern Paxson EE122 Fall 2007 TAs: Lisa Fowler, Daniel Killebrew, Jorge Ortiz.
SERVER LOAD BALANCING Presented By : Priya Palanivelu.
Anycast Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Drafting Behind Akamai (Travelocity-Based Detouring) Aleksandar Kuzmanovic Northwestern University Joint work with: A. Su, D. Choffnes, and F. Bustamante.
1 Drafting Behind Akamai (Travelocity-Based Detouring) AoJan Su, David R. Choffnes, Aleksandar Kuzmanovic, and Fabian E. Bustamante Department of Electrical.
1 Web Content Delivery Reading: Section and COS 461: Computer Networks Spring 2007 (MW 1:30-2:50 in Friend 004) Ioannis Avramopoulos Instructor:
Content Delivery Networks. History Early 1990s sees 100% growth in internet traffic per year 1994 o Netscape forms and releases their first browser.
Tradeoffs in CDN Designs for Throughput Oriented Traffic Minlan Yu University of Southern California 1 Joint work with Wenjie Jiang, Haoyuan Li, and Ion.
Caching and Content Distribution Networks. Web Caching r As an example, we use the web to illustrate caching and other related issues browser Web Proxy.
Content Delivery Networks (CDN) Dr. Yingwu Zhu Reverse Proxy Reverse Proxy Reverse Proxy Intranet Web Cache Architecure Browser Local ISP cache L4 Switch.
Content Distribution Networks (CDNs) Mike Freedman COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
© 2009 AT&T Intellectual Property. All rights reserved. Multimedia content growth: From IP networks to Medianets Cisco-IEEE ComSoc Webinar. Sept. 23, 2009.
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
Traffic Engineering for CDNs Matt Jansen Akamai Technologies APRICOT 2015.
1 Caching  Temporary storage of frequently accessed data (duplicating original data stored somewhere else)  Reduces access time/latency for clients 
Active Network Applications Tom Anderson University of Washington.
Redirection and Load Balancing
1. 1.Charting the CDNs(locating all their content and DNS servers). 2.Assessing their server availability. 3.Quantifying their world-wide delay performance.
{ Content Distribution Networks ECE544 Dhananjay Makwana Principal Software Engineer, Semandex Networks 5/2/14ECE544.
EE616 Technical Project Video Hosting Architecture By Phillip Sutton.
Oasis: Anycast for Any Service Michael J. Freedman Karthik Lakshminarayanan David Mazières in NSDI 2006 Presented by: Sailesh Kumar.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
Chapter 4. After completion of this chapter, you should be able to: Explain “what is the Internet? And how we connect to the Internet using an ISP. Explain.
Lecture 8 Page 1 Advanced Network Security Review of Networking Basics: Internet Architecture, Routing, and Naming Advanced Network Security Peter Reiher.
Infrastructure for Better Quality Internet Access & Web Publishing without Increasing Bandwidth Prof. Chi Chi Hung School of Computing, National University.
Global Internet Content Delivery Akamai Technologies and Carnegie Mellon University Bruce Maggs.
Akamai vs. Flash Crowds and Distributed Denial of Service Akamai Technologies & Carnegie Mellon Bruce Maggs.
CPSC 441: Multimedia Networking1 Outline r Scalable Streaming Techniques r Content Distribution Networks.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
How Akamai Handles Large Events Bruce Maggs Carnegie Mellon Duke Akamai Technologies.
CDN: Content Distribution Networks  References:  CS613 textbook, “Computer Networking – A Top-Down Approach”, 6 th edition. Chapter  The text.
CSE534- Fundamentals of Computer Networking Lecture 12-13: Internet Connectivity + IXPs (The Underbelly of the Internet) Based on slides by D. Choffnes.
Globally Distributed Content Delivery Presenter: Baoning Wu 03/25/2003.
Content Distribution Network, Proxy CDN: Distributed Environment
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/
Content Distribution Networks (CDNs)
Engineering a Content Delivery Network Bruce Maggs.
CIS679: Anycast r Review of Last lecture r Network-layer Anycast m Single-path routing for anycast messages r Application-layer anycast.
John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante Northwestern, EECS.
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
Drafting Behind Akamai (Travelocity-Based Detouring) Ao-Jan Su, David R. Choffnes, Aleksandar Kuzmanovic and Fabián E. Bustamante Department of EECS Northwestern.
Content Distribution Networks
Engineering a Content Delivery Network
Content Distribution Networks
Caching Temporary storage of frequently accessed data (duplicating original data stored somewhere else) Reduces access time/latency for clients Reduces.
LECTURE 34: WEB PROGRAMMING FOR SCALE
CS 4700 / CS 5700 Network Fundamentals
ECE 671 – Lecture 16 Content Distribution Networks
Managing Online Services
LECTURE 32: WEB PROGRAMMING FOR SCALE
LECTURE 33: WEB PROGRAMMING FOR SCALE
Content Distribution Networks
It Followed Me Home: Exploring Strong Last Hop Devices and CDNs
CSCI-351 Data communication and Networks
Engineering a Content Delivery Network
Content Delivery and Remote DNS services
EE 122: Lecture 22 (Overlay Networks)
LECTURE 33: WEB PROGRAMMING FOR SCALE
Engineering a Content Delivery Network
Presentation transcript:

CSE 534 – Fundamentals of Computer Networks Lecture 11: Content Delivery Networks (Over 1 billion served … each day) Based on slides by D. NEU. Revised Spring 2015 by P. Gill

 Motivation  CDN basics  Prominent example: Akamai (independent reading ) Outline 2

Evolution of Serving Web Content 3  In the beginning…  …there was a single server  Probably located in a closet  And it probably served blinking text  Issues with this model  Site reliability Unplugging cable, hardware failure, natural disaster  Scalability Flash crowds (aka Slashdotting)

Replicated Web service 4  Use multiple servers  Advantages  Better scalability  Better reliability  Disadvantages  How do you decide which server to use?  How to do synchronize state among servers?

Load Balancers 5  Device that multiplexes requests across a collection of servers  All servers share one public IP  Balancer transparently directs requests to different servers  How should the balancer assign clients to servers?  Random / round-robin When is this a good idea?  Load-based When might this fail?  Challenges  Scalability (must support traffic for n hosts)  State (must keep track of previous decisions) RESTful APIs reduce this limitation

Load balancing: Are we done? 6  Advantages  Allows scaling of hardware independent of IPs  Relatively easy to maintain  Disadvantages  Expensive  Still a single point of failure  Location! Where do we place the load balancer for Wikipedia?

Popping up: HTTP performance 7  For Web pages  RTT matters most  Where should the server go?  For video  Available bandwidth matters most  Where should the server go?  Is there one location that is best for everyone?

Why speed matters 8  Impact on user experience  Users navigating away from pages  Video startup delay

Why speed matters 9  Impact on user experience  Users navigating away from pages  Video startup delay  Impact on revenue  Amazon: increased revenue 1% for every 100ms reduction in page load time (PLT)  Shopzilla:12% increase in revenue by reducing PLT from 6 seconds to 1.2 seconds  Ping from BOS to LAX: ~100ms

Strawman solution: Web caches 10  ISP uses a middlebox that caches Web content  Better performance – content is closer to users  Lower cost – content traverses network boundary once  Does this solve the problem?  No!  Size of all Web content is too large   Web content is dynamic and customized Can’t cache banking content What does it mean to cache search results?

 Motivation  CDN basics  Prominent example: Akamai Outline 11

What is a CDN? 12  Content Delivery Network  Also sometimes called Content Distribution Network  At least half of the world’s bits are delivered by a CDN Probably closer to 80/90%  Primary Goals  Create replicas of content throughout the Internet  Ensure that replicas are always available  Directly clients to replicas that will give good performance

Key Components of a CDN 13  Distributed servers  Usually located inside of other ISPs  Often located in IXPs (coming up next)  High-speed network connecting them  Clients (eyeballs)  Can be located anywhere in the world  They want fast Web performance  Glue  Something that binds clients to “nearby” replica servers

Examples of CDNs 14  Akamai  147K+ servers, networks, 650+ cities, 92 countries  Limelight  Well provisioned delivery centers, interconnected via a private fiber-optic connected to 700+ access networks  Edgecast  30+ PoPs, 5 continents, direct connections  Others  Google, Facebook, AWS, AT&T, Level3, Brokers

Mapping clients to servers 15  CDNs need a way to send clients to the “best” server  The best server can change over time  And this depends on client location, network conditions, server load, …  What existing technology can we use for this?  DNS-based redirection  Clients request  DNS server directs client to one or more IPs based on request IP  Use short TTL to limit the effect of caching

DNS Redirection Considerations 16  Advantages  Uses existing, scalable DNS infrastructure  URLs can stay essentially the same  TTLs can control “freshness”  Limitations  DNS servers see only the DNS resolver IP Assumes that client and DNS server are close. Is this accurate?  Small TTLs are often ignored  Content owner must give up control  Unicast addresses can limit reliability

CDN Using Anycast 17  Anycast address  An IP address in a prefix announced from multiple locations AS 1AS 31AS 41AS 32AS 3 AS 20AS /16 ?

Anycasting Considerations 18  Why do anycast?  Simplifies network management Replica servers can be in the same network domain  Uses best BGP path  Disadvantages  BGP path may not be optimal  Stateful services can be complicated

Optimizing Performance 19 Key goal Send clients to server with best end-to-end performance  Performance depends on  Server load  Content at that server  Network conditions  Optimizing for server load  Load balancing, monitoring at servers  Generally solved

Optimizing performance: caching 20  Where to cache content?  Popularity of Web objects is Zipf-like Also called heavy-tailed and power law  N r ~ r -1  Small number of sites cover large fraction of requests  Given this observation, how should cache-replacement work?

Optimizing performance: Network 21  There are good solutions to server load and content  What about network performance?  Key challenges for network performance  Measuring paths is hard Traceroute gives us only the forward path Shortest path != best path  RTT estimation is hard Variable network conditions May not represent end-to-end performance  No access to client-perceived performance

Optimizing performance: Network 22  Example approximation strategies  Geographic mapping Hard to map IP to location Internet paths do not take shortest distance  Active measurement Ping from all replicas to all routable prefixes 56B * 100 servers * 500k prefixes = 500+MB of traffic per round  Passive measurement Send fraction of clients to different servers, observe performance Downside: Some clients get bad performance

 Motivation  CDN basics  Prominent example: Akamai Outline 23

Akamai case study 24  Deployment  147K+ servers, networks, 650+ cities, 92 countries  highly hierarchical, caching depends on popularity  4 yr depreciation of servers  Many servers inside ISPs, who are thrilled to have them Why?  Deployed inside100 new networks in last few years  Customers  250K+ domains: all top 60 eCommerce sites, all top 30 M&E companies, 9 of 10 to banks, 13 of top 15 auto manufacturers  Overall stats  5+ terabits/second, 30+ million hits/second, 2+ trillion deliveries/day, 100+ PB/day, 10+ million concurrent streams  15-30% of Web traffic

Somewhat old network map 25

Akamizing Links 26  Embedded URLs are Converted to ARLs Welcome to xyz.com! Welcome to our Web site! Click here to enter AK

DNS Redirection  Web client’s request redirected to ‘close’ by server  Client gets web site’s DNS CNAME entry with domain name in CDN network  Hierarchy of CDN’s DNS servers direct client to 2 nearby servers Internet Web client Hierarchy of CDN DNS servers Customer DNS servers (1) (2) (3) (4) (5) (6) LDNS Client requests translation for yahoo Client gets CNAME entry with domain name in Akamai Client is given 2 nearby web replica servers (fault tolerance) Web replica servers Multiple redirections to find nearby edge servers 27

Mapping Clients to Servers 28  Maps IP address of client’s name server and type of content being requested (e.g., “g” in a212.g.akamai.net) to an Akamai cluster.  Special cases: Akamai Accelerated Network Partners (AANPs)  Probably uses internal network paths  Also may require special “compute” nodes  General case: “Core Point” analysis

Core points 29  Core point X is the first router at which all paths to nameservers 1, 2, 3, and 4 intersect.  Traceroute once per day from 300 clusters to 280,000 nameservers.

Core Points 30  280,000 nameservers (98.8% of requests) reduced to 30,000 core points  ping core points every 6 minutes

Server clusters 31

Key future challenges 32  Mobile networks  Latency in cell networks is higher  Internal network structure is more opaque Is this AT&T mobile client in Seattle or NYC?  Video  4k/8k UHD = 16-30K Kbps compressed  25K Tbps projected  Big data center networks not enough (5 Tbps each)  Multicast (from end systems) potential solution