Content Delivery Networks - Principles & Practice Northeastern& Akamai Technologies Ravi Sundaram
Outline CDNs - Review of mechanicsCDNs - Review of mechanics FirstPoint - Traffic Management for mirrored websitesFirstPoint - Traffic Management for mirrored websites
Internet Content Providers End Users The Web: Simple on the Outside…
NAP UUNet Qwest AOL Network Providers Content Providers End Users Peering Points …But Problematic on the Inside
Why does my click not work Latency - Browser takes a long time to load the pageLatency - Browser takes a long time to load the page Packet Loss - Browser hangs, user needs to hit refreshPacket Loss - Browser hangs, user needs to hit refresh Jitter - Streams are jerkyJitter - Streams are jerky Server load - Browser connects but does not fully load the pageServer load - Browser connects but does not fully load the page Broken/missing contentBroken/missing content
The Akamai Solution Servers at Network Edge Content Providers End Users NAP
3 Content Provider’s Web Server DNS 1 Downloading - before CDNs User enters enters Browser requests IP address for requests IP address for Browser requests embedded objectsBrowser requests embedded objects Content provider’s web server returns HTMLContent provider’s web server returns HTML Browser requests HTMLBrowser requests HTML DNS returns IP addressDNS returns IP address Browser obtains IP addresses for hostnames listed in URLs of objects embedded on pageBrowser obtains IP addresses for hostnames listed in URLs of objects embedded on page Content provider’s web server returns embedded objectsContent provider’s web server returns embedded objects
DNS Resolution Browser’s Cache 1 OS 2 Local Name Server 3.com.net Root (InterNIC) 4 xyz.com DNS Servers TTL: 1 Day TTL: 30 Minutes
Origin - Content Provider’s Web Server Delivery of Whole Site 6 6. Browser obtains content from optimal Akamai server DNS 1. Browser requests DNS for IP of DNS returns IP of optimal Akamai server 5. Akamai server assembles page, contacting origin as needed 5 4. Browser requests Akamai server for content 4 2. DNS follows CNAME redirect to 2
Delivery of Whole Site - DNS Redirect DNS CNAME RECORD CNAME 2D
Delivery of Whole Site - Page Assembly Site owners create container pages that can be populated with varying content Container Page [TTL=5d] [XYZ news, content, promotions, etc. TTL=5d] [Breaking headlines TTL=2h] [TTL=15m] [TTL=8h]
Benefits of CDNs Improved end-user experienceImproved end-user experience -reduce latency -reduce loss -reduce jitter Reduced network congestionReduced network congestion Increased scalabilityIncreased scalability Improved fault-toleranceImproved fault-tolerance Reduced vulnerabilityReduced vulnerability Reduced costsReduced costs
Outline CDNs - Review of mechanicsCDNs - Review of mechanics FirstPoint - Traffic Management for mirrored websitesFirstPoint - Traffic Management for mirrored websites
What is FirstPoint Traffic management system for mirrored websitesTraffic management system for mirrored websites Directs browser to the optimal mirrorDirects browser to the optimal mirror DNS basedDNS based Application level anycastApplication level anycast
Why FirstPoint Content providers have mirrored websitesContent providers have mirrored websites Content providers only want to offload embedded contentContent providers only want to offload embedded content -Control -Security -Performance
Mapping Problem How to improve user experience?
What is the Mapping Problem Problem of directing requests to servers so as to optimize end-user experienceProblem of directing requests to servers so as to optimize end-user experience -reduce latency -reduce loss -reduce jitter Assumption - servers are fine Assumption - servers are fine Applicable to 2 mirrors or 1500 Akamai locationsApplicable to 2 mirrors or 1500 Akamai locations
Attempt Measure which is closerMeasure which is closer -Closeness changes over time Measure frequentlyMeasure frequently -Bothers people -Too many to do ~500,000 unique nameservers on any given day 10 sec per measurement cycle
Idea TopologyTopology -relatively static -changes in BGP time -order of hours if not days CongestionCongestion -dynamic -changes in round-trip time -order of milliseconds
Topology Discovery - Proxy points Cluster X Y
Aliasing Router fabrics using HSRP (hot stand-by routing protocol)Router fabrics using HSRP (hot stand-by routing protocol) -correlate over time Routers with multiple interfacesRouters with multiple interfaces -source address of UDP/ICMP packets
Set cover Let sets represent proxy pointsLet sets represent proxy points Let elements represent nameserversLet elements represent nameservers Find minimum collection of proxy points covering nameserversFind minimum collection of proxy points covering nameservers X covers 1, 2, 3 and 4 X 1 234
Topology Discovery At each mirror maintain list of partial paths to nameserversAt each mirror maintain list of partial paths to nameservers At each epoch extend paths by 1, in randomized fashion, and exchange with other mirrorAt each epoch extend paths by 1, in randomized fashion, and exchange with other mirror If the two (partial) paths to a namerver have intersected then declare that nameserver done.If the two (partial) paths to a namerver have intersected then declare that nameserver done. If path has reached forbidden IP then waitIf path has reached forbidden IP then wait Use pair of proxies in case of failureUse pair of proxies in case of failure
Topology Discovery - Proxy points Data exchange
Topology Discovery 500,000 nameservers 500,000 nameservers reduced to 90,000 proxy points (clusters)
Histogram of cluster sizes
Congestion Measurement Problem - Still too many measurements to do. 90,000 measurements every 10s with 32B packets requires a few Mbps per mirror. Problem - Still too many measurements to do. 90,000 measurements every 10s with 32B packets requires a few Mbps per mirror. Solution - Importance based sampling Solution - Importance based sampling
CDF of End-user Load
Load Estimation 500,000 nameservers reduced to 90,000 clusters 90,000 clusters 7,000 account for 95% end-user load!
Mapping Problem - Solved! Maps built every 10s
FirstPoint Customers - how to tell?Customers - how to tell? -look for CNAME to akadns.net Customers - who?Customers - who? -High traffic content providers -Yahoo!, Microsoft, TicketMaster etc Price - don’t ask :)Price - don’t ask :) Competitors - whoCompetitors - who -one-of-a-kind service -boxes: Cisco, F5, Foundry
FirstPoint - other aspects Load-balancingLoad-balancing -estimate-based -feedback-based : https, snmp -cost-based: 95/5 Fast cutout in case of failoverFast cutout in case of failover Highly fault-tolerantHighly fault-tolerant -hardware duplication, leader election -overlay routing, BGP-based anycast Integration with other servicesIntegration with other services -DOS/Load failover
Microsoft
Related Work TopologyTopology -Spring, Mahajan, Wetherall, Sigcomm ‘02 -Govindan, Tangmunarunkit, Infocom ‘00 ClusteringClustering -Krishnamurthy, Wang, Sigcomm ‘00 -Bezstavros, Mehrotra, WWC ‘01 -Barford, Gast, Globecom 02 ClusteringClustering -Shaikh, Tewari, Agrawal, Infocom ‘00 -Krishnamurthy, Wills, Zhang, Sigcomm IMW ‘01
Patents (pending) Global load balancing across mirrored data centers. Utility # Global load balancing across mirrored data centers. Utility # Method for predicting file download time from mirrored data centers in a global computer network. Utility # Method for predicting file download time from mirrored data centers in a global computer network. Utility # Method for generating a network map. Utility # Method for generating a network map. Utility # Method and system for protecting websites from public Internet threats. Filed 15 July 2002Method and system for protecting websites from public Internet threats. Filed 15 July 2002
Principles Open design principleOpen design principle -You need all the help you can get -Do not eliminate the obvious without trying first -Give serendipity a chance Scaling principleScaling principle -factor 10 difference means different domain -different domains need different techniques The common case principleThe common case principle -Zipf law is your friend -things cluster -optimize the common case
Conclusion The Internet will never be fast enough in all placesThe Internet will never be fast enough in all places People will want access to the Internet all the time and everywherePeople will want access to the Internet all the time and everywhere