Load-Balancing in Content-Delivery Networks

Slides:



Advertisements
Similar presentations
Distributed Multimedia Systems Tarek Elshaarani Vahid Rafiei.
Advertisements

Akamai Content Delivery Network Slides from Bruce Maggs.
Akamai networks,48000 servers and 70 countries in the world.
Web Booster HTTP Server Acceleration for Lotus Domino.
Resource Pooling A system exhibits complete resource pooling if it behaves as if there was a single pooled resource. The Internet has many mechanisms for.
Key Algorithms in a Content Delivery System Akamai Technologies and Carnegie Mellon University Bruce Maggs.
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
Engineering a Content Delivery Network Bruce Maggs.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Network Coding for Large Scale Content Distribution Christos Gkantsidis Georgia Institute of Technology Pablo Rodriguez Microsoft Research IEEE INFOCOM.
Internet Content Providers End Users The Internet: Simple on the Outside…
Wide-area cooperative storage with CFS
Capacity planning for web sites. Promoting a web site Thoughts on increasing web site traffic but… Two possible scenarios…
Content Delivery Networks. History Early 1990s sees 100% growth in internet traffic per year 1994 o Netscape forms and releases their first browser.
Tradeoffs in CDN Designs for Throughput Oriented Traffic Minlan Yu University of Southern California 1 Joint work with Wenjie Jiang, Haoyuan Li, and Ion.
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
{ Content Distribution Networks ECE544 Dhananjay Makwana Principal Software Engineer, Semandex Networks 5/2/14ECE544.
1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.
Global NetWatch Copyright © 2003 Global NetWatch, Inc. Factors Affecting Web Performance Getting Maximum Performance Out Of Your Web Server.
Load-Balancing in Content-Delivery Networks Michel Goemans.
COLLABORATIVE SPECTRUM MANAGEMENT FOR RELIABILITY AND SCALABILITY Heather Zheng Dept. of Computer Science University of California, Santa Barbara.
Global Internet Content Delivery Akamai Technologies and Carnegie Mellon University Bruce Maggs.
Akamai vs. Flash Crowds and Distributed Denial of Service Akamai Technologies & Carnegie Mellon Bruce Maggs.
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
CPSC 441: Multimedia Networking1 Outline r Scalable Streaming Techniques r Content Distribution Networks.
Data Structures & Algorithms and The Internet: A different way of thinking.
How Akamai Handles Large Events Bruce Maggs Carnegie Mellon Duke Akamai Technologies.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
Sharing Information across Congestion Windows CSE222A Project Presentation March 15, 2005 Apurva Sharma.
Peer-Assisted Content Distribution Pablo Rodriguez Christos Gkantsidis.
Akamai capabilities overview and it’s impact on Iowa.Gov and selected web pages.
Globally Distributed Content Delivery Presenter: Baoning Wu 03/25/2003.
Overlay Networks : An Akamai Perspective
Tackling I/O Issues 1 David Race 16 March 2010.
Engineering a Content Delivery Network Bruce Maggs.
Algorithms used by CDNs Stable Marriage Algorithm Consistent Hashing.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Seeking the perfect Video Streaming Hosting BY. Romeo.
1 Design and Implementation of a High-Performance Distributed Web Crawler Polytechnic University Vladislav Shkapenyuk, Torsten Suel 06/13/2006 석사 2 학기.
William Stallings Data and Computer Communications
INTRODUCTION TO WEB HOSTING
Distributed, real-time actionable insights on high-volume data streams
Engineering a Content Delivery Network
Chapter 25 Domain Name System.
Netscape Application Server
The Challenges of Delivering Content through the Internet
The Stable Marriage Problem
Scale and Performance in the CoBlitz Large-File Distribution Service
Large-scale file systems and Map-Reduce
Securing the Network Perimeter with ISA 2004
Mohammad Malli Chadi Barakat, Walid Dabbous Alcatel meeting
Network Load Balancing
SwiftServe Technical Workshop
GlassFish in the Real World
Distributed Multimedia Systems
Utilization of Azure CDN for the large file distribution
Software Architecture in Practice
Internet Networking recitation #12
Algorithmic Nuggets in Content Delivery
Edge computing (1) Content Distribution Networks
Content Distribution Networks
AWS Cloud Computing Masaki.
CS246 Search Engine Scale.
Capacity Unit 26 Design a small or home office network
CS246: Search-Engine Scale
Engineering a Content Delivery Network
Content Delivery and Remote DNS services
Challenges with developing a Commercial P2P System
Engineering a Content Delivery Network
Presentation transcript:

Load-Balancing in Content-Delivery Networks Michel Goemans

Covering MSTs [G.-Vondrak ’03] Complete graph Kn with distinct edge weights Q= U {S: |S| ≥ n-k} MST(G[S]) How large can |Q| be (as a function of n and k)? S selected uniformly at random ( Pr[v in S]=0.5 ) Find Q such that Pr [Q contains MST(G[S])] ≥ 1 – 1 / nc How small can one choose |Q| ? April 9, 2003 IMA

Problems with Centralized Content Slow content must traverse multiple backbones and long distances http://www.lalibre.be Unreliable content may be blocked by congestion or backbone peering problems Not scalable usage limited by bandwidth available at origin site www.lemonde.fr (edgesuited) June 4th 2001: Cable & Wireless stopped peering with PSInet (because PSI was going bankrupt) and meant that packets from PSI to C&W were just dropped. No connection whatsoever. If origin server was on PSInet, no client coming from C&W or going thru C&W could get through. Denial-of-service. Denial-of-service attacks March 24, 2003: http://english.aljazeera.net was launched and went down the same day (hacker attacks + load) April 9, 2003 IMA

Content-Delivery Networks Get content from server close to user Fast: Eliminate long distances, peering issues http://www.lemonde.fr Reliable: Less affected by failures in the internet Scalable: Content can be massively accessed simultaneously MSNBC.com served an unprecedented 12.7 million page views on September 11th "We would have 10 times the capacity sitting idle for 364 days of the year, which is ridiculous. Akamai is a very elegant solution for not throwing money at hardware, 90% of which you'd leave turned off most of the time." - Steve White, CTO, MSNBC.com April 9, 2003 IMA

Akamai 13000 servers distributed across internet Delivery All types: Embedded objects http://a177.ch1.akamai.net/... Whole site http://www.fbi.gov/ http/https https://cardholder.paysystems.com All types: html, pictures, software downloads, … live and on-demand streaming java servlets Reconstruction and processing at the edge Logitech: expected 100 times its regular traffic for giving out 20,000 cordless keyboards in Dec 2002. Used Edge java. Java program would randomly select 20,000 winners, generate win/lose page. Origin sever only contacted for a win. Akamai delivered 14 million page views per hour (72 millions over the event) and over 1 Gbps of traffic at peak. Dealer locator for Sony Ericsson in edge java. my.lycos.com personalization at the edge April 9, 2003 IMA

DNS based system www.sonyericsson.com CNAME a1538.g.akamaitech.net MAPPING to Akamai server Time To Live: 20 sec 146.57.248.7 On U. of M. campus April 9, 2003 IMA

Mapping Mapping depends on Mapping = Load Balancing User location IP space (dynamically) clustered in 10-50,000 blocks Content type requested web downloads, live streaming (WindowsMediaServer, QuickTime, Realaudio), secure content, java, … Performance/congestion in internet Latency, packet loss, … Load on Akamai servers Mapping = Load Balancing NOT EASY!!! Goal here is to show some of the complexity and to discuss one core algorithm used in practice April 9, 2003 IMA

Multi-Objective Major goals: Other goals: User experience, quality mapping Not overload servers Other goals: Robustness against load spikes, internet failures, … Bandwidth utilization User experience = function of latency, packet loss, cache hit, load on servers April 9, 2003 IMA

Mapping or Load-Balancing Two levels: Toplevel Mapping to a cluster of servers Updated every < 1 min Lowlevel: Mapping within cluster Constantly being updated If a server goes down, other can seamlessly take over Focus on toplevel April 9, 2003 IMA

LoadBal Complexity: Reaction Times Nameserver resolutions are cached by local nameservers (NS) Impact of mapping changes not immediate Actual load is smoothed Harder to detect load spikes → need to anticipate Stability issues Load graph “Minor” issue since TTLs are small (20 secs) 2000 Cannes Festival Victoria's Secret Fashion show. Visitors from 140 countries, over 15 million requests and up to 1700 hits/second April 9, 2003 IMA

LoadBal Complexity: Load Stickiness Once download/event starts, no way to shift load crucial issue for streaming events (connection times of over an hour) → “trial-and-error” approach unacceptable 16.5 Gbps at peak to 81,000 viewers for Steve Jobs keynote in January 2002. total of 11 terabytes of content to 160,000 unique viewers. Load goes from 10% to 90% in 5 minutes just before keynote. Very persistent load. April 9, 2003 IMA

Loadbal Complexity: Heterogeneity of Traffic Very different content types http, https, live streaming (WMS, Real, QT, …), huge downloads, content with huge cache footprint,… Not every machine can serve every request Customer constraints →Same IP can be mapped at same time to many different machines for different contents →Need to perform millions of assignments every 30 secs Need license for realaudio, windows for WMS For huge downloads, does not make sense to put content available everywhere. BMW movies: high quality streams (300 kbps), 200 TB of data accessed, 14 million film views during first series, DVD-quality movie to be downloaded: 1Gb (=1- mins on a cable/DSL connection) Software downloads for Veritas. Up to 800 MB in size! April 9, 2003 IMA

Yesterday from here www.lemonde.fr, www.msnbc.com, www.bestbuy.com, www.logitech.com, www.monster.com 146.57.248.* (University of Minnesota) a123.r.akareal.net 63.240.15.177 (ATT – New York) https://cardholder.paysystems.com 63.211.40.85 (L3 – New York) a177.ch1.akamai.net (usa.bmwfilms.com) 64.241.238.153 (Savvis – Chicago) April 9, 2003 IMA

LB Complexity: Multi-Dimensional Load Not a single constraining resource! Can be: Bandwidth CPU usage (e.g. key signing for https) Disk usage (e.g. for cache misses, auction sites) Memory (e.g. EdgeJava) Threads (e.g. EdgeJava) Number of licenses in realaudio Not necessarily linear Live streaming: 0-1: whether cluster subscribes to stream April 9, 2003 IMA

LoadBal Complexity: No load conservation Multi-dimensional + non-linear = No load conservation No Network flow April 9, 2003 IMA

LoadBal Complexity: Contracts Network contracts E.g. Akamai machines on U. of M. campus can be used to serve only users from U. of M. Customer contracts E.g. maximum to serve, customer servers, … ? Customer wants Akamai to serve up to x Gbps of traffic. April 9, 2003 IMA

LoadBal Complexity: Extreme Cases 99% of time: Load-balancing easy 1%: extreme conditions NEED to work under most incredible scenarios Internet failures (or DOS attacks) Only with part of input (since highly distributed environment) Scheduled/unscheduled events Load estimates unreliable CRITICAL to run fast (avoid domino effect) Reasonable mappings Internet failures (e.g. C&W and PSInet not peering) Scheduled/unscheduled events (e.g. California earthquake, 9/11, Clinton-Lewinsky, …) 2002 Superbowl: AT&T wireless ad campaign. Surge in traffic to its website and website went down! 24.8 billion hits on day Irak war started 3/20/03 [Jeff Young, --> Washington Post] April 9, 2003 IMA

LoadBal Complexity: Scalability NEED (sub)linear time algorithm that can be parallelized and distributed April 9, 2003 IMA

Stable Marriages Assignment of men and women Each man ranks each woman and vice versa Marriage stable if no pair (m,w) unmatched where m prefers w to his “wife” and w prefers m to her “husband” 3 2 2 4 April 9, 2003 IMA

Beauty of Stable Marriages [Gale and Shapley ’62]: Stable marriage always exists! Algorithm: (“men-propose, women-dispose”) Each unmatched man proposes to women in order of preference A woman (tentatively) accepts if proposal came from a man she prefers to her tentative fiance Stable marriage independent of order of proposals!: “man-optimal” marriage (lattice structure) Running time linear in number of proposals (≤ size of pref lists) [very fast] Works also if incomplete preference lists April 9, 2003 IMA

Residents-Hospitals Extension results + algorithm extends to case in which hospital j can accept c(j) residents In use since 1951 by National Intern Matching Program April 9, 2003 IMA

Stable Allocation Problem [Baïou-Balinski ’98, G. ’00] “demand item” i has demand s(i) and ranks every j “supply item” j has capacity c(j) and ranks every i Assignment (i,j) has capacity u(i,j) (fractional) assignment π is stable if π(i,j)<min(s(i),u(i,j),c(j)) implies π(i,j’)= min(s(i),u(i,j’),c(j’)) for every j’ preferred to j by i, and similarly for j Gale-Shapley algorithm applies Not polynomial, but weakly polynomial for integral inputs [BB ’98] Strongly polynomial if used inductively April 9, 2003 IMA

Stable Allocations With Tree Constraints [G ’00]: resources 1,…,k Supply item j has rooted tree T(j) of constraints V(T(j))={1,…,k} Every node v of T has capacity c(j,v) Demand item i has basic resource b(i) and demand d(i) When x units mapped to supply j, uses x units of each resource on path in T(j) from b(i) to root of T(j) Stability as before April 9, 2003 IMA

Algorithm for Tree Constraints [5,12] [12,12] [2,12] [10,12] [0,12] Demand items request unassigned demands in order of preference When demand i requests x units from j, repeat: Find lowest (in tree) tight constraint, say node v Dispose demands (up to x) of lower preference than i and using resources in subtree rooted at v [12,12] [12,12] [10,12] [8,12] Demand = 2 8 3 5 [5,9] [9,9] [2,9] [0,9] [9,9] [7,9] [7,9] 3 [3,5] [0,5] [4,9] [8,9] [2,9] [4,9] [2,9] [0,9] 1 5 [1,7] [0,7] [5,7] 2 4 2 8 [load,cap] [0,8] [2,8] [8,8] [4,8] [2,6] [0,6] [0,6] April 9, 2003 IMA

Properties Algorithm gives stable allocation (man-optimal) maximizes amount of ith demand allocated and allocates it to best possible supply node among all stable allocations If tries to assign only ≥ 1-ε of each demand then linear in size of preference lists (used) Easily parallelized and distributed (variety of ways) If m demands, n supplies and k resources, then at most nk fractional demands assigned April 9, 2003 IMA

Stable Allocations with Tree Constraints for Load Balancing Demand items: (groups of IPs, rule for mapping) m=millions Supply items: cluster of servers n=thousands (Incomplete) preference lists for demands based on performance + contract rules (Implicit) preference lists for supplies based on alternate choices, contract rules, … Tree of constraints model various resource constraints Almost integral assignment (at most a few thousands fractional) Just the core algorithm: many additional peripheral components Extremely fast! April 9, 2003 IMA

Covering MSTs [G.-Vondrak ’03] Complete graph Kn with distinct edge weights Q= U {S: |S| ≥ n-k} MST(G[S]) How large can |Q| be (as a function of n and k)? ≤ e (k+1) n S selected uniformly at random ( Pr[v in S]=0.5 ) Find Q such that Pr [Q contains MST(G[S])] ≥ 1 – 1 / nc How small can one choose |Q| ? ≤ e (c+1) n log2n April 9, 2003 IMA