1 The Effectiveness of Request Redirection on CDN Robustness Limin Wang, Vivek Pai and Larry Peterson Presented by: Eric Leshay Ian McBride Kai Rasmussen.

Slides:

Advertisements

Similar presentations

Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.

Advertisements

Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.

Information-Centric Networks05c-1 Week 5 / Paper 3 Democratizing content publication with Coral –Michael J. Freedman, Eric Freudenthal, David Mazières.

Ningning HuCarnegie Mellon University1 Optimizing Network Performance In Replicated Hosting Peter Steenkiste (CMU) with Ningning Hu (CMU), Oliver Spatscheck.

On the Effectiveness of Measurement Reuse for Performance-Based Detouring David Choffnes Fabian Bustamante Fabian Bustamante Northwestern University INFOCOM.

1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.

1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.

The War Between Mice and Elephants Presented By Eric Wang Liang Guo and Ibrahim Matta Boston University ICNP

Spring 2003CS 4611 Content Distribution Networks Outline Implementation Techniques Hashing Schemes Redirection Strategies.

SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,

Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.

Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.

An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gommadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy Proceedings of.

Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.

Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.

Locality-Aware Request Distribution in Cluster-based Network Servers 1. Introduction and Motivation --- Why have this idea? 2. Strategies --- How to implement?

EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

Flash Crowds And Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites Aaron Beach Cs395 network security.

1 An Overlay Scheme for Streaming Media Distribution Using Minimum Spanning Tree Properties Journal of Internet Technology Volume 5(2004) No.4 Reporter.

Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.

Adaptive Content Delivery for Scalable Web Servers Authors: Rahul Pradhan and Mark Claypool Presented by: David Finkel Computer Science Department Worcester.

1 Web Content Delivery Reading: Section and COS 461: Computer Networks Spring 2007 (MW 1:30-2:50 in Friend 004) Ioannis Avramopoulos Instructor:

Capacity planning for web sites. Promoting a web site Thoughts on increasing web site traffic but… Two possible scenarios…

Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.

World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.

Tradeoffs in CDN Designs for Throughput Oriented Traffic Minlan Yu University of Southern California 1 Joint work with Wenjie Jiang, Haoyuan Li, and Ion.

Locality-Aware Request Distribution in Cluster-based Network Servers Presented by: Kevin Boos Authors: Vivek S. Pai, Mohit Aron, et al. Rice University.

Information-Centric Networks05a-1 Week 5 / Paper 1 On the use and performance of content distribution networks –Balachander Krishnamurthy, Craig Wills,

1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.

On the Use and Performance of Content Distribution Networks Balachander Krishnamurthy Craig Wills Yin Zhang Presenter: Wei Zhang CSE Department of Lehigh.

Achieving Load Balance and Effective Caching in Clustered Web Servers Richard B. Bunt Derek L. Eager Gregory M. Oster Carey L. Williamson Department of.

{ Content Distribution Networks ECE544 Dhananjay Makwana Principal Software Engineer, Semandex Networks 5/2/14ECE544.

Network Aware Resource Allocation in Distributed Clouds.

Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.

1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,

« Performance of Compressed Inverted List Caching in Search Engines » Proceedings of the International World Wide Web Conference Commitee, Beijing 2008)

A Dynamic Data Grid Replication Strategy to Minimize the Data Missed Ming Lei, Susan Vrbsky, Xiaoyan Hong University of Alabama.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

Application of Content Computing in Honeyfarm Introduction Overview of CDN (content delivery network) Overview of honeypot and honeyfarm New redirection.

A Survey of Distributed Task Schedulers Kei Takahashi (M1)

1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.

Paper Group: 20 Overlay Networks 2 nd March, 2004 Above papers are original works of respective authors, referenced here for academic purposes only Chetan.

A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.

1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,

Optimal Content Delivery with Network Coding Derek Leong, Tracey Ho California Institute of Technology Rebecca Cathey BAE Systems CISS 2009 March 19, 2009.

VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Partitioning and Replication.

A Throttling Layer-7 Web Switch James Furness. Motivation & Goals Specification & Design Design detail Demonstration Conclusion.

OSDI 2002 Boston, MA 1 The Effectiveness of Request Redirection on CDN Robustness Limin Wang Vivek Pai and Larry Peterson Princeton University.

6 December On Selfish Routing in Internet-like Environments paper by Lili Qiu, Yang Richard Yang, Yin Zhang, Scott Shenker presentation by Ed Spitznagel.

Information-Centric Networks Section # 5.3: Content Distribution Instructor: George Xylomenos Department: Informatics.

CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.

Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/

Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks Author: P. Kokkinos, K. Christodoulopoulos, A. Kretsis, and E. Varvarigos.

09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.

On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.

Content Distribution Networks (CDNs)

/ Fast Web Content Delivery An Introduction to Related Techniques by Paper Survey B Li, Chien-chang R Sung, Chih-kuei.

John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante Northwestern, EECS.

Performance Evaluation of Redirection Schemes in Content Distribution Networks Jussi Kangasharju, Keith W. Ross Institut Eurecom Jim W. Roberts France.

Content Distribution Networks

Memory Management.

Mohammad Malli Chadi Barakat, Walid Dabbous Alcatel meeting

VIRTUAL SERVERS Presented By: Ravi Joshi IV Year (IT)

Content Distribution Networks

AWS Cloud Computing Masaki.

Content Distribution Networks + P2P File Sharing

Replica Placement Heuristics of Application-level Multicast

CSE 542: Operating Systems

Performance-Robust Parallel I/O

Content Distribution Networks + P2P File Sharing

Presentation transcript:

1 The Effectiveness of Request Redirection on CDN Robustness Limin Wang, Vivek Pai and Larry Peterson Presented by: Eric Leshay Ian McBride Kai Rasmussen

2 Outline Introduction Redirection Strategies Methodology Normal Load Results Flash Crowd Results Conclusion

3 CDNs To achieve better performance, networks can be built using redundant resources Content Distribution Networks (CDN) Improves - Response Time Cumulative latencies System Throughput Average number of requests satisfied every second

4 CDN Distribution Factors Network Proximity Minimizes response time Balance System Loads Improves system throughput Locality Select server with page already in cache Overall improvements

5 CDN Model Server Surrogate Caches page normally kept on a set of backend servers Uses replication to improve response time and system throughput Uses request redirectors Transparent system to get user to a file without user knowing about any replication

6 Redirector Mechanisms DNS server augmentation Site Level Caching problems avoided with short expiration times Server Redirection HTTP Redirect Response Adds extra round trip time Consumes bandwidth

7 Redirector Mechanisms Router or proxy rerouting Rewrite outbound request HTTP Redirect Proxies on edge of server Approximate load Information Identifiable client population

8 Hashing Schemes Maps URLs to range of values Modulo Hashing URL is hashed, n % (# of servers) Must change modulus as number of servers change Consistent Hashing URL and names of servers hashed in long circular list URL assigned to closest server in list

9 Hashing Schemes Highest Random Weight Hashes URL and server names by random weights, and sorts result List is traversed to find appropriate server More computation than consistent hashing

10 Request Redirection Strategies Random Request randomly sent to a server surrogate ‘baseline’ to determine reasonable performance Static Server Set Assigns a fixed number of server replicas to each URL Improves Locality Load-Aware Static Server Set Redirects based on approximated load information Dynamic Server Set Adjusts number of replicas for better locality and load balancing Network Proximity Favors shorter network paths

11 Static Server Set Replicated Consistent Hashing (R-CHash) Number of replicas is fixed but configurable URL and replicas hashed to circular space Redirector assigns a request to a replica for the URL Replicated Highest Random Weight (R-HRW) Uses HRW to hash URL and replicas Replicas for each URL decided by top N servers from the ordered weighted list

12 Load-Aware Static Server Set Redirectors maintain estimates of server load Finds least loaded server for redirection Load-Aware counter parts of R-CHash and R-HRW LR-CHash LR-HRW

13 Dynamic Server Set Dynamically adjusts the number of replicas Introduces two new algorithms Coarse Dynamic Replication (CDR) Fine Dynamic Replication (FDR) Factors both load and locality into decision combined with a dynamic set of replicas

14 Coarse Dynamic Replication Uses HRW hashing to create an ordered list of servers Coarse-grained load information is used to select first non-busy server Number of active communications used to approximate load level Can be combined with response latency, bandwidth consumption and other factors

15 CDR Code find_server(url, S) { foreach server s i in server set S, Weight i = hash(url,address(s i )); Sort weight; foreach server s j in decreasing order of weight j { If satisfy_load_criteria(s j ) then { targetServer = s j ; Stop search; } If targetServer is not valid then targetServer = server with highest weight; Route request url to targetServer; }

16 Fine Dynamic Replication Uses URL popularity to decrease unnecessary replication Introduces a walk length to indicate the number of servers that should be searched If all servers are busy the walk length is increased Keep track of modified time. Walk length is decreased after a long time unmodified

17 FDR Code 1 Find_server(url, S){ walk_entry = walkLenHash(url); w_len = walk_entry.length; foreach server s i in server set S, weight i = hash(url,address(s i )); sort weight; s candidate = least-loaded server of top w_len servers; if satisfy_load_criteria(s candidate ) then { targetServer = s candidate ; if (w_len > 1 && (timenow() - walk_entry.lastUpd > chgThresh) walk.entry.length--; }

18 FDR Code 2 else { foreach rest server s j in decreasing weight order { if satisfy_load_criteria(s j ) then { targetServer = s j ; stop search; } walk_entry.length = actual search steps; } if walk_entry.length changed then walk_entry.lastUpd = timenow(); if targetServer is not valid then targetServer = server with highest weight; route request url to targetServer; }

19 Network Proximity Map addresses to a geographic region. Select servers within a specific region Find closer servers Use ping and traceroute to measure topological location Three Network Proximity strategy NP-FDR NPR-CHash NPLR-CHash

Methodology: Simulation Network & OS/server combo NS-2: packet-level simulator Tests TCP implementations Logsim: server cluster simulator Simulates CPU processing, memory usage, and disk access

Methodology: Network Topology NSFNET backbone network T3 topology server surrogates client hosts regional routers backbone routers (with location) 64 servers 1,000 client hosts 1,100 nodes

Results: Normal Load Charts shown are for 64 server case. Optimal Static Replication Performance R-CHash and R-HRW influenced by number of replicas. For 2 to 64 replicas: Increasing replicas help load balance; improves throughput Too many replicas will hinder throughput, as replica working set causes more server disk activity. 10 replicas determined to be optimal number.

23 Results: Normal Load: Capacity R-CHash 119% better than random. R-HRW 99% better than random. LR-CHash and LR-HRW 173% better than random. CDR and FDR 250% better than random.

Results: Normal Load: Server Resource Utilization Hash schemes utilize disk more, processor less. Dynamic schemes (CDR and FDR) utilize processor more, disk less. “With faster simulated machines, we expect the gap between the dynamic schemes and the others to grow even larger.”

Results: Normal Load: Latency Dynamic schemes (CDR, FDR) similarly outperform hash schemes at low response loads. Dynamic and static schemes serve large files roughly the same, since large files are replicated less under CDR/FDR.

Results: Normal Load: Latency FDR outperforms CDR at very high response rates for files of median sizes. Small files are served at roughly the same rate by both schemes. Large files still suffer from under – replication.

27 Results: Normal Load: Scalability Experiments repeated for 8 to 128 servers. Linear growth meaning systems scale well. Server router to backbone router link bandwidth doubled for 128 server case.

28 Behavior under Flash Crowds Simulate the performance of CDNs under a flash crowd or DDoS attack Measured performance by: Capacity – requests/second Latency – response time in seconds Scalability – requests/second

29 Flash Crowd Setup System Capacity – requests/second, latency 1000 clients – 25% intensive requesters Intensive requesters download a URL of 6kb from a predetermined list over and over Clusters of 64 servers Scalability - requests/seconds Varying cluster size from 32 – 128 servers

30 System Capacity – Flash Crowd  FDR's benefit has grown to 91% from 60% over R-HRW and LR-CHash during the flash crowd scenario

 Random has the worst latency, LR-CHash and LR-HRW have the best latency.  In a direct comparison of FDR and CDR, FDR proves to have the best latency

32 Scalability Results – Flash Crowd All the algorithms scale linearly Similar results to the trial under normal load

33 More Flash Crowd Tests Flash Crowds setup 1 hot URL of 1kb 10 hot URLs of 6kb Test parameters Vary intensive requesters from 10% - 80% Vary cluster size from 32 to 64 servers Measured requests/second

FDR and CDR are able to adapt to flash crowds All other algorithms perform worse than random during the flash crowd Tested with 1 hot URL and 10 hot URLs

35 Proximity Comparison  Proximity can benefit latency, but may hurt capacity. This is the case with NPLR-CHash

36 Heterogeneity Random and R- CHash cannot determine speed of links LR-CHash and FDR are able to assign requests fairly between slow and fast links

37 Large File Effects Threshold set where files > 530kb sent to a separate server

38 Conclusion Improved Redirect Algorithms lead to more robust CDN systems FDR allows a 60-91% greater load than previously published systems FDR provides a mechanism for defending against flash crowds or Distributed Denial of Service Attacks

39 Questions/Discussion