1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,

1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001, Anchorage, AK, April 2001

2 Outline Overview Related work Our approach Simulation methodology & results Summary

3 Motivation Growing interests in Web server replicas Exponential growth in Web usage Content providers want to offer better service at lower cost Solution: replication Forms of Web server replicas Mirror sites Content Distribution Networks (CDNs) CDN: a network of servers Examples: Akamai, Digital Island Internet replica Clients Content Providers replica

4 Placement of Web Server Replicas Problem specification Among a set of N potential sites, pick K sites as replicas to minimize users’ latency or bandwidth usage Internet Clients Content Providers

5 Related Work Placement of Web proxies [LGI+99] Cache location [KRS00] Placement of Internet instrumentation [JJJ+00]

6 Our Approach Model Internet as a graph Parameterize the graph using measured inputs # requests generated from each region Distance between different regions Map the placement problem onto a graph optimization problem Assumption: Each client uses a single replica that is closest to it Solve graph optimization problem Using various approximation algorithms

7 Minimum K-median Problem Given a complete graph G=(V,E), d(j), c(i,j) d(j): # requests c(i,j): distance between node i and j Latency or hop counts or other metric to be optimized Find a subset V’  V with |V’| = K s.t. it minimizes  v  V min w  V’ d(v)c(v,w) NP-hard problem 2 5 10 3 8 4 7 3 6 8 2 5 6 4 2

8 Placement Algorithms Tree based algorithm [LGG+99] Assume the underlying topologies are trees, and model it as a dynamic programming problem O(N 3 M 2 ) for choosing M replicas among N potential places Random Pick the best among several random assignments Hot spot Place replicas near the clients that generate the largest load

9 Placement Algorithms (Cont.) Greedy algorithm Calculate costs of assigning clients to replicas Select replica with lowest cost Adjust costs based upon assignment, repeat until done Super-Optimal algorithm Lagrangian relaxation + subgradient method

10 Simulation Methodology Network topology Randomly generated topologies Using GT-ITM Internet topology generator Real Internet network topology AS level topology obtained using BGP routing data from a set of seven geographically dispersed BGP peers Web Workload Real server traces MSNBC, ClarkNet, NASA Kennedy Space Center Performance Metric Relative performance: cost practical /cost super-optimal

11 Simulation Methodology (Cont.) Simulate a network of N nodes (100  N  3000) Cluster clients using network aware clustering [KW00] IP addresses with the same address prefix belong to a cluster A small number of popular clusters account for most requests Top 10, 100, 1000, 3000 clusters account for about 24%, 45%, 78%, and 94% of the requests respectively Pick the top N clusters Map them to different nodes

12 Simulation Methodology (Cont.) Random trees Random graphs AS-level topologies Sensitivity to the error in the input

13 Random Tree Topologies Tree-based algorithm performs well as expected. Greedy algorithm performs equally as well.

14 Random Graph Topologies The greedy and hot-spot algorithms out-perform the tree-based algorithm.

15 Large Random Graph Topologies The greedy performs the best, and the hot-spot performs nearly as well.

16 AS-level Internet Topologies The greedy performs the best, and the hot-spot performs nearly as well.

17 Effects of Imperfect Knowledge about Input Data Predicted workload (using moving window average) Perfect topology information Within 5% degradation when using predicted workload

18 Effects of Imperfect Knowledge about Input Data (Cont.) Predicted workload (using moving window average) Noisy topology information Perturb the distance between two nodes i and j by up to a factor of 2 Within 15% degradation when using predicted workload and noisy topology information

19 Summary One of the first experimental studies on placement of Web server replicas Knowledge about client workload and topology is needed for provisioning replicas The greedy algorithm performs very well Within a factor of 1.1 – 1.5 of the super-optimal Insensitive to noise Stay within a factor of 2 of the super-optimal when the salted error is a factor of 4 The hot spot algorithm performs nearly as well Within a factor of 1.6 – 2 of the super-optimal Obtaining input data Moving window average for load prediction Using BGP router data to obtain topology information

20 Conclusion Recommend using the greedy algorithm for deciding the placement of Web server replicas

21 Acknowledgement Craig Labovitz Yin Zhang Ravi Kumar

1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,

Similar presentations

Presentation on theme: "1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,

Similar presentations

Presentation on theme: "1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,"— Presentation transcript:

Similar presentations

About project

Feedback