Efficient and Robust Computation of Resource Clusters in the Internet Efficient and Robust Computation of Resource Clusters in the Internet Chuang Liu, Ian Foster University of Chicago Argonne National Laboratory
2 What is the Problem? l Locate a set of resources with particular network connections in the Internet. l Q1: Find a set of R resources very close to each other: –The network latency between any pair of those resources is less than L milliseconds. l Q2: Find a set of R resources very far to each other: –The network latency between any pair of those resources is more than L milliseconds.
3 Challenges l It is a NP-hard problem. l It requires a large number of measurements. l Unstable networks and resources may lead to individual measurements failing. l Network latency data is noisy because of the sharing of network resources among users.
4 Intuition of Our Method l Clustering –Cluster: a set of resources that have smaller latency among each other, and much bigger latency with resources not in the cluster –Result of Clustering: a partition of resources such that each partition is a cluster, called Cluster structure l Search based on the cluster structure –Q1. Search for resources in a cluster –Q2. Search for resources from different clusters
5 Contributions l An effective clustering algorithm. –This method can find cluster structure even the latency measurements is incomplete. –Cluster structure is stable in a dynamic environment l An efficient search algorithm –Answer Q1 and Q2 based on the cluster structure. –Order-of-magnitude performance improvements
6 Outline l Problems l > Clustering Algorithm –Markov Cluster Algorithm –Stability of the Cluster Structure –Robustness of Clustering Algorithm l Search Algorithm l Performance Evaluation l Summary
7 Model of the Problem l Represent network connection of resources as a weighted graph –Each resource as a node –An edge between any two nodes –the reciprocal of latency measurement as the weight of each edge l In the graph representation of resources, a cluster is a set of nodes connected by heavy- weight edges l How to find the cluster structure in the graph?
8 The Markov Cluster Algorithm l Algorithm developed by S. van Dongen –A walker departs from one node. –Moves to one adjacent node in each step. The outlet edge is selected randomly in favor of heavy-weight edges. –After k steps, calculate the probability of a random walk starting from node i and ending at node j. If the probability is large, put them in the same cluster. l Walker tends to stay in the same cluster because he chooses high weight edges with high probability. l Granularity parameter G –With bigger G, the algorithm will create smaller clusters with smaller latency among resources in a cluster
9 Resources on Planetlab l Resources on Planetlab –400 Computers –200 sites –End-to-end pair-wise latencies collected by Jeremy Stribling
10 Cluster Structure of Resources on Planetlab l Cluster structures with different granularity –East America, West America, Central America, East Asian, South European, etc… –California, Texas, China, Korean, etc… –San Jose (HP, UCB, Stanford), Boston (BU, MIT), etc..
11 Outline l Problems l Clustering Algorithm –Markov Cluster Algorithm –> Stability of the Cluster Structure –Robustness of Clustering Algorithm l Search Algorithm l Performance Evaluation l Summary
12 Stability of the Cluster Structure l Latency among resources changes over time due to dynamic nature of Internet l Questions –Will the created cluster structures change over time? –Will the difference becomes larger over time?
13 Stability of the Cluster Structure l Latencies measurements collected at the beginning of each hour over a 5-day period. –24*5=120 sets of data in total l Produce cluster structure for each set of latency data and calculate the difference between these structure. l Metric of difference: D value –D value is defined as the proportion of nodes that must be exchanged to transform any of the two cluster structures into the other. –D is between 0 to 1 –Small D means cluster structure is stable
14 Histogram distribution of D values l Difference between each cluster structure with the one based on data 1 hour ago. l About 30% of cluster structures change less than 10% (D = 0.1) from one hour ago, More than 60% of cluster structures change between 10% and 15% from one hour ago l Will the created cluster structures change over time? –Yes, but not much.
15 Histogram distribution of D values l Compare clustering result with results based on data two and four hours ago. l D values does not get larger with the increase of time. –Distribution of D values is similar for 1, 2, and 4 hours. l Will the difference becomes larger over time? –No, in a few hours.
16 Conclusion l Cluster structure for Planetlab resources is relatively stable in short term l We do not need to rebuild the cluster structure frequently
17 Outline l Problems l Clustering Algorithm –Markov Cluster Algorithm –Stability of the Cluster Structure –> Robustness of Clustering Algorithm l Search Algorithm l Performance Evaluation l Summary
18 Robustness of Clustering Algorithm l Available latency measurements is only a subset of all possible measurements. –On Planetlab, from to , 25% to 30% of measurements are available at most of the time. Occasionally, only 10-15% are available.
19 Robustness of Clustering Algorithm l Question –Can the cluster algorithm find the right cluster structure based on an incomplete set of measurements?
20 Experimental Result l Compute cluster structures using a 10-90% of all data. l Compare the difference with the structure based on all data by D value. l The cluster algorithm is still effective when running on an incomplete set of data Frac90%80%70%60%50%40%30%20%1% D
21 Outline l Problems l Clustering Algorithm l > Search Algorithm l Performance Evaluation l Summary
22 Traditional Tree Search Algorithm l Starts with an empty set l Repeatedly picks from available resources one resource that has required connections with current members in the set, and adds it to the set l Rolls back the addition in previous step if no such resource exists l Finishes when the set contains all required resources l It is a NP hard problem
23 Modified Tree Search Algorithm l Q1: pick resources from the same clusters. l Q2: pick resources from different clusters. l Reduce the search space remarkably. –Search space is defined as the possible combinations of resources Granularity Ratio1.4E-43.6E-61E-6
24 Outline l Problems l Clustering Algorithm l Search Algorithm l > Performance Evaluation l Summary
25 Benchmark Queries l Q1 searches for R resources with latency between any two of them smaller than L. l Q2 searches for R resources with latency between any two of them more than L. l We build 1000 Q1 and 1000 Q2 by randomly choosing value of R and L.
26 Performance of Q1 l Cumulative distribution of execution time l Our algorithm answers 80% percentage of queries in less than a few millisecond. Algorithm70%90% tree0.6 s26 s modified1.6 ms0.4 s
27 Performance of Q2 l Cumulative distribution of execution time Algorithm70%90% tree0.6 s26 s heuristic5 s52 s
28 Summary l Markov Cluster algorithm can determine cluster structure based on incomplete latency measurements. l The cluster structure is stable in an Internet environment. l A heuristic algorithm to answer Q1 and Q2. l The algorithm archives order-of-magnitude performance improvements.
29 Contact l Chuang Liu: l Paper details: l Thank you
30 Model of the Problem
31 The Markov Cluster Algorithm l Random walk –A walker departs from one node –In each step, he randomly selects a outlet edge in the probability proportional to the edge weight, and moves to the other end of the edge l Intuition –Walker will stay in the same cluster with high probability because he tends to choose high weight edges. –Calculate the probability of a random walk starting from node i and ending at node j after k steps, and put them in the same cluster if the probability is large. –Granularity parameter G l Developed by S. van Dongen
32 Capability to Find Resources l Q1 l Q2 AlgorithmResult FoundNot ResultTimeout Tree Modified AlgorithmResult FoundNot ResultTimeout Tree Modified
33 Cluster Structure of Resources on Planetlab l Clusters –East America, West America, Central America, East Asian, South European, etc… –California, Texas, China, Korean, etc… –San Jose (HP, UCB, Stanford), Boston (BU, MIT), etc.. G # of clusters Median latency ms ms ms