Download presentation
Presentation is loading. Please wait.
Published byMegan Johnston Modified over 9 years ago
1
Texas Learning and Computation Center High Performance Systems Lab Automatic Clustering of Grid Nodes Nov 14, 2005 Qiang Xu, Jaspal Subhlok University of Houston
2
Texas Learning and Computation Center High Performance Systems Lab Grid Scheduler Computational Resource | CPU, memory Network Topology Network Topology Network Link | Latency, Bandwidth I will decide which group of nodes are best for an application!!!
3
Texas Learning and Computation Center High Performance Systems Lab Network Topology Fine-grained physical network topology --- Hard! heterogeneous, dynamic, and distributed nature of a grid system We focus on the “logical” network topology logical network topology: the connectivity between nodes based on the observed behavior. 1) Easier to compute 2) Sufficient to tackle the resource selection problem
4
Texas Learning and Computation Center High Performance Systems Lab Discover Clusters/Logical Topology A set of nodes with IP addresses / hostnames Connectivity?
5
Texas Learning and Computation Center High Performance Systems Lab Discover Clusters/Logical Topology Cluster A Cluster B Cluster C Dist(A—B) Dist(B—C) Dist(A—C) nodes close to each other same cluster
6
Texas Learning and Computation Center High Performance Systems Lab Outline Introduction Internet Geometric Space Automatic Clustering Experiments and Result Conclusion
7
Texas Learning and Computation Center High Performance Systems Lab Internet Topology Map 1 A macroscopic snapshot of the Internet : 4 April 2005 - 17 April 2005.
8
Texas Learning and Computation Center High Performance Systems Lab Internet Topology Map 2 Internet map as of 1998 by Bill Cheswick, Bell Labs Hal Burch, CMU
9
Texas Learning and Computation Center High Performance Systems Lab Why Geometric Space ? Internet Topology Map --- Complex! Geometric Space (N-Dimension Euclidean Space) GNP(Global Network Positioning) --- T. S. Eugene Ng and Hui Zhang, INFOCOM'02 I can’t tell the distance between nodes!!
10
Texas Learning and Computation Center High Performance Systems Lab Magic Landmarks! Node Landmark 3 12 8 Landmarks: A set of distributed nodes across the internet
11
Texas Learning and Computation Center High Performance Systems Lab Geometric Space 1.One axis per landmark 2.Coordinate of nodes ≡ Latency from each landmark. Y4=8 X4=12 Z4=3
12
Texas Learning and Computation Center High Performance Systems Lab Internet Geometric Space Simple Geometric Space Complex Internet Structure
13
Texas Learning and Computation Center High Performance Systems Lab Advantage of Geometric Space Simple --- distance in Geometric Space is well defined, e.g. the Euclidean distance. Scalable --- for M Nodes Pairwise distance among M nodes M*M probes Mapping to Geometric space M*N probes N is the number of landmarks – a number ~7 is known to be sufficient. Easy to manage --- only need to control the landmarks
14
Texas Learning and Computation Center High Performance Systems Lab Outline Introduction Internet Geometric Space Automatic Clustering Experiments and Result Conclusion
15
Texas Learning and Computation Center High Performance Systems Lab Again the problem! Cluster A Cluster B Cluster C Dist(A—B) Dist(B—C) Dist(A—C)
16
Texas Learning and Computation Center High Performance Systems Lab Place Nodes in Geometric Space ! Simple Geometric Space How do I cluster?
17
Texas Learning and Computation Center High Performance Systems Lab Network Distance: Threshold: If Distance < Threshold, nodes belong to the same logical cluster – N is the # of landmarks –T parameter describes how close nodes have to be to be in the same cluster for a typical domain to be one cluster,T = 1ms Distance and Threshold
18
Texas Learning and Computation Center High Performance Systems Lab All grid nodes are graph nodes Add an edge between nodes if Distance < Threshold Build Unidirected Graph
19
Texas Learning and Computation Center High Performance Systems Lab Edge exist if Distance < Threshold Typical Case Clusters are obvious and easy to distinguish! Clusters are obvious and easy to distinguish!
20
Texas Learning and Computation Center High Performance Systems Lab Pathological Case Border Node ? Where are the clusters? General Case: Find maximal cliques in the graph – each clique is a cluster
21
Texas Learning and Computation Center High Performance Systems Lab Summary of Inter-domain Clustering 1.Place Nodes in the geometric space. 2.Calculate the Euclidean distance. 3.Build a graph based on distance and Threshold. 4.Find the maximal cliques. inter-domain clustering --- good! intra-domain clustering --- not good enough!
22
Texas Learning and Computation Center High Performance Systems Lab Intra-domain clustering Nodes in the same domain but in different subnets. Short latency --- less than 1ms. Landmark-based approach --- resolution is not sufficient! measurement error ~ real latency We need to change the approach for intra- domain clustering !
23
Texas Learning and Computation Center High Performance Systems Lab Intra-domain Clustering 1.Distance between nodes is directly measured latency instead of projected geometrical distance. (M × M but M is smaller and measurements are quick.) 2.Basis for clustering is relative Distance between any two nodes inside a cluster is within β% of the smallest distance in the cluster.
24
Texas Learning and Computation Center High Performance Systems Lab REPEAT: Select least cost edge, say connecting clusters A and B If A and B are not the same cluster; and if this edge cost is within β % of least cost edges inside A and B, then combine them into one cluster Intra-domain Clustering Procedure Initially each node is a cluster Each edge is measured latency
25
Texas Learning and Computation Center High Performance Systems Lab Outline Introduction Internet Geometric Space Automatic Clustering Experiments and Result Conclusion
26
Texas Learning and Computation Center High Performance Systems Lab Experiments Inter-Domain Clustering 3 Landmarks: UT(Austin), Rice, CMU 36 Compute Nodes: Rice, UT-Dallas, TAMU-College Station, TAMU-Galveston Intra-Domain Clustering 4 clusters at University of Houston: PGH201, Itanium, Opetron, Stokes TCP Ping(not ICMP Ping) to measure latency
27
Texas Learning and Computation Center High Performance Systems Lab Inter-domain Cluster ( 2 landmarks) + UT Dallas ðTAMU Galveston TAMU College Station Rice Cannotdistinguishbetween UT Dallas & TAMU Galveston
28
Texas Learning and Computation Center High Performance Systems Lab Inter-domain Cluster ( 3 landmarks) + UT Dallas ðTAMU Galveston TAMU College Station Rice 4 clusters are well distinguished
29
Texas Learning and Computation Center High Performance Systems Lab Inter-domain Cluster ( 2 landmarks) + UT Dallas ðTAMU Galveston TAMU College Station Rice
30
Texas Learning and Computation Center High Performance Systems Lab Intra-domain Cluster latency ClustersPGH201OpteronItaniumStokes PGH2010.090.32 0.30 Opteron0.250.09 0.50 Itanium0.300.10 0.35 Stokes0.400.500.600.10 Latency between Nodes (ms)
31
Texas Learning and Computation Center High Performance Systems Lab Illustration of Intra-domain Clusters + UT Dallas ðTAMU Galveston TAMU College Station Rice
32
Texas Learning and Computation Center High Performance Systems Lab Future Work Integrate into a grid scheduling system Use Bandwidth as a factor for clustering Dynamically update logical clusters Nodes behind a NAT (Network address translation) -- nodes with local IP addresses
33
Texas Learning and Computation Center High Performance Systems Lab Conclusions Efficient and scalable procedure to hierarchically group distributed nodes into logical clusters Validation with experiments on nodes distributed across Texas An important step for scheduling in a grid environment.
34
Texas Learning and Computation Center High Performance Systems Lab Questions? Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.