Texas Learning and Computation Center High Performance Systems Lab Automatic Clustering of Grid Nodes Nov 14, 2005 Qiang Xu, Jaspal Subhlok University.

Slides:



Advertisements
Similar presentations
T. S. Eugene Ng Mellon University1 Towards Global Network Positioning T. S. Eugene Ng and Hui Zhang Department of Computer.
Advertisements

Topology-Aware Overlay Construction and Server Selection Sylvia Ratnasamy Mark Handley Richard Karp Scott Shenker Infocom 2002.
Intel Research Internet Coordinate Systems - 03/03/2004 Internet Coordinate Systems Marcelo Pias Intel Research Cambridge
1 Scoped and Approximate Queries in a Relational Grid Information Service Dong Lu, Peter A. Dinda, Jason A. Skicewicz Prescience Lab, Dept. of Computer.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Addressing the Network – IPv4 Network Fundamentals – Chapter 6.
Fabián E. Bustamante, 2007 Meridian: A lightweight network location service without virtual coordinates B. Wong, A. Slivkins and E. Gün Sirer SIGCOM 2005.
1 Turning Heterogeneity into an Advantage in Overlay Routing Gisik Kwon Dept. of Computer Science and Engineering Arizona State University Published in.
Geometry of large networks (computer science perspective) Dmitri Krioukov (CAIDA/UCSD) AIM, November 2011.
SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.
Communication Pattern Based Node Selection for Shared Networks
Small-World Graphs for High Performance Networking Reem Alshahrani Kent State University.
Multiple constraints QoS Routing Given: - a (real time) connection request with specified QoS requirements (e.g., Bdw, Delay, Jitter, packet loss, path.
Efficient and Robust Computation of Resource Clusters in the Internet Efficient and Robust Computation of Resource Clusters in the Internet Chuang Liu,
Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.
1 TCP-LP: A Distributed Algorithm for Low Priority Data Transfer Aleksandar Kuzmanovic, Edward W. Knightly Department of Electrical and Computer Engineering.
T. S. Eugene Ng Mellon University1 Global Network Positioning: A New Approach to Network Distance Prediction Tze Sing Eugene.
Network Monitoring for Internet Traffic Engineering Jennifer Rexford AT&T Labs – Research Florham Park, NJ 07932
Announcement r Project 2 Extension ? m Previous grade allocation: Projects 40% –Web client/server7% –TCP stack21% –IP routing12% Midterm 20% Final 20%
Lecture Week 3 Introduction to Dynamic Routing Protocol Routing Protocols and Concepts.
UCSC 1 Aman ShaikhICNP 2003 An Efficient Algorithm for OSPF Subnet Aggregation ICNP 2003 Aman Shaikh Dongmei Wang, Guangzhi Li, Jennifer Yates, Charles.
Clustering Unsupervised learning Generating “classes”
Rice01, slide 1 Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks Jaspal Subhlok Shreenivasa Venkataramaiah Amitoj Singh University.
Edge Based Cloud Computing as a Feasible Network Paradigm(1/27) Edge-Based Cloud Computing as a Feasible Network Paradigm Joe Elizondo and Sam Palmer.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 4: Addressing in an Enterprise Network Introducing Routing and Switching in the.
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429 Introduction to Computer Networks Lecture 8: Bridging Slides used with permissions.
Cloud MapReduce : a MapReduce Implementation on top of a Cloud Operating System Speaker : 童耀民 MA1G Authors: Huan Liu, Dan Orban Accenture.
Transport Layer 3-1 Chapter 4 Network Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012  CPSC.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
Exposure In Wireless Ad-Hoc Sensor Networks Seapahn Meguerdichian Computer Science Department University of California, Los Angeles Farinaz Koushanfar.
Dynamic Routing Chapter 9. powered by DJ 1. C HAPTER O BJECTIVES At the end of this Chapter you will be able to:  Explain Dynamic Routing  Identify.
Complex network geometry and navigation Dmitri Krioukov CAIDA/UCSD F. Papadopoulos, M. Kitsak, kc claffy, A. Vahdat M. Á. Serrano, M. Boguñá UCSD, December.
CS654: Digital Image Analysis Lecture 3: Data Structure for Image Analysis.
Objectives: Chapter 5: Network/Internet Layer  How Networks are connected Network/Internet Layer Routed Protocols Routing Protocols Autonomous Systems.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
VAST 2011 Sebastian Bremm, Tatiana von Landesberger, Martin Heß, Tobias Schreck, Philipp Weil, and Kay Hamacher Interactive-Graphics Systems TU Darmstadt,
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
Advanced Networking Lab. Given two IP addresses, the estimation algorithm for the path and latency between them is as follows: Step 1: Map IP addresses.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Addressing in an Enterprise Network Introducing Routing and Switching in the.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 4: Addressing in an Enterprise Network Introducing Routing and Switching in the.
TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 11 Unicast Routing Protocols.
IPDPS 2005, slide 1 Automatic Construction and Evaluation of “Performance Skeletons” ( Predicting Performance in an Unpredictable World ) Sukhdeep Sodhi.
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429 Introduction to Computer Networks Scaling Broadcast Ethernet Some slides used with.
Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.
1 An Adaptive File Distribution Algorithm for Wide Area Network Takashi Hoshino, Kenjiro Taura, Takashi Chikayama University of Tokyo.
KAIS T On the problem of placing Mobility Anchor Points in Wireless Mesh Networks Lei Wu & Bjorn Lanfeldt, Wireless Mesh Community Networks Workshop, 2006.
Internet Protocols. ICMP ICMP – Internet Control Message Protocol Each ICMP message is encapsulated in an IP packet – Treated like any other datagram,
Network Coordinates : Internet Distance Estimation Jieming ZHU
1 7-Jan-16 S Ward Abingdon and Witney College Dynamic Routing CCNA Exploration Semester 2 Chapter 3.
GLIDER: Gradient Landmark-Based Distributed Routing for Sensor Networks Qing Fang, Jie Gao, Leonidas J. Guibas, Vin de Silva, Li Zhang Department of Electrical.
Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.
Introduction to Active Directory
SEMINAR ON IP SPOOFING. IP spoofing is the creation of IP packets using forged (spoofed) source IP address. In the April 1989, AT & T Bell a lab was among.
Computer Networks22-1 Network Layer Delivery, Forwarding, and Routing.
Spring 2000CS 4611 Routing Outline Algorithms Scalability.
LACSI 2002, slide 1 Performance Prediction for Simple CPU and Network Sharing Shreenivasa Venkataramaiah Jaspal Subhlok University of Houston LACSI Symposium.
MicroGrid Update & A Synthetic Grid Resource Generator Xin Liu, Yang-suk Kee, Andrew Chien Department of Computer Science and Engineering Center for Networked.
1 Computer Networks Chapter 5. Network layer The network layer is concerned with getting packets from the source all the way to the destination. Getting.
Graphical Presentation of Mobile networks
Cohesive Subgraph Computation over Large Graphs
Data Mining, Neural Network and Genetic Programming
COMP 3270 Computer Networks
Optimal Configuration of OSPF Aggregates
Zhichen Xu, Mallik Mahalingam, Magnus Karlsson
Turning Heterogeneity into an Advantage in Overlay Routing
DISTRIBUTED CLUSTERING OF UBIQUITOUS DATA STREAMS
The Globus Toolkit™: Information Services
Link-State Routing Protocols
Link-State Routing Protocols
Chapter 4 Network Layer A note on the use of these ppt slides:
Presentation transcript:

Texas Learning and Computation Center High Performance Systems Lab Automatic Clustering of Grid Nodes Nov 14, 2005 Qiang Xu, Jaspal Subhlok University of Houston

Texas Learning and Computation Center High Performance Systems Lab Grid Scheduler Computational Resource | CPU, memory Network Topology Network Topology Network Link | Latency, Bandwidth I will decide which group of nodes are best for an application!!!

Texas Learning and Computation Center High Performance Systems Lab Network Topology Fine-grained physical network topology --- Hard! heterogeneous, dynamic, and distributed nature of a grid system We focus on the “logical” network topology logical network topology: the connectivity between nodes based on the observed behavior. 1) Easier to compute 2) Sufficient to tackle the resource selection problem

Texas Learning and Computation Center High Performance Systems Lab Discover Clusters/Logical Topology A set of nodes with IP addresses / hostnames Connectivity?

Texas Learning and Computation Center High Performance Systems Lab Discover Clusters/Logical Topology Cluster A Cluster B Cluster C Dist(A—B) Dist(B—C) Dist(A—C) nodes close to each other  same cluster

Texas Learning and Computation Center High Performance Systems Lab Outline Introduction Internet  Geometric Space Automatic Clustering Experiments and Result Conclusion

Texas Learning and Computation Center High Performance Systems Lab Internet Topology Map 1 A macroscopic snapshot of the Internet : 4 April April 2005.

Texas Learning and Computation Center High Performance Systems Lab Internet Topology Map 2 Internet map as of 1998 by Bill Cheswick, Bell Labs Hal Burch, CMU

Texas Learning and Computation Center High Performance Systems Lab Why Geometric Space ? Internet Topology Map --- Complex! Geometric Space (N-Dimension Euclidean Space) GNP(Global Network Positioning) --- T. S. Eugene Ng and Hui Zhang, INFOCOM'02 I can’t tell the distance between nodes!!

Texas Learning and Computation Center High Performance Systems Lab Magic Landmarks! Node Landmark Landmarks: A set of distributed nodes across the internet

Texas Learning and Computation Center High Performance Systems Lab Geometric Space 1.One axis per landmark 2.Coordinate of nodes ≡ Latency from each landmark. Y4=8 X4=12 Z4=3

Texas Learning and Computation Center High Performance Systems Lab Internet  Geometric Space Simple Geometric Space Complex Internet Structure

Texas Learning and Computation Center High Performance Systems Lab Advantage of Geometric Space Simple --- distance in Geometric Space is well defined, e.g. the Euclidean distance. Scalable --- for M Nodes Pairwise distance among M nodes  M*M probes Mapping to Geometric space  M*N probes N is the number of landmarks – a number ~7 is known to be sufficient. Easy to manage --- only need to control the landmarks

Texas Learning and Computation Center High Performance Systems Lab Outline Introduction Internet  Geometric Space Automatic Clustering Experiments and Result Conclusion

Texas Learning and Computation Center High Performance Systems Lab Again the problem! Cluster A Cluster B Cluster C Dist(A—B) Dist(B—C) Dist(A—C)

Texas Learning and Computation Center High Performance Systems Lab Place Nodes in Geometric Space ! Simple Geometric Space How do I cluster?

Texas Learning and Computation Center High Performance Systems Lab Network Distance: Threshold: If Distance < Threshold, nodes belong to the same logical cluster – N is the # of landmarks –T parameter describes how close nodes have to be to be in the same cluster for a typical domain to be one cluster,T = 1ms Distance and Threshold

Texas Learning and Computation Center High Performance Systems Lab All grid nodes are graph nodes Add an edge between nodes if Distance < Threshold Build Unidirected Graph

Texas Learning and Computation Center High Performance Systems Lab Edge exist if Distance < Threshold Typical Case Clusters are obvious and easy to distinguish! Clusters are obvious and easy to distinguish!

Texas Learning and Computation Center High Performance Systems Lab Pathological Case Border Node ? Where are the clusters? General Case: Find maximal cliques in the graph – each clique is a cluster

Texas Learning and Computation Center High Performance Systems Lab Summary of Inter-domain Clustering 1.Place Nodes in the geometric space. 2.Calculate the Euclidean distance. 3.Build a graph based on distance and Threshold. 4.Find the maximal cliques. inter-domain clustering --- good! intra-domain clustering ---  not good enough!

Texas Learning and Computation Center High Performance Systems Lab Intra-domain clustering Nodes in the same domain but in different subnets. Short latency --- less than 1ms. Landmark-based approach --- resolution is not sufficient! measurement error ~ real latency We need to change the approach for intra- domain clustering !

Texas Learning and Computation Center High Performance Systems Lab Intra-domain Clustering 1.Distance between nodes is directly measured latency instead of projected geometrical distance. (M × M but M is smaller and measurements are quick.) 2.Basis for clustering is relative Distance between any two nodes inside a cluster is within β% of the smallest distance in the cluster.

Texas Learning and Computation Center High Performance Systems Lab REPEAT: Select least cost edge, say connecting clusters A and B If A and B are not the same cluster; and if this edge cost is within β % of least cost edges inside A and B, then combine them into one cluster Intra-domain Clustering Procedure Initially each node is a cluster Each edge is measured latency

Texas Learning and Computation Center High Performance Systems Lab Outline Introduction Internet  Geometric Space Automatic Clustering Experiments and Result Conclusion

Texas Learning and Computation Center High Performance Systems Lab Experiments Inter-Domain Clustering 3 Landmarks: UT(Austin), Rice, CMU 36 Compute Nodes: Rice, UT-Dallas, TAMU-College Station, TAMU-Galveston Intra-Domain Clustering 4 clusters at University of Houston: PGH201, Itanium, Opetron, Stokes TCP Ping(not ICMP Ping) to measure latency

Texas Learning and Computation Center High Performance Systems Lab Inter-domain Cluster ( 2 landmarks) + UT Dallas ðTAMU Galveston  TAMU College Station  Rice Cannotdistinguishbetween UT Dallas & TAMU Galveston

Texas Learning and Computation Center High Performance Systems Lab Inter-domain Cluster ( 3 landmarks) + UT Dallas ðTAMU Galveston  TAMU College Station  Rice 4 clusters are well distinguished

Texas Learning and Computation Center High Performance Systems Lab Inter-domain Cluster ( 2 landmarks) + UT Dallas ðTAMU Galveston  TAMU College Station  Rice

Texas Learning and Computation Center High Performance Systems Lab Intra-domain Cluster latency ClustersPGH201OpteronItaniumStokes PGH Opteron Itanium Stokes Latency between Nodes (ms)

Texas Learning and Computation Center High Performance Systems Lab Illustration of Intra-domain Clusters + UT Dallas ðTAMU Galveston  TAMU College Station  Rice

Texas Learning and Computation Center High Performance Systems Lab Future Work Integrate into a grid scheduling system Use Bandwidth as a factor for clustering Dynamically update logical clusters Nodes behind a NAT (Network address translation) -- nodes with local IP addresses

Texas Learning and Computation Center High Performance Systems Lab Conclusions Efficient and scalable procedure to hierarchically group distributed nodes into logical clusters Validation with experiments on nodes distributed across Texas An important step for scheduling in a grid environment.

Texas Learning and Computation Center High Performance Systems Lab Questions? Thank you!