Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building Scalable, High Performance Cluster and Grid Networks: The Role Of Ethernet Thriveni Movva CMPS 5433.

Similar presentations


Presentation on theme: "Building Scalable, High Performance Cluster and Grid Networks: The Role Of Ethernet Thriveni Movva CMPS 5433."— Presentation transcript:

1 Building Scalable, High Performance Cluster and Grid Networks: The Role Of Ethernet Thriveni Movva CMPS 5433

2 Overview  About Grids/Clusters  Uses of Grid  Differences between Grids/Clusters  Benefits of Grid  Grid Architecture  Building Ethernet Network for Grids/Clusters  Examples of Ethernet Grids/Clusters  Conclusion/Summary

3 What Is A Grid Computer?  Hardware and Software System  Integrates a collection of distributed system components  Computer systems  Storage etc  Solves large-scale computation problems  Appear to the user as a single, large “Virtualized” computing system  Consists of geographically dispersed computers

4 What is a Cluster?  Multiprocessor system consisting of co-located computers and storage  Viewed as though it were a single computer  Connected through fast local area networks (Localized within a room or building)  Provides more speed and/or reliability than a single computer  Cost-effective than single computers of comparable speed or reliability.

5 Uses of Grid Computing  Computer systems and other resources  not constrained to be dedicated to individual users or applications  Can be made available for dynamic pooling/sharing according to the changing needs  Using internet, Grid-based resource sharing and collaborative problem solving can be extended to multi-institutional “Virtual Organizations”

6 Differences between Grids/Clusters  Grids: dispersed over a local/metropolitan/WANdispersed over a local/metropolitan/WAN span administrative boundariesspan administrative boundaries focus on problems in distributing computing/resource sharingfocus on problems in distributing computing/resource sharing distribute workloads among different machine types and OSdistribute workloads among different machine types and OS  Clusters: localized within a room/buildinglocalized within a room/building single administrationsingle administration focus on compute-intensive problems and HPCfocus on compute-intensive problems and HPC homogenous (single type of processor and OS)homogenous (single type of processor and OS)

7 Benefits Of The Grid  Grid Computing offers a number of Potential uses and benefits that can be broadly categorized in the following way:  High Performance Computing (HPC)  Data Federation and Collaboration  Resource Allocation and Optimization

8 High Performance Computing (HPC)  Computationally intensive parallelizable applications can be benefited  Uses computer array of numerous commodity or specialized systems  Most applications of the Grid fall into HPC classification  Advantages Of HPC:  Cost effective solutions to critical problems  High return on investment  Solves problems that were previously insolvable within given time and cost  Solve problems too large for conventional supercomputers  Fields in which the HPC Grid has successfully addressed a wide range of computational problems include:  Climate/weather/ocean modeling and simulation, Internet search engines, Signal/image processing, Pharmaceutical research, Military forces simulation

9 Data Federation and Collaboration  Consolidates data from different sources in a single data service  Hides data location, local ownership and infrastructure from the application  No data disruption by local users, applications or data management policies  Facilitates wide range of integrated applications like:  Corporate performance dashboards  Marketing analysis tools  Customer service applications  Data mining applications

10 Resource Allocation and Optimization  Sharing of computing and storage to improve resource utilization  For Example, the applications and the batch jobs can be transferred to an idle server  Benefits of resource optimization  Reclaims much of the stranded capacity of the computing infrastructure  Reduces the level of capital investment  No modification of existing application required

11 Grid Computing Architecture  Basic architecture of Grid consists of  User Interface  Applications  Grid Middleware  Computing Resources  Grid Network

12 Applications  Classification of parallel applications  Embarrassingly Parallel Computations (EPC) Divided into independent partsDivided into independent parts Allocated to multiple processors for simultaneous executionAllocated to multiple processors for simultaneous execution No communication is required between the processorsNo communication is required between the processors Example : Testing large integers to determine prime numbersExample : Testing large integers to determine prime numbers  Parametric and Data Parallel Computations Also referred to as Nearly Embarrassingly Parallel Computations (NEPC)Also referred to as Nearly Embarrassingly Parallel Computations (NEPC) Each processor works on independent subset of the dataEach processor works on independent subset of the data Data is later gathered by a single processData is later gathered by a single process Examples: Internet search enginesExamples: Internet search engines  Loosely Coupled Synchronous Parallel Computations Inter-process communication between small subset of processors before the computation can be completedInter-process communication between small subset of processors before the computation can be completed

13 Grid Middleware  Gives the Grid the semblance of a single computer system  Provides coordination among computing resources of the Grid  Provides location transparency  Allows the applications to run over a virtualized layer of networked resources  Available from system vendors and independent software vendors  Example: Globus Toolkit

14 Functions of Middleware  Discovery and monitoring  Discover what resources or services are available  Monitor their status  Resource allocation and management  Matches application requirements to the available computing resources  Creates and schedules remote jobs as required  Ensures optimum load balancing and resource utilization  Security  Shared resources may contain sensitive information  Secures communications, authenticate user identities using SSL/TLS etc  Message Passing System  Used by compute-intensive parallel applications for inter-process communication  Examples: MPI (Message passing interface) and PVM (parallel virtual machine)

15 Ethernet Networks for Clusters and Grids  Single-switch Clusters  Large Clusters  Ethernet Grid Networks

16 Single-switch Clusters  Built using a single high-availability Gigabit Ethernet switch/router as the cluster interconnect  The maximum size of a single-switch Ethernet cluster is determined by the non-blocking port capacity of the switch  Current Switch/routers provide interconnect for over 600 GbE connected servers  All server ports configured to be in same subnet

17 Large Clusters  Built using meshes of Federated Ethernet switches  Ethernet switches use non-blocking, constant Bi-sectional Bandwidth (CBB) topologies  CBB  Provides scalability to support thousands of cluster nodes  Provide high bandwidth connectivity to the network  The core of the cluster provides each node switch with equal load share to avoid blocking of ports

18 Ethernet Grid Networks (Campus Grid network based on Ethernet switching)  Ethernet allow the cluster to participate in a broader campus or Enterprise Grid structure  Desktop computers, workstations connected to the campus grid network using gbE  Server farms Outside of cluster are connected to site switches using gbE  Goal of campus LANs  gives high priority to general Grid traffic  ensures critical Grid traffic does not incur any added latency

19 Grid Tools  Tools used to prioritize critical grid traffic  Priority Queuing The forwarding capacity of a congested port is immediately allocated to any high priority traffic that enters the queueThe forwarding capacity of a congested port is immediately allocated to any high priority traffic that enters the queue  Rate limiting and policing Limits the amount of lower priority traffic that enters the networkLimits the amount of lower priority traffic that enters the network  Weighted Random Early Discard (WRED)  Packet loss can be eliminated if buffers are never allowed to fill to capacity with resulting overflows  Overflows can be avoided by applying WRED to the lower priority traffic  WRED eliminates the possibility of high priority packets arriving at a buffer that is already overflowing with lower priority packets

20 Examples of Ethernet Cluster/Grids  TeraGrid  Is a multi-institutional effort to build and deploy world’s most comprehensive computing infrastructure for open scientific research  NASA  NASA uses ESDCD “Grid of clusters”, to help scientists increase their understanding of the Earth, the solar system and the universe through computational modeling and processing of space-borne observations

21 Conclusion/Summary  Ethernet continues to evolve as a highly cost-effective and flexible technology  Majority of parallel and general Grid applications are very well served by the performance characteristics of Ethernet as the cluster/Grid interconnect  In the future, Ethernet end-to-end data transfer bandwidths, message latencies and CPU utilization will improve dramatically due to NIC enhancements  Volume production leading to price decline  These developments expected to improve the overall performance of existing Ethernet clusters/Grids and use of cluster/Grid technology by a broader range of commercial enterprises


Download ppt "Building Scalable, High Performance Cluster and Grid Networks: The Role Of Ethernet Thriveni Movva CMPS 5433."

Similar presentations


Ads by Google