High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.

Slides:



Advertisements
Similar presentations
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
Advertisements

CoMPI: Enhancing MPI based applications performance and scalability using run-time compression. Rosa Filgueira, David E.Singh, Alejandro Calderón and Jesús.
Packet Switching COM1337/3501 Textbook: Computer Networks: A Systems Approach, L. Peterson, B. Davie, Morgan Kaufmann Chapter 3.
Why to learn OSI reference Model? The answer is too simple that It tells us that how communication takes place between computers on internet but how??
Data and Computer Communications Eighth Edition by William Stallings Lecture slides by Lawrie Brown Chapter 10 – Circuit Switching and Packet Switching.
13.3 CHANGES IN THE STANDARD The 10-Mbps Standard Ethernet has gone through several changes before moving to the higher data rates. These changes actually.
Optical communications & networking - an Overview
A Parallel Computational Model for Heterogeneous Clusters Jose Luis Bosque, Luis Pastor, IEEE TRASACTION ON PARALLEL AND DISTRIBUTED SYSTEM, VOL. 17, NO.
Department of Computer Engineering University of California at Santa Cruz Networking Systems (1) Hai Tao.
1 6/22/ :39 Chapter 9Fiber Channel1 Rivier College CS575: Advanced LANs Chapter 9: Fibre Channel.
In-Band Flow Establishment for End-to-End QoS in RDRN Saravanan Radhakrishnan.
Review on Networking Technologies Linda Wu (CMPT )
CS335 Networking & Network Administration Tuesday, April 20, 2010.
EE 4272Spring, 2003 Chapter 11. ATM and Frame Relay Overview of ATM Protocol Architecture ATM Logical Connections ATM Cells ATM Service Categories ATM.
EEC-484/584 Computer Networks Lecture 6 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
Gursharan Singh Tatla Transport Layer 16-May
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Virtual LANs. VLAN introduction VLANs logically segment switched networks based on the functions, project teams, or applications of the organization regardless.
Connecting LANs, Backbone Networks, and Virtual LANs
Network Topologies.
Lecture 1, 1Spring 2003, COM1337/3501Computer Communication Networks Rajmohan Rajaraman COM1337/3501 Textbook: Computer Networks: A Systems Approach, L.
Communication Network Protocols Jaya Kalidindi CSC 8320(fall 2008)
Switching Techniques Student: Blidaru Catalina Elena.
Chapter 6 High-Speed LANs Chapter 6 High-Speed LANs.
Computer Science and Engineering Computer System Security CSE 5339/7339 Session 24 November 11, 2004.
Section 4 : The OSI Network Layer CSIS 479R Fall 1999 “Network +” George D. Hickman, CNI, CNE.
Common Devices Used In Computer Networks
Remote Access Chapter 4. Learning Objectives Understand implications of IEEE 802.1x and how it is used Understand VPN technology and its uses for securing.
Protocols and the TCP/IP Suite
Chapter 2 – X.25, Frame Relay & ATM. Switched Network Stations are not connected together necessarily by a single link Stations are typically far apart.
 Network Segments  NICs  Repeaters  Hubs  Bridges  Switches  Routers and Brouters  Gateways 2.
Chapter 1. Introduction. By Sanghyun Ahn, Deot. Of Computer Science and Statistics, University of Seoul A Brief Networking History §Internet – started.
1 Next Few Classes Networking basics Protection & Security.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
Internetworking – What is internetworking? Connect multiple networks of one or more organizations into a large, uniform communication system. The resulting.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 2.5 Internetworking Chapter 25 (Transport Protocols, UDP and TCP, Protocol Port Numbers)
TELE202 Lecture 5 Packet switching in WAN 1 Lecturer Dr Z. Huang Overview ¥Last Lectures »C programming »Source: ¥This Lecture »Packet switching in Wide.
Switching breaks up large collision domains into smaller ones Collision domain is a network segment with two or more devices sharing the same Introduction.
William Stallings Data and Computer Communications 7 th Edition Chapter 1 Data Communications and Networks Overview.
UNDERSTANDING THE HOST-TO-HOST COMMUNICATIONS MODEL - OSI LAYER & TCP/IP MODEL 1.
Example: Sorting on Distributed Computing Environment Apr 20,
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
LogP and BSP models. LogP model Common MPP organization: complete machine connected by a network. LogP attempts to capture the characteristics of such.
TCP/IP Honolulu Community College Cisco Academy Training Center Semester 2 Version 2.1.
Data and Computer Communications Chapter 11 – Asynchronous Transfer Mode.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
William Stallings Data and Computer Communications
BZUPAGES.COM Presentation on TCP/IP Presented to: Sir Taimoor Presented by: Jamila BB Roll no Nudrat Rehman Roll no
STORE AND FORWARD & CUT THROUGH FORWARD Switches can use different forwarding techniques— two of these are store-and-forward switching and cut-through.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Connecting to the Network Introduction to Networking Concepts.
Network and the internet Part eight Introduction to computer, 2nd semester, 2009/2010 Mr.Nael Aburas Faculty of Information.
Copyright 2002Cisco Press: CCNA Instructor’s Manual Year 2 - Chapter 16/Cisco 4 - Module 9 CCNA Certification Exam Review By Your Name.
Using Heterogeneous Paths for Inter-process Communication in a Distributed System Vimi Puthen Veetil Instructor: Pekka Heikkinen M.Sc.(Tech.) Nokia Siemens.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 April 11, 2006 Session 23.
Introduction Computer networks: – definition – computer networks from the perspectives of users and designers – Evaluation criteria – Some concepts: –
1 Switching and Forwarding Sections Connecting More Than Two Hosts Multi-access link: Ethernet, wireless –Single physical link, shared by multiple.
Performance Comparison of Ad Hoc Network Routing Protocols Presented by Venkata Suresh Tamminiedi Computer Science Department Georgia State University.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 April 6, 2006 Session 22.
M. R. Kharazmi Chapter 1 Data Communications and Networks Overview.
The Transport Layer Implementation Services Functions Protocols
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
SWITCHING Switched Network Circuit-Switched Network Datagram Networks
Practical Issues for Commercial Networks
Switching Techniques.
CPEG514 Advanced Computer Networkst
Network Architecture for Cyberspace
Optical communications & networking - an Overview
Computer Networks Protocols
Presentation transcript:

High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center

2 Multiple Path Communication Introduction Heterogeneity in Networks and Applications Multiple Path Communication Case Study Summary and Conclusion

3 Introduction (1) Communication is major factor of performance in network-based computing Utilize all available network resources within a system Substantial benefits in simultaneously exploiting multiple independent communication paths between processors Heterogeneous network environment Single application execution Different network - different performance characteristic

4 Introduction (2) Performance-based path determination (PBPD) Exploiting multiple physical communication paths & multiple communication protocols within a single parallel application program to reduce communication overhead Different physical networks & protocols have different performance characteristics Different types of communication within an application might be best suited to one of several different communication paths bulk data transfer (image data): – high-bandwidth communication short control messages (synch, acknowledgement): - low-latency communication

5 Introduction (3) Performance-based path determination (PBPD) Performance-based path selection (PBPS) Dynamically select the best communication path among several in a given system Performance-based path aggregation (PBPA) Aggregate multiple communication paths into a single virtual path for sending an individual message in an application Each independent path simultaneously carries a fraction of a single message Useful in bandwidth-limited situations Control is based on message size, message type, or network traffic load

6 Heterogeneity in Network and Applications Heterogeneity inherent interprocessor communication Different types of messages latency-limited bandwidth-limited

7 Varieties of Communication Networks WAN (1 Mbps) MAN (100 Mbps) SAN (more than 1 Gbps) Ethernet (10 Mbps – 1000 Mbps) ATM (155 Mbps) FDDI (100 Mbps) HiPPI (800 Mbps – 1.6 Gbps) Infinitband (10Gbps)

8 Exploiting Multiple Communication Paths Messages size communication groups priority level Different type of network services to support different types of message communication Network services connection oriented connectionless Both are logically sufficient for the implementation of any communication pattern Each offers performance & programming benefits for some classes of applications

9 Benefits to Support Multiple Communication Paths Efficient network utilization Most appropriate communication paths to be used Alternative network protocols Connection oriented / connectionless Robust communication Multiple networks provides extra reliability Network load balancing Quality of service Multiple paths can support diverse QoS requirements

10 Multiple Path Communication A single application is likely to require several different types of messages for communication Different types of messages may be better suited to a different type of communication mechanism

11 Performance-Based Path Selection (PBPS) Useful when one provides better performance in one situation while the other is better in another situation Possible to dynamically select the appropriate communication path for a given communication event f1(m1) = t1 f2(m2) = t2 f PBPS (m) = Best[f i (m)], where (i= 1.. N)

12 When sending a message of size m i, performance- based path selection (PBPS) uses the lower latency curve among f 1 (m i ) and f 2 (m i )

13 Performance-Based Path Aggregation (PBPA) Can be applied when different paths show similar characteristics 2 nearly identical networks aggregate 2 networks into a single virtual network bandwidth will be nearly twice Divide – transmit - aggregate Important consideration determine the size of the submessages f PBPA (m) = f i (m i ), where (i= 1.. N), (m =  m i )

14 When using performance-based path aggregation (PBPA) with two networks, a message of size m 1 + m 2 is split into two submessages such that messages of size m 1 and m 2 are sent over networks f 1 (m i ) and f 2 (m i ) simultaneously

15 PBPD Library Custom library whose main feature is the support of multiple communication paths in a single application program Based on common TCP layer Add integer field ‘ length ’ tells message size Used by PBPA if a message is too small to segment Handle the multiplexing of different TCP connections Use UNIX select system call with appropriate table lookups

16 Implementation and Protocol Hierarchy of the PBPD Communication Routines

17 Case Study: Multiple Path Characteristic Communication type in parallel application program point-to-point, collective 4 Silicon Graphics Challenge L shared-memory multiprocessors. 4 or 8 R10000 processors at 196MHz per node 10Mbps Ethernet and 266Mbps Fiber Channel

18 Multiple Heterogeneous Network Configuration used in the Experiments

19 Point-to-point and Broadcast Characteristics of Ethernet using the TCP and UDP Communication Protocols

20 Example of a Broadcast Operation of a Separate Addressing Method using the PBPA Technique

21 Point-to-point Characteristics of Ethernet and Fibre Channel using the TCP Protocol

22 Broadcast Characteristics of Ethernet and Fibre Channel using the TCP Protocol

23 Case Study: Communication Patterns of Parallel Applications Performance of PBPD at application-level depends on the communication patterns of the specific application being executed

24 Parallel Benchmark Programs Tested ProgramsDescription CGConjugate gradient MGMultigrid ISInteger sort FilterSmoothing (averaging) filter GaussGaussian elimination HoughLine recognition algorithm KirschImage processing TRFDTwo-electron integral transformation WarpSpatial domain image restoration BTSimulated CFD application using Block tridiagonal solver LUSimulated CFD application using LU solver SPSimulated CFD application using Pentadiagonal solver MICOMMiami isopycnic coordinate ocean model

25 Case Study: Computation Model Data parallelization Medium to coarse-grained parallelism Similar communication pattern for every node except when starting application Each processor tends to alternate computation and communication at the same time communication congestion is inevitable

26 Relative Times of Communication Events for the IS Benchmark

27 Relative Times of Communication Events for the MG Benchmark

28 Case Study: Message Size and Destination Distributions Distribution of message destinations Uniformly distributed among all nodes Biased to some destinations for each node Distribution of message sizes

29 Overall Message Destination Distribution for All of the Test Programs

30 The Message Destination Distribution for Each Processor in the CG Benchmark

31 The Cumulative Distribution of Message Sizes in the Test Programs

32 Experiments Results Parameterize communication pattern Point-to-point communication Small message mean size l1, probability to appear is b Large message mean size l2, probability to appear is 1 - (b+c) Broadcast communication Message mean size l3, probability to appear is c Assumption 1 master, p-1 slave processors Message size for each communication follow Poison distribution with 3 different mean value l1, l2, l3

33 Parameter Values Used in the Synthetic Benchmark Application type A B C pt2pt (small) mean l 1 pt2pt (large) mean l 2 broadcast mean l 3 b1-(b+c)c 45% 10% 25% 50% 5% 90%

34 Speedup using PBPS with the TCP and UDP Protocols over Ethernet. The speedups are normalized to the case when using the TCP protocol alone

35 Speedup using PBPA Technique with Ethernet and Fiber Channel using the TCP Protocols. The speedups are normalized to the case when using the Ethernet alone

36 Summary and Conclusion Communication overhead can be reduced by exploiting heterogeneity in both communication path and application Reduce communication overhead PBPD technique can achieve performance improvement based on message type