SCTP versus TCP for MPI Brad Penoff, Humaira Kamal, Alan Wagner Department of Computer Science University of British Columbia.

Slides:



Advertisements
Similar presentations
A Hybrid MPI Design using SCTP and iWARP Distributed Systems Group Mike Tsai, Brad Penoff, and Alan Wagner Department of Computer Science University of.
Advertisements

Umut Girit  One of the core members of the Internet Protocol Suite, the set of network protocols used for the Internet. With UDP, computer.
SCTP v/s TCP – A Comparison of Transport Protocols for Web Traffic CS740 Project Presentation by N. Gupta, S. Kumar, R. Rajamani.
Camarillo / Schulzrinne / Kantola November 26th, 2001 SIP over SCTP performance analysis
RivuS Stream Control Transmission Protocol (SCTP) on BSD By- Jayesh Rane Nitin Kumbhar Kedar Sovani PICT. Guides: Prof. Rajesh B. Ingle, PICT. Mr. Adityashankar.
CCNA – Network Fundamentals
Transmission Control Protocol (TCP)
SCTP Tutorial Randall Stewart
Transmission Control Protocol (TCP) Basics
TCP/IP Protocol Suite 1 Chapter 13 Upon completion you will be able to: Stream Control Transmission Protocol Be able to name and understand the services.
TCP/IP Protocol Suite 1 Chapter 13 Upon completion you will be able to: Stream Control Transmission Protocol Be able to name and understand the services.
EEC-484/584 Computer Networks Lecture 12 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 OSI Transport Layer Network Fundamentals – Chapter 4.
EEC-484/584 Computer Networks Lecture 12 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
Protocols and the TCP/IP Suite
Stream Control Transmission Protocol 網路前瞻技術實驗室 陳旻槿.
Computer Network Architecture and Programming
Department of Electronic Engineering City University of Hong Kong EE3900 Computer Networks Transport Protocols Slide 1 Transport Protocols.
TCP. Learning objectives Reliable Transport in TCP TCP flow and Congestion Control.
WXES2106 Network Technology Semester /2005 Chapter 8 Intermediate TCP CCNA2: Module 10.
EEC-484/584 Computer Networks Lecture 6 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
What Is TCP/IP? The large collection of networking protocols and services called TCP/IP denotes far more than the combination of the two key protocols.
 The Open Systems Interconnection model (OSI model) is a product of the Open Systems Interconnection effort at the International Organization for Standardization.
Process-to-Process Delivery:
Gursharan Singh Tatla Transport Layer 16-May
Protocols and the TCP/IP Suite Chapter 4. Multilayer communication. A series of layers, each built upon the one below it. The purpose of each layer is.
Process-to-Process Delivery:
Chapter 16 Stream Control Transmission Protocol (SCTP)
SCTP versus TCP for MPI Brad Penoff, Humaira Kamal, Alan Wagner Department of Computer Science University of British Columbia Distributed Research Group.
Chapter 17 Networking Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William Stallings.
Lecture 2 TCP/IP Protocol Suite Reference: TCP/IP Protocol Suite, 4 th Edition (chapter 2) 1.
1 March 2010 A Study of Hardware Assisted IP over InfiniBand and its Impact on Enterprise Data Center Performance Ryan E. Grant 1, Pavan Balaji 2, Ahmad.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 7: Transport Layer Introduction to Networking.
TCP Lecture 13 November 13, TCP Background Transmission Control Protocol (TCP) TCP provides much of the functionality that IP lacks: reliable service.
University of the Western Cape Chapter 12: The Transport Layer.
TCP1 Transmission Control Protocol (TCP). TCP2 Outline Transmission Control Protocol.
1 Transport Protocols Relates to Lab 5. An overview of the transport protocols of the TCP/IP protocol suite. Also, a short discussion of UDP.
TCOM 509 – Internet Protocols (TCP/IP) Lecture 03_b Protocol Layering Instructor: Dr. Li-Chuan Chen Date: 09/15/2003 Based in part upon slides of Prof.
23.1 Chapter 23 Process-to-Process Delivery: UDP, TCP, and SCTP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
COP 4930 Computer Network Projects Summer C 2004 Prof. Roy B. Levow Lecture 3.
Minimizing Communication Latency to Maximize Network Communication Throughput over InfiniBand Design and Implementation of MPICH-2 over InfiniBand with.
BZUPAGES.COM Presentation on TCP/IP Presented to: Sir Taimoor Presented by: Jamila BB Roll no Nudrat Rehman Roll no
Towards MPI progression layer elimination with TCP and SCTP
CCNA 1 v3.0 Module 9 TCP/IP Protocol Suite and IP Addressing
CHAPTER 4 PROTOCOLS AND THE TCP/IP SUITE Acknowledgement: The Slides Were Provided By Cory Beard, William Stallings For Their Textbook “Wireless Communication.
Computer Networks23-1 PART 5 Transport Layer. Computer Networks23-2 Position of Transport Layer Responsible for the delivery of a message from one process.
SCTP: A new networking protocol for super-computing Mohammed Atiquzzaman Shaojian Fu Department of Computer Science University of Oklahoma.
Teacher:Quincy Wu Presented by: Ying-Neng Hseih
Stream Control Transmission Protocol
Ph.D Unurkhaan Esbold, Computer Science and Management School, Mongolian University of Science and Technology “InfoSec Mongolia 2006” conference, Ulaanbaatar,
Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet P. Balaji, S. Bhagvat, R. Thakur and D. K. Panda, Mathematics.
TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 16 Stream Control Transmission.
Ch23 Ameera Almasoud 1 Based on Data Communications and Networking, 4th Edition. by Behrouz A. Forouzan, McGraw-Hill Companies, Inc., 2007.
High Performance and Reliable Multicast over Myrinet/GM-2
The Transport Layer Implementation Services Functions Protocols
Transport Protocols Relates to Lab 5. An overview of the transport protocols of the TCP/IP protocol suite. Also, a short discussion of UDP.
Chapter 16 Stream Control Transmission Protocol (SCTP)
PART 5 Transport Layer Computer Networks.
SCTP v/s TCP – A Comparison of Transport Protocols for Web Traffic
Net 431: ADVANCED COMPUTER NETWORKS
Using SCTP to hide latency in MPI programs
SCTP: Stream Control Transport Protocol
Protocols and the TCP/IP Suite
Transport Protocols Relates to Lab 5. An overview of the transport protocols of the TCP/IP protocol suite. Also, a short discussion of UDP.
Stream Control Transmission Protocol (SCTP)
Process-to-Process Delivery:
SCTP-based Middleware for MPI
COMPUTER NETWORKS CS610 Lecture-35 Hammad Khalid Khan.
Protocols and the TCP/IP Suite
Transport Protocols Relates to Lab 5. An overview of the transport protocols of the TCP/IP protocol suite. Also, a short discussion of UDP.
Presentation transcript:

SCTP versus TCP for MPI Brad Penoff, Humaira Kamal, Alan Wagner Department of Computer Science University of British Columbia

Outline Self Introduction Research background Research presentation SCTP & MPI background MPI over SCTP design Design features Results Conclusions

Who am I? Born and raised in Columbus area OSU alumni Europa alumni Worked a few years Grad student finishing my MSc at UBC

UBC d

Who do I work with? Alan Wagner (Prof, UBC) Humaira Kamal (PhD, UBC) Mike Yao Chen Tsai (MSc, UBC) Edith Vong (BSc, UBC) Randall Stewart (Cisco)

What field do we work in? Parallel computing Concurrently utilize multiple resources

What field do we work in? Parallel computing Concurrently utilize multiple resources 1 cook

What field do we work in? Parallel computing Concurrently utilize multiple resources 1 cook vs 8 cooks

What field do we work in? Parallel computing Concurrently utilize multiple resources

What field do we work in? Message passing programming model Message Passing Interface (MPI) Standardized API for applications

What field do we work in? Middleware for MPI Glues necessary components together for parallel environment

What field do we work in? Middleware for MPI Glues necessary components together for parallel environment ←

What field do we work in? Parallel library component Implements MPI API for various interconnects Shared memory Myrinet Infiniband Specialized hardware (BlueGene/L, ASCI Red, etc)

What field do we work in? TCP/IP protocol stack interconnect Stream Control Transmission Protocol

SCTP versus TCP for MPI Brad Penoff, Humaira Kamal, Alan Wagner Department of Computer Science University of British Columbia Supercomputing 2005, Seattle, Washington USA

What is MPI and SCTP? Message Passing Interface (MPI) Library that is widely used to parallelize scientific and compute-intensive programs Stream Control Transmission Protocol (SCTP) General purpose unicast transport protocol for IP network data communications Recently standardized by IETF Can be used anywhere TCP is used

What is MPI and SCTP? Message Passing Interface (MPI) Library that is widely used to parallelize scientific and compute-intensive programs Stream Control Transmission Protocol (SCTP) General purpose unicast transport protocol for IP network data communications Recently standardized by IETF Can be used anywhere TCP is used Question Can we take advantage of SCTP features to better support parallel applications using MPI?

Communicating MPI Processes TCP is often used as transport protocol for MPI SCTP

SCTP Key Features Reliable in-order delivery, flow control, full duplex transfer. Selective ACK is built-in the protocol TCP-like congestion control

SCTP Key Features Message oriented Use of associations Multihoming Multiple streams within an association

Associations and Multihoming Primary address Heartbeats Retransmissions Failover User adjustable controls CMT

Logical View of Multiple Streams in an Association

Partially Ordered User Messages Sent on Different Streams

Can be received in the same order as it was sent (required in TCP).

Partially Ordered User Messages Sent on Different Streams

MPI API Implementaion Message matching is done based on Tag, Rank and Context (TRC). Combinations such as blocking, non-blocking, synchronous, asynchronous, buffered, unbuffered. Use of wildcards for receive MPI_Send(msg,count,type,dest-rank,tag,context ) MPI_Recv(msg,count,type,source-rank,tag,context )

MPI Messages Using Same Context, Two Processes

Out of order messages with same tags violate MPI semantics

MPI API Implementation Request Progression Layer Short Messages vs. Long Messages

MPI over SCTP : Design and Implementation LAM (Local Area Multi-computer) is an open source implementation of MPI library. Origins at Ohio Supercomputing Center We redesigned LAM TCP RPI module to use SCTP. RPI module is responsible maintaining state information of all requests.

MPI over SCTP : Design and Implementation Challenges: Lack of documentation Code examination Our document is linked-off LAM/MPI website Extensive instrumentation Diagnostic traces Identification of problems in SCTP protocol

Using SCTP for MPI Striking similarities between SCTP and MPI

Implementation Issues Maintaining State Information Maintain state appropriately for each request function to work with the one-to-many style. Message Demultiplexing Extend RPI initialization to map associations to rank. Demultiplexing of each incoming message to direct it to the proper receive function. Concurrency and SCTP Streams Consistently map MPI tag-rank-context to SCTP streams, maintaining proper MPI semantics. Resource Management Make RPI more message-driven. Eliminate the use of the select() system call, making the implementation more scalable. Eliminating the need to maintain a large number of socket descriptors.

Implementation Issues Eliminating Race Conditions Finding solutions for race conditions due to added concurrency. Use of barrier after association setup phase. Reliability Modify out-of-band daemons and request progression interface (RPI) to use a common transport layer protocol to allow for all components of LAM to multihome successfully. Support for large messages Devised a long-message protocol to handle messages larger than socket send buffer. Experiments with different SCTP stacks

Features of Design Scalability Head-of-Line Blocking

Scalability TCP

Scalability SCTP

Head-of-Line Blocking

Limitations Comprehensive CRC32c checksum – offload to NIC not yet commonly available SCTP bundles messages together so it might not always be able to pack a full MTU SCTP stack is in early stages and will improve over time Performance is stack dependant (Linux lksctp stack << FreeBSD KAME stack)

Experiments Controlled environment - Eight nodes - Dummynet Used standard benchmarks as well as real world programs Fair comparison Buffer sizes, Nagle disabled, SACK ON, No multihoming, CRC32c OFF

Experiments: Benchmarks MPBench Ping Pong Test under No Loss

NAS Benchmarks The NAS benchmarks approximate real world parallel scientific applications We experimented with a suite of 7 benchmarks, 4 data set sizes SCTP performance comparable to TCP for large datasets.

Latency Tolerant Programs Bulk Farm Processor program Real-world application Non-blocking communication Overlap computation with communication Use of multiple tags

Farm Program - Short Messages

Head-of-line blocking – Short messages

Conclusions SCTP is a better suited for MPI Avoids unnecessary head-of-line blocking due to use of streams Increased fault tolerance in presence of multihomed hosts In-built security features Robust under loss SCTP might be key to moving MPI programs from LANs to WANs.

Future Work Release LAM SCTP RPI module at SC|05 Incorporate our work into Open MPI and/or MPICH2 Modify real applications to use tags as streams

More information about our work is at: Thank you!

Extra Slides

Partially Ordered User Messages Sent on Different Streams

Added Security User data can be piggy-backed on third and fourth leg SCTP’s Use of Signed Cookie

Added Security 32 bit Verification Tag – reset attack Autoclose feature No half-closed state

Farm Program - Long Messages

Head-of-line blocking – Long messages

Experiments: Benchmarks SCTP outperformed TCP under loss for ping pong test.

Experiments: Benchmarks SCTP outperformed TCP under loss for ping pong test.

Experiments: Benchmarks SCTP outperformed TCP under loss for ping pong test.