Infiniband Architecture

Slides:



Advertisements
Similar presentations
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Advertisements

CCNA3: Switching Basics and Intermediate Routing v3.0 CISCO NETWORKING ACADEMY PROGRAM Switching Concepts Introduction to Ethernet/802.3 LANs Introduction.
The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Case study IBM Bluegene/L system InfiniBand. Interconnect Family share for 06/2011 top 500 supercomputers Interconnect Family CountShare % Rmax Sum (GF)
1 InfiniBand HW Architecture InfiniBand Unified Fabric InfiniBand Architecture Router xCA Link Topology Switched Fabric (vs shared bus) 64K nodes per sub-net.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Storage area network and System area network (SAN)
Storage Networking Technologies and Virtualization Section 2 DAS and Introduction to SCSI1.
CECS 474 Computer Network Interoperability Tracy Bradley Maples, Ph.D. Computer Engineering & Computer Science Cal ifornia State University, Long Beach.
Chapter 4: Managing LAN Traffic
Chapter 2 Network Models
1/29/2002 CS Distributed Systems 1 Infiniband Architecture Aniruddha Bohra.
Current major high performance networking technologies InfiniBand 10G-Ethernet.
CS 6401 Internetworking Outline Internet Architecture Best Effort Service Model.
Part 3: Internetworking Internet architecture, addressing, encapsulation, reliable transport and the TCP/IP protocol suite.
ECE 526 – Network Processing Systems Design Networking: protocols and packet format Chapter 3: D. E. Comer Fall 2008.
Remote Direct Memory Access (RDMA) over IP PFLDNet 2003, Geneva Stephen Bailey, Sandburst Corp., Allyn Romanow, Cisco Systems,
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
Chapter 3 - VLANs. VLANs Logical grouping of devices or users Configuration done at switch via software Not standardized – proprietary software from vendor.
Virtual Machines Created within the Virtualization layer, such as a hypervisor Shares the physical computer's CPU, hard disk, memory, and network interfaces.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 April 11, 2006 Session 23.
Internet Protocol Storage Area Networks (IP SAN)
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
© 2007 EMC Corporation. All rights reserved. Internet Protocol Storage Area Networks (IP SAN) Module 3.4.
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
LISA Linux Switching Appliance Radu Rendec Ioan Nicu Octavian Purdila Universitatea Politehnica Bucuresti 5 th RoEduNet International Conference.
William Stallings Computer Organization and Architecture 6th Edition
Chapter 3 Part 3 Switching and Bridging
Enhancements for Voltaire’s InfiniBand simulator
Overview Parallel Processing Pipelining
HyperTransport™ Technology I/O Link
Advanced Computer Networks
Direct Attached Storage and Introduction to SCSI
System On Chip.
What is Fibre Channel? What is Fibre Channel? Introduction
Chapter 6: Network Layer
Fabric Interfaces Architecture – v4
Chapter 4: Routing Concepts
Overview of SDN Controller Design
CS 286 Computer Organization and Architecture
Introduction to Networks
CT1303 LAN Rehab AlFallaj.
CS703 - Advanced Operating Systems
Virtual LANs.
Chapter 4: Network Layer
Aled Edwards, Anna Fischer, Antonio Lain HP Labs
Chapter 3 Part 3 Switching and Bridging
BIC 10503: COMPUTER ARCHITECTURE
Chapter 7 Backbone Network
Chapter 4-1 Network layer
Direct Attached Storage and Introduction to SCSI
Protocols and the TCP/IP Suite
What’s “Inside” a Router?
CS 31006: Computer Networks – The Routers
Internetworking: Hardware/Software Interface
Network Virtualization
Storage Networking Protocols
Internet Protocol INTERNET PROTOCOL.
Storage area network and System area network (SAN)
Router Construction Outline Switched Fabrics IP Routers
TCP/IP Protocol Suite: Review
Chapter 2: Operating-System Structures
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
CSC3050 – Computer Architecture
Chapter 3 Part 3 Switching and Bridging
Protocols and the TCP/IP Suite
Chapter 2: Operating-System Structures
Chapter 13: I/O Systems.
Chapter 4: Network Layer
Cluster Computers.
Presentation transcript:

Infiniband Architecture Aniruddha Bohra 1/29/2002 CS 545 - Distributed Systems

Distributed Applications and Data Transfer Traditional distributed applications Need low latency message delivery Data volume in transfers between nodes not too high Server applications Need low latency and high bandwidth data transfers Data volumes in transfers are high e.g. in a cluster based storage or streaming multimedia servers Need Reliable and Available Services Need easy maintenance 1/29/2002 CS 545 - Distributed Systems

Traditional message send To NIC Application Memory buffers System Call Kernel TCP sendmsg Copy from user space IP and lower layers Backup buffers One kernel boundary crossing Two memory copies!! 1/29/2002 CS 545 - Distributed Systems

Lessons from parallel computing Co-processors that can access memory directly used for communication FLASH, J-Machine, Alewife User level networking Virtual Memory Mapped Communication Unet VMMC VIA 1/29/2002 CS 545 - Distributed Systems

Interconnect bottleneck Servers require high data transfer rate CPUs operate at GHz speed Gigabit ethernet is commonly used in cluster based servers Data volumes are high PCI bus is much slower operates at 32 bit/33 MHz or 64 bit/66 MHz the next generation bus PCI-X operates at 133 MHz 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Some solutions HyperTransport Runs at 800MHz full duplex Bridges with current buses and other HyperTransport buses 3GIO Switch based Provides a layered implementation Promises more than 40 Gb/s transfer rate 1/29/2002 CS 545 - Distributed Systems

More problems with bus based interconnects Cannot keep up with the increasing CPU and peripheral speed Bus is shared between all peripherals The pin count is high – PCB space is limited! Buses are not able to extend to long distances Do not support a large number of devices 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Outline Motivation and background Infiniband architecture Infiniband components Infiniband operation Other Infiniband features Status Summary 1/29/2002 CS 545 - Distributed Systems

Infiniband Architecture Provides switch based interconnect Increased reliability Scalable and easily maintainable Supports memory to memory communication Low latency communication Provides support for “out of box” components Scalable Easier to manage and operate Is complimentary to the 3GIO and HyperTransport Buses 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems What is Infiniband? Infiniband Architecture(IBA) defines a System Area Network (SAN) IBA SAN is a communications and management infrastructure for I/O and IPC IBA defines a switched communications fabric high bandwidth and low latency protected, remotely managed environment. IBA hardware off-loads from the CPU much of the I/O communications operation. 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems An IBA SAN 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Outline Motivation and background Infiniband architecture Infiniband components Infiniband operation Other Infiniband features Status Summary 1/29/2002 CS 545 - Distributed Systems

Topologies and components IBA serves as an interconnect for endnodes A node can be a processor node, an I/O unit and/or a router to another network Node Node Infiniband Fabric Node Node Node Node Node 1/29/2002 CS 545 - Distributed Systems

Topologies and Components An IBA network is subdivided into subnets interconnected by routers Endnodes can attach to a single or multiple subnets An IBA subnet is composed of endnodes, switches, routers and subnet managers Each IBT device may attach to a single switch or multiple switches and/or directly with each other 1/29/2002 CS 545 - Distributed Systems

IBT device – processor node Verbs Consumer Consumer Consumer Message and Data Service Channel Adapter (endnode) Port Channel Adapter (endnode) Port 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Processor node Each channel adapter constitutes a node on the fabric Architecture supports multiple channel adapters per unit with each adapter providing one or more ports to the fabric Message and Data service is an OS component Verbs describe the functions to configure, manage and operate a host channel adapter Verbs are not API but provide the framework for OS to specify it 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Channel Adapter An IBA channel adapter(CA) is a programmable DMA engine with special protection features that allow DMA operations to be initiated locally and remotely. Host Channel Adapter(HCA) provides a consumer interface providing the functions specified by IBA verbs. Target Channel Adapter(TCA) provides an interface to the device 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Channel Adapter 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Addressing in IBA Each endnode has one or more CAs and each CA has one or more ports Each Queue Pair (QP) has a QP number (QPN) assigned by the CA Each port has a unique Local ID (LID) and at least one IPv6 address – Global ID (GID) 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Switches Do not generate or consume packets – pass them along based on the destination address Are the routing components for intra-subnet routing – support uni or multicast Every destination is configured with one or more unique Local IDs (LIDs) Subnet manager configures switches including loading their forwarding tables 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Routers Routers are inter-subnet routing elements Routers forward packets based on the packet’s global route header Routers expose one or more ports between which packets are relayed IPv6 specifies the protocol performed between routers to derive their routing tables 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Subnet Managers An Subnet Manager(SM) is an entity attached to a subnet responsible for its management Tasks Discover topology Configure the CA port with a range of LIDs, GIDs, subnet prefix and Partition_Keys Maintains LID/GID resolution tables 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Outline Motivation and background Infiniband architecture Infiniband components Infiniband operation Other Infiniband features Status Summary 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Communication Queuing Consumer queues up a set of instructions for hardware to execute (Work queue). Work queues are created in pairs(Queue pairs – QP) for send and receive operations Each Work Queue has corresponding Completion Queue 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Work Queue Operations Send operations SEND Block in memory space to send to destination RDMA RDMA_READ, RDMA_WRITE, ATOMIC Memory Binding Alters the memory binding relationship – gives the R_KEY to components which allows secure DMA Receive operation Specifies a receive data buffer 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Work Queue Operations 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Communication Stack 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Keys Keys are used to provide isolation and protection M_KEY Enforces the control of a master Subnet Manager B_KEY Enforces control of a baseboard Subnet Manager P_KEY Enforces membership in a subnet Q_KEY Enforces access rights for reliable or unreliable service L_KEY and R_KEY Provide access rights to Remote registered memory 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Outline Motivation and background Infiniband architecture Infiniband components Infiniband operation Other Infiniband features Status Summary 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Virtual Lanes A virtual lane represents a set of transmit and receive buffers in a port VL15 is used for subnet management Each port must have at least one data VL Separate flow control is maintained over each VL 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Service Levels Service levels(SLs) are maintained by attaching a VL to a SL IBA does not specify any QoS levels(e.g. best effort) The SMA must keep a mapping of Service Level to Virtual Lane and propagate it through the switch 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Status Intel Developer Forum had several status talks http://www.intel.com/idf/us IBA enabled network storage has been demonstrated at industry shows Banderacom Windriver The first products are expected to be in the market by middle of 2002 1/29/2002 CS 545 - Distributed Systems

CS 545 - Distributed Systems Summary Future bandwidth requirements for servers would lead to the interconnect becoming a bottleneck – IBA is an attempt to alleviate the problem IBA provides a thorough migration from a bus based to a switch based architecture while maintaining interoperability Further deployment is needed to realize other issues that would arise in operation 1/29/2002 CS 545 - Distributed Systems