Internetworking: Hardware/Software Interface

Slides:



Advertisements
Similar presentations
Middleware Support for RDMA-based Data Transfer in Cloud Computing Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi Department of Electrical.
Advertisements

System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Chapter 17 Networking Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Institute of Computer Science Foundation for Research and Technology – Hellas Greece Computer Architecture and VLSI Systems Laboratory Exploiting Spatial.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Multicore Architectures. Managing Wire Delay in Large CMP Caches Bradford M. Beckmann David A. Wood Multifacet Project University of Wisconsin-Madison.
OSI Model.
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan.
 The Open Systems Interconnection model (OSI model) is a product of the Open Systems Interconnection effort at the International Organization for Standardization.
Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji  Hemal V. Shah ¥ D. K. Panda 
IWARP Ethernet Key to Driving Ethernet into the Future Brian Hausauer Chief Architect NetEffect, Inc.
Lecture slides prepared for “Business Data Communications”, 7/e, by William Stallings and Tom Case, Chapter 8 “TCP/IP”.
1.  A protocol is a set of rules that governs the communications between computers on a network.  Functions of protocols:  Addressing  Data Packet.
I/O Acceleration in Server Architectures
Protocols and the TCP/IP Suite Chapter 4. Multilayer communication. A series of layers, each built upon the one below it. The purpose of each layer is.
Chapter 17 Networking Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William Stallings.
NetworkProtocols. Objectives Identify characteristics of TCP/IP, IPX/SPX, NetBIOS, and AppleTalk Understand position of network protocols in OSI Model.
Protocols and the TCP/IP Suite
Towards a Common Communication Infrastructure for Clusters and Grids Darius Buntinas Argonne National Laboratory.
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
Lecture 3 Review of Internet Protocols Transport Layer.
High Performance Computing & Communication Research Laboratory 12/11/1997 [1] Hyok Kim Performance Analysis of TCP/IP Data.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Network Services Networking for Home and Small Businesses – Chapter 6.
Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath.
TCP/IP Transport and Application (Topic 6)
1 The Internet and Networked Multimedia. 2 Layering  Internet protocols are designed to work in layers, with each layer building on the facilities provided.
1 Networking Chapter Distributed Capabilities Communications architectures –Software that supports a group of networked computers Network operating.
ITCC-1401 Chapter 3: Network Protocols and Communications
Remote Direct Memory Access (RDMA) over IP PFLDNet 2003, Geneva Stephen Bailey, Sandburst Corp., Allyn Romanow, Cisco Systems,
Srihari Makineni & Ravi Iyer Communications Technology Lab
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
TCP/IP Honolulu Community College Cisco Academy Training Center Semester 2 Version 2.1.
Computer Security Workshops Networking 101. Reasons To Know Networking In Regard to Computer Security To understand the flow of information on the Internet.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
BZUPAGES.COM Presentation on TCP/IP Presented to: Sir Taimoor Presented by: Jamila BB Roll no Nudrat Rehman Roll no
ND The research group on Networks & Distributed systems.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
1Thu D. NguyenCS 545: Distributed Systems CS 545: Distributed Systems Spring 2002 Communication Medium Thu D. Nguyen
CSCI 465 D ata Communications and Networks Lecture 24 Martin van Bommel CSCI 465 Data Communications & Networks 1.
Voice Over Internet Protocol (VoIP) Copyright © 2006 Heathkit Company, Inc. All Rights Reserved Presentation 5 – VoIP and the OSI Model.
Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet P. Balaji, S. Bhagvat, R. Thakur and D. K. Panda, Mathematics.
Technical Overview of Microsoft’s NetDMA Architecture Rade Trimceski Program Manager Windows Networking & Devices Microsoft Corporation.
Operating Systems: Summary INF1060: Introduction to Operating Systems and Data Communication.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Interactions with Microarchitectures and I/O Copyright 2004 Daniel.
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
Enhancements for Voltaire’s InfiniBand simulator
Balazs Voneki CERN/EP/LHCb Online group
The Transport Layer Implementation Services Functions Protocols
Infiniband Architecture
Transport Layer.
Process-to-Process Delivery, TCP and UDP protocols
Lec 2: Protocols.
CS 286 Computer Organization and Architecture
Chapter 6: Network Layer
Net 431: ADVANCED COMPUTER NETWORKS
Transport Layer Our goals:
Multimedia and Networks
Storage Networking Protocols
L.N. Bhuyan Partly from Berkeley Notes
Practical Issues for Commercial Networks
Basic Mechanisms How Bits Move.
Chapter 13: I/O Systems.
Low Overhead Interrupt Handling with SMT
Presentation transcript:

Internetworking: Hardware/Software Interface CS 213, LECTURE 16 L.N. Bhuyan CS258 S99

Protocols: HW/SW Interface Internetworking: allows computers on independent and incompatible networks to communicate reliably and efficiently; Enabling technologies: SW standards that allow reliable communications without reliable networks Hierarchy of SW layers, giving each layer responsibility for portion of overall communications task, called protocol families or protocol suites Transmission Control Protocol/Internet Protocol (TCP/IP) This protocol family is the basis of the Internet IP makes best effort to deliver; TCP guarantees delivery TCP/IP used even when communicating locally: NFS uses IP even though communicating across homogeneous LAN WS companies used TCP/IP even over LAN Because early Ethernet controllers were cheap, but not reliable 11/15/2018 CS258 S99 CS258 S99

TCP/IP packet Application sends message TCP breaks into 64KB segements, adds 20B header IP adds 20B header, sends to network If Ethernet, broken into 1500B packets with headers, trailers Header, trailers have length field, destination, window number, version, ... Ethernet IP Header TCP Header IP Data TCP data (≤ 64KB) 11/15/2018 CS258 S99

Communicating with the Server: The O/S Wall Problems: O/S overhead to move a packet between network and application level => Protocol Stack (TCP/IP) O/S interrupt Data copying from kernel space to user space and vice versa Oh, the PCI Bottleneck! CPU User Kernel NIC PCI Bus CS258 S99

The Send/Receive Operation The application writes the transmit data to the TCP/IP sockets interface for transmission in payload sizes ranging from 4 KB to 64 KB. The data is copied from the User space to the Kernel space The OS segments the data into maximum transmission unit (MTU)–size packets, and then adds TCP/IP header information to each packet. The OS copies the data onto the network interface card (NIC) send queue. The NIC performs the direct memory access (DMA) transfer of each data packet from the TCP buffer space to the NIC, and interrupts CPU activities to indicate completion of the transfer. 11/15/2018 CS258 S99

Transmitting data across the memory bus using a standard NIC http://www.dell.com/downloads/global/power/1q04-her.pdf 11/15/2018 CS258 S99

Timing Measurement in UDP Communication X.Zhang, L. Bhuyan and W. Feng, ““Anatomy of UDP and M-VIA for Cluster Communication” JPDC, October 2005 11/15/2018 CS258 S99

I/O Acceleration Techniques TCP Offload: Offload TCP/IP Checksum and Segmentation to Interface hardware or programmable device (Ex. TOEs) – A TOE-enabled NIC using Remote Direct Memory Access (RDMA) can use zero-copy algorithms to place data directly into application buffers. O/S Bypass: User-level software techniques to bypass protocol stack – Zero Copy Protocol (Needs programmable device in the NIC for direct user level memory access – Virtual to Physical Memory Mapping. Ex. VIA) Architectural Techniques: Instruction set optimization, Multithreading, copy engines, onloading, prefetching, etc. 11/15/2018 CS258 S99

Comparing standard TCP/IP and TOE enabled TCP/IP stacks (http://www.dell.com/downloads/global/power/1q04-her.pdf) 11/15/2018 CS258 S99

Chelsio 10 Gbs TOE 11/15/2018 CS258 S99

Cluster (Network) of Workstations/PCs 11/15/2018 CS258 S99

Myrinet Interface Card 11/15/2018 CS258 S99

InfiniBand Interconnection Zero-copy mechanism. The zero-copy mechanism enables a user-level application to perform I/O on the InfiniBand fabric without being required to copy data between user space and kernel space. RDMA. RDMA facilitates transferring data from remote memory to local memory without the involvement of host CPUs. Reliable transport services. The InfiniBand architecture implements reliable transport services so the host CPU is not involved in protocol-processing tasks like segmentation, reassembly, NACK/ACK, etc. Virtual lanes. InfiniBand architecture provides 16 virtual lanes (VLs) to multiplex independent data lanes into the same physical lane, including a dedicated VL for management operations. High link speeds. InfiniBand architecture defines three link speeds, which are characterized as 1X, 4X, and 12X, yielding data rates of 2.5 Gbps, 10 Gbps, and 30 Gbps, respectively. Reprinted from Dell Power Solutions, October 2004. BY ONUR CELEBIOGLU, RAMESH RAJAGOPALAN, AND RIZWAN ALI 11/15/2018 CS258 S99

InfiniBand system fabric 11/15/2018 CS258 S99

UDP Communication – Life of a Packet X. Zhang, L. Bhuyan and W. Feng, “Anatomy of UDP and M-VIA for Cluster Communication” Journal of Parallel and Distributed Computing (JPDC), Special issue on Design and Performance of Networks for Super-, Cluster-, and Grid-Computing, Vol. 65, Issue 10, October 2005, pp. 1290-1298. 11/15/2018 CS258 S99

Timing Measurement in UDP Communication X.Zhang, L. Bhuyan and W. Feng, ““Anatomy of UDP and M-VIA for Cluster Communication” JPDC, October 2005 11/15/2018 CS258 S99

Network Bandwidth is Increasing TCP requirements Rule of thumb: 1GHz for 1Gbps 1000 100 100 Network bandwidth outpaces Moore’s Law 40 10 10 GHz and Gbps The gap between the rate of processing network applications and the fast growing network bandwidth is increasing 1 0.1 Moore’s Law .01 1990 1995 2000 2003 2005 2006/7 2010 Time 11/15/2018 CS258 S99 CS258 S99

Total Avg Clocks / Packet: ~ 21K Effective Bandwidth: 0.6 Gb/s Profile of a Packet System Overheads Descriptor & Header Accesses IP Processing Computes TCB Accesses TCP Processing Memory Memory Copy Total Avg Clocks / Packet: ~ 21K Effective Bandwidth: 0.6 Gb/s (1KB Receive) 11/15/2018 CS258 S99 CS258 S99

Five Emerging Technologies Optimized Network Protocol Stack (ISSS+CODES, 2003) Cache Optimization (ISSS+CODES, 2003, ANCHOR, 2004) Network Stack Affinity Scheduling Direct Cache Access Lightweight Threading Memory Copy Engine (ICCD 2005 and IEEE TC) 11/15/2018 CS258 S99 CS258 S99

Stack Optimizations (Instruction Count) Separate Data & Control Paths TCP data-path focused Reduce # of conditionals NIC assist logic (L3/L4 stateless logic) Basic Memory Optimizations Cache-line aware data structures SW Prefetches Optimized Computation Standard compiler capability 3X reduction in Instructions per Packet 11/15/2018 CS258 S99 CS258 S99

Network Stack Affinity Assigns network I/O workloads to designated devices Separates network I/O from application work Reduces scheduling overheads More efficient cache utilization Increases pipeline efficiency Chipset Memory CPU Core Core Core Core … I/O Interface CPU Dedicated for network I/O Intel calls it Onloading 11/15/2018 CS258 S99 CS258 S99

Direct Cache Access (DCA) Normal DMA Writes Direct Cache Access CPU Cache Memory NIC Memory Controller Step 1 DMA Write Step 2 Cache Update Step 3 CPU Read CPU Step 4 CPU Read Cache Step 2 Snoop Invalidate Memory Controller Memory Step 1 DMA Write Step 3 Memory Write NIC Eliminate 3 to 25 memory accesses by placing packet data directly into cache 11/15/2018 CS258 S99 CS258 S99

Lightweight Threading Builds on helper threads; reduces CPU stall Memory informing event (e.g. cache miss) Thread Manager S/W controlled thread 1 Execution pipeline S/W controlled thread 2 Single Hardware Context Single Core Pipeline Continue computing in single pipeline in shadow of cache miss 11/15/2018 CS258 S99 CS258 S99

Potential Efficiencies (10X) Benefits of Affinity Benefits of Architectural Technques Greg Regnier, et al., “TCP Onloading for DataCenter Servers,” IEEE Computer, vol 37, Nov 2004 On CPU, multi-gigabit, line speed network I/O is possible 11/15/2018 CS258 S99 CS258 S99

I/O Acceleration – Problem Magnitude Security Services Storage over IP Networking Memory Copies & Effects of Streaming CRCs Crypto Parsing, Tree Construction I/O Processing Rates are significantly limited by CPU in the face of Data Movement and Transformation Operations 11/15/2018 CS258 S99 CS258 S99