2006 Sonoma Workshop February 2006Page 1 Sockets Direct Protocol (SDP) for Windows - Motivation and Plans Gilad Shainer Mellanox Technologies Inc.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

Middleware Support for RDMA-based Data Transfer in Cloud Computing Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi Department of Electrical.
A Hybrid MPI Design using SCTP and iWARP Distributed Systems Group Mike Tsai, Brad Penoff, and Alan Wagner Department of Computer Science University of.
The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
04/25/06Pavan Balaji (The Ohio State University) Asynchronous Zero-copy Communication for Synchronous Sockets in the Sockets Direct Protocol over InfiniBand.
Windows Compute Cluster Server Overview and Update Paris OpenFabrics Workshop 2006 Xavier Pillons – Principal Consultant Microsoft.
RDS and Oracle 10g RAC Update Paul Tsien, Oracle.
Windows HPC Server 2008 Presented by Frank Chism Windows and Condor: Co-Existence and Interoperation.
ECE 526 – Network Processing Systems Design Software-based Protocol Processing Chapter 7: D. E. Comer.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
An overview of Infiniband Reykjavik, June 24th 2008 R E Y K J A V I K U N I V E R S I T Y Dept. Computer Science Center for Analysis and Design of Intelligent.
IWARP Ethernet Key to Driving Ethernet into the Future Brian Hausauer Chief Architect NetEffect, Inc.
Stan Smith Intel SSG/DPD June, 2015 Kernel Fabric Interface KFI Framework.
I/O Systems ◦ Operating Systems ◦ CS550. Note:  Based on Operating Systems Concepts by Silberschatz, Galvin, and Gagne  Strongly recommended to read.
Chapter 17 Networking Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William Stallings.
Windows 2000 Networking Computing Department, Lancaster University, UK.
Protocols for Wide-Area Data-intensive Applications: Design and Performance Issues Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi, Brian.
Scalable Networking for Next-Generation Computing Platforms Yoshio Turner *, Tim Brecht *‡, Greg Regnier §, Vikram Saletore §, John Janakiraman *, Brian.
Roland Dreier Technical Lead – Cisco Systems, Inc. OpenIB Maintainer Sean Hefty Software Engineer – Intel Corporation OpenIB Maintainer Yaron Haviv CTO.
1 March 2010 A Study of Hardware Assisted IP over InfiniBand and its Impact on Enterprise Data Center Performance Ryan E. Grant 1, Pavan Balaji 2, Ahmad.
LWIP TCP/IP Stack 김백규.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
OFED 1.2 Lessons, 1.3 Planning and Field Support May 07 Tziporet Koren.
LWIP TCP/IP Stack 김백규.
2006 Sonoma Workshop February 2006Page 1 of (#) General Windows Update Gilad Shainer Mellanox Technologies Inc.
The NE010 iWARP Adapter Gary Montry Senior Scientist
1 Using HPS Switch on Bassi Jonathan Carter User Services Group Lead NERSC User Group Meeting June 12, 2006.
Windows 2000 Course Summary Computing Department, Lancaster University, UK.
ISER Update OpenIB Workshop, Feb 2006 Yaron Haviv, Voltaire John Hufferd, Brocade
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
OpenFabrics Windows Development and Microsoft Windows CCS 2003 Part1
Remote Direct Memory Access (RDMA) over IP PFLDNet 2003, Geneva Stephen Bailey, Sandburst Corp., Allyn Romanow, Cisco Systems,
---- IT Acumens. COM IT Acumens. COMIT Acumens. COM.
Srihari Makineni & Ravi Iyer Communications Technology Lab
August 22, 2005Page 1 of (#) Datacenter Fabric Workshop Open MPI Overview and Current Status Tim Woodall - LANL Galen Shipman - LANL/UNM.
Chapter 2 Applications and Layered Architectures Sockets.
OFED Usage in VMware Virtual Infrastructure Anne Marie Merritt, VMware Tziporet Koren, Mellanox May 1, 2007 Sonoma Workshop Presentation.
3.1 Silberschatz, Galvin and Gagne ©2009Operating System Concepts with Java – 8 th Edition Chapter 3: Processes.
Datacenter Fabric Workshop August 22, 2005 Reliable Datagram Sockets (RDS) Ranjit Pandit SilverStorm Technologies
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
Sonoma Feb 6, 2006 Reliable Datagram Sockets (RDS) Ranjit Pandit SilverStorm Technologies
InfiniBand support for Socket- based connection model by CM Arkady Kanevsky November 16, 2005 version 4.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Middleware Services. Functions of Middleware Encapsulation Protection Concurrent processing Communication Scheduling.
Windows OpenFabrics (WinOF) Update Gilad Shainer, Mellanox Technologies November 2007.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
Shawn Hansen Director of Marketing. Windows Compute Cluster Server 2003 Enable scientist and researcher to focus on Science, not IT. Mission: Enable scientist.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage September 2010 Brandon.
Datacenter Fabric Workshop NFS over RDMA Boris Shpolyansky Mellanox Technologies Inc.
Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet P. Balaji, S. Bhagvat, R. Thakur and D. K. Panda, Mathematics.
2006 Sonoma Workshop February 2006Page 1 MemFree Technology Gilad Shainer Mellanox Technologies Inc.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Stan Smith Intel SSG/DPD June, 2015 Kernel Fabric Interface Kfabric Framework.
Chapter 4: Threads 羅習五. Chapter 4: Threads Motivation and Overview Multithreading Models Threading Issues Examples – Pthreads – Windows XP Threads – Linux.
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
Experiences with VI Communication for Database Storage Yuanyuan Zhou, Angelos Bilas, Suresh Jagannathan, Cezary Dubnicki, Jammes F. Philbin, Kai Li.
SC’13 BoF Discussion Sean Hefty Intel Corporation.
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
SOCKET PROGRAMMING Presented By : Divya Sharma.
Introduction to threads
Microsoft enterprise concepts
LWIP TCP/IP Stack 김백규.
Chapter 4: Threads 羅習五.
Chapter 4: Threads.
CS703 - Advanced Operating Systems
Integrating DPDK/SPDK with storage application
Application taxonomy & characterization
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Presentation transcript:

2006 Sonoma Workshop February 2006Page 1 Sockets Direct Protocol (SDP) for Windows - Motivation and Plans Gilad Shainer Mellanox Technologies Inc.

February Sonoma Workshop – Sockets Direct ProtocolPage 2 Agenda SDP Protocol Overview SDP vs. WSD SDP Stack Components Current Status

February Sonoma Workshop – Sockets Direct ProtocolPage 3 Key Message With SDP, any traditional TCP sockets- based applications can benefit from InfiniBand without any change

February Sonoma Workshop – Sockets Direct ProtocolPage 4 Motivation SDP sockets provide much better bandwidth and lower latency than traditional TCP sockets SDP sockets provide lower CPU utilization and can benefit from RDMA WSD sockets are not supported in all Windows operation systems –XP, XP embedded and Vista do not have WSD support

February Sonoma Workshop – Sockets Direct ProtocolPage 5 Sockets Direct Protocol (SDP) Overview Transparent to the application Maintains SOCK_STREAM Semantics Leverages InfiniBand Capabilities –Transport Offload – Reliable Connection –Zero Copy – Using RDMA Standardized wire protocol

February Sonoma Workshop – Sockets Direct ProtocolPage 6 SDP Stack Overview Unmodified Application winsock API SDP SDP WinSock Provider Socket Switch WSD SAN Provider IB Access Layer TCP/IP IPoIB NDIS Miniport IB Access Layer user kernel WinSock Provider Windows platform OpenIB Application

February Sonoma Workshop – Sockets Direct ProtocolPage 7 Data Transfer Modes Data Source Data Sink App Buf SDP Buf SDP Buf SDP Buf SDP Buf SDP Buf SDP SDP Data Message SDP Buf SDP Buf SDP Buf SDP Buf SDP Buf SDP App Buf App Buf Data Source Data Sink App Buf SrcAvail Message App Buf App Buf RdmaRdCompl Msg BCopy Read ZCopy RDMA Read Read Response

February Sonoma Workshop – Sockets Direct ProtocolPage 8 SDP vs. WSD Sockets Direct Protocol (SDP) Windows Sockets Direct (WSD) APIWinsock 2.x, POSIX/BSD like API WHQLNoneSAN / Winsock Direct Wire Protocol Specification IB spec 1.2Microsoft Proprietary OS SupportWin XP, WS 2K3WS 2K3 InteroperabilityWindows, Linux and any OS that conforms to IB specification Windows Server only CodeOpen-sourceProtocol - Windows proprietary SAN Provider - open-source IHV ModuleSocket Provider Library SDP kernel module SAN Provider Library Implementation Domain Mostly kernel (similar to Linux)Mostly user

February Sonoma Workshop – Sockets Direct ProtocolPage 9 ULP Comparison IB VerbsSDP/WSDIPoIB APILow levelSocket (TCP only)Socket Latencylowestlowerlow Bandwidthhighesthighmedium CPU Utilizationlowest (if not polling)BCopy high ZCopy low highest Kernel bypass SDP  WSD  Stack Overhead LightMediumHigh Memory Registration Explicitly by application/ middleware Heuristics by SDPNone Application Adaptation Porting/ Development Required Supports Unmodified Application

February Sonoma Workshop – Sockets Direct ProtocolPage 10 SDP Socket Provider User-mode library Implements Socket Provider Interface (SPI) –Supports TCP protocol –WSPxxx function for each socket call Socket switch implemented in the library –Policy based selection of SDP vs TCP –SDP calls are redirected to SDP module

February Sonoma Workshop – Sockets Direct ProtocolPage 11 SDP Module Kernel module –Implemented as a high level driver Connection establishment –Routing –ARP through IPoIB –Path Record –CM Data transfer mechanism –Buffer Copy for first release –Using physical memory region for local SDP buffers

February Sonoma Workshop – Sockets Direct ProtocolPage 12 Kernel Implementation Each user socket is implemented as a struct in the kernel, composed from 3 parts: 1.For sending the data 2.For receiving data 3.For accepting new connections If this is a listening socket Currently all the parts are being protected by one lock –We will consider changing that in the future In kernel all operations are being implemented as Asynchronous –If operation is blocking, we block at the user level. The Heaviest operations are data copy –We might consider moving it from the lock.

February Sonoma Workshop – Sockets Direct ProtocolPage 13 Buffer Copy Implementation 16KB buffers for send and receive. The data is first copied into the buffers, and only then being sent For each process, a second thread is created for user context operations If the data copy should be done in interrupt context –This thread will be waked up – for send and receive operations

February Sonoma Workshop – Sockets Direct ProtocolPage 14 Current Status Maintainer –Tzachi Dar (Mellanox); Buffer Copy released Oct/05 –Support: Socket, Connect, Bind, Listen, Accept, Close, Send/Recv –Only kernel code posted to openib.org repository Zero Copy and API enhancements schedule TBD –Will support overlapped (asynchronous) operation –Will support Zero copy

February Sonoma Workshop – Sockets Direct ProtocolPage 15 Current Performance 2 sockets (2 processes) bandwidth benchmark Message size (B)Bandwidth (MB)

February Sonoma Workshop – Sockets Direct ProtocolPage 16 Resources OpenIB WiKi – Openib-windows mailing list – Sign up to contribute –

February Sonoma Workshop – Sockets Direct ProtocolPage 17 Q & A