SRP Update Bart Van Assche,.

Slides:



Advertisements
Similar presentations
© 2003 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Performance Measurements of a User-Space.
Advertisements

The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Copyright © 2014 EMC Corporation. All Rights Reserved. Linux Host Installation and Integration for Block Upon completion of this module, you should be.
OFED TCP Port Mapper Proposal June 15, Overview Current NE020 Linux OFED driver uses host TCP/IP stack MAC and IP address for RDMA connections Hardware.
Uncovering Performance and Interoperability Issues in the OFED Stack March 2008 Dennis Tolstenko Sonoma Workshop Presentation.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 W. Schulte Chapter 5: Inter-VLAN Routing Routing And Switching.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 5: Inter-VLAN Routing Routing & Switching.
IWARP Update #OFADevWorkshop.
1 InfiniBand HW Architecture InfiniBand Unified Fabric InfiniBand Architecture Router xCA Link Topology Switched Fabric (vs shared bus) 64K nodes per sub-net.
Modifying the SCSI / Fibre Channel Block Size Presented by Keith Bonneau, John Chrzanowski and Craig O’Brien Advised by Robert Kinicki and Mark Claypool.
The SNIA NVM Programming Model
1 I/O Management in Representative Operating Systems.
Storage Networking Technologies and Virtualization Section 2 DAS and Introduction to SCSI1.
File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.
Mahesh Wagh Intel Corporation Member, PCIe Protocol Workgroup.
IB ACM InfiniBand Communication Management Assistant (for Scaling) Sean Hefty.
Infiniband enables scalable Real Application Clusters – Update Spring 2008 Sumanta Chatterjee, Oracle Richard Frank, Oracle.
NVM Programming Model. 2 Emerging Persistent Memory Technologies Phase change memory Heat changes memory cells between crystalline and amorphous states.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 5: Inter-VLAN Routing Routing And Switching.
New Direction Proposal: An OpenFabrics Framework for high-performance I/O apps OFA TAC, Key drivers: Sean Hefty, Paul Grun.
Module 10 Configuring and Managing Storage Technologies.
GeoVision Solutions Storage Management & Backup. ๏ RAID - Redundant Array of Independent (or Inexpensive) Disks ๏ Combines multiple disk drives into a.
Windows RDMA File Storage
Pooja Shetty Usha B Gowda.  Network File Systems (NFS)  Drawbacks of NFS  Parallel Virtual File Systems (PVFS)  PVFS components  PVFS application.
TPT-RAID: A High Performance Multi-Box Storage System
Copyright DataDirect Networks - All Rights Reserved - Not reproducible without express written permission Adventures Installing Infiniband Storage Randy.
12/12/2008 Summers - SAiSCSI 1 Secure Asymmetric iSCSI For Online Storage Sarah A. Summers.
Design and Implementation of a Linux SCSI Target for Storage Area Networks Ashish A. PalekarAnshul Chaddha, Trebia Networks Narendran Ganapathy, 33 Nagog.
Slide 1 DESIGN, IMPLEMENTATION, AND PERFORMANCE ANALYSIS OF THE ISCSI PROTOCOL FOR SCSI OVER TCP/IP By Anshul Chadda (Trebia Networks)-Speaker Ashish Palekar.
Chapter 5 Section 2 : Storage Networking Technologies and Virtualization.
Storage Module 6.
1 High-Level Carrier Requirements for Cross Layer Optimization Dave McDysan Verizon.
© 2012 MELLANOX TECHNOLOGIES 1 The Exascale Interconnect Technology Rich Graham – Sr. Solutions Architect.
Scalable name and address resolution infrastructure -- Ira Weiny/John Fleck #OFADevWorkshop.
ISER Update OpenIB Workshop, Feb 2006 Yaron Haviv, Voltaire John Hufferd, Brocade
Cisco 3 - LAN Perrine. J Page 110/20/2015 Chapter 8 VLAN VLAN: is a logical grouping grouped by: function department application VLAN configuration is.
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
OFED Usage in VMware Virtual Infrastructure Anne Marie Merritt, VMware Tziporet Koren, Mellanox May 1, 2007 Sonoma Workshop Presentation.
RDMA Bonding Liran Liss Mellanox Technologies. Agenda Introduction Transport-level bonding RDMA bonding design Recovering from failure Implementation.
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
Red Hat RDMA Integration and Testing Processes Doug Ledford.
Hyper-V Performance, Scale & Architecture Changes Benjamin Armstrong Senior Program Manager Lead Microsoft Corporation VIR413.
VMware vSphere Configuration and Management v6
Silberschatz, Galvin and Gagne  Operating System Concepts UNIT II Operating System Services.
MDC323B SMB 3 is the answer Ned Pyle Sr. PM, Windows Server
Rick Claus Sr. Technical Evangelist,
iSER update 2014 OFA Developer Workshop Eyal Salomon
OpenFabrics Interface WG A brief introduction Paul Grun – co chair OFI WG Cray, Inc.
Mr. P. K. GuptaSandeep Gupta Roopak Agarwal
STORAGE ARCHITECTURE/ MASTER): Disk Storage: What Are Your Options? Randy Kerns Senior Partner The Evaluator Group.
Use case of RDMA in Symantec storage software stack Om Prakash Agarwal Symantec.
Virtual Machine Movement and Hyper-V Replica
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 5.
Microsoft Advertising 16:9 Template Light Use the slides below to start the design of your presentation. Additional slides layouts (title slides, tile.
How to setup DSS V6 iSCSI Failover with XenServer using Multipath Software Version: DSS ver up55 Presentation updated: February 2011.
Tgt: Framework Target Drivers FUJITA Tomonori NTT Cyber Solutions Laboratories Mike Christie Red Hat, Inc Ottawa Linux.
Chapter 13: I/O Systems Modified by Dr. Neerja Mhaskar for CS 3SH3.
Direct Attached Storage and Introduction to SCSI
Chapter 5: Inter-VLAN Routing
Introduction to Networks
Direct Attached Storage and Introduction to SCSI
Enabling the NVMe™ CMB and PMR Ecosystem
ECEN “Internet Protocols and Modeling”
iSCSI-based Virtual Storage System for Mobile Devices
Building continuously available systems with Hyper-V
Specialized Cloud Architectures
Chapter 2: The Linux System Part 5
Application taxonomy & characterization
Cost Effective Network Storage Solutions
Efficient Migration of Large-memory VMs Using Private Virtual Memory
Presentation transcript:

SRP Update Bart Van Assche,

Overview Involvement With SRP SRP Protocol Overview Recent SRP Driver Changes Possible Future Directions March 30 – April 2, 2014 #OFADevWorkshop

Involvement with SRP Maintainer of the open source Linux SRP initiator and the SCST SRP target drivers. Member of the Fusion-io ION team. ION is an all-flash H.A. shared storage appliance. Flash memory provides low latency and high bandwidth. The focus of RDMA is on low latency and high bandwidth. In other words, RDMA is well suited for remote access to flash memory. March 30 – April 2, 2014 #OFADevWorkshop

SRP Protocol Overview SRP = SCSI RDMA Protocol. Defines how to perform SCSI communication over an RDMA network. Defines how to discover InfiniBand SRP targets, how to log in, how to transfer SCSI CDB's and also how to transfer data via RDMA. Revision 16a of the SRP protocol has been approved as an official ANSI standard in 2007. April 2-3, 2014 #2014IBUG

SRP and SCSI SRP defines a SCSI transport layer. Enables supports for e.g. these SCSI features: Reading and writing data blocks. Read capacity. Command queueing. Multiple LUNs per SCSI host. Inquire LUN information, e.g. volume identification, caching information and thin provisioning support (a.k.a. TRIM / UNMAP). Atomic (vectored) write - helps to make database software faster. VAAI (WRITE SAME, UNMAP, ATS, XCOPY). End-to-end data integrity (a.k.a. T10-PI). Persistent reservations a.k.a. cluster support. Asymmetric Logical Unit Access (ALUA). Fusion-io is actively involved in the ANSI T10 committee for standardization of new SCSI commands. April 2-3, 2014 #2014IBUG

SRP Protocol - Login IB spec defines device management. Initiator sends device management query to subnet manager. Subnet manager reports ports with device management capabilities. Initiator sends I/O controller query to each port with device management capabilities. SRP target reports I/O controllers. Initiator sends login request to selected I/O controllers. Initiator requests SCSI LUN report and queries capacity and identification of each LUN. I/O starts. April 2-3, 2014 #2014IBUG

Linux SRP Initiator Support Kernel driver ib_srp - implements SRP protocol. User space srptools package. srp_daemon and ibsrpdm executables. Target discovery. Target login. Interface between kernel and user space /sys/class/infiniband_srp/srp-${port}/add_target /sys/class/srp_remote_ports /sys/class/scsi_host/*/{sgid,dgid,...} /sys/class/scsi_device/*/{state,queue_depth,...} April 2-3, 2014 #2014IBUG

SRP Login - Example # cat /etc/srp_daemon.conf a queue_size=128,max_cmd_per_lun=128 # srp_daemon -oaecd/dev/infiniband/umad1 id_ext=0002c90300fc3210,ioc_guid=0002c90300fc3210,dgid=fe800000000000000002c90300fc3211,pkey=ffff,service_id=0002c90300fc3210 id_ext=0002c90300a543b0,ioc_guid=0002c90300a543b0,dgid=fe800000000000000002c90300fc3221,pkey=ffff,service_id=0002c90300a543b0 [ … ] # ls /sys/class/srp_remote_ports/ port-453:1 port-459:1 [ … ] # lsscsi [5:0:0:0] disk FUSIONIO ION LUN 3243 /dev/sdc [5:0:0:1] disk FUSIONIO ION LUN 3243 /dev/sdd April 2-3, 2014 #2014IBUG

Recent SRP Initiator Changes Queue size is now configurable. Optimal performance for SSDs and hard disk RAID arrays can only be achieved with a large queue size (128 instead of the default 64). Support for modifying the queue depth dynamically has been added. Path loss detection time has been reduced from about 40s to about 17s. Further reduction is possible by lowering the subnet timeout on the subnet manager. This makes a significant difference in H.A. setups. Added support for fast_io_fail_tmo and dev_loss_tmo parameters for multipath. P_Key support has been added in srp_daemon. Many smaller changes in the srptools package. April 2-3, 2014 #2014IBUG

OFED and SRP Support ib_srp srptools Upstream Linux kernel 3.14.0 Fusion-io is working with Linux distribution vendors to keep SRP support up to date. ib_srp srptools Upstream Linux kernel 3.14.0 1.0.2 RHEL 6.5 2.6.32+ 0.0.4 SLES 11 SP3 3.0.101 MLNX OFED 2.1 3.13.0 1.0.0 OFED 3.12 3.12.0 April 2-3, 2014 #2014IBUG

SRP Initiator and SCSI Core Linux SRP initiator is a SCSI driver. Linux SCSI mid-layer builds on block layer. SRP initiator relies on SCSI core for LUN scanning, SCSI error handling, ... Path removal triggers a call of scsi_remove_host(). Path removal during I/O works reliably since Linux kernel 3.8. Fusion-io contributed several patches to make the Linux SCSI core and block layer handle path removal during I/O reliably. April 2-3, 2014 #2014IBUG

Possible Future Directions Improving Linux SCSI performance via the scsi-mq project. Higher bandwidth by using multiple RDMA channels. Latency reduction. NUMA performance improvements. FRWR support - needed e.g. for ConnectIB HCA support. End-to-end data integrity (T10-PI) support; supported by Oracle database software. Builds on FRWR support. Adding SR-IOV support. Support for Ethernet networks (RoCE and/or iWARP). Requires to switch from IB/CM to RDMA/CM. Requires modification of the target discovery software (srptools). The current target discovery software is based on InfiniBand MADs. April 2-3, 2014 #2014IBUG