Enabling the NVMe™ CMB and PMR Ecosystem

Slides:



Advertisements
Similar presentations
The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Advertisements

Enabling Coherent FPGA Acceleration Allan Cantle, President & Founder Nallatech Join the conversation at #OpenPOWERSummit1 #OpenPOWERSummit.
Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Chapter 13 Embedded Systems
Students:Gilad Goldman Lior Kamran Supervisor:Mony Orbach Mid-Semester Presentation Spring 2005 Network Sniffer.
The SNIA NVM Programming Model
An overview of Infiniband Reykjavik, June 24th 2008 R E Y K J A V I K U N I V E R S I T Y Dept. Computer Science Center for Analysis and Design of Intelligent.
PCIe 2.0 Base Specification Protocol And Software Overview
Mahesh Wagh Intel Corporation Member, PCIe Protocol Workgroup.
SRP Update Bart Van Assche,.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Achieving 10 Gb/s Using Xen Para-virtualized.
1 Lecture 20: I/O n I/O hardware n I/O structure n communication with controllers n device interrupts n device drivers n streams.
The NE010 iWARP Adapter Gary Montry Senior Scientist
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
EXPOSING OVS STATISTICS FOR Q UANTUM USERS Tomer Shani Advanced Topics in Storage Systems Spring 2013.
Intel Open Source Technology Center Lu Baolu 2015/09
iSER update 2014 OFA Developer Workshop Eyal Salomon
Under the Hood with NVMe over Fabrics
2015 Storage Developer Conference. © Intel Corporation. All Rights Reserved. RDMA with PMEM Software mechanisms for enabling access to remote persistent.
Packet processed storage in a software defined world Ash Young fd.io Foundation1.
Computer System Structures Storage
NVMMU: A Non-Volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures Jie Zhang1, David Donofrio3, John Shalf3, Mahmut Kandemir2,and Myoungsoo.
Lecture 15: IO Virtualization
Configuration & Registry Microservice Deep Dive
Chapter 13: I/O Systems Modified by Dr. Neerja Mhaskar for CS 3SH3.
Chapter Objectives In this chapter, you will learn:
Current Generation Hypervisor Type 1 Type 2.
USB The topics covered, in order, are USB background
cFE FSW at APL & FSW Reusability
Software Defined Storage
Alternative system models
Using non-volatile memory (NVDIMM-N) as block storage in Windows Server 2016 Tobias Klima Program Manager.
Persistent Memory over Fabrics
Fabric Interfaces Architecture – v4
Microsoft Build /12/2018 5:05 AM Using non-volatile memory (NVDIMM-N) as byte-addressable storage in Windows Server 2016 Tobias Klima Program Manager.
Multi-PCIe socket network device
Ping-Sung Yeh, Te-Hao Hsu Conclusions Results Introduction
Introduction to Computers
RDMA Extensions for Persistency and Consistency
PQI vs. NVMe® Queuing Comparison
The Small batch (and Other) solutions in Mantle API
CSCI/CMPE 3334 Systems Programming
Avalon Switch Fabric.
Tushar Gohad, Intel Moshe Levi, Mellanox Ivan Kolodyazhny, Mirantis
OS Virtualization.
Rob Davis, Mellanox Ilker Cebeli, Samsung
T10/11-119r0 by Robert Elliott, HP 7 March 2011
SCSI over PCI Express convergence
HyperLoop: Group-Based NIC Offloading to Accelerate Replicated Transactions in Multi-tenant Storage Systems Daehyeok Kim Amirsaman Memaripour, Anirudh.
Windows Server 2016 Software Defined Storage
OpenFabrics Alliance An Update for SSSI
Building a Database on S3
Xen Network I/O Performance Analysis and Opportunities for Improvement
Sr. Developer Cloud System - Architecture
Today’s agenda Hardware architecture and runtime system
NVMHCI Workgroup and T10 Collaboration Proposal
Integrating DPDK/SPDK with storage application
Deflate your Data with DPDK
Upcoming Improvements and Features in Charm++
Windows Virtual PC / Hyper-V
NVMe™/TCP Development Status and a Case study of SPDK User Space Solution 2019 NVMe™ Annual Members Meeting and Developer Day March 19, 2019 Sagi Grimberg,
Open Source Activity Showcase Computational Storage SNIA SwordfishTM
Accelerating Applications with NVM Express™ Computational Storage 2019 NVMe™ Annual Members Meeting and Developer Day March 19, 2019 Prepared by Stephen.
NVMe.
Chapter 13: I/O Systems.
Microsoft Virtual Academy
Factors Driving Enterprise NVMeTM Growth
Openstack Summit November 2017
Presentation transcript:

Enabling the NVMe™ CMB and PMR Ecosystem Stephen Bates, PhD. CTO, Eideticom Oren Duer. Software Architect, Mellanox NVM Express Developers Day, May 1, 2018

Outline Intro to NVMe™ Controller Memory Buffers (CMBs) Use cases for CMBs Submission Queue Support (SQS) only RDS (Read Data Support) and WDS (Write Data Support) for NVMe p2p copies SQS, RDS and WDS for optimized NVMe™ over Fabrics (NVMe-oF™) Software for NVMe CMBs SPDK (Storage Performance Developer Kit) work for NVMe copies. Linux kernel work for p2pdma and for offload. Roadmap for the future

Intro to Controller Memory Buffers CMBs were introduced to the NVMe™ standard in 2014 in version 1.2. A NVMe CMB is a PCIe BAR (or part thereof) that can be used for certain NVMe specific data types. The main purpose of the CMB is to provide an alternative to: Placing queues in host memory Placing data for DMA in host memory. As well as a BAR, two optional NVMe registers are needed: CMBLOC - location CMBSZ - size and supported types Multiple vendors support CMB today (Intel, Eideticom, Everspin) or soon (Toshiba, Samsung, WDC etc).

Intro to Controller Memory Buffers A - This device’s manufacturer has registered its vendor ID and device IDs with the PCIe database. This means you get a human-readable description of it. B - This device has three PCIe BARs: BAR0 is 16KB and is the standard NVMe™ BAR that any legitimate NVMe device must have. C - The third BAR is the Controller Memory Buffer (CMB) which can be used for both NVMe queues and NVMe data. F - Since this device is a NVMe device it is bound to the standard Linux kernel NVMe driver. An example of CMBLOC and CMBSZ obtained via nvme-cli:

Some Fun Use Cases for CMBs Placing some (or all) of your NVMe™ queues in CMB rather than host memory. Reduce latency [Linux Kernel1 and SPDK1]. Using the CMB as a DMA buffer allows for offloaded NVMe copies. This can improve performance and offloads the host CPU [SPDK1]. Using the CMB as a DMA buffer allows RDMA NICs to directly place NVMe-oF™ data into the NVMe SSD. Reduce latency and CPU load [Linux Kernel2] Traditional DMAs (left) load the CPU. P2P DMAs (right) do not load the CPU. 1Upstream in the relevant tree. 2Proposed patches (see last slide for git repo).

Software for CMBs - SPDK Storage Performance Development Kit (SPDK) is a Free and Open Source (FOSS) user-space framework for high performance storage. Focus on NVMe™ and NVMe-oF™. Code added in Feb 2018 to enable P2P NVMe copies when CMBs allow it. A simple example of an application using this new API also in SPDK examples (cmb_copy). CMB cmb_copy is an example application using SPDK’s APIs to copy data between NVMe SSDs using P2P DMAs. This bypasses the CPU’s memory and PCIe subsystems.

Software for CMBs - SPDK See https://asciinema.org/a/bkd32zDLyKvIq7F8M5BBvdX42 Software for CMBs - SPDK C B A A A - copied 9MB from SSD A to SSD B. B - less than 1MB of data on PCIe switch Upstream Port. C - SPDK command line

Software for CMBs - The Linux Kernel A P2P framework called p2pdma is being proposed for the Linux kernel. Much more general than NVMe™ CMBs. Any PCIe device can utilize it (NICS, GPGPUs etc.). PCIe drivers can register memory (e.g. CMBs) or request access to memory for DMA. Initial patches use p2pdma to optimize the NVMe-oF™ target code. The p2pdma framework can be used to improve NVMe-oF targets. Here we show results from a generic NVMe-oF system. p2pdma can reduce CPU memory load by x50 and CPU PCIe load by x25. NVMe offload can also be employed to reduce CPU core load by x50.

Software for CMBs - The Linux Kernel Legacy Data Path Software for CMBs - The Linux Kernel p2pdma Data Path The hardware setup for the NVMe-oF™ p2pdma testing is as shown on the right. The software setup consisted of a modified Linux kernel and standard NVMe-oF configuration tools (mostly nvme-cli and nvmet). The Linux kernel used added support for NVMe™ offload and Peer-2-Peer DMAs using an NVMe CMB provided by the Eideticom NVMe device. This is the NVMe-oF target configuration used. Note RDMA NIC is connected to switch and not CPU Root Port.

Roadmap for CMBs, PMRs and the Software NVMe™ CMBs have been in the standard for a while. However it’s only now they are starting to become available and software is starting to utilize them. SPDK and the Linux kernel are the two main locations for CMB software enablement today. SPDK: NVMe P2P copies. NVMe-oF™ updates coming. Linux kernel. p2pdma framework upstream soon. Will be expanded to support other NVMe/PCIe resources (e.g. doorbells). Persistent Memory Regions add non-volatile CMBs and will require (lots of) software enablement too. They will enable a path to Persistent memory storage on the PCIe bus.

Further Reading, Resources and References Current NVM Express™ Standard - http://nvmexpress.org/wp-content/uploads/NVM-Express-1_3a- 20171024_ratified.pdf. PMR TP - http://nvmexpress.org/wp-content/uploads/NVM-Express-1.3-Ratified-TPs.zip. SPDK - http://www.spdk.io/ and https://github.com/spdk/spdk. p2pdma Linux kernel patches - https://github.com/sbates130272/linux-p2pmem/tree/pcp-p2p-v4. Mellanox offload driver - https://github.com/Mellanox/NVMEoF-P2P SDC talk on p2pmem - https://www.youtube.com/watch?v=zEXJ549ealM. Offload+p2pdma kernel code - https://github.com/lsgunth/linux/commits/max-mlnx-offload-p2pdma. Offload+p2pdma white paper link - https://github.com/Mellanox/NVMEoF-P2P https://docs.google.com/document/d/1GVGCLALneyw3pyKYmRRG7VTNWPtzL0XqrFlYA53rx_M/edit?usp=sharing.

2018 Storage Performance Development Kit (SPDK) Summit May 15th -16th  Dolce Hayes Mansion, San Jose 200 Edenvale Avenue, San Jose, California 95136 This will be a great opportunity to meet with other SPDK community members and listen to a new series of talks from SPDK users and developers; everything from case studies and analysis to tech tutorials and live demos. This year we will dedicate a second day just for developers that will include a hands-on lab, as well as a few hours set aside for active contributors to tackle design issues and discuss future advancements. Registration is free!!!!! http://www.cvent.com/d/qgqnn3 Sponsored by Intel® Corporation