Red Hat RDMA Integration and Testing Processes Doug Ledford.

Slides:



Advertisements
Similar presentations
CyberPatriot: An Introduction to GNU/Linux 9/10/10 Joshua White Director of CyOON R&D Everis Inc (315)
Advertisements

Condor use in Department of Computing, Imperial College Stephen M c Gough, David McBride London e-Science Centre.
The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
OFED TCP Port Mapper Proposal June 15, Overview Current NE020 Linux OFED driver uses host TCP/IP stack MAC and IP address for RDMA connections Hardware.
Uncovering Performance and Interoperability Issues in the OFED Stack March 2008 Dennis Tolstenko Sonoma Workshop Presentation.
OpenFabrics Enterprise Distribution for Windows 1 Stan C. SmithIshai RabinovitzEric Lantz 3/16/2010.
Performance Analysis of Virtualization for High Performance Computing A Practical Evaluation of Hypervisor Overheads Matthew Cawood University of Cape.
Introduction Characteristics of USB System Model What needs to be done Platform Issues Conceptual Issues Timeline USB Monitoring Final Presentation 10.
Lesson 5-Accessing Networks. Overview Introduction to Windows XP Professional. Introduction to Novell Client. Introduction to Red Hat Linux workstation.
Linux+ Guide to Linux Certification Chapter 12 Compression, System Backup, and Software Installation.
Software Configuration Management
Red Hat Installation. Installing Red Hat Linux is the process of copying operating system files from a CD, DVD, or USB flash drive to hard disk(s) on.
Improving the OFED Development Process.
Installing Windows Vista Lesson 2. Skills Matrix Technology SkillObjective DomainObjective # Performing a Clean Installation Set up Windows Vista as the.
Version Control with git. Version Control Version control is a system that records changes to a file or set of files over time so that you can recall.
Linux Basics CS 302. Outline  What is Unix?  What is Linux?  Virtual Machine.
OFA-IWG - March 2010 OFA Interoperability Working Group Update Authors: Mikkel Hagen, Rupert Dance Date: 3/15/2010.
SRP Update Bart Van Assche,.
Windows RDMA File Storage
Windows Azure Conference 2014 Running Docker on Windows Azure.
OFA-IWG Interop Event March 2008 Rupert Dance, Arkady Kanevsky, Tuan Phamdo, Mikkel Hagen Sonoma Workshop Presentation.
Linux Last Update Copyright Kenneth M. Chipps Ph.D. 1.
Arago Project Creating an Open Integration and Distribution System William Mills
RDMA Stacks MOFED, OFED & Linux Kernel
CIS 191 – Lesson 2 System Administration. CIS 191 – Lesson 2 System Architecture Component Architecture –The OS provides the simple components from which.
CONNECT: Install Webinar for Code-A-Thon April 20th, 2010.
Current Status of OFED in SUSE Linux Enterprise Server John Jolly Senior Software Engineer SUSE.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
Open Fabrics BOF Supercomputing 2008 Tziporet Koren, Gilad Shainer, Yiftah Shahar, Bob Woodruff, Betsy Zeller.
OFED for Linux: Status and Next Steps 1 Betsy Zeller (Qlogic), Tziporet Koren (Mellanox) 3/16/2010.
OFED 1.2 Lessons, 1.3 Planning and Field Support May 07 Tziporet Koren.
MAE Continuous Integration Administration guide July 8th, 2013.
COMPUTER OPERATING SYSTEMS THE BIG 3. MENU PC WINDOWS The primary operating system for the majority of computer users around the world is Windows. Many.
Install Software. UNIX Shell The UNIX/LINUX shell is a program important part of a Unix system. interface between the user & UNIX kernel starts running.
Git workflow and basic commands By: Anuj Sharma. Why git? Git is a distributed revision control system with an emphasis on speed, data integrity, and.
OFED Interoperability NetEffect April 30, 2007 Sonoma Workshop Presentation.
InfiniBand in the Lab Erik 1.
© 2012 MELLANOX TECHNOLOGIES 1 Disruptive Technologies in HPC Interconnect HPC User Forum April 16, 2012.
1 Development of a High-Throughput Computing Cluster at Florida Tech P. FORD, R. PENA, J. HELSBY, R. HOCH, M. HOHLMANN Physics and Space Sciences Dept,
IT320 OPERATING SYSTEM CONCEPTS Unit 3: Welcome to Linux September 2012 Kaplan University 1.
OpenFabrics Enterprise Distribution (OFED) Update
Open Fabrics BOF Supercomputing 2009 Tziporet Koren, Gilad Shainer, Yiftah Shahar, Stan Smith Hal Rosenstock, Jeff Squyres, DK Panda,
Mark E. Fuller Senior Principal Instructor Oracle University Oracle Corporation.
Cluster Software Overview
PerfSONAR-PS Functionality February 11 th 2010, APAN 29 – perfSONAR Workshop Jeff Boote, Assistant Director R&D.
Docker and Container Technology
OFA-IWG Interop Event April 2007 Rupert Dance Lamprey Networks Sonoma Workshop Presentation.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 1.
Gorman, Stubbs, & CEP Inc. 1 Introduction to Operating Systems Lesson 8 Linux.
Computer Operating Systems And Software applications.
System Requirements  Supports 32 bit i586 and 64 bit x86-64 PC hardware.  PowerPC(PPC) processors.  RAM: 256 MB minimum, 512 MB recommended.  Hard.
This slide deck is for LPI Academy instructors to use for lectures for LPI Academy courses. ©Copyright Network Development Group Module 01 Introduction.
Securing a Host Computer BY STEPHEN GOSNER. Definition of a Host  Host  In networking, a host is any device that has an IP address.  Hosts include.
Red Hat, Inc. The Revolution of Choice. Red Hat, Inc. Founded in 1995 –Bob Young, CEO - Co-founder –Marc Ewing, CTO - Co-founder Headquartered in Research.
© 2007 UC Regents1 Rocks – Present and Future The State of Things Open Source Grids and Clusters Conference Philip Papadopoulos, Greg Bruno Mason Katz,
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
DELL PUBLIC Matt Domsch Technology Strategist, Office of the CTO, Dell September 21, 2009 Dynamic Driver Injection.
IT320 Operating System Concepts
High Availability 24 hours a day, 7 days a week, 365 days a year…
Current Generation Hypervisor Type 1 Type 2.
Trends like agile development and continuous integration speak to the modern enterprise’s need to build software hyper-efficiently Jenkins:  a highly.
ITS 145: Intro to Information Systems
Essentials of UrbanCode Deploy v6.1
More Scripting & Chapter 11
XSEDE’s Campus Bridging Project
Enabling the NVMe™ CMB and PMR Ecosystem
Chapter 2: The Linux System Part 1
Section 1: Linux Basics and SLES9 Installation
© 2019 Hazy Limited. Private & confidential.
Presentation transcript:

Red Hat RDMA Integration and Testing Processes Doug Ledford

Historical Releases Red Hat Enterprise Linux 4 –Utilized OFED for both kernel code and user space packages –Took in all of the support present in OFED –The version of the openib rpm was == the version of OFED we pulled into any given release Red Hat Enterprise Linux 5 –Utilized OFED for kernel code, but went directly to upstream releases for user space code –Did not take in all of OFED kernel code either, specifically excluded features not accepted upstream –The version of the openib rpm was == to the version of OFED we pulled the kernel support from 2 April 2-3, 2014#2014IBUG

Current Releases Red Hat Enterprise Linux 6 and upcoming 7 –Uses upstream kernel code. –Uses upstream user space packages. –Uses a new package named rdma as the base kernel configuration package. Starting with EL 6.3 and later, the version of the rdma package can be used to determine the version of the upstream kernel that we pulled the RDMA support from for the current EL kernel. For instance, in EL 6.3, the rdma package is version 3.3, so for that release the core RDMA kernel support came from the upstream version 3.3 Linus kernel. For EL 7, we also encode the release number into the rdma package version to ensure proper sorting. For instance, EL 7.0’s rdma package is currently version 7.0_3.13_rc8-3.el7 3 April 2-3, 2014#2014IBUG

Integration for current releases We perform a full kernel RDMA stack refresh with each point release, but we weed out patches that can’t be taken due to need for other items that would break kABI in our kernels (the RDMA stack is exempt from the kABI list and has been since EL 4). The git whatchanged and git cherry-pick commands are essential for this work. We grab whichever user space packages have updated since the last point release, and rebuilt all dependent packages even if the dependent package didn’t have it’s own source update. 4 April 2-3, 2014#2014IBUG

Testing Environment We have a specific cluster we use to test every update. –56GBit/s Mellanox InfiniBand switch on one fabric –40GBit/s Intel InfiniBand switch on another fabric –40GBit/s Mellanox Ethernet switch on another fabric –10GBit/s Dell Ethernet switch on another fabric –We test a fairly complete matrix of all possible combinations of InfiniBand, iWARP, RoCE/IBoE, P_Keys, VLANs, and SRIOV. And across this matrix we run high level tests (such as MPI tests) as well as low level tests (simple pings run overnight in a loop) and performance oriented tests. 5 April 2-3, 2014#2014IBUG

Testing Environment (cont.) We test across a wide array of host adapters –mthca, mlx4, mlx5, qib, cxgb3, cxgb4, ocrdma –Where possible, we specifically attempt to test cross driver compatibility with each release Testing is mostly automated –We have an install environment that installs the latest release, installs our internal test harness, then runs a complete set of automated tests across the appropriate subset of machines for each test type. Failures are flagged for further analysis. 6 April 2-3, 2014#2014IBUG

Questions? No? Awesome, I must have explained everything perfectly! 7 April 2-3, 2014#2014IBUG

8