Fabtests – test framework ideas/suggestions Howard Pritchard – LANL LA-UR-1426578 www.openfabrics.org - OFI WG F2F - 8/2014 1.

Slides:



Advertisements
Similar presentations
Mapping Service Templates to Concrete Network Semantics Some Ideas.
Advertisements

KOFI Stan Smith Intel SSG/DPD January, 2015 Kernel OpenFabrics Interface.
Copyright © 2001 Qusay H. Mahmoud RMI – Remote Method Invocation Introduction What is RMI? RMI System Architecture How does RMI work? Distributed Garbage.
OFA Openframework WG SHMEM/PGAS Feedback Worksheet 1/27/14.
Copyright 2004 Monash University IMS5401 Web-based Systems Development Topic 2: Elements of the Web (g) Interactivity.
A CHAT CLIENT-SERVER MODULE IN JAVA BY MAHTAB M HUSSAIN MAYANK MOHAN ISE 582 FALL 2003 PROJECT.
© 2009 Research In Motion Limited Methods of application development for mobile devices.
TCP/IP Protocol Suite 1 Chapter 11 Upon completion you will be able to: User Datagram Protocol Be able to explain process-to-process communication Know.
William Stallings Data and Computer Communications 7 th Edition Chapter 2 Protocols and Architecture.
Hillsboro August F2F Summary Paul Grun OFI WG co-chair 01 Sept ‘14.
COE 342: Data & Computer Communications (T042) Dr. Marwan Abu-Amara Chapter 2: Protocols and Architecture.
Stan Smith Intel SSG/DPD June, 2015 Kernel Fabric Interface KFI Framework.
Stan Smith Intel SSG/DPD February, 2015 Kernel OpenFabrics Interface kOFI Framework.
An Introduction to Device Drivers Sarah Diesburg COP 5641 / CIS 4930.
IB ACM InfiniBand Communication Management Assistant (for Scaling) Sean Hefty.
Process-to-Process Delivery:
Sharing Geographic Content
System Design Chapter 8. Objectives  Understand the verification and validation of the analysis models.  Understand the transition from analysis to.
New Direction Proposal: An OpenFabrics Framework for high-performance I/O apps OFA TAC, Key drivers: Sean Hefty, Paul Grun.
Introduction to the Enterprise Library. Sounds familiar? Writing a component to encapsulate data access Building a component that allows you to log errors.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
Chapter 17 Networking Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William Stallings.
OpenFabrics 2.0 Sean Hefty Intel Corporation. Claims Verbs is a poor semantic match for industry standard APIs (MPI, PGAS,...) –Want to minimize software.
London April 2005 London April 2005 Creating Eyeblaster Ads The Rich Media Platform The Rich Media Platform Eyeblaster.
LWIP TCP/IP Stack 김백규.
London April 2005 London April 2005 Creating Eyeblaster Ads The Rich Media Platform The Rich Media Platform Eyeblaster.
The Pipeline Processing Framework LSST Applications Meeting IPAC Feb. 19, 2008 Raymond Plante National Center for Supercomputing Applications.
LWIP TCP/IP Stack 김백규.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Network Services Networking for Home and Small Businesses – Chapter 6.
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
OpenFabrics 2.0 or libibverbs 1.0 Sean Hefty Intel Corporation.
(Business) Process Centric Exchanges
Fabric Interfaces Architecture Sean Hefty - Intel Corporation.
A Brief Documentation.  Provides basic information about connection, server, and client.
Virtual Batch Queues A Service Oriented View of “The Fabric” Rich Baker Brookhaven National Laboratory April 4, 2002.
PMI: A Scalable Process- Management Interface for Extreme-Scale Systems Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, Jayesh Krishna, Ewing.
Fabric Interfaces Architecture Sean Hefty - Intel Corporation.
TCP/IP Protocol Suite 1 Chapter 10 Upon completion you will be able to: Internet Group Management Protocol Know the purpose of IGMP Know the types of IGMP.
Stan Smith Intel SSG/DPD February, 2015 Kernel OpenFabrics Interface Initialization.
RFC 3964 Security Considerations for 6to4 Speaker: Chungyi Wang Adviser: Quincy Wu Date:
Cray Inc. Hot Interconnects 1 Bob Alverson, Duncan Roweth, Larry Kaplan Cray Inc.
Interoperability Testing. Work done so far WSDL subgroup Generated Web Service Description with aim for maximum interoperability between various SOAP.
IB Verbs Compatibility
OFI SW Sean Hefty - Intel Corporation. Target Software 2 Verbs 1.x + extensions 2.0 RDMA CM 1.x + extensions 2.0 Fabric Interfaces.
Compliance and Interoperability Discussion 11/25/2014 Paul Grun.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Linux Operations and Administration
OpenFabrics 2.0 rsockets+ requirements Sean Hefty - Intel Corporation Bob Russell, Patrick MacArthur - UNH.
Diameter Overload DIME WG IETF 87 July, Starting Point DIAMETER_TOO_BUSY provides little guidance on what a Diameter client should do when it receives.
Outline Server side Dependencies Installing it Configuring it Client side coding Browser setup.
Data Plane Computing System CERN Openlab Technical Workshop 5-6th November 2015 Lazaros Lazaridis › 05/11/2015.
Data Communications and Networks Chapter 6 – IP, UDP and TCP ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
Open Fabrics Interfaces Software Sean Hefty - Intel Corporation.
Open Map Yamama Dagash & Haitham Khateeb under the supervision of: Benny Daon & Eyal Levin Open Map.
Stan Smith Intel SSG/DPD June, 2015 Kernel Fabric Interface Kfabric Framework.
SC’13 BoF Discussion Sean Hefty Intel Corporation.
Windows Communication Foundation and Web Services
Part 3 – Remote Connection, File Transfer, Remote Environments
Fabric Interfaces Architecture – v4
CHAPTER 3 Architectures for Distributed Systems
Advancing open fabrics interfaces
TCP Transport layer Er. Vikram Dhiman LPU.
Windows Communication Foundation and Web Services
An Introduction to Device Drivers
DHCP, DNS, Client Connection, Assignment 1 1.3
Transport Protocols Relates to Lab 5. An overview of the transport protocols of the TCP/IP protocol suite. Also, a short discussion of UDP.
OpenFabrics Interfaces: Past, present, and future
Request ordering for FI_MSG and FI_RDM endpoints
Process-to-Process Delivery:
Application taxonomy & characterization
Presentation transcript:

Fabtests – test framework ideas/suggestions Howard Pritchard – LANL LA-UR OFI WG F2F - 8/2014 1

Topics Current state of fabtests Test suites for similar RDMA network protocols –OFED tarball –PAMI –Portals4 –uGNI HPC-style job launcher options Content ideas for fabtests - OFI WG F2F - 8/2014 2

Fabtests – current state Only two tests currently –unit/provinfo.c – tests fi_getinfo –simple/pingpong.c – tests FI_MSG based ping/pong using client/server model Need a lot more – we all know this - OFI WG F2F - 8/2014 3

OFED tarball perftest –Set of client/server based tests of send/recv, rdma performance, etc. –Simple job launch script for client side qperf –Client/server style tests for UC,UD,RC send/recv, rdma (amos) performance Doesn’t appear to be any src rpm containing a set of unit tests for ibverbs or psm in the OFED tarball - OFI WG F2F - 8/2014 4

PAMI – finding it Little tricky to find, but available at driver/V1R2M2/ driver/V1R2M2/ Get the brq-V1R2M2.tar.gz tarball - OFI WG F2F - 8/2014 5

PAMI testsuite The PAMI tests will untar into comm/sys/pami/tests Lots of them, for collectives, p2p, PAMI internal funcs, etc. Perf tests and unit tests appear to be intermingled. Appears all tests are launched on BG using poe - OFI WG F2F - 8/2014 6

Portals4 At code.google.com/p/portals4 About 30 basic tests, can be used either for matching or non-matching portals NIC handle Also have several performance tests (e.g. NetPIPE, portals versions of Sandia MPI Benchmarks - SMB, …) Leverages Argonne Hydra/simple PMI job launcher for basic runtime support, included in the Portals tarball - OFI WG F2F - 8/2014 7

GNI (Cray) Lots of unit tests for in the unit tests rpm (generally not available to customers), generally written by developers of particular GNI features Also have an examples rpm intended for customers to provide guidance on using GNI – not written by the developers With a few exceptions, all of the tests and examples use Hydra-lite(or Cray aprun)/PMI for a runtime system - OFI WG F2F - 8/2014 8

HPC-style runtime/job launcher and fabtests The libfabric API does not require a HPC-style runtime/job launch – this is a good thing However, for most HPC use cases, some kind of runtime/job launch system will be used Having such a runtime system makes writing unit/example tests reflecting HPC use cases much easier –Can run tests on production systems without interfering with other users –Provides ways for exchanging info in an OOB way between processes running a test - OFI WG F2F - 8/2014 9

Job launcher options for fabtests Roll our own using pdsh, etc. –May be more familiar to non-HPC users –To HPC users, may seem like wheel reinventing HPC job launch options –Resource manager specific job launchers SLURM, LFS, etc. Vendor specific (Cray aprun, IBM poe, etc.) –Open source options Hydra (Argonne’s MPICH job launcher) ORTE (OpenMPI’s job launcher) YARN - Hadoop (this is kind of a joke) - OFI WG F2F - 8/

Hydra and ORTE Compared - OFI WG F2F - 8/ Hydra/Simple PMIORTE LicenseBSD style PackagingJob launcher for MPICH. Available as a separate package. Simple PMI included in MPICH Comes as part of OpenMPI package. Batch system/launcher aware yes Ease of use within fabtestsSimple, high level PMI interface More complex, lower level interface, likely would require a glue layer of some sort to avoid libfabric developers/testers having to learn ORTE/OPAL

Hydra & PMI Job launch –mpiexec –n 2 –hosts node1,node2./a.out Basic job setup and parameters –PMI_Init/PMI_Finalize –PMI_Rank –PMI_Size Barrier function (PMI_Barrier) Key-value store –PMI_KVS_put/PMI_KVS_get –PMI_KVS_commit - OFI WG F2F - 8/

- OFI WG F2F - 8/ Content Ideas for fabtests

Job launcher related tests Add Hydra/simple PMI to fabtests, much like is provided with Portals4 Include some simple smoke tests which only exercise the PMI functionality. If these don’t work, no sense running fabtests which rely on Hydra/PMI. - OFI WG F2F - 8/

- OFI WG F2F - 8/ Provider checklist tests

Endpoint types According to fabric.7 man page, a provider must support at least one of the following endpoint types for libfabric version OFI WG F2F - 8/ FID_MSGconnected/reliable FID_RDMunconnected/reliable FID_DGRAMunconnected/unreliable

Endpoint data transfer/CM functionality Provider must implement at a minimum the FI_MSG data transfer interface Connection management functions for FID_RDM/FID_DGRAM: getname, getpeer, connect, multicast join/leave Connection management functions for FID_MSG: getname, getpeer, connect, accept, listen, reject, shutdown - OFI WG F2F - 8/

Access Domain Functionality Must support opening address vector maps and tables Address vectors (AVs) have to support at least FI_ADDER_PROTO input format, FI_SOCKADDR_IN(6) if endpoints can be identified by IP addr AVs must support must support following output formats: FI_ADDR, FI_ADDR_INDEX, FI_AV Must support opening EQs and counters - OFI WG F2F - 8/

Event Queue Functionality Must support at least FI_EQ_FORMAT_CONTEXT Data transfer completion EQs must support the FI_EQ_FORMAT_DATA format - OFI WG F2F - 8/

Forward compatibility Provider expected to be forward compatible Able to handle being compiled against expanded fi_xxx_ops…. - OFI WG F2F - 8/

Other ideas Example tests illustrating non-trivial usage of various endpoint types Error handling – simulating error events being delivered to a COMP EQ, etc. Out of order deliver simulation Move fabtests project to github or other location more suitable for open source development - OFI WG F2F - 8/

BACKUP MATERIAL - OFI WG F2F - 8/

Hydra / ORTE Compared Hydra –BSD style license –Separate package from MPICH –Works with simple PMI client (the app) –“template” already with Portals4 package –Simple to use PMI interface –Batch system aware ORTE –BSD style license –Part of OMPI package/uses OPAL –More complex to use than Hydra/PMI – at least looking at ORTE tests –Batch system aware - OFI WG F2F - 8/