© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For public release. Data Reorganization Interface (DRI) Overview Kenneth Cain.

Slides:

Advertisements

Similar presentations

MPI-3 Persistent WG (Point-to-point Persistent Channels) February 8, 2011 Tony Skjellum, Coordinator MPI Forum.

Advertisements

.NET Technology. Introduction Overview of.NET What.NET means for Developers, Users and Businesses Two.NET Research Projects:.NET Generics AsmL.

Principles of I/O Hardware I/O Devices Block devices, Character devices, Others Speed Device Controllers Separation of electronic from mechanical components.

MPI Message Passing Interface

Slide-1 University of Maryland Five Common Defect Types in Parallel Computing Prepared for Applied Parallel Computing Prof. Alan Edelman Taiga Nakamura.

IEEE Region 6 Student Paper Contest, Anaheim CA, September 17 th 1998 Chen-I Lim Arizona State University

GENI Experiment Control Using Gush Jeannie Albrecht and Amin Vahdat Williams College and UC San Diego.

Lecturer: Sebastian Coope Ashton Building, Room G.18 COMP 201 web-page: Lecture.

Introduction CS 524 – High-Performance Computing.

Reference: Getting Started with MPI.

Development of a Ray Casting Application for the Cell Broadband Engine Architecture Shuo Wang University of Minnesota Twin Cities Matthew Broten Institute.

25 September 2007eSDO and the VO, ADASS 2007Elizabeth Auden Accessing eSDO Solar Image Processing and Visualisation through AstroGrid Elizabeth Auden ADASS.

Big Kernel: High Performance CPU-GPU Communication Pipelining for Big Data style Applications Sajitha Naduvil-Vadukootu CSC 8530 (Parallel Algorithms)

Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy H. Kaiser, Ph.D. San Diego.

NetHope Confidential. Unauthorized reproduction or use prohibited. NetHope CLOUD SERVICES PORTAL OVERVIEW TECHNOLOGY PROVIDER LAUNCH January 15, 2013.

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.

Course Outline DayContents Day 1 Introduction Motivation, definitions, properties of embedded systems, outline of the current course How to specify embedded.

Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.

1 A Common API for Transparent Hybrid Multicast (draft-waehlisch-sam-common-api-04) Matthias Wählisch, Thomas C. Schmidt Stig Venaas {waehlisch,

© 2005 Mercury Computer Systems, Inc. Yael Steinsaltz, Scott Geaghan, Myra Jean Prelle, Brian Bouzas,

Effective User Services for High Performance Computing A White Paper by the TeraGrid Science Advisory Board May 2009.

ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.

SIGCOMM 2002 New Directions in Traffic Measurement and Accounting Focusing on the Elephants, Ignoring the Mice Cristian Estan and George Varghese University.

The “State” and “Future” of Middleware for HPEC Anthony Skjellum RunTime Computing Solutions, LLC & University of Alabama at Birmingham HPEC 2009.

Computational Design of the CCSM Next Generation Coupler Tom Bettge Tony Craig Brian Kauffman National Center for Atmospheric Research Boulder, Colorado.

© 2001 Mercury Computer Systems, Inc. Real-Time Geo- Registration on High-Performance Computers Alan Chao Monica Burke ALPHATECH, Inc. High Performance.

©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.

MIT Lincoln Laboratory hch-1 HCH 5/26/2016 Achieving Portable Task and Data Parallelism on Parallel Signal Processing Architectures Hank Hoffmann.

Reinventing Explicit Parallel Programming for Improved Engineering of High Performance Computing Software Anthony Skjellum, Purushotham Bangalore, Jeff.

Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.

DORII Joint Research Activities DORII Joint Research Activities Status and Progress 6 th All-Hands-Meeting (AHM) Alexey Cheptsov on.

Architectural Design lecture 10. Topics covered Architectural design decisions System organisation Control styles Reference architectures.

©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.

Advanced Computer Networks Topic 2: Characterization of Distributed Systems.

© 2004 Mercury Computer Systems, Inc. FPGAs & Software Components Graham Bardouleau & Jim Kulp Mercury Computer Systems, Inc. High Performance Embedded.

Evaluation of the VSIPL++ Serial Specification Using the DADS Beamformer HPEC 2004 September 30, 2004 Dennis Cottel Randy Judd.

OSes: 3. OS Structs 1 Operating Systems v Objectives –summarise OSes from several perspectives Certificate Program in Software Development CSE-TC and CSIM,

Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,

CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.

The MIMS Spatial Allocator: A Tool for Generating Emission Surrogates without a Geographic Information System* Alison M. Eyth, Kimberly Hanisak Carolina.

AEROSPACE DIVISION Optimised MPI for HPEC applications Benoit Guillon: Thales Research&Technology Gerard Cristau: Thales Computers Vincent Chuffart: Thales.

1 Qualifying ExamWei Chen Unified Parallel C (UPC) and the Berkeley UPC Compiler Wei Chen the Berkeley UPC Group 3/11/07.

Topic 3: C Basics CSE 30: Computer Organization and Systems Programming Winter 2011 Prof. Ryan Kastner Dept. of Computer Science and Engineering University.

ProActive components and legacy code Matthieu MOREL.

Gedae, Inc. Gedae: Auto Coding to a Virtual Machine Authors: William I. Lundgren, Kerry B. Barnes, James W. Steed HPEC 2004.

SDMX IT Tools Introduction

Implementation of a Shipboard Ballistic Missile Defense Processing Application Using the High Performance Embedded Computing Software Initiative (HPEC-SI)

Generic Compressed Matrix Insertion P ETER G OTTSCHLING – S MART S OFT /TUD D AG L INDBO – K UNGLIGA T EKNISKA H ÖGSKOLAN SmartSoft – TU Dresden

Implementation and Optimization of SIFT on a OpenCL GPU Final Project 5/5/2010 Guy-Richard Kayombya.

A New Class of High Performance FFTs Dr. J. Greg Nash Centar ( High Performance Embedded Computing (HPEC) Workshop.

© 2003 Mercury Computer Systems, Inc. Data Reorganization Interface (DRI) Kenneth Cain Jr. Mercury Computer Systems, Inc. High Performance Embedded Computing.

1 WRB 09/02 HPEC Lincoln Lab Sept 2002 Poster B: Software Technologies andSystems W. Robert Bernecky Naval Undersea Warfare Center Ph: (401) Fax:

NSF/TCPP Curriculum Planning Workshop Joseph JaJa Institute for Advanced Computer Studies Department of Electrical and Computer Engineering University.

1 VSIPL++: Parallel Performance HPEC 2004 CodeSourcery, LLC September 30, 2004.

1 HPJAVA I.K.UJJWAL 07M11A1217 Dept. of Information Technology B.S.I.T.

High Performance Flexible DSP Infrastructure Based on MPI and VSIPL 7th Annual Workshop on High Performance Embedded Computing MIT Lincoln Laboratory

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. Data Reorganization Interface (DRI) Kenneth Cain Jr., Mercury Computer Systems,

Slide 1 Chapter 8 Architectural Design. Slide 2 Topics covered l System structuring l Control models l Modular decomposition l Domain-specific architectures.

PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.

Stan Smith Intel SSG/DPD June, 2015 Kernel Fabric Interface Kfabric Framework.

CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.

©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Chapter 5:Architectural Design l Establishing the overall structure of a software.

PERFORMANCE OF THE OPENMP AND MPI IMPLEMENTATIONS ON ULTRASPARC SYSTEM Abstract Programmers and developers interested in utilizing parallel programming.

SC’13 BoF Discussion Sean Hefty Intel Corporation.

Microprocessors Personal Computers Embedded Systems Programmable Logic

M. Richards1 ,D. Campbell1 (presenter), R. Judd2, J. Lebak3, and R

Definition of Distributed System

Threads and Cooperation

May 19 Lecture Outline Introduce MPI functionality

VSIPL++: Parallel Performance HPEC 2004

Presentation transcript:

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For public release. Data Reorganization Interface (DRI) Overview Kenneth Cain Jr., Mercury Computer Systems, Inc. Anthony Skjellum, MPI Software Technology On behalf of the Data Reorganization Forum SC2002 November 17, 2002

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 2 Outline l DRI specification highlights wDesign/application space wFeatures in DRI 1.0 w2-D FFT Example* wJournal of development (features discussed, but not included in DRI 1.0) l DRI forum status l Current/future value of DRI to HPEC * Thanks to Steve Paavola of Sky Computers for contributing to the 2-D FFT example code.

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 3 Design/Application Space l Algorithms Facilitated wLowpass filter wBeamforming wMatched filter wOthers… l Generic Design Space wSignal/image processing data flows wData-parallel applications wClique or pipeline designs wStream real-time processing l Applications Facilitated wSpace-Time Adaptive Processing (STAP) wSynthetic Aperture Radar (SAR) wOthers… Range Processing Azimuth Processing

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 4 DRI Important Objects

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 5 DRI Parallel Mapping

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 6 DRI Communication Setup

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 7 DRI Parallel Communication

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 8 All DRI Objects

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 9 DRI In-Place Processing

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 10 DRI 2-D FFT Example (1/2) DRI_Globaldata gdo;// global data object DRI_Distribution srcDist, dstDist;// distribution objects DRI_Reorg srcReorg, dstReorg; // data reorg objects DRI_Datapart dpo;// Datapart object DRI_Blockinfo block; // block description DRI_Group allprocs; // all processes int gdo_dimsizes[2] = {1024, 512}; int cols = gdo_dimsizes[0]; // 1024 matrix cols int rows = gdo_dimsizes[1];// 512 matrix rows int my_rows, my_cols // my rows, cols int i; // loop counter complex *matrix; DRI_Init (&argc, &argv); DRI_Globaldata_create (2, gdo_dimsizes, &gdo); DRI_Distribution_create (gdo, DRI_GROUPDIMS_ALLNOPREFERENCE, DRI_PARTITION_WB, DRI_LAYOUT_PACKED_01, &srcDist); DRI_Distribution_create (gdo, DRI_GROUPDIMS_ALLNOPREFERENCE, DRI_PARTITION_BW, DRI_LAYOUT_PACKED_10, &dstDist); DRI_Reorg_create (DRI_REORG_SEND, “cornerturn”, DRI_COMPLEX, srcDist, allprocs, 1, 0, &srcReorg); DRI_Reorg_create (DRI_REORG_RECV, “cornerturn”, DRI_COMPLEX, dstDist, allprocs, 1, 0, &dstReorg); DRI_Globaldata gdo;// global data object DRI_Distribution srcDist, dstDist;// distribution objects DRI_Reorg srcReorg, dstReorg; // data reorg objects DRI_Datapart dpo;// Datapart object DRI_Blockinfo block; // block description DRI_Group allprocs; // all processes int gdo_dimsizes[2] = {1024, 512}; int cols = gdo_dimsizes[0]; // 1024 matrix cols int rows = gdo_dimsizes[1];// 512 matrix rows int my_rows, my_cols // my rows, cols int i; // loop counter complex *matrix; DRI_Init (&argc, &argv); DRI_Globaldata_create (2, gdo_dimsizes, &gdo); DRI_Distribution_create (gdo, DRI_GROUPDIMS_ALLNOPREFERENCE, DRI_PARTITION_WB, DRI_LAYOUT_PACKED_01, &srcDist); DRI_Distribution_create (gdo, DRI_GROUPDIMS_ALLNOPREFERENCE, DRI_PARTITION_BW, DRI_LAYOUT_PACKED_10, &dstDist); DRI_Reorg_create (DRI_REORG_SEND, “cornerturn”, DRI_COMPLEX, srcDist, allprocs, 1, 0, &srcReorg); DRI_Reorg_create (DRI_REORG_RECV, “cornerturn”, DRI_COMPLEX, dstDist, allprocs, 1, 0, &dstReorg);

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 11 DRI 2-D FFT Example (2/2) DRI_Reorg_connect(&srcReorg); DRI_Reorg_connect(&dstReorg); DRI_Reorg_get_datapart(srcReorg, &(void *)matrix, &dpo); DRI_Datapart_get_blockinfo(dpo, 0, &block); my_rows = DRI_Blockinfo_length(block, 1); for (i=0; i<my_rows; i++) // FFT row starting at address (complex*)matrix + i*cols; DRI_Reorg_put_datapart(srcReorg, &dpo); DRI_Reorg_get_datapart(dstReorg, &(void *)matrix, &dpo); DRI_Datapart_get_blockinfo(dpo, 0, &block); my_cols = DRI_Blockinfo_length(block, 1); for (i=0; i<my_cols; i++) // FFT col starting at address (complex*)matrix + i*rows; DRI_Reorg_put_datapart(dstReorg, &dpo); // Destroy all DRI objects with DRI_Object_destroy() DRI_Finalize();

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 12 Journal of Development l Topics discussed, but not in DRI 1.0 wPiecemeal data production/consumption wUser-allocated memory wGeneralized buffer sharing wSplit data types (e.g., complex, RGB, …) wInteroperation with VSIPL wDynamic Transfer sizes (e.g., mode change support) Buffer sizes (lower memory impact for multi-buffering) Process sets: pick process set pair at transfer-time wNon-CPU (device) endpoints l To address these issues, vendors/users wDevelop desired extensions (necessarily non-portable) wUse experience to focus requirements for future standardization (if appropriate)

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 13 DRI Forum Status l Future wGather feedback about DRI 1.0 evaluation/use wFurther specification TBD based on community interest l See web page/ list for latest info. wWeb: w list: DRI-1.0 Ratified in September 2002 Voting Institutions l Mercury Computer Systems, Inc. l MPI Software Technology, Inc. l Sky Computers, Inc. l SPAWAR Systems Center, San Diego l The MITRE Corporation

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 14 DRI Value to HPEC l DRI current value to HPEC community: wImplementable with high performance wPackages often independently developed data flow features wHigh level of abstraction Reduce source lines of code dedicated to data flow Reduce development time associated with data flow l DRI future value to HPEC community: wCommon framework/terminology in which to focus community’s requirements wDRI 1.0 journal of development provides good starting list of expanded features

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 15 Recommendations l Implement DRI l Evaluate/use DRI wInquire with HW/SW vendors about availability wEvaluate implementations, provide feedback l Focus application requirements wIdentify what DRI (and other middlewares in this space) must provide wProvide feedback to DRI, other HPEC forums l Innovate with DRI wDevelop reference implementations wConsider enhancements C++ bindings Integration with processing middleware

© 2002 Mercury Computer Systems, Inc. © 2002 MPI Software Technology, Inc. For publc release. 16 Expectations l DRI will be an add-on library that works with MPI-1 l DRI will be optimized by some library developers to be fast l Both a reference implementation and optimized versions are underway l Vendor-specific DRI versions that go with their own message passing will also be offered (e.g., Mercury PAS)