Integrating Large-Scale Distributed and Parallel High Performance Computing (DPHPC) Applications Using a Component-based Architecture Nanbor Wang 1, Fang.

Slides:



Advertisements
Similar presentations
Institute of Computer Science AGH Towards Multilanguage and Multiprotocol Interoperability: Experiments with Babel and RMIX Maciej Malawski, Daniel Harężlak,
Advertisements

What is RMI? Remote Method Invocation –A true distributed computing application interface for Java, written to provide easy access to objects existing.
Remote Method Invocation (RMI) Mixing RMI and sockets
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Cracow Grid Workshop, November 5-6, 2001 Towards the CrossGrid Architecture Marian Bubak, Marek Garbacz, Maciej Malawski, and Katarzyna Zając.
Harness and H2O Alternative approaches to metacomputing Distributed Computing Laboratory Emory University, Atlanta, USA
Technical Architectures
Web-based Distributed Flexible Manufacturing System (FMS) Monitoring and Control Student: Wei Liu Instructor: Dr. Chang Apr. 23, 2003.
CORBA Case Study By Jeffrey Oliver March March 17, 2003CORBA Case Study by J. T. Oliver2 History The CORBA (Common Object Request Broker Architecture)
1 Quality Objects: Advanced Middleware for Wide Area Distributed Applications Rick Schantz Quality Objects: Advanced Middleware for Large Scale Wide Area.
Protocols and the TCP/IP Suite
Systems Architecture, Fourth Edition1 Internet and Distributed Application Services Chapter 13.
Lesson 3 Remote Method Invocation (RMI) Mixing RMI and sockets Rethinking out tic-tac-toe game.
Protocols and the TCP/IP Suite Chapter 4. Multilayer communication. A series of layers, each built upon the one below it. The purpose of each layer is.
New Direction Proposal: An OpenFabrics Framework for high-performance I/O apps OFA TAC, Key drivers: Sean Hefty, Paul Grun.
Center for Component Technology for Terascale Simulation Software 122 June 2002Workshop on Performance Optimization via High Level Languages and Libraries.
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Remote OMNeT++ v2.0 Introduction What is Remote OMNeT++? Remote environment for OMNeT++ Remote simulation execution Remote data storage.
Protocol Architectures. Simple Protocol Architecture Not an actual architecture, but a model for how they work Similar to “pseudocode,” used for teaching.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
H Research Issues in CORBA Peter de Jong Hewlett-Packard Usenix 8/12/97 Research Issues in CORBA What keeps CORBA people awake at Night! Peter de Jong.
1 G52IWS: Distributed Computing Chris Greenhalgh.
Babel F2003 Wrap-up Stefan Muszala*, Tom Epperly(LLNL), Nanbor Wang* Funded by DOE (TASCS) Grant No DE-FC02-07ER25805, DOE Grant No DE-FG02-04ER84099 and.
CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.
February 2-3, 2006SRB Workshop, San Diego P eter Cao, NCSA Mike Wan, SDSC Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Object-level.
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
CcaEcloud Phase I Wrap-up Phase I Doe SBIR Stefan Muszala, PI DOE Grant No DE-FG02-08ER85152 Tech-X Corporation Boulder, CO Updates: onRamp, FACETS+Babel,
Architecting Web Services Unit – II – PART - III.
Crossing The Line: Distributed Computing Across Network and Filesystem Boundaries.
Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke.
December 1, 2005HDF & HDF-EOS Workshop IX P eter Cao, NCSA December 1, 2005 Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration.
XMPP Concrete Implementation Updates: 1. Why XMPP 2 »XMPP protocol provides capabilities that allows realization of the NHIN Direct. Simple – Built on.
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
Plans and Opportunities Involving Beam Dynamics Components ComPASS SAP Project and Phase I and II Doe SBIR Boyana Norris (ANL) In collaboration with Stefan.
Topics of presentation
Grid Computing Research Lab SUNY Binghamton 1 XCAT-C++: A High Performance Distributed CCA Framework Madhu Govindaraju.
“DECISION” PROJECT “DECISION” PROJECT INTEGRATION PLATFORM CORBA PROTOTYPE CAST J. BLACHON & NGUYEN G.T. INRIA Rhône-Alpes June 10th, 1999.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Building an Electron Cloud Simulation using Bocca, Synergia2, TxPhysics and Tau Performance Tools Phase I Doe SBIR Stefan Muszala, PI DOE Grant No DE-FG02-08ER85152.
Center for Component Technology for Terascale Simulation Software CCA is about: Enhancing Programmer Productivity without sacrificing performance. Supporting.
CORBA Technologies and Beyond
SCIRun and SPA integration status Steven G. Parker Ayla Khan Oscar Barney.
Integrating Digital Libraries by CORBA, XML and Servlet Integrating Digital Libraries by CORBA, XML and Servlet Wing Hang Cheung, Michael R. Lyu and Kam.
Refining middleware functions for verification purpose Jérôme Hugues Laurent Pautet Fabrice Kordon
DOE PI Meeting at BNL 1 Lightweight High-performance I/O for Data-intensive Computing Jun Wang Computer Architecture and Storage System Laboratory (CASS)
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Presented by An Overview of the Common Component Architecture (CCA) The CCA Forum and the Center for Technology for Advanced Scientific Component Software.
Server to Server Communication Redis as an enabler Orion Free
Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.
Update on CORBA Support for Babel RMI Nanbor Wang and Roopa Pundaleeka Tech-X Corporation Boulder, CO Funded by DOE OASCR SBIR.
S imple O bject A ccess P rotocol Karthikeyan Chandrasekaran & Nandakumar Padmanabhan.
CCA Common Component Architecture CCA Forum Tutorial Working Group CCA Status and Plans.
Distributed Components for Integrating Large- Scale High Performance Computing Applications Nanbor Wang, Roopa Pundaleeka and Johan Carlsson
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Software Engineering Chapter: Computer Aided Software Engineering 1 Chapter : Computer Aided Software Engineering.
7. Grid Computing Systems and Resource Management
Distributed Components for Integrating Large-Scale High Performance Computing Applications – A Project Summary Nanbor Wang Tech-X Corporation.
1 ProActive GCM – CCA Interoperability Maciej Malawski, Ludovic Henrio, Matthieu Morel, Francoise Baude, Denis Caromel, Marian Bubak Institute of Computer.
CCA Common Component Architecture CCA Forum Tutorial Working Group Common Component Architecture.
SDM Center Parallel I/O Storage Efficient Access Team.
Toward a Distributed and Parallel High Performance Computing Environment Johan Carlsson and Nanbor Wang Tech-X Corporation Boulder,
Redmond Protocols Plugfest 2016 Jinghui Zhang Office Interoperability Test Tools (Test Suites and Open Source Projects) Software Engineer Microsoft Corporation.
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Java Distributed Object System
CORBA Alegria Baquero.
What is RMI? Remote Method Invocation
Ch > 28.4.
CORBA Alegria Baquero.
Department of Intelligent Systems Engineering
An Interactive Browser For BaBar Databases
Presentation transcript:

Integrating Large-Scale Distributed and Parallel High Performance Computing (DPHPC) Applications Using a Component-based Architecture Nanbor Wang 1, Fang (Cherry) Liu 2, Paul Hamil 1, Stephen Tramer 1, Rooparoni Pundaleeka 1, Randall Bramley 2 1 Tech-X Corporation 2 Indiana University Boulder, CO U.S.A Bloomington, IN, U.S.A Workshop on Component-Based High-Performance Computing October 16, 2008 Karlsruhe, Germany Work partially funded by the US Department of Energy, Office of Advanced Scientific Computing Research, Grant #DE-FG02-04ER84099

DPHPC Applications, CBHPC 2008 Nanbor Wang 2 Agenda Motivation and approach for Distributed and Parallel High-Performance Computing (DPHPC) Enabling distributed technologies Applications development

DPHPC Applications, CBHPC 2008 Nanbor Wang 3 Distributed and Parallel Component-Based Software Engineering Addresses Modern Needs of Scientific Computing Motivating scenarios for Distributed and Parallel HPC (DPHPC): –Integrate separately-developed and established codes – FSP, climate modeling, space weather modeling, each component needing its own architecture –Provide ways to better utilize high-CPU number hardware and combine computing resources of multiple clusters/computing centers –Enable parallel data streaming between computing task and post- processing task (no feedback to the solver) –Integrate multiple parallel codes using heterogeneous architectures Existing component standards and frameworks designed with enterprise applications in mind –No support for features that are important for HPC scientific applications: interoperability with scientific programming languages (FORTRAN) and parallel computing infrastructure (MPI) CCA address needs of HPC scientific applications: combustion modeling, global climate modeling, fusion and plasma simulations Tasks –Explore various distributed technologies and approaches for DPHPC –Enhance tool support for DPHPC – F2003 struct support (covered later in Stefan’s talk)

DPHPC Applications, CBHPC 2008 Nanbor Wang 4 Typical Parallel CCA Frameworks Support both SPMD and MPMD scenarios Stay out of the way of component parallelism –Components handle parallel communication P0P1P2P3 Group BGroup A SCMD MCMD

DPHPC Applications, CBHPC 2008 Nanbor Wang 5 An Illustration of DPHPC Application P0P1P2P3 Middleware Still support conventional CCA component managed parallelism Provide additional framework mediated distributed inter- component communication capability Cooperative Processing – LLNL PACO++ – INRIA Alternative MCMD

DPHPC Applications, CBHPC 2008 Nanbor Wang 6 Agenda Motivation for Distributed and Parallel High- Performance Computing (DPHPC) Enabling distributed technologies Applications development

DPHPC Applications, CBHPC 2008 Nanbor Wang 7 BABEL RMI CLIENT BABEL RMI SERVER Babel RMI Interface Simple Protocol Babel RMI Allows Multiple Implementations Babel generates mapping for remote invocations, and has its own transfer protocol “Simple Protocol” implemented in C Thanks to Babel’s open architecture and language interoperability – users can take advantage of various distributed technologies through third party RMI libraries We have developed a CORBA protocol library for Babel RMI using TAO (version or later) –The first 3 rd -party Babel RMI library –TAO is the C++ based CORBA middleware framework –This protocol is essentially a bridge between Babel and TAO BABEL RMI CLIENT BABEL RMI SERVER Babel RMI Interface TAOIIOP TAO

DPHPC Applications, CBHPC 2008 Nanbor Wang 8 Using CORBA in Babel RMI Allows CORBA and Babel Objects to Interoperate Goal is to –Allow interoperability between existing CORBA and Babel objects –Retain performance of CORBA IIOP protocol Possible approaches for serialization –Encapsulating Babel Simple Protocol wire-format into a block of binary data and transport it using CORBA (as Octet Sequence) –Encapsulating Babel communications into CORBA Any objects (did not follow up because of inefficiency of Any) –Mapping Babel communications to CORBA format directly (the adopted approach). CORBA uses Common Data Representation (CDR) in the wire.

DPHPC Applications, CBHPC 2008 Nanbor Wang 9 Direct Conversions Between CORBA & Babel types Enable Interoperability with Little Penalty module taoiiop { module rmi { exception ServerException { string info; }; struct fcomplex { float real; float imaginary; }; struct dcomplex { double real; double imaginary; }; /* SIDL arrays are mapped to CORBA structs which keep all the metadata information and the array values are stored as CORBA sequence following the metadata */ typedef sequence ArrayDims; struct Array_Metadata { short ordering; short dims; ArrayDims stride; ArrayDims lower; ArrayDims upper; }; }; }; AfterTaoIIOP 2.0 has a performance close to raw socket –Optimizations: Made CORBA-Babel mapping types native in TAO by implementing optimized, zero-copy version of marshaling and demarshaling support

DPHPC Applications, CBHPC 2008 Nanbor Wang 10 Agenda Motivation for Distributed and Parallel High- Performance Computing (DPHPC) Enabling distributed technologies Applications development

DPHPC Applications, CBHPC 2008 Nanbor Wang 11 Leveraging Oneway and Asynchronous Calls to Increase Application Parallelism Compute- bound task Dump data Data Analysis Compute- bound task Dump data Data Analysis signal Compute- bound task Simulation cluster Remote cluster Compute- bound task Dump data Data Analysis Compute- bound task Data Analysis signal Simulation cluster Remote cluster Compute- bound task Data Analysis Compute- bound task Synchronous Invocations Asynchronous/oneway Invocations

DPHPC Applications, CBHPC 2008 Nanbor Wang 12 Performance Comparison – TaoIIOP Async and Oneway Calls Figure shows average time for each time step Very lightweight data analysis – emphasis on transport cost 0 payload actually makes no remote invocation Babel team is working on a new RMI implementation

DPHPC Applications, CBHPC 2008 Nanbor Wang 13 VORPAL is a Versatile Framework for Physics Simulations Highly-flexible, arbitrary- dimension Plasma and beam simulations using multiple models Utilize both MPI and parallel I/O Use of robust file to configure a simulation task numPhysCells [NX, NY, NZ] length [LX, LY, LZ] decompType=regular kind=yeeEmField kind=relBoris mass=ELEMACS

DPHPC Applications, CBHPC 2008 Nanbor Wang 14 Componentize VORPAL to perform On- demand Data Processing

DPHPC Applications, CBHPC 2008 Nanbor Wang 15 DPHPC Application – Speed-up for On- line Data Analysis We had developed a prototype to perform online data-analysis as a proof-of-concept Run in the same cluster as two group of processors ~20% speedup was observed More speed up with elaborate data processing We modified the VORPAL source code separately for this prototype

DPHPC Applications, CBHPC 2008 Nanbor Wang 16 DPHPC Applications – Remote Monitoring/Steering of Simulations We have extended Vorpal component framework to interact with CCA framework through Babel RMI Configurable from Vorpal’s initialization file: kind = historyKind kind = babelSender babelRmiURL = eclipse.txcorp.com:8081 Support specification of a URL group – a list of URLs running parallel tasks We are able to connect a running simulation to one or multiple workstations –For online data processing/analysis –For monitoring simulation Physicists are most interested in –Monitoring –Steering VpBabelSender: connecting to taoiiophandle://quartic.txcorp.com:8081 VpBabelSender: endpoint URL: taoiiophandle://quartic.txcorp.com:8081/1000 VorpalClient constructor update 1 time= e-13 update 2 time= e-12

DPHPC Applications, CBHPC 2008 Nanbor Wang 17 Summary Implemented the distributed proxy components and the TaoIIOP Babel RMI protocol for connecting distributed CCA applications into an integrated systems Conducted performance benchmarking on preliminary prototype implementation (version 1.0) to identify key optimizations needed Implemented the optimizations to minimize the overhead (version 2.0) Interoperability with CORBA can be achieved with little/no performance penalty

DPHPC Applications, CBHPC 2008 Nanbor Wang 18 Summary and Future Directions Interoperability with CORBA can be achieved with little/no performance penalty Implement more scenarios of mixing distributed and high performance components involving several clusters and real applications Synergy with MCMD Support for petascale HPC applications –Remote monitoring/steering of large-scale simulations on supercomuters (e.g., franklin) –Can take advantage of CORBA-Babel RMI interoperability for now and switch to TAOIIOP later