Download presentation
Presentation is loading. Please wait.
Published byCarmel Osborne Modified over 9 years ago
1
Integrating Large-Scale Distributed and Parallel High Performance Computing (DPHPC) Applications Using a Component-based Architecture Nanbor Wang 1, Fang (Cherry) Liu 2, Paul Hamil 1, Stephen Tramer 1, Rooparoni Pundaleeka 1, Randall Bramley 2 1 Tech-X Corporation 2 Indiana University Boulder, CO U.S.A Bloomington, IN, U.S.A Workshop on Component-Based High-Performance Computing October 16, 2008 Karlsruhe, Germany Work partially funded by the US Department of Energy, Office of Advanced Scientific Computing Research, Grant #DE-FG02-04ER84099
2
DPHPC Applications, CBHPC 2008 Nanbor Wang 2 Agenda Motivation and approach for Distributed and Parallel High-Performance Computing (DPHPC) Enabling distributed technologies Applications development
3
DPHPC Applications, CBHPC 2008 Nanbor Wang 3 Distributed and Parallel Component-Based Software Engineering Addresses Modern Needs of Scientific Computing Motivating scenarios for Distributed and Parallel HPC (DPHPC): –Integrate separately-developed and established codes – FSP, climate modeling, space weather modeling, each component needing its own architecture –Provide ways to better utilize high-CPU number hardware and combine computing resources of multiple clusters/computing centers –Enable parallel data streaming between computing task and post- processing task (no feedback to the solver) –Integrate multiple parallel codes using heterogeneous architectures Existing component standards and frameworks designed with enterprise applications in mind –No support for features that are important for HPC scientific applications: interoperability with scientific programming languages (FORTRAN) and parallel computing infrastructure (MPI) CCA address needs of HPC scientific applications: combustion modeling, global climate modeling, fusion and plasma simulations Tasks –Explore various distributed technologies and approaches for DPHPC –Enhance tool support for DPHPC – F2003 struct support (covered later in Stefan’s talk)
4
DPHPC Applications, CBHPC 2008 Nanbor Wang 4 Typical Parallel CCA Frameworks Support both SPMD and MPMD scenarios Stay out of the way of component parallelism –Components handle parallel communication P0P1P2P3 Group BGroup A SCMD MCMD
5
DPHPC Applications, CBHPC 2008 Nanbor Wang 5 An Illustration of DPHPC Application P0P1P2P3 Middleware Still support conventional CCA component managed parallelism Provide additional framework mediated distributed inter- component communication capability Cooperative Processing – LLNL PACO++ – INRIA Alternative MCMD
6
DPHPC Applications, CBHPC 2008 Nanbor Wang 6 Agenda Motivation for Distributed and Parallel High- Performance Computing (DPHPC) Enabling distributed technologies Applications development
7
DPHPC Applications, CBHPC 2008 Nanbor Wang 7 BABEL RMI CLIENT BABEL RMI SERVER Babel RMI Interface Simple Protocol Babel RMI Allows Multiple Implementations Babel generates mapping for remote invocations, and has its own transfer protocol “Simple Protocol” implemented in C Thanks to Babel’s open architecture and language interoperability – users can take advantage of various distributed technologies through third party RMI libraries We have developed a CORBA protocol library for Babel RMI using TAO (version 1.5.1 or later) –The first 3 rd -party Babel RMI library –TAO is the C++ based CORBA middleware framework –This protocol is essentially a bridge between Babel and TAO BABEL RMI CLIENT BABEL RMI SERVER Babel RMI Interface TAOIIOP TAO
8
DPHPC Applications, CBHPC 2008 Nanbor Wang 8 Using CORBA in Babel RMI Allows CORBA and Babel Objects to Interoperate Goal is to –Allow interoperability between existing CORBA and Babel objects –Retain performance of CORBA IIOP protocol Possible approaches for serialization –Encapsulating Babel Simple Protocol wire-format into a block of binary data and transport it using CORBA (as Octet Sequence) –Encapsulating Babel communications into CORBA Any objects (did not follow up because of inefficiency of Any) –Mapping Babel communications to CORBA format directly (the adopted approach). CORBA uses Common Data Representation (CDR) in the wire.
9
DPHPC Applications, CBHPC 2008 Nanbor Wang 9 Direct Conversions Between CORBA & Babel types Enable Interoperability with Little Penalty module taoiiop { module rmi { exception ServerException { string info; }; struct fcomplex { float real; float imaginary; }; struct dcomplex { double real; double imaginary; }; /* SIDL arrays are mapped to CORBA structs which keep all the metadata information and the array values are stored as CORBA sequence following the metadata */ typedef sequence ArrayDims; struct Array_Metadata { short ordering; short dims; ArrayDims stride; ArrayDims lower; ArrayDims upper; }; }; }; AfterTaoIIOP 2.0 has a performance close to raw socket –Optimizations: Made CORBA-Babel mapping types native in TAO by implementing optimized, zero-copy version of marshaling and demarshaling support
10
DPHPC Applications, CBHPC 2008 Nanbor Wang 10 Agenda Motivation for Distributed and Parallel High- Performance Computing (DPHPC) Enabling distributed technologies Applications development
11
DPHPC Applications, CBHPC 2008 Nanbor Wang 11 Leveraging Oneway and Asynchronous Calls to Increase Application Parallelism Compute- bound task Dump data Data Analysis Compute- bound task Dump data Data Analysis signal Compute- bound task Simulation cluster Remote cluster Compute- bound task Dump data Data Analysis Compute- bound task Data Analysis signal Simulation cluster Remote cluster Compute- bound task Data Analysis Compute- bound task Synchronous Invocations Asynchronous/oneway Invocations
12
DPHPC Applications, CBHPC 2008 Nanbor Wang 12 Performance Comparison – TaoIIOP Async and Oneway Calls Figure shows average time for each time step Very lightweight data analysis – emphasis on transport cost 0 payload actually makes no remote invocation Babel team is working on a new RMI implementation
13
DPHPC Applications, CBHPC 2008 Nanbor Wang 13 VORPAL is a Versatile Framework for Physics Simulations Highly-flexible, arbitrary- dimension Plasma and beam simulations using multiple models Utilize both MPI and parallel I/O Use of robust file to configure a simulation task numPhysCells [NX, NY, NZ] length [LX, LY, LZ] decompType=regular kind=yeeEmField kind=relBoris mass=ELEMACS
14
DPHPC Applications, CBHPC 2008 Nanbor Wang 14 Componentize VORPAL to perform On- demand Data Processing
15
DPHPC Applications, CBHPC 2008 Nanbor Wang 15 DPHPC Application – Speed-up for On- line Data Analysis We had developed a prototype to perform online data-analysis as a proof-of-concept Run in the same cluster as two group of processors ~20% speedup was observed More speed up with elaborate data processing We modified the VORPAL source code separately for this prototype
16
DPHPC Applications, CBHPC 2008 Nanbor Wang 16 DPHPC Applications – Remote Monitoring/Steering of Simulations We have extended Vorpal component framework to interact with CCA framework through Babel RMI Configurable from Vorpal’s initialization file: kind = historyKind kind = babelSender babelRmiURL = eclipse.txcorp.com:8081 Support specification of a URL group – a list of URLs running parallel tasks We are able to connect a running simulation to one or multiple workstations –For online data processing/analysis –For monitoring simulation Physicists are most interested in –Monitoring –Steering VpBabelSender: connecting to taoiiophandle://quartic.txcorp.com:8081 VpBabelSender: endpoint URL: taoiiophandle://quartic.txcorp.com:8081/1000 VorpalClient constructor update 1 time=6.128014e-13 update 2 time=1.225603e-12
17
DPHPC Applications, CBHPC 2008 Nanbor Wang 17 Summary Implemented the distributed proxy components and the TaoIIOP Babel RMI protocol for connecting distributed CCA applications into an integrated systems Conducted performance benchmarking on preliminary prototype implementation (version 1.0) to identify key optimizations needed Implemented the optimizations to minimize the overhead (version 2.0) Interoperability with CORBA can be achieved with little/no performance penalty
18
DPHPC Applications, CBHPC 2008 Nanbor Wang 18 Summary and Future Directions Interoperability with CORBA can be achieved with little/no performance penalty Implement more scenarios of mixing distributed and high performance components involving several clusters and real applications Synergy with MCMD Support for petascale HPC applications –Remote monitoring/steering of large-scale simulations on supercomuters (e.g., franklin) –Can take advantage of CORBA-Babel RMI interoperability for now and switch to TAOIIOP later
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.