Programming Parallel and Distributed Systems for Large Scale Numerical Simulation Application Christian Perez INRIA researcher IRISA Rennes, France.

Programming Parallel and Distributed Systems for Large Scale Numerical Simulation Application Christian Perez INRIA researcher IRISA Rennes, France

2 IRISA INRIA Rennes CNRS Univ. of Rennes I INSA 600 people (march 06) 233 researchers 184 PhD students 120 Engineers, technicians

3 Project team composition Tenured personnel (11) F. André (Prof IFSIC) G. Antoniu (CR INRIA) J-P. Banâtre (Prof IFSIC) L. Bougé (Prof ENS) Y. Jégou (CR INRIA) D. Margery (IR INRIA) 50% C. Morin (DR INRIA) P. Morillon (IE IFSIC) J.L. Pazat (Prof. INSA) C. Perez (CR INRIA) T. Priol (DR INRIA) Post-docs (5) PhD students (11) Engineers (6)

4 PARIS objectives Objects of study: Cluster and Cluster of Clusters (aka Grids) Main objectives: Study and design operating systems, middleware and runtime systems to make the programming of such computing infrastructures easier with: High-performance High-availability Scalability Design advanced programming models for the programming of Clusters of Clusters Combining both parallel and distributed computing paradigms Evaluation of the proposed operating systems, runtimes and middleware through the development of advanced software (not only software prototype) Technology transfer through collaboration with industrial partners

5 Research activities Operating System and Runtime systems for Clusters and Cluster Federations Single System Image Operating System Grid-aware Operating System Middleware for Computational Grids Component-based middleware for computational Grids Communication framework Parallel components Deployment of parallel components within a grid Adaptive components Large-scale data management for Grids Coupling of Distributed Shared Memories Data sharing services for mutable data using a P2P approach Advanced Models for the Grids High-order Gamma Enactment of workflows based on a chemical metaphor Experimental Grid Infrastructures The Grid’5000 testbed

6 Orsay 1000 (684) Rennes 518 (658) Bordeaux 500 (96) Toulouse 500 (116) Lyon 500 (252) Grenoble 500 (270) Sophia Antipolis 500 (434) Lille 500 (106) Nancy: 500 (94) Grid’5000 testbed 10 Gbps Dark fiber Dedicated Lambda Fully isolated traffic! Provided by RENATER

7 Component-based middleware for computational Grids Apply modern software practices to scientific computing High-performance & Scalability Specialized component models for high performance computing (clusters & grids) Code coupling applications Parametric applications Increase the level of abstraction SPMD paradigm (MxN communications) Master-worker paradigm Data sharing paradigm High-performance communication Independence vis à vis of the networking technologies Adaptation Adaptation to the dynamic behavior of grids Deployment Map components to available resources Technology independent (CCM, CCA, Fractal, CoreGRID GCM) SPMD components Master-Slaves component ThermalDynamics Structural Mechanics Optics

8 High Performance Components for code coupling: SPMD paradigm SPMD component Parallelism is a non-functional property of a component It is an implementation issue Collection of sequential components SPMD execution model Support of distributed arguments API for data redistribution API for communication scheduling w.r.t. network properties Support of parallel exceptions Object Request Broker CORBA stub/skeleton Communication Library ( MPI) Application Application view management - Data distribution description Communication management - Comm. Matrix computation - Comm. Matrix scheduling - Communication execution Redistribution Library 1 Communication Library GridCCM runtime Scheduling Library Component AComponent B

9 High Performance Components for parametric codes: Master-worker paradigm Collection of components Simple model for application developers A request is delivered to an instance of the collection The selection of collection instance is delegated to a request transport policy Enable the use of existing MW environments (DIET, ….) Resources infrastructure independence No dealing with the number of workers No dealing with request delivery concerns Valid for ADL and non ADL based component models Preliminary results Fluid motion estimation Experiment on Grid’5000 4003 components 974 processors 7 sites Speedup of 213 with a round robin pattern Round-Robin Programmer view binding master worker Exposed provided port RR Proxy Set of “request delivery” patterns Number of workers & policy pattern selection Resources Abstract ADL Concrete ADL XML collection definition worker

10 PadicoTM: Communication framework Provides an open integration platform to combine various communication middleware and runtimes Message based runtimes (MPI, PVM, …) DSM-based runtimes (TreadMarks, …) RPC and RMI based middleware (DCE, CORBA, Java, …) Allows several communication middleware and runtime to share various networking technologies Ethernet, Myrinet, Infiniband, Quadrics, SCI Last but not least: get the maximum performance of the network!  Available as an open source software under the GPL licence http://padico.gforge.inria.fr > 200 downloads since July 2002 Madeleine Portability across networks Marcel I/O aware multi-threading MyrinetSCI PadicoTM Core PadicoTM Services Multithreading Networks DSMJVMMPICORBA JXTA TCP Personality Layer Internal engine Mpich OmniORB MICO Orbacus Orbix/E KaffeJuxMem Mome

11 ADAGE: Automatic Deployment of Application in a Grid Environment Deploy a same application on any kind of resources from clusters to grids Support multi-middleware application MPI+CORBA+JUXMEM+... Planner as plugin round robin & random Some successes 29.000 JXTA peers on ~400 nodes 4003 components on 974 processors on 7 sites Alpha support for dynamic application MPI Application Description CCM Application Description Resource Description Generic Application Description Control Parameters Deployment Planning Deployment Plan Execution Application Configuration

Programming Parallel and Distributed Systems for Large Scale Numerical Simulation Application Christian Perez INRIA researcher IRISA Rennes, France.

Similar presentations

Presentation on theme: "Programming Parallel and Distributed Systems for Large Scale Numerical Simulation Application Christian Perez INRIA researcher IRISA Rennes, France."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Programming Parallel and Distributed Systems for Large Scale Numerical Simulation Application Christian Perez INRIA researcher IRISA Rennes, France.

Similar presentations

Presentation on theme: "Programming Parallel and Distributed Systems for Large Scale Numerical Simulation Application Christian Perez INRIA researcher IRISA Rennes, France."— Presentation transcript:

Similar presentations

About project

Feedback