InterCell foundations with ParXXL Render Large Scale Computations Interactive Jens Gustedt INRIA Nancy – Grand Est AlGorille Stéphane Vialle Supélec Metz.

Slides:



Advertisements
Similar presentations
© 2007 Open Grid Forum SAGA: Simple API for Grid Applications Steven Newhouse Application Standards Area Director.
Advertisements

Support for Fault Tolerance (Dynamic Process Control) Rich Graham Oak Ridge National Laboratory.
Wireless Networks Should Spread Spectrum On Demand Ramki Gummadi (MIT) Joint work with Hari Balakrishnan.
Elton Mathias and Jean Michael Legait 1 Elton Mathias, Jean Michael Legait, Denis Caromel, et al. OASIS Team INRIA -- CNRS - I3S -- Univ. of Nice Sophia-Antipolis,
Chapter 1 Introduction Copyright © Operating Systems, by Dhananjay Dhamdhere Copyright © Introduction Abstract Views of an Operating System.
1 Communication in Distributed Systems REKs adaptation of Tanenbaums Distributed Systems Chapter 2.
INTRODUCTION TO SIMULATION WITH OMNET++ José Daniel García Sánchez ARCOS Group – University Carlos III of Madrid.
Windows® Deployment Services
MPI Message Passing Interface
High Performance Computing Course Notes Parallel I/O.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
MINJAE HWANG THAWAN KOOBURAT CS758 CLASS PROJECT FALL 2009 Extending Task-based Programming Model beyond Shared-memory Systems.
Christian Delbe1 Christian Delbé OASIS Team INRIA -- CNRS - I3S -- Univ. of Nice Sophia-Antipolis November Automatic Fault Tolerance in ProActive.
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Definition of a Distributed System (1) A distributed system is: A collection of independent computers that appears to its users as a single coherent system.
Portability Issues. The MPI standard was defined in May of This standardization effort was a response to the many incompatible versions of parallel.
User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M.
1 Message protocols l Message consists of “envelope” and data »Envelope contains tag, communicator, length, source information, plus impl. private data.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Parallel Implementation of a Biologically Realistic NeoCortical Simulator E.Courtenay Wilson.
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
1 What is message passing? l Data transfer plus synchronization l Requires cooperation of sender and receiver l Cooperation not always apparent in code.
Ch4: Distributed Systems Architectures. Typically, system with several interconnected computers that do not share clock or memory. Motivation: tie together.
Parallel Processing LAB NO 1.
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
Lecture 6: Introduction to Distributed Computing.
Inter-process Communication and Coordination Chaitanya Sambhara CSC 8320 Advanced Operating Systems.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
Introduction to DISTRIBUTED SYSTEMS Tran, Van Hoai Department of Systems & Networking Faculty of Computer Science & Engineering HCMC University of Technology.
1 Lecture 20: Parallel and Distributed Systems n Classification of parallel/distributed architectures n SMPs n Distributed systems n Clusters.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Neural and Evolutionary Computing - Lecture 10 1 Parallel and Distributed Models in Evolutionary Computing  Motivation  Parallelization models  Distributed.
Introduction to DISTRIBUTED COMPUTING Tran, Van Hoai Department of Systems & Networking Faculty of Computer Science & Engineering HCMC University of Technology.
OS2- Sem ; R. Jalili Introduction Chapter 1.
The Vesta Parallel File System Peter F. Corbett Dror G. Feithlson.
Distributed Computing Systems CSCI 4780/6780. Distributed System A distributed system is: A collection of independent computers that appears to its users.
Headline in Arial Bold 30pt HPC User Forum, April 2008 John Hesterberg HPC OS Directions and Requirements.
PART II OPERATING SYSTEMS LECTURE 8 SO TAXONOMY Ştefan Stăncescu 1.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Distributed Computing Systems CSCI 4780/6780. Geographical Scalability Challenges Synchronous communication –Waiting for a reply does not scale well!!
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.
Improving I/O with Compiler-Supported Parallelism Why Should We Care About I/O? Disk access speeds are much slower than processor and memory access speeds.
OS2- Sem1-83; R. Jalili Introduction Chapter 1. OS2- Sem1-83; R. Jalili Definition of a Distributed System (1) A distributed system is: A collection of.
Distributed Computing Systems CSCI 6900/4900. Review Distributed system –A collection of independent computers that appears to its users as a single coherent.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
Globus Grid Tutorial Part 2: Running Programs Across Multiple Resources.
Microkernel Systems - Jatin Lodhia. What is Microkernel A microkernel is a minimal computer operating system kernel which, in its purest form, provides.
Data and Computer Communications Chapter 10 – Circuit Switching and Packet Switching.
Distributed Real-time Systems- Lecture 01 Cluster Computing Dr. Amitava Gupta Faculty of Informatics & Electrical Engineering University of Rostock, Germany.
Background Computer System Architectures Computer System Software.
Primitive Concepts of Distributed Systems Chapter 1.
Distributed Systems Architecure. Architectures Architectural Styles Software Architectures Architectures versus Middleware Self-management in distributed.
1.3 Operating system services An operating system provide services to programs and to the users of the program. It provides an environment for the execution.
Parallel Programming Models EECC 756 David D. McGann 18 May, 1999.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
For Massively Parallel Computation The Chaotic State of the Art
GdX - Grid eXplorer parXXL: A Fine Grained Development Environment on Coarse Grained Architectures PARA 2006 – UMEǺ Jens Gustedt - Stéphane Vialle - Amelia.
University of Technology
Advanced Operating Systems
Hybrid Programming with OpenMP and MPI
MPJ: A Java-based Parallel Computing System
Introduction To Distributed Systems
An Implementation of User-level Distributed Shared Memory
Parallel I/O for Distributed Applications (MPI-Conn-IO)
Presentation transcript:

InterCell foundations with ParXXL Render Large Scale Computations Interactive Jens Gustedt INRIA Nancy – Grand Est AlGorille Stéphane Vialle Supélec Metz campus

InterCell lowlevel ParXXL 2 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle Goals Provide a programming paradigm and an interface: Simplicity: easy to use close to existing programming habits Performance: competitive responsive easy to evaluate Interoperability: close to known standards, heterogenity of computing material, OS, administrative domains

InterCell lowlevel ParXXL 3 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle

InterCell lowlevel ParXXL 4 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle Round trip time: > 0.13 s Synchronization Frequency: < 7.5 Hz Problem 1: Communication Latency

InterCell lowlevel ParXXL 5 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle Overruling Latency

InterCell lowlevel ParXXL 6 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle Problem 2: Data Management Access? Management?

InterCell lowlevel ParXXL 7 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle Delegation of responsibilities

InterCell lowlevel ParXXL 8 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle Isolation in welldefined parts Control Communication

InterCell lowlevel ParXXL 9 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle Two commonly used paradigms Message Passing Standardization (MPI) Efficiency (distributed env.) Simplified Data Control Memory Blowup Extra Copies Shared Memory Standardization (pthread, shm_open) Latency Problems Data Consistency Problems Efficiency (parallel machines) Random Access during Computation

InterCell lowlevel ParXXL 10 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle parXXL unifies these paradigms Low Level Computational Efficiency (distributed and / or parallel) Simplified Data Control Memory Efficiency (data size) Transfer Efficiency High Level Abstraction for Cellular Computation Implementation of commonly used Cell-Networks

InterCell lowlevel ParXXL 11 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle parXXL achievements 2008 Low Level Packaging and Portability Code Stability: Size Up (fill whole InterCell cluster) Robustness (long runs = transient failures) Transfer Efficiency: Data collectors. Collect distinguished data from the cells Mix communication models (shared memory, files & messages)

InterCell lowlevel ParXXL 12 Jens Gustedt Jens Gustedt INRIA RILLEpour la g rithmesALGO Stéphane Vialle ParXXL achievements 2008, cont. High Level Cellular Computation: Find a compromise between synchronized and de-synchronized Cell-Computation Develop new modes of asynchronous computation (cont.) Implementation of commonly used Cell-Networks: Optimize the parallel deployment (avoid scaling problems) (cont.)... changed lines of code (sep 07 – sep 08)