CASC This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under.

Slides:

Advertisements

Similar presentations

National Institute of Advanced Industrial Science and Technology Ninf-G - Core GridRPC Infrastructure Software OGF19 Yoshio Tanaka (AIST) On behalf.

Advertisements

Multiple Processor Systems

Threads, SMP, and Microkernels

M. Muztaba Fuad Masters in Computer Science Department of Computer Science Adelaide University Supervised By Dr. Michael J. Oudshoorn Associate Professor.

Distributed Object & Remote Invocation Vidya Satyanarayanan.

Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.

A Grid Parallel Application Framework Jeremy Villalobos PhD student Department of Computer Science University of North Carolina Charlotte.

David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.

1 Introducing Collaboration to Single User Applications A Survey and Analysis of Recent Work by Brian Cornell For Collaborative Systems Fall 2006.

28/1/2001 Seminar in Databases in the Internet Environment Introduction to J ava S erver P ages technology by Naomi Chen.

Scripting Languages For Virtual Worlds. Outline Necessary Features Classes, Prototypes, and Mixins Static vs. Dynamic Typing Concurrency Versioning Distribution.

University of Kansas Construction & Integration of Distributed Systems Jerry James Oct. 30, 2000.

3.5 Interprocess Communication

EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

Communication in Distributed Systems –Part 2

Software Engineering Module 1 -Components Teaching unit 3 – Advanced development Ernesto Damiani Free University of Bozen - Bolzano Lesson 2 – Components.

16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.

1 I/O Management in Representative Operating Systems.

Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse 2.

A Virtual Cloud Computing Provider for Mobile Devices Gonzalo Huerta-Canepa Dongman Lee.

CLUSTER COMPUTING Prepared by: Kalpesh Sindha (ITSNS)

DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.

Introduction to the Enterprise Library. Sounds familiar? Writing a component to encapsulate data access Building a component that allows you to log errors.

Computer System Architectures Computer System Software

Center for Component Technology for Terascale Simulation Software 122 June 2002Workshop on Performance Optimization via High Level Languages and Libraries.

GRAPPA Part of Active Notebook Science Portal project A “notebook” like GRAPPA consists of –Set of ordinary web pages, viewable from any browser –Editable.

©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.

Introduction to Ice Copyright © ZeroC, Inc. Ice Programming with Java 1. Introduction to Ice.

Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.

CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.

Service-enabling Legacy Applications for the GENIE Project Sofia Panagiotidi, Jeremy Cohen, John Darlington, Marko Krznarić and Eleftheria Katsiri.

Overview of Recent MCMD Developments Manojkumar Krishnan January CCA Forum Meeting Boulder.

Model Coupling Environmental Library. Goals Develop a framework where geophysical models can be easily coupled together –Work across multiple platforms,

BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

Tammy Dahlgren with Tom Epperly, Scott Kohn, and Gary Kumfert Center for Applied Scientific Computing Common Component Architecture Working Group October.

 2004 Deitel & Associates, Inc. All rights reserved. 1 Chapter 4 – Thread Concepts Outline 4.1 Introduction 4.2Definition of Thread 4.3Motivation for.

Overview of Recent MCMD Developments Jarek Nieplocha CCA Forum Meeting San Francisco.

Components for Beam Dynamics Douglas R. Dechow, Tech-X Lois Curfman McInnes, ANL Boyana Norris, ANL With thanks to the Common Component Architecture (CCA)

Advanced Computer Networks Topic 2: Characterization of Distributed Systems.

Center for Component Technology for Terascale Simulation Software CCA is about: Enhancing Programmer Productivity without sacrificing performance. Supporting.

Presented by An Overview of the Common Component Architecture (CCA) The CCA Forum and the Center for Technology for Advanced Scientific Component Software.

Tammy Dahlgren, Tom Epperly, Scott Kohn, & Gary Kumfert.

Scott Kohn with Tammy Dahlgren, Tom Epperly, and Gary Kumfert Center for Applied Scientific Computing Lawrence Livermore National Laboratory October 2,

PMI: A Scalable Process- Management Interface for Extreme-Scale Systems Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, Jayesh Krishna, Ewing.

Update on CORBA Support for Babel RMI Nanbor Wang and Roopa Pundaleeka Tech-X Corporation Boulder, CO Funded by DOE OASCR SBIR.

CS 501: Software Engineering Fall 1999 Lecture 12 System Architecture III Distributed Objects.

CCA Common Component Architecture CCA Forum Tutorial Working Group CCA Status and Plans.

Distributed Components for Integrating Large- Scale High Performance Computing Applications Nanbor Wang, Roopa Pundaleeka and Johan Carlsson

Testing OO software. State Based Testing State machine: implementation-independent specification (model) of the dynamic behaviour of the system State:

Toward interactive visualization in a distributed workflow Steven G. Parker Oscar Barney Ayla Khan Thiago Ize Steven G. Parker Oscar Barney Ayla Khan Thiago.

Bronis R. de Supinski and Jeffrey S. Vetter Center for Applied Scientific Computing August 15, 2000 Umpire: Making MPI Programs Safe.

Shangkar Mayanglambam, Allen D. Malony, Matthew J. Sottile Computer and Information Science Department Performance.

The Proxy Pattern (Structural) ©SoftMoore ConsultingSlide 1.

LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.

CEN6502, Spring Understanding the ORB: Client Side Structure of ORB (fig 4.1) Client requests may be passed to ORB via either SII or DII SII decide.

Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.

Performance-Driven Interface Contract Enforcement for Scientific Components 10th International Symposium on Component-Based Software Engineering Medford,

Topic 4: Distributed Objects Dr. Ayman Srour Faculty of Applied Engineering and Urban Planning University of Palestine.

Chapter 4 – Thread Concepts

Self Healing and Dynamic Construction Framework:

Chapter 4 – Thread Concepts

Parallel Objects: Virtualization & In-Process Components

CHAPTER 3 Architectures for Distributed Systems

Analysis models and design models

Multithreaded Programming

MapReduce: Simplified Data Processing on Large Clusters

Presentation transcript:

CASC This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48. UCRL-PRES-XXXXXX. Introducing Cooperative Parallelism John May, David Jefferson Nathan Barton, Rich Becker, Jarek Knap Gary Kumfert, James Leek, John Tannahill Lawrence Livermore National Laboratory presented to the CCA Forum 25 Jan 2007

Outline l Challenges for massively parallel programming l Cooperative parallel programming model l Applications for cooperative parallelism l Cooperative parallelism and Babel l Ongoing work

Massive parallelism strains SPMD New techniques needed to fill the gap l Increasingly difficult to make all processors work in lock-step —Lack of inherent parallelism —Load balance l New techniques need richer programming model than pure SPMD with MPI —Adaptive sampling —Multi-model simulation (e.g., components) l Fault tolerance requires better process management —Need smaller unit of granularity for failure recovery, checkpoint/restart

Parallel symponent using MPI internally Ad hoc symponent creation and communication Runtime system Introducing Cooperative Parallelism l Computational job consists of multiple interacting “symponents” —Large parallel (MPI) jobs or single processes —Created and destroyed dynamically —Appear as objects to each other —Communicate through remote method invocation (RMI) l Apps can add symponents incrementally l Designed to complement MPI, not replace it!

Cooperative parallelism features l Three synchronization styles for RMI —Blocking (caller waits for return) —Nonblocking (caller checks later for result) —One-way (caller dispatches request and has no further interaction) l Target of RMI can be a single process or a parallel job, with parameters distributed to all tasks l Closely integrated with Babel framework —Symponents written in C, C++, Fortran, F90, Java, and Python interact seamlessly —Developer writes interface description files to specify RMI interfaces —Exceptions propagated from remote methods —Object-oriented structure lets symponents inherit capabilities and interfaces from other symponents

Benefits of cooperative parallelism l Easy subletting of work improves load balance l Simple model for expressing task-based parallelism (rather than data parallelism) l Nodes can be suballocated dynamically l Dynamic management of symponent jobs supports fault tolerance —Caller notified of failing symponents; can re- launch l Existing stand-alone applications can be modified and combined as discrete modules

But what about MPI? l Cooperative parallelism —Dynamic management of symponents —Components are opaque to each other —Communication is connectionless, ad-hoc, interrupting and point-to-point l MPI and MPI-2 —Mostly-static process management (MPI-2 can spawn processes but not explicitly terminate them) —Tasks are typically highly-coordinated —Communication is connection-oriented and either point-to-point or collective; MPI-2 supports remote memory access

Well-balanced work Server proxy Servers for unbalanced work Applications: Load balancing l Divide work into well-balanced and unbalanced parts l Run balanced work as a regular MPI job l Set up pool of servers to handle unbalanced work —Server proxy assigns work to available servers l Tasks with extra work can sublet it in parallel so they can catch up to less-busy tasks

Coarse scale model Server proxy Fine scale servers Unknown function Interpolated values Newly computed value Previously computed values Applications: Adaptive sampling l Multiscale model, similar to AMR —BUT: Can use different models at different scales l Fine-scale computations requested from remote servers to improve load balance l Initial results cached in a database l Later computations check cached results and interpolate if accuracy is acceptable

Master Completed simulation Completed simulation Active simulation Active simulation Active simulation Applications: Parameter studies l Master process launches multiple parallel components to complete a simulation, each with different parameters l Existing simulation codes can be wrapped to form components l Master can direct study based on accumulated results, launch new components as others complete

Applications: Federated simulations l Components modeling separate physical entities interact l Potential example: Ocean, atmosphere, sea ice —Each modeled in a separate job —Interactions communicated through RMI (N-by-M parallel RMI is future work)

Cooperative parallelism and Babel l Babel gives Co-op —Language interoperability, SIDL, object-oriented model —RMI, including exception handling l Co-op adds —Symponent launch, monitoring, and termination —Motivation and resources for extending RMI —Patterns for developing task-parallel applications l Babel and Cooperative Parallelism teams are colocated and share some staff

Status: Runtime software l Prototype runtime software working on Xeon, Opteron and Itanium systems l Competed 1360-CPU demonstration in September l Beginning to do similar-scale runs on new platforms l Planning to port to IBM systems this year l Ongoing work to enhance software robustness and documentation

Status: Applications l Ongoing experiments with material modeling application —Demonstrated X speedups on >1000 processors using adaptive sampling l Investigating use in parameter study application l In discussions with several other apps groups at LLNL l Also looking for possible collaborations with groups outside LLNL l Contacts: John May l David Jefferson l