Checkpoint & Restart for Distributed Components in XCAT3 Sriram Krishnan* Indiana University, San Diego Supercomputer Center & Dennis Gannon Indiana University.

Slides:



Advertisements
Similar presentations
Fujitsu Laboratories of Europe © 2004 What is a (Grid) Resource? Dr. David Snelling Fujitsu Laboratories of Europe W3C TAG - Edinburgh September 20, 2005.
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Threads, SMP, and Microkernels
Christian Delbe1 Christian Delbé OASIS Team INRIA -- CNRS - I3S -- Univ. of Nice Sophia-Antipolis November Automatic Fault Tolerance in ProActive.
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
Spark: Cluster Computing with Working Sets
Reliability on Web Services Presented by Pat Chan 17/10/2005.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
I.1 Distributed Systems Prof. Dr. Alexander Schill Dresden Technical University Computer Networks Dept.
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
Approaches to EJB Replication. Overview J2EE architecture –EJB, components, services Replication –Clustering, container, application Conclusions –Advantages.
University of British Columbia Software Practices Lab Introduction to Middleware for Software Engineering Eric Wohlstadter 539D.
Distributed components
CIM2564 Introduction to Development Frameworks 1 Overview of a Development Framework Topic 1.
MPICH-V: Fault Tolerant MPI Rachit Chawla. Outline  Introduction  Objectives  Architecture  Performance  Conclusion.
DISTRIBUTED CONSISTENCY MANAGEMENT IN A SINGLE ADDRESS SPACE DISTRIBUTED OPERATING SYSTEM Sombrero.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
© 2004 IBM Corporation BEA WebLogic Server Introduction and Training.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
Understanding and Managing WebSphere V5
Computer System Architectures Computer System Software
Distributed Software Engineering To explain the advantages and disadvantages of different distributed systems architectures To discuss client-server and.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Enterprise Java Beans Part I Kyungmin Cho 2001/04/10.
DISTRIBUTED COMPUTING
Daniel Vanderster University of Victoria National Research Council and the University of Victoria 1 GridX1 Services Project A. Agarwal, A. Berman, A. Charbonneau,
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
® IBM Software Group © 2007 IBM Corporation J2EE Web Component Introduction
Enterprise Java Beans Java for the Enterprise Server-based platform for Enterprise Applications Designed for “medium-to-large scale business, enterprise-wide.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.
Rio de Janeiro, October, 2005 SBAC Portable Checkpointing for BSP Applications on Grid Environments Raphael Y. de Camargo Fabio Kon Alfredo Goldman.
A Proposal of Application Failure Detection and Recovery in the Grid Marian Bubak 1,2, Tomasz Szepieniec 2, Marcin Radecki 2 1 Institute of Computer Science,
Cracow Grid Workshop, October 27 – 29, 2003 Institute of Computer Science AGH Design of Distributed Grid Workflow Composition System Marian Bubak, Tomasz.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
Grid Computing Research Lab SUNY Binghamton 1 XCAT-C++: A High Performance Distributed CCA Framework Madhu Govindaraju.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Refining middleware functions for verification purpose Jérôme Hugues Laurent Pautet Fabrice Kordon
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Extreme! Computing Lab, Dept. of Computer Science, Indiana University 1 Programming the Grid with Components Madhu Govindaraju Aleksander Slominski Dennis.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Databases JDBC (Java Database Connectivity) –Thin clients – servlet,JavaServer Pages (JSP) –Thick clients – RMI to remote databases –most recommended way.
GSFL: A Workflow Framework for Grid Services Sriram Krishnan Patrick Wagstrom Gregor von Laszewski.
Abstraction of Transaction Demarcation in Component-Oriented Middleware Romain Rouvoy - Philippe Merle Jacquard INRIA Project LIFL –
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Grid Services I - Concepts
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
GLOBE DISTRIBUTED SHARED OBJECT. INTRODUCTION  Globe stands for GLobal Object Based Environment.  Globe is different from CORBA and DCOM that it supports.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
ProActive components and legacy code Matthieu MOREL.
 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.
On Using BPEL Extensibility to Implement OGSI and WSRF Grid Workflows Aleksander Slomiski Presented by Onyeka Ezenwoye CIS Advanced Topics in Software.
Middleware for Fault Tolerant Applications Lihua Xu and Sheng Liu Jun, 05, 2003.
FTC-Charm++: An In-Memory Checkpoint-Based Fault Tolerant Runtime for Charm++ and MPI Gengbin Zheng Lixia Shi Laxmikant V. Kale Parallel Programming Lab.
Seminar On Rain Technology
OGSA-DAI.
Distributed Systems Architectures Chapter 12. Objectives  To explain the advantages and disadvantages of different distributed systems architectures.
Introduction to Distributed Platforms
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
University of Technology
Hadoop Technopoints.
An XML-based System Architecture for IXA/IA Intercommunication
Presentation transcript:

Checkpoint & Restart for Distributed Components in XCAT3 Sriram Krishnan* Indiana University, San Diego Supercomputer Center & Dennis Gannon Indiana University

Long-running Distributed Applications on the Grid The Problem: 1 Launch simulation at Y 2. Launch simulation at Z 3. Link both simulations 4. Execute both simulations 5. Store results at X Y Z The Grid X Need an effective way to orchestrate such computations

Checkpoint & Restart Motivation  Basic fault tolerance via periodic checkpointing  Rollback to saved checkpoint upon failure  Dynamic rescheduling of jobs  Checkpoint and restart on another location Checkpointing Goals  Correctness  Portability  Minimal checkpoint size  Scalability  Interoperability  Checkpoint Availability

Outline Motivation Background  The XCAT3 framework  Checkpoint & Restart Checkpointing & Restart in XCAT3  Software Techniques  Algorithms  Experiments Conclusions & Future work

Application Orchestration: Component Architectures A Component Architecture consists of two parts:  Components  Software objects that implement a set of required behaviors  Frameworks  A runtime environment  A set of services used by components Benefits  Encapsulation, modular construction of programs (via composition), reuse Component Architectures adopted in various domains  Business: EJB, CCM, COM/DCOM  Scientific Computing: CCA

Common Component Architecture A ComponentID for identification & management purposes Ports: the public interfaces of a component  Defines the different ways we can interact with a component and the ways the component uses other services and components. Image Processing Component setImage(Image I) Image getImage() adjustColor() setFilter(Filter) calls doFFT(…) Provides Ports - interfaces functions provided by component Uses Ports - interface of a service used by component

XCAT3: CCA Framework for the Grid Grid Service Extensions (GSX) Toolkit used for OGSI Compatible Grid services  Standard protocols used by Grid services: SOAP, HTTP   A Component is represented as a set of Grid services  Provides ports, ComponentID’s are Grid services  Uses ports are Grid service clients  Sriram Krishnan and Dennis Gannon. XCAT3: A Framework for CCA Components as OGSA Services. In HIPS 2004, 9th International Workshop on High-Level Parallel Programming Models and Supportive Environments. April 2004.XCAT3: A Framework for CCA Components as OGSA Services

Checkpointing: Software Techniques System-level Techniques  Automatic transparent checkpointing for an application at the operating system or middleware level User-defined Techniques  Non-transparent checkpointing for an application that relies on the programmer to identify the minimal information needed for restart

Checkpointing: Software Techniques Transparent to the user: No expertise required Not very portable across platforms Larger checkpoint sizes: Typically complete process images stored Less flexible: Application is treated as a black box Not transparent to the user: Considerable expertise required More portable across platforms Smaller checkpoint sizes: Only minimal state stored More flexible: Application information can be used System-LevelUser-defined

Checkpointing: Examples System-level Techniques  Condor  LAM-MPI  Enterprise Java Beans  CORBA Components User-defined Techniques  CUMULVS  Enterprise Java Beans  CORBA Components Global Grid Forum: Grid Checkpoint/Recovery Group  User-defined checkpointing APIs for Grid services  Do not address consistent global checkpoints for distributed applications  A set of individual checkpoints that constitute a state that occurs in a failure-free, correct execution

Checkpointing Technique in XCAT3 User-defined & System-assisted  User is responsible for identifying local component state  Framework is responsible for:  Generating complete state of the component, viz. local component state, connection state, and environment state  Algorithms for generating global component states, and storing them into stable storage Component writer implements the following methods:  generateComponentState()  loadComponentState()  resumeExecution()

Distributed Checkpointing Algorithm Overview: Coordinated blocking checkpoint algorithm  Block all port communication between components  Take individual checkpoints, and commit them atomically  Resume port communication between components Novelty: Application to RPC-based component framework  Typically, such algorithms are applied to messaging frameworks

The Big Picture Application Coordinator Persistent Storage X Y Z Distributed Components on the Grid Federation of Master (MS) & Individual Storage (IS) Services MS IS

Checkpoint Algorithm Application Coordinator Persistent Storage X Y Z MS IS Checkpoint Components

Checkpoint Algorithm Application Coordinator Persistent Storage X Y Z MS IS Block all port communication between components

Checkpoint Algorithm Application Coordinator Persistent Storage X Y Z MS IS All communication between components blocked

Checkpoint Algorithm Application Coordinator Persistent Storage X Y Z MS IS Find best available Storage service URLs

Checkpoint Algorithm Application Coordinator Persistent Storage X Y Z MS IS Store checkpoints into Storage services

Checkpoint Algorithm Application Coordinator Persistent Storage X Y Z MS IS Return storageID’s for stored state

Checkpoint Algorithm Application Coordinator Persistent Storage X Y Z MS IS Atomically update locators for individual checkpoints

Checkpoint Algorithm Application Coordinator Persistent Storage X Y Z MS IS Un-block communication between components

Checkpointing: Correctness Consistency of Global Checkpoint  A flavor of coordinated blocking algorithms – well accepted to be correct Atomicity of Checkpoints  Locators for the global checkpoint are updated atomically after all components have been checkpointed  Not possible to have a scenario where a global checkpoint consists of a combination of old and new individual checkpoints

Restart Algorithm Also implemented by the Application Coordinator Details  Destroy executing instances, if need be  Restart all components (possibly on other resources)  Load state of components from the Storage services  Resume execution of all control threads, after the states of every component have been loaded from the Storage services

Test Application: Chem-Eng Simulation Based on the simulation of copper electro-deposition on resistive substrate (NCSA-UIUC)  Master-Worker model of execution  Variable number of workers, and data size per worker generateComponentState(), loadComponentState(), and resumeExecution() methods added to support checkpointing and restart  Required identification of the various execution states of the master and worker components

Experiment Setup Hardware setup  8 node Linux cluster  2.8GHz dual processor Intel Xeon processors  Red Hat Linux 8.0  2GB Memory  1Gbps Ethernet  SUN’s JDK 1.4.2_04 Federation of 1 Master & 8 Individual Storage services used Single GSX-based Handle Resolver

Checkpointing: Master Processing

Checkpointing: Workers Processing

Future Work Framework  Integration with the Web Service Resource Framework (WSRF) Fault Tolerance  Fault Monitoring  Reliable communication between components  Checkpoint Optimizations  Storage Service Optimizations Applications  Use of XCAT3 for LEAD (

Conclusions A framework for checkpointing & restart of distributed applications on the Grid  CCA-based component framework consistent with Grid standards  User-defined, platform-independent checkpoints  APIs for checkpointing, and algorithms for capturing global checkpoints and for restart provided by the framework

Appendix

OGSI Compatibility Representation for Provides ports  In traditional Grid/Web services, multiple ports of the same portType are semantically equivalent  CCA allows multiple ports of the same type  CCA ports can not be mapped to Web service ports!  Hence, every Provides port is mapped as a separate Grid service  A single portType containing the Provides port interface Representation for Uses ports  Clients of Grid services (Provides ports)  Connections to Provides ports made at runtime

OGSI Compatibility Representation for the ComponentID  Also a Grid service  Acts as a Manager for the other Provides ports  Contains SDEs containing GSH/GSRs for the various Provides ports The Provides ports and ComponentID services, and the Uses ports communicate via shared state

Acme FFT component Building Applications by Composition Connect Uses Ports to Provides Ports. Image Processing Component getImage() adjustColor() Image tool graphical interface component Image database component setImage(…) doFFT(…)

Restart Algorithm

Test Application: Chem-Eng Simulation