Update on ULFM Fault Tolerance Working Group MPI Forum, San Jose CA December, 2014.

Slides:



Advertisements
Similar presentations
The MPI Forum: Getting Started Rich Graham Oak Ridge National Laboratory.
Advertisements

Support for Fault Tolerance (Dynamic Process Control) Rich Graham Oak Ridge National Laboratory.
Presented by Fault Tolerance and Dynamic Process Control Working Group Richard L Graham.
Presented by Fault Tolerance Working Group Update Rich Graham.
MPI3 RMA William Gropp Rajeev Thakur. 2 MPI-3 RMA Presented an overview of some of the issues and constraints at last meeting Homework - read Bonachea's.
Presented by Structure of MPI-3 Rich Graham. 2 Current State of MPI-3 proposals Many working groups have several proposal being discussed ==> standard.
RMA Considerations for MPI-3.1 (or MPI-3 Errata)
Paging: Design Issues. Readings r Silbershatz et al: ,
Use Cases for Fault Tolerance Support in MPI Rich Graham Oak Ridge National Laboratory.
User Level Failure Mitigation Fault Tolerance Plenary December 2013, MPI Forum Meeting Chicago, IL USA.
W3C Workshop on Web Services Mark Nottingham
User Level Failure Mitigation Fault Tolerance Working Group September 2013, MPI Forum Meeting Madrid, Spain.
6.1 Synchronous Computations ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006.
Revoke / Incarnation #s / Matching Discussion around how to reclaim context IDs (resources that are a part of message matching) after an MPI_Comm_revoke.
S.Chechelnitskiy / SFU Simon Fraser Running CE and SE in a XEN virtualized environment S.Chechelnitskiy Simon Fraser University CHEP 2007 September 6 th.
MPI Forum Suggestions for Additions to the Voting rules March 2015.
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
The SMART Way to Migrate Replicated Stateful Services Jacob R. Lorch, Atul Adya, Bill Bolosky, Ronnie Chaiken, John Douceur, Jon Howell Microsoft Research.
Module 20 Troubleshooting Common SQL Server 2008 R2 Administrative Issues.
Parallel Programming Laboratory1 Fault Tolerance in Charm++ Sayantan Chakravorty.
Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University.
Computer Science 162 Section 1 CS162 Teaching Staff.
Efficient Parallelization for AMR MHD Multiphysics Calculations Implementation in AstroBEAR.
Lessons Learned Implementing User-Level Failure Mitigation in MPICH Wesley Bland, Huiwei Lu, Sangmin Seo, Pavan Balaji Argonne National Laboratory User-level.
Peer-to-peer archival data trading Brian Cooper and Hector Garcia-Molina Stanford University.
Simplifying the Recovery Model of User- Level Failure Mitigation Wesley Bland ExaMPI ‘14 New Orleans, LA, USA November 17, 2014.
REPLICATION IN THE HARP FILE SYSTEM B. Liskov, S. Ghemawat, R. Gruber, P. Johnson, L. Shrira, M. Williams MIT.
Non-Collective Communicator Creation in MPI James Dinan 1, Sriram Krishnamoorthy 2, Pavan Balaji 1, Jeff Hammond 1, Manojkumar Krishnan 2, Vinod Tipparaju.
1-1 Embedded Network Interface (ENI) API Concepts Shared RAM vs. FIFO modes ENI API’s.
Mantid Scientific Steering Committee Nick Draper 10/11/2010.
Argonne National Laboratory is a U.S. Department of Energy laboratory managed by U Chicago Argonne, LLC. Xin Zhao *, Pavan Balaji † (Co-advisor) and William.
1/20 Optimization of Multi-level Checkpoint Model for Large Scale HPC Applications Sheng Di, Mohamed Slim Bouguerra, Leonardo Bautista-gomez, Franck Cappello.
A Fault Tolerant Protocol for Massively Parallel Machines Sayantan Chakravorty Laxmikant Kale University of Illinois, Urbana-Champaign.
30 October Agenda for Today Introduction and purpose of the course Introduction and purpose of the course Organization of a computer system Organization.
Chapter 4 Memory Management Virtual Memory.
CS 4850: Senior Project Fall 2014 Object-Oriented Design.
MPI-3 Hybrid Working Group Status. MPI interoperability with Shared Memory Motivation: sharing data between processes on a node without using threads.
Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.
Tuning Github for SPL development Branching models and operations for product engineers Oscar Díaz University of the Basque Country ONEKIN Research Group.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
By The Supreme Team CMPT 275 Assignment 2 May 29, 2009.
Non-Collective Communicator Creation Tickets #286 and #305 int MPI_Comm_create_group(MPI_Comm comm, MPI_Group group, int tag, MPI_Comm *newcomm) Non-Collective.
Diskless Checkpointing on Super-scale Architectures Applied to the Fast Fourier Transform Christian Engelmann, Al Geist Oak Ridge National Laboratory Februrary,
Lecture 11 Page 1 CS 111 Online Working Sets Give each running process an allocation of page frames matched to its needs How do we know what its needs.
Wesley Bland, Huiwei Lu, Sangmin Seo, Pavan Balaji Argonne National Laboratory {wbland, huiweilu, sseo, May 5, 2015 Lessons Learned Implementing.
Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.
Efficient Multithreaded Context ID Allocation in MPI James Dinan, David Goodell, William Gropp, Rajeev Thakur, and Pavan Balaji.
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
1 Lecture 4: Part 2: MPI Point-to-Point Communication.
Coursework. Deadline: 20 th December 2013  The marks for Background and analysis lock down on 20 th Dec and are sent to the board. After this date changes.
Persistent Collective WG December 9, Basics Mirror regular nonblocking collective operations For each nonblocking MPI collective (including neighborhood.
Endpoints Plenary James Dinan Hybrid Working Group December 10, 2013.
GitHub and the MPI Forum: The Short Version December 9, 2015 San Jose, CA.
MPI Communicator Assertions Jim Dinan Point-to-Point WG March 2015 MPI Forum Meeting.
Issues #1 and #3 Fault Tolerance Working Group December 2015 MPI Forum.
Virtual Memory Pranav Shah CS147 - Sin Min Lee. Concept of Virtual Memory Purpose of Virtual Memory - to use hard disk as an extension of RAM. Personal.
Processes 2 Introduction to Operating Systems: Module 4.
1 Chapter Overview Monitoring Access to Shared Folders Creating and Sharing Local and Remote Folders Monitoring Network Users Using Offline Folders and.
1 CEG 2400 Fall 2012 eDirectory – Directory Service.
20 October 2005 LCG Generator Services monthly meeting, CERN Validation of GENSER & News on GENSER Alexander Toropin LCG Generator Services monthly meeting.
Is MPI still part of the solution ? George Bosilca Innovative Computing Laboratory Electrical Engineering and Computer Science Department University of.
A software development company especially related with development of software in education sector. DEN has a highly interactive and user friendly interface.
Building PetaScale Applications and Tools on the TeraGrid Workshop December 11-12, 2007 Scott Lathrop and Sergiu Sanielevici.
Denis Caromel1 OASIS Team INRIA -- CNRS - I3S -- Univ. of Nice Sophia-Antipolis -- IUF IPDPS 2003 Nice Sophia Antipolis, April Overview: 1. What.
Distributed Shared Memory
EECS 498 Introduction to Distributed Systems Fall 2017
Git Best Practices Jay Patel Git Best Practices.
Git CS Fall 2018.
The SMART Way to Migrate Replicated Stateful Services
Git GitHub.
Presentation transcript:

Update on ULFM Fault Tolerance Working Group MPI Forum, San Jose CA December, 2014

Outline Proposal Reorg RMA Updates Dynamic Processing Papers Implementations / Testing

Reorganization Reorganizing ticket to split out RMA & Files Important Reasons RMA section is still under enough flux to not be ready in the immediate future Communicator section is pretty much ready to go and hasn’t changed in a while (pending fixes in dynamic processing) Practical Reasons Every time we need to make a minor change in the RMA section, we re-read the entire ticket Cut down on reading times

RMA Updates Mostly updated section to account for locally shared memory Clarify state of data after failures Clarify state of memory when freeing window Purpose is to stabilize communication, not data Just like communicators Data can be protected using external mechanisms SCR, GVR, FTI, etc.

Data Status After a failure, all data in a window becomes undefined Implementation might do better Data also becomes undefined in overlapping windows without failure Includes locally shared memory in other processes

Memory Reuse There are times where it may not be possible to reuse memory after a failure Some networks can protect your data from late arriving messages Some networks cannot It may not be possible to reuse memory exposed to a failure after freeing the window MPI will provide an attribute in the window to let the user know if the memory can be reused

Dynamic Processing Reworking this section Need to ensure there is a way to "validate" a new communicator May require stronger synchronization semantics for SPAWN, CONNECT/ACCEPT, etc.

Papers/Projects Programming Models Falanx Resilience Libraries Fenix (SC), Message Logging (ICA3PP), LFLR (EuroMPI/ASIA) Applications PDE Solvers (IPDPS Workshops), Multi-Level Monte Carlo (PARCO) Evaluation / Discussions Lots and lots

Implementations MPICH Experimental support added in v3.2a2 Open MPI Available in branch from UTK Improved algorithms added recently Working on bringing up to date with master branch Simulator Developed by Christian Englemann and Thomas Naughton Evaluates performance at large scale

Testing Repository Holds common collection of tests Used to validate all ULFM implementations Currently housed at ORNL, but will be moved to Github soon