Process Migration Motivations: load sharing/balancing - migrate jobs to less utilized machines resource locality - migrate jobs to the machine where needed.

Slides:



Advertisements
Similar presentations
Providing Fault-tolerance for Parallel Programs on Grid (FT-MPICH) Heon Y. Yeom Distributed Computing Systems Lab. Seoul National University.
Advertisements

COS 461 Fall 1997 Workstation Clusters u replace big mainframe machines with a group of small cheap machines u get performance of big machines on the cost-curve.
Distributed System Structures Network Operating Systems –provide an environment where users can access remote resources through remote login or file transfer.
2. Computer Clusters for Scalable Parallel Computing
Distributed Process Management
Chapter 18 Distributed Process Management Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Distributed Systems 1 Topics  What is a Distributed System?  Why Distributed Systems?  Examples of Distributed Systems  Distributed System Requirements.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Cluster Technologies University of Macedonia Thessaloniki - Greece Parallel & Distributed Processing Laboratory
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
Distributed components
Task Scheduling and Distribution System Saeed Mahameed, Hani Ayoub Electrical Engineering Department, Technion – Israel Institute of Technology
Example for Scheduling- Structures: Structured HPC Grids.
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
City University London
Coda file system: Disconnected operation By Wallis Chau May 7, 2003.
2/18/2004 Challenges in Building Internet Services February 18, 2004.
Distributed Process Management
NetSolve Henri Casanova and Jack Dongarra University of Tennessee and Oak Ridge National Laboratory
Overview Distributed vs. decentralized Why distributed databases
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
AgentOS: The Agent-based Distributed Operating System for Mobile Networks Salimol Thomas Department of Computer Science Illinois Institute of Technology,
PRASHANTHI NARAYAN NETTEM.
1 Distributed Systems: Distributed Process Management – Process Migration.
DISTRIBUTED COMPUTING
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 1.
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Ch4: Distributed Systems Architectures. Typically, system with several interconnected computers that do not share clock or memory. Motivation: tie together.
ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.
PMIT-6102 Advanced Database Systems
Module 14: Configuring Print Resources and Printing Pools.
PVM and MPI What is more preferable? Comparative analysis of PVM and MPI for the development of physical applications on parallel clusters Ekaterina Elts.
A brief overview about Distributed Systems Group A4 Chris Sun Bryan Maden Min Fang.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 11Slide 1 Chapter 11 Distributed Systems Architectures.
Service Architecture of Grid Faults Diagnosis Expert System Based on Web Service Wang Mingzan, Zhang ziye Northeastern University, Shenyang, China.
HadoopDB Presenters: Serva rashidyan Somaie shahrokhi Aida parbale Spring 2012 azad university of sanandaj 1.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
Cluster Workstations. Recently the distinction between parallel and distributed computers has become blurred with the advent of the network of workstations.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
N. GSU Slide 1 Chapter 05 Clustered Systems for Massive Parallelism N. Xiong Georgia State University.
Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
1 Process migration n why migrate processes n main concepts n PM design objectives n design issues n freezing and restarting a process n address space.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Presented By: Samreen Tahir Coda is a network file system and a descendent of the Andrew File System 2. It was designed to be: Highly Highly secure Available.
CCNA4 v3 Module 6 v3 CCNA 4 Module 6 JEOPARDY K. Martin.
By, Naga Manojna Chintapalli. CHAPTER 2.2 TRANSPARENCY.
Types of Operating Systems 1 Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2015.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu, Chien-Lung Chou, Kuang-Jui Wang, Hsiao-Hsi Wang, and Kuan-Ching.
Core Migration On SCC [keyword : Lookup Table, MPB] Chan Seok Kang 2013/06/19.
DISTRIBUTED COMPUTING
1 Distributed Processing Chapter 1 : Introduction.
CSC 480 Software Engineering Lecture 17 Nov 4, 2002.
Background Computer System Architectures Computer System Software.
Primitive Concepts of Distributed Systems Chapter 1.
Data-Centric Systems Lab. A Virtual Cloud Computing Provider for Mobile Devices Gonzalo Huerta-Canepa presenter 김영진.
Distributed Systems Architectures Chapter 12. Objectives  To explain the advantages and disadvantages of different distributed systems architectures.
Chapter 1: Introduction
Introduction to Load Balancing:
PREGEL Data Management in the Cloud
CSC 480 Software Engineering
Auburn University COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques (2) Dr. Xiao Qin Auburn University.
Process Migration Troy Cogburn and Gilbert Podell-Blume
Database System Architectures
Presentation transcript:

Process Migration Motivations: load sharing/balancing - migrate jobs to less utilized machines resource locality - migrate jobs to the machine where needed resources exist resource sharing - make distinctive resources available to all jobs fault tolerance - migrate jobs away from failing machines system administration - migrate jobs temporarily for administrative reasons mobility/ubiquitous computing - migrate jobs to follow the user Architectures: MPP - massively parallel processors/clusters NOW - networks of workstations

Steps in Migrating a Process kernel process x 1. Select process to migrate kernel process x 2. Detach process from source node kernel process x 3. Redirect communication 4. Extract state external communication kernel process x

Steps in Migrating a Process 5. Create new process at destination node kernel process x source nodedestination node 6. Transfer state kernel process x source nodedestination node 7. Create forwarding reference kernel source nodedestination node process x kernel source nodedestination node process x 8. Resume migrated process

Factors in Process Migration complexity - added functionality imposed on operating system performance - strongly related to state transfer transparency - visibility to the process of migration fault resilience - tolerance of the migrated task to failure of its source node scalability - sustained performance with increasing system size heterogeneity - ability to incorporate diverse architectures

Trade-offs Among Factors upper kernel lower kernel linker destination node migrated process upper kernel lower kernel linker source node transparency ++ complexity -- fault resilience -- Mosix

State Migration Strategies Eager (all) source node destination node source node destination node network paging store Eager (dirty)

State Migration Strategies source node destination node network paging store flushing source node destination node network paging store copy-on- reference

State Migration Strategies source node destination node precopy

Barriers to Use of Migration