CoreGRID Summer School Bonn, July 24 th - 28 th 2006 HPC4U: Realizing SLA-aware Resource Management Simon Alexandre CETIC, Charleroi, Belgium Matthias.

Slides:



Advertisements
Similar presentations
OGF19 -- NC 1 Service Level Agreements and QoS: what do we measure and why? Omer F. Rana School of Computer Science, Cardiff.
Advertisements

Remus: High Availability via Asynchronous Virtual Machine Replication
Service Level Agreement Based Scheduling Heuristics Rizos Sakellariou, Djamila Ouelhadj.
Chapter 20 Oracle Secure Backup.
Distributed Processing, Client/Server and Clusters
Processes Management.
Global Analysis and Distributed Systems Software Architecture Lecture # 5-6.
Providing Fault-tolerance for Parallel Programs on Grid (FT-MPICH) Heon Y. Yeom Distributed Computing Systems Lab. Seoul National University.
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
1 IK1500 Communication Systems IK1330 Lecture 3: Networking Anders Västberg
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
SLA-Oriented Resource Provisioning for Cloud Computing
Silberschatz and Galvin  Operating System Concepts Module 16: Distributed-System Structures Network-Operating Systems Distributed-Operating.
Walter Binder University of Lugano, Switzerland Niranjan Suri IHMC, Florida, USA Green Computing: Energy Consumption Optimized Service Hosting.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
An Approach to Secure Cloud Computing Architectures By Y. Serge Joseph FAU security Group February 24th, 2011.
Transaction.
Dynamic SLAs Discussion Omer Rana, School of Computer Science, Cardiff.
CLOUD COMPUTING AN OVERVIEW & QUALITY OF SERVICE Hamzeh Khazaei University of Manitoba Department of Computer Science Jan 28, 2010.
An Application of Dynamic Service Level Agreements in a Risk-Aware Grid Environment Sanaa Sharaf and Karim Djemame School of Computing University of Leeds.
Virtualization and Cloud Computing Virtualization David Bednárek, Jakub Yaghob, Filip Zavoral.
Network Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Logging into the.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
8. Fault Tolerance in Software
1© Copyright 2015 EMC Corporation. All rights reserved. SDN INTELLIGENT NETWORKING IMPLICATIONS FOR END-TO-END INTERNETWORKING Simone Mangiante Senior.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
1 Distributed Systems: Distributed Process Management – Process Migration.
Introduction to the new mainframe © Copyright IBM Corp., All rights reserved. Chapter 5: Batch processing and the Job Entry Subsystem (JES) Batch.
UNIX System Administration OS Kernal Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept Kernel or MicroKernel Concept: An OS architecture-design.
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE II - Network Service Level Agreement (SLA) Establishment EGEE’07 Mary Grammatikou.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Identifying Application Impacts on Network Design Designing and Supporting Computer.
IMPROUVEMENT OF COMPUTER NETWORKS SECURITY BY USING FAULT TOLERANT CLUSTERS Prof. S ERB AUREL Ph. D. Prof. PATRICIU VICTOR-VALERIU Ph. D. Military Technical.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
CHEP'07 September D0 data reprocessing on OSG Authors Andrew Baranovski (Fermilab) for B. Abbot, M. Diesburg, G. Garzoglio, T. Kurca, P. Mhashilkar.
Lecture 4: Sun: 23/4/1435 Distributed Operating Systems Lecturer/ Kawther Abas CS- 492 : Distributed system & Parallel Processing.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
Data Sharing. Data Sharing in a Sysplex Connecting a large number of systems together brings with it special considerations, such as how the large number.
Scheduling in HPC Resource Management System: Queuing vs. Planning Matthias Hovestadt, Odej Kao, Alex Keller, and Achim Streit 2003 Job Scheduling Strategies.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
VMware vSphere Configuration and Management v6
Introduction to z/OS Basics © 2006 IBM Corporation Chapter 7: Batch processing and the Job Entry Subsystem (JES) Batch processing and JES.
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Company LOGO Network Management Architecture By Dr. Shadi Masadeh 1.
By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison Spring 2000.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
1 Kyung Hee University Chapter 8 Switching. 2 Kyung Hee University Switching  Switching  Switches are devices capable of creating temporary connections.
BDTS and Its Evaluation on IGTMD link C. Chen, S. Soudan, M. Pasin, B. Chen, D. Divakaran, P. Primet CC-IN2P3, LIP ENS-Lyon
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
INSERT PROJECT ACRONYM HERE BY EDITING THE MASTER SLIDE (VIEW / MASTER / SLIDE MASTER) Using WS-Agreement for Risk Management in the Grid European Commission.
Operating Systems Distributed-System Structures. Topics –Network-Operating Systems –Distributed-Operating Systems –Remote Services –Robustness –Design.
Enabling Grids for E-sciencE Agreement-based Workload and Resource Management Tiziana Ferrari, Elisabetta Ronchieri Mar 30-31, 2006.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
RESERVOIR Service Manager NickTsouroulas Head of Open-Source Reference Implementations Unit Juan Cáceres
Chapter 6: Securing the Cloud
Deterministic Communication with SpaceWire
Cloud Management Mechanisms
Storage Virtualization
湖南大学-信息科学与工程学院-计算机与科学系
Wide Area Workload Management Work Package DATAGRID project
On the Use of Service Level Agreements in AssessGrid
Presentation transcript:

CoreGRID Summer School Bonn, July 24 th - 28 th 2006 HPC4U: Realizing SLA-aware Resource Management Simon Alexandre CETIC, Charleroi, Belgium Matthias Hovestadt University of Paderborn, Germany

HPC4U2 Topics Motivation Architecture of an SLA-aware RMS Phases of Operation SLA-aware Scheduling Cross-border Migration Summary

HPC4U3 Grid Computing Today How do Grids look like today? Grids are in usage, but… –… commercial usage is rare and limited oonly isolated applications –… mostly used as a prototypic solution in research otestbeds within research projects Problem: No contractually fixed QoS levels Deadline bounded business critical jobs

HPC4U4 What is an SLA? Service Level Agreement (SLA) Contract between Provider and Customer –Describes all obligations and expectations Flexible formulation for each use case SLA is in focus of research in Grid Middleware Service Level Agreement Terms R-Type: HW, OS, Compiler, Software Packages, … R-Quantity: Number CPUs, main memory, … R-Quality: CPU>2GHz, Network Bandwidth, … Deadline: Date, Time,… Policies: Demands on Security and Privacy, … Price for Resource Consumtion (fulfilled SLA) Penalty Fee in case of SLA violation Contract Parties, Responsible Persons ID or Description of SLA Name Context Service Level Agreement

HPC4U5 The Gap between Grid and RMS SLA RMS M1M2M3 grid middleware user request Reliability? Quality of Service? Best Effort! User asks for Service Level Agreement Grid Middleware realizes job by means of local RMS systems BUT: These RMS only offer Best Effort! Goal: SLA-aware RMS runtime responsibility reliability –fault tolerance Guaranteed!

HPC4U6 Demands on an SLA-aware RMS Negotiation active negotiation with upper layers accept new job only if SLA can be fulfilled System Management taking terms of SLAs into account allocation of nodes according SLAs Fault Tolerance ensure terms of SLAs also in case of failures mechanisms for failure handling

HPC4U7 Topics Motivation Architecture of an SLA-aware RMS Phases of Operation SLA-aware Scheduling Cross-border Migration Summary

HPC4U8 SLA-aware RMS Central component Interface to Grid middleware for SLA-Negotiation Interfaces to Subsystems for provision of FT Tasks SLA Negotiation Policies –security, … Monitoring FT –checkpoints –migration Open interfaces

HPC4U9 Process Subsystem Concept: virtual bubble Virtualization of Resources –virtual network devices, virtual process ids, … Application runs in virtual environment only minimal impact on job runtime Checkpoint of entire virtual bubble no re-linking necessary –also applicable for commercial applications Restart of checkpointed virtual bubble compatibility has to be ensured application does not detect restart

HPC4U10 Network Subsystem Provision of FT also for parallel jobs communication between nodes checkpoint of network state necessary –ensuring consistency between process and network at restart Network Checkpointing checkpoint of network queues checkpoint of in-transit packets Cooperative Checkpoint Protocol (CCP) direct communication between process checkpointing and network checkpointing

HPC4U11 Storage Subsystem Task of Storage Subsystem Storage-related QoS Checkpointing of storage Overall consistency of checkpoint restoring state of storage at process checkpointing time checkpoint = process+network+storage Storage Checkpoint may be huge Problem: Delay until restart on remote resource –Grid migration over slow WAN-connections Solution: Data replication with COW (copy on write) –precautionary data transfer to remote resource

HPC4U12 Generation of a new Checkpoint RMS NetworkStorageCP 1. CP job +halt 2. In- Transit Packets 4. Snap- shot ! 5. Link to Snapshot 6. Resume job 7. Job running again 8. Migration from last checkpoint 3. Return: Checkpoint completed!

HPC4U13 Checkpointing Backup of consistent image of running job running process network state in case of parallel jobs storage partition Process checkpointing causes delay in job completion depends on number of jobs, memory, interconnect, … Delay has to be regarded at job scheduling partition size = estimated runtime + checkpointing overhead

HPC4U14 Topics Motivation Architecture of an SLA-aware RMS Phases of Operation SLA-aware Scheduling Cross-border Migration Summary

HPC4U15 Phases of Operation Negotiation of SLA Pre-Runtime: Configuration of Resources e.g. network, storage, compute nodes Runtime: Stage-In, Computation, Stage-Out Post-Runtime: Re-configuration

HPC4U16 Negotiation Phase Negotiation Grid customer and provider try to agree on a Service Level Agreement –which resources have to be provided? –which QoS level is required? ospecification of a deadline RMS in central position steering of negotiation process current system condition has to be regarded

HPC4U17 Pre-Runtime Phase Task of Pre-Runtime Phase Configuration of all allocated resources Goal: Fulfill requirements of SLA Reconfiguration affects all system elements Resource Management System –e.g. configuration of assigned compute nodes Storage Subsystem –e.g. initialization of a new data partition Network Subsystem –e.g. configuration of network infrastructure

HPC4U18 Runtime Phase lifetime of job in system adherence with SLA has to be assured FT mechanisms have to be utilized Phase consists of three distinct steps Stage-In –transmission of required input data from Grid customer to compute resource Computation –execution of application Stage-Out –transmission of generated output data from compute resource back to Grid customer

HPC4U19 Post-Runtime Phase Task of Post-Runtime Phase: Re-Configuration of all resources –e.g. re-configuration of network –e.g. deletion of checkpoint datasets –e.g. deletion of temporary data Counterpart to Pre-Runtime Phase Allocation of resources ends Update of schedules in RMS and storage Resources are available for new jobs

HPC4U20 Topics Motivation Architecture of an SLA-aware RMS Phases of Operation SLA-aware Scheduling Cross-border Migration Summary

HPC4U21 Negotiation of new SLA Incoming SLA-request 3 nodes, 7h runtime, earliest start: 20:00, deadline 6:00 request can be accepted, buffer time frame 2h Regular Checkpointing new checkpoint to be generated every 60 minutes checkpointing causes delay in job completion –depends on CP-system and job size 12am6pm12pm6am

HPC4U22 Suspending Jobs valuable resources may be blocked by non-SLA jobs 23:00: SLA-request: 3 nodes, 7 hours, deadline 6:00 –insufficient capacity: rejection of new SLA-request checkpoint and suspend of non-SLA jobs (best effort only) acceptance of request and execution of SLA-job resume of suspended SLA-bound job completion of best-effort job, completion of SLA-job 12am6pm12pm6am

HPC4U23 Increasing system utilization Jobs are requesting number of nodes and runtime users do not align their requests to free capacities Reservations must be guaranteed No other complete jobs fits into gaps Using of job suspend to use gaps for partial job execution Realization of background jobs

HPC4U24 Runtime of SLA job Pre-runtime phase configuration of network, storage, and nodes Runtime phase Monitoring of system Regular checkpointing Post-runtime phase 12am6pm12pm6am

HPC4U25 Handling of Resource Failures Resource outage in partition of job job crashes immediately last checkpoint after 4h runtime –computation time since last checkpoint is lost allocation of partition with 3h runtime restore from last checkpointed state scheduling of regular checkpoint intervals resuming computation 12am6pm12pm6am

HPC4U26 Availability of spare resources Migration presumes availability of resources but: resources may be blocked by other jobs Solution: Suspension of other jobs Problem: What to do in case of SLA-jobs blocking resource SLA-job can only be suspended if deadline is held Buffer nodes: execution of non-SLA jobs only 12am6pm12pm6am

HPC4U27 Topics Motivation Architecture of an SLA-aware RMS Phases of Operation SLA-aware Scheduling Cross-border Migration Summary

HPC4U28 Cross-border migration Goal: Successful execution of SLA-jobs handling of failures depends on local load situation goal of provider: utilization of resources high load + massive failure no migration SLA violation Idea: Cross-border migration usage of resources on other local machines –multiple clusters available on most sites transfer of checkpoint dataset to remote cluster resume of job from checkpointed state RMS

HPC4U29 Grid Migration Cross border migration enhances FT-level additional alternatives for migration process Grid Migration = usage of Grid as migration target Virtual Resource Manager as active Grid component Negotiation with Grid on resources Migration process request for spare resources transfer using standard protocols Transparent for the user user will receive results from new site Problem: Compatibility of resources

HPC4U30 Compatibility Profile Checkpoint dataset needs compatible resources for restart processor architecture main and storage memory interconnect type libraries –exact version for loaded libs –compatible version for unloaded libs paths Compatibility profile describes requirements of checkpointed jobs Resource query according to this profile

HPC4U31 Grid Integration Grid Middleware Grid Customer Grid Interface RMS HPC4U Grid Fabric Grid Customer

HPC4U32 Summary New requirements from future commercial Grids Transparent fault tolerance, SLA negotiation and mgmt. SLA-aware Resource Management System orchestrated operation of subsystems for Process, Storage, and Network SLA scheduling in RMS Cross-border migration for increased FT level virtual resource management, compatibility profile Progress support for single node jobs running support of parallel applications close to completion next: cross-border migration

HPC4U33 Further Information please visit our website you will find… … general information about HPC4U … movies showing fault tolerance in action … downloadable demo system for playing … links and contact addresses