Service Level Agreement based Allocation of Cluster Resources: Handling Penalty to Enhance Utility Chee Shin Yeo and Rajkumar Buyya Grid Computing and.

Slides:

Advertisements

Similar presentations

Libra: An Economy driven Job Scheduling System for Clusters Jahanzeb Sherwani 1, Nosheen Ali 1, Nausheen Lotia 1, Zahra Hayat 1, Rajkumar Buyya 2 1. Lahore.

Advertisements

GridSim:Java-based Modelling and Simulation of Deadline and Budget-based Scheduling for Grid Computing Rajkumar Buyya and Manzur Murshed Monash University,

Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)

SProj 3 Libra: An Economy-Driven Cluster Scheduler Jahanzeb Sherwani Nosheen Ali Nausheen Lotia Zahra Hayat Project Advisor/Client: Rajkumar Buyya Faculty.

Enabling Cost-Effective Resource Leases with Virtual Machines Borja Sotomayor University of Chicago Ian Foster Argonne National Laboratory/

School of Computing FACULTY OF ENGINEERING Richard Kavanagh Research Group: Collaborative Systems and Performance, Supervisors: Karim Djemame and Natasha.

Feedback Control Real- time Scheduling James Yang, Hehe Li, Xinguang Sheng CIS 642, Spring 2001 Professor Insup Lee.

Evaluating the Cost-Benefit of Using Cloud Computing to Extend the Capacity of Clusters Presenter: Xiaoyu Sun.

Hadi Goudarzi and Massoud Pedram

Studies of the User-Scheduler Relationship Cynthia Bailey Lee Advisor: Allan E. Snavely Department of Computer Science and Engineering San Diego Supercomputer.

SLA-Oriented Resource Provisioning for Cloud Computing

1 GridSim 2.0 Adv. Grid Modelling & Simulation Toolkit Rajkumar Buyya, Manzur Murshed (Monash), Anthony Sulistio, Chee Shin Yeo Grid Computing and Distributed.

Anthony Sulistio 1, Kyong Hoon Kim 2, and Rajkumar Buyya 1 Managing Cancellations and No-shows of Reservations with Overbooking to Increase Resource Revenue.

Towards Provision of Quality of Service Guarantees in Job Scheduling Mohammad IslamPavan Balaji P. SadayappanD. K. Panda Computer Science and Engineering.

Scheduling of parallel jobs in a heterogeneous grid environment Scheduling of parallel jobs in a heterogeneous grid environment Each site has a homogeneous.

Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.

Managing Risk of Inaccurate Runtime Estimates for Deadline Constrained Job Admission Control in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing.

Senior Design Project: Parallel Task Scheduling in Heterogeneous Computing Environments Senior Design Students: Christopher Blandin and Dylan Machovec.

CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.

GridFlow: Workflow Management for Grid Computing Kavita Shinde.

Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.

1 Auction or Tâtonnement – Finding Congestion Prices for Adaptive Applications Xin Wang Henning Schulzrinne Columbia University.

Copyright © 2010 Platform Computing Corporation. All Rights Reserved. Platform Computing Ken Hertzler VP Product Management.

A Prediction-based Real-time Scheduling Advisor Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University

10th Workshop on Information Technologies and Systems 1 A Comparative Evaluation of Internet Pricing Schemes: Smart Market and Dynamic Capacity Contracting.

HeteroPar 2013 Optimization of a Cloud Resource Management Problem from a Consumer Perspective Rafaelli de C. Coutinho, Lucia M. A. Drummond and Yuri Frota.

Gridbus Resource Broker for Application Service Costs-based Scheduling on Global Grids: A Case Study in Brain Activity Analysis Srikumar Venugopal 1, Rajkumar.

Integrated Risk Analysis for a Commercial Computing Service Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. Dept.

Adaptive Control of Virtualized Resources in Utility Computing Environments HP Labs: Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal University.

Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management.

A Budget Constrained Scheduling of Workflow Applications on Utility Grids using Genetic Algorithms Jia Yu and Rajkumar Buyya Grid Computing and Distributed.

1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.

OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.

Network Aware Resource Allocation in Distributed Clouds.

Ruppa K. Thulasiram Slide 1/24 Resource Provisioning Policies to Increase IaaS Provider’s Profit in a Federated Cloud Environment Adel Nadjaran Toosi *,

Marcos Dias de Assunção 1,2, Alexandre di Costanzo 1 and Rajkumar Buyya 1 1 Department of Computer Science and Software Engineering 2 National ICT Australia.

Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis [1] 4/24/2014 Presented by: Rakesh Kumar [1 ]

Resource Provisioning based on Lease Preemption in InterGrid Mohsen Amini Salehi, Bahman Javadi, Rajkumar Buyya Cloud Computing and Distributed Systems.

Meta Scheduling Sathish Vadhiyar Sources/Credits/Taken from: Papers listed in “References” slide.

Policy-based CPU-scheduling in VOs Catalin Dumitrescu, Mike Wilde, Ian Foster.

GridIS: an Incentive-based Grid Scheduling Lijuan Xiao, Yanmin Zhu, Lionel M. Ni, Zhiwei Xu 19th International Parallel and Distributed Processing Symposium.

1 520 Student Presentation GridSim – Grid Modeling and Simulation Toolkit.

1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.

Euro-Par, A Resource Allocation Approach for Supporting Time-Critical Applications in Grid Environments Qian Zhu and Gagan Agrawal Department of.

1 University of Maryland Linger-Longer: Fine-Grain Cycle Stealing in Networks of Workstations Kyung Dong Ryu © Copyright 2000, Kyung Dong Ryu, All Rights.

October 18, 2005 Charm++ Workshop Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming.

Printed by Definition of Grid Resource Scheduling Scheduling diverse applications on heterogeneous, distributed, dynamic grid computing.

Scheduling in HPC Resource Management System: Queuing vs. Planning Matthias Hovestadt, Odej Kao, Alex Keller, and Achim Streit 2003 Job Scheduling Strategies.

Towards Economic Fairness for Big Data Processing in Pay-as-you-go Cloud Computing Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.

Faucets Queuing System Presented by, Sameer Kumar.

Performance Analysis of Preemption-aware Scheduling in Multi-Cluster Grid Environments Mohsen Amini Salehi, Bahman Javadi, Rajkumar Buyya Cloud Computing.

Shanjiang Tang, Bu-Sung Lee, Bingsheng He, Haikun Liu School of Computer Engineering Nanyang Technological University Long-Term Resource Fairness Towards.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

Job Scheduling P. (Saday) Sadayappan Ohio State University.

QoPS: A QoS based Scheme for Parallel Job Scheduling M. IslamP. Balaji P. Sadayappan and D. K. Panda Computer and Information Science The Ohio State University.

Timeshared Parallel Machines Need resource management Need resource management Shrink and expand individual jobs to available sets of processors Shrink.

A stochastic scheduling algorithm for precedence constrained tasks on Grid Future Generation Computer Systems (2011) Xiaoyong Tang, Kenli Li, Guiping Liao,

Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.

1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.

Basic Concepts Maximum CPU utilization obtained with multiprogramming

Resource Selection in Grids Using Contract Net Kunal Goswami, Arobinda Gupta Cisco Systems, Bangalore, India Dept. of Computer Science & Engineering and.

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.

Resource Provision for Batch and Interactive Workloads in Data Centers Ting-Wei Chang, Pangfeng Liu Department of Computer Science and Information Engineering,

Embedded System Scheduling

OPERATING SYSTEMS CS 3502 Fall 2017

P. (Saday) Sadayappan Ohio State University

A Characterization of Approaches to Parrallel Job Scheduling

GRUBER: A Grid Resource Usage SLA Broker

Experiences in Running Workloads over OSG/Grid3

Presentation transcript:

Service Level Agreement based Allocation of Cluster Resources: Handling Penalty to Enhance Utility Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. Dept. of Computer Science and Software Engineering The University of Melbourne, Australia

2 Problem Providing a service market via Service-oriented Grid computing IBM ’ s E-Business on Demand, HP ’ s Adaptive Enterprise, Sun Microsystem ’ s pay-as-you-go Grid resources comprise clusters Utility-driven cluster computing Service Level Agreement (SLA): differentiate different values and varying requirements of jobs depending on user-specific needs and expectations Cluster Resource Management System (RMS) need to support and enforce SLAs

3 Proposal Current Cluster RMSs focus on overall job performance and system usage Using market-based approaches for utility-driven computing Utility based on users ’ willingness to pay Utility varies with users ’ SLAs Deadline Budget Penalty

4 Impact of Penalty Function on Utility

5 Service Level Agreement (SLA) Delay Delay = (finish_time – submit_time) - deadline Utility Utility = budget – (delay * penalty_rate) No Delay Utility = Budget Delay 0 < Utility < Budget Utility < 0 LibraSLA – considers risk of penalties Proportional share Considers job properties Run time Number of processors

6 LibraSLA SLA based Proportional Share with Utility Consideration Users express utility as budget or amount of real money Focuses on resource allocation (not elaborating on other market concepts such as user bidding strategies or auction pricing mechanisms) Users only gain utility and pay for service upon job completion (may be penalty)

7 LibraSLA Estimated run time provided during job submission is accurate Deadline of a job > its estimated run time SLA does not change after job acceptance Users submit jobs thru Cluster RMS only Cluster nodes may be homogeneous or heterogeneous Time-shared scheduling supported at nodes

8 LibraSLA Proportional Share of a job i on node j Deadline and Run time Total share for all jobs on a node j Delay when total_share > maximum processor time of node

9 LibraSLA Return of a job i on node j Return < 0 if utility < 0 Favors jobs with shorter deadlines Higher penalty for jobs with shorter deadlines Return of a node j Lower return indicates overloading

10 LibraSLA Admission Control (Accept new job or not?) Determines return of each node if new job is accepted Node is suitable if It has higher return It can satisfy HARD deadline if required New job accepted if enough suitable nodes as requested Accepted new job allocated to nodes with highest return

11 LibraSLA Determines return of a node Determines total share of processor time to fulfill deadlines of all its allocated jobs and new job Identifies job with highest return Gives additional remaining share to job with highest return (if any) If insufficient processor time, only job with highest return and jobs with hard deadlines are not delayed; jobs with soft deadlines are delayed proportionally Returns of these delays computed accordingly

12 Performance Evaluation: Simulation Simulated scheduling for a cluster computing environment using the GridSim toolkit (

13 Experimental Methodology: Trace Properties Feitelson’s Parallel Workload Archive ( Last 1000 jobs in SDSC SP2 trace Average inter arrival time: 2276 secs (37.93 mins) Average run time: secs (2.94 hrs) Average number of requsted processors: 18

14 Experimental Methodology: Cluster Properties SDSC SP2: Number of computation nodes: 128 SPEC rating of each node: 168 Processor type on each computation node: RISC System/6000 Operating System: AIX

15 Experimental Methodology: SLA Properties 20% - HIGH urgency jobs HARD deadline type LOW deadline/runtime HIGH budget/f(runtime) HIGH penalty_rate/g(runtime) where f(runtime) and g(runtime) are functions representing the MINIMUM budget and penalty rate for the user-specified runtime

16 Experimental Methodology: SLA Properties 80% - LOW urgency jobs SOFT deadline type HIGH deadline/runtime LOW budget/f(runtime) LOW penalty_rate/g(runtime) where f(runtime) and g(runtime) are functions representing the MINIMUM budget and penalty rate for the user-specified runtime

17 Experimental Methodology: SLA Properties High:Low ratio Eg. Deadline high:low ratio is the ratio of means for high deadline/runtime (low urgency) and low deadline/runtime (high urgency) Deadline high:low ratio of 7 Budget high:low ratio of 7 Penalty Rate high:low ratio of 4

18 Experimental Methodology: SLA Properties Values normally distributed within each HIGH and LOW deadline/runtime budget/f(runtime) penalty_rate/g(runtime) HIGH and LOW urgency jobs randomly distributed in arrival sequence

19 Experimental Methodology: Performance Evaluation Arrival delay factor Models cluster workload thru inter arrival time of jobs Eg. arrival delay factor of 0.01 means a job with 400 s of inter arrival time now has 4 s Mean factor Denotes mean value for normal distribution of deadline, budget and penalty rate SLA parameters Eg. Mean factor of 2 means having mean value double that of 1 (ie. higher)

20 Experimental Methodology: Performance Evaluation Comparison with Libra Assumes HARD deadline Selects nodes based on BEST FIT strategy (ie. nodes with least available processor time after accepting the new job are selected first) Evaluation Metrics Number of jobs completed with SLA fulfilled Aggregate utility achieved for jobs completed

21 Performance Evaluation: Impact of Various SLA Properties Deadline type Hard: no delay Soft: can accommodate delay (Penalty rate determines limits of delay) Deadline Time period to finish the job Budget Maximum amount of currency user willing to pay Penalty rate Compensate user for failure to meet deadline Reflects flexibility with delayed deadline (higher penalty rate limits delay to be shorter)

22 Deadline Type

23 Deadline Type

24 Deadline Mean Factor

25 Deadline Mean Factor

26 Budget Mean Factor

27 Budget Mean Factor

28 Penalty Rate Mean Factor

29 Penalty Rate Mean Factor

30 Conclusion Importance of handling penalty in SLAs LibraSLA Fulfill more SLAs thru soft deadlines Minimizes penalties to improve utility SLA with 4 parameters (i) Deadline Type (ii) Deadline (iii) Budget (iv) Penalty Rate Need to support Utility-driven cluster computing Service-oriented Grid computing

End of Presentation Questions ?

32 Motivation Cluster-based systems have gained popularity and widely adopted 75% of Top500 supercomputers world-wide based on Cluster architecture Clusters are used in not only used in scientific computing, but also in driving many commercial applications Many Corporate Data Centers are cluster- based systems

33 Related Work Existing cluster RMSs Condor, LoadLeveler, LSF, PBS, SGE Advanced Scheduler Maui Bid-based proportional share REXEC [Chun B.,2000] Tycoon [Lai K.,2004]

34 Related Work Cluster-On-Demand [Irwin D., 2004] Penalty after deadline instead of runtime Priorities job with highest return QoPS [Islam M., 2004] Penalty rate instead of slack factor Minimize penalty to improve utility Proportional share Libra [Sherwani J., 2004] Soft deadline as well as HARD deadline Examines return of accepting new job