Evaluating the Cost-Benefit of Using Cloud Computing to Extend the Capacity of Clusters Presenter: Xiaoyu Sun.

Slides:



Advertisements
Similar presentations
Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)
Advertisements

Enabling Cost-Effective Resource Leases with Virtual Machines Borja Sotomayor University of Chicago Ian Foster Argonne National Laboratory/
Creating HIPAA-Compliant Medical Data Applications with Amazon Web Services Presented by, Tulika Srivastava Purdue University.
Capacity Planning IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Cloud Computing: Theirs, Mine and Ours Belinda G. Watkins, VP EIS - Network Computing FedEx Services March 11, 2011.
Hadi Goudarzi and Massoud Pedram
Virtualization and Cloud Computing. Definition Virtualization is the ability to run multiple operating systems on a single physical system and share the.
Anthony Sulistio 1, Kyong Hoon Kim 2, and Rajkumar Buyya 1 Managing Cancellations and No-shows of Reservations with Overbooking to Increase Resource Revenue.
Towards Provision of Quality of Service Guarantees in Job Scheduling Mohammad IslamPavan Balaji P. SadayappanD. K. Panda Computer Science and Engineering.
Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.
Cloud Computing Brandon Hixon Jonathan Moore. Cloud Computing Brandon Hixon What is Cloud Computing? How does it work? Jonathan Moore What are the key.
USC Viterbi School of Engineering Ewa Deelman Resource Management.
Scheduling of parallel jobs in a heterogeneous grid environment Scheduling of parallel jobs in a heterogeneous grid environment Each site has a homogeneous.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
XENMON: QOS MONITORING AND PERFORMANCE PROFILING TOOL Diwaker Gupta, Rob Gardner, Ludmila Cherkasova 1.
Copyright 2007, Information Builders. Slide 1 Workload Distribution for the Enterprise Mark Nesson, Vashti Ragoonath June, 2008.
Managing Risk of Inaccurate Runtime Estimates for Deadline Constrained Job Admission Control in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing.
Parallel Job Scheduling Algorithms and Interfaces Research Exam for Cynthia Bailey Lee Department of Computer Science and Engineering University of California,
Project 2 – solution code
Cs238 CPU Scheduling Dr. Alan R. Davis. CPU Scheduling The objective of multiprogramming is to have some process running at all times, to maximize CPU.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 34 – Media Server (Part 3) Klara Nahrstedt Spring 2012.
Commonwealth of Massachusetts Statewide Strategic IT Consolidation (ITC) Initiative ITD Virtualization and Shared Services Executive Briefing Presentation.
An Introduction to Cloud Computing. The challenge Add new services for your users quickly and cost effectively.
CHAPTER OVERVIEW SECTION 5.1 – MIS INFRASTRUCTURE
Buffer Management for Shared- Memory ATM Switches Written By: Mutlu Apraci John A.Copelan Georgia Institute of Technology Presented By: Yan Huang.
Computer System Lifecycle Chapter 1. Introduction Computer System users, administrators, and designers are all interested in performance evaluation. Whether.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Ajou University, South Korea ICSOC 2003 “Disconnected Operation Service in Mobile Grid Computing” Disconnected Operation Service in Mobile Grid Computing.
Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
Integrated Risk Analysis for a Commercial Computing Service Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. Dept.
CHAPTER FIVE INFRASTRUCTURES: SUSTAINABLE TECHNOLOGIES
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
Network Aware Resource Allocation in Distributed Clouds.
Ruppa K. Thulasiram Slide 1/24 Resource Provisioning Policies to Increase IaaS Provider’s Profit in a Federated Cloud Environment Adel Nadjaran Toosi *,
Scheduling of Parallel Jobs In a Heterogeneous Multi-Site Environment By Gerald Sabin from Ohio State Reviewed by Shengchao Yu 02/2005.
Marcos Dias de Assunção 1,2, Alexandre di Costanzo 1 and Rajkumar Buyya 1 1 Department of Computer Science and Software Engineering 2 National ICT Australia.
Resource Provisioning based on Lease Preemption in InterGrid Mohsen Amini Salehi, Bahman Javadi, Rajkumar Buyya Cloud Computing and Distributed Systems.
Fault-Tolerant Workflow Scheduling Using Spot Instances on Clouds Deepak Poola, Kotagiri Ramamohanarao, and Rajkumar Buyya Cloud Computing and Distributed.
GridIS: an Incentive-based Grid Scheduling Lijuan Xiao, Yanmin Zhu, Lionel M. Ni, Zhiwei Xu 19th International Parallel and Distributed Processing Symposium.
Grid Computing at The Hartford Condor Week 2008 Robert Nordlund
The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter: 李長霖.
Uniprocessor Scheduling
Power-Aware Parallel Job Scheduling
OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.
Green Computing Metrics: Power, Temperature, CO2, … Computing system: Many-cores, Clusters, Grids and Clouds Algorithm and model: task scheduling, CFD.
Performance Analysis of Preemption-aware Scheduling in Multi-Cluster Grid Environments Mohsen Amini Salehi, Bahman Javadi, Rajkumar Buyya Cloud Computing.
On Reducing Mesh Delay for Peer- to-Peer Live Streaming Dongni Ren, Y.-T. Hillman Li, S.-H. Gary Chan Department of Computer Science and Engineering The.
Job Scheduling P. (Saday) Sadayappan Ohio State University.
QoPS: A QoS based Scheme for Parallel Job Scheduling M. IslamP. Balaji P. Sadayappan and D. K. Panda Computer and Information Science The Ohio State University.
Timeshared Parallel Machines Need resource management Need resource management Shrink and expand individual jobs to available sets of processors Shrink.
Chapter 8 – Cloud Computing
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 31 – Process Management (Part 1) Klara Nahrstedt Spring 2009.
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-1.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
PRESENTED BY– IRAM KHAN ISHITA TRIPATHI GAURAV AGRAWAL GAURAV SINGH HIMANSHU AWASTHI JAISWAR VIJAY KUMAR JITENDRA KUMAR VERMA JITENDRA SINGH KAMAL KUMAR.
KAASHIV INFOTECH – A SOFTWARE CUM RESEARCH COMPANY IN ELECTRONICS, ELECTRICAL, CIVIL AND MECHANICAL AREAS
Practical IT Research that Drives Measurable Results 1Info-Tech Research Group Get Moving with Server Virtualization.
2004 Queue Scheduling and Advance Reservations with COSY Junwei Cao Falk Zimmermann C&C Research Laboratories NEC Europe Ltd.
Developing resource consolidation frameworks for moldable virtual machines in clouds Author: Liang He, Deqing Zou, Zhang Zhang, etc Presenter: Weida Zhong.
Resource Allocation and Scheduling for Workflows Gurmeet Singh, Carl Kesselman, Ewa Deelman.
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
Lizhe Wang, Gregor von Laszewski, Jai Dayal, Thomas R. Furlani
How to prepare for the End of License of Windows Server 2012/R2
Cloud Computing By P.Mahesh
Brandon Hixon Jonathan Moore
P. (Saday) Sadayappan Ohio State University
A Characterization of Approaches to Parrallel Job Scheduling
Presentation transcript:

Evaluating the Cost-Benefit of Using Cloud Computing to Extend the Capacity of Clusters Presenter: Xiaoyu Sun

Cluster Computing Users have to know cluster very well administrative privileges

What is Cloud Computing? Cloud computing provides computation, software, data access, and storage resources without requiring cloud users to know the location and other details of the computing infrastructure.

Empowerment Users control resource by themselves not by a centralized IT service Agility users' ability to re-provision technological infrastructure resources. Application Programing Interface Cost Device and Location Independence enable users to access systems using a web browser regardless of their location or what device they are using Virtualization servers and storage devices to be shared and utilization be increased Reliability and Scalability Performance Monitor by web services as the system interface Security providers are able to devote resources to solving security issues that many customers cannot afford Maintenance Applications dont need to be installed on each user's computer and can be accessed from different places Characteristics of Cloud Computing

Describe a system that enables an organization to augment its computing infrastructure by allocating resources from a Cloud provider. Provide various scheduling strategies that aim to minimize the cost of utilizing resources from the Cloud provider. Evaluate the proposed strategies, considering different performance metrics; namely average weighted response time, job slowdown, number of deadline violations, number of jobs rejected, and the money spent for using the Cloud. Purpose

Cloud Computing Figure 1:The resource provisioning scenario Strategy sets

Conservative each request is scheduled when it arrives in the system, and requests are allowed to jump ahead in the queue if they do not delay the execution of other requests. Aggressive Only the request at the head of the waiting queue called the pivot is granted a reservation. Other requests are allowed to move ahead in the queue if they do not delay the pivot. Selective Requests are given reservations if they have waited long enough in the queue. Long enough is determined by the requests expansion factor: Xfactor = (wait time + run time)/run time (1) The threshold is given by the average slowdown of previously completed requests. Backfilling Policies

Naïve: Both site and cloud schedulers use Conservative backfilling to schedule the requests The redirection algorithm is executed at the arrival of each job at the site Use cloud provider when the request cannot start immediately on local cluster Strategy Sets

Shortest Queue: Aggressive backfilling First-Come-First-Served (FCFS) manner At the arrival or complete of each job at the site Compute the ratio of number of VMs required by requests to the number of VMS available Redirect request if cloud providers number is smaller Strategy Sets

Weighted Queue: Aggressive backfilling First-Come-First-Served (FCFS) manner Number of VMs that can be borrowed from cloud provider is the number of VMs required by requests minus VMs in use Strategy Sets

Selective Selective backfilling Compute the ratio of number of VMs required by requests to the number of VMS available When the requests xFactor exceeds the threshold, the scheduler makes a reservation at the place that provides the earliest start time. Strategy Sets

Simulation of two-month-long periods SDSC Blue Horizon machine with 144 nodes Number of VMs Price of a virtual machine per hour Amazon EC2s small instance: US $0.10 Network and storage are not considered Values are average of 5 simulation runs Experiments

Average Weighted Response Time(AWRT) of site k: T k : requests submitted to site k P j : the runtime of request j m j : the number of VMs required by request j ct j : request js completion time st j : the submission time of request j Performance Metrics

Performance Improvement Cost of a strategy set st: Amount spent is the amount spent running virtual machines on the Cloud provider AWRT base is the AWRT achieved by a base strategy(FCFS with aggressive backfilling) that schedules requests using only the site's resources AWRT st is the AWRT reached by the strategy st when Cloud resources are also utilized. Performance Metrics

Using Lublin99's model to generate different workloads: Umed: the mean number of virtual machines required by a request to log2m-umed where m is the maximum number of virtual machines allowed in the system, from 1.5 to 3.5. Barr: the inter-arrival time of requests at rush hours, from 0.45 to PB: the proportion p of the first gamma in Lublin99's model is given by p = pa * nodes + PB, from 0.5 to 1.0. Performance Improvement Cost

These three graphs show the site's utilization using the base aggressive backfilling strategy without Cloud resources The larger the value of Umed, the smaller the requests. The larger the value of PB, the smaller the duration of the requests

Performance Improvement Cost Requests sizeRequests arrive timeRequests duration

Users may have stringent requirement on when the virtual machines are required Deadline constrained requests have: Ready time Duration Deadline Cost of using Cloud resources used to meet requests deadlines and decrease the number of deadline violations and request rejections Deadline Constrained Applications

Conservative both local site and Cloud schedule requests using conservative backfilling. Places a request where it achieves the best start time If rejections are allowed and deadline cannot be met, reject the request Aggressive both local site and Cloud use aggressive backfilling to schedule requests Earliest Deadline First If request deadlines are broken in the local cluster, try the cloud provider If rejections are allowed and deadlines are broken, reject the request Deadline Aware Strategies

The non-violation cost is given by: Where: Amount_spent st: amount spent with Cloud resources Viol base : the number of deadline violations under the base strategy set (aggressive backfilling and an Earliest Deadline First manner) Viol st :the number of deadline violations under the evaluated strategy set Cost of Reducing Deadline Violations

The deadline calculation is given by: Where: st j : the request j's submission time ct j : the completion time. ta j : the difference between the request's completion and submission times. sf : a stringency factor that indicates how urgent the deadlines are. Deadline calculation

Cost of Reducing Deadline Violations sf=0.9sf=1.3 sf=1.7

Cost of Reducing Deadline Violations Tight deadlinesNormal deadlinesRelaxed deadlines

Cost to Reduce Job Rejections: Aggressive Strategy Set

Different strategy sets can yield different ratios of performance improvement to money spent Naïve strategy has a higher performance improvement cost Selective strategy provides a good ratio of money spent to job slowdown improvement Using cloud provider to meet job deadlines Less than $3,000 were spent to keep the number of rejections close to zero Conclusions