Hadi Goudarzi and Massoud Pedram

Slides:



Advertisements
Similar presentations
February 20, Spatio-Temporal Bandwidth Reuse: A Centralized Scheduling Mechanism for Wireless Mesh Networks Mahbub Alam Prof. Choong Seon Hong.
Advertisements

SLA-Oriented Resource Provisioning for Cloud Computing
Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in Wireless Ad Hoc Networks By C. K. Toh.
Walter Binder University of Lugano, Switzerland Niranjan Suri IHMC, Florida, USA Green Computing: Energy Consumption Optimized Service Hosting.
A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.
A SLA Framework for QoS Provisioning and Dynamic Capacity Allocation Rahul Garg (IBM India Research Lab), R. S. Randhawa (Stanford University), Huzur Saran.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
SLA-aware Virtual Resource Management for Cloud Infrastructures
Energy Management and Adaptive Behavior Tarek Abdelzaher.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Data Communication and Networks Lecture 13 Performance December 9, 2004 Joseph Conron Computer Science Department New York University
Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley Asynchronous Distributed Algorithm Proof.
Mahapatra-Texas A&M-Fall'001 Partitioning - I Introduction to Partitioning.
On the Task Assignment Problem : Two New Efficient Heuristic Algorithms.
Online Auctions in IaaS Clouds: Welfare and Profit Maximization with Server Costs Xiaoxi Zhang 1, Zhiyi Huang 1, Chuan Wu 1, Zongpeng Li 2, Francis C.M.
Efficient agent-based selection of DiffServ SLAs over MPLS networks Thanasis G. Papaioannou a,b, Stelios Sartzetakis a, and George D. Stamoulis a,b presented.
Admission Control and Dynamic Adaptation for a Proportional-Delay DiffServ-Enabled Web Server Yu Cai.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Ekrem Kocaguneli 11/29/2010. Introduction CLISSPE and its background Application to be Modeled Steps of the Model Assessment of Performance Interpretation.
MAXIMIZING SPECTRUM UTILIZATION OF COGNITIVE RADIO NETWORKS USING CHANNEL ALLOCATION AND POWER CONTROL Anh Tuan Hoang and Ying-Chang Liang Vehicular Technology.
Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.
Bargaining Towards Maximized Resource Utilization in Video Streaming Datacenters Yuan Feng 1, Baochun Li 1, and Bo Li 2 1 Department of Electrical and.
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
A Unified Modeling Framework for Distributed Resource Allocation of General Fork and Join Processing Networks in ACM SIGMETRICS
Network Aware Resource Allocation in Distributed Clouds.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Software Pipelining for Stream Programs on Resource Constrained Multi-core Architectures IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEM 2012 Authors:
Power and Performance Modeling in a Virtualized Server System M. Pedram and I. Hwang Department of Electrical Engineering Univ. of Southern California.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
On QoS Guarantees with Reward Optimization for Servicing Multiple Priority Class in Wireless Networks YaoChing Peng Eunyoung Chang.
RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008.
SOFTWARE / HARDWARE PARTITIONING TECHNIQUES SHaPES: A New Approach.
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
Efficient Provisioning of Service Level Agreements for Service Oriented Applications Valeria Cardellini, Emiliano Casalicchio, Vincenzo Grassi, Francesco.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
OPERETTA: An Optimal Energy Efficient Bandwidth Aggregation System Karim Habak†, Khaled A. Harras‡, and Moustafa Youssef† †Egypt-Japan University of Sc.
Problem Formulation Elastic cloud infrastructures provision resources according to the current actual demand on the infrastructure while enforcing service.
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
Managing Server Energy and Operational Costs Chen, Das, Qin, Sivasubramaniam, Wang, Gautam (Penn State) Sigmetrics 2005.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.
1 Presented by Sarbagya Buddhacharya. 2 Increasing bandwidth demand in telecommunication networks is satisfied by WDM networks. Dimensioning of WDM networks.
Modeling Virtualized Environments in Simalytic ® Models by Computing Missing Service Demand Parameters CMG2009 Paper 9103, December 11, 2009 Dr. Tim R.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
Static Process Scheduling
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks Author: P. Kokkinos, K. Christodoulopoulos, A. Kretsis, and E. Varvarigos.
1 / 21 Providing Differentiated Services from an Internet Server Xiangping Chen and Prasant Mohapatra Dept. of Computer Science and Engineering Michigan.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
1 Chapter 5 Branch-and-bound Framework and Its Applications.
Basic Concepts Maximum CPU utilization obtained with multiprogramming
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.
Resource Provision for Batch and Interactive Workloads in Data Centers Ting-Wei Chang, Pangfeng Liu Department of Computer Science and Information Engineering,
OPERATING SYSTEMS CS 3502 Fall 2017
The Impact of Replacement Granularity on Video Caching
Dynamic Graph Partitioning Algorithm
Analyzing Security and Energy Tradeoffs in Autonomic Capacity Management Wei Wu.
Measuring Service in Multi-Class Networks
ISP and Egress Path Selection for Multihomed Networks
Module 5: CPU Scheduling
Hidden Markov Models Part 2: Algorithms
3: CPU Scheduling Basic Concepts Scheduling Criteria
Networked Real-Time Systems: Routing and Scheduling
Module 5: CPU Scheduling
Module 5: CPU Scheduling
Chrysostomos Koutsimanis and G´abor Fodor
Presentation transcript:

Multi-dimensional SLA-based Resource Allocation for Multi-tier Cloud Computing Systems Hadi Goudarzi and Massoud Pedram University of Southern California, Los Angeles in IEEE Cloud 2011 (The 4th International Conference on Cloud Computing)

Agenda Introduction System model Problem formulation An upper-bound on the total profit Profit maximization solution Simulation results Conclusion

Introduction Modern internet applications are complex software implemented on multi-tier architectures. Each tier provides a defined service to the next tiers and uses services from the previous tiers. The problem of resource allocation for multi-tier applications is harder than that for single-tier applications tiers are not homogenous a performance bottleneck in one tier can decrease the overall profit, even if the other tiers have acceptable service quality.

Introduction (cont’d) The IT infrastructure provided by the datacenter owners/operators must meet various SLAs established with the clients. The problem of optimal resource provisioning is challenging due to the diversity present in the client applications being hosted and the SLAs. reduce the cost incurred on the datacenter operators meet the clients’ SLAs.

Introduction (cont’d) This paper focus on the SLA based resource allocation in the cloud computing system. Two types of SLA contracts: For Gold SLA class: response time is guaranteed and if this constraint is violated, the cloud provider pays a penalty. For the Bronze SLA class, each client has a defined utility function based on its response time.

Introduction (cont’d) In this paper, we assume that servers are characterized by their maximum capacity in three dimensions: processing power memory usage communication bandwidth Objective An SLA-based multi-dimensional resource allocation scheme for the multi-tier services in a cloud computing system to optimize the total expected profit.

System model vi3 ui1 wi6 vi4 ui2 wi7 vi5 for each server j: 3 6

Problem formulation Profit Total allocated resource  1 Memory If tij is not zero, the value of ztij is set to one, otherwise the value of ztij is zero.

Average response time estimation We consider generalized processor sharing at each queue. Based on the queuing theory, the output of each queue with this characteristic has an exponential distribution with a mean value of 1/(-). : average service rate, : average arrival rate Average response time in tier t : Average response time :

Profit Utility function Cost (power consumption) Gold SLA: Bronze SLA: Ubi(Ri) is a nonincreasing function of the average response time. Cost (power consumption) Profit in a duration Te:

Baseline solution Optimization parameters: Iterative method (IM) [12][13] fix the resource shares and optimize the task distribution rates and then fix the distribution rates and optimize the resource shares

Profit upper-bound problem statement Release some constraints and fix tij Find the best case of assigning the client’s requests to the servers tij . solve the problem for each client independently and sum up the results of best profits for all clients This problem can be solved using KKT condition.

Profit maximization solution An initial solution based on the solution given for the profit upper bound problem is generated. Distribution rates are fixed and resource sharing is improved by a local optimization step. Finally a resource consolidation technique, inspired by the force-directed scheduling, is applied to consolidate resources, determine the active (ON) servers and further optimize the resource assignment.

Initial solution A greedy technique to rank the clients and the application tiers for each client is used to determine the order of resource assignment processing in our constructive approach. For each client, the following equation is used as its ranking metric: For the selected client, tiers are ordered using a similar metric:

Initial solution (cont’d) Based on the solution given for the profit upper bound problem, determine the ij and tij for each client. Replaced by

Force-directed Resource Assignment (FRA) Resource Consolidation using Force-Directed Search A client that has the highest force difference toward a new server is picked and if the required server is available, the load replacement is done. The definition of force is based on the partial profit gained from allocating each portion of the clients’ request in a specific tier to a server with specific type. Gold: Bronze:

Force-directed Resource Assignment (FRA)

Force-directed Resource Assignment (FRA) After this replacement, forces are updated and the new maximum force differential client-to-server assignment is made. This algorithm continues until there are no positive force differentials for any clients. Limit the number of tries in the algorithm and avoid loops and lockouts

Simulations Consider 10 to 100 clients For each client the processing power and memory capacity requirements are set with random variables. Ten different application tiers are considered. The number of application tiers for each client is selected randomly to be between 3 and 5. The probabilities of going forward or backward average of 80% going forward and 20% backward for each tier. Service times for clients with different application tiers are also modeled with random variables. Gold or Bronze SLA class with probability of 50%

Simulations (cont’d) The first scenario (low server to client ratio) The average number of servers is 5n where n denotes the number of clients. The second scenario (high server to client ratio) The average number of servers is set to 10n.

Simulation results

Simulation results (cont’d) An example of Resource Consolidation Improvement in case of a bad initial solution.

Simulation results (cont’d) Computation overhead

Simulation results (cont’d) Average utilization factor of the servers

Conclusion A closed-form formula for calculating the average response time of a client’s request for multi-tier applications. A unified framework for handling both strong (Gold class) and weak (Bronze class) SLAs. A provable upper bound on the total profit in the system for given clients, resources and SLAs. A multi-dimensional resource allocation scheme to consider communication resources for multi-tier applications that incur large inter-server communications.

Comments A well problem formulation. The upper bound is too loose. Computation overhead is high static dispatcher in this paper We can consider the effect of virtualization A host server has multiple application. Another issue How to design a dynamic dispatcher according to request arrival and server condition ?