Download presentation
1
SLA-aware Virtual Resource Management for Cloud Infrastructures
On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8th, 2010
2
Problem Space Automate the management of virtual servers
Why this is challenging: Must take into account high-level SLA requirements of hosted applications Must take into account resource management costs When applications change state, their resource demands are likely to change אופן אוטומטי למכן
3
High Level Solution Generate a Global Utility function
Constraint Programming approach Degree of SLA fulfillment Operating costs Autonomic resource manager built on utility function Decouple resource provisioning and VM placement
4
Why use (and automate) the Cloud?
Static allocation of resources results in 15-20% utilization VMs allow decoupling of applications from physical servers Automation of the management process (scale up and scale down) can reduce cost, and boost/maintain performance
5
Decoupling in two stages
Provisioning stage Allocate resource capacity virtual machines for a given application Driven by performance goals associated with the business-level SLAs of the hosted applications Placement stage Map Virtual Machines to Physical Machines Driven by data center policies regarding resource management costs. A typical example is to lower energy consumption by minimizing the number of active physical servers. Define SLA – Service Level Agreement – average response time, number of jobs completed per unit of time
6
Allocating new resources with state changes
Application Environment A Application Environment B Application Environment C VM1 VM2 VM3 VM4 VM5 VM6 VM7 Draw own picture with multiple VMs per application and multiple VMs on a physical machine Physical Machine1 Physical Machine2 Physical Machine3 Physical Machine4
7
Automation criteria What are the requirements for successful automation? Dynamic provisioning and placement Support for online applications with stringent QoS requirements batch-oriented CPU-intensive applications Support for a variety of application topologies
8
Provisioning maps to application specific functions
Placement maps to a global decision layer Utility function is their means of communication. Utility function returns a scalar value 0 (unsatisfied) to 1 (satisfied) Application state: Workload, Resource Capacity, SLA Both provisioning and placement are mapped as Constraint Satisfaction Problems Utility function – value 0 to 1, 0 meaning
9
Some Definitions Satisfaction – whether an application is achieving its performance goals Constraint Programming – solve a problem by stating constraint relations between variables – constraints must be satisfied by the solution
10
Assumptions Physical machines can run multiple VMs
Application Environment (AE) AEs can span multiple VMs SLAs apply to AEs A VM can only run one AE at a time
11
High Level Architecture
Local Decision Module (LDM) for each AE Compute satisfaction with current resources, workload, and service metrics (utility function) Evaluate the opportunity of allocating more VMs or releasing existing VMs to/from the AE Global Decision Module (GDM) Arbitrates resource requirements based on utility functions and performance of VMs and PMs Notify LDMs of changes to VMs and manage the VM lifecycle (start, stop, migrate)
13
Local Decision Module LDM is associated with two utility functions
(1) Fixed service-level - maps the service level to a utility value (2) Dynamic resource-level - maps a resource capacity to a utility value, communicated to GDM Variables Let A =(a1, a2, ..., ai, ..., am) denote the set of AEs, P=(p1,p2,...,pj,...,pq) denote set of PMs in datacenter, S=(s1, s2, ..., sk, ..., sc), denote set of c classes of VMs, where sk=(skcpu, skram) specifies the VM CPU capacity in MHz and the VM memory capacity in megabytes
14
LDM (cont’d) Utility function (2) ui for application ai:
ui = fi(Ni), where Ni is the VM allocation vector of application ai: Ni = (ni1,ni2,...,nik,...,nim) where nik is the number of VMs of class sk attributed to application ai.
15
Application Constraints
Each application also provides upper bound on VMs that it is willing to accept. Each VM class Nimax=(ni1max,ni2max,...,nikmax,...,nimmax) Total Timax (1) ≤ i ≤ m and 1 ≤ k ≤ c (2) ≤ i ≤ m
16
Global Decision Module
Duties (and Constraint Satisfaction Problems) Determining VM allocation vectors Ni for each application (Provisioning) Place VMs on PMs such that number of active PMs is minimized (Packing)
17
Provisioning VMs allocated to all applications are constrained by capacity physical servers CPU capacity (3) RAM capacity where Cj is the capacity of PM pj
18
Provisioning Output Provisioning phase output
Set of vectors Ni for which constraints 1, 2, 3 are satisfied Comparing new Ni to existing Ni tells GDM which VMs will be created, destroyed, or resized. Global utility Uglobal is maximized via weighted sums of utility and operating costs. where is weight of utility fn for application , and ε is coefficient that allows admin to trade/tweak performance goals for operating cost of Ni …and cost(N) is a function of VM allocation vectors Ni and must share the same scale as application utility functions, i.e (0,1). This cost function is not hardwired into the GDM but can be specified arbitrarily.
19
Packing (Placement) V = (vm1,vm2,...,vml,...,vmv) lists all VMs running at the current time. For each PM pj ∈ P, bit vector Hj = (hj1,hj2,...,hjl,...,hjv) denotes the set of VMs assigned to pj Example: hjl = 1 if pj is hosting vml R = (r1 , r2 , ..., rl , ..., rv ) is the resource capacity (CPU, RAM) of all VMs, where rl=(rlcpu , rlram )
20
Packing (physical resource) Constraints
The sum of the resource capacities of the VMs on PM pj must be less than or equal to the resource capacity of pj.
21
Packing Output Packing produces VM placement vectors Hj
GDM is run periodically – uses previous Hj to determine which VMs need to be migrated Goal is to minimize number of active PMs X:
22
Simulation Environment
4 PMs, each with 4000 MHz, 4000 MB 2 applications Cost function:
23
Simulation 1 Minimize operating cost impact: ε = .05 4 VM classes
Given Table II below, A is given priority
24
Demand DA and DB CPUs RA and RB #Physical Machines Global Utility Response times TA and TB
At times t0 and t1, as workloads of A and B are both low, CPU demands are negligible, hence there is only one PM needed. Between times t2 and t3, workloads increase, so the system tends to attribute more resource (CPU) accordingly. However, the workloads are not too heavy, there are still enough resource to keep response time under the SLA goal (τ = 100ms) for all applications. In this interval, almost all PMs are mobilized. The impact of the application weights αi is also illustrated between time t4 and t5, where demands for application A and B are both high. As 0.8 = αA > αB = 0.2, A has the higher resource allocating priority. Since there is not any CPU resource left to provision to application B, the response time of B exceeds the SLA goal. Note that at these intervals, the global utility has slightly decreased (to 0.85) because of the SLA violation of B. At times t6, t7 and t8 a peak of demand for application A corresponds to a trough of demand for application B and vice-versa. CPU capacity is switched between the two applications to maintain an optimal global utility.
25
Simulation 2: Operating Cost factor ε increases to .3
Simulation (cont’d) As the resource demands increases, the global utility value decreases more. The worst value is reached at t4 and t5 (0.55) when both applications need a maximum amount of resource. We can see at these intervals and even at t7, t8, B descends quicker and than only holds a small amount, as compared to Figure 5, even when nearly all resource was released by A. This behavior is due to the weight αB = 0.2 is negligible against αA = 0.8, and the local utility value of B is not enough to compensate the operating cost which is ”boosted” by ε = 0.3. Consequently, there are some SLA violations of B, notably at time t7.
26
New utility function for both A and B
Simulation 3 New utility function for both A and B
27
Simulation 3 results Looking at t4 and t5 – CPU resource for B descends faster as compared to the first test Focusing on intervals t4 and t5, as now B uses the utility function 4(a), the CPU resource for B descends faster as compared to the first test (Figure 5), since 4(a) doesn’t have some intermediate value line 4(b) when resource amount is not enough to satisfy the SLA value.
28
Simulation 4: Changing weight factors
αA=0.3, αB=0.7 B obtains enough CPU, A does not fails to meet SLA The increase of the response time of A (about 250ms) during high workload intervals is the result of its lower contribution the global utility value.
29
Recommendations Constraint Solver for optimizing provisioning and packing is not discussed.* No mention of any overheads of migrating to a new PM or allocating a new VM. Simulations do not dive into N vectors for VM provisioning No discussion of cost or frequency of running GDM *Choco open source constraint solver is used
30
Conclusions Dynamic placement and attention to application- specific goals are valuable Modeling on top of Constraints allows for flexibility Utility functions provide a uniform way for applications to self-optimize.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.