Download presentation
Presentation is loading. Please wait.
1
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
Automatic Scaling of Internet Applications for Cloud Computing Services Zhen Xiao, Qi Chen, and Haipeng Luo May 2013 To appear in IEEE Transactions on Computers
2
Outlines 1. Introduction 2. Related Work 3. System Architecture
4. Our Design 5. Simulations 1. Introduction 6. Experiments 7. Conclusion
3
Outlines 1. Introduction 2. Related Work 3. System Architecture
4. Our Design 5. Simulations 1. Introduction 6. Experiments 7. Conclusion
4
Benefits of Cloud Computing
Auto-scaling: Cloud computing allows business customers to scale up and down their resource usage based on needs L7 Switch Hypervisor Hypervisor Hypervisor Hypervisor
5
When and where to start a VM for an application?
Myth about Cloud Computing Myth #1: Cloud computing provides infinite resource on demand --- Reality: just statistical multiplexing L7 Switch Hypervisor Hypervisor Hypervisor Hypervisor When and where to start a VM for an application?
6
Goals of Scheduling Achieve good demand satisfaction
The percentage of application demand that is satisfied should be maximized when a large number of applications experience their peak demand around the same time. Support green computing the number of servers used should be minimized as long as they can still satisfy the needs of all VMs. Idle servers can be turned off to save energy.
7
Goals of Scheduling
8
Outlines 1. Introduction 2. Related Work 3. System Architecture
4. Our Design 5. Simulations 1. Introduction 6. Experiments 7. Conclusion
9
Related Work Auto scaling in Amazon EC2 -- Scalr Google AppEngine
For one application Load balancing Google AppEngine Support Java & Python Secure sandbox environment has strict limitations Can not support existing applications Microsoft Windows Azure Applications should be stateless Users maintain the number of Instances
10
Outlines 1. Introduction 2. Related Work 3. System Architecture
4. Our Design 5. Simulations 1. Introduction 6. Experiments 7. Conclusion
11
System Architecture … Requests Switch Application Scheduler Plugin
Appl info Appl info Load distribution Reqs Instance list Algorithm adjustment Appl placement Dispatcher Request counter Monitor Usher CTRL Dom 0 Dom U Dom 0 Dom U Dom 0 Dom U Usher LNM Usher LNM … Usher LNM Xen Hypervisor Xen Hypervisor Xen Hypervisor
12
Fast start Complicated applications can take a long time to finish all the initializations (several minutes) Suspend && Resume Resumption time is independent of the start up time It depends on how fast the server can read the suspended VM file from the disk, which is quite short (several seconds) with modern disk technology Start up time is reduced by 70% for a VM with 1G memory Memory file VM1 VM1 Disk VM1 VM2
13
Green computing Put idle servers into standby mode so that they can be waken up quickly in an on-demand manner TPC-W workloads Fully utilized server consumes about 205 Watts Idle server consumes about 130 Watts Server in standby mode consumes about 20 Watts Putting idle server into standby mode save 85% energy Wake-on-LAN (WOL) technology Standby to active transition time is 1-2 seconds Suspend (in ram) to active transition time is 3-5 seconds
14
Outlines 1. Introduction 2. Related Work 3. System Architecture
4. Our Design 5. Simulations 1. Introduction 6. Experiments 7. Conclusion
15
Problem definition
16
Our design Bin packing Class Constrained Bin-Packing Problem (CCBP)
Items with different sizes are packed into a min number of bins Class Constrained Bin-Packing Problem (CCBP) The size of each item is one unit Each bin has capacity v Items are divided into classes Each bin can accommodate items from at most c distinct classes Model our problem as CCBP Each server is a bin Each class represents an application Items from a specific class represent the resource demands The capacity of a bin is the amount of resource at a server Class constraint: the max number of appls a server can run simultaneously
17
Our design App 3 c = 2 v = 5 App 3 App 1 App 3 App 1 App 3 App 1 App 2
Server1 Server2
18
Our design Enhanced color set algorithm
All sets contain exactly c colors except the last one Items from different color sets are packed independently using a greedy algorithm Resource needs of appls can vary with time Applications can join and leave A key observation of our work: Not all item movements are equally expensive Creating a new application instance is expensive Adjusting the load distribution is cheaper
19
Demand varies with time
Load increase: arrivals of new items c = 4 v = 5 App 2 App 3 App 2 App 1 App 3 App 3 App 1 App 2 App 2 App 3 App 3 App 1 App 1 App 2 App 3 App 3 App 1 App 1 App 2 App 3 bin1 bin2 unfilled bin
20
Demand varies with time
Load decrease: departure of already packed items c = 4 v = 5 App 2 App 3 App 2 App 1 App 3 App 3 App 1 App 2 App 2 App 3 App 3 App 1 App 1 App 3 App 2 App 3 App 1 App 1 App 2 App 3 bin1 bin2 unfilled bin
21
Mathematical analysis
R is determined mostly by c * t (the total load of all appls in a color set)
22
Practical considerations
Server equivalence class Divide the servers into “equivalence classes” based on their hardware settings Run our algorithm within each equivalence class Periodical execution Optimizations Each color set has at most one unfilled bin Use them to satisfy the appls whose demands are not completely satisfied
23
Outlines 1. Introduction 2. Related Work 3. System Architecture
4. Our Design 5. Simulations 1. Introduction 6. Experiments 7. Conclusion
24
Simulations
25
1000 servers and 1000 applications.
Appl demand ratio 1000 servers and 1000 applications.
26
Scalability Increase both #server and #appl from 1000 to 10,000
27
Appl number Fix #server to 1000, vary #appl from 200 to 2000
28
Outlines 1. Introduction 2. Related Work 3. System Architecture
4. Our Design 5. Simulations 1. Introduction 6. Experiments 7. Conclusion
29
Experiments 30 Dell PowerEdge servers with Intel E5620 CPU (8 cores) and 24GB of RAM run xen-4.0 and linux Web applications: Apache servers serving CPU intensive PHP scripts Clients: httperf to invoke the PHP scripts
30
Load shifting
31
Auto scaling Green Computing Flash Crowd
32
Auto scaling Compared with the Scalr (Amazon EC2) Flash Crowd
33
Auto scaling Our algorithm restores to the normal QoS in less than 5 min. While Scalr still suffers much degraded performance even after 25 min.
34
Outlines 1. Introduction 2. Related Work 3. System Architecture
4. Our Design 5. Simulations 1. Introduction 6. Experiments 7. Conclusion
35
Conclusion We presented the design and implementation of a system that can scale up and down the number of application instances automatically based on demand An enhanced color set algorithm to decide the application placement and the load distribution Achieve high satisfaction ratio even when the load is very high Save energy by reducing the number of running instances when the load is low
37
Thank You!
38
Application joins and leaves
Application leaves Load decrease: demand → 0 Shutdown the instances Remove the color from the set Application joins Sort the unfilled color sets: #color↓ Use a greedy algorithm to add new colors into those sets If #unfilled set > 1, then Sort the unfilled color sets: #color ↓ Use the last set in the list to fill the first set Repeat until #unfilled set <= 1 If #new color > 0, then Partition them into additional color sets
39
Scheduling Model App 2 App 2 App 3 App 2 App 1 Server1 Server2 Server3
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.