Download presentation
Presentation is loading. Please wait.
Published byJesse Brown Modified over 9 years ago
1
Computer Science Dynamic Resource Management in Internet Data Centers Prashant Shenoy University of Massachusetts
2
Computer Science Motivation Internet applications used in a variety of domains Online banking, online brokerage, online music store, e-commerce Internet usage continues to grow rapidly Broadband deployment is accelerating Outages of Internet applications more common “Site not responding” “connection timed out”
3
Computer Science Internet Application Outages Down for 30 minutes Average download time ~ 260 sec Periodic outages over 4 days Cause: Too many users leading to overload Holiday Shopping Season 2000: 9/11: site inaccessible for brief periods
4
Computer Science Internet Workloads are highly variable Short-term fluctuations “Slashdot Effect” Flash Crowds Long-term seasonal effects Time-of-day, month-of-year Peak difficult to predict Static overprovisoning not effective Manual allocation: slow Soccer World Cup’98 Key Issue: How can we design applications to handle large workload variations?
5
Computer Science Internet Data Centers Internet applications run on data centers Server farms Provide computational and storage resources Applications share data center resources Problem: How should the platform allocate resources to absorb workload variations?
6
Computer Science Talk Outline Motivation Internet data center model Dynamic provisioning Request Policing Cataclysm Server Platform Experimental results Summary
7
Computer Science Data Center Model Dedicated hosting: each application runs on a subset of servers in the data center Subsets are mutually exclusive: no server sharing Data center hosts multiple applications Free server pool: unused servers Retail Web site streaming
8
Computer Science Internet Application Model Internet applications: multiple tiers Example: 3 tiers: HTTP, J2EE app server, database Replicable applications Individual tiers: partially or fully replicable Example: clustered HTTP, J2EE server, shared-nothing db Each application employs a sentry Each tier uses a dispatcher: load balancing requests http J2EE database Load balancing sentry
9
Computer Science Approach Dynamic provisioning Allocate servers to applications on-the-fly Request policing Turn away excess requests Degrade performance based on SLA Couple provisioning and policing
10
Computer Science Research Questions How many servers to allocate and when? Multi-tier apps: when and how to provision each tier? How many requests should be turned away during overload? Multi-tier apps: where should requests be dropped? Can we meet SLAs during overloads? Is it possible to predict future workloads?
11
Computer Science Dynamic Provisioning Key idea: increase or decrease allocated servers to handle workload fluctuations Monitor incoming workload Compute current or future demand Match number of allocated servers to demand Monitor workload Monitor workload Compute current/ future demand Compute current/ future demand Adjust allocation
12
Computer Science Single-tier Provisioning Single tier provisioning well studied [Muse, TACT] Non-trivial to extend to multiple-tiers Strawman #1: use single-tier provisioning independently at each tier Problem: independent tier provisioning may not increase goodput C=15 C=10 C=10.1 14 req/s 14 10 dropped 4 req/s
13
Computer Science Single-tier Provisioning Single tier provisioning well studied [Muse, TACT] Non-trivial to extend to multiple-tiers Strawman #1: use single-tier provisioning independently at each tier Problem: independent tier provisioning may not increase goodput C=15 C=10.1 14 req/s 14 C=20 14 dropped 3.9 req/s 10.1
14
Computer Science Model-based Provisioning Black box approach Treat application as a black box Measure response time from outside Increase allocation if response time > SLA Use a model to determine how much to allocate Strawman #2: use black box for multi-tier apps Problems: Unclear which tier needs more capacity May not increase goodput if bottleneck tier is not replicable 14 req/s C=15 C=10.1 14 C=20 14 10.1
15
Computer Science Provisioning Multi-tier Apps Approach: holistic view of multi-tier application Determine tier-specific capacity independently Allocate capacity by looking at all tiers (and other apps) Predictive provisioning Long-term provisioning: time scale of hours Maintain long-term workload statistics Predict and provisioning for the next few hours Reactive provisioning Short term provisioning: time scale of several minutes React to “current” workload trends Correct errors of long-term provisioning Handle flash crowds (inherently unpredictable)
16
Computer Science Workload Prediction Long term workload monitoring and prediction Monitor workload for multiple days Maintain a histogram for each hour of the day Capture time of day effects Forecast based on Observed workload for that hour in the past Observed workload for the past few hours of the current day Predict a high percentile of expected workload Mon Tue Wed Today
17
Computer Science Predictive Provisioning Queuing theoretic application model Each individual server is a G/G/1 queue Derive per-tier E(r) from end-to-end SLA Monitor other parameters and determine per-server capacity) Use predicted workload pred to determine # servers per tier Assumes perfect load balancing in each tier Alternative: each tier G/G/k G/G/1 pred
18
Computer Science Reactive Provisioning Idea: react to current conditions Useful for capturing significant short-term fluctuations Can correct errors in predictions Track error between long-term predictions and actual Allocate additional servers if error exceeds a threshold Account for prediction errors Can be invoked if request drop rate exceeds a threshold Handles sudden flash crowds Operates over time scale of a few minutes Pure reactive provisioning: lags workload Reactive + predictive more effective! Prediction error pred actual error > Invoke reactor time series allocate servers
19
Computer Science Talk Outline Motivation Internet data center model Dynamic provisioning Request Policing Cataclysm Server Platform Experimental results Summary
20
Computer Science Request Policing Key Idea: If incoming req. rate > current capacity Turn away excess requests Degrade performance of requests Why police when you can provision? Provisioning is not instantaneous Residual sessions on reallocated server Application and OS installation and configuration overheads Overhead of several (5-30) minutes Sentry policing G/G/1 drop
21
Computer Science Class-based Differentiation Some requests are more important than others Purchase versus catalog browsing Stock trade versus view account balance Overload => preferentially let in more important requests Maximize utility during overload Incoming requests queued up in class queues Example: gold, silver, bronze class Higher priority to more important classes Sentry policing drop
22
Computer Science Scalable Policing Techniques Examining individual requests infeasible Incoming rate may be order of magnitude greater than capacity Need to reduce overhead of policing decisions Idea #1: Batch processing Premise: Requests arrivals are bursty Admit a batch of queued up requests One admission control test per batch Reduces overhead from O(n) to O(b) Idea #2: Use pre-computed thresholds Example: capacity = 100 req/s, G=75, S=50, B=50 req/s Admit all gold, half of silver and no broze Periodically estimate and s: compute threshold O(1) overhead: trades accuracy for efficiency
23
Computer Science Cataclysm Server Platform Prototype data center Commodity hardware 40+ Pentium servers 2 TB of RAID arrays Gigabit switches Linux-based platform
24
Computer Science Cataclysm Software Architecture Cataclysm Control Plane Provisioning Global allocation App placement Nucleus Apps OS Nucleus Apps OS Nucleus Apps OS Server Node Runs apps, sentries Resource monitoring, Local allocation Two key components: control plane and nuclei
25
Computer Science Cataclysm Node Architecture Capsule: component of an app on a node Qlinux: proportional-sharing of node resources Nucleus: resource allocations across capsules and VMs Nucleus Capsule QLinux HSFQ CPU scheduler Prop-share packet sched Cello disk scheduler SFVM memory mgr Nucleus QLinux Capsule VM Capsule VM Capsule VM Active Dormant UML Xen
26
Computer Science Cataclysm Applications Multi-tiered apps: Rubis (e-auctions), Rubbos (b-board) Apache, JBOSS, mysql Tier-1 Sentry Ktcpvs: kernel HTTP load balancer Request policing and class-based differentiation Workload monitoring Tier-2 sentry: Apache JBOSS redirector, workload monitoring Nuclues: Linux trace toolkit, /proc to monitor node statistics All system components are replicable! Apache Load bal police ktcpvs Apache JBOSS mysql
27
Computer Science Talk Outline Motivation Internet data center model Dynamic provisioning Request Policing Cataclysm Server Platform Experimental results Summary
28
Computer Science Dynamic Provisioning Server Allocation adapts to changing workload WorkloadServer Allocation RuBiS: E-auction application like Ebay
29
Computer Science Class-based differentiation Arrival rate 0 50 100 150 200 250 0100200300400500600 Time (sec) Arrival rate GLD SIL BRZ Fraction admitted 0 0.2 0.4 0.6 0.8 1 1.2 0100200300400500600 Time (sec) Fraction admitted GLD SIL BRZ
30
Computer Science Threshold-based: higher scalability Scalability 0 20 40 60 80 100 05000100001500020000 Arrival rate CPU usage Batch Thresh
31
Computer Science Other Research Results OS Resource Allocation Qlinux [ACM MM00], SFS [OSDI00], DFS [RTAS02] SHARC cluster-based prop. sharing [TPDS03] Shared hosting provisioning Measurement-based [IWQOS02], Queuing-based [Sigmetrics03,IWQOS03] Provisioning granularity [Self-manage 03] Application placement [PDCS 2004] Profiling and Overbooking [OSDI02] Storage issues iSCSI vs NFS [FAST03], Policy-managed [TR03]
32
Computer Science Glimpse of Other Projects Hyperion: Network processor based measurement platform Measurement in the backbone and at the edge NP-based measurements in the data center RiSE: Rich Sensor Environments Video sensor networks Robotics sensor networks Real-time sensor networks Weather sensors
33
Computer Science Concluding Remarks Internet applications see varying workloads Handle workload dynamics by Dynamic capacity provisioning Request Policing Need to account for multi-tiered applications Joint work: Bhuvan Urgaonkar, Abhishek Chandra and Vijay Sundaram More at http://lass.cs.umass.edu
34
Computer Science Predictive Provisioning Invoked once every hour Captures long-term variations - time of day effects Extensions to seasonal effects (month-of-year, holidays) How to initialize? Needs several days of history to work well What happens if no servers are available? Use revenue/utility to arbitrate allocation [Muse] Turn away excess requests Non-replicable tiers are easy to handle Provision other tiers until non-replicable tier is saturated
35
Computer Science Degrade or Drop? Depends on the application and the SLA Degrading increases effective capacity Also degrades performance seen by requests Degrade if Utility from servicing more requests at lower performance > Utility from servicing fewer requests - penalty of dropping requests Otherwise drop requests < 500msr1 < 1sr2 <10sr3 SLA:
36
Computer Science Use of Virtual Machine Monitors Server allocation can be slow (~ 5-20+ minutes) Need residual sessions to terminate Disk scrubbing, OS and app installation, configuration Application and system overheads Flash crowds => need fast allocation Use virtual machines Each app runs inside a VM, multiple VMs on a server Only one VM is active at any time, other VMs are “hot spares” Server allocation => idle one VM, activate another System overhead reduces to < 1s Need to still account for residual sessions Application issue, not longer a system limitation
37
Computer Science Threshold-based: loss of accuracy
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.