Download presentation
Presentation is loading. Please wait.
Published byEmil Freeman Modified over 9 years ago
1
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Data Centers Bhuvan Urgaonkar Laboratory for Advanced Systems Software University of Massachusetts Amherst http://www.cs.umass.edu/~bhuvan
2
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 2 Internet Applications Proliferation of Internet applications auction siteonline gameonline store Growing significance in personal, business affairs Focus: Internet server applications
3
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 3 Internet Workloads Are Dynamic Multi-time-scale variations Time-of-day, hour-of-day Flash crowds User threshold for response time: 8-10 s Key issue: Provide good response time under varying workloads 0 20000 40000 60000 80000 100000 120000 140000 05101520 Time (hrs) Request Rate (req/min) 0 12 24 Time (hours) Time (days) 0 12345 Arrivals per min 0 0 140K 1200
4
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 4 Data Centers Clusters of servers Hosting platforms: Rent resources to third-party applications Performance guarantees in return for revenue Benefits: Applications: don’t need to maintain their own infrastructure o Rent server resources, possibly on demand Platform provider: generates revenue by renting resources
5
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 5 Goals of a Data Center Satisfy application performance guarantees under dynamic workloads E.g., average response time, throughput Maximize resource utilization E.g., maximize the number of hosted applications Question: How should a data center manage its resources to meet these goals?
6
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 6 Manual Resource Allocation Resource over-provisioning Resource wastage A bad estimate could result in under-allocation Manual reallocation Slow allocation time Challenge: How to handle dynamic workloads while efficiently utilizing resources? WC Soccer 1998
7
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 7 Dynamic Resource Management How to map an application to servers in the data center? How to provide good performance under dynamic workloads? How to remain operational under extreme overloads? Application Placement [OSDI’02,PDCS’04] Dynamic Capacity Provisioning [Auto Computing’05] Scalable Policing [World Wide Web’05]
8
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 8 Dynamic Resource Management How to map an application to servers in the data center? How to provide good performance under dynamic workloads? How to remain operational under extreme overloads? Application Placement [OSDI’02,PDCS’04] Dynamic Capacity Provisioning [Auto Computing’05] Scalable Policing [World Wide Web’05]
9
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 9 Talk Outline Motivation Data Center Models Application Placement Dynamic Capacity Provisioning Summary and Future Research
10
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 10 Data Center Models Small applications Require only a fraction of a server Shared Web hosting, $20/month to run own Web site Shared hosting: multiple applications on a server Co-located applications compete for server resources
11
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 11 Data Center Models Large applications May span multiple servers eBay site uses thousands of servers! Dedicated hosting: at most one application per server Allocation at the granularity of a single server
12
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 12 Application Placement How to map application to servers in the data center? Step 1: Finding application’s resource requirement Automatic requirement inference technique Step 2: Identifying servers to host the application Easy in dedicated hosting o Just assign the desired number of available servers! Non-trivial in shared hosting o Opportunity for statistical multiplexing of resources on a server o Multi-dimensional Knapsack OSDI’02
13
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 13 Resource Requirement Inference time Measurement Interval Cumulative Probability Fractional usage 01 1 A 0.99 B ON-OFF PROCESS Fractional usage Probability 01 1 PDF CDF
14
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 14 Requirement Inference Technique Profiling: process of determining resource usage Run the application on an isolated server Subject the application to a real workload Determine CPU and network usage Use the Linux trace toolkit [Yaghmour00] Track scheduling events, packet transmissions times Implementation on a Linux cluster Apache Web server using SPECWeb99 Streaming media server with VBR MPEG-1 clients Postgres database server Quake game server
15
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 15 Application Profiles 0 0.05 0.1 0.15 0.2 0.25 0.3 00.10.20.30.40.50.60.7 Apache Web Server, 50% cgi-bin Probability Fraction of CPU 0 0.05 0.1 0.15 0.2 0.25 0.3 00.1.20.30.40.50.60.70.8 Streaming Media Server, 20 clients Probability Fraction of NW bandwidth Observation: Resource usage can be bursty Peak requirement much higher than a high percentile Insight: Provisioning for the tail can save resources! Under-provisioning of resources Occasional violations of resource guarantees
16
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 16 Controlled Resource Under-provisioning Allow applications to specify a violation tolerance V Provision for the (100-V) th percentile of resource usage Requirements do not necessarily peak simultaneously o Probability of violations even less than V Similar to resource overbooking in airline industry Determine which servers have enough capacity σ k : (100-v) th percentile, C: server capacity Σ K σ cpu ≤ C cpu ; Σ K σ net ≤ C net kk
17
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 17 Resource Utilization Gains 1% violations can more than double number of applications ! Small under-provisioning can yield large gains Bursty applications yield larger benefits Placement of Apache Web Servers 0 200 400 600 800 1000 1200 1400 020406080100120140 No Viol Viol=1% Web Servers Placed Data center size 0 50 100 150 200 250 300 350 020406080100120140 Placement of Streaming Media Servers No Viol Viol=1% Media Servers Placed Data center size
18
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 18 Impact of Under-provisioning on Application Performance Provisioning for the tail results in tolerable degradation Large resource savings possible with small degradation
19
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 19 Application Placement: Summary Server applications tend to have bursty usage Save resources in shared data centers running small applications Determine resource usage behavior Under-provision resources Controlled performance degradation Theoretical properties of application placement NP-hard, approximation algorithms OSDI’02 PDCS’04
20
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 20 Talk Outline Motivation Data Center Models Application Placement Dynamic Capacity Provisioning Summary and Future Research
21
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 21 Dynamic Capacity Provisioning Key idea: increase or decrease allocated resources to handle workload fluctuations To handle increased workload … Shared hosting: increase resource share Dedicated hosting: start replicas on additional servers Focus: Dedicated hosting, large applications Monitor workload Monitor workload Compute future demand Compute future demand Adjust allocation [Chandra03, Chase01]
22
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 22 Dynamic Capacity Provisioning Allocator Predictors Monitor Application Models Predicted workload Observed workload Resource reqmts Servers Workload measurements Server allocations
23
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 23 Dynamic Capacity Provisioning Allocator Predictors Monitor Application Models Predicted workload Observed workload Resource reqmts Servers Workload measurements Server allocations
24
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 24 Internet Application Architecture Multi-tier architecture Each tier uses services provided by its successor Session-based workloads Caching, replication HTTPJ2EEDatabase request processing in an online bookstore search “moby” queries response Melville’s ‘Moby Dick’ Music CDs by Moby
25
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 25 Existing Application Models Models for Web servers [Chandra03, Doyle03] Do not model Java server, database etc. Black-box models [Kamra04, Ranjan02] Unaware of bottleneck tier Extensions of single-tier models [Welsh03] Fail to capture interactions between tiers Existing models inadequate for multi-tier Internet applications
26
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 26 Baseline Application Model Model consists of two components Sub-system to capture behavior of clients Sub-system to capture request processing inside the application SIGMETRICS’05 clientsapplication
27
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 27 Modeling Clients Clients think between successive requests Infinite server system to capture think time Z Captures independence of Z from processing in application Client 1 Client 2 Client N Z Z Z Q0Q0 applicationclients
28
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 28 Modeling Request Processing Q1Q1 Q2Q2 QMQM tier 1tier 2tier M p M =1p3p3 p1p1 p2p2 S1S1 S2S2 SMSM Transitions defined to capture circulation of requests Request may move to next queue or previous queue Multiple requests are processed concurrently at tiers Processor sharing scheduling discipline Caching effects get captured implicitly! N
29
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 29 Putting It All Together Q0Q0 Q1Q1 Q2Q2 QMQM p M =1p3p3 p1p1 p2p2 Z Z S1S1 S2S2 SMSM N A closed-queuing model that captures a given number of simultaneous sessions being served tier 1tier 2tier M client
30
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 30 Model Solution and Parameter Estimation Mean Value Analysis (MVA) Algorithm Computes mean response time Visit ratios Equivalent to trans. probs. for MVA V i ≈ λ i / λ req ; λ req at policer, λ i from logs Service times Use residence time X i logged at tier i For last tier, S M ≈ X M S i = X i – ( V i+1 / V i ) · X i+1 Think time Measured at the entry point of application SIGMETRICS’05
31
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 31 Evaluation of Baseline Model Auction site RUBiS One server per tier ApacheJBOSS Mysql Concurrency limits not captured 150 75
32
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 32 Q0Q0 Q1Q1 Q2Q2 QMQM Z Z S1S1 S2S2 SMSM N Requests may be dropped due to concurrency limits Need to model the finiteness of queues! Handling Concurrency Limits dropped requests
33
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 33 QMQM p1p1 pMpM S1S1 SMSM Q0Q0 Q1Q1 Q2Q2 QMQM Z Z S1S1 S2S2 SMSM N Approach: Subsystems to capture dropped requests Distinguish the processing of dropped requests Handling Concurrency Limits drop Q1Q1
34
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 34 Enhanced model can capture concurrency limits Response Time Prediction
35
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 35 Query Caching at the Database Caching effects Captured by tuning V i and/or S i Bulletin-board site RUBBoS 50 sessions SELECT SQL_NO_CACHE causes Mysql to not cache the response to a query More model enhancements Replication at tiers Multiple session classes
36
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 36 Dynamic Capacity Provisioning Allocator Predictors Monitor Application Models Predicted workload Observed workload Resource reqmts Servers Workload measurements Server allocations
37
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 37 Handling Unanticipated Workloads Allocation for an application may be insufficient Short-term fluctuations are difficult to predict Errors in parameter estimation may cause under-allocation Reactor: Allocate additional servers over time scale of a few minutes if Observed workload exceeds predicted workload Request drop rate exceeds a threshold Repeated invocations may be needed Policer: If incoming session rate > current capacity Turn away excess sessions Highly scalable policing World Wide Web’05
38
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 38 Prototype Data Center 40+ Linux servers Gigabit switches Multi-tier applications Auction (RUBiS) Bulletin-board (RUBBoS) Apache, JBOSS (replicable) Mysql database Control Plane Application placement Dynamic provisioning Nucleus Apps OS Server Node Applications Request policer Resource monitoring Parameter estimation Nucleus Apps OS Nucleus Apps OS
39
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 39 Dynamic Capacity Provisioning WorkloadResponse time Server allocations Auction application RUBiS Factor of 4 increase in 30 min Server allocations increased to match increased workload Response time kept below 2 seconds
40
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 40 Talk Outline Motivation Data Center Models Application Placement Dynamic Capacity Provisioning Summary and Future Research
41
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 41 Summary Dynamic resource management in data centers Application Placement Improve utilization by under-provisioning Dynamic Capacity Provisioning Analytical model for Internet applications Predictive provisioning Reactive provisioning Handling Extreme Overloads Scalable policing
42
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 42 Future Research Directions Virtual machine based hosting Trade-off between fast switching and VM overheads Malicious flash crowds, DoS attacks Security mechanisms Sensor networks Constrained environment How to provide desired performance to overlying applications? Mobile computing Resource-deficient clients How to design Internet servers for such clients? Focus: Large-scale emerging distributed systems
43
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 43 Thank you! More information at: http://www.cs.umass.edu/~bhuvan
44
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 44 Agile Switching Using Virtual Machine Monitors Use VMM’s to enable fast switching of servers Switching time only limited by residual sessions VMM active dormant active VM 1 VM 2 VM 3 VM 2 VM 3 VMM’s allow multiple “virtual” m/c on a server E.g., Xen, VMWare, …
45
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 45 Model Solution and Parameter Estimation Mean Value Analysis (MVA) Algorithm Computes mean response time Visit ratios Equivalent to trans. probs. for MVA V i ≈ λ i / λ req ; λ req at policer, λ i from logs Service times Use residence time X i logged at tier i For last tier, S M ≈ X M S i = X i – ( V i+1 / V i ) · X i+1 Think time Measured at the entry point of application SIGMETRICS’05
46
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 46 Prototype Data Center 40+ Linux servers Gigabit switches Multi-tier applications Auction (RUBiS) Bulletin-board (RUBBoS) Apache, JBOSS (replicable) Mysql database Control Plane Application placement Dynamic provisioning Nucleus Apps OS Server Node Applications Request policer Resource monitoring Parameter estimation Nucleus Apps OS Nucleus Apps OS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.