Download presentation
Presentation is loading. Please wait.
Published byNickolas Atkinson Modified over 9 years ago
1
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Hosting Platforms Ph.D. Thesis Defense Bhuvan Urgaonkar Advisor: Prashant Shenoy
2
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 2 Internet Applications Proliferation of Internet applications auction siteonline gameonline retail store Growing significance in personal, business affairs Focus: Internet server applications
3
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 3 Hosting Platforms Data Centers Clusters of servers Storage devices High-speed interconnect Hosting platforms: Rent resources to third-party applications Performance guarantees in return for revenue Benefits: Applications: don’t need to maintain their own infrastructure o Rent server resources, possibly on demand Platform provider: generates revenue by renting resources
4
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 4 Goals of a Hosting Platform Meet service-level agreements Satisfy application performance guarantees o E.g., average response time, throughput Maximize revenue E.g., maximize the number of hosted applications Question: How should a hosting platform manage its resources to meet these goals?
5
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 5 Challenge #1: Dynamic Workloads Multi-time-scale variations Time-of-day, hour-of-day Overloads E.g., Flash crowds User threshold for response time: 8-10 s Key issue: How to provide good response time under varying workloads? 0 20000 40000 60000 80000 100000 120000 140000 05101520 Time (hrs) Request Rate (req/min) 0 12 24 Time (hours) Time (days) 0 12345 Arrivals per min 0 0 140K 1200
6
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 6 Challenge #2: Complexity of Applications Complex software architecture Diverse software components Web servers, Java application servers, databases Multiple classes of clients How to provide differentiated service? Replicable components How many replicas to have? Tunable configuration parameters E.g., MaxClient in Apache How to set these parameters? Key issue: How to capture all this complexity?
7
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 7 Talk Outline Motivation Thesis contributions Application modeling Dynamic provisioning Scalable request policing Conclusions
8
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 8 Hosting Platform Models Small applications Require only a fraction of a server Shared Web hosting, $20/month to run own Web site Shared hosting: multiple applications on a server Co-located applications compete for server resources
9
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 9 Hosting Platform Models Large applications May span multiple servers eBay site uses thousands of servers! Dedicated hosting: at most one application per server Allocation at the granularity of a single server
10
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 10 Thesis Contributions Dynamic resource management in hosting platforms Shared Hosting Statistical multiplexing and under-provisioning [OSDI 2002] Application placement [PDCS 2004] Dedicated Hosting Analytical model for an Internet application [SIGMETRICS 2005] Dynamic provisioning [Autonomic Computing 2005] Scalable request policing [PODC 2004, WWW 2005]
11
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 11 Talk Outline Motivation Thesis contributions Application modeling Dynamic provisioning Scalable request policing Conclusions
12
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 12 Internet Application Architecture Multi-tier architecture Each tier uses services provided by its successor Session-based workloads HTTPJ2EEDatabase request processing in an online bookstore search “moby” queries response Melville’s ‘Moby Dick’ Music CDs by Moby
13
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 13 Baseline Application Model Model consists of two components Sub-system to capture behavior of clients Sub-system to capture request processing inside the application SIGMETRICS’05 clientsapplication
14
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 14 Modeling Clients Clients think between successive requests Infinite server system to capture think time Z Captures independence of Z from processing in application Client 1 Client 2 Client N Z Z Z Q0Q0 applicationclients
15
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 15 Modeling Request Processing Q1Q1 Q2Q2 QMQM tier 1tier 2tier M p M =1p3p3 p1p1 p2p2 S1S1 S2S2 SMSM Transitions defined to capture circulation of requests Request may move to next queue or previous queue Multiple requests are processed concurrently at tiers Processor sharing scheduling discipline Caching effects get captured implicitly! N
16
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 16 Putting It All Together Q0Q0 Q1Q1 Q2Q2 QMQM p M =1p3p3 p1p1 p2p2 Z Z S1S1 S2S2 SMSM N A closed-queuing model that captures a given number of simultaneous sessions being served tier 1tier 2tier M client
17
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 17 Mean-value Analysis Q0Q0 Q1Q1 Q2Q2 QMQM Product-form closed queuing network L m : average length of Q m A m : average number of clients in Q m seen by arriving client A m (n+1) = L m (n) Iterative algorithm to compute mean queue lengths, sojourn times client n n+1 1 A 2 (n+1)=A M (n+1)=A 1 (n+1)=L 1 (n)L 2 (n)L M (n)
18
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 18 Parameter Estimation Visit ratios Equivalent to trans. probs. for MVA V i ≈ λ i / λ req ; λ req at sentry, λ i from logs Service times Use residence time X i logged at tier i For last tier, S M ≈ X M S i = X i – ( V i+1 / V i ) · X i+1 Think time Measured at the application sentry
19
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 19 Evaluation of Baseline Model Auction site RUBiS One server per tier ApacheJBOSS Mysql Concurrency limits not captured 75150
20
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 20 Q0Q0 Q1Q1 Q2Q2 QMQM Z Z S1S1 S2S2 SMSM N Requests may be dropped due to concurrency limits Need to model the finiteness of queues! Handling Concurrency Limits dropped requests
21
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 21 QMQM p1p1 pMpM S1S1 SMSM Q0Q0 Q1Q1 Q2Q2 QMQM Z Z S1S1 S2S2 SMSM N Approach: Subsystems to capture dropped requests Distinguish the processing of dropped requests Handling Concurrency Limits drop Q1Q1
22
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 22 Estimating Drop Probabilities and Delay Values Drop probability Step 1: Estimate throughput using MVA assuming no concurrency limits Step 2: Estimate p i drop as the drop probability of M/M/1/K i queue Delay value for tier i Subject the application to offline workload that causes limit to be exceeded only at tier i; record response time of failed requests High limit High limit Low limit Tput=t tt*(1-p i drop ) t*p i drop KiKi
23
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 23 Enhanced model can capture concurrency limits Response Time Prediction
24
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 24 Replication and Load Imbalances Causes of imbalance “Sticky” sessions Variation in session durations and resource requirements Imbalance factor for j th most-loaded replica of tier i imbalance(i, j) = num_arrivals(i, j) / num_arrivals(i) Scale visit ratio V i, j = V i * imbalance(i, j) ApacheMysql JBOSS
25
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 25 Capturing Load Imbalance Number of requests (per-replica) 0 200 400 600 800 1000 3090150210 270 Time (sec) Number of requests Replica 1 Replica 2 Replica 3 Response times (based on load) 0 200 400 600 800 1000 1200 1400 1600 1800 ObservedPerfect Load balancing Enhanced Model Avg. resp. time (msec) Least loaded Medium loaded Most loaded Average Session affinity causes load imbalance Imbalance shifts among replicas Our enhancement helps improve response time prediction JBOSS Apache Mysql JBOSS
26
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 26 Talk Outline Motivation Thesis contributions Application modeling Dynamic provisioning Scalable request policing Conclusions
27
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 27 Dynamic Provisioning Key idea: increase or decrease allocated servers to handle workload fluctuations Monitor incoming workload Compute current or future demand Match number of allocated servers to demand Monitor workload Monitor workload Compute current/ future demand Compute current/ future demand Adjust allocation Auto. Computing’05
28
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 28 Dynamic Provisioning at Multiple Time-scales Predictive provisioning Certain Internet workloads patterns can be predicted o E.g., time-of-day effects, increased workload during Thanksgiving Provision using model at time-scale of hours or days Reactive provisioning Applications may see unpredictable fluctuations o E.g., Increased workload to news-sites after an earthquake Detect such anomalies and react fast (minutes)
29
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 29 Request Policing Key Idea: If incoming req. rate > current capacity Turn away excess requests Why police when you can provision? Provisioning is not instantaneous o Residual sessions on reallocated server o Application and OS installation and configuration overheads Overhead of several (5-30) minutes Sentry policing drop
30
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 30 Existing Work Lots of existing work on request policing [Kanodia00, Li00, Verma03, Welsh03, Abdelzaher99, …] Shortcomings of existing work: Does not attempt to integrate policing and provisioning Does not address scalability of the policer! o The policer itself may become the bottleneck during overloads
31
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 31 Policer: Design Goals Each class should sustain its guaranteed admission rate Class-based differentiation and revenue maximization Challenging due to online nature of the problem o An admitted request may cause a more important request arriving later to be dropped Approach: Preferential admission to higher class requests Scalability The policer should remain operational even under extremely high arrival rates
32
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 32 Overview of Policer Design Our policer has three components Request classifier and per-class leaky buckets Class-specific queues Admission control Classifier Leaky buckets Class gold Class silver Class bronze Class-specific queues Admission control d gold d silver d bronze dropped admitted PODC’04 / WWW’05
33
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 33 Class-based Differentiation Classifier Leaky buckets Class gold Class silver Class bronze Class-specific queues d gold d silver d bronze Each incoming request undergoes classification Per-class leaky buckets used to ensure that rates guaranteed in SLA are admitted Admission control dropped admitted
34
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 34 Revenue Maximization Classifier Leaky buckets Class gold Class silver Class bronze Class-specific queues d gold d silver d bronze Idea: Different delays in processing requests of different classes More important requests processed more frequently Methodology to compute delay values in online manner Bounds probability of a request denying admission to a more important request [Appendix B of thesis] Admission control dropped admitted
35
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 35 Admission Control Classifier Leaky buckets Class gold Class silver Class bronze Class-specific queues d gold d silver d bronze Admission control Goal: Ensure that an admitted request meets its response time target Measurement-based admission control algorithm Use information about current load on servers and estimated size of new request to make decision dropped admitted
36
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 36 Scalability of Admission Control Idea #1: Reduce the per-request admission control cost Admission control on every request may be expensive Bursty arrivals during overloads => batches get formed Delays for class-based differentiation => batches get formed Admission control that operates on batches instead of requests Idea #2: Sacrifice accuracy for computational overhead When batch-based processing becomes prohibitive Threshold-based scheme o E.g., Admit all Gold requests, drop all Silver and Bronze requests o Thresholds chosen based on observed arrival rates and service times Extremely efficient Wrong threshold => bad response times or fewer requests admitted
37
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 37 Scaling Even Further … Protocol processing overheads will saturate sentry resources at extremely high arrival rates Indiscriminate dropping of requests will occur o Important requests may be turned away without even undergoing the admission control test o Loss in revenue! Sentry should still be able to process each arriving request! Idea: Dynamic capacity provisioning for sentry Pull in an additional sentry if CPU utilization of existing sentries exceeds a threshold (e.g., 90%) Round-robin DNS to load balance among sentries
38
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 38 Class-based Differentiation Three classes of requests: Gold, Silver, Bronze Policer successful in providing preferential admission to important requests
39
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 39 Threshold-based: Higher Scalability Threshold-based processing allows the policer to handle upto 4 times higher arrival rate Single sentry can handle about 19000 req/s
40
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 40 Threshold-based: Loss of Accuracy Higher scalability comes at a loss in accuracy of admission control More violations of response time targets
41
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 41 Talk Outline Motivation Thesis contributions Application modeling Dynamic provisioning Scalable request policing Summary and Future Research
42
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 42 Thesis Contributions Dynamic resource management in hosting platforms Shared Hosting Statistical multiplexing and under-provisioning [OSDI 2002] Application placement [PDCS 2004] Dedicated Hosting Analytical model for Internet applications [SIGMETRICS 2005] Dynamic provisioning [Autonomic Computing 2005] Scalable request policing [PODC 2004, WWW 2005]
43
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 43 Future Research Directions Virtual machine based hosting Recent research has shown feasibility of migrating VMs across nodes Adds a new dimension to the capacity provisioning problem Characterizing multi-tier workloads Workloads for standalone Web servers are well-characterized E.g., typical service times at Java tier or query processing times? Offshoot of this study: workloads generators for multi-tier applications Automated determination of provisioning parameters Predictor and reactor invoked based on manually chosen frequencies System administrators use rules-of-thumb => error-prone
44
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 44 Thanks to … Advisor Prashant Shenoy Thesis committee Emery Berger, Jim Kurose, Don Towsley, Tilman Wolf Collaborators Abhishek Chandra, Pawan Goyal, Giovanni Pacifici, Timothy Roscoe, Arnold Rosenberg, Mike Spreitzer, Asser Tantawi All my teachers Paul Cohen, Mani Krishna, Don Towsley Friends and family
45
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 45 Questions or comments?
46
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 46 Query Caching at the Database Caching effects Captured by tuning V i and/or S i Bulletin-board site RUBBoS 50 sessions SELECT SQL_NO_CACHE causes Mysql to not cache the response to a query
47
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 47 Agile Switching Using Virtual Machine Monitors Use VMMs to enable fast switching of servers Switching time only limited by residual sessions VMM active dormant active VM 1 VM 2 VM 3 VM 2 VM 3 VMMs allow multiple “virtual” m/c on a server E.g., Xen, VMWare, …
48
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 48 Prototype Data Center 40+ Linux servers Gigabit switches Multi-tier applications Auction (RUBiS) Bulletin-board (RUBBoS) Apache, JBOSS (replicable) Mysql database Control Plane Application placement Dynamic provisioning Nucleus Apps OS Server Node Application capsules Sentries Resource monitoring Parameter estimation Nucleus Apps OS Nucleus Apps OS
49
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 49 Sentry Provisioning (XXX)
50
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 50 System Overview Control Plane Centralized resource manager Nucleus Per-server measurements and resource management Sentry Per-application admission control Capsule Component of an application running on a server Control Plane Nucleus Apps OS Server Node Application capsules Sentries Resource monitoring Parameter estimation Nucleus Apps OS Nucleus Apps OS Application placement Dynamic provisioning
51
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 51 Existing Application Models Models for Web servers [Chandra03, Doyle03] Do not model Java server, database etc. Black-box models [Kamra04, Ranjan02] Unaware of bottleneck tier Extensions of single-tier models [Welsh03] Fail to capture interactions between tiers Existing models inadequate for multi-tier Internet applications
52
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 52 Existing Work Predictable resource management within a single server Proportional-share schedulers for CPU, network [Duda,Goyal,Waldspurger] o Multi-processors [Chandra] Memory management [Berger,Waldspurger] Disk scheduling [Shenoy] Hosting platforms and Internet applications Rice, Duke, Penn State: shared platforms for Web servers IBM, HP Labs: shared platforms, workload prediction Berkeley: novel architecture for Internet applications Main shortcomings Possible statistical multiplexing gains in shared platforms unexplored Most work assumes simplistic applications (e.g., only Web servers) Provisioning either purely reactive or purely predictive Handling of extreme overloads not addressed satisfactorily
53
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 53 Predictive Provisioning Allocator Predictors Monitor Application Models Predicted workload Observed workload Resource reqmts Servers Workload measurements Server allocations
54
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 54 Reactive Provisioning Idea: react to current conditions Useful for capturing significant short-term fluctuations Can correct errors in predictions Track error between long-term predictions and actual Allocate additional servers if error exceeds a threshold Can be invoked if request drop rate exceeds a threshold Operates over time scale of a few minutes Pure reactive provisioning: lags workload Reactive + predictive more effective! Prediction error pred actual error > Invoke reactor time series allocate servers
55
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 55 Dynamic Capacity Provisioning WorkloadResponse time Server allocations Auction application RUBiS Factor of 4 increase in 30 min Server allocations increased to match increased workload Response time kept below 2 seconds
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.