Computer Science Dynamic Resource Management in Internet Data Centers Prashant Shenoy University of Massachusetts.

Slides:

Advertisements

Similar presentations

The Case for Enterprise Ready Virtual Private Clouds Timothy Wood, Alexandre Gerber *, K.K. Ramakrishnan *, Jacobus van der Merwe *, and Prashant Shenoy.

Advertisements

Walter Binder University of Lugano, Switzerland Niranjan Suri IHMC, Florida, USA Green Computing: Energy Consumption Optimized Service Hosting.

1 Vladimir Knežević Microsoft Software d.o.o.. 80% Održavanje 80% Održavanje 20% New Cost Reduction Keep Business Up & Running End User Productivity End.

1 SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

Sandpiper : Black box and Gray-Box resource management for Virtual Machines Journal : Computer Networks: The International Journal of Computer and Telecommunications.

Memory Buddies: Exploiting Page Sharing for Smart Colocation in Virtualized Data Centers Timothy Wood, Gabriel Tarasuk-Levin, Prashant Shenoy, Peter Desnoyers*,

CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Provisioning for Multi-tier Internet Applications Bhuvan Urgaonkar, Prashant.

Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Server Consolidation in Virtualized Data Centers Prashant Shenoy University of Massachusetts.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Quantifying the Benefits of Resource Multiplexing in On-Demand Data Centers Abhishek.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Allocation for Shared Data Centers Using Online Measurements.

Handling Web Hotspots at Dynamic Content Web Sites Using DotSlash Weibin Zhao Henning Schulzrinne Columbia University NYMAN’04.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Virtualization in Data Centers Prashant Shenoy

Adaptive Content Delivery for Scalable Web Servers Authors: Rahul Pradhan and Mark Claypool Presented by: David Finkel Computer Science Department Worcester.

Yaksha: A Self-Tuning Controller for Managing the Performance of 3-Tiered Web Sites Abhinav Kamra, Vishal Misra CS Department Columbia University Erich.

Bandwidth Allocation in a Self-Managing Multimedia File Server Vijay Sundaram and Prashant Shenoy Department of Computer Science University of Massachusetts.

Capacity planning for web sites. Promoting a web site Thoughts on increasing web site traffic but… Two possible scenarios…

© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Automated Workload Management in.

DotSlash: Providing Dynamic Scalability to Web Applications Weibin Zhao and Henning Schulzrinne Department of Computer Science, Columbia University More.

Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.

New Challenges in Cloud Datacenter Monitoring and Management

Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen Vrije Universiteit, Amsterdam,

Resource Management in Virtualization-based Data Centers Bhuvan Urgaonkar Computer Systems Laboratory Pennsylvania State University Bhuvan Urgaonkar Computer.

Computer Science Cataclysm: Policing Extreme Overloads in Internet Applications Bhuvan Urgaonkar and Prashant Shenoy University of Massachusetts.

AGILE, DYNAMIC PROVISIONING OF MULTITIER INTERNET APPLICATIONS Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandray, and Pawan Goyal ACM Transactions on.

SOFTWARE AS A SERVICE PLATFORM AS A SERVICE INFRASTRUCTURE AS A SERVICE.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Black-box and Gray-box Strategies for Virtual Machine Migration Timothy Wood, Prashant.

Black-box and Gray-box Strategies for Virtual Machine Migration Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif † Univ. of Massachusetts.

Computer Science 1 Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst.

Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.

Adaptive Control of Virtualized Resources in Utility Computing Environments HP Labs: Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal University.

Department of Computer Science Engineering SRM University

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science An Analytical Model for Multi-tier Internet Services and its Applications Bhuvan.

Profile Driven Component Placement for Cluster-based Online Services Christopher Stewart (University of Rochester) Kai Shen (University of Rochester) Sandhya.

IISWC 2007 Panel Benchmarking in the Web 2.0 Era Prashant Shenoy UMass Amherst.

Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum

Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.

Challenges towards Elastic Power Management in Internet Data Center.

Adaptive Virtual Machine Provisioning in Elastic Multi-tier Cloud Platforms Fan Zhang, Junwei Cao, Hong Cai James J. Mulcahy, Cheng Wu Tsinghua University,

© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Profiling and Modeling Resource Usage.

1 Multiprocessor and Real-Time Scheduling Chapter 10 Real-Time scheduling will be covered in SYSC3303.

Computer Networks with Internet Technology William Stallings

Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.

1 Challenges in Scaling E-Business Sites  Menascé and Almeida. All Rights Reserved. Daniel A. Menascé Department of Computer Science George Mason.

Computer Science 1 Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst.

Computer Science 1 Adaptive Overload Control for Busy Internet Servers Matt Welsh and David Culler USITS 2003 Presented by: Bhuvan Urgaonkar.

A dynamic optimization model for power and performance management of virtualized clusters Vinicius Petrucci, Orlando Loques Univ. Federal Fluminense Niteroi,

Design and Evaluation of a Model for Multi-tiered Internet Applications Bhuvan Urgaonkar Internship project talk – Services Management Middleware Dept,

1 Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel,

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Data Centers Bhuvan Urgaonkar Laboratory.

1 Agility in Virtualized Utility Computing Hangwei Qian, Elliot Miller, Wei Zhang Michael Rabinovich, Craig E. Wills {EECS Department, Case Western Reserve.

1 Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard.

Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Hosting Platforms Ph.D. Thesis Defense.

Capsule Placement in the Service Platform Bhuvan Urgaonkar Timothy Roscoe Systems Group, Sprint ATL.

Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer

Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.

Abhinav Kamra, Vishal Misra CS Department Columbia University

Lead SQL BankofAmerica Blog: SQLHarry.com

Regulating Data Flow in J2EE Application Server

Dynamic Provisioning for Multi-tier Internet Applications

Zhen Xiao, Qi Chen, and Haipeng Luo May 2013

DotSlash: An Automated Web Hotspot Rescue System

Admission Control and Request Scheduling in E-Commerce Web Sites

Cloud Computing Architecture

Cataclysm: Handling Extreme Overloads in Internet Services

Presentation transcript:

Computer Science Dynamic Resource Management in Internet Data Centers Prashant Shenoy University of Massachusetts

Computer Science Motivation  Internet applications used in a variety of domains  Online banking, online brokerage, online music store, e-commerce  Internet usage continues to grow rapidly  Broadband deployment is accelerating  Outages of Internet applications more common “Site not responding” “connection timed out”

Computer Science Internet Application Outages Down for 30 minutes Average download time ~ 260 sec Periodic outages over 4 days Cause: Too many users leading to overload Holiday Shopping Season 2000: 9/11: site inaccessible for brief periods

Computer Science Internet Workloads are highly variable  Short-term fluctuations  “Slashdot Effect”  Flash Crowds  Long-term seasonal effects  Time-of-day, month-of-year  Peak difficult to predict  Static overprovisoning not effective  Manual allocation: slow Soccer World Cup’98 Key Issue: How can we design applications to handle large workload variations?

Computer Science Internet Data Centers Internet applications run on data centers Server farms Provide computational and storage resources Applications share data center resources Problem: How should the platform allocate resources to absorb workload variations?

Computer Science Talk Outline Motivation  Internet data center model  Dynamic provisioning  Request Policing  Cataclysm Server Platform  Experimental results  Summary

Computer Science Data Center Model  Dedicated hosting: each application runs on a subset of servers in the data center  Subsets are mutually exclusive: no server sharing  Data center hosts multiple applications  Free server pool: unused servers Retail Web site streaming

Computer Science Internet Application Model  Internet applications: multiple tiers  Example: 3 tiers: HTTP, J2EE app server, database  Replicable applications  Individual tiers: partially or fully replicable  Example: clustered HTTP, J2EE server, shared-nothing db  Each application employs a sentry  Each tier uses a dispatcher: load balancing requests http J2EE database Load balancing sentry

Computer Science Approach  Dynamic provisioning  Allocate servers to applications on-the-fly  Request policing  Turn away excess requests  Degrade performance based on SLA  Couple provisioning and policing

Computer Science Research Questions  How many servers to allocate and when?  Multi-tier apps: when and how to provision each tier?  How many requests should be turned away during overload?  Multi-tier apps: where should requests be dropped?  Can we meet SLAs during overloads?  Is it possible to predict future workloads?

Computer Science Dynamic Provisioning  Key idea: increase or decrease allocated servers to handle workload fluctuations  Monitor incoming workload  Compute current or future demand  Match number of allocated servers to demand Monitor workload Monitor workload Compute current/ future demand Compute current/ future demand Adjust allocation

Computer Science Single-tier Provisioning  Single tier provisioning well studied [Muse, TACT]  Non-trivial to extend to multiple-tiers  Strawman #1: use single-tier provisioning independently at each tier  Problem: independent tier provisioning may not increase goodput C=15 C=10 C= req/s dropped 4 req/s

Computer Science Single-tier Provisioning  Single tier provisioning well studied [Muse, TACT]  Non-trivial to extend to multiple-tiers  Strawman #1: use single-tier provisioning independently at each tier  Problem: independent tier provisioning may not increase goodput C=15 C= req/s 14 C=20 14 dropped 3.9 req/s 10.1

Computer Science Model-based Provisioning  Black box approach  Treat application as a black box  Measure response time from outside  Increase allocation if response time > SLA Use a model to determine how much to allocate  Strawman #2: use black box for multi-tier apps  Problems:  Unclear which tier needs more capacity  May not increase goodput if bottleneck tier is not replicable 14 req/s C=15 C= C=

Computer Science Provisioning Multi-tier Apps  Approach: holistic view of multi-tier application  Determine tier-specific capacity independently  Allocate capacity by looking at all tiers (and other apps)  Predictive provisioning  Long-term provisioning: time scale of hours  Maintain long-term workload statistics  Predict and provisioning for the next few hours  Reactive provisioning  Short term provisioning: time scale of several minutes  React to “current” workload trends  Correct errors of long-term provisioning  Handle flash crowds (inherently unpredictable)

Computer Science Workload Prediction  Long term workload monitoring and prediction  Monitor workload for multiple days  Maintain a histogram for each hour of the day Capture time of day effects  Forecast based on Observed workload for that hour in the past Observed workload for the past few hours of the current day  Predict a high percentile of expected workload Mon Tue Wed Today

Computer Science Predictive Provisioning  Queuing theoretic application model  Each individual server is a G/G/1 queue  Derive per-tier E(r) from end-to-end SLA  Monitor other parameters and determine  per-server capacity)  Use predicted workload pred to determine # servers per tier Assumes perfect load balancing in each tier  Alternative: each tier G/G/k G/G/1 pred

Computer Science Reactive Provisioning  Idea: react to current conditions  Useful for capturing significant short-term fluctuations  Can correct errors in predictions  Track error between long-term predictions and actual  Allocate additional servers if error exceeds a threshold  Account for prediction errors  Can be invoked if request drop rate exceeds a threshold  Handles sudden flash crowds  Operates over time scale of a few minutes  Pure reactive provisioning: lags workload  Reactive + predictive more effective! Prediction error pred actual error >  Invoke reactor time series allocate servers

Computer Science Talk Outline Motivation Internet data center model Dynamic provisioning  Request Policing  Cataclysm Server Platform  Experimental results  Summary

Computer Science Request Policing  Key Idea: If incoming req. rate > current capacity  Turn away excess requests  Degrade performance of requests  Why police when you can provision?  Provisioning is not instantaneous Residual sessions on reallocated server Application and OS installation and configuration overheads  Overhead of several (5-30) minutes Sentry policing G/G/1 drop

Computer Science Class-based Differentiation  Some requests are more important than others  Purchase versus catalog browsing  Stock trade versus view account balance  Overload => preferentially let in more important requests  Maximize utility during overload  Incoming requests queued up in class queues  Example: gold, silver, bronze class  Higher priority to more important classes Sentry policing drop

Computer Science Scalable Policing Techniques  Examining individual requests infeasible  Incoming rate may be order of magnitude greater than capacity  Need to reduce overhead of policing decisions  Idea #1: Batch processing  Premise: Requests arrivals are bursty  Admit a batch of queued up requests One admission control test per batch Reduces overhead from O(n) to O(b)  Idea #2: Use pre-computed thresholds  Example: capacity = 100 req/s, G=75, S=50, B=50 req/s Admit all gold, half of silver and no broze  Periodically estimate and s: compute threshold  O(1) overhead: trades accuracy for efficiency

Computer Science Cataclysm Server Platform  Prototype data center  Commodity hardware  40+ Pentium servers  2 TB of RAID arrays  Gigabit switches  Linux-based platform

Computer Science Cataclysm Software Architecture Cataclysm Control Plane Provisioning Global allocation App placement Nucleus Apps OS Nucleus Apps OS Nucleus Apps OS Server Node Runs apps, sentries Resource monitoring, Local allocation  Two key components: control plane and nuclei

Computer Science Cataclysm Node Architecture  Capsule: component of an app on a node  Qlinux: proportional-sharing of node resources  Nucleus: resource allocations across capsules and VMs Nucleus Capsule QLinux HSFQ CPU scheduler Prop-share packet sched Cello disk scheduler SFVM memory mgr Nucleus QLinux Capsule VM Capsule VM Capsule VM Active Dormant UML Xen

Computer Science Cataclysm Applications  Multi-tiered apps: Rubis (e-auctions), Rubbos (b-board)  Apache, JBOSS, mysql  Tier-1 Sentry  Ktcpvs: kernel HTTP load balancer  Request policing and class-based differentiation  Workload monitoring  Tier-2 sentry: Apache JBOSS redirector, workload monitoring  Nuclues: Linux trace toolkit, /proc to monitor node statistics  All system components are replicable! Apache Load bal police ktcpvs Apache JBOSS mysql

Computer Science Talk Outline Motivation Internet data center model Dynamic provisioning Request Policing Cataclysm Server Platform  Experimental results  Summary

Computer Science Dynamic Provisioning Server Allocation adapts to changing workload WorkloadServer Allocation  RuBiS: E-auction application like Ebay

Computer Science Class-based differentiation Arrival rate Time (sec) Arrival rate GLD SIL BRZ Fraction admitted Time (sec) Fraction admitted GLD SIL BRZ

Computer Science Threshold-based: higher scalability Scalability Arrival rate CPU usage Batch Thresh

Computer Science Other Research Results  OS Resource Allocation  Qlinux [ACM MM00], SFS [OSDI00], DFS [RTAS02]  SHARC cluster-based prop. sharing [TPDS03]  Shared hosting provisioning  Measurement-based [IWQOS02], Queuing-based [Sigmetrics03,IWQOS03]  Provisioning granularity [Self-manage 03]  Application placement [PDCS 2004]  Profiling and Overbooking [OSDI02]  Storage issues  iSCSI vs NFS [FAST03], Policy-managed [TR03]

Computer Science Glimpse of Other Projects  Hyperion: Network processor based measurement platform  Measurement in the backbone and at the edge  NP-based measurements in the data center  RiSE: Rich Sensor Environments  Video sensor networks  Robotics sensor networks  Real-time sensor networks  Weather sensors

Computer Science Concluding Remarks  Internet applications see varying workloads  Handle workload dynamics by  Dynamic capacity provisioning  Request Policing  Need to account for multi-tiered applications  Joint work: Bhuvan Urgaonkar, Abhishek Chandra and Vijay Sundaram  More at

Computer Science Predictive Provisioning  Invoked once every hour  Captures long-term variations - time of day effects  Extensions to seasonal effects (month-of-year, holidays)  How to initialize?  Needs several days of history to work well  What happens if no servers are available?  Use revenue/utility to arbitrate allocation [Muse]  Turn away excess requests  Non-replicable tiers are easy to handle  Provision other tiers until non-replicable tier is saturated

Computer Science Degrade or Drop?  Depends on the application and the SLA  Degrading increases effective capacity  Also degrades performance seen by requests  Degrade if  Utility from servicing more requests at lower performance >  Utility from servicing fewer requests - penalty of dropping requests  Otherwise drop requests < 500msr1 < 1sr2 <10sr3 SLA:

Computer Science Use of Virtual Machine Monitors  Server allocation can be slow (~ minutes)  Need residual sessions to terminate  Disk scrubbing, OS and app installation, configuration  Application and system overheads  Flash crowds => need fast allocation  Use virtual machines  Each app runs inside a VM, multiple VMs on a server  Only one VM is active at any time, other VMs are “hot spares”  Server allocation => idle one VM, activate another  System overhead reduces to < 1s  Need to still account for residual sessions  Application issue, not longer a system limitation

Computer Science Threshold-based: loss of accuracy