Managing Server Energy and Operational Costs Chen, Das, Qin, Sivasubramaniam, Wang, Gautam (Penn State) Sigmetrics 2005
Introduction Usual arguments –Variable workloads –High operational cost even at low workloads –Power consumption a serious concern WORK NOT IN CONTEXT OF VIRTUALIZATION
…Introduction New points: – Server on/off and DVS has been used for power management BUT no consideration for impact of on/off on reliability of hardware –Authors claim no previous paper considered SLA violations – only energy optimization (maybe true in 2005!)
…Introduction Three new approaches for server provisioning with both DVS and server on/off, with response time SLAs considered, and with cost of server on/off: –First queuing model based –Second feedback control theory based –Third hybrid of above two
System model Cluster with identical servers (hosts) Any host can run any application Applications run on multiple host Assuming web front end each HTTP request is routed to one of the web servers –Service time is related to the request parameter (e.g. file size) We will control the number of servers allocated to a given application
…System model SLA used: average response time Goal: uses minimum energy required to meet SLA Two options for power management –Server on/off: costs in terms of (1) time and energy required to boot up (2) wear and tear of components (esp disk) –Dynamic voltage-frequency scaling of CPU
…System model Assume l frequency levels: f 1, f 2,…f l Assume “centralized” DVS control: –Single DVS setting for all hosts running one application Assumption: Only one application on one server Solution required for two steps: (1) how many servers per app (2) Freq setting for servers of each app
…Problem Formulation M identical servers N different applications Each appl i is allocated m i servers at any time Max/min frequency for each server: –f max, f min
…Problem Formulation (Power Model) Dynamic power consumption of CPU operating at frequency f is proportional to V 2 *f Further, V is proportional to f Thus, model of CPU operating at frequency f is: –P fixed + P f *f 3 Energy consumption of the cluster at frequency f over time t:
…Problem Formulation M and f are controlled periodically with interval t Over Z such intervals of time t, energy consumption is: Where m i (z) is the number of servers allocated to application i during duration z
…Problem Formulation K $ is the per unit electricity cost Total cost: B 0 is the cost per server turn-on cycle. Then total cost:
…Problem formulation B 0 itself has two components: power required to turn on, and “MTBF” impact cost due to the turn on P max is power consumed at max frequency, T reboot is time taken for rebooting, C r is MTBF cost (MTBF = Mean Time Between Failures)
…Problem Formulation Objective Function to Minimize, under constraints
…Constraints
Solutions. First: Queuing Theoretic Assumptions: Number of servers is managed every time T Frequency level per application controlled every time t T is an integer multiple of t, T= S*t Optimization period: U intervals of time T –Each with S intervals of time t
Prediction Need the following parameters for each application: –Mean arrival rate ( ) –Squared coeff of interarrival times C a 2 –Squared coeff of service times C s 2 –Mean file size in bytes: S-ARMA model used for prediction of arrival parameters (seasonal autoregressive moving average) “Winter’s” smoothing method for file size parameters
Queueing Analysis Model the application on each server as a G/G/1 queue. Use Bolch approximation
Optimization Problem
Heuristic Solution First assume t=T and consider only one interval at a time. Then objective is to minimize Start solution for interval u by finding for all i, the minumum number of servers m i (u) required for the constraint W i <= W i,SLA using highest frequencies Do this for each interval
…Heuristic Solution Then try to reduce f i (u) and increase m i (u), if that reduces cost. Select applications in decreasing order of f i (u) –i.e. select app with highest frequency, first.
…Heuristic Solution Now consider all intervals (still assume t=T) Start with “upper bound” solution of previous round (upper bound because intervals were considered separately)
…Heuristic solution Again consider one interval at a time (from 1 st to last), but the total obj function In each interval, start from apps with highest frequency and go in decreasing order
…Heuristic solution For each app, compare number of servers in this interval with previous, and “level off” the number of servers, if soln improves –Search greedily where number of servers are closer to number in previous intvl –Then, tune frequencies to get a feasible solution
…Heuristic solution So far, t =T. For cases where T > t, use average arrival and files size estimates for interval T and use above solution Then given the m i (u) from this step, tune f i (u,s) for each small interval s
Control Theoretic Approach Set up a feedback control formulation that finds an “aggregate frequency” for each app i –Objective: to meet response time SLA Solve the problem of allocating optimal number of servers which achieve this aggregate frequency –Objective: minimize power cost
…Control theoretic formulation F i (u,s) =m i (u)* f i (u,s): aggregate frequency –Indexed as F i (k), where k = 1,2,..U*S Implicit assumption: response time from multiple servers with total capacity F is the same as that from single server with capacity F Objective function: RF and R W,i are relative weights
LQR method Modify decision variables to formulate problem as well-known Linear Quadratic Regulator:
…Control theoretic approach is the “control gain” calculated by using standard LQR methods After above, use Integral Controller
Server Allocation Given F i (u,s) we need to determine m i (u) Define m(u) = –Total number of servers required for all apps Define F(u) = –Total frequency required for all apps Algorithm allocates m i (u) in proportion to its aggregate frequency
…server allocation Server frequency min,max is : Therefore, number of servers is bounded between: where and
Online server allocation algorithm Start with Equation 11, ignore the server on/off costs The remaining expression quantifies the tradeoff between the cost of a new server vs cost of higher frequency Differentiate this w.r.t. m(u) and find minima– get number of servers that minimize the cost
… Online server allocation algorithm This m*(u) will be a minimum – adding even one more server will increase the cost, without even considering the cost of server on (if any) Now consider two cases: –m(u-1) is greater than m*(u) or –Else turn on additional servers with the following algo…
Online server allocation algo
…online server allocation Where D denotes rounding of to available discrete frequency levels