Download presentation
Presentation is loading. Please wait.
Published byGwen Bell Modified over 6 years ago
1
Forecasting with Cyber-physical Interactions in Data Centers (part 3)
Lei Li 9/28/2011 PDL Seminar
2
Big Picture: Predictive AC Control and Server Management
Server/workload management Computing energy model Sensor measuring Model of computing energy Temperature prediction Cooling energy model CRAC control (c) Lei Li 2012
3
Outline Overview of time series mining Motivation Experimental setup
Time series examples What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms (c) Lei Li 2012
4
Experimental setup Tested in JHU data center with 171 1U servers, instrumented with a network of 80 sensors (c) Lei Li 2012
5
Sample measurements (c) Lei Li 2012
6
Observations Temperature difference cycle (max/min temp. on the same rack) is in anti-phase with air velocity cycle. Middle and bottom sections are coldest; Top is hottest Shutting down under-utilized servers could reduce energy consumption. (c) Lei Li 2012
7
What happens when shutting down servers?
Shut down (c) Lei Li 2012
8
Outline Overview of time series mining Motivation Experimental setup
Time series examples What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms (c) Lei Li 2012
9
ThermoCast [Li et al, KDD 2011]
Given: intake temperatures, outtake temperatures, workload for each server , and floor air speed Goal: forecasting temperature distribution and thermal aware placement of workload Approach: a zonal forecasting model divide the machine room into zones, and each rack into sections. (c) Lei Li 2012
10
Assumptions A0: incompressible air
A1: environmental temperature is constant A2: supply air temperature is constant within a period A3: constant server fan speed A4: vertical air flow at the outtake is negligible A5: vertical air flow at the intake is linear to height (c) Lei Li 2012
11
Sensor measurements & Air interactions
(c) Lei Li 2012
12
ThermoCast (c) Lei Li 2012
13
ThermoCast Model outlet temp Inlet temp floor air speed
Derived from fluid dynamics and thermodynamics together with assumptions [Li et al, KDD 2011] (c) Lei Li 2012
14
Parameter Learning s.t. (c) Lei Li 2012
15
Outline Overview of time series mining Motivation Experimental setup
Time series examples What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms (c) Lei Li 2012
16
ThermoCast Results Q1: How accurately can a server learn its local thermal dynamics for prediction? 2x better using 90 minutes as training, predicting 5 minutes away AR ThermoCast 75% 100% shutdown (c) Lei Li 2012
17
ThermoCast Results Q2: How long ahead can ThermoCast forecast thermal alarms? 2x faster Baseline ThermoCast Recall 62.8% 71.4% FAR 45% 43.1% MAT 2.3min 4.2 min FAR=false alarm rate MAT=mean look-ahead time (c) Lei Li 2012
18
Implication on Capacity Gain
Preliminary results comparing workload placement strategies: 5 minutes forecast length With the same cooling: Inlet temp with ThermoCast: C Inlet temp with Static profiling: 16.5 C Assume the servers consume 200W on average (Dell PowerEdge 1950), we gain extra 26% computing power with the same cooling (c) Lei Li 2012
19
Contributions and Impact
Predictability: a hybrid approach to integrate the thermodynamics and sensor data Scalable learning/training thanks to the zonal thermal model Real data and instrument in a data center with practical workload Projected impact: can handle extra 26% workload (e.g. PUE 1.5 PUE 1.4) (c) Lei Li 2012
20
Outline Overview of time series mining Motivation Experimental setup
Time series examples What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms (c) Lei Li 2012
21
DynaMMo: imputation/forecasting
Time sensor 1 sensor 2 … sensorm blackout Goal: recover the missing values Details in [Li et al, KDD 2009] (c) Lei Li 2012
22
DynaMMo result Ideal Reconstruction error Our DynaMMo better
Average missing length Spline MSVD [Srebro’03] Linear Interpolation Our DynaMMo better Average length of successive missing values, Why there is drop at 100? Because it is average of 10 repeats, and each time we make random missing values, there is variance. Ideal Dataset: CMU Mocap #16 mocap.cs.cmu.edu harder (c) Lei Li 2012 more results in [Li et al, KDD 2009]
23
PLiF and CLDS for clustering
BGP data: hierarchical clustering + PLiF features Details in [Li et al, VLDB 2010] and [Li & Prakash, ICML 2011] (c) Lei Li 2012
24
CLDS Clustering Mocap Data
CLDS two features PCA top 2 components Accuracy = 93.9% Accuracy = 51.0% (c) Lei Li 2012 walking motion running motion
25
WindMine Goal: find patterns and anomalies from user-click streams
(c) Lei Li 2012
26
Discoveries by WindMine
Job website weather kids health (c) Lei Li 2012
27
Conclusion time series mining with many applications
Numbers for energy consumption in DC, and cooling costs much Sensor networks find use in data center monitoring ThermoCast: the forecasting model Other time series models and algorithms DynaMMo for imputation PLiF & CLDS for clustering WindMine for web clicks
28
References Lei Li, et al. ThermoCast: A Cyber-Physical Forecasting Model for Data Centers KDD 2011 Lei Li, et al. Time Series Clustering: Complex is Simpler. ICML 2011 Yasushi Sakurai, Lei Li, et al, WindMine: Fast and Effective Mining of Web-click Sequences, SDM, 2011. Lei Li, et al. Parsimonious Linear Fingerprinting for Time Series. VLDB 2010. Lei Li, et al. DynaMMo: Mining and Summarization of Coevolving Sequences with Missing Values. ACM KDD 2009. (c) Lei Li 2012
29
Thanks! contact: Lei Li (leili@cs.cmu.edu)
papers, software, datasets on
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.