Download presentation
Presentation is loading. Please wait.
1
Understanding and Predicting Host Load Peter A. Dinda Carnegie Mellon University http://www.cs.cmu.edu/~pdinda
2
2 Talk in a Nutshell Load is self-similar Load exhibits epochal behavior Load prediction benefits from capturing self-similarity Statistical analysis of two sets of week long, 1 Hz resolution traces of load on ~40 machines and evaluation of linear time series models for load prediction
3
3 Why Study Load? Load partially determines execution time We want to model and predict load [t min,t max ] ? Interactive Application Short tasks with deadlines Unmodified Distributed System
4
4 Load and Execution Time
5
5 Outline Measurement methodology Load traces Load variance New Results –Self-similarity –Epochal behavior Benefits of capturing self similarity in linear models Conclusions
6
6 Measurement Methodology Ready Queue RUN len t len t-T len t-2T len t-29T... len t-30T... Exponential Average (1 minute Load “Average”) avg t avg t-0.5T avg t-T... Our Measurements (1 Hz sample rate) Digital Unix KernelUser Level Measurement Tool T=2 seconds
7
7 Load Traces
8
8 Absolute Variation
9
9 Relative Variation
10
10 Load Autocorrelation Periodogram Time Lag Frequency
11
11 Visual Self-Similarity Here
12
12 The Hurst Parameter
13
13 Self-similarity Statistics
14
14 Why is Self-Similarity Important? Complex structure –Not completely random, nor independent –Short range dependence Excellent for history-based prediction –Long range dependence Possibly a problem Modeling Implications –Suggests models that can capture ARFIMA, FGN, TAR
15
15 Load Exhibits Epochal Behavior
16
16 Epoch Length Statistics
17
17 Why is Epochal Behavior Important? Complex structure –Non-stationary Modeling Implications –Suggests models ARIMA, ARFIMA, etc. Non-parametric spectral methods –Suggests problem decomposition
18
18 Linear Time Series Models Choose weights j to minimize a 2 a is the confidence interval for t+1 predictions Unpredictable Random Sequence Fixed Linear Filter Partially Predictable Load Sequence
19
19 Realizable Pole-Zero Models ARFIMA(p,d,q) ARIMA(p,d,q) ARMA(p,q) AR(p)MA(q) Self Similarity, d related to Hurst Non-stationarity, d integer p,q are numbers of parameters d is degree of differencing
20
20 Real World Benefits of Models a is the confidence interval for t+1 predictions Map work that would take 100 ms at zero load axp0: z =0.54, =1.0, a(ARMA(4,4)) = 0.109 a(ARFIMA(4,d,4)) = 0.108 no model: 1.0 +/- 1.06 (95%) => 100 to 306 ms ARMA:1.0 +/- 0.22 (95%) => 178 to 222 ms ARFIMA:1.0 +/- 0.21 (95%) => 179 to 221 ms axp7: z =0.14, =0.12, a(ARMA(4,4)) = 0.041 a(ARFIMA(4,d,4)) = 0.025 no model:0.12 +/- 0.27 (95%) =>100 to 139 ms ARMA:0.12 +/- 0.08 (95%) =>104 to 120 ms ARFIMA:0.12 +/- 0.05 (95%)=>107 to 117 ms 1 % 40 %
21
21 t+1 prediction
22
22 t+8 prediction
23
23 Conclusions Load has high variance Load is self-similar Load exhibits epochal behavior Capturing self-similarity in linear time series models improves predictability
24
24 Load Traces Would a web-accessible load trace database be useful? Would you like to contribute?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.