1 Exploiting Nonstationarity for Performance Prediction Christopher Stewart (University of Rochester) Terence Kelly and Alex Zhang (HP Labs)

2 2 Motivation Enterprise applications are hard manage  Complex software hierarchy executes on (globally) distributed platforms  Application-level performance metrics are more complicated than system-level metrics  Infrastructure is fragile; system modifications (even for measurement purposes) are not always practical for real applications

3 3 Previous Work Performance models ease the burden of system management  Reduce complex system configurations to end-user response time or throughput prediction  Achieved via kernel modification [ barham-osdi-2004 ], runtime libraries [ chandra-eurosys-2007 ], and controlled benchmarking [stewart-nsdi-2005,urgoankar-sigmetrics-2005] Can we apply model-driven system management when intrusive measurement tools are impractical?

4 4 Observation Relative frequencies of transaction types in real enterprise applications are nonstationary  i.e., they change over time Nonstationarity allows model calibration using passive observations of application-level performance and system metrics

5 5 An Example Desire the mean value of a metric for each transaction type  Nonstationarity allows for model calibration Solve a set a linear equations: type A = 1 type B = 2 Passive observations are sufficient to calibrate performance models for real systems Passive observations ObservationMetric value# of type A requests # of type B requests 11024 21335

6 6 Outline 1. Transaction mix nonstationarity is real  Investigate 2 production enterprise applications  Implications of nonstationarity 2. A performance model for real enterprise applications 3. Performance-aware server consolidation 4. Conclusion

7 7 Commercial Applications Codename: VDR  Internal business-critical HP application  Services HP users and external customers  1 week trace Codename: ACME  Large Internet retailer (circa 2000)  5-day trace

8 8 Nonstationarity in Real Applications VDR Application  Relative frequency of the two most popular transaction types  Each point reflects an observation during a 5- minute interval Almost every ratio is represented  Transaction-type popularity is not fixed Fraction of 2 nd Most Popular Fraction of Most Popular

9 9 Nonstationarity in Real Applications ACME Application  Fraction of “add-to-cart” transactions in the ACME workload  Each point reflects an observation during a 5- minute window Frequencies vary by 2 orders of magnitude 0 24 48 72 96 120 Time (hours)

10 10 Implications of Nonstationarity Performance models  A wide-range of transaction mixes is a first-order concern for real production applications  Models that consider only request rate are likely to provide poor predictive accuracy under real-world conditions

11 11 Implications of Nonstationarity Workload generators  Popular benchmarks (e.g., RUBiS and TPC-W) use first-order Markov models  First-order Markov models yield stationary mixes (in the long term) RUBiS browse-mix shown  Rethink workload generation Fraction of 2 nd Most Popular Fraction of Most Popular

12 12 Outline 1. Transaction mix nonstationarity is real 2. A performance model for real enterprise applications  Passive observations in real applications  Model design  Model validation 3. Performance-aware server consolidation 4. Conclusion

13 13 Model Overview Measurements under real workloads are sufficient (with some analytics) to predict application-level performance We will carefully build a model that can be calibrated from passive observations of response times and resource utilizations

14 14 Passive Observations Certain system metrics are easy-to-acquire and widely available in production environments  Response times, CPU, and disk utilizations are routinely collected by tools in commodity Operating Systems Passive observations from VDR (abbreviated) 5-min interval (i) Sum of Resp. (y) CPU util.Transaction Count (Per-type) type 1 (N 1 )type 2 (N 2 )type 3 (N 3 ) 117.2 sec6.5%257827 2124 sec24%4814445.................................... 14404.1 sec1.2%41914

15 15 Model Design Each term considers one aspect of response time The first term considers service time  N ij - The count of transaction type j in interval i   j - Typical service time of transaction type j

16 16 Model Design The second term considers queuing delay  U ir - The utilization of resource r at interval i  i - The arrival rate of all transactions during interval i Resource utilization is not known a priori  Independently calibrated as a function of transaction mix

17 17 Model Calibration For performance prediction, we must acquire  j  The second term is constant for each interval i  Solve (minimize error) a set of linear equations Regression technique: least absolute residuals (LAR)  Robust to outliers, no tunable parameters, maximizes retrospective accuracy

18 18 Model Validation VDR trace  ½ for calibration  ½ for prediction Our model robustly predicts past and future performance 0 500 1000 1500 2000 0 500 1000 1500 2000 5-min intervals (in trace order) Sum of Response Times (sec.)

19 19 Model Validation VDR trace Median Error  7% calibrated set  9% predicted set ACME 12% median predictive error An accurate model from passive observations 0% 20% 40% 60% 80% 100% 0% 50% 100% 150% Absolute Percentage Error | predict – actual | / actual CDF

20 20 Outline 1. Transaction mix nonstationarity is real 2. Performance prediction for real enterprise applications 3. Performance-aware server consolidation  Problem statement  Extending our model for server consolidation  Validation 4. Conclusion

21 21 Problem Statement Performance-aware server consolidation  Given passive observations of enterprise applications running separately  Predict post-consolidation performance for each application  For this work, the hardware platform does not change

22 22 Performance-Aware Server Consolidation Post-consolidation performance model  Application consolidation primarily affects the queuing delay for each application  Simplifying assumption: Post-consolidation utilization is the sum of pre-consolidation utilizations

23 23 Validation Experimental setup  RUBiS and StockOnline  Custom nonstationary workloads Observed on ACME-variant Consolidated on VDR-variant  10-hour consolidation with 30 second measurement intervals Passively calibrated model predicts post-consolidation performance Median error 6% and 11% 0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% Absolute Percentage Error | predict – actual | / actual CDF

24 24 Outline 1. Transaction mix nonstationarity is real 2. Performance prediction for real enterprise applications 3. Performance-aware server consolidation  Problem statement  Model-driven server consolidation  Validation 4. Conclusion

25 25 Future Work 1. Performance prediction across multi-core processor configurations Passive observations calibrate simple yet effective models of processor utilization 2. Performance anomaly depiction Predictions are used to identify situations where performance does not match model expectations [stewart-hotdep-2006, kelly-worlds-2005]

26 26 Take Away Points Transaction mix nonstationarity is a real phenomenon in production applications Passive observations are sufficient to calibrate performance models Passively calibrated performance models can guide system management decisions

