Subrata MitraPurdue University Greg BronevetskyGoogle / LLNL Suhas JavagalPurdue University Saurabh BagchiPurdue University PACT 2015, San Francisco.

Slides:



Advertisements
Similar presentations
Active Appearance Models
Advertisements

Design of Experiments Lecture I
3.3 Hypothesis Testing in Multiple Linear Regression
Objectives 10.1 Simple linear regression
Forecasting Using the Simple Linear Regression Model and Correlation
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
8 - 1 Multivariate Linear Regression Chapter Multivariate Analysis Every program has three major elements that might affect cost: – Size » Weight,
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Variance and covariance M contains the mean Sums of squares General additive models.
Empirical Analysis Doing and interpreting empirical work.
x – independent variable (input)
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistics for Managers Using Microsoft® Excel 5th Edition
Ensemble Learning: An Introduction
Multivariate Data Analysis Chapter 4 – Multiple Regression.
An Experimental Evaluation on Reliability Features of N-Version Programming Xia Cai, Michael R. Lyu and Mladen A. Vouk ISSRE’2005.
Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Simple Linear Regression Analysis
Linear Regression/Correlation
WuKong: Automatically Detecting and Localizing Bugs that Manifest at Large System Scales Bowen ZhouJonathan Too Milind KulkarniSaurabh Bagchi Purdue University.
Objectives of Multiple Regression
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Gaussian process modelling
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
CPIS 357 Software Quality & Testing
© 1998, Geoff Kuenning Linear Regression Models What is a (good) model? Estimating model parameters Allocating variation Confidence intervals for regressions.
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
Learning Objective 1 Explain the two assumptions frequently used in cost-behavior estimation. Determining How Costs Behave – Chapter10.
Practical Statistical Analysis Objectives: Conceptually understand the following for both linear and nonlinear models: 1.Best fit to model parameters 2.Experimental.
Other Regression Models Andy Wang CIS Computer Systems Performance Analysis.
Bug Localization with Machine Learning Techniques Wujie Zheng
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.
Predictive Design Space Exploration Using Genetically Programmed Response Surfaces Henry Cook Department of Electrical Engineering and Computer Science.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
BSBPMG504A Manage Project Costs 7.1 Estimate Costs Adapted from PMBOK 4 th Edition InitiationPlanning ExecutionClose Monitor Control The process of developing.
Fault Tolerance Benchmarking. 2 Owerview What is Benchmarking? What is Dependability? What is Dependability Benchmarking? What is the relation between.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 12 Multiple.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Machine Learning 5. Parametric Methods.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Linear Regression Models Andy Wang CIS Computer Systems Performance Analysis.
CHARACTERIZING CLOUD COMPUTING HARDWARE RELIABILITY Authors: Kashi Venkatesh Vishwanath ; Nachiappan Nagappan Presented By: Vibhuti Dhiman.
CS Statistical Machine learning Lecture 7 Yuan (Alan) Qi Purdue CS Sept Acknowledgement: Sargur Srihari’s slides.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Determining How Costs Behave
Experience Report: System Log Analysis for Anomaly Detection
Chapter 13 Simple Linear Regression
Chapter 7. Classification and Prediction
Regression Analysis Module 3.
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Determining How Costs Behave
CJT 765: Structural Equation Modeling
Linear Regression Models
Statistical Methods For Engineers
Feifei Li, Ching Chang, George Kollios, Azer Bestavros
Multiple Regression Berlin Chen
More on Maxent Env. Variable importance:
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
Presentation transcript:

Subrata MitraPurdue University Greg BronevetskyGoogle / LLNL Suhas JavagalPurdue University Saurabh BagchiPurdue University PACT 2015, San Francisco

The need for accurate performance prediction  Scheduler: More accurate execution time and resource usage prediction  Approximate computing: Level of approximation might be adjusted based on prediction of performance with different inputs  Anomaly detection: False alarms can be reduced with accurate prediction mechanism Too many factors influencing performance: Execution environment, config parameters, input size, data characteristic Not possible to test or model all the combinations 2

Test: Looks GOOD Production: Blown Up  void main (argc, argv){ int p = readInput() int q = argv[1]; int r = calculationsOnInput(p); float s = otherCalculations() float someVal = (r % q) ; int iterCount = s / someVal; for (int i=0; i < iterCount; i++){ doMoreCalculations(); } qrssomeValiterCount qrssomeValiterCount When a production execution takes too long how would you know whether it is because of (A)large input size OR (B) bug? Example of a performance bug 3

 DIDUCE 1 and Daikon 2 : Identifies hard invariants. Good for detecting constants, a valid range, ordering, linear relationship etc. E.g.: v1 = 5 2 < v1< 6 v1 > v2 v1 + v2 = 3v3 etc.. [1] Hangal et al., ICSE-2002 [2] Ernst et al., Science of Computer Programming-2007 Not suitable for detecting performance anomalies No statistical prediction mechanism - Cannot operate on unseen ranges of data Binary decision: violated/not violated. - No notion of how bad Previous works tried to solve this problem 4

 Many existing research for performance anomaly detection -“Fingerprinting the datacenter: automated classification of performance crises” by Bodik et al. (Eurosys 2010) -“Ensembles of models for automated diagnosis of system performance problems” by Zhang et al. (DSN 2005) -X-ray: Automating Root-Cause Diagnosis of Performance Anomalies in Production Software” by Attariyanet al. (OSDI 2012) None can bound prediction confidence based on input data Previous works related to performance anomaly 5

Need for prediction error characterization Our tool chains Details of error characterization techniques Show a use case of anomaly detection Present results and conclude Roadmap of the talk 6

Statistical models are NOT fully accurate  Many configuration parameters, large range of acceptable values – impossible to cover all combinations  Performance changes with input size  Performance changes with characteristics of input data -Examples: density of a graph, sparsity of a matrix  Models created with limited training runs. Production inputs can be very different Errors in performance prediction models must be characterized 7

Examples of prediction errors Predictors have inherent inaccuracies. They can not capture all possible input scenarios. SpMV Prediction error in # executed instructions % error in prediction Distance from training set Extrapolation error Density % error in prediction Interpolation error CoMD Prediction error in # executed instructions 8

 Showed how to characterize errors in performance prediction from statistical models  Built associated tools for data collection and modeling  Showed how such error characterization can be used for anomaly detection which is sensitive to input parameters and data Our contributions 9

Tool chain: S IGHT – S CULPTOR – E-A NALYZER - G UARDIAN Alarm Monitoring System ObservationValueProbability of fault L2_miss time f1 f2o1 o2 SIGHT Identify duplicate features Remove noisy observations Context based modeling Create distribution of prediction errors Create profile for extrapolation error SCULPTOR E-ANALYZER Input aware anomaly detection GUARDIAN Graphical interface Features & Observation s List High Norma l Test/train runs Production runs feedback 10

 Track input features and observation features of each module  Input features examples: input size, matrix sparsity (SpMV), bit-rate (FFMpeg), # threads (LINPACK) etc.  Observation features example: execution time, # executed instructions, # load instructions, cache miss rate, residue value etc. float func (float x, float y) { //creating a module for the scope of this function SightModule funcMod (“func”, inputs (“x”, x, “y”, y)); DoCalculations () // some computation done by the function r = calculateResidual (); // function computes a residual value to verify convergence funcMod.outputs (“r”, r); // SIGHT tracks the residual as an output observation } Simple annotation APIs: S IGHT : Instrumentation & data collection framework 11

Alarm Monitoring System ObservationValueProbability of fault L2_miss time f1 f2o1 o2 SIGHT Identify duplicate features Remove noisy observations Context based modeling Create distribution of prediction errors Create profile for extrapolation error SCULPTOR E-ANALYZER Input aware anomaly detection GUARDIAN Graphical interface Features & Observation s List High Norma l Test/train runs Production runs feedback S CULPTOR : The modeling tool 12

S CULPTOR : The modeling tool  Creates models for code regions identified by the developers  Choose the most useful input and observation features ─Maximal information coefficient (MIC) based analysis 1. ─Identifies if any relationship exists between two variables ─Works even for non-linear relationships  Polynomial regression model built for each output feature ─Polynomial of up to degree 3 ─LASSO to restrict the number of terms [1] “Detecting Novel Associations in Large Data Sets,” Science,

Handling discontinuity, incremental update Ask for more input features if model is still not accurate f1 f2o1 o2 SIGHT Identify duplicate features Remove noisy observations Context based modeling Create distribution of prediction errors Create profile for extrapolation error SCULPTOR E-ANALYZER Features & Observation s List Test/train runs feedback 14

Alarm Monitoring System ObservationValueProbability of fault L2_miss time f1 f2o1 o2 SIGHT Identify duplicate features Remove noisy observations Context based modeling Create distribution of prediction errors Create profile for extrapolation error SCULPTOR E-ANALYZER Input aware anomaly detection GUARDIAN Graphical interface Features & Observation s List High Norma l Test/train runs Production runs feedback E-A NALYZER : The error characterization tool 15

E-A NALYZER : The error characterization tool  Interpolation error  Extrapolation error Probability % Error Round 1Round 3Round 2 Training pointsTest points Error distribution Training points Test points Training sub-region Available data Predict at production point distance Extrapolation distance Prediction error 16

How to calculate the distance ? Example: Trained with two input features: f1 (using range ) and f2 (using range 0-10)  How far is the production point (f1=2000, f2=11) ?  Which one is closer to training region (f1=2000, f2=11) or (f1=1100, f2=20) ? Practical limitations: Input features can not be normalized - Valid ranges may not be known (no a priori bound: e.g. input size) Training points Test points Training sub-region Available data Predict at production point distance Extrapolation distance Prediction error 17

Proposed notion for distance: 18

Example of Error Calculation % error Sort training data w.r.t f1 Create error profile for varying f1 Sort training data w.r.t f2 Create error profile for varying f2 To estimate overall error at a production point (f1 = X1, f2= X2): Combine the projected errors from feature specific error profiles % error 19

Combining individual error components 20

Alarm Monitoring System ObservationValueProbability of fault L2_miss time f1 f2o1 o2 SIGHT Identify duplicate features Remove noisy observations Context based modeling Create distribution of prediction errors Create profile for extrapolation error SCULPTOR E-ANALYZER Input aware anomaly detection GUARDIAN Graphical interface Features & Observation s List High Norma l Test/train runs Production runs feedback G UARDIAN : The anomaly detection tool 21

Production run: calculate a probability of anomaly How extrapolation and interpolation error fit together ? G UARDIAN : The anomaly detection tool 22

 Used 7 applications from different domains: - CoMD, LULESH (scientific simulations) - LINPACK, SpMV (math) - PageRank, BlackScholes, FFmpeg (diverse)  False positive rate: -Training: Using given range of parameter values -Production: Parameter values away from the training range (low/high distance)  Anomaly detection accuracy: -Injected computation and memory intensive code as anomaly -Varied intensity of anomaly  Compared with fixed percentage threshold based detection - G UARDIAN A: Only extrapolation error considered - G UARDIAN B: Both interpolation and extrapolation errors considered How did we evaluate G UARDIAN 23

Lower is better % of time false alarm raised G UARDIAN A : Only extrapolation error characteristics G UARDIAN B: Both interpolation and extrapolation error characteristics G UARDIAN significantly reduces false positives rates Experiment: False positive rates 24

Higher is better injected extra computation loop % of time detected Experiment: Detection accuracy G UARDIAN maintains a high detection accuracy Other experiments: injected memory allocation and cache interference codes Refer to the paper for the details 25

 Input-aware scheduling -Adjust estimation based on prediction errors characteristics  Approximate computing -Adjust level of approximation based on modeling error characteristics and input properties Summary  Input parameters have substantial influence on software performance  Not all possible scenarios can be modeled or tested  Confidence in performance prediction by a statistical technique must be quantified based on the input data 26 Future directions

Thank you! Questions ?Questions ? 27

Backup slides 28

SIGHT: The inputs and observation features we track 29

30

Background: MIC Reference: Detecting Novel Associations in Large Data Sets by Reshef et al. Science, Dec

32

33

How does this equation scale ? When no features are correlated, the overall error rises to √k for k features (identical to RMS norm) The error grows with number of features contributing to extrapolation error 34

 Many configuration parameters, execution environment, input size, input characteristics  Not possible to test all the combinations  When performance degrades, hard to understand whether it was expected or something went wrong Too many factors influencing performance 35