Unveiling the Secret Sauce of

Slides:



Advertisements
Similar presentations
FINANCIAL TIME-SERIES ECONOMETRICS SUN LIJIAN Feb 23,2001.
Advertisements

Autocorrelation Functions and ARIMA Modelling
DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Exam 1 review: Quizzes 1-6.
Part II – TIME SERIES ANALYSIS C3 Exponential Smoothing Methods © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Operations Management Forecasting Chapter 4
Part II – TIME SERIES ANALYSIS C4 Autocorrelation Analysis © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Part II – TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 7: Box-Jenkins Models – Part II (Ch. 9) Material.
Time Series Building 1. Model Identification
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
How should these data be modelled?. Identification step: Look at the SAC and SPAC Looks like an AR(1)- process. (Spikes are clearly decreasing in SAC.
Business Forecasting Chapter 10 The Box–Jenkins Method of Forecasting.
Forecasting Purpose is to forecast, not to explain the historical pattern Models for forecasting may not make sense as a description for ”physical” behaviour.
Moving Averages Ft(1) is average of last m observations
Assignment week 38 Exponential smoothing of monthly observations of the General Index of the Stockholm Stock Exchange. A. Graphical illustration of data.
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
Chapter 13 Forecasting.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 6: Correlation.
Chapter 11 Multiple Regression.
Modern methods The classical approach: MethodProsCons Time series regression Easy to implement Fairly easy to interpret Covariates may be added (normalization)
Part II – TIME SERIES ANALYSIS C2 Simple Time Series Methods & Moving Averages © Angel A. Juan & Carles Serrat - UPC 2007/2008.
1 King Abdulaziz University Faculty of Engineering Industrial Engineering Dept. IE 436 Dynamic Forecasting.
Applied Business Forecasting and Planning
Business Forecasting Chapter 5 Forecasting with Smoothing Techniques.
Slides 13b: Time-Series Models; Measuring Forecast Error
Fall, 2012 EMBA 512 Demand Forecasting Boise State University 1 Demand Forecasting.
Chemometrics Method comparison
BOX JENKINS METHODOLOGY
Traffic modeling and Prediction ----Linear Models
The Forecast Process Dr. Mohammed Alahmed
Demand Management and Forecasting
TIME SERIES by H.V.S. DE SILVA DEPARTMENT OF MATHEMATICS
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
#1 EC 485: Time Series Analysis in a Nut Shell. #2 Data Preparation: 1)Plot data and examine for stationarity 2)Examine ACF for stationarity 3)If not.
Tutorial for solution of Assignment week 39 “A. Time series without seasonal variation Use the data in the file 'dollar.txt'. “
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 27 Time Series.
MBA.782.ForecastingCAJ Demand Management Qualitative Methods of Forecasting Quantitative Methods of Forecasting Causal Relationship Forecasting Focus.
Time Series Analysis and Forecasting
John G. Zhang, Ph.D. Harper College
15-1 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Forecasting Chapter 15.
Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect.
Time Series Analysis and Forecasting. Introduction to Time Series Analysis A time-series is a set of observations on a quantitative variable collected.
The Box-Jenkins (ARIMA) Methodology
DEPARTMENT OF MECHANICAL ENGINEERING VII-SEMESTER PRODUCTION TECHNOLOGY-II 1 CHAPTER NO.4 FORECASTING.
13 – 1 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall. Forecasting 13 For Operations Management, 9e by Krajewski/Ritzman/Malhotra.
Forecast 2 Linear trend Forecast error Seasonal demand.
Chapter 15 Forecasting. Forecasting Methods n Forecasting methods can be classified as qualitative or quantitative. n Such methods are appropriate when.
Forecasting. Model with indicator variables The choice of a forecasting technique depends on the components identified in the time series. The techniques.
Chapter 11 – With Woodruff Modications Demand Management and Forecasting Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 3 Lecture 4 Forecasting. Time Series is a sequence of measurements over time, usually obtained at equally spaced intervals – Daily – Monthly –
Welcome to MM305 Unit 5 Seminar Dr. Bob Forecasting.
Lecture 9 Forecasting. Introduction to Forecasting * * * * * * * * o o o o o o o o Model 1Model 2 Which model performs better? There are many forecasting.
Energy Consumption Forecast Using JMP® Pro 11 Time Series Analysis
Time Series Analysis By Tyler Moore.
Ramya Ramalinga Moorthy, EliteSouls Consulting Services
Machine Learning with Spark MLlib
OPERATING SYSTEMS CS 3502 Fall 2017
Performance Eval Slides originally from Williamson at Calgary
Case study 4: Multiplicative Seasonal ARIMA Model
Linear Regression.
Belinda Boateng, Kara Johnson, Hassan Riaz
Lecture 8 ARIMA Forecasting II
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
“The Art of Forecasting”
4th Joint EU-OECD Workshop on BCS, Brussels, October 12-13
Jia-Bin Huang Virginia Tech
Time Series introduction in R - Iñaki Puigdollers
BOX JENKINS (ARIMA) METHODOLOGY
Chap 7: Seasonal ARIMA Models
Presentation transcript:

Unveiling the Secret Sauce of Robust Performance Anomaly Detection & Forecasting Solutions - Statistical Modelling & Machine Learning Techniques Ramya Ramalinga Moorthy Performance Architect Founder & Technical Head EliteSouls Consulting Services

The Secret Sauce Abstract Digital Product Performance Management - not a luxury anymore. Impact of Performance is realized. Resulted in Trending Research… Performance Anomaly Detection & Performance Forecasting Models Right Statistical Modelling & Machine Learning Techniques The Secret Sauce

Statistical Modeling & Machine Learning Techniques illustrated in this paper Performance Anomaly Detection : 1) Pearson & Spearman Correlation Algorithm (SM) 2) Twitter’s Anomaly Detection Algorithm (SM) 3) K-means Clustering Algorithm (ML) 4) K-NN Nearest Neighbourhood Algorithm (ML) Performance Forecasting Models : Moving Average Model (SM) Exponential Smoothing Models (SM) 2) Simple Exponential Smoothing Model 3) Holt’s Exponential Smoothing Model 4) Holt Winter’s Exponential Smoothing Model 5) ARIMA Model (SM)

Ecommerce Case Study Overview A Women apparels retailer interested to expand its online business portfolio. Needs support in decision making for hardware upgrade requirement for the next 1 year. Demanded Proof-Of-Concept (POC) to assess how Statistical models will perform on their historical data. This paper is based on our experience gained during this POC analysis. User traffic statistics available for last 3 years (Jan 2013 till Dec 2015). Server monitoring statistics available for last 6 years (Jan 2010 till Dec 2015).

Performance Anomaly Detection illustrations   Our 2 step solution includes : Visualize the performance counters over time and report deviations (Example : % CPU usage > 80%) Perform correlation analysis to support in root cause analysis. (Explained in next slide ) Data Input : 12 hours data (10 minutes interval) sample is used for anomaly detection analysis

Anomaly Detection using Correlation Analysis Pearson Correlation Coefficient & Spearman Correlation Coefficient (p-value < 0.05) analysis confirms : Correlation between % CPU Utilization & % Memory Utilization is real Scatter plot of all monitored counters Circled Correlation Matrix HeatMap View

Anomaly Detection using Twitter’s Algorithm This algorithm Is based on Seasonal Hybrid ESD (Extreme Studentized Deviates) Global anomalies extend above or below expected seasonality and not subject to seasonality and underlying trend.  Local anomalies occur inside seasonal patterns and are masked making it difficult for detection. Data Input : 3 years (Jan 2013 till Dec 2015) of daily page view data used for anomaly detection Anomaly Detection Results % Deviation Calculation 82% of the detected anomalies (Q3 & Q4 2014) corresponds to Production environment usage for performance tests

Anomaly Detection using K-Means Machine Learning Algorithm This is an unsupervised learning algorithm that tries to cluster data based on similarity. Clustering happens by calculating the Euclidean distance between the data points. Best recommended number of clusters is validated using Elbow method Data Input : 35 observations of CPU & Disk utilizations captured during peak hour traffic situations K-3 Clustering K-4 Clustering Clustered view of input data for quick anomaly detection

Anomaly Detection using K-NN Machine Learning Algorithm K Nearest Neighbours is a supervised learning algorithm that stores all available cases and classifies new cases by a majority vote of its k neighbours. Model efficacy is based on determining the k value. Used k = # observations^(1/2) Data Input : 30 observations of CPU & Disk utilizations labeled data captured during peak traffic hours Input Data Usage 70% – Used for Training 30% – Used For Testing Prediction Accuracy Verification Actual versus Predicted Classification accuracy in our solution 85 % (1 out of 7 predictions were wrong)

Performance Forecasting Model Illustrations For simplicity (during POC phase), data aggregation technique that uses 95th percentile calculation was used to derive the data points for monthly basis for 6 years (12*6 = #72). Choose the right technique based on time series component analysis using decomposition. Data Input : Application Server % CPU utilization during last 6 years (Jan 2010 till Dec 2015)

Forecasting Model using Moving Average Model The average of the most recent k data values is used to forecast the next period Our solution uses K =3 for forecasting the next value in time series data Forecasted Trend for next 1 year (2016) is provided below along with the residuals. Forecast accuracy measures, low value of RMSE = 1.41 & MAPE =2.25 confirms good fit. Actual Data & Forecasts Residuals Plot

Actual Vs Filtered data series Actual Data & Forecasts Forecasting Model using Simple Exponential Smoothing This model uses a “smoothing constant” to determine how much weight to assign to the actual values for using them in forecasting. Simple ES is applicable for time series data with only level , no trend or seasonality. Forecasted Trend for next 1 year (2016) is provided below along with the residuals. Forecast accuracy measures, RMSE = 2.93 & MAPE =4.94 confirms not a best fit. Actual Vs Filtered data series Actual Data & Forecasts Residuals Plot

Forecasting Model using Holt’s Exponential Smoothing This is applicable for time series data that have level and trend and no seasonality. Actual data series versus transformed time data series shown below. Forecasted Trend for next 1 year (2016) is provided below along with the residuals. Forecast accuracy measures, RMSE = 3.05 & MAPE =5.4 confirms not a best fit. Actual Vs Filtered data series Actual Data & Forecasts Residuals Plot

Forecasting Model using Holt-Winter’s Exponential Smoothing This is applicable for time series data that have level , trend and seasonality. Actual versus transformed data series shown below. Forecasted Trend for next 1 year (2016) is provided below along with the residuals. Forecast accuracy measures, low value of RMSE = 0.08 & MAPE =1.58 confirms good fit. Actual Vs Filtered data series Actual Data & Forecasts Residuals Plot

Forecasting Model using ARIMA Model This model can be used to forecast both stationary & non-stationary time series data. The model ARIMA (p, d, q) has three parameters - the autoregressive parameter (p), the number of differencing passes (d), and moving average parameters (q). The model operates in 3 steps : Transformation to stationary time series data Identify model parameters (p & q) Find the best fit model to perform forecasts Forecast Accuracy measures, RMSE =2.98 & MAPE=5.31 confirms not a best fit. ACF & PACF Plots (p=1, q=0) Actual Data & Forecasts

Forecasting Model Accuracy Analysis The Holt-Winters (HW) Exponential Smoothing model which has the lowest RMSE & MAPE values seems to have the highest accuracy comparatively.

Forecasting Model Results Comparison Based on forecast accuracy, Holt-Winters Exponential Smoothing model seem to be best model for the retailer application time series data. Forecasting Models confirmed : No hardware (CPU capacity) upgrade is required for 2016. Upon comparison with actual data (Jan till June 2016), HW ES has 86% accuracy, ARIMA has 82% accuracy & MA has 76% accuracy

Conclusion Statistical Modelling & Machine Learning techniques play a vital role in the field of data science & data analytics on various application domains. Choosing the right technique (that can yield highest accuracy level ) purely depends on the type of data & its characteristics. Hence, the choice of best technique can vary across time series data of various domains. SM & ML Techniques might not yield best results always by default. Most of the time it requires fine tuning of its parameters to meet the accuracy levels. Raising demand for Performance Anomaly Detection & Forecasting solutions for assuring stringent performance SLAs can be met only by choosing right combination of SM & ML techniques that suits the best for the input data.

References “Workload-aware anomaly detection for web applications. “ by Tao Wang, Jun Wei, Wenbo Zhang, Hua Zhong, and Tao Huang. 2) “ Detection of performance anomalies in web-based applications “ by Joao Paulo Magalhaes and Luis Moura Silva. 3) “Root-cause analysis of performance anomalies in web-based applications.” by Joao Paulo Magalhaes and Luis Moura Silva. 4) “Performance Anomaly Detection and Bottleneck Identification” by Olumuyiwa Ibidunmoye & Erik Elmroth. 5) R Blogs - https://www.r-bloggers.com/ 6) Other Blogs : http://ucanalytics.com/blogs/ & http://analyticsvidhya.com

Q & A Ramya Ramalinga Moorthy, Performance Architect , carries about 13+ years of experience in Performance testing, Performance Engineering & Capacity Planning. She has great passion for learning & experimentation. In her current role, she provides technology consulting & training through her start-up, EliteSouls Consulting Services that specializes in niche NFR Test Services.

Appendix To be referenced while clarifying queries from audience

Need for Performance Anomaly Detection What is Performance Anomaly? Performance Anomaly Types : Resource Saturation & Contention anomalies User load & application workload anomalies Data for Detecting Performance Anomalies are categorized as Application Metrics like response time, user load and throughput Server Metrics like CPU / Memory / Disk / Network utilization & IOPS “ Amazon face 1% decrease in sales for additional 100 ms delay in response time ” “ Google reports a 20% drop in traffic due to 500 ms delay in response time ”

Need for Performance Forecasting Model What is Forecasting ? Forecasting Solutions Categories Analytical Modeling (Queuing Theory principles) Statistics & Machine Learning Time Series Models examine the time series data spaced at uniform intervals to extract meaningful information to forecast the future behavior. Why FM is a necessity Predicting application usage trends Predicting anomalies Predicting performance metrics (response time, throughput, etc) Predicting future hardware demands / Capacity planning

Correlation Analysis Result Snapshots Pearson Correlation Coefficient reported : P-value = 2.2e-16 (which is <0.05) Spearman’s Correlation Coefficient reported : P-value = 5.2e-15 (which is <0.05) Pearson Correlation Coefficient Matrix Matrix view of metrics with correlation coefficients

K-Means Clustering Analysis Result Snapshots Elbow Method Analysis Results Recommends - 3 or 4 clusters Range of CPU & Disk Values in k=4 Clustering Input Data Template Column View 4 cluster based grouping (Low, Average, High, Very High levels), was preferred by our clients for easy interpretation of anomalies (as data points with high CPU & Disk demand falls into last, high value cluster).

K-NN Analysis Result Snapshots Input Data Usage : 70% Training & 30% Testing Test Data View Input Data (CPU & Disk Values) Summary Input Data Template Column View The trained model is ready for classifying any new input data into groups as per the learning process. This algorithm can be easily used for online anomaly detection requirements.

Moving Average Analysis Result Snapshots Ljung Box Test has resulted are shown below. Forecast Accuracy analysis using various measures is shown below.

Simple ES Analysis Result Snapshots Ljung Box test analysis has resulted in p-Value well above 0.05 indicating “non-significance” with little evidence for non-zero autocorrelations in the forecast errors for lags 1-20. Histogram plot is used to check whether forecast errors are normally distributed with mean zero and constant variance – shows larger deviation to normal curve. Forecast Accuracy analysis using various measures is shown below.

Holt’s ES Analysis Result Snapshots Ljung Box test analysis has resulted in p-Value well above 0.05 indicating “non-significance” with little evidence for non-zero autocorrelations in the forecast errors for lags 1-20. Histogram plot is used to check whether forecast errors are normally distributed with mean zero and constant variance – shows larger deviation to normal curve. Forecast Accuracy analysis using various measures is shown below.

Holt Winter’s ES Analysis Result Snapshots Ljung Box test analysis has resulted in p-Value well above 0.05 indicating “non-significance” with little evidence for non-zero autocorrelations in the forecast errors for lags 1-20. Histogram plot is used to check whether forecast errors are normally distributed with mean zero and constant variance – shows a proper fit to normal curve. Forecast Accuracy analysis using various measures is shown below.

ARIMA Model Analysis Result Snapshots Transforming non-stationary to stationary data confirms the need for 1 level of differencing, d value=1 Non- Stationary Data Series Stationary Data Series Based on AutoCorrelation & Partial AutoCorrelation analysis, p value = 1 & q value = 0. The best ARIMA model fit calculation results are shown below. ACF & PACF Plots Best ARIMA model fitting

ARIMA Model Analysis Result Snapshots …cont Ljung Box test analysis (to check the lack of fit of a time series model) has resulted in p-Value = 0.46, well above 0.05 indicating “non-significance” with little evidence for non-zero autocorrelations in the forecast errors for lags 1-20. In Q-Q Plot, most of the values are normal as they rest on a line and aren't all over the place. Forecast Accuracy analysis using various measures is shown below.

ThanK You