Unveiling the Secret Sauce of

Unveiling the Secret Sauce of
Robust Performance Anomaly Detection & Forecasting Solutions - Statistical Modelling & Machine Learning Techniques Ramya Ramalinga Moorthy Performance Architect Founder & Technical Head EliteSouls Consulting Services

The Secret Sauce Abstract
Digital Product Performance Management - not a luxury anymore. Impact of Performance is realized. Resulted in Trending Research… Performance Anomaly Detection & Performance Forecasting Models Right Statistical Modelling & Machine Learning Techniques The Secret Sauce

Statistical Modeling & Machine Learning Techniques
illustrated in this paper Performance Anomaly Detection : 1) Pearson & Spearman Correlation Algorithm (SM) 2) Twitter’s Anomaly Detection Algorithm (SM) 3) K-means Clustering Algorithm (ML) 4) K-NN Nearest Neighbourhood Algorithm (ML) Performance Forecasting Models : Moving Average Model (SM) Exponential Smoothing Models (SM) 2) Simple Exponential Smoothing Model 3) Holt’s Exponential Smoothing Model 4) Holt Winter’s Exponential Smoothing Model 5) ARIMA Model (SM)

Ecommerce Case Study Overview
A Women apparels retailer interested to expand its online business portfolio. Needs support in decision making for hardware upgrade requirement for the next 1 year. Demanded Proof-Of-Concept (POC) to assess how Statistical models will perform on their historical data. This paper is based on our experience gained during this POC analysis. User traffic statistics available for last 3 years (Jan 2013 till Dec 2015). Server monitoring statistics available for last 6 years (Jan 2010 till Dec 2015).

Performance Anomaly Detection illustrations
Our 2 step solution includes : Visualize the performance counters over time and report deviations (Example : % CPU usage > 80%) Perform correlation analysis to support in root cause analysis. (Explained in next slide ) Data Input : 12 hours data (10 minutes interval) sample is used for anomaly detection analysis

Anomaly Detection using Correlation Analysis
Pearson Correlation Coefficient & Spearman Correlation Coefficient (p-value < 0.05) analysis confirms : Correlation between % CPU Utilization & % Memory Utilization is real Scatter plot of all monitored counters Circled Correlation Matrix HeatMap View

Anomaly Detection using Twitter’s Algorithm
This algorithm Is based on Seasonal Hybrid ESD (Extreme Studentized Deviates) Global anomalies extend above or below expected seasonality and not subject to seasonality and underlying trend. Local anomalies occur inside seasonal patterns and are masked making it difficult for detection. Data Input : 3 years (Jan 2013 till Dec 2015) of daily page view data used for anomaly detection Anomaly Detection Results % Deviation Calculation 82% of the detected anomalies (Q3 & Q4 2014) corresponds to Production environment usage for performance tests

Anomaly Detection using K-Means Machine Learning Algorithm
This is an unsupervised learning algorithm that tries to cluster data based on similarity. Clustering happens by calculating the Euclidean distance between the data points. Best recommended number of clusters is validated using Elbow method Data Input : 35 observations of CPU & Disk utilizations captured during peak hour traffic situations K-3 Clustering K-4 Clustering Clustered view of input data for quick anomaly detection

Anomaly Detection using K-NN Machine Learning Algorithm
K Nearest Neighbours is a supervised learning algorithm that stores all available cases and classifies new cases by a majority vote of its k neighbours. Model efficacy is based on determining the k value. Used k = # observations^(1/2) Data Input : 30 observations of CPU & Disk utilizations labeled data captured during peak traffic hours Input Data Usage 70% – Used for Training 30% – Used For Testing Prediction Accuracy Verification Actual versus Predicted Classification accuracy in our solution 85 % (1 out of 7 predictions were wrong)

Performance Forecasting Model Illustrations
For simplicity (during POC phase), data aggregation technique that uses 95th percentile calculation was used to derive the data points for monthly basis for 6 years (12*6 = #72). Choose the right technique based on time series component analysis using decomposition. Data Input : Application Server % CPU utilization during last 6 years (Jan 2010 till Dec 2015)

Forecasting Model using Moving Average Model
The average of the most recent k data values is used to forecast the next period Our solution uses K =3 for forecasting the next value in time series data Forecasted Trend for next 1 year (2016) is provided below along with the residuals. Forecast accuracy measures, low value of RMSE = & MAPE =2.25 confirms good fit. Actual Data & Forecasts Residuals Plot

Actual Vs Filtered data series Actual Data & Forecasts
Forecasting Model using Simple Exponential Smoothing This model uses a “smoothing constant” to determine how much weight to assign to the actual values for using them in forecasting. Simple ES is applicable for time series data with only level , no trend or seasonality. Forecasted Trend for next 1 year (2016) is provided below along with the residuals. Forecast accuracy measures, RMSE = & MAPE =4.94 confirms not a best fit. Actual Vs Filtered data series Actual Data & Forecasts Residuals Plot

Forecasting Model using Holt’s Exponential Smoothing
This is applicable for time series data that have level and trend and no seasonality. Actual data series versus transformed time data series shown below. Forecasted Trend for next 1 year (2016) is provided below along with the residuals. Forecast accuracy measures, RMSE = & MAPE =5.4 confirms not a best fit. Actual Vs Filtered data series Actual Data & Forecasts Residuals Plot

Forecasting Model using Holt-Winter’s Exponential Smoothing
This is applicable for time series data that have level , trend and seasonality. Actual versus transformed data series shown below. Forecasted Trend for next 1 year (2016) is provided below along with the residuals. Forecast accuracy measures, low value of RMSE = & MAPE =1.58 confirms good fit. Actual Vs Filtered data series Actual Data & Forecasts Residuals Plot

Forecasting Model using ARIMA Model
This model can be used to forecast both stationary & non-stationary time series data. The model ARIMA (p, d, q) has three parameters - the autoregressive parameter (p), the number of differencing passes (d), and moving average parameters (q). The model operates in 3 steps : Transformation to stationary time series data Identify model parameters (p & q) Find the best fit model to perform forecasts Forecast Accuracy measures, RMSE =2.98 & MAPE=5.31 confirms not a best fit. ACF & PACF Plots (p=1, q=0) Actual Data & Forecasts

Forecasting Model Accuracy Analysis
The Holt-Winters (HW) Exponential Smoothing model which has the lowest RMSE & MAPE values seems to have the highest accuracy comparatively.

Forecasting Model Results Comparison
Based on forecast accuracy, Holt-Winters Exponential Smoothing model seem to be best model for the retailer application time series data. Forecasting Models confirmed : No hardware (CPU capacity) upgrade is required for 2016. Upon comparison with actual data (Jan till June 2016), HW ES has 86% accuracy, ARIMA has 82% accuracy & MA has 76% accuracy

Conclusion Statistical Modelling & Machine Learning techniques play a vital role in the field of data science & data analytics on various application domains. Choosing the right technique (that can yield highest accuracy level ) purely depends on the type of data & its characteristics. Hence, the choice of best technique can vary across time series data of various domains. SM & ML Techniques might not yield best results always by default. Most of the time it requires fine tuning of its parameters to meet the accuracy levels. Raising demand for Performance Anomaly Detection & Forecasting solutions for assuring stringent performance SLAs can be met only by choosing right combination of SM & ML techniques that suits the best for the input data.

References “Workload-aware anomaly detection for web applications. “ by Tao Wang, Jun Wei, Wenbo Zhang, Hua Zhong, and Tao Huang. 2) “ Detection of performance anomalies in web-based applications “ by Joao Paulo Magalhaes and Luis Moura Silva. 3) “Root-cause analysis of performance anomalies in web-based applications.” by Joao Paulo Magalhaes and Luis Moura Silva. 4) “Performance Anomaly Detection and Bottleneck Identification” by Olumuyiwa Ibidunmoye & Erik Elmroth. 5) R Blogs - 6) Other Blogs : &

Q & A Ramya Ramalinga Moorthy, Performance Architect , carries about 13+ years of experience in Performance testing, Performance Engineering & Capacity Planning. She has great passion for learning & experimentation. In her current role, she provides technology consulting & training through her start-up, EliteSouls Consulting Services that specializes in niche NFR Test Services.

Appendix To be referenced while clarifying queries from audience

Need for Performance Anomaly Detection
What is Performance Anomaly? Performance Anomaly Types : Resource Saturation & Contention anomalies User load & application workload anomalies Data for Detecting Performance Anomalies are categorized as Application Metrics like response time, user load and throughput Server Metrics like CPU / Memory / Disk / Network utilization & IOPS “ Amazon face 1% decrease in sales for additional 100 ms delay in response time ” “ Google reports a 20% drop in traffic due to 500 ms delay in response time ”

Need for Performance Forecasting Model
What is Forecasting ? Forecasting Solutions Categories Analytical Modeling (Queuing Theory principles) Statistics & Machine Learning Time Series Models examine the time series data spaced at uniform intervals to extract meaningful information to forecast the future behavior. Why FM is a necessity Predicting application usage trends Predicting anomalies Predicting performance metrics (response time, throughput, etc) Predicting future hardware demands / Capacity planning

Correlation Analysis Result Snapshots
Pearson Correlation Coefficient reported : P-value = 2.2e-16 (which is <0.05) Spearman’s Correlation Coefficient reported : P-value = 5.2e-15 (which is <0.05) Pearson Correlation Coefficient Matrix Matrix view of metrics with correlation coefficients

K-Means Clustering Analysis Result Snapshots
Elbow Method Analysis Results Recommends - 3 or 4 clusters Range of CPU & Disk Values in k=4 Clustering Input Data Template Column View 4 cluster based grouping (Low, Average, High, Very High levels), was preferred by our clients for easy interpretation of anomalies (as data points with high CPU & Disk demand falls into last, high value cluster).

K-NN Analysis Result Snapshots
Input Data Usage : 70% Training & 30% Testing Test Data View Input Data (CPU & Disk Values) Summary Input Data Template Column View The trained model is ready for classifying any new input data into groups as per the learning process. This algorithm can be easily used for online anomaly detection requirements.

Moving Average Analysis Result Snapshots
Ljung Box Test has resulted are shown below. Forecast Accuracy analysis using various measures is shown below.

Simple ES Analysis Result Snapshots
Ljung Box test analysis has resulted in p-Value well above 0.05 indicating “non-significance” with little evidence for non-zero autocorrelations in the forecast errors for lags 1-20. Histogram plot is used to check whether forecast errors are normally distributed with mean zero and constant variance – shows larger deviation to normal curve. Forecast Accuracy analysis using various measures is shown below.

Holt’s ES Analysis Result Snapshots
Ljung Box test analysis has resulted in p-Value well above 0.05 indicating “non-significance” with little evidence for non-zero autocorrelations in the forecast errors for lags 1-20. Histogram plot is used to check whether forecast errors are normally distributed with mean zero and constant variance – shows larger deviation to normal curve. Forecast Accuracy analysis using various measures is shown below.

Holt Winter’s ES Analysis Result Snapshots
Ljung Box test analysis has resulted in p-Value well above 0.05 indicating “non-significance” with little evidence for non-zero autocorrelations in the forecast errors for lags 1-20. Histogram plot is used to check whether forecast errors are normally distributed with mean zero and constant variance – shows a proper fit to normal curve. Forecast Accuracy analysis using various measures is shown below.

ARIMA Model Analysis Result Snapshots
Transforming non-stationary to stationary data confirms the need for 1 level of differencing, d value=1 Non- Stationary Data Series Stationary Data Series Based on AutoCorrelation & Partial AutoCorrelation analysis, p value = 1 & q value = 0. The best ARIMA model fit calculation results are shown below. ACF & PACF Plots Best ARIMA model fitting

ARIMA Model Analysis Result Snapshots …cont
Ljung Box test analysis (to check the lack of fit of a time series model) has resulted in p-Value = 0.46, well above 0.05 indicating “non-significance” with little evidence for non-zero autocorrelations in the forecast errors for lags 1-20. In Q-Q Plot, most of the values are normal as they rest on a line and aren't all over the place. Forecast Accuracy analysis using various measures is shown below.

ThanK You

Unveiling the Secret Sauce of

Similar presentations

Presentation on theme: "Unveiling the Secret Sauce of"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Unveiling the Secret Sauce of

Similar presentations

Presentation on theme: "Unveiling the Secret Sauce of"— Presentation transcript:

Similar presentations

About project

Feedback