Fault Prediction and Software Aging

Slides:



Advertisements
Similar presentations
Time series modelling and statistical trends
Advertisements

Objectives 10.1 Simple linear regression
Forecasting Using the Simple Linear Regression Model and Correlation
The Experience Factory May 2004 Leonardo Vaccaro.
Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
1 Pertemuan 13 Uji Koefisien Korelasi dan Regresi Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Part II – TIME SERIES ANALYSIS C2 Simple Time Series Methods & Moving Averages © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Trend analysis: methodology
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Correlational Designs
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Chapter 7 Forecasting with Simple Regression
© 2011 Pearson Education, Inc. Statistics for Business and Economics Chapter 13 Time Series: Descriptive Analyses, Models, & Forecasting.
Time Series and Forecasting
Network Management 1 School of Business Eastern Illinois University © Abdou Illia, Spring 2006 (Week 15, Friday 4/21/2006) (Week 16, Monday 4/24/2006)
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Forecasting using trend analysis
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
SENG521 (Fall SENG 521 Software Reliability & Testing Software Reliability Tools (Part 8a) Department of Electrical & Computer.
University of Toronto Department of Computer Science © 2001, Steve Easterbrook CSC444 Lec22 1 Lecture 22: Software Measurement Basics of software measurement.
Least-Squares Regression
Inference for regression - Simple linear regression
Demand Management and Forecasting
University of Palestine software engineering department Testing of Software Systems Fundamentals of testing instructor: Tasneem Darwish.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
EQT 272 PROBABILITY AND STATISTICS
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 22 Regression Diagnostics.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Geographic Information Science
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Operations Management For Competitive Advantage 1Forecasting Operations Management For Competitive Advantage Chapter 11.
MBA.782.ForecastingCAJ Demand Management Qualitative Methods of Forecasting Quantitative Methods of Forecasting Causal Relationship Forecasting Focus.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Regression Regression relationship = trend + scatter
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Week 11 Introduction A time series is an ordered sequence of observations. The ordering of the observations is usually through time, but may also be taken.
Microsoft Reseach, CambridgeBrendan Murphy. Measuring System Behaviour in the field Brendan Murphy Microsoft Research Cambridge.
Forecasting Operations Management For Competitive Advantage.
CEN st Lecture CEN 4021 Software Engineering II Instructor: Masoud Sadjadi Monitoring (POMA)
Maintenance Workload Forecasting
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 3  Network Implementation and Management Strategies 1 Chapter 3 Overview  Why is a network implementation strategy necessary?  Why is network.
© 1999 Prentice-Hall, Inc. Chap Chapter Topics Component Factors of the Time-Series Model Smoothing of Data Series  Moving Averages  Exponential.
Fault Tolerance Benchmarking. 2 Owerview What is Benchmarking? What is Dependability? What is Dependability Benchmarking? What is the relation between.
Copyright © 2011 Pearson Education, Inc. Regression Diagnostics Chapter 22.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
CHARACTERIZING CLOUD COMPUTING HARDWARE RELIABILITY Authors: Kashi Venkatesh Vishwanath ; Nachiappan Nagappan Presented By: Vibhuti Dhiman.
Data Mining: Neural Network Applications by Louise Francis CAS Convention, Nov 13, 2001 Francis Analytics and Actuarial Data Mining, Inc.
BPS - 5th Ed. Chapter 231 Inference for Regression.
McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Forecasting.
Forecast 2 Linear trend Forecast error Seasonal demand.
Forecasting. Model with indicator variables The choice of a forecasting technique depends on the components identified in the time series. The techniques.
Chapter 11 – With Woodruff Modications Demand Management and Forecasting Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Time Series Forecasting Trends and Seasons and Time Series Models PBS Chapters 13.1 and 13.2 © 2009 W.H. Freeman and Company.
Yandell – Econ 216 Chap 16-1 Chapter 16 Time-Series Forecasting.
Determining How Costs Behave
Regression and Correlation
Hands-On Microsoft Windows Server 2008
Essentials of Modern Business Statistics (7e)
Correlation and Regression
Basic Practice of Statistics - 3rd Edition Inference for Regression
Presentation transcript:

Fault Prediction and Software Aging Carlos Perez

Outline Software Lifecycle / Motivation Software Aging / Problem Fault Prediction / Approach Methodology for Detection and Estimation of Software Aging Approach / Preventive Maintenance Experiment Data Analysis Results Conclusion

The Software Lifecycle Youth Software is new, simple, efficient. Functionality might be limited. Maturity As new requirements arise software becomes complex and code limitations surface. Elderliness Aging has taken a heavy toll on performance. Death Legacy App is replaced by newborn

DOS: A Case Study Youth Maturity Elderliness Death DOS - Simple, but very limited functionality Maturity Windows 3.1 – GUI interface on top of DOS. More functionality, but more bugs Windows95/97 – More functionality, new bugs, performance has suffered. Elderliness Windows98 – Many bugs have been patched, but increasing functionality is risky at this point. Death Windows XP was introduced!

The Software Aging Problem The main problem with legacy code is aging What is Software Aging? Deterioration in the availability of OS resources, data corruption and numerical error accumulation Consequences Performance degradation Crash / Hang Failure

Causes of Software Aging Common causes of software aging are: Memory bloating or leaks Unreleased file-locks Data corruption Storage space fragmentation Accumulation of round off errors Legacy code is more likely to experience these kind of problems

Combating Software Aging Research Question: How can we combat software aging? Why is it a challenging problem? It is caused by heisenbugs (hard to find bugs) It is an inherent characteristic of elderly systems It is hard to detect It can be present in critical systems

Software Rejuvenation Approach Software rejuvenation is a proactive fault management technique aimed at cleaning up the system internal state to prevent the occurrence of future failures. Examples of cleaning: Garbage collection Kernel table flushing Rebooting Advantages: Prevents crashes from occurring Provides fault tolerance in the presence of bugs Disadvantages: Introduces overhead

Fault Prediction Fault prediction tries to detect errors before they happen It monitors system resources in order to detect and estimate aging It computes an “estimated time to failure” Preventive measures can be taken to avoid crashes Enables software rejuvenation

S. Garg et al. “A methodology for detection and estimation of software aging.” In Proc. 9th International Symposium on Software Reliability Engineering, 1998 Presents a methodology for fault prediction based on the characterization of software aging

Approach Collect UNIX system resource usage at regular intervals using a distributed monitoring tool Use statistical trend detection techniques to detect and validate the existence of aging in UNIX.

Experimental Setup Distributed monitoring tool based on SNMP Works like a distributed database Monitors state of UNIX running in stations Monitoring station Queries SNMP agent at each workstation Determines “health” of each system

SNMP Model SNMP – Simple Network Management Protocol Supports monitoring of network-attached devices Pro-Active Fault Management MIB Defines a set of objects that can be queried on any workstation by the managing station These objects describe the state of the workstation

PFM MIBs hostID – provides basic information about the station timeVal – provides current time and time since last reboot osResource – describes state of OS resources such as free memory, file table size, etc. procStats – describes state of processes running etc, etc…

Data Collection Heterogenous UNIX workstations were monitored Their resource data was gathered every 15 minutes Crashes are recorded for correlation purposes

Data Analysis The data gathering face provides a time series for every object monitored Using these time series several issues are addressed: Is aging present? What is the nature of the variations in the value? Can failures be related to observed values? Can we quantify aging?

Data Analysis Visual cues Classical time series analysis Can periodicity be clearly seen from time series plots? Is an increasing/decreasing trend visible? What analysis should we do? Classical time series analysis Linear and periodic dependency analysis Trend detection and estimation

Periodicity and Linear Dependence Determines the nature variations in data Approach Autocorrelation function Harmonic Analysis Confirms daily and weekly periodicities in the data

Trend Detection and Estimation Trends indicate the presence of aging Approach looks for monotonically increasing/decreasing trends in resources Estimation Trend estimation quantifies the aging Approach approximates slope of trend to estimate the expected time to resource exhaustion

Trend Detection Smoothing Test Trend Existence Hypothesis Robust Locally Weighted Regression Reliable for nonlinear data Test Trend Existence Hypothesis Seasonal Kendall Test Detects trends in the presence of cycles

Smoothing Step 1 Start at focal point Define the window width Larger size causes heavier smoothing Overall trend is captured

Smoothing Step 2 Choose a weight function Tricube weight function is the most common

Smoothing Step 3 Polynomial regression using weighted least squares Take fitted value at focal point from regression These steps are repeated at every X

Smoothing Results Steps are repeated for every observation in the data A separate local regression is performed at each X The fitted value for each focal X is plotted

Trend Hypothesis Seasonal Kendall test Compares the relationships of points at different time periods (seasons) Determines if a trend exists

Trend Estimation Once we confirm the existence of a trend, we must estimate its slope Sen Slope Determines the slope at each point and takes the median of the slopes.

Results Periodicities and Linear Dependence Many values show daily and weekly periodic dependencies

Results Existence of aging Proved for file table size using seasonal trend decomposition Original time series Increasing trend from regression Periodicities Residual

Aging Quantification Estimated time to failure due to aging is calculated with respect to a particular resource Estimation is done from Sen’s slope and initial values Important resources can then be identified for monitoring and managing

Conclusion Quantification of software aging is presented as a means of fault prediction Statistical analysis is an appropriate method for the detection and estimation of software aging Can help in developing a strategy for software rejuvenation