Application of Maximum Entropy Principle to software failure prediction Wu Ji Software Engineering Institute BeiHang University.

Slides:



Advertisements
Similar presentations
Analysis by design Statistics is involved in the analysis of data generated from an experiment. It is essential to spend time and effort in advance to.
Advertisements

On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif.
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Optimal redundancy allocation for information technology disaster recovery in the network economy Benjamin B.M. Shao IEEE Transaction on Dependable and.
Overview Lesson 10,11 - Software Quality Assurance
SE 450 Software Processes & Product Metrics Reliability: An Introduction.
CSE 322: Software Reliability Engineering Topics covered: Software Reliability Models.
1 Software Testing and Quality Assurance Lecture 34 – Software Quality Assurance.
Software Engineering CSE470: Requirements Analysis 1 Requirements Analysis Defining the WHAT.
Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 9 Title : Reliability Reading: I. Sommerville, Chap. 16, 17 and 18.
CSE 300: Software Reliability Engineering Topics covered: Software Reliability Models.
SIMULATION. Simulation Definition of Simulation Simulation Methodology Proposing a New Experiment Considerations When Using Computer Models Types of Simulations.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Maximum Entropy Model & Generalized Iterative Scaling Arindam Bose CS 621 – Artificial Intelligence 27 th August, 2007.
Software Process and Product Metrics
Cleanroom Software Engineering Crystal Donald. Origins Developed by Dr. Harlan Mills in 1987 Developed by Dr. Harlan Mills in 1987 Name derived from hardware.
Monté Carlo Simulation MGS 3100 – Chapter 9. Simulation Defined A computer-based model used to run experiments on a real system.  Typically done on a.
PJSISSTA '001 Black-Box Test Reduction Using Input-Output Analysis ISSTA ‘00 Patrick J. Schroeder, Bogdan Korel Department of Computer Science Illinois.
SEG Software Maintenance1 Software Maintenance “The modification of a software product after delivery to correct faults, to improve performance or.
Chapter 20: Defect Classification and Analysis  General Types of Defect Analyses.  ODC: Orthogonal Defect Classification.  Analysis of ODC Data.
Models for Software Reliability N. El Kadri SEG3202.
Software Reliability Categorising and specifying the reliability of software systems.
Software Reliability Model Deterministic Models Probabilistic Models Halstead’s software metric McCabe’s cyclomatic complexity metrix Error seeding Failure.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 24 Slide 1 Critical Systems Validation 1.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 27 Slide 1 Quality Management 1.
CS527: (Advanced) Topics in Software Engineering Overview of Software Quality Assurance Tao Xie ©D. Marinov, T. Xie.
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
Chapter 8 Introduction to Hypothesis Testing
CLEANROOM SOFTWARE ENGINEERING.
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Unit 8 Syllabus Quality Management : Quality concepts, Software quality assurance, Software Reviews, Formal technical reviews, Statistical Software quality.
Open Data from Reliable Records Anne Thurston. The Open Data movement, a key aspect of Open Government, is now a top development interest across the world.
Analyze Opportunity Part 1
Random Sampling, Point Estimation and Maximum Likelihood.
1 OM2, Supplementary Ch. D Simulation ©2010 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible.
1 Software Testing and Quality Assurance Lecture 33 – Software Quality Assurance.
Bug Localization with Machine Learning Techniques Wujie Zheng
Software Measurement & Metrics
Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Approaching a Problem Where do we start? How do we proceed?
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
Bayesian Learning Chapter Some material adapted from lecture notes by Lise Getoor and Ron Parr.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
1 Context-dependent Product Line Practice for Constructing Reliable Embedded Systems Naoyasu UbayashiKyushu University, Japan Shin NakajimaNational Institute.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
Software Engineering Principles. SE Principles Principles are statements describing desirable properties of the product and process.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. 1.
Chapter 8 Lecture 1 Software Testing. Program testing Testing is intended to show that a program does what it is intended to do and to discover program.
McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
West Virginia University Sherif Yacoub, Hany H. Ammar, and Ali Mili A UML Model for Analyzing Software Quality Sherif Yacoub, Hany H. Ammar, and Ali Mili.
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
Software Quality Assurance and Testing Fazal Rehman Shamil.
CSE SW Metrics and Quality Engineering Copyright © , Dennis J. Frailey, All Rights Reserved CSE8314M13 8/20/2001Slide 1 SMU CSE 8314 /
Copyright , Dennis J. Frailey CSE Software Measurement and Quality Engineering CSE8314 M00 - Version 7.09 SMU CSE 8314 Software Measurement.
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
Copyright © 2015 Inter-American Development Bank. This work is licensed under a Creative Commons IGO 3.0 Attribution-Non Commercial-No Derivatives (CC-IGO.
Dillon: CSE470: ANALYSIS1 Requirements l Specify functionality »model objects and resources »model behavior l Specify data interfaces »type, quantity,
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.
Hardware & Software Reliability
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Software Reliability Definition: The probability of failure-free operation of the software for a specified period of time in a specified environment.
Software Reliability Models.
10701 / Machine Learning Today: - Cross validation,
Range.
Failure Mode and Effect Analysis
Multivariate Methods Berlin Chen
Presentation transcript:

Application of Maximum Entropy Principle to software failure prediction Wu Ji Software Engineering Institute BeiHang University

Agenda Introduction Problem and focus Method and models Results Conclusions

Introduction Failure prediction is one of the key problems for software quality (reliability) estimation. Generally, failure prediction can be defined as y = f(x). – y is failure related variable – x is the foundation on which prediction works As far as we know, x has been set as: –Software execution time  reliability growth prediction –Software execution trace  anomaly detection

Introduction (cont.) Reliability has been a big concern for high reliability requirement (HRR) software. Reliability engineering has very high cost. Reliability testing is seldom done for the software without HRR. Anomaly detection is usually implemented as a built-in module of software.

Introduction (cont.) Generally, all managers are striving for high quality. What does manager really care for failure prediction? –Given an usage scenario, if software can survive? How to predict software failure from input is still a new problem.

Problem and focus How to predict failure from software input?

Problem and focus (cont.) … failure observation = ? (0/1) left context execution time line execution start s t

Problem and focus (cont.) If we can model the left context, we get the distribution {(lc, fo)}. Failure Learning Failure Prediction {(lc,fo)} Software input Failure observation Failure law

Method and models The whole left context is hard to model. –A probability model: p o (y|x) –x: partial left context, y: failure observation. Maximum Entropy Principle (MEP) is applied to model the p o (y|x).

Method and models (cont.) MEP is a well-known and widely used learning principle: –Great generalization ability –Dynamic and open –Good adaptive with data sparseness

Method and models (cont.) Failure cannot be well modeled without modeling fault. Failure can be well modeled only from input, and its relations with failures. Structure Viewer Surface Viewer Structure Model Surface Model

Method and models (cont.) Surface Model: learns the statistical co- occurrence of the surface information. Structure Model: learns the statistical cause-effect (fault-failure) relationship.

Method and models (cont.) SIU-Seg-Ftrs SIU-Num-Ftrs Failure-Ftrs Flr The features applied in the surface model

Method and models (cont.) Fault-Ftrs (Flt -> Flr) Ftrs Failure-Ftrs Flr The features applied in the structure model

Method and models (cont.) Supervised training Training data Objective: maximize the likelihood function.

Method and models (cont.) Models Evaluation: –For a given test case: Test engineer would run it and get the test_fo_sequence; The prediction model would return the predicted pred_fo_sequence. –Evaluate by the match degree (precision) between test_fo_sequence and pred_fo_sequence.

Results Two groups of experiments, totally 5 software involved in, 17 testing. Open test method –Testing data keeps separate with training data and keeps unknown for training. Surface Model: average precision: Structure Model: average precision: 0.858

Results (cont.) Evaluation Score Distribution

Results (cont.)

Potential applications of the prediction model –Test case prioritization –Reliability Estimation –Reliability Growth Modeling

Conclusions A new failure prediction problem Apply statistical learning method to learn failure law and then predict failure Two models, surface model and structure model Promising evaluation results: –Surface Model: –Structure Model:

Conclusions (cont.) Lessons learnt: –To design and start experiments ASAP to verify model. –Complex model does not always perform well.  model simplification. –DO NOT draw much assumption on the generation of data.

Thank you for the attentions Ready for questions!