1 Report on results of Discriminant Analysis experiment. 27 June 2002 Norman F. Schneidewind, PhD Naval Postgraduate School 2822 Racoon Trail Pebble Beach,

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Chapter 4 Quality Assurance in Context
Departments of Medicine and Biostatistics
1 The Role of the Revised IEEE Standard Dictionary of Measures of the Software Aspects of Dependability in Software Acquisition Dr. Norman F. Schneidewind.
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Research Methods in MIS
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Chi-square Test of Independence
Structural Equation Modeling
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 13 Using Inferential Statistics.
Today Concepts underlying inferential statistics
Chapter 7 Correlational Research Gay, Mills, and Airasian
Richard M. Jacobs, OSA, Ph.D.
Educational Research: Correlational Studies EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.
Chapter 24 - Quality Management Lecture 1 1Chapter 24 Quality management.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Measurement and Data Quality
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Testing Hypotheses I Lesson 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics n Inferential Statistics.
1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.
Chapter 8 Introduction to Hypothesis Testing
CHAPTER 4 Research in Psychology: Methods & Design
1 Using Excel to Implement Software Reliability Models Norman F. Schneidewind Naval Postgraduate School 2822 Racoon Trail, Pebble Beach, California, 93953,
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Statistical Analysis A Quick Overview. The Scientific Method Establishing a hypothesis (idea) Collecting evidence (often in the form of numerical data)
Chapter 6 : Software Metrics
14 Elements of Nonparametric Statistics
Chapter 8 Introduction to Hypothesis Testing
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.
EVIDENCE ABOUT DIAGNOSTIC TESTS Min H. Huang, PT, PhD, NCS.
Lecture 7: Requirements Engineering
Chapter 16 The Chi-Square Statistic
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Enabling Reuse-Based Software Development of Large-Scale Systems IEEE Transactions on Software Engineering, Volume 31, Issue 6, June 2005 Richard W. Selby,
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Chap. 5 Building Valid, Credible, and Appropriately Detailed Simulation Models.
Chapter 3: Software Project Management Metrics
California Institute of Technology Estimating and Controlling Software Fault Content More Effectively NASA Code Q Software Program Center Initiative UPN.
Question paper 1997.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Chapter Eight: Using Statistics to Answer Questions.
Chapter 6: Analyzing and Interpreting Quantitative Data
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
© Copyright McGraw-Hill 2004
Information Technology Project Management Managing IT Project Risk.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Outline of Today’s Discussion 1.The Chi-Square Test of Independence 2.The Chi-Square Test of Goodness of Fit.
Copyright , Dennis J. Frailey CSE Software Measurement and Quality Engineering CSE8314 M00 - Version 7.09 SMU CSE 8314 Software Measurement.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
Chapter 13 Understanding research results: statistical inference.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Independent Samples ANOVA. Outline of Today’s Discussion 1.Independent Samples ANOVA: A Conceptual Introduction 2.The Equal Variance Assumption 3.Cumulative.
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Software Project Configuration Management
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Software Quality Engineering
Software Reliability PPT BY:Dr. R. Mall 7/5/2018.
Understanding Results
Developing and Evaluating Theories of Behavior
Welcome to Corporate Training -1
Software metrics.
Chapter 18: The Chi-Square Statistic
Testing Hypotheses I Lesson 9.
Presentation transcript:

1 Report on results of Discriminant Analysis experiment. 27 June 2002 Norman F. Schneidewind, PhD Naval Postgraduate School 2822 Racoon Trail Pebble Beach, California USA Internet ( ):

2 Outline 1. Introduction 2. Objectives 3. Technical Approach 4. Risk Factors 5. Risk Factor Examples 6. Results 7. Conclusions

3 1.0 Introduction If measurement is to advance to a higher level: –We must shift our attention to the front- end of the development process. –It is during system conceptualization that errors in specifying requirements are inserted into the process.

4 1.1 Introduction A requirements change may induce ambiguity and uncertainty in the development process that cause errors in implementing the changes. –These errors may result in significant risks associated with implementing the requirements. Faults and failures may be induced by changes in requirements due to lack of precision in requirements.

5 1.2 Introduction Our use of risk pertains to executing the software of a system where there is the chance of injury, damage, or loss of the mission, if a serious software failure occurs during the mission. Acronyms –CRs: Change Requests –DRs: Discrepancy Reports –MW: Man Weeks –SLOC: Source Lines Of Code

6 1.3 Introduction Definitions –critical value: Discriminant that distinguishes high quality from low quality software. –“issues”: Number of possible conflicts among requirements. –“mods”: Number of modifications or iterations on the proposed change. –“sloc”: Number of lines of code affected by the change. –“space”: Amount of memory space required to implement the change.

7 2.0 Objectives Given the lack of emphasis in metrics research on the critical role that requirements play in determining reliability, we are motivated to investigate the following issues:

8 2.1 Objectives –What is the relationship between requirements attributes and reliability? Are there requirements attributes that are strongly related to the occurrence of defects and failures in the software?

9 2.2 Objectives –What is the relationship between requirements attributes and software attributes like complexity and size? Are there requirements attributes that are strongly related to the complexity and size of software?

Objectives –Is it feasible to use requirements attributes as predictors of reliability? Can static requirements change attributes like the size of the change be used to predict reliability in execution? –For example, time to next failure and number of failures.

Objectives –Are there requirements attributes that can discriminate between high and low reliability? Thus qualifying these attributes as predictors of reliability. –Which requirements attributes affect the risk to reliability the most?

Objectives –Develop a process (shown in Figure 1) that is used for the following: – 1) analyze the relationships between requirements changes, complexity, and reliability, and –2) assess and predict reliability risk as a function of requirements changes.

13 Figure 1. Risk Analysis Process

Technical Approach By retrospectively analyzing the relationship between requirements and reliability and maintainability, we identified those risk factors that are associated with reliability and maintainability. We prioritized them based on the degree to which the relationship is s-significant.

Technical Approach In order to quantify the effect of a requirements change, we used various risk factors. These are defined as the attribute of a requirement change that can induce adverse effects on reliability and maintainability: –Failure incidence, maintainability (e.g., size and complexity of the code) and project management (e.g. personnel resources).

Technical Approach Table 1 shows the Change Request Hierarchy of the Space Shuttle: –change requests, discrepancy reports, and failures. We analyzed categories 1 versus 2.1 and 1 versus with respect to risk factors as discriminants of the categories.

17 Table 1: Change Request Hierarchy Change Requests (CRs) –1. No Discrepancy Reports (i.e., CRs with no DRs) –2. Discrepancy Reports 2.1 No failures (i.e., CRs with DRs only) 2.2 Failures Pre-release failures Post-release failures Exclusive OR of and (i.e., CRs with failures)

Technical Approach Categorical Data Analysis Using the null hypothesis, H o : A risk factor is not a discriminant of reliability and maintainability versus the alternate hypothesis H 1 : A risk factor is a discriminant of reliability and maintainability, we used categorical data analysis to test the hypothesis.

Technical Approach A similar hypothesis is used to assess whether risk factors can serve as discriminants of metrics characteristics. We used the requirements, requirements risk factors, reliability, and metrics data from the Space Shuttle “Three Engine Out” software. (abort sequence invoked when three engines are lost) to test our hypotheses. Samples of these data are shown in Tables 2, 3, and 4.

20 Table 2: Example Failure Data Failure Found On Operational Increment: Q Days from Release When Failure Occurred: 75 Discrepancy Report #: 1 Severity: 2 Failure Date: Release Date: Module in Error: 10

21 Table 3:Example Risk Factors Change Request Number: A Source Lines of Code Changed: 1933 Complexity Rating of Change: 4 Criticality Of Change: 3 Number of Principal Functions Affected: 27 Number of Modifications Of Change Request: 7 Number of Requirements Issues: 238 Number of Inspections Required: 12 Manpower Required to Make Change: MW

22 Table 4: Example Metrics Data Module: 10 Operator Count: 3895 Operand Count: 1957 Statement Count: 606 Path Count: 998 Cycle Count: 4 Discrepancy Report Count: 14 Change Request Count: 16

Technical Approach Table 5 shows the definition of the Change Request samples that are used in the analysis. –Sample sizes are small due to the high reliability of the Space Shuttle. However, sample size is one of the parameters accounted for in the statistical tests that produced s-significant results in certain cases.

24 Table 5: Definition of Samples Sample Size Total CRs24 Instances of CRs with no DRs12 Instances of CRs with DRs only 9 Instances of CRs with failures 7 Instances of CRs with modules that caused failures 7 A given CR can have multiple instances of DRs, failures, and modules that caused failures. CR: Change Request. DR: Discrepancy Report.

Technical Approach To minimize the confounding effects of a large number of variables that interact in some cases, a statistical categorical data analysis was performed incrementally. We used only one category of risk factor at a time. –Observe the effect of adding an additional risk factor on the ability to correctly classify change requests that have discrepancy reports or failures and those that do not.

Technical Approach The Mann-Whitney test for difference in medians between categories was used because no assumption need be made about s-statistical distribution. In addition, some risk factors are ordinal scale quantities (e.g., modification level). –Rank correlation was used to check for risk factor dependencies.

Risk Factors One of the software maintenance problems of the NASA Space Shuttle Flight Software organization is to evaluate the risk of implementing requirements changes. These changes can affect the reliability and maintainability of the software. To assess the risk of change, the software development contractor uses a number of risk factors.

Risk Factors This formal process is called a risk assessment. No requirements change is approved by the change control board without an accompanying risk assessment. To date this qualitative risk assessment has proven useful for identifying possible risky requirements changes. –Or conversely, providing assurance that there are no unacceptable risks in making a change.

Risk Factors However, there has been no quantitative evaluation to determine whether high risk factor software was really less reliable and maintainable than low risk factor software. In addition, there is no model for predicting the reliability and maintainability of the software, if the change is implemented. We addressed both of these issues.

Risk Factor Examples Complexity Factors –Number of modifications or iterations on the proposed change How many times must the change be modified or presented to the change control board before it is approved? Size Factors –Number of lines of code affected by the change How many lines of code must be changed to implement the change?

Risk Factor Examples Requirements Issues and Function Factors –Possible conflicts among requirements changes (requirements issues) Will this change conflict with other requirements changes (e.g., lead to conflicting operational scenarios)? Performance Factors –Amount of memory required to implement the change Will the change use memory to the extent that other functions will be not have sufficient memory to operate effectively?

Results We show the results of performing the statistical analyses (a, b, and c) in Tables 6, 7, and 8, respectively. –This process is illustrated in Figure 2. Only those risk factors where there is sufficient data (i.e., data from seven or more CRs) and the results are s-significant are shown. Some quantitative risk factors (e.g., size of change, “sloc”) are s-significant; –no non-quantitative risk factors (e.g., complexity) are s-significant.

33

Results a. Categorical data analysis on the relationship between CRs with no DRs vs. CRs with failures, using the Mann-Whitney Test. Categorical data analysis on the relationship between CRs with no DRs vs. CRs with DRs only, using the Mann-Whitney Test.

Results b. Dependency check on risk factors, using rank correlation coefficients. c. Identification of modules that caused failures as a result of the CR, and their metric values.

Results a. Categorical Data Analysis Table 6 (Part 1) shows that there are s- significant results for CRs with no DRs vs. CRs with failures for the risk factors “mods”, “sloc”, “issues”, and “space”. There are also s-significant results for CRs with no DRs vs. CRs with DRs only for the risk factors “issues” and “space” (Part 2).

37 Table 6: S-significant Results (alpha .05). CRs with no DRs vs. CRs. with failures. Mann-Whitney Test (Part 1)

38 Table 6: S-significant Results (Part 2)

Results Since the value of alpha represents the level of s-significance of a risk factor in predicting reliability, we use it in Table 6 as a means to prioritize the use of risk factors, with low values meaning high priority. –The priority order is: “space”, “issues”, “mods”, and “sloc”. The s-significant risk factors would be used to predict reliability and maintainability problems for this set of data and this version of the software.

Results b. Dependency Check on Risk Factors To check for possible dependencies among risk factors that could confound the results, rank correlation coefficients are computed in Table 7. –Using an arbitrary threshold of.7, the results indicate s-significant dependencies between “issues” and “mod” and between “issues” and “sloc” for CRs with no DRs.

41 Table 7: Rank Correlation Coefficients of Risk Factors

Results As the number of conflicting requirements increases in Table 7, the number of modifications and size of the change request increases. There is an s-significant dependency between “space” and “issues” for CRs with failures. That is, as the number of conflicting requirements increases, the memory space required to implement the change request increases.

Results c. Identification of Modules That Caused Failures Requirements change requests may occur on modules with metric values that exceed the critical values. In these cases, there is s-significant risk in making the change because such modules could fail.

Results Table 8 shows modules that caused failures, as the result of the CRs, had metric values that far exceed the critical values. A module with metric values exceeding the critical values is predicted to cause failures. For the Space Shuttle, modules with excessive size and complexity consistently lead to failures.

45 Table 8: Selected Risk Factor Module Characteristics

Conclusions Risk factors that are s-significant can be used to make decisions about the risk of making changes. These changes impact the reliability and maintainability of the software.

Conclusions S-significant results were found for CRs with no DRs vs. CRs with failures; –in addition, s-significant results were found for CRs with no DRs vs. CRs with DRs only. Module metrics should be considered in risk analysis because metric values that exceed the critical values are likely to result in unreliable and non-maintainable software. This methodology can be generalized to other risk assessment domains, but the specific risk factors, their numerical values, and statistical results may vary.