Download presentation
Presentation is loading. Please wait.
Published byVernon Rodgers Modified over 9 years ago
1
Daniel Liu & Yigal Darsa - Presentation Early Estimation of Software Quality Using In-Process Testing Metrics: A Controlled Case Study Presenters: Yigal Darsa Daniel Liu
2
Daniel Liu & Yigal Darsa - Presentation ABSTRACT o field quality of a product tends to become available too late in the software development process o A controlled case study conducted at North Carolina State University
3
Daniel Liu & Yigal Darsa - Presentation ABSTRACT (cont’d) o Use of a suite of in-process metrics that leverages the software testing effort to provide an estimation of potential software field quality in early software development phases, an estimation of potential software field quality in early software development phases, the identification of low quality software programs the identification of low quality software programs
4
Daniel Liu & Yigal Darsa - Presentation INTRODUCTION o True field quality cannot be measured before a product has been completed and delivered to an internal or external customer. o Field quality is calculated using the number of failures found by these customers.
5
Daniel Liu & Yigal Darsa - Presentation INTRO (cont’d) Because this information is available late in the process, corrective actions tend to be expensive.Because this information is available late in the process, corrective actions tend to be expensive. Software developers can benefit from an early warning regarding the quality of their product.Software developers can benefit from an early warning regarding the quality of their product.
6
Daniel Liu & Yigal Darsa - Presentation INTRO (cont’d) Early warning can be built from a collection of internal metricsEarly warning can be built from a collection of internal metrics An internal metric, such as the cyclomatic complexity (will be explained later), is a measure derived from the product itselfAn internal metric, such as the cyclomatic complexity (will be explained later), is a measure derived from the product itself
7
Daniel Liu & Yigal Darsa - Presentation INTRO (cont’d) An external measure is a measure of a product derived from the external assessment of the behavior of the systemAn external measure is a measure of a product derived from the external assessment of the behavior of the system i.e.: the number of defects found in test is an external measure. i.e.: the number of defects found in test is an external measure. Structural object-orientated (O-O) measurements are being used to evaluate and predict the quality of softwareStructural object-orientated (O-O) measurements are being used to evaluate and predict the quality of software i.e.: Chidamber-Kemerer and MOOD O-O metric suites i.e.: Chidamber-Kemerer and MOOD O-O metric suites
8
Daniel Liu & Yigal Darsa - Presentation INTRO (cont’d) The CK metric suite consists of six metrics The CK metric suite consists of six metrics weighted methods per class (WMC), weighted methods per class (WMC), coupling between objects (CBO), coupling between objects (CBO), depth of inheritance tree (DIT), depth of inheritance tree (DIT), number of children (NOC), number of children (NOC), response for a class (RFC), response for a class (RFC), lack of cohesion among methods (LCOM). lack of cohesion among methods (LCOM).
9
Daniel Liu & Yigal Darsa - Presentation INTRO (cont’d) o These metrics can be a useful early internal indicator of externally-visible product quality in terms of fault- proneness
10
Daniel Liu & Yigal Darsa - Presentation INTRO (cont’d) Section 2 outlines the STREW metric suite. Section 2 outlines the STREW metric suite. Sections 3 discusses a controlled experiment, in which STREW metric suite had been studied. Sections 3 discusses a controlled experiment, in which STREW metric suite had been studied. Section 4 presents the experimental results. Section 4 presents the experimental results. Finally, Section 5 presents the conclusions and future work. Finally, Section 5 presents the conclusions and future work.
11
Daniel Liu & Yigal Darsa - Presentation Introduction STREW The metric used is called the Software Testing and Reliability Early Warning metric, also know as STREW. STREW is a set of internal, in-process software metrics that are used to make an early estimation of post release field quality. GOAL: early prediction of field quality.
12
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite Reasoning behind this metric suites: Different from the traditional reliability estimation models, STREW puts a greater emphasis on internal software metrics, especially those involving the testing effort.
13
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite The use of the STREW metrics is based on the existence of an extensive collection of unit test cases being created as development proceeds. During the initial stages of creating any project, such a unit test suite might not be available. In that case, historical data from a comparable project may be used.
14
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite The STREW-J metric suite consists of nine metric ratios. The metrics are intended to cross-check each other and to triangulate upon an estimate of post-release field quality. Each metric makes an individual contribution towards estimation of the post-release field quality but work best when used together.
15
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite Strew Metric Suite 9 STREW metrics table. Categorized into three groups: test quantification metrics, complexity and O-O metrics, and a size adjustment metric.
16
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite Test quantification metrics (SM1 to SM4): Specifically intended to crosscheck each other to account for coding/testing styles. i.e. One developer might write fewer test cases, each with multiple asserts checking various conditions. Another developer might test the same conditions by writing many more test cases, each with only one assert.
17
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite Test quantification metrics (SM1 to SM4): Intended to provide useful guidance to each of these developers without prescribing the style of writing the test cases.
18
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite The complexity and O-O metrics (SM5 to SM8): It examines the relative ratio of test to source code for control flow complexity and for a subset of the CK metrics. These relative ratios for a product in development can be compared with the historical values for similar projects to indicate the relative complexity of the testing effort with respect to the source code.
19
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite The complexity and O-O metrics (SM5 to SM8) IN DETAILS: The cyclomatic complexity metric: This software systems studies can be defined as the number of linearly independent paths in a program. It have shown that code complexity correlates strongly with program size measured by lines of code and is an indication of the extent to which control flow is used. The use of conditional statements increases the amount of testing required because there are more logic and data flow paths to be verified.
20
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite The complexity and O-O metrics (SM5 to SM8) IN DETAILS: The larger the inter-object coupling, the higher the sensitivity to change. Therefore, maintenance of the code is more difficult. As a result, the higher the inter- object class coupling, the more rigorous the testing should be. The number of methods and the complexity of methods involved is a predictor of how much time and effort is required to develop and maintain the class. The larger the number of methods in a class, the greater is the potential impact on its children, since the children will inherit all the methods defined in the class.
21
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite The final metric is a relative size adjustment factor: Defect density has been shown to increase with class size. The difference of lines of code size is accounted for, because projects that uses the STREW metric prediction relative size adjustment factor will not all have the same LOC size.
22
Daniel Liu & Yigal Darsa - Presentation Strew Metric Suite Removal of Metrics: Some metrics were removed based on the lack of ability to contribute towards the estimation of post-release field quality. Statement coverage Branch coverage Number of requirements/Source lines of code Number of children test /Number of children source Lack of cohesion among methods test /Lack of cohesion among methods source
23
Daniel Liu & Yigal Darsa - Presentation CONTROLLED EXPERIMENT o Research Design o Case Study Limitations
24
Daniel Liu & Yigal Darsa - Presentation Research Design To evaluate the predictive ability of STREW- J, a case study was carried out in a junior/senior-level software engineering course at NCSU in the fall 2003 semester
25
Daniel Liu & Yigal Darsa - Presentation Research Design (cont’d) Students developed an open source Eclipse Students developed an open source Eclipse Eclipse is an open source integrated development environmentEclipse is an open source integrated development environment Plug-in in Java. Plug-in in Java. Plug ins tested via: Plug ins tested via: Unit test via JUnitUnit test via JUnit Acceptance test via FIT toolAcceptance test via FIT tool Groups of four or five junior and/or senior students submitted 22 projects Groups of four or five junior and/or senior students submitted 22 projects
26
Daniel Liu & Yigal Darsa - Presentation Research Design (cont’d) SLOC: Source Lines of Code TLOC: Test Lines of Code
27
Daniel Liu & Yigal Darsa - Presentation Research Design (cont’d) Evaluated by 45 black box test cases: Evaluated by 45 black box test cases: Exception checkingException checking Error handlingError handling Boundary test casesBoundary test cases Operational correctness of the plug inOperational correctness of the plug in
28
Daniel Liu & Yigal Darsa - Presentation Case Study Limitations Students’ experiences vary. Not everyone is advanced Students’ experiences vary. Not everyone is advanced Experience is done academically and in ideal conditions; might not reflect industrial SW Development Experience is done academically and in ideal conditions; might not reflect industrial SW Development Eclipse plug ins are relatively small applications for industrial applications Eclipse plug ins are relatively small applications for industrial applications
29
Daniel Liu & Yigal Darsa - Presentation Limitations Experimental Results Black box test failures/KLOC: Approximated as the problems that would have been found by the customer had the project been released.
30
Daniel Liu & Yigal Darsa - Presentation Experimental Results Using the black box test failures/KLOC test quality obtained by running the 45 test cases, a multiple linear regression analysis was performed. Difficulty with multiple linear regression analysis: There is multi-collinearity among the metrics. This can lead to inflated variance in the prediction of post-release field quality.
31
Daniel Liu & Yigal Darsa - Presentation Experimental Results To eliminate multi-collinearity, PCA was used: PCA removed multi-collinearity and are orthogonal to each other. This means that changes in one component do not influence either of the other components, unlike the individual metrics.
32
Daniel Liu & Yigal Darsa - Presentation Experimental Results Ability of STREW metrics: used to identify programs of low quality. All programs having a Black box test failures/KLOC lower than the calculation from the equation below are of high quality and the remaining is of low quality. Equation (lower bound) = µ Black box test failures – [(z α/2 *Standard deviation of black box test failures/KLOC) / n ]
33
Daniel Liu & Yigal Darsa - Presentation Experimental Results Table: Table: Overall classification of program quality The estimate of the percentage correct classification is 90.9% (i.e. overall 20 of the 22 programs were correctly identified as high or low quality programs, using STREW).
34
Daniel Liu & Yigal Darsa - Presentation Results Conclusion From the experiment: 20 of the 22 programs were correctly identified as high or low quality programs. Feedback on potential field quality of software is very useful to developers because it helps identify weaknesses and faults in the software that require fixing.
35
Daniel Liu & Yigal Darsa - Presentation CONCLUSION in most production environments, field quality is measured too late to affordably guide significant corrective actionsin most production environments, field quality is measured too late to affordably guide significant corrective actions in-process testing metric suite for providing an early warning regarding post-release field quality measured by black box test failures/KLOC, and for identifying low quality programsin-process testing metric suite for providing an early warning regarding post-release field quality measured by black box test failures/KLOC, and for identifying low quality programs
36
Daniel Liu & Yigal Darsa - Presentation CONCLUSION (cont’d) STREW metric suite is a practical approach to measuring software post- release field qualitySTREW metric suite is a practical approach to measuring software post- release field quality STREW-based logistic regression analysis is a feasible technique for detecting low quality programs.STREW-based logistic regression analysis is a feasible technique for detecting low quality programs.
37
Daniel Liu & Yigal Darsa - Presentation ANY QUESTIONS!? Why? What is it that you didn’t understand? Why? What is it that you didn’t understand? Are you sure you read the article? Are you sure you read the article? If you read and didn’t understand what makes you think that we understood? If you read and didn’t understand what makes you think that we understood? Go easy on us!! Go easy on us!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.