Download presentation
Presentation is loading. Please wait.
Published byOwen Pearson Modified over 6 years ago
1
Presentation on the article: Identifying effective software metrics using genetic algorithms
Presenter: Randy Hunt Presenter: Vitaliy Krestnikov Date: April 27, 2009 course: comp 589 9/23/2018
2
Introduction Team Leaders commonly use software metrics as a measure of the overall quality of the design and the eventual implementation of systems. The ability to predict the quality of a software object from a set of software metrics is in essence a problem of classification. 9/23/2018
3
Classification Take a set of objects with known features (software metrics) Combine them with group labels (quality rankings) And you get a classifier that can predict the quality of new objects using only the computed metrics 9/23/2018
4
Software Metrics Software metrics are used to quantitatively map a set of numerical values, such as the number of lines of code in a file or the number of methods in a class, to a subjective measure of quality, in terms of the apparent complexity, maintainability and usability. Not all metrics provide the same classification power though, but different combination can yield results that certain people are looking for. 9/23/2018
5
Proposition This article proposes using a genetic algorithm feature selection procedure to indicate the optimal metrics used in the classification process. To test this proposal, software produced by Evident was used. 9/23/2018
6
Software Metrics All 338 software objects in EvIdent were subjectively labeled by an experienced software architect in terms of maintainability. Ranked each Java class as low, medium-low, medium or high. High represents easy to modify. Low represents difficult to modify. 9/23/2018
7
Software Metrics There were 16 different software metrics used.
LOC, SLC, CLC, WLC, RCC, RCS, SMC, MET, ANL, CAN, AE, ALC, ASC, ASL, ACC. AEC 9/23/2018
8
The Genetic Algorithm 9/23/2018
9
Genetic Algorithm step 1: initialize population
Population of Genes: Each chromosome is a software metric Chromo-somes: gene #1 gene #2 gene #3 gene #4 gene #5 T Y P L O C S W R SMC ME ANL CAN AIE … 1 9/23/2018
10
2. Begin the algorithm* for creating offspring for generation N, starting with generation 1
* The algorithm is shown on the following slides 9/23/2018
11
3. Calculate fitness by LDA* 4. Select pair based on fitness
* LDA is explained later in this presentation Chromo-somes: gene #1 gene #2 gene #3 gene #4 gene #5 T Y P L O C S W R SMC ME ANL CAN AIE … 1 Fitness (LDA %) 44 63 37 67 50 9/23/2018
12
5. Produce child gene by swapping bits starting from the randomly-picked crossover point
Chromo-somes: gene #2 gene #4 crossover T Y P L O C S W R SMC ME ANL CAN AIE … 1 * 9/23/2018
13
6. mutate each child bit where a random probability number exceeds the control parameter
* Control parameter should be small (e.g. 10%) Chromo-somes: Crossover mutated T Y P L O C S W R SMC ME ANL CAN AIE … 1 9/23/2018
14
7. Insert child into population; replacing the least fit gene
Chromo-somes: gene #1 gene #2 child gene #4 gene #5 T Y P L O C S W R SMC ME ANL CAN AIE … 1 Fitness (LDA %) 44 63 N/A 67 50 9/23/2018
15
8. Return to step 3 and repeat this process until one generation has reproduced.
* There is a control parameter, the number of elite genes (those which survive to the next generation) which determines when one generation is complete. 9/23/2018
16
9. Return to step 2 and repeat for the next generation, until N generations have reproduced.
* There is a control parameter, the number of generations, which determines when this loop terminates. 9/23/2018
17
Control parameters for the GA
Number of genes in the population Number of generations Percent of elite genes (those that survive to the next generation) * The probability of mutations * In the previous example, we have a very small population and only one reproduction was demonstrated. There are many reproductions per generation. 9/23/2018
18
Control parameters for the GA
Number of genes in the population Number of generations Percent of elite genes (those that survive to the next generation) * The probability of mutations * In the previous example, we have a very small population and only one reproduction was demonstrated. There are many reproductions per generation. 9/23/2018
19
Linear Discriminate Analysis (LDA)
9/23/2018
20
Computing “Fitness” using LDA
Java object: Zoo Quality ranking: low SW metrics: TYP: 1, LOC:539, SLC: 401, CLC: 138, Etc. Java object: Bar Quality ranking: low SW metrics: TYP: 1, LOC:539, SLC: 401, CLC: 138, Etc. Objective Function using LDA Fitness Value Java object: Foo Quality ranking: high SW metrics: TYP: 1, LOC:539, SLC: 401, CLC: 138, Etc. * This shows computing fitness for only one gene (set of SW metrics) 9/23/2018
21
SW metric (“known feature”):
Group: High max Group: Low Group: Medium SW metric (“known feature”): LOC Group: medium-low SW Metric (“known feature”):TYP * In reality, we can have up to 16 dimensions (only 2 shown here) 9/23/2018
22
Aspects of LDA function logic
For a point on the previous graph, the LDA algorithm will allocate it to the group based on: the greatest probability distribution The prior probability (for the last SW object processed, presumably) is also a factor 9/23/2018
23
Results 9/23/2018
24
Top 5 These are the 6 metrics that were common to the top 5 genes. SLC
WLC RCC AE ASL ACC 9/23/2018
25
Conclusion The GA metrics appear to indicate that code that is easy to read along with comments help developers understand the purpose of the code. 9/23/2018
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.