Metrics.

Metrics

Learning objectives Software metrics Metrics for various phases
Why metrics are needed How to collect metrics How to use metrics

Questions How big is the program? How close are you to finishing?
Huge!! How close are you to finishing? We are almost there!! Can you, as a manager, make any useful decisions from such subjective information? Need information like, cost, effort, size of project.

Metrics Quantifiable measures that could be used to measure characteristics of a software system or the software development process Required in all phases Required for effective management Managers need quantifiable information, and not subjective information Subjective information goes against the fundamental goal of engineering)

Overview Software process and project metrics are quantitative measures that enable software engineers to gain insight into the efficiency of the software process and the projects conducted using the process framework. In software project management, we are primarily concerned with productivity and quality metrics. There are four reasons for measuring software processes, products, and resources to characterize, to evaluate, to predict, to improve

SOFTWARE METRICS FOR PROCESS AND PROJECTS
Software Process Metrics and Project Metrics are Quantitative Measures that enable Software Professionals to gain insight into the efficacy of Software Process and the Project that are conducted using the Process as a Framework.

SOFTWARE METRICS FOR PROCESS AND PROJECTS
Quality and Productivity data are collected, analysed and compared against past averages to : Pinpoint/Uncover Problem areas before they “Go critical” Track potential Risks Assess Productivity improvements. Assess the status of an ongoing Project Adjust work flow or Tasks Evaluate the Project Team’s ability to Control Quality of Software Work Product.

Measures that are collected by a Project team and converted into Metrics for use during a Project can also be transmitted to those with responsibility for Software Process improvement. For this reason, many of the same Metrics are used in both the Process and Project domain.

Definitions Measure - quantitative indication of extent, amount, dimension, capacity, or size of some attribute of a product or process. E.g., Number of errors Metric - quantitative measure of degree to which a system, component or process possesses a given attribute. “A handle or guess about a given attribute.” E.g., Number of errors found per person hours expended

Why Measure Software? Determine the quality of the current product or process Predict qualities of a product/process Improve quality of a product/process

Motivation for Metrics
Estimate the cost & schedule of future projects Evaluate the productivity impacts of new tools and techniques Establish productivity trends over time Improve software quality Forecast future staffing needs Anticipate and reduce future maintenance needs

Example Metrics Defect rates Error rates Measured by:
individual module during development Errors should be categorized by origin, type, cost

Metric Classification
Products Explicit results of software development activities Deliverables, documentation, by products Processes Activities related to production of software Resources Inputs into the software development activities hardware, knowledge, people

Product vs. Process Process Metrics Product Metrics
Insights of process paradigm, software engineering tasks, work product, or milestones Lead to long term process improvement Product Metrics Assesses the state of the project Track potential risks Uncover problem areas Adjust workflow or tasks Evaluate teams ability to control quality

Types of Measures Direct Measures (internal attributes)
Cost, effort, LOC, speed, memory Indirect Measures (external attributes) Functionality, quality, complexity, efficiency, reliability, maintainability

Classification of software metrics
Product metrics quantify characteristics of the product being developed size, reliability Process metrics quantify characteristics of the process being used to develop the software efficiency of fault detection Resources Inputs into the software development activities hardware, knowledge, people

Issues Cost of collecting metrics Validity of metrics
Automation is less costly than manual method CASE tool may not be free Development cost of the tool Extra execution time for collecting metrics Interpretation of metrics consumes resources Validity of metrics Does the metric really measure what it should? What exactly should be measured?

Issues cont. Selection of metrics for measurement Basic metrics
Hundreds available and with some cost Basic metrics Size (like LOC) Cost (in $$$) Duration (months) Effort (person-months) Quality (number of faults detected)

Selection of metrics Identify problems from the basic metrics
high fault rates during coding phase Introduce strategy to correct the problems To monitor success, collect more detailed metrics fault rates of individual programmers

Utility of metrics LOC What if # defects per 1000 LOC is high?
size of product take a regular intervals and find out how fast the project is growing What if # defects per 1000 LOC is high? Then even if the LOC is high, most of the code has to be thrown away.

Applicability of metrics
Throughout the software process, like effort in person-months staff turnover cost Specific to a phase LOC # defects detected per hour of reviewing specifications

Metrics: planning When can we plan the entire software project?
At the very beginning? After a rapid prototype is made? After the requirements phase? After the specifications are ready? Sometimes there is a need to do it early.

Metrics: planning graph of cost estimate
4 3 Relative range of cost estimate 2 Requirements Specifications Design Implementation Integration Phase during which cost estimation is made

Planning: Cost estimation
Client wants to know: How much will I have to pay? Problem with underestimation (possible loss by the developer) overestimation (client may offer bid to someone else) Cost internal (salaries of personnel, overheads) external (usually cost + profit)

Cost estimation Other factors: Too many variables
desperate for work - charge less client may think low cost => low quality, so raise the amount Too many variables Human factors Quality of programmers, experience What if someone leaves midway Size of product

Planning: Duration estimation
Problem with underestimation unable to keep to schedule, leading to loss of credibility possible penalty clauses Problem with overestimation the client may go to other developers Difficulty because of similar reasons as for cost estimation

Metrics: planning - size of product
Units for measurement LOC = lines of code KDSI = thousand delivered source instructions Problems creation of code is only a part of the total effort effect of using different languages on LOC how should one count LOC? executable lines of code? data definitions comments? What are the pros and cons? DSI is the primary input to many tools for estimating software cost. The term "delivered" is generally meant to exclude non-delivered support software such as test drivers. However, if these are developed with the same care as delivered software, with their own reviews, test plans, documentation, etc., then they should be counted. The "source instructions" include all program instructions created by project personnel and processed into machine code by some combination of preprocessors, compilers, and assemblers. It excludes comments and unmodified utility software. It includes job control language, format statements, and data declarations.

Problems with lines of code
More on how to count Job control language statements? What if lines are changed or deleted? What if code is reused? Not all code is delivered to clients code may be for tool support What if you are using a code generator? Early on, you can only estimate the lines of code. So, the cost estimation is based on another estimated quantity!!! Job control language : A programming language used to specify the manner, timing, and other requirements of execution of a task or set of tasks submitted for execution, especially in background, on a multitasking computer; a programming language for controlling job execution.Abbreviated JCL.

Estimating size of product
FFP metric for cost estimation of medium-scale products Files, flows and processes (FFP) File: collection of logically or physically related records that are permanently resident Flow: a data interface between the product and the environment Process: functionally defined logical or arithmetic manipulation of data Size : S = #Files + #Flows + #Process, Cost : C = d X S

Techniques of cost estimation
Take into account the following: Skill levels of the programmers Complexity of the project Size of the project Familiarity of the development team Hardware Availability of CASE tools Deadline effect

Techniques of cost estimation
Expert judgement by analogy Bottom up approach Algorithmic cost estimation models Based on mathematical theories resource consumption during s/w development obeys a specific distribution Based on statistics large number of projects are studied Hybrid models mathematical models, statistics and expert judgement Bottom-up estimating is a technique in which the people who were going to do the work participate in the estimating process. That’s typically the project team members and they work with the project manager to develop estimating data at the lowest level in the work breakdown structure. When the estimates of work, duration and cost are set at that level, we can aggregate them upward into estimates of higher-level deliverables and the project as a whole.

COCOMO COnstructive COst MOdel Series of three models
Basic - macroestimation model Intermediate COCOMO Detailed - microestimation model Estimates total effort in terms of person-months Cost of development, management, support tasks included Secretarial staff not included

Intermediate COCOMO Obtain an initial estimate (nominal estimate) of the development effort from the estimate of KDSI Nominal effort = a X (KDSI)b person-months System a b Organic Semi-detached Embedded 3.2 3.0 2.8 1.05 1.12 1.20

Kind of systems Organic Semi-detached
Organization has considerable experience in that area Requirements are less stringent Small teams Simple business systems, data processing sys Semi-detached New operating system Database management system Complex inventory management system

Kind of systems Embedded Ambitious, novel projects
Organization has little experience Stringent requirements for interfacing, reliability Tight constraints from the environment Embedded avionics systems, real-time command systems

Intermediate COCOMO (contd)
Determine a set of 15 multiplying factors from different attributes (cost driver attributes) of the project Adjust the effort estimate by multiplying the initial estimate with all the multiplying factors Also have phase-wise distribution

Cost Driver Attributes
Cost Drivers Ratings Very Low Low Nominal High Very High Extra High Product attributes Required software reliability 0.75 0.88 1.00 1.15 1.40 Size of application database 0.94 1.08 1.16 Complexity of the product 0.70 0.85 1.30 1.65 Hardware attributes Run-time performance constraints 1.11 1.66 Memory constraints 1.06 1.21 1.56 Volatility of the virtual machine environment 0.87 Required turnabout time 1.07

Cost Driver Attributes
Cost Drivers Ratings Very Low Low Nominal High Very High Extra High Personnel attributes Analyst capability 1.46 1.19 1.00 0.86 0.71 Applications experience 1.29 1.13 0.91 0.82 Software engineer capability 1.42 1.17 0.70 Virtual machine experience 1.21 1.10 0.90 Programming language experience 1.14 1.07 0.95 Project attributes Application of software engineering methods 1.24 Use of software tools 0.83 Required development schedule 1.23 1.08 1.04

Determining the rating
Module complexity multiplier Very low: control operations consist of a sequence of constructs of structured programming Low: Nested operators Nominal: Intermodule control and decision tables High: Highly nested operators, compound predicates, stacks and queues Very high: Reentrant and recursive coding, fixed priority handling

COCOMO example System for office automation Four major modules
data entry: 0.6 KDSI data update: 0.6 KDSI date query: 0.8 KDSI report generator: 1.0 KDSI Total: KDSI Category: organic Initial effort: 3.2 * = PM (PM = person-months)

COCOMO example (contd)
From the requirements the ratings were assessed: Complexity High 1.15 Storage High 1.06 Experience Low 1.13 Programmer Capability Low 1.17 Other ratings are nominal EAF = 1.15 * 1.06 * 1.13 * 1.17 = 1.61 Adjusted effort = 1.61 * = 16.3 PM

Metrics: requirements phase
Number of requirements that change during the rest of the software development process if a large number changed during specification, design, …, something is wrong in the requirements phase Metrics for rapid prototyping Are defect rates, mean-time-to-failure useful? Knowing how often requirements change? Knowing number of times features are tried?

Metrics: specification phase
Size of specifications document may predict effort required for subsequent products What can be counted? Number of items in the data dictionary number of files number of data items number of processes Tentative information a process in a DFD may be broken down later into different modules a number of processes may constitute one module

Metrics: specification phase
Cost Duration Effort Quality number of faults found during inspection rate at which faults are found (efficiency of inspection)

Metrics: Design Phase Number of modules (measure of size of target product) Fault statistics Module cohesion Module coupling Cyclomatic complexity Fan-in, fan-out COUPLING An indication of the strength of interconnections between program units.Highly coupled have program units dependent on each other. Loosely coupled are made up of units that are independent or almost independent. Modules are independent if they can function completely without the presence of the other. Obviously, can't have modules completely independent of each other. Must interact so that can produce desired outputs. The more connections between modules, the more dependent they are in the sense that more info about one modules is required to understand the other module. Three factors: number of interfaces, complexity of interfaces, type of info flow along interfaces. COHESION Measure of how well module fits together.A component should implement a single logical function or single logical entity. All the parts should contribute to the implementation. Coupling" describes the relationships between modules, and "cohesion" describes the relationships within them. A reduction in interconnectedness between modules (or classes) is therefore achieved via a reduction in coupling. On the other hand, well-designed modules (or classes) should have some purpose; all the elements should be associated with a single task. This means that in a good design, the elements within a module (or class) should have internal cohesion. Coupling between modules can arise for different reasons, some of which are more acceptable, or desirable, than others. A ranked list [from least desirable to most desirable] might look something like the following: Internal data coupling [one module modifying the internals of another] Global data coupling [modules sharing global data] Control or sequence coupling [one module controlling the sequence of events in another] Parameter coupling [one module passing information to another through parameters] Subclass coupling [one module inheriting from another] ...As with coupling, cohesion can be ranked on a scale of the weakest (least desirable) to the strongest (most desirable) as follows: Coincidental cohesion [elements are in the same module for no particular reason] Logical cohesion [elements perform logically related tasks] Temporal cohesion [elements must be used at approximately the same time] Communication cohesion [elements share I/O] Sequential cohesion [elements must be used in a particular order] Functional cohesion [elements cooperate to carry out a single function] Data cohesion [elements cooperate to present an interface to a hidden data structure]

Cyclomatic complexity
Number of binary decisions + 1 The number of branches in a module Proposed by McCabe Lower the value of this number, the better Only control complexity, no data complexity For OO, cyclomatic complexity is usually low because methods are mostly small also, data component is important for OO, but ignored in cyclomatic complexity

Architecture design as a directed graph
Fan-in of a module: number of flows into a module plus the number of global data structures accessed by the module Fan-out of a module: number of flows out of the module plus the number of data structures updated by the module Measure of complexity: length X (fan-in X fan-out)2

OO design metrics Assumption: The effort in developing a class is determined by the number of methods. Hence the overall complexity of a class can be measured as a function of the complexity of its methods. Proposal: Weighted Methods per class (WMC)

WMC Let class C have methods M1, M2, .....Mn.
Let Ci denote the complexity of method How to measure Ci?

WMC validation Most classes tend to have a small number of methods, are simple, and provide some specific abstraction and operations. WMC metric has a reasonable correlation with fault-proneness of a class.

Depth of inheritance tree
Depth of a class in a class hierarchy determines potential for re-use. Deeper classes have higher potential for re-use. Inheritance increases coupling... changing classes becomes harder. Depth of Inheritance (DIT) of class C is the length of the shortest path from the root of the inheritance tree to C. In case of multiple inheritance DIT is the maximum length of the path from the root to C.

DIT evaluation Basili et al. study,1995.
Chidamber and Kemerer study, 1994. Most classes tend to be close to the root. Maximum DIT value found to be 10. Most classes have DIT=0. DIT is significant in predicting error proneness of a class. Higher DIT leads to higher error-proneness.

Metrics: implementation phase
Intuition: more complex modules are more likely to contain faults Redesigning complex modules may be cheaper than debugging complex faulty modules Measures of complexity: LOC assume constant probability of fault per LOC empirical evidence: number of faults related to the size of the product

McCabe’s cyclomatic complexity Essentially the number of branches in a module Number of tests needed for branch coverage of a module Easily computed In some cases, good for predicting faults Validity questioned Theoretical grounds Experimentally

Halstead’s software metrics Number of distinct operators in the module (+. -. If, goto) Number of distinct operands Total number of operators Total number of operands

High correlation shown between LOC and other complexity metrics Complexity metrics provide little improvement over LOC Problem with Halstead metrics for modern languages Constructor: is it an operator? Operand?

Metrics: implementation and integration phase
Total number of test cases Number of tests resulting in failure Fault statistics Total number of faults Types of faults misunderstanding the design lack of initialization inconsistent use of variables Statistical-based testing: zero-failure technique

Zero failure technique
The longer a product is tested without a single failure being observed, the greater the likelihood that the product is free of faults. Assume that the chance of failure decreases exponentially as testing proceeds. Figure out the number of test hours required without a single failure occurring.

Metrics: inspections Purpose: measure effectiveness of inspections
may reflect deficiencies of the development team, quality of code Measure fault density Faults per page - specs and design inspection Faults per KLOC - code inspection Fault detection rate - #faults / hour Fault detection efficiency - #faults/person-hour

Metrics: maintenance phase
Metrics related to the activities performed. What are they? Specific metrics: total number of faults reported classifications by severity, fault type status of fault reports (reported/fixed)

References S. R. Scach - Classical and Object-Oriented Software Engineering (Look at metrics under Index) P. Jalote - An Integrated Approach to Software Engineering (Look at metrics under Index)

Metrics.

Similar presentations

Presentation on theme: "Metrics."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Metrics.

Similar presentations

Presentation on theme: "Metrics."— Presentation transcript:

Similar presentations

About project

Feedback