Measuring process attributes
Good Estimates Predictions are needed for software development decision-making (figure 12.1) A prediction is useful only if it is reasonably accurate, close enough to actual A prediction/estimate is a range/window; not a single number.
What is an Estimate A prediction/estimate is a range/window; not a single number. It is a probabilistic assessment, estimate refers to the center of the range Formal definition: median of the distribution (Fig 12.2) Do not set estimate as target Estimate should be presented as a triple (the median plus upper and lower bounds/confidence intervals)
Evaluating Estimation Accuracy – Relative error (RE) in estimate : RE = (actual value – estimated value) / actual value – Mean RE for n projects: --- n RE = 1 / n ∑ RE i i = 1
Evaluating Estimation Accuracy – Mean magnitude of RE for n projects: n MRE = 1 / n ∑ MRE i i = 1 – If mean magnitude of RE is small then our predictions are good – Conte, Dunsmore and Shen, acceptable level of it is <= 0.25
Evaluating estimation accuracy Measure of prediction quality, PRED(q) = k/n Out of n projects, k number of projects have mean magnitude of relative error less than or equal to q. Eg: PRED(0.25) = % of the predicted values fall within 25% of their actual values. Conte, Dunsmore and Shen suggests that an estimation technique is acceptable if PRED(0.25) is at least 0.75.
Evaluating estimation accuracy DeMarco suggests EQF (estimating quality factor) to assess the accuracy of the prediction process. Estimates are made repeatedly throughout the process as more info is known. Fig 12.3 Effectiveness of the estimating process = area of the hatched region divide by D x A.
Cost Estimation: Problems and Approaches Cost estimation normally refers to likely amount of effort, time and staffing levels required to build software. Cost estimation and effort estimation are sometime used interchangeably.
Problems with cost estimation The nature of the problem Convert estimate to target, manipulate estimation parameters to fit an already-given outcome. (price-to- win) Not encourage to collect data, thus no history records to make judgment and predictions.
Current Approaches techniques for estimating effort and schedule: expert opinion: estimate is made based on experts past experience analogy: identifying a similar past project and adjusting decomposition: divide and conquer models: using a model relating key inputs and effort decomposition and modeling are preferred.
Current Approaches These techniques can either be applied bottom-up or top-down. Bottom-up estimation begins with the lowest- level parts of product or task, and provides estimates for each. Top-down estimation begins with the overall process or product.
Models of Effort and Cost Two type of models: cost models providing direct estimates of effort or duration often based on empirical data reflecting factors that contribute to overall cost. input consist of one primary input (size) and a number of secondary adjustment factors (cost drivers - characteristics that are expected to influence effort or duration). Eg: COCOMO
Models of Effort and Cost Two type of models: constraint models demonstrate the relationship over time between two or more parameters of effort, duration, or staffing level Rayleigh curve(Figure 12.4)
Regression-based models One of the models used is E = aS b where a, b are parameters that are estimated by regression techniques, see Figure next step - identify the factors that cause variation between predicted and actual effort (eg : experience of developers). In this way an effort adjustment factor F is obtained. The unadjusted result is then multiplied by this factor to give the adjusted effort (E = aS b F) F is the product of the cost driver values.
COCOMO In 1970s, Boehm derived the constructive cost model (COCOMO). The original COCOMO is a collection of three models: a basic model to be applied early, an intermediate model to be applied after requirements are specified, and an advanced model to be used when design is complete.
Oiginal COCOMO: Effort All three have the form E = aS b F E - effort in person months, S - size in thousands of delivered source instructions (KDSI) F - adjustment factor (=1 in the basic model). The parameters a,b are dependent on the type of software, organic(data processing) embedded(real time software within a larger, hardware-based system) and semi-detached (a blend of these) see Table 12.2.
Original COCOMO: Effort Eg 12.7 There are 15 independent adjustment factors, see Table 12.3, i.e. F = F1 x F2 x... x F15 used in the intermediate and advanced model. The advanced model uses the intermediate model on the component level and then a phase-based model is used to build up an estimate for the complete project.
Oiginal COCOMO: Duration For duration (D) the model D = a E b is used. D is duration in months, and the parameters are given in Table Eg 12.8
COCOMO 2.0 An updated, three stage version COCOMO 2.0, was presented in Based on 3 major stages of any development projects: Stage 1, project builds prototypes to resolve high-risk issues involving user interface, interactions etc. Estimate size in object points based on number of screens, reports and 3rd generation language components (refer to page 265, 266), reuse is taken into account.
COCOMO 2.0 Stage 2 employs function point as a measure of size. Function point estimate functionality captured in the requirement. Stage 3 Development has begun Can use LOC as measure of size. Other differences between the stages can be seen in Table COCOMO 2.0 incorporates reuse, takes in account maintenance and breakage.
Putnam's SLIM Model – Putnam´s model assumes that the effort for software development projects is distributed similarly to a collection of Rayleigh curves, one for each major development activity, see Figure – constructed for US Army use in 1978 to cover projects exceeding 70 KLOC.
Putnam's SLIM Model like the COCOMO model it is based on empirical studies. Derived from basic Rayleigh formula. S = C K 1/3 t d 4/3 S - size in LOC C - a technology factor (C) K - total project effort in person years (includes maintenance) t d - elapsed time to delivery in years (in theory the point at which Rayleigh curve reaches a max)
Putnam's SLIM Model assess the effect of varying delivery date on the total effort needed to complete the project. eg: a 10% decrease in elapsed time S = C K 1/3 t d 4/3 = C K´ 1/3 (0.9 t d ) 4/3 results in K´/K = 1.52, i.e., in a 52% increase in total life-cycle effort.
Putnam's SLIM Model To estimate effort/duration, Putnam introduce: D 0 = K / t d 3 D 0 - manpower acceleration constant (12.3 for new software with many interface and interactions with other system), 15 for stand-alone system, 27 for re- implementation of existing systems)
Putnam's SLIM Model From the 2 equations can derive: K = (S / C) 9/7 D 0 4/7 SLIM uses separate Rayleigh curves for design and code, test and validation, maintenance and management. Requirement specification is not included.
Multi-project models Effort estimates are affected by other projects Cost can be amortized over several upcoming projects. Reuse normally involves multiple projects
Problems with existing modeling methods Conte group suggests that a model be considered acceptable when PRED (0.25) exceeds 0.75 but that rarely happens. It shows model insufficiency.
Problems with existing modeling methods Reasons: Model structure Many studies agree that b in the effort-duration models is about 1/3. However, there is little consensus about the effect of reducing or extending duration. Overly complex models Models with many parameters (eg. cost drivers) are not necessarily preferable (accuracy, subjectivity, independency, static). Product size estimation Size estimates in LOC are not available early in the process.
Dealing with problems of current estimation methods use local data definitions calibrate models in the actual environment use independent estimation group reduce input subjectivity do preliminary estimate (group estimate like Delphi; estimate by analogy) and re- estimation
Dealing with problems of current estimation methods use other than LOC for early size estimation Function points Specification weight metrics/bang metrics (DeMarco) Function bang measure - based on the number of functional primitive (bubble) in a data-flow diagram. Data bang measure - based on the number of entities in the ER model.
Dealing with problems of current estimation methods use locally developed cost models 5 steps in defining a local cost model (DeMarco, 1982) Decompose cost element Formulate cost theory Collect data Analyze data and evaluate model Check model - use PRED (0.25) or other to assess acceptable accuracy