COCOMO II Overview Ray Madachy madachy@usc.edu CSCI 510 September 14, 2005 9/14/05
Agenda COCOMO introduction Basic estimation formulas Cost factors Reuse model Sizing USC COCOMO tool demo Data collection 9/14/05
Software Cost Estimation Methods Cost estimation: prediction of both the person-effort and elapsed time of a project Methods: Algorithmic Expert judgement Estimation by analogy Parkinsonian Best approach is a combination of methods compare and iterate estimates, reconcile differences COCOMO is the most widely used, thoroughly documented and calibrated cost model Price-to-win Top-down Bottom-up 9/14/05
COCOMO Background COCOMO - the “COnstructive COst MOdel” COCOMO II is the update to COCOMO 1981 ongoing research with annual calibrations made available Originally developed by Dr. Barry Boehm and published in 1981 book Software Engineering Economics COCOMO II described in new book Software Cost Estimation with COCOMO II COCOMO can be used as a framework for cost estimation and related activities 9/14/05
Initial Operating Capability Life Cycle Architecture Software Estimation Accuracy 4x Effect of uncertainties over time 2x Relative Size Range x 0.5x Initial Operating Capability Operational Concept Life Cycle Objectives Life Cycle Architecture 0.25x Feasibility Plans/Rqts. Design Develop and Test Phases and Milestones 9/14/05
COCOMO Black Box Model COCOMO II product size estimate development, maintenance cost and schedule estimates product, process, platform, and personnel attributes COCOMO II reuse, maintenance, and increment parameters cost, schedule distribution by phase, activity, increment organizational project data recalibration to organizational data 9/14/05
Major COCOMO II Features Multi-model coverage of different development sectors Variable-granularity cost model inputs Flexibility in size inputs SLOCS function points application points other (use cases ...) Range vs. point estimates per funnel chart 9/14/05
COCOMO Uses for Software Decision Making Making investment decisions and business-case analyses Setting project budgets and schedules Performing tradeoff analyses Cost risk management Development vs. reuse decisions Legacy software phaseout decisions Software reuse and product line decisions Process improvement decisions 9/14/05
Productivity Ranges COCOMO provides natural framework to identify high leverage productivity improvement factors and estimate their payoffs. 9/14/05
COCOMO Submodels Applications Composition involves rapid development or prototyping efforts to resolve potential high-risk issues such as user interfaces, software/system interaction, performance, or technology maturity. It’s sized with application points (weighted screen elements, reports and 3GL modules). The Early Design model involves exploration of alternative software/system architectures and concepts of operation using function points and a course-grained set of 7 cost drivers. The Post-Architecture model involves the actual development and maintenance of a software product using source instructions and / or function points for sizing, with modifiers for reuse and software breakage; a set of 17 multiplicative cost drivers; and a set of 5 factors determining the project's scaling exponent. 9/14/05
Agenda COCOMO introduction Basic estimation formulas Cost factors Reuse model Sizing USC COCOMO tool demo Data collection 9/14/05
COCOMO Effort Formulation # of cost drivers Effort (person-months) = A (Size)B P EMi i=1 Where: A is a constant derived from historical project data (currently A = 2.94 in COCOMOII.2000) Size is in KSLOC (thousand source lines of code), or converted from function points or object points B is an exponent for the diseconomy of scale dependent on five additive scale drivers according to b = .91 + .01*SSFi, where SFi is a weighting factor for ith scale driver EMi is the effort multiplier for the ith cost driver. The geometric product results in an overall effort adjustment factor to the nominal effort. Automated translation effects are not included 9/14/05
Diseconomy of Scale Nonlinear relationship when exponent > 1 9/14/05
COCOMO Schedule Formulation Where: Schedule is the calendar time in months from the requirements baseline to acceptance C is a constant derived from historical project data (currently C = 3.67 in COCOMOII.2000) Effort is the estimated person-months excluding the SCED effort multiplier B is the sum of project scale factors SCED% is the compression / expansion percentage in the SCED cost driver This is the COCOMOII.2000 calibration Formula can vary to reflect process models for reusable and COTS software, and the effects of application composition capabilities. Schedule (months) = C (Effort)(.33+0.2(B-1.01)) x SCED%/100 9/14/05
Coverage of Different Processes COCOMO II provides a framework for tailoring the model to any desired process Original COCOMO was predicated on the waterfall process single-pass, sequential progression of requirements, design, code, test Modern processes are concurrent, iterative, incremental, and cyclic e.g. Rational Unified Process (RUP), the USC Model-Based Architecting and Software Engineering (MBASE) process Effort and schedule are distributed among different phases and activities per work breakdown structure of chosen process 9/14/05
Common Process Anchor Points Anchor points are common process milestones around which cost and schedule budgets are organized COCOMO II submodels address different development stages anchored by these generic milestones: Life Cycle Objectives (LCO) inception: establishing a sound business case Life Cycle Architecture (LCA) elaboration: commit to a single architecture and elaborate it to cover all major risk sources Initial Operational Capability (IOC) construction: commit to transition and support operations 9/14/05
MBASE Phase Distributions see COCOMO II book for complete phase/activity distributions Phase Effort % Schedule % Inception 6 12.5 Elaboration 24 37.5 Construction 76 62.5 Transition 12 12.5 COCOMO Total 100 100 Project Total 118 125 9/14/05
Waterfall Phase Distributions Effort % Schedule % Plans & rqts 7 20 Product Design 17 26 Programming 58 48 Integration & Test 25 26 Transition 12 12.5 COCOMO Total 100 100 Project Total 119 125 9/14/05
COCOMO II Output Ranges COCOMO II provides one standard deviation optimistic and pessimistic estimates. Reflect sources of input uncertainties per funnel chart. Apply to effort or schedule for all of the stage models. Represent 80% confidence limits: below optimistic or pessimistic estimates 10% of the time. 9/14/05
COCOMO Tailoring and Enhancements Calibrate effort equations to organizational experience USC COCOMO has a calibration capability Consolidate or eliminate redundant cost driver attributes Add cost drivers applicable to your organization Account for systems engineering, hardware and software integration 9/14/05
Agenda COCOMO introduction Basic estimation formulas Cost factors Reuse model Sizing USC COCOMO tool demo Data collection 9/14/05
Cost Factors Significant factors of development cost: scale drivers are sources of exponential effort variation cost drivers are sources of linear effort variation product, platform, personnel and project attributes effort multipliers associated with cost driver ratings Defined to be as objective as possible Each factor is rated between very low and very high per rating guidelines relevant effort multipliers adjust the cost up or down May be difficult to quantify, but better than ignoring important project factors (e.g. b asic vs. intermediate accuracies) 9/14/05
Scale Drivers Precedentedness (PREC) Development Flexibility (FLEX) Degree to which system is new and past experience applies Development Flexibility (FLEX) Need to conform with specified requirements Architecture/Risk Resolution (RESL) Degree of design thoroughness and risk elimination Team Cohesion (TEAM) Need to synchronize stakeholders and minimize conflict Process Maturity (PMAT) SEI CMM process maturity rating 9/14/05
Cost Drivers Product Factors Platform Factors Personnel factors Reliability (RELY) Data (DATA) Complexity (CPLX) Reusability (RUSE) Documentation (DOCU) Platform Factors Time constraint (TIME) Storage constraint (STOR) Platform volatility (PVOL) Personnel factors Analyst capability (ACAP) Program capability (PCAP) Applications experience (APEX) Platform experience (PLEX) Language and tool experience (LTEX) Personnel continuity (PCON) Project Factors Software tools (TOOL) Multisite development (SITE) Required schedule (SCED) 9/14/05
Example Cost Driver - Required Software Reliability (RELY) Measures the extent to which the software must perform its intended function over a period of time. Ask: what is the effect of a software failure? 9/14/05
Example Effort Multiplier Values for RELY 1.39 1.15 Very Low High Very High Low Nominal 1.0 Slight Inconvenience Low, Easily Recoverable Losses Moderate, Easily Recoverable Losses High Financial Loss Risk to Human Life 0.88 0.75 E.g. a highly reliable system costs 39% more than a nominally reliable system 1.39/1.0=1.39) or a highly reliable system costs 85% more than a very low reliability system (1.39/.75=1.85) 9/14/05
Scale Factors Sum scale factors Wi across all of the factors to determine a scale exponent, B, using B = .91 + .01 S Wi 9/14/05
Precedentedness (PREC) and Development Flexibility (FLEX) Elaboration of the PREC and FLEX rating scales: 9/14/05
Architecture / Risk Resolution (RESL) Use a subjective weighted average of the characteristics: 9/14/05
Team Cohesion (TEAM) Use a subjective weighted average of the characteristics to account for project turbulence and entropy due to difficulties in synchronizing the project's stakeholders. Stakeholders include users, customers, developers, maintainers, interfacers, and others 9/14/05
Process Maturity (PMAT) Two methods based on the Software Engineering Institute's Capability Maturity Model (CMM) Method 1: Overall Maturity Level (CMM Level 1 through 5) Method 2: Key Process Areas (see next slide) 9/14/05
Key Process Areas Decide the percentage of compliance for each of the KPAs as determined by a judgement-based averaging across the goals for all 18 Key Process Areas. 9/14/05
Cost Drivers Product Factors Platform Factors Personnel Factors Project Factors 9/14/05
Product Factors Required Software Reliability (RELY) Measures the extent to which the software must perform its intended function over a period of time. Ask: what is the effect of a software failure 9/14/05
Product Factors cont’d Data Base Size (DATA) Captures the effect large data requirements have on development to generate test data that will be used to exercise the program. Calculate the data/program size ratio (D/P): 9/14/05
Product Factors cont’d Product Complexity (CPLX) Complexity is divided into five areas: control operations, computational operations, device-dependent operations, data management operations, and user interface management operations. Select the area or combination of areas that characterize the product or a sub-system of the product. See the module complexity table, next several slides 9/14/05
Product Factors cont’d Module Complexity Ratings vs. Type of Module Use a subjective weighted average of the attributes, weighted by their relative product importance. 9/14/05
Product Factors cont’d Module Complexity Ratings vs. Type of Module Use a subjective weighted average of the attributes, weighted by their relative product importance. 9/14/05
Product Factors cont’d Required Reusability (RUSE) Accounts for the additional effort needed to construct components intended for reuse. Documentation match to life-cycle needs (DOCU) What is the suitability of the project's documentation to its life-cycle needs. 9/14/05
Platform Factors Platform Execution Time Constraint (TIME) Refers to the target-machine complex of hardware and infrastructure software (previously called the virtual machine). Execution Time Constraint (TIME) Measures the constraint imposed upon a system in terms of the percentage of available execution time expected to be used by the system. 9/14/05
Platform Factors cont’d Main Storage Constraint (STOR) Measures the degree of main storage constraint imposed on a software system or subsystem. Platform Volatility (PVOL) Assesses the volatility of the platform (the complex of hardware and software the software product calls on to perform its tasks). 9/14/05
Personnel Factors Analyst Capability (ACAP) Analysts work on requirements, high level design and detailed design. Consider analysis and design ability, efficiency and thoroughness, and the ability to communicate and cooperate. Programmer Capability (PCAP) Evaluate the capability of the programmers as a team rather than as individuals. Consider ability, efficiency and thoroughness, and the ability to communicate and cooperate. 9/14/05
Personnel Factors cont’d Applications Experience (AEXP) Assess the project team's equivalent level of experience with this type of application. Platform Experience (PEXP) Assess the project team's equivalent level of experience with this platform including the OS, graphical user interface, database, networking, and distributed middleware. 9/14/05
Personnel Factors cont’d Language and Tool Experience (LTEX) Measures the level of programming language and software tool experience of the project team. Personnel Continuity (PCON) The scale for PCON is in terms of the project's annual personnel turnover. 9/14/05
Project Factors Use of Software Tools (TOOL) Assess the usage of software tools used to develop the product in terms of their capabilities and maturity. 9/14/05
Project Factors cont’d Multisite Development (SITE) Assess and average two factors: site collocation and communication support. Required Development Schedule (SCED) Measure the imposed schedule constraint in terms of the percentage of schedule stretch-out or acceleration with respect to a nominal schedule for the project. 9/14/05
Cost Factor Rating Whenever an assessment of a cost driver is between the rating levels: always round to the Nominal rating e.g. if a cost driver rating is between High and Very High, then select High. 9/14/05
Cost Driver Rating Level Summary 9/14/05
Cost Driver Rating Level Summary cont’d 9/14/05
Agenda COCOMO introduction Basic estimation formulas Cost factors Reuse model Sizing USC COCOMO tool demo Data collection 9/14/05
Reused and Modified Software Effort for adapted software (reused or modified) is not the same as for new software. Approach: convert adapted software into equivalent size of new software. 9/14/05
Nonlinear Reuse Effects The reuse cost function does not go through the origin due to a cost of about 5% for assessing, selecting, and assimilating the reusable component. Small modifications generate disproportionately large costs primarily due the cost of understanding the software to be modified, and the relative cost of interface checking. 9/14/05
COCOMO Reuse Model A nonlinear estimation model to convert adapted (reused or modified) software into equivalent size of new software: 9/14/05
COCOMO Reuse Model cont’d ASLOC - Adapted Source Lines of Code ESLOC - Equivalent Source Lines of Code AAF - Adaptation Adjustment Factor DM - Percent Design Modified. The percentage of the adapted software's design which is modified in order to adapt it to the new objectives and environment. CM - Percent Code Modified. The percentage of the adapted software's code which is modified in order to adapt it to the new objectives and environment. IM - Percent of Integration Required for Modified Software. The percentage of effort required to integrate the adapted software into an overall product and to test the resulting product as compared to the normal amount of integration and test effort for software of comparable size. AA - Assessment and Assimilation effort needed to determine whether a fully-reused software module is appropriate to the application, and to integrate its description into the overall product description. See table. SU - Software Understanding. Effort increment as a percentage. Only used when code is modified (zero when DM=0 and CM=0). See table. UNFM - Unfamiliarity. The programmer's relative unfamiliarity with the software which is applied multiplicatively to the software understanding effort increment (0-1). 9/14/05
Assessment and Assimilation Increment (AA) 9/14/05
Software Understanding Increment (SU) Take the subjective average of the three categories. Do not use SU if the component is being used unmodified (DM=0 and CM =0). 9/14/05
Programmer Unfamiliarity (UNFM) Only applies to modified software 9/14/05
Commercial Off-the-Shelf (COTS) Software Current best approach is to treat as reuse A COTS cost model is under development Calculate effective size from external interface files and breakage Have identified candidate COTS cost drivers 9/14/05
Reuse Parameter Guidelines 9/14/05
Agenda COCOMO introduction Basic estimation formulas Cost factors Reuse model Sizing USC COCOMO tool demo Data collection 9/14/05
Lines of Code Source Lines of Code (SLOCs) = logical source statements Logical source statements = data declarations + executable statements Executable statements cause runtime actions Declaration statements are nonexecutable statements that affect an assembler's or compiler's interpretation of other program elements Codecount tool available on USC web site 9/14/05
Lines of Code Counting Rules Standard definition for counting lines Based on SEI definition checklist from CMU/SEI-92-TR-20 Modified for COCOMO II When a line or statement contains more than one type, classify it as the type with the highest precedence. Order of precedence is in ascending order 9/14/05
Lines of Code Counting Rules cont’d (See COCOMO II book for remaining details) 9/14/05
Counting with Function Points Used in both the Early Design and the Post-Architecture models. Based on the amount of functionality in a software product and project factors using information available early in the project life cycle. Quantify the information processing functionality with the following user function types: 9/14/05
Counting with Function Points cont’d External Input (Inputs) Count each unique user data or user control input type that both Enters the external boundary of the software system being measured Adds or changes data in a logical internal file. External Output (Outputs) Count each unique user data or control output type that leaves the external boundary of the software system being measured. 9/14/05
Counting with Function Points cont’d Internal Logical File (Files) Count each major logical group of user data or control information in the software system as a logical internal file type. Include each logical file (e.g., each logical group of data) that is generated, used, or maintained by the software system. External Interface Files (Interfaces) Files passed or shared between software systems should be counted as external interface file types within each system. 9/14/05
Counting with Function Points cont’d External Inquiry (Queries) Count each unique input-output combination, where an input causes and generates an immediate output, as an external inquiry type. Each instance of the user function types is then classified by complexity level. The complexity levels determine a set of weights, which are applied to their corresponding function counts to determine the Unadjusted Function Points quantity. 9/14/05
Counting with Function Points cont’d The usual Function Point procedure involves assessing the degree of influence of fourteen application characteristics on the software project. The contributions of these characteristics are inconsistent with COCOMO experience, so COCOMO II uses Unadjusted Function Points for sizing. 9/14/05
Unadjusted Function Points Counting Procedure Step 1 - Determine function counts by type. The unadjusted function counts should be counted by a lead technical person based on information in the software requirements and design documents. The number of each of the five user function types should be counted Internal Logical File (ILF) Note: The word file refers to a logically related group of data and not the physical implementation of those groups of data. External Interface File (EIF) External Input (EI) External Output (EO) External Inquiry (EQ)) 9/14/05
Unadjusted Function Points Counting Procedure cont’d Step 2 - Determine complexity-level function counts. Classify each function count into Low, Average and High complexity levels depending on the number of data element types contained and the number of file types referenced. Use the following scheme: 9/14/05
Unadjusted Function Points Counting Procedure cont’d Step 3 - Apply complexity weights. Weight the number in each cell using the following scheme. The weights reflect the relative value of the function to the user. Step 4 - Compute Unadjusted Function Points. Add all the weighted functions counts to get one number, the Unadjusted Function Points. 9/14/05
Agenda COCOMO introduction Basic estimation formulas Cost factors Reuse model Sizing USC COCOMO tool demo Data collection 9/14/05
USC COCOMO Demo 9/14/05
Agenda COCOMO introduction Basic estimation formulas Cost factors Reuse model Sizing USC COCOMO tool demo Data collection 9/14/05
Cost Driver Ratings Profile Need to rate cost drivers in a consistent and objective fashion within an organization. Cost driver ratings profile: Graphical depiction of historical ratings to be used as a reference baseline to assist in rating new projects Used in conjunction with estimating tools to gauge new projects against past ones objectively 9/14/05
Example Cost Driver Ratings Profile 9/14/05
Techniques to Generate Cost Driver Ratings Profile Single person Time efficient, but may impose bias and person may be unfamiliar with all projects Group Converge ratings in a single meeting (dominant individual problem) Wideband Delphi technique (longer calendar time, but minimizes biases). See Software Engineering Economics, p. 335 9/14/05
COCOMO Dataset Cost Metrics Size (SLOCS, function points) Effort (Person-hours) Schedule (Months) Cost drivers Scale drivers Reuse parameters 9/14/05
Recommended Project Cost Data For each project, report the following at the end of each month and for each release: SIZE Provide total system size developed to date, and report new code size and reused / modified code size separately. This can be at a project level or lower level as the data supports and is reasonable. For languages not supported by tools such as assembly code, report the number of physical lines separately for each language. EFFORT Provide cumulative staff-hours spent on software development per project at the same granularity as the size components. 9/14/05
Recommended Project Cost Data cont’d COST DRIVERS AND SCALE DRIVERS For each reported size component, supply the cost driver ratings for product, platform, personnel and project attributes. For each reported size component, supply scale driver ratings. REUSE PARAMETERS For each component of reused/modified code, supply reuse parameters AA, SU, UNFM, DM, CM and IM. See Appendix C in COCOMO II book for additional data items Post-mortem reports are highly recommended 9/14/05
Effort Staff-Hours Definition Standard definition Based on SEI definition checklist form CMU/SEI-92-TR-21 Modified for COCOMO II Does not include unpaid overtime, production and deployment activities, customer training activities Includes all personnel except level 3 or higher software management (i.e. directors or above who timeshare among projects) Person-month is defined as 152 hours 9/14/05
Further Information B. Boehm, C. Abts, W. Brown, S. Chulani, B. Clark, E. Horowitz, R. Madachy, D. Reifer, B. Steece, Software Cost Estimation with COCOMO II, Prentice-Hall, 2000 B. Boehm, Software Engineering Economics. Englewood Cliffs, NJ, Prentice-Hall, 1981 Main COCOMO website at USC: http://sunset.usc.edu/research/COCOMOII COCOMO information at USC: (213) 740-6470 COCOMO email: cocomo-info@sunset.usc.edu 9/14/05