smi COCOMO II Calibration Status COCOMO Forum October 2004
smi COCOMO Forum - October A Little History Calibration effort started in January 2002 Confusion –Repository in an inconsistent state –“Uncharacterized” data from many sources –Process for duplicating the 2000 calibration results –Schedule compression rating was inconsistent Expectation –New data had a lot of variation but… –Affiliates (and the user population in general) want an “Accurate” and up-to-date model – not just one that explained variation PRED(.25) versus R 2
smi COCOMO Forum - October Change in Approach Removed pre-1990 data from dataset used in calibration –This removed a lot of “converted” data Removed “bad” data –Incomplete: No duration data, estimated effort, no valid SLOC size Still use the Bayesian calibration approach developed by Chulani Changed to a holistic analysis approach: considered effort and duration together –Identified data that needed review –Schedule compression was automatically set
smi COCOMO Forum - October Post-1989 Data Using Current COCOMO II Values Effort Underestimated Duration Overestimated Effort Underestimated Duration Underestimated Effort Overestimated Duration Overestimated Effort Overestimated Duration Underestimated
smi COCOMO Forum - October Effort- Duration Error Interpretation Effort EstimatesDuration Estimates Data Validation / Interpretation Under-estimated Actual size data is too small due to reuse modeling Actual error and duration included lifecycle phases not in the model Difficult, low productivity projects Under-EstimatedOver-EstimatedSchedule Compression required Over-estimatedUnder-estimatedFixed-staffing levels Project slow-down Schedule Stretch-out Over-estimated Actual data is too large due to physical SLOC count, reuse modeling Actual effort and duration cover fewer lifecycle phases than estimated Easy, high productivity
smi COCOMO Forum - October Effort Estimate Error Compared to Size (Post 1989 – 89 Projects, 2000 Cal)
smi COCOMO Forum - October Duration Estimate Error Compared to Size (Post 1989 – 89 Projects, 2000 Cal)
smi COCOMO Forum - October Accuracy Results Effort Estimation Accuracy Duration Estimation Accuracy PRED 161 Dataset with 2000 Cal Values 89 Dataset with 2000 Cal Values 89 Dataset with 89 Cal Values PRED 161 Dataset with 2000 Values 89 Dataset with 2000 Values 89 Dataset with 89 Values
smi COCOMO Forum - October Calibration Progress Reviewing new data –Dataset A: 8 projects –Dataset B: 52 projects –Dataset C: 13 projects –Dataset D: 4 projects –Dataset E: 10 projects –Dataset F: 8 projects
smi COCOMO Forum - October Dataset A Effort Underestimated Duration Overestimated Effort Underestimated Duration Underestimated Effort Overestimated Duration Overestimated Effort Overestimated Duration Underestimated
smi COCOMO Forum - October Dataset B Effort Underestimated Duration Overestimated Effort Underestimated Duration Underestimated Effort Overestimated Duration Overestimated Effort Overestimated Duration Underestimated
smi COCOMO Forum - October Dataset C Effort Underestimated Duration Overestimated Effort Underestimated Duration Underestimated Effort Overestimated Duration Overestimated Effort Overestimated Duration Underestimated
smi COCOMO Forum - October Dataset D & E Effort Underestimated Duration Overestimated Effort Overestimated Duration Overestimated Effort Overestimated Duration Underestimated
smi COCOMO Forum - October Dataset F Effort Underestimated Duration Overestimated Effort Underestimated Duration Underestimated Effort Overestimated Duration Overestimated Effort Overestimated Duration Underestimated
smi COCOMO Forum - October Observations on New Data The estimation error of the new datasets lie outside the Post-1989 (Cal 2000) dataset error range When each dataset is given its own (local) calibration constant, A, accuracy improves There have been some suggestions on modifying the COCOMO II model –“Globbing” data by application domain or platform and provide different model constants for each “glob” –Add a Cost Driver that accounts for “spread” of data
smi COCOMO Forum - October Proposed New Driver Domain Expertise Driver Definition: –Cumulative knowledge and experience that has been acquired through a thorough track record that comes to represent the core competencies of an organization
smi COCOMO Forum - October Next Steps Finish Early COCOTS calibration –Tailoring and Glue Code activities to analyze –Model definition manual and tool Finish COCOMO II calibration –Consider “Globbing” over adding a new driver Start COCOMO II Driver Elaboration –Make some driver descriptions less subjective –Crisper definitions
smi COCOMO Forum - October For more information, requests or questions Brad Clark Software Metrics, Inc. Ye Yang USC-CSE