Estimating Project Success June Verner and Barbara Kitchenham Empirical Software Engineering NICTA
Outline Background Data –Factors –Data sets Methodology –Factor analysis –Logistic regression –Results summary Conclusions and further work
Background “Billions of dollars are wasted each year on failed software projects... ….we have a dismal history of projects that have gone awry” [Charettte IEEE Spectrum Sept 2005] Have been developing software since the 1960s but still have not learned enough to ensure project success Most project failures are predictable & avoidable How can we identify these projects early enough to take action?
Failures Most organizations try to hide their failures –Not only monetary loss, but also lost opportunity A recent “Hall of Shame” includes (in $US millions) –FBI100 –UK Inland Revenue 33 –Ford Motor Company400 –Sainsburys527 –Sydney Water Corp Other recent Australian problems include: –National Australia Bank AUD200 million write down on failed ERP project, –RMIT’s Academic Management System –Victorian State’s Infrastructure Management System –Continued controversy over the Federal Government’s new sea cargo import reporting system
Data - Factors Literature Discussions with 90+ developers Categories –Sponsor –Customer and users –Requirements –Estimation and scheduling –The project manager –Project management –Development process –Developers
Data sets Mostly in-house software developments –North American financial Institution 42 projects 45% success rate –Other NE US projects 79 projects 71% success rate –Sydney 42 projects 78% success rate Chile 200+ –In house –Developments for third parties
Methodology Correlate all factors with project success Consider only those at the 95% level –Overall –By groups Remove factors with large number of missing values Use factor analysis on reduced set of variables to develop a new set of variables suitable for predicting project success Use these variables to develop prediction equations on entire data set and by groups –How do these equations compare? Take original reduced set of correlated factors and develop prediction equations overall and for each of the groups Compare the results with the equations developed with reduced set of factors.
Correlations with project success Only variables correlated at the 95% level across 3 groups and overall Sponsor - nil Customer and Users –level of confidence of customers in the project manager, team members –customers had realistic expectations Requirements –were requirements completed adequately at some stage –good requirements overall Estimation and Scheduling –how good were the estimates? –staff added late to meet an aggressive schedule? Project manager –the PM communicated well with the staff? –how good was project manager? –how well did project manager relate to software development staff?
Correlations with project success Development process –Adequate time was allowed for each of the phases Development team –How well did the team members work together? –How high was the motivation of the team members? –What was the working environment like?
Not included Sponsor –Project manager given full authority to manage project –Sponsor commitment (2) Customer and Users –Level of customer involvement (1) –Customer turnover (1) –Large numbers of customers and users Requirements –Adequate time made available for requirements gathering (2) –Central repository (2) –Size impacted requirements gathering –Scope was well defined (1) Estimation and Scheduling –Estimate of delivery date used adequate requirements information (2) –Developers were involved in the estimates –Adequate staff assigned to project (1) –Developers were involved in the estimates (1)
Not included Project manager –Project manager background –Years of experience –Experience in the application area –Project manager had a vision of what the project was to do for the organization (2) Project management –Did the PM control the project? (1) –Staff were appreciated for working long hours (2) –Staff were rewarded for working long hours (2) Development process –Defined development methodology used (1) –Risks incorporated into project plan (2) –Requirements managed effectively (2) Developers –Total number of staff (1) –Team members consulted about staff selection (2)
Factor analysis 75% of variance explained with 3 factors Factor 1- Project manager –The PM communicated well with staff –How good was project manager? –How well did project manager relate to software development staff? Factor 2 - Customers and requirements –Level of confidence of customers in the project manager & team members –Customers had realistic expectations –Good requirements overall –How good were the estimates? Factor 3 –Staff added late to meet an aggressive schedule
Logistic regression - overall % failed 78392% succeeded 82% Overall All three factors
Logistic regression-by group Group 1 Factor % 82% 84% Failures Successes Overall Group 2 Factor2 & Factor % 95% 81% Failures Successes Overall Group3 Factor % 100% 90% Failures Successes Overall Marginally better than overall 24 wrong versus 23
Logistic regression-Original 12 variables-overall Level of confidence of customers in the project manager, team members Overall reqts were good How good were the estimates Staff added late to met an aggressive schedule % 88% 81% Failures Successes Overall
Logistic regression-original 12 variables-groups Group 1 Reqts good overall How good were the estimates? % 64% 81% Failures Successes Overall Group 2 Reqt good overall % 95% 81% Failures Successes Overall Group3 Staff added late to meet aggressive schedule PM communicated well with staff % 90% 85% Failures Successes Overall
Factors not significant for any of the groups Project manager given full authority The project began with a committed champion The committment lasted right through the project The sponsor was involved with decisions Other stakeholders comitted and involved Senior management impacted the project Involved customers/users stayed throughout the project Customers/users involved in schedule estimates Problems were caused by large by the number of customers/users involved Reqts were gathered using a particular method Was the manager involved in estimate? The project had a schedule There was a project manager Years of experience of the PM Was project manager experienced in application area? Was project manager able to pitch and help if needed? The consultants reported to the project manager Did all the key people stay throughout the project? Rewards at end of project motivated team
Results - summary Did better with the extracted factors overall –24 wrong versus 26 Did better with the extracted factors with groups –21 wrong versus 24 Nearly 3 times as many failed project predicted incorrectly Data set with too few failures problemmatic
Conclusions and further work Would it help if we had values for: –Adequate time was allowed for each of the phases –How well did the team members work together? –How high was the motivation of the team members? –Working environment Next steps –How best to deal with the missing values? –Use Bayesian networks –What are the missing failure factors? Don’t believe that in all cases failure is the converse of success