Presentation is loading. Please wait.

Presentation is loading. Please wait.

Understanding the Human Estimator

Similar presentations


Presentation on theme: "Understanding the Human Estimator"— Presentation transcript:

1 Understanding the Human Estimator
Gary D. Boetticher Univ. of Houston - Clear Lake, Houston, TX, USA Nazim Lokhandwala Univ. of Houston - Clear Lake, Houston, TX, USA James C. Helm Univ. of Houston - Clear Lake, Houston, TX, USA The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

2 Chaos Chronicles [Standish03]
Introduction Chaos Chronicles [Standish03] 300 billion dollars 250,000 new projects 1.2 million dollars per project The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

3 Boehm’s 4X http://nas.cl.uh.edu/boetticher/publications.html
Boehm’s Software Engineering Economics. The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

4 Types of Estimation [Jorgenson04]
7 - 16% Algorithmic and Machine Learners [Jorgensen04], Jorgensen M., “A review of studies on Expert Estimation of Software Development Effort.”, “Journal of Systems and Software”, 2004. Jorgensen summarized and compared various surveys McAuley’s gave the above numbers. Heemstra - Human Based 62%, Algorithmic 14% and Other 9%, Wydenbach – 86%, Algorithmic 26% and Other 11%. % Human-Based The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

5 Research Focus Number of Papers On Software Estimation in IEEE [Jorgenson02] Human-Based Estimation (17%) Other (83%) A search of estimation papers in the journals IEEE transactions of Software Engineering, Journals of Systems and Software, Journal of Information and Software Technology and Journal of Empirical Software engineering. The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

6 How do human demographics affect human-based estimation?
Statement of Problem How do human demographics affect human-based estimation? Can predictive models be constructed using human demographics? The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

7 Investigation Procedure
Collect demographics from participants Request participants to estimate software components Build models (Estimates vs. Actuals) Survey The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

8 Which Demographics? Basic Demographics Academic Background
Work Experience Domain Experience The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

9 The Survey http://nas.cl.uh.edu/boetticher/EffortEstimationSurvey.html
The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

10 Competitive Procurement Software
Supplier Software Buyer Software Distribution Server Supplier1 Buyer Admin Supplier2 Buyer1 ... Buyern : Suppliern The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

11 Sample Estimation Screenshots
The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

12 Survey Results Screenshots
The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

13 Data Collection Invitations Filtered Incomplete Records
122 Final Records The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

14 Participant Educational Background
Mean Maximum Standard Deviation Computer Science Undergrad Courses 8.8525 70 Grad Courses 2.4262 15 3.2293 Hardware 3.5246 64 8.0209 0.5000 10 1.3252 Management Information Systems Undergrad Courses 0.7705 12 1.5892 0.4918 9 1.3742 Project Management 0.2951 4 0.6886 0.8115 6 1.1806 Software Engineering 0.9180 7 1.2958 2.1557 21 3.1202 Most of the participants hold Bachelors or Masters Degrees The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

15 Participant Work Experience
Mean Maximum Standard Deviation (Years) Years of Experience As Hardware Project Manager 0.6557 15 1.9251 Software Project Manager 1.3443 10 2.0811 No of Projects estimated Hardware Projects 0.8279 20 2.6307 Software Projects 2.9508 28 4.4848 The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

16 Participant Domain Experience
2.2512 20 0.7274 Process Industry 1.3818 10 0.6209 Procurement and Billing Domain Experience Standard Deviation Maximum (Years) Mean The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

17 Data Preparation INPUT= OUTPUT=
9 4 2 1 3 7 5 INPUT= 69% zeros…Needs Consolidation Courses, Workshops, Conferences, Programming Exp. 45 attributed reduced to 14 attributes Highest Degree Achieved…Need Transformation OUTPUT= MRE=Abs (Total Actual – Total Est.)/(Total Actual) The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

18 Build Models Linear Regression (Excel) Non-Linear Regression (DataFit)
Genetic Programming (GDB_GP) The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

19 GP Configuration 3 Settings 1000 Chromosomes 50 Generations
20 Trials each The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

20 Results: All Demographic Factors
Best Values of R Squared with Min. Std. Error 1.6470 0.8847 Non-Linear Regression Std. Error R Squared 1.3875 4.4580 0.9174 0.1550 Genetic Programming Linear Regression T-Test between Average R Square Values 1.87E-15 3.45E-17 T-test 0.8847 0.5592 0.1550 Mean Non-Linear Regression Genetic Programming Linear Regression The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

21 Results: Educational Factors
Best Values of R Squared with Min. Std. Error 4.1667 0.2136 Non-Linear Regression Std. Error R Squared 3.9738 4.6101 0.2784 0.0373 Genetic Programming Linear Regression T-Test between Average R Square Values 0.0486 2.74E-13 T-test 0.2136 0.1973 0.0373 Mean Non-Linear Regression Genetic Programming Linear Regression The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

22 Results: Work Experience
Best Values of R Squared with Min. Std. Error 4.0644 0.3698 Non-Linear Regression Std. Error R Squared 2.2855 4.5169 0.7572 0.0596 Genetic Programming Linear Regression T-Test between Average R Square Values 1.54E-11 2.73E-19 T-test 0.3698 0.5564 0.0596 Mean Non-Linear Regression Genetic Programming Linear Regression The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

23 Results: Domain Experience
Best Values of R Squared with Min. Std. Error 3.9091 0.3260 Non-Linear Regression Std. Error R Squared 2.9283 4.5425 0.5911 0.0243 Genetic Programming Linear Regression T-Test between Average R Square Values 4.55E-16 3.27E-23 T-test 0.3260 0.5405 0.0243 Mean Non-Linear Regression Genetic Programming Linear Regression The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

24 Summary of All Experiments
R Square Values Linear Regression Best Case Genetic Prog. Avg. Case Genetic Prog. Non-Linear Regression All Factors 0.1550 0.9174 0.5592 0.8847 Education Only 0.0373 0.2784 0.1973 0.2136 Work Experience Only 0.0596 0.7572 0.5564 0.3698 Domain Experience Only 0.0243 0.5911 0.5405 0.3260 The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

25 Best Equation: All Factors. r2 = 0.9174
Too Much of a Good Thing? Best Equation: All Factors. r2 = ((Log (TechGradCourses + (TechGradCourses ^ ((Log TotWShops)/(Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (Log (Log (Log SWProjEstExp))))))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (((ProcIndExp + (Log (Sin MgmtGradCourses)))/(Sin SWPMExp)) + (Sin ((Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Sin SWPMExp)))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Cos (TechGradCourses ^ ((Log SWProjEstExp) / (((Log (ProcIndExp + (Log (TechGradCourses ^ ((Log SWProjEstExp) / (Log SWProjEstExp)))))) - 3) / (ProcIndExp + (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos ((((Log SWProjEstExp) / ((ProcIndExp + (Log (TechGradCourses ^ (TechGradCourses ^ (Log SWProjEstExp))))) / (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (Log (Log (Log SWProjEstExp)))))))))))))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / (TechGradCourses ^ (Log SWProjEstExp))))))))))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))) + ((Log SWProjEstExp) / (Log SWProjEstExp)))))) / (Log (Log (Log (TechGradCourses + (Cos (Log (Log (TechGradCourses ^ (Cos (((((Log SWProjEstExp) / (TechGradCourses ^ (Log SWProjEstExp))) / ((ProcIndExp + (Log (Sin MgmtGradCourses))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp))))))))))))))))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Log ((((Log TotLangExp) / (Log SWProjEstExp)) / (Log SWProjEstExp)) / (Sin SWPMExp))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))))))) + (((((ProcIndExp + (Log (TechGradCourses ^ (Log (TechGradCourses + ((TechGradCourses ^ (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos ((((Log SWProjEstExp) / ((ProcIndExp + (Log (TechGradCourses ^ (Log (TechGradCourses + (Cos (Log (Log (TechGradCourses ^ (Cos (((((Log SWProjEstExp) / (TechGradCourses ^ (Log SWProjEstExp))) / ((ProcIndExp + (Log (Sin MgmtGradCourses))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / (TechGradCourses ^ (Log SWProjEstExp))))))) / (Sin SWPMExp))))))) / (TechGradCourses ^ (Log SWProjEstExp))) / (TechGradCourses ^ (Log SWProjEstExp))) / (TechGradCourses ^ (Log SWProjEstExp))) / (Sin SWPMExp))) The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

26 Conclusions Viability of a human-based est. model Model assessment
Non-linear  GP Impact on Human Based Estimation 1) All Factors 2) Domain Experience  Work Experience 3) Education The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

27 Future Directions Equation Optimizer for GP Collect More Data
Further analysis without consolidation Detailed Effect of Educational Factors Use other statistical indicators Build other models Hybrid (Non-linear and GP) Classifiers Impact of process on estimation The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

28 Questions? http://nas.cl.uh.edu/boetticher/publications.html
The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop

29 Thank You ! http://nas.cl.uh.edu/boetticher/publications.html
The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop


Download ppt "Understanding the Human Estimator"

Similar presentations


Ads by Google