Download presentation
Presentation is loading. Please wait.
1
Understanding the Human Estimator
Gary D. Boetticher Univ. of Houston - Clear Lake, Houston, TX, USA Nazim Lokhandwala Univ. of Houston - Clear Lake, Houston, TX, USA James C. Helm Univ. of Houston - Clear Lake, Houston, TX, USA The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
2
Chaos Chronicles [Standish03]
Introduction Chaos Chronicles [Standish03] 300 billion dollars 250,000 new projects 1.2 million dollars per project The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
3
Boehm’s 4X http://nas.cl.uh.edu/boetticher/publications.html
Boehm’s Software Engineering Economics. The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
4
Types of Estimation [Jorgenson04]
7 - 16% Algorithmic and Machine Learners [Jorgensen04], Jorgensen M., “A review of studies on Expert Estimation of Software Development Effort.”, “Journal of Systems and Software”, 2004. Jorgensen summarized and compared various surveys McAuley’s gave the above numbers. Heemstra - Human Based 62%, Algorithmic 14% and Other 9%, Wydenbach – 86%, Algorithmic 26% and Other 11%. % Human-Based The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
5
Research Focus Number of Papers On Software Estimation in IEEE [Jorgenson02] Human-Based Estimation (17%) Other (83%) A search of estimation papers in the journals IEEE transactions of Software Engineering, Journals of Systems and Software, Journal of Information and Software Technology and Journal of Empirical Software engineering. The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
6
How do human demographics affect human-based estimation?
Statement of Problem How do human demographics affect human-based estimation? Can predictive models be constructed using human demographics? The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
7
Investigation Procedure
Collect demographics from participants Request participants to estimate software components Build models (Estimates vs. Actuals) Survey The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
8
Which Demographics? Basic Demographics Academic Background
Work Experience Domain Experience The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
9
The Survey http://nas.cl.uh.edu/boetticher/EffortEstimationSurvey.html
The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
10
Competitive Procurement Software
Supplier Software Buyer Software Distribution Server Supplier1 Buyer Admin Supplier2 Buyer1 ... Buyern : Suppliern The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
11
Sample Estimation Screenshots
The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
12
Survey Results Screenshots
The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
13
Data Collection Invitations Filtered Incomplete Records
122 Final Records The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
14
Participant Educational Background
Mean Maximum Standard Deviation Computer Science Undergrad Courses 8.8525 70 Grad Courses 2.4262 15 3.2293 Hardware 3.5246 64 8.0209 0.5000 10 1.3252 Management Information Systems Undergrad Courses 0.7705 12 1.5892 0.4918 9 1.3742 Project Management 0.2951 4 0.6886 0.8115 6 1.1806 Software Engineering 0.9180 7 1.2958 2.1557 21 3.1202 Most of the participants hold Bachelors or Masters Degrees The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
15
Participant Work Experience
Mean Maximum Standard Deviation (Years) Years of Experience As Hardware Project Manager 0.6557 15 1.9251 Software Project Manager 1.3443 10 2.0811 No of Projects estimated Hardware Projects 0.8279 20 2.6307 Software Projects 2.9508 28 4.4848 The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
16
Participant Domain Experience
2.2512 20 0.7274 Process Industry 1.3818 10 0.6209 Procurement and Billing Domain Experience Standard Deviation Maximum (Years) Mean The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
17
Data Preparation INPUT= OUTPUT=
9 4 2 1 3 7 5 INPUT= 69% zeros…Needs Consolidation Courses, Workshops, Conferences, Programming Exp. 45 attributed reduced to 14 attributes Highest Degree Achieved…Need Transformation OUTPUT= MRE=Abs (Total Actual – Total Est.)/(Total Actual) The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
18
Build Models Linear Regression (Excel) Non-Linear Regression (DataFit)
Genetic Programming (GDB_GP) The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
19
GP Configuration 3 Settings 1000 Chromosomes 50 Generations
20 Trials each The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
20
Results: All Demographic Factors
Best Values of R Squared with Min. Std. Error 1.6470 0.8847 Non-Linear Regression Std. Error R Squared 1.3875 4.4580 0.9174 0.1550 Genetic Programming Linear Regression T-Test between Average R Square Values 1.87E-15 3.45E-17 T-test 0.8847 0.5592 0.1550 Mean Non-Linear Regression Genetic Programming Linear Regression The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
21
Results: Educational Factors
Best Values of R Squared with Min. Std. Error 4.1667 0.2136 Non-Linear Regression Std. Error R Squared 3.9738 4.6101 0.2784 0.0373 Genetic Programming Linear Regression T-Test between Average R Square Values 0.0486 2.74E-13 T-test 0.2136 0.1973 0.0373 Mean Non-Linear Regression Genetic Programming Linear Regression The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
22
Results: Work Experience
Best Values of R Squared with Min. Std. Error 4.0644 0.3698 Non-Linear Regression Std. Error R Squared 2.2855 4.5169 0.7572 0.0596 Genetic Programming Linear Regression T-Test between Average R Square Values 1.54E-11 2.73E-19 T-test 0.3698 0.5564 0.0596 Mean Non-Linear Regression Genetic Programming Linear Regression The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
23
Results: Domain Experience
Best Values of R Squared with Min. Std. Error 3.9091 0.3260 Non-Linear Regression Std. Error R Squared 2.9283 4.5425 0.5911 0.0243 Genetic Programming Linear Regression T-Test between Average R Square Values 4.55E-16 3.27E-23 T-test 0.3260 0.5405 0.0243 Mean Non-Linear Regression Genetic Programming Linear Regression The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
24
Summary of All Experiments
R Square Values Linear Regression Best Case Genetic Prog. Avg. Case Genetic Prog. Non-Linear Regression All Factors 0.1550 0.9174 0.5592 0.8847 Education Only 0.0373 0.2784 0.1973 0.2136 Work Experience Only 0.0596 0.7572 0.5564 0.3698 Domain Experience Only 0.0243 0.5911 0.5405 0.3260 The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
25
Best Equation: All Factors. r2 = 0.9174
Too Much of a Good Thing? Best Equation: All Factors. r2 = ((Log (TechGradCourses + (TechGradCourses ^ ((Log TotWShops)/(Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (Log (Log (Log SWProjEstExp))))))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (((ProcIndExp + (Log (Sin MgmtGradCourses)))/(Sin SWPMExp)) + (Sin ((Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Sin SWPMExp)))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Cos (TechGradCourses ^ ((Log SWProjEstExp) / (((Log (ProcIndExp + (Log (TechGradCourses ^ ((Log SWProjEstExp) / (Log SWProjEstExp)))))) - 3) / (ProcIndExp + (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos ((((Log SWProjEstExp) / ((ProcIndExp + (Log (TechGradCourses ^ (TechGradCourses ^ (Log SWProjEstExp))))) / (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (Log (Log (Log SWProjEstExp)))))))))))))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / (TechGradCourses ^ (Log SWProjEstExp))))))))))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))) + ((Log SWProjEstExp) / (Log SWProjEstExp)))))) / (Log (Log (Log (TechGradCourses + (Cos (Log (Log (TechGradCourses ^ (Cos (((((Log SWProjEstExp) / (TechGradCourses ^ (Log SWProjEstExp))) / ((ProcIndExp + (Log (Sin MgmtGradCourses))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp))))))))))))))))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Log ((((Log TotLangExp) / (Log SWProjEstExp)) / (Log SWProjEstExp)) / (Sin SWPMExp))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))))))) + (((((ProcIndExp + (Log (TechGradCourses ^ (Log (TechGradCourses + ((TechGradCourses ^ (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos ((((Log SWProjEstExp) / ((ProcIndExp + (Log (TechGradCourses ^ (Log (TechGradCourses + (Cos (Log (Log (TechGradCourses ^ (Cos (((((Log SWProjEstExp) / (TechGradCourses ^ (Log SWProjEstExp))) / ((ProcIndExp + (Log (Sin MgmtGradCourses))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / (TechGradCourses ^ (Log SWProjEstExp))))))) / (Sin SWPMExp))))))) / (TechGradCourses ^ (Log SWProjEstExp))) / (TechGradCourses ^ (Log SWProjEstExp))) / (TechGradCourses ^ (Log SWProjEstExp))) / (Sin SWPMExp))) The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
26
Conclusions Viability of a human-based est. model Model assessment
Non-linear GP Impact on Human Based Estimation 1) All Factors 2) Domain Experience Work Experience 3) Education The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
27
Future Directions Equation Optimizer for GP Collect More Data
Further analysis without consolidation Detailed Effect of Educational Factors Use other statistical indicators Build other models Hybrid (Non-linear and GP) Classifiers Impact of process on estimation The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
28
Questions? http://nas.cl.uh.edu/boetticher/publications.html
The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
29
Thank You ! http://nas.cl.uh.edu/boetticher/publications.html
The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.