High Dependability Computing in a Competitive World Barry Boehm, USC NASA/CMU HDC Workshop January 10, 2001 (boehm@; http://) sunset.usc.edu
HDC in a Competitive World The economics of IT competition and dependability Software Dependability Opportunity Tree Decreasing defects Decreasing defect impact Continuous improvement Attacking the future Conclusions and References 01/10/01 ©USC-CSE
Competing on Cost and Quality - adapted from Michael Porter, Harvard ROI Price or Cost Point Under- priced Stuck in the middle Over- priced Cost leader (acceptable Q) Quality leader (acceptable C) 01/10/01 ©USC-CSE
Competing on Schedule and Quality - A risk analysis approach Risk Exposure RE = Prob (Loss) * Size (Loss) “Loss” – financial; reputation; future prospects, … For multiple sources of loss: RE = [Prob (Loss) * Size (Loss)]source sources 01/10/01 ©USC-CSE
Example RE Profile: Time to Ship - Loss due to unacceptable dependability Many defects: high P(L) Critical defects: high S(L) RE = P(L) * S(L) Few defects: low P(L) Minor defects: low S(L) Time to Ship (amount of testing) 01/10/01 ©USC-CSE
Example RE Profile: Time to Ship - Loss due to unacceptable dependability - Loss due to market share erosion Many defects: high P(L) Critical defects: high S(L) Many rivals: high P(L) Strong rivals: high S(L) RE = P(L) * S(L) Few rivals: low P(L) Weak rivals: low S(L) Few defects: low P(L) Minor defects: low S(L) Time to Ship (amount of testing) 01/10/01 ©USC-CSE
Example RE Profile: Time to Ship - Sum of Risk Exposures Time to Ship (amount of testing) RE = P(L) * S(L) Few rivals: low P(L) Weak rivals: low S(L) Many rivals: high P(L) Strong rivals: high S(L) Sweet Spot Many defects: high P(L) Critical defects: high S(L) Few defects: low P(L) Minor defects: low S(L) 01/10/01 ©USC-CSE
Comparative RE Profile: Safety-Critical System Time to Ship (amount of testing) RE = P(L) * S(L) Mainstream Sweet Spot Higher S(L): defects High-Q 01/10/01 ©USC-CSE
Comparative RE Profile: Internet Startup Higher S(L): delays Low-TTM Sweet Spot Mainstream Sweet Spot TTM: Time to Market RE = P(L) * S(L) Time to Ship (amount of testing) 01/10/01 ©USC-CSE
Conclusions So Far Unwise to try to compete on both cost/schedule and quality Some exceptions: major technology or marketplace edge There are no one-size-fits-all cost/schedule/quality strategies Risk analysis helps determine how much testing (prototyping, formal verification, etc.) is enough Buying information to reduce risk Often difficult to determine parameter values Some COCOMO II values discussed next 01/10/01 ©USC-CSE
Software Development Cost/Quality Tradeoff - COCOMO II calibration to 161 projects RELY Rating Defect Risk Rough MTBF(mean time between failures) Very High Loss of Human Life 100 years High Financial Loss 2 years High Moderate recoverable loss Nominal 1 month In-house support software 1.0 Low, easily recoverable loss Low 1 day Very Low Slight inconvenience 1 hour 0.8 0.9 1.0 1.1 1.2 1.3 Relative Cost/Source Instruction 01/10/01
Software Development Cost/Quality Tradeoff - COCOMO II calibration to 161 projects RELY Rating Defect Risk Rough MTBF(mean time between failures) Very High Loss of Human Life 100 years High Financial Loss 2 years Commercial quality leader High 1.10 Moderate recoverable loss Nominal 1 month In-house support software 1.0 Low, easily recoverable loss Commercial cost leader Low 1 day 0.92 Very Low Slight inconvenience 1 hour 0.8 0.9 1.0 1.1 1.2 1.3 Relative Cost/Source Instruction 01/10/01
Software Development Cost/Quality Tradeoff - COCOMO II calibration to 161 projects RELY Rating Defect Risk Rough MTBF(mean time between failures) Very High Loss of Human Life 100 years Safety-critical 1.26 High Financial Loss 2 years Commercial quality leader High 1.10 Moderate recoverable loss Nominal 1 month In-house support software 1.0 Low, easily recoverable loss Commercial cost leader Low 1 day 0.92 Very Low Slight inconvenience (1 hour) Startup demo 0.82 0.8 0.9 1.0 1.1 1.2 1.3 Relative Cost/Source Instruction 01/10/01
COCOMO II RELY Factor Dispersion 20 40 60 80 100 Very Low Nominal High t = 2.6 t > 1.9 significant 01/10/01 ©USC-CSE
COCOMO II RELY Factor Phenomenology Rqts. and Product Design Detailed Design Code and Unit test Integration and test Little detail Many TBDs Little verification Minimal QA, CM, standards, draft user manual, test plans Minimal PDR Basic design information Minimal QA, CM, standards, draft user manual, test plans Informal design inspections No test procedures Many requirements untested Minimal QA, CM Minimal stress and off-nominal tests Minimal as-built documentation RELY = Very Low No test procedures Minimal path test, standards check Minimal QA, CM Minimal I/O and off- nominal tests Minimal user manual Detailed verification, QA, CM, standards, PDR, documentation IV & V interface Very detailed test plans, procedures Detailed verification, QA, CM, standards, CDR, documentation Very thorough design inspections Very detailed test plans, procedures IV & V interface Less rqts. rework Detailed test procedures, QA, CM, documentation Very thorough code inspections Very extensive off- nominal tests IV & V interface Less rqts., design rework Very detailed test procedures, QA, CM, documentation Very extensive stress and off-nominal tests IV & V interface Less rqts., design, code rework RELY = Very High 01/10/01 ©USC-CSE
“Quality is Free” Did Philip Crosby’s book get it all wrong? Investments in dependable systems Cost extra for simple, short-life systems Pay off for high-value, long-life systems 01/10/01 ©USC-CSE
Software Life-Cycle Cost vs. Dependability 0.8 Very Low Nominal High 0.9 1.0 1.1 1.2 1.3 1.4 1.10 0.92 1.26 0.82 Relative Cost to Develop COCOMO II RELY Rating 01/10/01 ©USC-CSE
Software Life-Cycle Cost vs. Dependability 0.8 Very Low Nominal High 0.9 1.0 1.1 1.2 1.3 1.4 1.10 0.92 1.26 0.82 Relative Cost to Develop, Maintain COCOMO II RELY Rating 1.23 0.99 1.07 01/10/01 ©USC-CSE
Software Life-Cycle Cost vs. Dependability 0.8 Very Low Nominal High 0.9 1.0 1.1 1.2 1.3 1.4 1.10 0.92 1.26 0.82 Relative Cost to Develop, Maintain COCOMO II RELY Rating 1.23 0.99 1.07 1.11 1.05 70% Maint. 1.20 Low-dependability inadvisable for evolving systems 01/10/01 ©USC-CSE
Software Ownership Cost vs. Dependability 1.4 VL = 2.55 L = 1.52 Operational-defect cost at Nominal dependability = Software life cycle cost 1.3 1.26 Relative Cost to Develop, Maintain, Own and Operate 1.2 70% Maint. 1.23 1.20 1.10 1.10 1.1 1.11 Operational - defect cost = 0 1.07 1.05 1.07 1.0 0.99 0.9 0.92 0.8 0.76 0.82 0.69 Very Low Low Nominal High Very High COCOMO II RELY Rating 01/10/01 ©USC-CSE
Conclusions So Far - 2 Quality is better than free for high-value, long-life systems There is no universal dependability sweet spot Yours will be determined by your value model And the relative contributions of dependability techniques Let’s look at these next 01/10/01 ©USC-CSE
HDC in a Competitive World The economics of IT competition and dependability Software Dependability Opportunity Tree Decreasing defects Decreasing defect impact Continuous improvement Attacking the future Conclusions 01/10/01 ©USC-CSE
Software Dependability Opportunity Tree Decrease Defect Risk Exposure Continuous Improvement Impact, Size (Loss) Prob (Loss) Defect Prevention Defect Detection and Removal Value/Risk - Based Defect Reduction Graceful Degradation CI Methods and Metrics Process, Product, People Technology 01/10/01 ©USC-CSE
Software Defect Prevention Opportunity Tree IPT, JAD, WinWin,… PSP, Cleanroom, Dual development,… Manual execution, scenarios,... Staffing for dependability Rqts., Design, Code,… Interfaces, traceability,… Checklists People practices Standards Languages Prototyping Modeling & Simulation Reuse Root cause analysis Defect Prevention 01/10/01 ©USC-CSE
People Practices: Some Empirical Data Cleanroom: Software Engineering Lab 25-75% reduction in failure rates 5% vs 60% of fix efforts over 1 hour Personal Software Process/Team Software Process 50-75% defect reduction in CMM Level 5 organization Even higher reductions for less mature organizations Staffing Many experiments find factor-of-10 differences in people’s defect rates 01/10/01 ©USC-CSE
Root Cause Analysis Each defect found can trigger five analyses: Debugging: eliminating the defect Regression: ensuring that the fix doesn’t create new defects Similarity: looking for similar defects elsewhere Insertion: catching future similar defects earlier Prevention: finding ways to avoid such defects How many does your organization do? 01/10/01 ©USC-CSE
Pareto 80-20 Phenomena 80% of the rework comes from 20% of the defects 80% of the defects come from 20% of the modules About half the modules are defect-free 90% of the downtime comes from < 10% of the defects 01/10/01 ©USC-CSE
Pareto Analysis of Rework Costs TRW Project B 1005 SPR’s 100 90 80 TRW Project A 373 SPR’s 70 % of Cost to Fix SPR’s 60 50 Major Rework Sources: Off-Nominal Architecture-Breakers A - Network Failover B - Extra-Long Messages 40 30 20 10 10 20 30 40 50 60 70 80 90 100 % of Software Problem Reports (SPR’s) 01/10/01 ©USC-CSE
Software Defect Detection Opportunity Tree Completeness checking Consistency checking - Views, interfaces, behavior, pre/post conditions Traceability checking Compliance checking - Models, assertions, standards Automated Analysis Defect Detection and Removal - Rqts. - Design - Code Peer reviews, inspections Architecture Review Boards Pair programming Reviewing Requirements & design Structural Operational profile Usage (alpha, beta) Regression Value/Risk - based Test automation Testing 01/10/01 ©USC-CSE
Orthogonal Defect Classification - Chillarege, 1996 01/10/01 ©USC-CSE
UMD-USC CeBASE Experience Comparisons - http://www.cebase.org “Under specified conditions, …” Technique Selection Guidance (UMD, USC) Peer reviews are more effective than functional testing for faults of omission and incorrect specification Peer reviews catch 60 % of the defects Functional testing is more effective than reviews for faults concerning numerical approximations and control flow Technique Definition Guidance (UMD) For a reviewer with an average experience level, a procedural approach to defect detection is more effective than a less procedural one. Readers of a software artifact are more effective in uncovering defects when each uses a different and specific focus. Perspective - based reviews catch 35% more defects 01/10/01 31
Factor-of-100 Growth in Software Cost-to-Fix 1000 Larger software projects 500 IBM-SSD 200 GTE 100 80% Relative cost to fix defect Median (TRW survey) 50 20% SAFEGUARD 20 10 Smaller software projects 5 2 1 Requirements Design Code Development Acceptance Operation test test Phase in which defect was fixed 01/10/01 ©USC-CSE
Reducing Software Cost-to-Fix: CCPDS-R - Royce, 1998 Architecture first -Integration during the design phase -Demonstration-based evaluation Risk Management Configuration baseline change metrics: RATIONAL S o f t w a r e C o r p o r a t I o n Project Development Schedule 15 20 25 30 35 40 10 Design Changes Implementation Changes Maintenance Changes and ECP’s Hours Change 01/10/01
COQUALMO: Constructive Quality Model COCOMO II COQUALMO Software development effort, cost and schedule estimate Software Size estimate Software product, Defect Introduction process, computer and Model personnel attributes Number of residual defects Defect density per unit of size Defect removal capability Defect Removal levels Model 01/10/01 ©USC-CSE
Defect Impact Reduction Opportunity Tree Business case analysis Pareto (80-20) analysis V/R-based reviews V/R-based testing Cost/schedule/quality as independent variable Value/Risk - Based Defect Reduction Decrease Defect Impact, Size (Loss) Fault tolerance Self-stabilizing SW Reduced-capability models Manual Backup Rapid recovery Graceful Degredation 01/10/01 ©USC-CSE
Software Dependability Opportunity Tree Defect Prevention Decrease Defect Prob (Loss) Defect Detection and Removal Decrease Defect Risk Exposure Decrease Defect Impact, Size (Loss) Value/Risk - Based Defect Reduction Graceful Degradation CI Methods and Metrics Continuous Improvement Process, Product, People Technology 01/10/01 ©USC-CSE
CeBASE The Experience Factory Organization - Exemplar: NASA Software Engineering Lab Project Organization Experience Factory environment characteristics 1. Characterize 2. Set Goals 3. Choose Process Project Support 6. Package tailorable knowledge, consulting Generalize Execution plans products, lessons learned, models Experience Base Tailor Formalize project analysis, process modification Disseminate 5. Analyze 4. Execute Process data, lessons learned
Goal-Model-Question Metric Approach to Testing Models Questions Metrics · Accept product Product requirements What inputs, outputs verify requirements compliance? Percent of requirements tested Cost- Effective Defect Detection Defect source models - Orthogonal defect Classification (ODC) Defect-prone modules How best to mix and sequence automated analysis, reviewing, and testing? Defect detection yield by defect class Defect detection costs, ROI Expected vs. actual defect detection by class No known delivered defects Defect closure rate models How many outstanding defects? What is their closure status? Percent of defect reports closed Expected; actual Meeting target reliability, defect density (DDD) Operational profiles Reliability estimation models DDD estimation models How well are we progressing towards targets? How long will it take to reach targets? Estimated vs. target reliability, DDD Estimated time to achieve targets Minimize adverse effects of Business-case or mission models of value by feature, quality attribute, delivery time Tradeoff relationships What HDC techniques best address high-value elements? How well are project techniques minimizing adverse effects? Risk exposure reduction vs. time Business-case based "earned value" Value-weighted defect detection yields for each technique Note: No one-size-fits-all solution 01/10/01 ©USC-CSE
GMQM Paradigm: Value/Risk - Driven Testing Goal: Minimize adverse effects of defects Models: Business-case or mission models of value By feature, quality attribute, or delivery time Attribute tradeoff relationships Questions: What HDC techniques best address high-value elements? How well are project techniques minimizing adverse effects? Metrics: Risk exposure reduction vs. time Business-case based “earned value” Value-weighted defect detection yields by technique 01/10/01 ©USC-CSE
“Dependability” Attributes and Tradeoffs Robustness: reliability, availability, survivability Protection: security, safety Quality of Service: accuracy, fidelity, performance assurance Integrity: correctness, verifiability Attributes mostly compatible and synergetic Some conflicts and tradeoffs Spreading information: survivability vs. security Fail-safe: safety vs. quality of service Graceful degredation: survivability vs. quality of service 01/10/01 ©USC-CSE
Resulting Reduction in Risk Exposure Acceptance, Defect-density testing (all defects equal) Risk Exposure from Defects = P(L) *S(L) Value/Risk- Driven Testing (80-20 value distribution) Time to Ship (amount of testing) 01/10/01 ©USC-CSE
20% of Features Provide 80% of Value: Focus Testing on These (Bullock, 2000) 100 80 % of Value for Correct Customer Billing 60 40 20 5 10 15 Customer Type 01/10/01 ©USC-CSE
Cost, Schedule, Quality: Pick any Two? 01/10/01 ©USC-CSE
Cost, Schedule, Quality: Pick any Two? Consider C, S, Q as Independent Variable Feature Set as Dependent Variable C C S Q S Q 01/10/01 ©USC-CSE
C, S, Q as Independent Variable Determine Desired Delivered Defect Density (D4) Or a value-based equivalent Prioritize desired features Via QFD, IPT, stakeholder win-win Determine Core Capability 90% confidence of D4 within cost and schedule Balance parametric models and expert judgment Architect for ease of adding next-priority features Hide sources of change within modules (Parnas) Develop core capability to D4 quality level Usually in less than available cost and schedule Add next priority features as resources permit Versions used successfully on 17 of 19 USC digital library projects 01/10/01 ©USC-CSE
Attacking the Future: Software Trends “Skate to where the puck is going” --- Wayne Gretzky Increased complexity Everything connected Opportunities for chaos (agents) Systems of systems Decreased control of content Infrastructure COTS components Faster change Time-to-market pressures Marry in haste; no leisure to repent Adapt or die (e-commerce) Fantastic opportunities Personal, corporate, national, global 01/10/01 ©USC-CSE
Opportunities for Chaos: User Programming - From NSF SE Workshop: http://sunset.usc.edu/Activities/aug24-25 Mass of user-programmers a new challenge 40-50% of user programs have nontrivial defects By 2005, up to 55M “sorcerer’s apprentice” user-programmers loosing agents into cyberspace Need research on “seat belts, air bags, safe driving guidance” for user-programmers Frameworks (generic, domain-specific) Rules of composition, adaptation, extension “Crash barriers;” outcome visualization 01/10/01 ©USC-CSE
Dependable COTS-Based systems Establish dependability priorities Assess COTS dependability risks Prototype, benchmark, testbed, operational profiles Wrap COTS components Just use the parts you need Interface standards and API’s help Prepare for inevitable COTS changes Technology watch, testbeds, synchronized refresh 01/10/01 ©USC-CSE
HDC Technology Prospects High-dependability change Architecture-based analysis and composition Lightweight formal methods Model-based analysis and assertion checking Domain models, stochastic models High-dependability test data generation Dependability attribute tradeoff analysis Dependable systems form less-dependable components Exploiting cheap, powerful hardware Self-stabilizing software Complementary empirical methods Experiments, testbeds, models, experience factory Accelerated technology transition, education, training 01/10/01 ©USC-CSE
Conclusions Future trends intensify competitive HDC challenges Complexity, criticality, decreased control, faster change Organizations need tailored, mixed HDC strategies No universal HDC sweet spot Goal/value/risk analysis useful Quantitative data and models becoming available HDC Opportunity Tree helps sort out mixed strategies Quality is better than free for high-value, long-life systems Attractive new HDC technology prospects emerging Architecture- and model-based methods Lightweight formal methods Self-stabilizing software Complementary theory and empirical methods 01/10/01 ©USC-CSE
References 01/10/01 ©USC-CSE V. Basili et al., “SEL’s Software Process Improvement Program,” IEEE Software, November 1995, pp. 83-87. B. Boehm and V. Basili, “Software Defect Reduction Top 10 List,” IEEE Computer, January 2001 B. Boehm et al., Software Cost Estimation with COCOMO II, Prentice Hall, 2000. J. Bullock, “Calculating the Value of Testing,” Software Testing and Quality Engineering, May/June 2000, pp. 56-62 CeBASE (Center for Empirically-Based Software Engineering), http://www.cebase.org R. Chillarege, “Orthogonal Defect Classification,” in M. Lyu (ed.), Handbook of Software Reliability Engineering, IEEE-CS Press, 1996, pp. 359-400. P. Crosby, Quality is Free, Mentor, 1980. R. Grady, Practical Software Metrics, Prentice Hall, 1992 N. Leveson, Safeware: System Safety and Computers, Addison Wesley, 1995 B. Littlewood et al., “Modeling the Effects of Combining Diverse Fault Detection Techniques,” IEEE Trans. SW Engr. December 2000, pp. 1157-1167. M. Lyu (ed), Handbook of Software Reliability Engineering, IEEE-CS Press, 1996 J. Musa and J. Muda, Software Reliability Engineered Testing, McGraw-Hill, 1998 M. Porter, Competitive Strategy, Free Press, 1980. P. Rook (ed.), Software Reliability Handbook, Elsevier Applied Science, 1990 W. E. Royce, Software Project Management, Addison Wesley, 1998. 01/10/01 ©USC-CSE
CeBASE Software Defect Reduction Top-10 List - http://www.cebase.org 1. Finding and fixing a software problem after delivery is often 100 times more expensive than finding and fixing it during the requirements and design phase. 2. About 40-50% of the effort on current software projects is spent on avoidable rework. 3. About 80% of the avoidable rework comes form 20% of the defects. 4. About 80% of the defects come from 20% of the modules and about half the modules are defect free. 5. About 90% of the downtime comes from at most 10% of the defects. 6. Peer reviews catch 60% of the defects. 7. Perspective-based reviews catch 35% more defects than non-directed reviews. 8. Disciplined personal practices can reduce defect introduction rates by up to 75%. 9. All other things being equal, it costs 50% more per source instruction to develop high- dependability software products than to develop low-dependability software products. However, the investment is more than worth it if significant operations and maintenance costs are involved. 10. About 40-50% of user programs have nontrivial defects. 01/10/01 ©USC-CSE