Balancing System and Software Qualities Barry Boehm, USC STC 2015 Tutorial October 12, 2015 Fall 20151
Outline Critical nature of system qualities (SQs) – Or non-functional requirements (NFRs); ilities – Major source of project overruns, failures – Significant source of stakeholder value conflicts – Poorly defined, understood – Underemphasized in project management Foundations: An initial SQs ontology – Nature of an ontology; choice of IDEF5 structure – Stakeholder value-based, means-ends hierarchy – Synergies and Conflicts matrix and expansions SQ Opportunity Trees Conclusions Fall 20152
Importance of SQ Tradeoffs Major source of DoD, other system overruns SQs have systemwide impact – System elements generally just have local impact SQs often exhibit asymptotic behavior – Watch out for the knee of the curve Best architecture is a discontinuous function of SQ level – “Build it quickly, tune or fix it later” highly risky – Large system example below Fall 2015 $100M $50M Required Architecture: Custom; many cache processors Original Architecture: Modified Client-Server Response Time (sec) Original Spec After Prototyping Original Cost 3
Types of Decision Reviews Schedule-based commitment reviews (plan-driven) – We’ll release the RFP on April 1 based on the schedule in the plan – $70M overrun to produce overperforming system Event-based commitment reviews (artifact-driven) – The design will be done in 15 months, so we’ll have the review then – Responsive design found to be unaffordable 15 months later Evidence-based commitment reviews (risk-driven) – Evidence: affordable COTS-based system can’t satisfy 1-second requirement Custom solution roughly 3x more expensive – Need to reconsider 1-second requirement Fall
Evidence-Based Decision Result Fall Attempt to validate 1-second response time Commercial system benchmarking and architecture analysis: needs expensive custom solution Prototype: 4-second response time OK 90% of the time Negotiate response time ranges 2 seconds desirable 4 seconds acceptable with some 2-second special cases Benchmark commercial system add-ons to validate their feasibility Present solution and feasibility evidence at evidence- based decision review Result: Acceptable solution with minimal delay
Outline Critical nature of system qualities (SQs) – Or non-functional requirements (NFRs); ilities – Major source of project overruns, failures – Significant source of stakeholder value conflicts – Poorly defined, understood – Underemphasized in project management Foundations: An initial SQs ontology – Nature of an ontology; choice of IDEF5 structure – Stakeholder value-based, means-ends hierarchy – Synergies and Conflicts matrix and expansions SQ Opportunity Trees Conclusions Fall 20156
Example of SQ Value Conflicts: Security IPT Single-agent key distribution; single data copy – Reliability: single points of failure Elaborate multilayer defense – Performance: 50% overhead; real-time deadline problems Elaborate authentication – Usability: delays, delegation problems; GUI complexity Everything at highest level – Modifiability: overly complex changes, recertification Fall 20157
Proliferation of Definitions: Resilience Wikipedia Resilience variants: Climate, Ecology, Energy Development, Engineering and Construction, Network, Organizational, Psychological, Soil Ecology and Society Organization Resilience variants: Original-ecological, Extended-ecological, Walker et al. list, Folke et al. list; Systemic-heuristic, Operational, Sociological, Ecological-economic, Social-ecological system, Metaphoric, Sustainabilty-related Variants in resilience outcomes – Returning to original state; Restoring or improving original state; Maintaining same relationships among state variables; Maintaining desired services; Maintaining an acceptable level of service; Retaining essentially the same function, structure, and feedbacks; Absorbing disturbances; Coping with disturbances; Self-organizing; Learning and adaptation; Creating lasting value – Source of serious cross-discipline collaboration problems Fall 20158
Example of Current Practice “The system shall have a Mean Time Between Failures of 10,000 hours” What is a “failure?” – 10,000 hours on liveness – But several dropped or garbled messages per hour? What is the operational context? – Base operations? Field operations? Conflict operations? Most management practices focused on functions – Requirements, design reviews; traceability matrices; work breakdown structures; data item descriptions; earned value management What are the effects of or on other SQs? – Cost, schedule, performance, maintainability? Fall 20159
Outline Critical nature of system qualities (SQs) – Or non-functional requirements (NFRs); ilities – Major source of project overruns, failures – Significant source of stakeholder value conflicts – Poorly defined, understood – Underemphasized in project management Foundations: An initial SQs ontology – Nature of an ontology; choice of IDEF5 structure – Stakeholder value-based, means-ends hierarchy – Synergies and Conflicts matrix and expansions SQ Opportunity Trees Conclusions Fall
Need for SQs Ontology Oversimplified one-size-fits all definitions – ISO/IEC 25010, Reliability: the degree to which a system, product, or component performs specified functions under specified conditions for a specified period of time – OK if specifications are precise, but increasingly “specified conditions” are informal, sunny-day user stories. Satisfying just these will pass “ISO/IEC Reliability,” even if system fails on rainy-day user stories – Need to reflect that different stakeholders rely on different capabilities (functions, performance, flexibility, etc.) at different times and in different environments Proliferation of definitions, as with Resilience Weak understanding of inter-SQ relationships – Security Synergies and Conflicts with other qualities Fall
Nature of an ontology; choice of IDEF5 structure An ontology for a collection of elements is a definition of what it means to be a member of the collection For “system qualities,” this means that an SQ identifies an aspect of “how well” the system performs – The ontology also identifies the sources of variability in the value of “how well” the system performs Functional requirements specify “what;” NFRs specify “how well” After investigating several ontology frameworks, the IDEF5 framework appeared to best address the nature and sources of variability of system SQs – Good fit so far Fall
Initial SERC SQs Ontology Modified version of IDEF5 ontology framework – Classes, Subclasses, and Individuals – Referents, States, Processes, and Relations Top classes cover stakeholder value propositions – Mission Effectiveness, Resource Utilization, Dependability, Changeabiity Subclasses identify means for achieving higher-class ends – Means-ends one-to-many for top classes – Ideally mutually exclusive and exhaustive, but some exceptions – Many-to-many for lower-level subclasses Referents, States, Processes, Relations cover SQ variation Referents: Sources of variation by stakeholder value context: States: Internal (beta-test); External (rural, temperate, sunny) Processes: Operational scenarios (normal vs. crisis; experts vs. novices) Relations: Impact of other SQs (security as above, synergies & conflicts) Fall
Example: Reliability Revisited Reliability is the probability that the system will deliver stakeholder-satisfactory results for a given time period (generally an hour), given specified ranges of: – Stakeholder value propositions: desired and acceptable ranges of liveness, accuracy, response time, speed, capabilities, etc. – System internal and external states: integration test, acceptance test, field test, etc.; weather, terrain, DEFCON, takeoff/flight/landing, etc. – System internal and external processes: status checking frequency; types of missions supported; workload volume, interoperability with independently evolving external systems – Effects via relations with other SQs: synergies improving other SQs; conflicts degrading other SQs Fall
Set-Based SQs Definition Convergence RPV Surveillance Example Fall Effective ness (pixels/ frame) Efficiency (E.g., frames/second) Phase 1 Phase 2 Phase 3 Phase 1. Rough ConOps, Rqts, Solution Understanding Phase 2. Improved ConOps, Rqts, Solution Understanding Phase 3. Good ConOps, Rqts, Solution Understanding Desired Acceptable Desired
Stakeholder value-based, means-ends hierarchy Mission operators and managers want improved Mission Effectiveness – Involves Physical Capability, Cyber Capability, Human Usability, Speed, Accuracy, Impact, Maneuverability, Scalability, Versatility, Interoperability Mission investors and system owners want Mission Cost-Effectiveness – Involves Cost, Duration, Personnel, Scarce Quantities (capacity, weight, energy, …); Manufacturability, Sustainability All want system Dependability: cost-effective defect-freedom, availability, and safety and security for the communities that they serve – Involves Reliability, Availability, Maintainability, Survivability, Safety, Security, Robustness In an increasingly dynamic world, all want system Changeability: to be rapidly and cost-effectively evolvable – Involves Maintainability, Adaptability Fall
U. Virginia: Coq Formal Reasoning Structure Inductive Dependable (s: System): Prop := mk_dependability: Security s -> Safety s -> Reliability s -> Maintainability s -> Availability s -> Survivability s -> Robustness s -> Dependable s. Example aSystemisDependable: Dependable aSystem. apply mk_dependability. exact (is_secure aSystem). exact (is_safe aSystem). exact (is_reliable aSystem). exact (is_maintainable aSystem). exact (is_avaliable aSystem). exact (is_survivable aSystem). exact (is_robust aSystem). Qed. Fall
Maintainability Challenges and Responses Maintainability supports both Dependability and Changeability – Distinguish between Repairability and Modifiability – Elaborate each via means-ends relationships Multiple definitions of Changeability – Distinguish between Product Quality and Quality in Use (ISO/IEC 25010) – Provide mapping between Product Quality and Quality in Use viewpoints Changeability can be both external and internal – Distinguish between Maintainability and Adaptability Many definitions of Resilience – Define Resilience as a combination of Dependability and Changeability Variability of Dependability and Changeability values – Ontology addresses sources of variation – Referents (stakeholder values), States, Processes, Relations with other SQs Need to help stakeholders choose options, avoid pitfalls – Qualipedia, Synergies and Conflicts, Opportunity Trees Fall
Product Quality View of Changeability MIT Quality in Use View also valuable Changeability (PQ): Ability to become different product – Swiss Army Knife, Brick Useful but not Changeable Changeability (Q in Use): Ability to accommodate changes in use – Swiss Army Knife does not change as a product but is Changeable – Brick is Changeable in Use (Can help build a wall or break a window?) Changeability Adaptability Maintainability Self-Diagnosability Self-Modifiability Self-Testability Internal External Modifiability Repairability Defects Changes Testability Means to End Subclass of Understandability Modularity Scalability Portability Diagnosability Accessibility Restorability Fall
A B A BC,D Means to End Subclass of Dependability, Availability Reliability Maintainability Defect Freedom Survivability Fault Tolerance Repairability Test Plans, Coverage Test Scenarios, Data Test Drivers, Oracles Test Software Qualitities Testability Complete Partial Robustness Self-Repairability Graceful Degradation Choices of Security, Safety … Modifiability Dependability, Changeability, and Resilience Testability, Diagnosability, etc. Fall 2015 Changeability Resilience 20
Outline Critical nature of system qualities (SQs) – Or non-functional requirements; ilities – Major source of project overruns, failures – Significant source of stakeholder value conflicts – Poorly defined, understood – Underemphasized in project management Need for and nature of SQs ontology – Nature of an ontology; choice of IDEF5 structure – Stakeholder value-based, means-ends hierarchy – Synergies and Conflicts matrix and expansions SQ Opportunity Trees Conclusions Fall
7x7 Synergies and Conflicts Matrix Mission Effectiveness expanded to 4 elements – Physical Capability, Cyber Capability, Interoperability, Other Mission Effectiveness (including Usability as Human Capability) Synergies and Conflicts among the 7 resulting elements identified in 7x7 matrix – Synergies above main diagonal, Conflicts below Work-in-progress tool will enable clicking on an entry and obtaining details about the synergy or conflict – Ideally quantitative; some examples next Still need synergies and conflicts within elements – Example 3x3 Dependability subset provided Fall
Fall
Software Development Cost vs. Reliability 0.8 Very Low NominalHigh Very High Relative Cost to Develop COCOMO II RELY Rating MTBF (hours) , ,000 Fall
Software Ownership Cost vs. Reliability 0.8 Very Low NominalHigh Very High Relative Cost to Develop, Maintain, Own and Operate COCOMO II RELY Rating % Maint VL = 2.55 L = 1.52 Operational-defect cost at Nominal dependability = Software life cycle cost Operational - defect cost = 0 MTBF (hours) , ,000 Fall
Fall 2015 Legacy System Repurposing Eliminate Tasks Eliminate Scrap, Rework Staffing, Incentivizing, Teambuilding Kaizen (continuous improvement) Work and Oversight Streamlining Collaboration Technology Early Risk and Defect Elimination Modularity Around Sources of Change Incremental, Evolutionary Development Risk-Based Prototyping Satisficing vs. Optimizing Performance Value-Based Capability Prioritization Composable Components,Services, COTS Cost Improvements and Tradeoffs Get the Best from People Make Tasks More Efficient Simplify Products (KISS) Reuse Components Facilities, Support Services Tools and Automation Lean and Agile Methods Evidence-Based Decision Gates Domain Engineering and Architecture Task Automation Model-Based Product Generation Value-Based, Agile Process Maturity Affordability and Tradespace Framework Reduce Operations, Support Costs Streamline Supply Chain Design for Maintainability, Evolvability Automate Operations Elements Anticipate, Prepare for Change Value- and Architecture-Based Tradeoffs and Balancing 26
Costing Insights: COCOMO II Productivity Ranges Productivity Range Product Complexity (CPLX) Analyst Capability (ACAP) Programmer Capability (PCAP) Time Constraint (TIME) Personnel Continuity (PCON) Required Software Reliability (RELY) Documentation Match to Life Cycle Needs (DOCU) Multi-Site Development (SITE) Applications Experience (AEXP) Platform Volatility (PVOL) Use of Software Tools (TOOL) Storage Constraint (STOR) Process Maturity (PMAT) Language and Tools Experience (LTEX) Required Development Schedule (SCED) Data Base Size (DATA) Platform Experience (PEXP) Architecture and Risk Resolution (RESL) Precedentedness (PREC) Develop for Reuse (RUSE) Team Cohesion (TEAM) Development Flexibility (FLEX) Scale Factor Ranges: 10, 100, 1000 KSLOC Fall 2015 Staffing Teambuilding Continuous Improvement 27
COSYSMO Sys Engr Cost Drivers Fall 2015 Teambuilding Staffing Continuous Improvement 28
Fall 2015 Legacy System Repurposing Eliminate Tasks Eliminate Scrap, Rework Staffing, Incentivizing, Teambuilding Kaizen (continuous improvement) Work and Oversight Streamlining Collaboration Technology Early Risk and Defect Elimination Modularity Around Sources of Change Incremental, Evolutionary Development Risk-Based Prototyping Satisficing vs. Optimizing Performance Value-Based Capability Prioritization Composable Components,Services, COTS Affordability Improvements and Tradeoffs Get the Best from People Make Tasks More Efficient Simplify Products (KISS) Reuse Components Facilities, Support Services Tools and Automation Lean and Agile Methods Evidence-Based Decision Gates Domain Engineering and Architecture Task Automation Model-Based Product Generation Value-Based, Agile Process Maturity Tradespace and Affordability Framework Reduce Operations, Support Costs Streamline Supply Chain Design for Maintainability, Evolvability Automate Operations Elements Anticipate, Prepare for Change Value- and Architecture-Based Tradeoffs and Balancing 29
Fall 2015 COCOMO II-Based Tradeoff Analysis Better, Cheaper, Faster: Pick Any Two? -- Cost/Schedule/RELY: “pick any two” points (RELY, MTBF (hours)) For 100-KSLOC set of features Can “pick all three” with 77-KSLOC set of features 30
Fall 2015 Value-Based Testing: Empirical Data and ROI — LiGuo Huang, ISESE 2005 (a) (b) 31
Value-Neutral Defect Fixing Is Even Worse % of Value for Correct Customer Billing Customer Type Automated test generation tool - all tests have equal value Value-neutral defect fixing: Quickly reduce # of defects Pareto Business Value Fall
Outline Foundations: An initial SQs ontology – Nature of an ontology; choice of IDEF5 structure – Stakeholder value-based, means-ends hierarchy Critical nature of system qualities (SQs) – Or non-functional requirements (NFRs); ilities – Major source of project overruns, failures – Significant source of stakeholder value conflicts – Poorly defined, understood – Underemphasized in project management SQ Opportunity Trees Conclusions Fall
Context: SERC SQ Initiative Elements SQ Tradespace and Affordability Project foundations – More precise SQ definitions and relationships – Stakeholder value-based, means-ends relationships – SQ strategy effects, synergies, conflicts, Opportunity Trees – USC, MIT, U. Virginia Next-generation system cost-schedule estimation models – Software: COCOMO III – Systems Engineering: COSYSMO 3.0 – USC, AFIT, GaTech, NPS Applied SQ methods, processes, and tools (MPTs) – For concurrent cyber-physical-human systems – Experimental MPT piloting, evolution, improvement – Wayne State, AFIT, GaTech, MIT, NPS, Penn State, USC Fall
SQ Opportunity Trees Means-Ends relationships for major SQs – Previously developed for cost and schedule reduction – Revising an early version for Dependability – Starting on Maintainability Accommodates SQ Synergies and Conflicts matrix – Conflicts via “Address Potential Conflicts” entry – Means to Ends constitute Synergies More promising organizational structure than matrix Fall
Fall Schedule Reduction Opportunity Tree Eliminating Tasks Reducing Time Per Task Reducing Risks of Single-Point Failures Reducing Backtracking Activity Network Streamlining Increasing Effective Workweek Better People and Incentives Transition to Learning Organization Business process reengineering Reusing assets Applications generation Schedule as independent variable Tools and automation Work streamlining (80-20) Increasing parallelism Reducing failures Reducing their effects Early error elimination Process anchor points Improving process maturity Collaboration technology Minimizing task dependencies Avoiding high fan-in, fan-out Reducing task variance Removing tasks from critical path 24x7 development Nightly builds, testing Weekend warriors
Fall Reuse at HP’s Queensferry Telecommunication Division Time to Market (months) Non-reuse Project Reuse project
Fall Software Dependability Opportunity Tree Decrease Defect Risk Exposure Continuous Improvement Decrease Defect Impact, Size (Loss) Decrease Defect Prob (Loss) Defect Prevention Defect Detection and Removal Value/Risk - Based Defect Reduction Graceful Degradation CI Methods and Metrics Process, Product, People Technology
Fall Software Defect Prevention Opportunity Tree IPT, JAD, WinWin,… PSP, Cleanroom, Pair development,… Manual execution, scenarios,... Staffing for dependability Rqts., Design, Code,… Interfaces, traceability,… Checklists Defect Prevention People practices Standards Languages Prototyping Modeling & Simulation Reuse Root cause analysis
Fall Defect Detection Opportunity Tree Completeness checking Consistency checking - Views, interfaces, behavior, pre/post conditions Traceability checking Compliance checking - Models, assertions, standards Defect Detection and Repair - Rqts. - Design - Development Testing Reviewing Automated Analysis Peer reviews, inspections Architecture Review Boards Pair development Requirements & design Structural Operational profile Usage (alpha, beta) Regression Value/Risk - based Test automation
Fall Defect Impact Reduction Opportunity Tree Business case analysis Pareto (80-20) analysis V/R-based reviews V/R-based testing Cost/schedule/quality as independent variable Decrease Defect Impact, Size (Loss) Graceful Degredation Value/Risk - Based Defect Reduction Fault tolerance Self-stabilizing SW Reduced-capability models Manual Backup Rapid recovery
Fall Repairability Opportunity Tree Anticipate Repairability Needs Design/Develop for Repairability Improve Repair Diagnosis Repair facility requirements Repair-type trend analysis Repair skills analysis Repair personnel involvement Replacement parts logistics development Replacement accessibility deaign Multimedia diagnosis guides Fault localization Address Potential Conflicts Repair compatibility analysis Regression test capabilities Address Potential Conflicts Improve Repair, V&V Processes Smart system anomaly, trend analysis Informative error messages Replacement parts trend analysis Repair facility design, development Address Potential Conflicts Switchable spare components Prioritize, Schedule Repairs, V&V
Fall Modifiability Opportunity Tree Anticipate Modifiability Needs Design/Develop for Modifiability Improve Modification, V&V Evolution requirements Trend analysis Hotspot (change source) analysis Modifier involvement Spare capacity Domain-specific architecture in domain Regression test capabilities Address Potential Conflicts Service-orientation; loose coupling Modularize around hotspots Enable user programmability Modification compatibility analysis Prioritize, Schedule Modifications, V&V
Hotspot Analysis: Projects A and B - Change processing over 1 person-month = 152 person-hours Fall CategoryProject AProject B Extra long messages = 5045 Network failover = 3040 Hardware-software interface = = 2832 Encryption algorithms = 1615 Subcontractor interface = 2060 GUI revision =2550 Data compression algorithm 910 External applications interface = 1460 COTS upgrades = = 1461 Database restructure = 1860 Routing algorithms = 692 Diagnostic aids = 979 TOTAL:
Product Line Engineering and Management: NPS Fall
Outline Critical nature of system qualities (SQs) – Or non-functional requirements (NFRs); ilities – Major source of project overruns, failures – Significant source of stakeholder value conflicts – Poorly defined, understood – Underemphasized in project management Foundations: An initial SQs ontology – Nature of an ontology; choice of IDEF5 structure – Stakeholder value-based, means-ends hierarchy – Synergies and Conflicts matrix and expansions SQ Opportunity Trees Conclusions Fall
Conclusions System qualities (SQs) are success-critical – Major source of project overruns, failures – Significant source of stakeholder value conflicts – Poorly defined, understood – Underemphasized in project management SQs ontology clarifies nature of system qualities – Using value-based, means-ends hierarchy – Identifies variation types: referents, states, processes, relations – Relations enable SQ synergies and conflicts identification Fall
Fall ©USC-CSSE List of Acronyms CDConcept Development CPCompetitive Prototyping DCRDevelopment Commitment Review DoDDepartment of Defense ECRExploration Commitment Review EVExpected Value FCRFoundations Commitment Review FEDFeasibility Evidence Description GAOGovernment Accounting Office ICMIncremental Commitment Model KPPKey Performance Parameter MBASEModel-Based Architecting and Software Engineering OCROperations Commitment Review RERisk Exposure RUPRational Unified Process V&VVerification and Validation VBValue of Bold approach VCRValuation Commitment Review
Fall B. Boehm, J. Lane, S. Koolmanojwong, R. Turner, The Incremental Commitment Spiral Model, Addison Wesley, B. Boehm, “Anchoring the Software Process,” IEEE Software, July B. Boehm, A.W. Brown, V. Basili, and R. Turner, “Spiral Acquisition of Software- Intensive Systems of Systems,” Cross Talk, May 2004, pp B. Boehm and J. Lane, “Using the ICM to Integrate System Acquisition, Systems Engineering, and Software Engineering,” CrossTalk, October 2007, pp J. Maranzano et. al., “Architecture Reviews: Practice and Experience,” IEEE Software, March/April R. Pew and A. Mavor, Human-System Integration in the System Development Process: A New Look, National Academy Press, W. Royce, Software Project Management, Addison Wesley, R. Valerdi, “The Constructive Systems Engineering Cost Model,” Ph.D. dissertation, USC, August CrossTalk articles: References
Backup charts Fall
Fall
Metrics Evaluation Criteria Correlation with quality: How much does the metric relate with our notion of software quality? It implies that nearly all programs with a similar value of the metric will possess a similar level of quality. This is a subjective correlational measure, based on our experience.I Importance: How important is the metric and are low or high values preferable when measuring them? The scales, in descending order of priority are: Extremely Important, Important and Good to have Feasibility of automated evaluation: Are things fully or partially automable and what kinds of metrics are obtainable? Ease of automated evaluation: In case of automation how easy is it to compute the metric? Does it involve mammoth effort to set up or can it be plug-and-play or does it need to be developed from scratch? Any OTS tools readily available? Completeness of automated evaluation: Does the automation completely capture the metric value or is it inconclusive, requiring manual intervention? Do we need to verify things manually or can we directly rely on the metric reported by the tool? Units: What units/measures are we using to quantify the metric? Fall