© SAIC. All rights reserved. NATIONAL SECURITY ENERGY & ENVIRONMENT HEALTH CYBERSECURITY The Potential High Cost of Simple Systems Engineering Errors Jim Gottfried Chief Scientist/Engineer, Logistics and Engineering Solutions SAIC March 7, 2012 USC CSSE Annual Research Review 2012
SAIC.com © SAIC. All rights reserved. Ground Rules The projects and circumstances to be discussed were all performed by strong, competent, and well-disciplined engineering companies, often operating under CMMI L3 or higher processes The engineers working these projects were experienced, very competent, and disciplined system and software engineers Still, problems do occur, cost money to fix, and may have been avoided 2
SAIC.com © SAIC. All rights reserved. Problem #1: Specification Errors Setting the Stage – What characteristics describe good requirements? Clear/unambiguous Accurate Complete Necessary, traceable to a higher level requirement Consistent with other requirements/standards Achievable Verifiable 3
SAIC.com © SAIC. All rights reserved. Problem #1: Specification Errors, cont. Requirements Example Logistics Metrics (1): The radio system shall provide the capability for a remote or local user to view performance metrics information of the type listed below, as a minimum. – Availability (Ao): % time system capable of supporting prime mission – Mean Time Between Failures (MTBF): time in tenths of hours between failure of a software or hardware item – Mean Down Time (MDT): average downtime in tenths of hours where system cannot perform primary mission Logistics Metrics (2): The radio system shall be capable of calculating the values of the logistics metrics described above. The remote maintenance software shall be capable of displaying these values on a user screen available to both a local and a remote user. The calculated data will be air base specific. What is wrong or missing with the above requirements? 4
SAIC.com © SAIC. All rights reserved. Problem #1: Specification Errors, cont. Note that the system reported the required metrics and the reporting format was fine to the user. The metrics were calculated accurately. The user reported the metrics to their management on a quarterly basis. Could the user perform this reporting function? Why or why not? – Answer: No, there was no capability to reset the metrics after reading them each quarter Resolution: Update software and documentation to allow resetting metrics upon command Cost: Over $80K 5
SAIC.com © SAIC. All rights reserved. Problem #2: Systems Engineering Design Errors Setting the Stage – The power for the system came through an uninterruptable power supply (UPS) – The UPS was software controlled and monitored for failure – Commercial UPS specifications were reviewed – A commercial UPS was selected and installed with the system – After installation when facility power failed, large electrical spikes were seen that shut down some of the electronic equipment – Investigation showed that this UPS was not designed to condition the power as installed on this system 6
SAIC.com © SAIC. All rights reserved. Problem #2: Systems Engineering Design Analysis Resolution Options – Option 1: add a transformer between UPS and system Customer does not like this option as a long term solution (for additional bases as well) This would make the first system different from other, future bases – Option 2: replace the original UPS with a different UPS that will properly condition the power The only available UPSs that will do the job properly have a different software interface This UPS is lower cost and more flexible in sizing Customer wants this solution on future system sites Action: – New UPS purchased, system software changed for compatibility – New UPS installed and tested Cost: Over $120K 7
SAIC.com © SAIC. All rights reserved. Problem #1 and #2 Lessons Learned Both problems resulted from relatively simple systems engineering (SE) errors Both problems resulted in substantial cost additions How to avoid – My opinion: we will never eliminate all SE problems; system engineers are human – Best approach to avoid this type of problem is extremely thorough peer reviews of all requirements and design decisions using quality checklists – Thorough peer reviews take time and must be planned in the process – Peer reviews should involve a sufficient number of engineers to fully represent all stakeholder organizations including system, design, integration, test, and specialty engineers – Problem #1 (specification) may have been prevented by developing use cases for all user interactions with the system 8
SAIC.com © SAIC. All rights reserved. Problem #3: A System vs. a Hardware Item What distinguishes a system from a hardware item (e.g., a communications radio [JTRS, air traffic control, etc.])? – Some characteristics: More functionality Multiple hardware items More external interfaces Computer controlled; more software/firmware Larger, more dynamic user interfaces … etc. Problem: Understanding and appreciating the complexity of a system versus the previous hardware item 9
SAIC.com © SAIC. All rights reserved. Problem #3: A System vs. a Hardware Item, cont. The need to understand and appreciate the complexity of a system is very intuitive, however, the solution is very difficult to understand and address Why? – Psychology: Because we (system engineers) are the experts in the hardware item domain; we understand it well; the system is just an extension of what we know/do – New goals for the system are underestimated: rarely do we build a one-for-one replacement of the hardware Systems are built to add flexibility to the product – Flexibility increases development complexity and time Systems are built to add functionality to the product – More user/remote control, better user experience, easier maintenance, more capability, more accuracy, more timeliness Systems are built to improve product reliability and availability – Better diagnostics, backup capability, redundancy and auto failover Other? 10
SAIC.com © SAIC. All rights reserved. Problem #3: A System vs. a Hardware Item, cont. Ramification of failure to understand the system vs. the hardware item – Development time increases 2-3 times original plan – Cost can increase 2-4 times original plan – Late to market, competitor first to market – Unhappy customers – Frustrated management and engineers – Cancellation of project Solutions? – It must start with better appreciation of the problems, goals, and complexity of the system vs. the hardware item 11