23/05/2015Dr Andy Brooks1 FOR0383 Software Quality Assurance Lecture 2 ESA Ariane 5 Rocket Flight 501.

Slides:



Advertisements
Similar presentations
Lectures on File Management
Advertisements

1/1/ / faculty of Electrical Engineering eindhoven university of technology Speeding it up Part 3: Out-Of-Order and SuperScalar execution dr.ir. A.C. Verschueren.
SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
CSCI 5230: Project Management Software Reuse Disasters: Therac-25 and Ariane 5 Flight 501 David Sumpter 12/4/2001.
EECE499 Computers and Nuclear Energy Electrical and Computer Eng Howard University Dr. Charles Kim Fall 2013 Webpage:
Software Quality Assurance Plan
1 Software Engineering Lecture 11 Software Testing.
The Normalization of Deviance at NASA. Background January 28, 1986 Shuttle engineers were worried about launching at the predicted temperature of 31 degrees.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
1 COMS 161 Introduction to Computing Title: Numeric Processing Date: November 10, 2004 Lecture Number: 31.
Figures – Chapter 17. Figure 17.1 Component characteristics Component characteristic Description StandardizedComponent standardization means that a component.
CSC444 Lec01 1 University of Toronto Department of Computer Science Lecture 1: Why Does Software Fail? Some background What is Software Engineering? What.
INFORMATION TECHNOLOGIES SAFETY AND QUALITY THROUGH INFORMATION TECHNOLOGY WSRS Ulm – 20 Sept St. Ramberger / Th.Gruber 1 Experience Report: Error.
Software Development Methodology for Robotic and Embedded Systems (from drawing to coding) Presented by Iwan Setiawan for Robot and Technology Fair ( )-
Reliability and Safety Lessons Learned. Ways to Prevent Problems Good computer systems Good computer systems Good training Good training Accountability.
©Ian Sommerville 2000CS 365 Ariane 5 launcher failureSlide 1 The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its.
ARIANE 5 FAILURE ► BACKGROUND:- ► European space agency’s re-useable launch vehicle. ► Ariane-4 was a major success ► Ariane -5 was developed for the larger.
Page 1 Building Reliable Component-based Systems Chapter 14 - Testing Reusable Software Components in Safety- Critical Real-Time Systems Chapter 14 Testing.
Testing Components in the Context of a System CMSC 737 Fall 2006 Sharath Srinivas.
1 Software Testing and Quality Assurance Lecture 5 - Software Testing Techniques.
The Modular Structure of Complex Systems D.L. Parnas, P.C. Clement, and D.M. Weiss Published in IEEE Transactions on Software Engineering, March 1985 Presented.
EMBEDDED SOFTWARE Team victorious Team Victorious.
©Ian Sommerville 2004Software Engineering Case Studies Slide 1 The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its.
System/Software Testing
System Testing There are several steps in testing the system: –Function testing –Performance testing –Acceptance testing –Installation testing.
System Implementation. System Implementation and Seven major activities Coding Testing Installation Documentation Training Support Purpose To convert.
Dr Andy Brooks1 FOR0383 Software Quality Assurance Lecture 1 Introduction Forkröfur/prerequisite: FOR0283 Programming II Website:
The Ariane 5 Launcher Failure
CRASH AND BURN ARIANE 5 Kristen Hieronymus SYSM6309 Advanced Requirements Engineering
CRASH AND BURN ARIANE 5 Kristen Hieronymus SYSM6309 Advanced Requirements Engineering
CPSC 871 John D. McGregor Module 0 Session 1 Introduction.
1 Fault Tolerance in the Nonstop Cyclone System By Scott Chan Robert Jardine Presented by Phuc Nguyen.
Reliability Andy Jensen Sandy Cabadas.  Understanding Reliability and its issues can help one solve them in relatable areas of computing Thesis.
1 Debugging and Testing Overview Defensive Programming The goal is to prevent failures Debugging The goal is to find cause of failures and fix it Testing.
The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its maiden flight.
INVARIANTS EEN 417 Fall When is a Design of a System “Correct”? A design is correct when it meets its specification (requirements) in its operating.
Requirements and Estimation Process From a CMM Level 5 Organization Alan Prosser.
 System Development Life Cycle System Development Life Cycle  SDLC Phases SDLC Phases Phase 1: Preliminary Investigation Phase 2: Feasibility Study.
By: Rachel Gambacorta.  Challenger was NASA's second space shuttle  It had 9 successful launches.
SENG521 (Fall SENG 521 Software Reliability & Testing Fault Tolerant Software Systems: Techniques (Part 4b) Department of Electrical.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
Historical Aspects Origin of software engineering –NATO study group coined the term in 1967 Software crisis –Low quality, schedule delay, and cost overrun.
System Engineering Experiences Harold Sasnowitz, IEEE Life Senior Member.
RELIABILITY ENGINEERING 28 March 2013 William W. McMillan.
LESSON 3. Properties of Well-Engineered Software The attributes or properties of a software product are characteristics displayed by the product once.
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
David Streader Computer Science Victoria University of Wellington Copyright: David Streader, Victoria University of Wellington Debugging COMP T1.
第 11 組 MIS 報告. Phases of any information system ~ recognition of a business problem or opportunity ~ recognition of a business problem or opportunity.
Chapter 1 Software Engineering Principles. Problem analysis Requirements elicitation Software specification High- and low-level design Implementation.
Software Quality Assurance and Testing Fazal Rehman Shamil.
HNDIT23082 Lecture 09:Software Testing. Validations and Verification Validation and verification ( V & V ) is the name given to the checking and analysis.
CSCI 3428: Software Engineering Tami Meredith Chapter 1 Why Software Engineering.
Topic 10Summer Ariane 5 Some slides based on talk from Sommerville.
Week#3 Software Quality Engineering.
Inertial Measurement Unit. Project Advisor: Dr. Basart Client: Matt Nelson Team Members (491): Matt Ulrich Luis Garcia Amardeep Jawandha Julian Currie.
Testing Tutorial 7.
Hardware & Software Reliability
Software Testing An Introduction.
Fault Tolerant Computing
Ariane 5 Software error Integer overflow.
Section 8 Discussion Points
RESEARCH METHODS Trial
Lecture 09:Software Testing
CS 5150 Software Engineering
Eagle Space Flight Team Electronics Team
Software Verification, Validation, and Acceptance Testing
System Testing.
Knowing When to Stop: An Examination of Methods to Minimize the False Negative Risk of Automated Abort Triggers RAM XI Training Summit October 2018 Patrick.
An Introduction to Debugging
Presentation transcript:

23/05/2015Dr Andy Brooks1 FOR0383 Software Quality Assurance Lecture 2 ESA Ariane 5 Rocket Flight 501

23/05/2015Dr Andy Brooks2 4 June 1996 at ~40 seconds into launch at an altitude of ~3700m the launcher veered off path and began to break up the self-destruct system was triggered ~$500 million (uninsured, maiden flight) the launcher was unmanned

23/05/2015Dr Andy Brooks3 Board of Inquiry what was the cause of failure? was appropriate testing undertaken? what corrective actions should there be? the report by the Board of Inquiry was completed in less than 6 weeks

23/05/2015Dr Andy Brooks4 Weather conditions the weather was acceptable there was no risk of lightning but visibility had worsened for a time the launch was delayed by about 1hr The Challenger Space Shuttle disaster was partly due to the weather. Overnight conditions at the launch pad had been extremely cold which meant the O-rings on the booster rockets were brittle and prone to fracture.

23/05/2015Dr Andy Brooks5 Briefly nominal behaviour of the launcher until H seconds the backup Inertial Reference System fails the active Inertial Reference System fails –after the backup all the rocket nozzles are swivelled into extreme positions the launcher breaks up and the self-destruct system was triggered

23/05/2015Dr Andy Brooks6 Recovery of material debris fell back to ground, scattered over a wide area (5 x 2,5km) despite mangrove swamps, the two Inertial Reference Systems were recovered telemetry data was received on the ground trajectory data was received from radar stations optical observations (camera and film)

23/05/2015Dr Andy Brooks7 Unrelated Anomaly at H seconds variations started in the hydraulic pressure of the actuators of the main engine nozzle with a frequency of 10Hz “This phenomenon is significant and has not yet been fully explained, but after consideration it has not been found relevant to the failure.”

23/05/2015Dr Andy Brooks8 Inertial Reference System (SRI) complex piece of equipment measures attitude and movements in space output transmitted to the On-Board Computer (OBC) executing the flight control program to improve reliability, two SRIs operated in parallel with identical hardware and software First question to ask: how is the system backed up?...

23/05/2015Dr Andy Brooks9 Equipment Redundancy there are two On-Board Computers and a number of other units in the flight control system are also duplicated

23/05/2015Dr Andy Brooks10 So, what really happened? the OBC received incorrect data the SRI had declared a failure due to a software exception (Operand Error) a data conversion from a 64-bit floating point was too large for the target 16-bit signed integer value this particular data conversion was not protected

23/05/2015Dr Andy Brooks11 …Different Trajectory the operand error occurred because Ariane 5 built up a horizontal velocity much more quickly than Ariane 4 –Ariane 5 built up horizontal velocity five times more quickly than Ariane 4 the failure context was precisely determined from memory readouts from the recovered SRIs

Ariane family 23/05/2015Dr Andy Brooks12

23/05/2015Dr Andy Brooks13 …No useful purpose the software module which generated the exception served no useful purpose after launch! simply re-used from Ariane 4 “Effective reuse requires design by contract. Without a precise specification attached to each reusable component - precondition, postcondition, invariant - no one can trust a supposedly reusable component. Without a specification, it is probably safer to redo than to reuse.” Jean-Merc Jézéquel and Betrand Mayer, IEEE Computer, January 1997 p130

23/05/2015Dr Andy Brooks14 Unprotected variables? 3 variables were unprotected “because a maximum workload target of 80% had been set for the SRI computer” –remember, this is a real-time system the justification was not given in source code the reasoning was that variables were either physically limited or there was a large safety margin –this was true for Ariane 4 the decision to protect some but not all of the variables was taken jointly by project partners

23/05/2015Dr Andy Brooks15 The specification of exception-handling contributed to the failure. the failure should be indicated on the databus –the OBC interpreted the diagnostic data it was sent as valid data, causing the nozzle deflections –remember, the backup SRI failed first the failure context should be stored in EEPROM memory the SRI processor should be shut down this approach addressed random hardware failures

23/05/2015Dr Andy Brooks16 Testing no test was performed to verify that the SRI would behave correctly when subject to the count-down and trajectory of Ariane 5 the SRI specification did not contain Ariane 5 trajectory data as a functional requirement “It would have been technically feasible to include almost the entire inertial reference system in the overall system simulations which were performed. For a number of reasons it was decided to use the simulated output of the inertial reference system, not the system itself or its detailed simulation. Had the system been included, the failure could have been detected.”

23/05/2015Dr Andy Brooks17 Recommendations R1 … no software function should run during flight unless it is needed R2 … test facility must include as much real equipment as possible… Complete simulations must take place... R3 … do not allow sensors to stop sending best effort data

23/05/2015Dr Andy Brooks18 … more Recommendations R5 review all flight software… identify all implicit assumptions R9 include external participants when reviewing specifications, code and justification documents (someone with a fresh mind can sometimes easily spot mistakes that the authors miss) R14 provide more transparent organisation of co-operation among partners