Download presentation
Presentation is loading. Please wait.
Published byCamilla Underwood Modified over 9 years ago
1
WHY THEY FAILED AND LESSONS TO BE DRAWN Samuel Franklin G53QAT: Quality Assurance and Testing Famous Software Failures
2
Overview Three Software Failures Patriot Missile Russian Satellite Missile Detection London Ambulance Service Summary of Findings Questions
3
The Patriot Missile Failure Feb 1991 – Gulf War Failed to intercept Scud missile from Iraq 28 dead 100 injured Error from storing value in fixed point register The Patriot FailingThe Patriot in action
4
Why it went wrong HoursSecondsCalculated Time (sec)Inaccuracy (sec) Approx. shift in Range Gate (meters) 00000 136003599.9966.00347 8288008799.9725.002555 20(a)7200071999.9313.0687137 48172800172799.8352.1648330 72259200259199.7528.2472494 100(b)360000359999.6667.3433687 The system had been running for 100 hours The calculations were out by 0.34 seconds Missed the Scud by over 600 meters WOULD MISS AFTER 20 HOURS
5
What American learnt from this USA knew of the fault from Israeli Military American’s did not reboot regularly enough Software update arrived day after the death of the soldiers
6
Russian Satellite Missile Detection System Put in place to detect threats from America during cold war Stanislav Petrov monitored system on 26 th September 1983 Oko alerted Petrov that 5 missiles were heading towards Russia. Petrov had to choose: Declare it a false alarm Start a counterstrike and probably a Nuclear war
7
Stanislav Petrov The Man Who Saved the World
8
What Russia learnt from this The Russians dissected the Oko System Found the software full of bugs Launched the SPRN-2 Prognoz to supplement the Oko system Cost of this failure could have been: World War III
9
London Ambulance Fiasco London Ambulance Service (LAS) introuduced a Computer Aided Dispatch System (CAD) on 26 th October 1992 LAS: Carry over 5000 patients per day Receive approx 2500 calls per day 65% of calls are emergency New system needed to have near 100% accuracy and full cooperation from all LAS to succeed
10
26 th October 1992 The new CAD system could not handle the volume of call – regular use Response time became several hours Communications between ambulance and LAS lost System had: Poor interface between crews and the system Number of technical problems: Failed to identify duplicate calls Did not prioritise exception messages
11
What London learnt from this Do not use direct conversion Implement in step-by-step fashion Full consultation Quality assurance and testing User training
12
Conclusion Testing is essential All critical systems Rush to get system in place is bad Training Value of humans in the process
13
Any questions? Questions and Discussion
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.