An Investigation of the Therac-25 Accidents Nancy G. Leveson Clark S. Turner IEEE, 1993 Presented by Jack Kustanowitz April 26, 2005 University of Maryland.

Slides:



Advertisements
Similar presentations
Object Oriented Analysis And Design-IT0207 iiI Semester
Advertisements

Test process essentials Riitta Viitamäki,
CSCI 5230: Project Management Software Reuse Disasters: Therac-25 and Ariane 5 Flight 501 David Sumpter 12/4/2001.
“An Investigation of the Therac-25 Accidents” by Nancy G. Leveson and Clark S. Turner Catherine Schell CSC 508 October 13, 2004.
The Therac-25: A Software Fatal Failure
Social Implications of a Computerized Society Computer Errors Instructor: Oliver Schulte Simon Fraser University.
Can We Trust the Computer? Case Study: The Therac-25 Based on Article in IEEE-Computer, July 1993.
Therac-25 Lawsuit for Victims Against the AECL
Therac-24 The Upshot. Summary/Overview Six patients received radiation overdoses during cancer treatment by a faulty medical linear accelerator, the Therac-25.
+ THE THERAC-25 - A SOFTWARE FATAL FAILURE Kpea, Aagbara Saturday SYSM 6309 Spring ’12 UT-Dallas.
Race Conditions. Isolated & Non-Isolated Processes Isolated: Do not share state with other processes –The output of process is unaffected by run of other.
Reliability and Safety Lessons Learned. Ways to Prevent Problems Good computer systems Good computer systems Good training Good training Accountability.
A Gift of Fire Third edition Sara Baase
Applied Software Project Management Andrew Stellman & Jennifer Greene Applied Software Project Management Applied Software.
Jacky: “Safety-Critical Computing …” ► Therac-25 illustrated that comp controlled equipment could be less safe. ► Why use computers at all, if satisfactory.
 QUALITY ASSURANCE:  QA is defined as a procedure or set of procedures intended to ensure that a product or service under development (before work is.
Personal Software Process Overview CIS 376 Bruce R. Maxim UM-Dearborn.
Software Failures Ron Gilmore, CMC Edmonton April 2006.
Lecture 7, part 2: Software Reliability
Dr Andy Brooks1 Lecture 4 Therac-25, computer controlled radiation therapy machine, that killed people. FOR0383 Software Quality Assurance.
DJ Wattam, Han Junyi, C Mongin1 COMP60611 Directed Reading 1: Therac-25 Background – Therac-25 was a new design dual mode machine developed from previous.
Death by Software The Therac-25 Radio-Therapy Device Brian MacKay ESE Requirements Engineering – Fall 2013.
Software Engineering Process I
Managing Software Quality
Therac-25 : Summary Malfunction Complacency Race condition (turntable / energy mismatch) Data overflow (turntable not positioned) time‘85‘86‘88 ‘87 Micro-switch.
Chapter 8: Systems analysis and design
Software Safety Case Study Medical Devices : Therac 25 and beyond Matthew Dwyer.
Therac-25 Final Presentation
Therac 25 Nancy Leveson: Medical Devices: The Therac-25 (updated version of IEEE Computer article)
ITGS Software Reliability. ITGS All IT systems are a combination of: –Hardware –Software –People –Data Problems with any of these parts, or a combination.
Course: Software Engineering © Alessandra RussoUnit 1 - Introduction, slide Number 1 Unit 1: Introduction Course: C525 Software Engineering Lecturer: Alessandra.
CS 360 Lecture 3.  The software process is a structured set of activities required to develop a software system.  Fundamental Assumption:  Good software.
Chapter 8: Errors, Failures, and Risk
©Ian Sommerville 2000, Mejia-Alvarez 2009 Slide 1 Software Processes l Coherent sets of activities for specifying, designing, implementing and testing.
Dimitrios Christias Robert Lyon Andreas Petrou Dimitrios Christias Robert Lyon Andreas Petrou.
Testing -- Part II. Testing The role of testing is to: w Locate errors that can then be fixed to produce a more reliable product w Design tests that systematically.
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. System design techniques Quality assurance. 1.
What you know… You work at the East Texas Cancer Center in Tyler, Texas as a physicist who “maintains and checks the machine regularly.” (Huff 2005) Patient.
This material is approved for public release. Distribution is limited by the Software Engineering Institute to attendees. Sponsored by the U.S. Department.
DEBUGGING. BUG A software bug is an error, flaw, failure, or fault in a computer program or system that causes it to produce an incorrect or unexpected.
Software Engineering Chapter 3 CPSC Pascal Brent M. Dingle Texas A&M University.
Intermediate 2 Software Development Process. Software You should already know that any computer system is made up of hardware and software. The term hardware.
Managing Change 1. Why Do Requirements Change?  External Factors – those change agents over which the project team has little or no control.  Internal.
1 Ch. 1: Software Development (Read) 5 Phases of Software Life Cycle: Problem Analysis and Specification Design Implementation (Coding) Testing, Execution.
Configuration Management and Change Control Change is inevitable! So it has to be planned for and managed.
Therac-25 CS4001 Kristin Marsicano. Therac-25 Overview  What was the Therac-25?  How did it relate to previous models? In what ways was it similar/different?
Software Testing Process By: M. Muzaffar Hameed.
Chapter 1: Fundamental of Testing Systems Testing & Evaluation (MNN1063)
Module CC3002 Post Implementation Issues Lecture for Week 7
CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.
CSC 480 Software Engineering Test Planning. Test Cases and Test Plans A test case is an explicit set of instructions designed to detect a particular class.
Dr. Rob Hasker. Classic Quality Assurance  Ensure follow process Solid, reviewed requirements Reviewed design Reviewed, passing tests  Why doesn’t “we.
CSCI 3428: Software Engineering Tami Meredith Chapter 7 Writing the Programs.
Software Development Process CS 360 Lecture 3. Software Process The software process is a structured set of activities required to develop a software.
Dr. Rob Hasker. Classic Quality Assurance  Ensure follow process Solid, reviewed requirements Reviewed design Reviewed, passing tests  Why doesn’t “we.
Directed Reading 1 Girish Ramesh – Andres Martin-Lopez – Bamdad Dashtban –
EN Lecture Notes Spring 2016 FUNDAMENTALS OF SECURE DESIGN (SOFTWARE)
CHAPTER 9: PROFESSIONAL ETHICS AND RESPONSIBILITIES BY: MATT JENNINGS SHANE CRAKER KYLER RHOADES.
1 Advanced Computer Programming Project Management: Basics Copyright © Texas Education Agency, 2013.
Tracking and Squashing Bugs
EE 585 : FAULT TOLERANT COMPUTING SYSTEMS B.RAM MOHAN
COMP60611 Directed Reading 1: Therac-25
Therac-25 Accidents What was Therac-25? Who developed it?
A Gift of Fire Third edition Sara Baase
Reliability and Safety
Therac-25.
System design techniques
Week 13: Errors, Failures, and Risks
Computer in Safety-Critical Systems
A Gift of Fire Third edition Sara Baase
Presentation transcript:

An Investigation of the Therac-25 Accidents Nancy G. Leveson Clark S. Turner IEEE, 1993 Presented by Jack Kustanowitz April 26, 2005 University of Maryland

2 Overview What happened Accident history Development history Technical problems Company responses Lessons learned Ethical questions Resources

University of Maryland3 What Happened Between June 1985 and January 1987, 6 known accidents involving massive overdoses, causing death & serious injury

University of Maryland4 Accident History June : First overdose July-Dec 1985: Two more overdoses, patient sues AECL and hospital, two requests for modifications Jan-Feb 1986: Denial of possibility of overdose Mar-Apr 1986: Two more overdoses, software blamed May-Dec 1986: FDA declares Therac-25 defective, CAPs (Corrective Action Plans) sent back and forth between FDA and AECL. First Therac-25 user group meeting. Jan 1987: Sixth overdose Feb-July 1987: More CAPs back and forth until fifth revision of CAP sent to FDA Nov 1988: Final safety analysis report issued Grueling first-hand descriptions of what it felt like to get a massive radiation overddose

University of Maryland5 Development History Therac-6: 6 MeV accelerator for x-rays Therac-20: 20 MeV dual-mode (x-rays or electrons)  Separate hardware interlocks Therac-25: 25 MeV dual-mode  All safeguards done in software Testing  “Unit and software testing was minimal, with most effort directed at the integrated system test” Software written in assembly on a PDP-11

University of Maryland6 The Operator Interface

University of Maryland7 The Operator Interface At first, operator needed to enter information at the treatment table, and then re-enter at a console in the control room Operators complained; safeguard was removed Error codes are reported on the screen with no English explanation  Example: (East Texas Cancer Center) “Malfunction 54” reported, caused by “dose input 2”. An AECL technician testified that “does input 2” means the dose delivered was either too high or too low (!) “Treatment Pause” after non-critical error, which operator can ignore by pressing “P”  Causes operators to become insensitive to errors

University of Maryland8 Example Bugs Data Entry Bug  Setting the bending magnets takes 8 seconds  “Delay” subroutine uses shared memory with the data entry subroutine  So data changes within 8 seconds will be wiped out when Delay exits!  Causes bugs that only show up with proficient users who do data entry in <8 seconds Set-Up Test Bug  On every 256 th pass through Set-Up (one-byte counter), the upper collimator is not checked  Problem if operator hits “set” exactly when counter rolls over to 0 These kinds of bugs are notoriously difficult to track down

University of Maryland9 AECL Responses Denial  “We did not believe that there could have been any accelerator malfunction” Incremental, local band-aid fixes  Example: “P” key removed to prevent operators from ignoring warnings Dragging feet, doing minimum of FDA’s requests  Perhaps justified? See ethics discussion… Knee-jerk responses – fix the bugs as they are reported Difficulty reproducing bugs (that only happened once in several hundred runs)

University of Maryland10 Lessons: General Focusing on particular software bugs is not the way to make a safe system  Assumption that fixing one error would prevent further accidents  “There is always another software bug” It is a bad idea to remove independent hardware interlocks, and to believe too much in software  Assume software will fail, and handle that properly, rather than trying to write “perfect” software Don’t believe in numerical claims  “Risk assessment can be like the captured spy: if you torture it long enough, it will tell you anything you want to know” Record the reasons for design decisions (like duplicate data entry) Design for the worst case Don’t enhance usability at the expense of safety Power of user groups to cause change when companies drag their feet

University of Maryland11 Lessons: Software Engineering Documentation should not be an afterthought Establish QA practices & standards Keep designs simple Design audit trails and logging from the beginning Perform extensive testing and formal analysis at the module and software level, rather than relying on system-level testing Summary of this course!

University of Maryland12 Ethical Questions 500 patients treated in East Texas before first serious accident Too much government oversight slows progress If 1 person was getting hurt for every 1000 helped, would you take the machine out of use? How about 1:100? 1:10000? Where’s the line?

University of Maryland13 Resources /notes/Therac-25/SouthPark/01.htm