COMP60611 Directed Reading 1: Therac-25 Introduction: Therac-25 is a medical linear accelerator, used for destroying tumours with electron beams: Shallow tissue treated directly by low energy electron beams (~ 5 – 15 MeV) Deeper tissue treated with X-rays, which are generated by firing high energy electron beam (~ 25 MeV) at a target. Between June 1985 and January 1987, six people were massively overdosed: Due to 25 MeV beam setting being used in direct irradiation mode. Unlike earlier machines (i.e. Therac-6, Therac-20), Therac-25 had no hardwired interlock to prevent high-energy beam from being used when patient/turntable setup for direct irradiation: Therac-25 relied on software for safety checks One of the software faults that led to overdosing was a concurrency bug. A. Michelis, L. Wang, C. Goddard 2nd October 2011
COMP60611 Directed Reading 1: Therac-25 Tyler software bug: Operator edits mode/energy input field on console and returns to command line. Software calls “magnet” subroutine, which sets magnet positions (takes ~ 8 seconds). If operator edits mode/energy input field on console during this 8 seconds, change is not recognised by system, though displayed on console. Therefore, machine could operate in “electron” mode with 25 MeV beam. Yakima software bug: “Class3” shared variable is incremented every time “Set Up Test” is executed (i.e. several hundred times) As Class3 is a single byte variable, maximum value is 255. Every 256th pass through Set Up Test, Class3 overflows and has zero value: Collimator position checking subroutine is skipped A. Michelis, L. Wang, C. Goddard 2nd October 2011
COMP60611 Directed Reading 1: Therac-25 Main mistakes: Reuse of the Therac-20 software for the Therac-25 The circumstances were different as they removed the hardware safeties for the Therac-25 Too much confidence given to the software No safety analysis of the software at first Bad engineering process Design too complex, poor testing, bad error detection & reporting Poor investigations led by AECL/poor reactions after they were alerted of the incidents They learned about a first lawsuit but did not act consequently, when they were aware of a bug they just fixed the "symptom" and not the root cause (the whole design should have been changed) A. Michelis, L. Wang, C. Goddard 2nd October 2011