190 likes | 344 Views
Death by Software. The Therac-25 Radio-Therapy Device Brian MacKay ESE6361 - Requirements Engineering – Fall 2013 Final Presentation. Recap: Software that Kills. Early to mid-1980s Revolutionary Double-Pass medical particle accelerator Moved to complete software control
E N D
Death by Software The Therac-25 Radio-Therapy Device Brian MacKay ESE6361 - Requirements Engineering – Fall 2013 Final Presentation
Recap: Software that Kills • Early to mid-1980s • Revolutionary Double-Pass medical particle accelerator • Moved to complete software control • Injured 6 people, killing 3 of them • Two different underlying bugs • But it was more than just bugs • Poor software engineering practices
Let’s Look at the PIG in Detail Don’t Kill or Injure People Injures & Kills People Increment Overflow Bug Indecipherable Error Messages ++ ++ Malfunction 54 Bug + Operator “Malfunction Fatigue” + 40 Malfunctions/ Day
Assembly Language Programming Injures & Kills People Increment Overflow Bug Assembly Language Programming Programming Shortcuts ++ ++ Malfunction 54 Bug + + + + + Bad Testability
Code Reuse Code Reuse Injures & Kills People Increment Overflow Bug Expensive Hardware, etc RT Synchronization Issues ++ ++ Homemade RT-OS Malfunction 54 Bug + + + ++ + + “Working Code”
Moving to Complete Computer Control !! Move to Computer Control Mechanical Controls “Less Cool” Toxic Situation Increment Overflow Bug Injures & Kills People No Mechanical Interlocks ++ ++ Mechanical Controls Fail Malfunction 54 Bug ++ ++ Code Reuse ++ + +
Cross-Cutting Issues Injures & Kills People Increment Overflow Bug Faith in Software No Auditing Hardware Focused Organization ++ ++ Malfunction 54 Bug
The Real Issue • A combination of: • Code Reuse • The removal of the mechanical interlocks • An unreasonable faith in Software • General bad software engineering practice
The Solution Domain • Based in early 1980’s technology • Hindsight is one thing • But 30 years of technological innovation is cheating • Based on my experiences • I was a junior engineer starting my career in process & manufacturing systems
Control System Design • In the 1980s – and now • Uses a “Distributed Control System” • Provides for strong segregation between the layers • Early user of networking technology • Typically combined • Done with a“PLC”
PLC: Programmable Logic Controller • In 1980s used “Ladder Logic” graphical programming language • Program spec-ed by an engineer – Programmed by an electrician • Consider…
PLC: Ladder Logic Pump Programmable by an Electrician Pump On Switch Valve Position Open
The Rest of the System • Multi-bus system and enclosure • Intel 8086 with 8087 coprocessor • 512 kilobytes of memory • 20 megabyte disk drive: program, logs and audits • Mark Williams “C” Compiler • Intel iRMX-86 real-time operating system • RS-232 and RS-485 serial connections • Commercial terminal management software • ANSI compatible terminal (e.g. VT-100) All this is Off the Shelf
Error Messages • Even with something like aVT-100 Green Screen a “windowed” interface is possible • Lots of terminal management software was available commercially to handle this PATIENT NAME : JOHN DOE TREATMENT MODE : FIX BEAM TYPE: X ENERGY (MeV): 25 ACTUAL PRESCRIBED UNIT RATE/MINUTE 0 200 MONITOR UNITS 50 50 200 TIME (MIN) 0.27 1.00 GANTRY ROTATION (DEG) 0.0 0 VERIFIED COLLIMATOR ROTATION (DEG) 359.2 359 VERIFIED COLLIMATOR X (CM) 14.2 14.3 VERIFIED COLLIMATOR Y (CM) 27.2 27.3 VERIFIED WEDGE NUMBER 1 1 VERIFIED ACCESSORY NUMBER 0 0 VERIFIED DATE : 84-OCT-26 SYSTEM : BEAM READY OP.MODE: TREAT AUTO TIME : 12:55. 8 TREAT : TREAT PAUSE X-RAY 173777 OPR ID : T25VO2-RO3 REASON : OPERATOR COMMAND: PATIENT NAME : JOHN DOE TREATMENT MODE : FIX BEAM TYPE: X ENERGY (MeV): 25 ACTUAL PRESCRIBED UNIT RATE/MINUTE 0 200 MONITO┌──────────────────────────────────────┐ TIME │ Error 54: │ │ This is a serious error and could │ GANTRY ROT│ compromise patient safety │ VERIFIED COLLIMATOR│ The system must be reset │ VERIFIED COLLIMATOR│ [Enter] │ VERIFIED COLLIMATOR└──────────────────────────────────────┘ VERIFIED WEDGE NUMBER 1 1 VERIFIED ACCESSORY NUMBER 0 0 VERIFIED DATE : 84-OCT-26 SYSTEM : BEAM READY OP.MODE: TREAT AUTO TIME : 12:55. 8 TREAT : TREAT PAUSE X-RAY 173777 OPR ID : T25VO2-RO3 REASON : OPERATOR COMMAND:
Final System Design Intel 8086/8087 Running iRMX-86 Programmed in “C” UI Implemented Using Commercial Terminal Manager Software PLC Programmed in Ladder Logic
References • “Medical Devices – The Therac-25”,Levenson, Nancy.http://sunnyday.mit.edu/papers/therac.pdf • “An Investigation of the Therac-25 Accidents”, Levenson, Nancy and Turner, Clark S., IEEE Computer, Vol. 26, No. 7, July 1993, pp. 18-41http://courses.cs.vt.edu/~cs3604/lib/Therac_25/Therac_1.html • “Fatal Dose - Radiation Deaths linked to AECL Computer Errors”,Rose, Barbara Wade, Saturday Night (magazine), June, 1994http://www.ccnr.org/fatal_dose.html • “Safety-Critical Computing: Hazards, Practices, Standards, and Regulation”, Jacky, Jonathan, http://staff.washington.edu/jon/pubs/safety-critical.html • “Therac-25”,Wikipediahttp://en.wikipedia.org/wiki/Therac-25