1 / 17

Jacky: “Safety-Critical Computing …”

Jacky: “Safety-Critical Computing …”. Therac-25 illustrated that comp controlled equipment could be less safe. Why use computers at all, if satisfactory techniques already exist? Who is held responsible? Therac-25 relied on computer control instead of physical safe-guards. Jacky (cont. 1).

Download Presentation

Jacky: “Safety-Critical Computing …”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jacky: “Safety-Critical Computing …” • Therac-25 illustrated that comp controlled equipment could be less safe. • Why use computers at all, if satisfactory techniques already exist? • Who is held responsible? • Therac-25 relied on computer control instead of physical safe-guards

  2. Jacky (cont. 1) • What are the differences between physical failures and logical/software failures? • Therac had 2 states, producing electron beams and X-rays. To produce X-rays, the electron beam was up to 100 times more powerful. A metal object is supposed to absorb the electrons. • If the tech went from X-rays to electrons within 8 seconds, the target was withdrawn but the beam would be set to full intensity.

  3. Jacky (cont. 2) • Therac was not the first system to radiate and kill patients; 3 patients were killed in 1966, because of a failure in the system. • Over 500 patients were successfully treated by Therac before the failures. • This is the problem with software, just because a system works for a certain amount of test cases, doesn’t necessarily mean we can make a prediction about other cases.

  4. Jacky (cont. 3) • Testing software is different when compared to testing typical engineered structures. • If a bridge can sustain a weight of 5000kg, it can also sustain a weight of 4999kg, 4998kg, 4997kg, … • Why were the techs not alarmed when the interface read Malfunction 54 when treating patients.

  5. Jacky (cont. 4) • What did AECL do when it became apparent that there was a problem? • Ans. They proposed a solution where they would take off the key cap of the “up arrow” key and cover it with electrical tape. This would make it difficult to switch from X-ray to normal quickly.

  6. Jacky (cont. 5) • Therac-25 had a history of problems: • Massive assembly started rotating spontaneously (failure: diode blown out) • Patients were overdosed with Therac-20 (failure: hazy – fuses blown, hardware circuit, but there were hardwire locks) • The publics faith in computing is illustrated on page 772, paragraph 5

  7. Jacky (cont. 6) • Why did institutions continue to use therac-25. • “The world is largely divided between people whose job it is to track down problems and others who are supposed to get on with production.” (pg. 773) • Virtually all devices now have embedded computers.

  8. Jacky (cont. 7) • FDA recalls about 400 medical devices per year eg. Ultrasound units, patient monitors, pacemakers, blood analyzers, ventilators, etc. (pg. 774) • Marvin Minsky: “When a program grows in power by an evolution of partially understood patches and fixes, the programmer begins to lose track of the internal details …”

  9. Jacky (cont. 8) • “… loses his ability to predict what will happen, begins to hope instead of to know, and watches the result as though the program were an individual whose range of behavior in uncertain.” • When do we stop testing? • What does it mean to stop testing when we achieve 2 – 3 errors per 1000 lines of code. What is a line of code?

  10. Jacky (cont. 9) • We work around bugs. That is one reason software has such low standards. • Safety is not the same as reliability. Can a system be reliable, but unsafe? • Safety engineering: Failure of a single component should never be capable of causing an accident.

  11. Jacky (cont. 10) • Tony Hoare: “The principle that the work of an engineer should be inspected and signed off by another more experienced and competent engineer lies at the heart of the codes of safe practice in all branches of engineering.” • What is one criticism of formal software engineering methods.

  12. Jacky (cont. 11) • How useful are formal methods? Is it expensive? (pg. 284) • What of regulation? • John Shore: “We require certification for doctors, lawyers, architects, civil engineers, aircraft pilots, automobile drivers, and even hair stylists! Why not software engineers?”

  13. Jacky (cont. 12) • Tony Hoare: “No industry and no profession has ever voluntarily and spontaneously developed or adopted an effective and relevant code for safe practice. Even voluntary codes are established only in the face of some kind of external pressure or threat, arising from public disquiet, fostered by journals and newspapers and taken up by politicians.”

  14. Jacky (cont. 13) • Is there really a need for regulation? How do programmers compare with each other? • Ans. Best programmers are 25 times better than the worst. Teams out-produce others by a factor of up to 5. Some managers have a poor grip on their responsibilities.

  15. Jacky (cont. 14) • What types of regulation are proposed? • Regulate programmers, satisfy educational requirements • Certify organization i.e. companies, and departments. ISO. (this is the favored approach) • Regulate the products themselves. • UK Ministry have a requirement that formal methods must be used for military software.

  16. Jacky (cont. 15) • UK has a law called “Machine Safety Directive” which allows criminal charges to be brought upon director or manager of device that causes injury.

More Related