1 / 23

Safety

Safety. Example Terms and Concepts Safety Architectures Safe Design Process Software Specific Stuff Sources Hard Time, by Bruce Powell Douglass, which references to Safeware. What is a Safe System?. Brake. Brake Pedal. Pedal Sensor. Processor. Bus. Engine. Is it safe?

alykins
Download Presentation

Safety

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Safety • Example • Terms and Concepts • Safety Architectures • Safe Design Process • Software Specific Stuff • Sources • Hard Time, by Bruce Powell Douglass, which references to Safeware CSE 466 – Fall 2000 - Introduction - 1

  2. What is a Safe System? Brake Brake Pedal Pedal Sensor Processor Bus Engine Is it safe? What does “safe” mean? How can we make it safe? Add electronic watch dog between brake and bus Add mechanical linkage from separate brake pedal directly to brake Add a third mechanical linkage…. CSE 466 – Fall 2000 - Introduction - 2

  3. Terms and Concepts • Reliability of component i can be expressed as the probability that component i is still functioning at some time t. • Is system reliability Ps (t) = PPi(t) ? • Assuming that all components have the same component reliability, Is a system w/ fewer components always more reliable? • Does component failure  system failure burn in period Probability of Continued Operation Low failure rate means nearly constant probability 1/(failure rate) = MTBF Pi(t) time CSE 466 – Fall 2000 - Introduction - 3

  4. A Safety System • A system is safe if it’s deployment involves assuming an acceptable amount of risk…acceptable to whom? • Risk factors • Probability of something bad happing • Consequences of something bad happening (Severity) • Example • Airplane Travel – high severity, low probability • Electric shock from battery powered devices – hi probability, low severity danger zone (we don’t all have the same risk tolerance!) probability PC autopilot severity mp3 player CSE 466 – Fall 2000 - Introduction - 4

  5. More Precise Terminology • Accident or Mishap: (unintended) Damage to property or harm to persons. Economic impact of failure to meet warranted performance is outside of the scope of safety. • Hazard: A state of the the system that will inevitably lead to an accident or mishap • Release of Energy • Release of Toxins • Interference with life support functions • Supplying misleading information to safety personnel or control systems. This is the desktop PC nightmare scenario. Bad information • Failure to alarm when hazardous conditions exist CSE 466 – Fall 2000 - Introduction - 5

  6. Faults • A fault is an “unsatisfactory system condition or state”. A fault is not necessarily a hazard. In fact, assessments of safety are based on the notion of fault tolerance. • Systemic faults • Design Errors (includes process errors such as failure to test or failure to apply safety design process) • Faults due to software bugs are always systemic • Random Faults • Random events that can cause permanent or temporary damage to the system. Includes EMI, radiation, component failure, power supply problems, security breech, etc. CSE 466 – Fall 2000 - Introduction - 6

  7. Component v. System • Reliability is a component issue • Safety and Availability are system issues • If a system has lots of redundancy the likelihood of a component failure (a fault) increases, but so may the safety and availability of that system. • Safety and Availability are different and sometimes at odds. Safety may require the shutdown of a system that may still be able to perform its function. • A backup system that can fully operate a nuclear power plant might always shut it down in the event of failure of the primary system. CSE 466 – Fall 2000 - Introduction - 7

  8. Single Fault Tolerance (for safety) • The existence of any single fault does not result in a hazard • Single fault tolerant systems are generally considered to be safe, but more stringent requirements may apply to high risk cases…airplanes, power plants, etc. Backup H2 Valve Control If the handshake fails, then either one or both can shut off the gas supply. Is this a single fault tolerant system? watchdog protocol Main H2 Valve Control CSE 466 – Fall 2000 - Introduction - 8

  9. Think Again? Backup H2 Valve Control common mode failures watchdog handshake Main H2 Valve Control CSE 466 – Fall 2000 - Introduction - 9

  10. Now Safe? Backup H2 Valve Control Separate Clock Source Power Fail-Safe What about power spike? Does it ever end? Is there another approach? watchdog handshake Main H2 Valve Control CSE 466 – Fall 2000 - Introduction - 10

  11. Time is a Factor • The TUV Fault Assessment Flow Chart • T0: Fault tolerance time of the first failure • T1: Time after which a second fault is likely • Captures time, and the notion of “latent faults” • Built-in fault diagnosis and corrective action must occur within specific time frames. Tact < T0 < T1 CSE 466 – Fall 2000 - Introduction - 11

  12. Latent Faults • Any fault this is not detectable by the system during operation has a probability of 1 – doesn’t count in single fault tolerance assessment Backup H2 Valve Control stuck valves could be latent if the controllers cannot check their state. watchdog handshake Main H2 Valve Control CSE 466 – Fall 2000 - Introduction - 12

  13. Fail Safe • On reset process checks status. If bad, enter “safe mode” • power off • reduced/altered functionality • alarm • restart • Safe mode is application dependent status Processor Watchdog protocol reset protocol failure CSE 466 – Fall 2000 - Introduction - 13

  14. Safety Architectures • Single Channel Protected Design • Redundancy • Diversity or Heterogeneity • Watchdog Brake watchdog Brake Pedal Pedal Sensor Computer Computer Bus Engine Control embed checks into the channel CSE 466 – Fall 2000 - Introduction - 14

  15. Single Channel Protection • How to protect against corrupt code due to EMI? • perform period checksums on the code space • How long does this take? • Is T0<T1? • Feasibility of SCPD • Fault Tolerance Time • Speed of the processor • Amount of ROM/RAM • Redundant Memory Subsystems: special hw • Recurring cost v. Development cost tradeoff Brake parity/crc on the bus Computer (code corruption) Computer Bus Engine Control CSE 466 – Fall 2000 - Introduction - 15

  16. Multi-Channel Protection • Homogeneous Redundancy • Low development cost…just duplicate • High recurring cost • No protection against systemic faults Brake Computer (code corruption) Voting Bus Engine Control Computer could be implemented same as collision detect in I2C Computer CSE 466 – Fall 2000 - Introduction - 16

  17. Multi-Channel Protection • Heterogeneous Redundancy • Protects against random andsome systemic faults. • Best if implementation teams are keptseparated Brake Proc/SW 1 Voting Bus Engine Control Proc/SW 2 CSE 466 – Fall 2000 - Introduction - 17

  18. Design for Safety • Hazard Identification and Fault Tree Analysis • Risk Assessment • Define Safety Measures • Create Safe Requirements • Create Safe Designs • Implement Safety • Assure Safety Process • Test,Test,Test CSE 466 – Fall 2000 - Introduction - 18

  19. Hazard Identification – Ventilator Example CSE 466 – Fall 2000 - Introduction - 19

  20. Fault Tree Analysis • Pacemaker Example CSE 466 – Fall 2000 - Introduction - 20

  21. System Risk Assessment • According to TUV: Risk factors are • Severity of the hazard • Duration of the period of exposure to the hazard • Prevention of the hazard • Probability of occurrence CSE 466 – Fall 2000 - Introduction - 21

  22. Define the Safety Measures • Obviation: Make it physically impossible (mechanical hookups, etc). • Education: Educate users to prevent misuse or hazardous cases • Alarming: Inform the users/operators of hazardous conditions • Interlocks: Take steps to eliminate the hazard when conditions exist • Restriction of Access. High voltage sources should be in compartments that require tools to access, w/ proper labels. • Labeling • Consider • Tolerance time • Supervision of the system: constant, occasional, unattended. Airport People movers have to be design to a much higher level of safety than attended trains even if they both have fully automated control CSE 466 – Fall 2000 - Introduction - 22

  23. Create Save Requirements: Specifications • Document the safety functionality CSE 466 – Fall 2000 - Introduction - 23

More Related