1 / 69

Safety

Recapping. Dependabilitythe property of a system such that we can justifiably place our reliance on the service it delivers"Attributes, relationships, conflictsAvailabilityReliability?Continuity of Correct Service"Quantitative Measures ? POFOD, ROCOF, MTTFFault-Error-FailureFault removal/avo

cadee
Download Presentation

Safety

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Safety Glen Dobson g.dobson@lancs.ac.uk http://www.comp.lancs.ac.uk/~dobsong/teaching/dependability

    2. Recapping Dependability the property of a system such that we can justifiably place our reliance on the service it delivers” Attributes, relationships, conflicts Availability Reliability “Continuity of Correct Service” Quantitative Measures – POFOD, ROCOF, MTTF Fault-Error-Failure Fault removal/avoidance/tolerance Representative operational profiles hard to achieve

    3. Overview Safety definition Hazards Compromise Safety measures Hazard avoidance Hazard tolerance Risk

    4. Safety Is about… (during normal & abnormal operation) Controlling potentially dangerous systems Preventing injury or death of people Preventing damage to environment Viewed as a specialisation of reliability Minimise occurrence of failures – specifically those with catastrophic consequences

    5. Safety systems Direct safety (primary): Safety critical system System itself can cause damage/injury Power station control, flight controller, etc. Indirect safety (secondary): Support system with safety implications System can lead to damage/injury Treatment db, maintenance manager, etc.

    6. Hazard chain

    7. Similarities ?

    8. Examples Hazard Live electrical cable on the lawn Narrow coolant pipes Incident Lawn mower cuts through cable Coolant pipes become blocked Accident Gardener gets electrocuted Core meltdown

    9. Value for human life

    10. Compromise We must put a price on life and suffering Perfect safety is impossible Very high reliability is very expensive Reach an “acceptable” compromise between: Safety, Practicality, Cost Otherwise we would never do anything! Many social, technical and political issues

    11. Example “I was a recall coordinator, my job was to apply the formula: Take the number of vehicles in the field (A), multiply it by the probable rate of failure (B), then multiply the result by the average out of court settlement (C). [If the result] (A x B x C) is less than the cost of a recall, we don’t do one.”

    12. Manufacturer responsibility Because death/injury must be tolerated Manufacturer is open to litigation Fines by government agencies (e.g. environment agency) Civil proceedings Even imprisonment of employees

    13. Manufacturer defence Demonstrate system “fitness for purpose” “As Safe as Could Reasonably be Expected” Demonstrate lack of negligence Provide warnings (signs, labels, disclaimers) Take out insurance !!!

    14. Evaluating safety Safety is hard to measure Often rely on “Judged” safety level Estimates our “confidence level” From “Very safe” to “Very unsafe” Matter for professional judgement Evidence supported argument Should address product AND process

    15. Factors influence judgement Reputation of developers Maturity of development process Adherence to standards Well documented V&V: Reviews/inspections Static checking Comprehensive testing Formal proofs Safety case or safety argument

    16. System safety case Justify and defend system Does not prove safety of system Reasoned argument indicating safety Demonstrates design and assessment Presents “evidence” based on: Expert engineering judgement Probabilistic risk analysis Demonstrating risks have been addressed

    17. Proof by contradiction Systematic & mathematical approach Show unsafe state can’t be reached Conditions for hazard can’t exist Focused on single aspect of system Shorter than full formal method “Semi-formal” (?) thus easy to understand

    18. Safety Integrity levels Safety specified using integrity level Exact quantitative value not always possible Probability of accident occurrence: Integrity level 4: 10-2 to 10-1 Integrity level 3: 10-3 to 10-2 Integrity level 2: 10-4 to 10-3 Integrity level 1: 10-5 to 10-4 (From IEC 1508 standard)

    19. Give example systems For integrity level 4 (10-2 to 10-1) ? For integrity level 3 (10-3 to 10-2) ? For integrity level 2 (10-4 to 10-3) ? For integrity level 1 (10-5 to 10-4) ?

    20. Very high safety measures Use of v. high measures is problematic Often impossible to verify achievement We can’t test to such extremes So can we build to these extremes? Maybe such systems are too risky? If we can’t check it - don’t build it!

    21. Severity counts Not all failures have the same severity We can put up with some minor ones… aim for the following integrity levels: Negligible: 10-2 to 10-1 Minor effect: 10-4 to 10-3 Major effect: 10-6 to 10-5 Hazardous: 10-8 to 10-7 Catastrophic: 10-9 and lower (From civil aircraft manufacturing)

    22. Of negligible (10-2 to 10-1 ) ? Of minor effect (10-4 to 10-3) ? Of major effect (10-6 to 10-5) ? Of hazardous (10-8 to 10-7) ? Of catastrophic (10-9 and lower) ? Give example systems

    23. Hazards and failures Hazard viewed as specialised “fault” Safety related failure Wider socio-technical perspective Hazards thus managed in similar way: Hazard avoidance (c.f. fault avoidance) Damage limitation (c.f. fault tolerance)

    24. Accident prevention

    25. Hazard avoidance & removal Formal proofs Informal arguments Managed development lifecycle Hazard analysis: Thought support tools Checklists Brainstorming

    26. “Safe” development process Hazard analysis Hazard management (logging, tracing) Engineers with responsibility for safety Extensive use of safety reviews Safety certification Detailed configuration management

    27. Safety development lifecycle

    28. Hazard analysis process

    29. Hazard analysis collaboration Developers Domain experts Safety advisers Managers End user Regulatory bodies Certification organisation

    30. Hazard analysis Long and time consuming Difficult and complex Expensive Boring and tedious Omission and error prone Estimating probabilities and severities is hard

    31. Hazard analysis process

    32. Hazard identification Identify all possible hazards Often many possible hazards Hard to identify all possible hazards Potential for hazard interaction Most accidents are due to multiple hazards/incidents (Perrow 1984)

    33. Identification mechanisms Introspection Group brainstorming Precedence and case studies Thought support tools Checklists

    34. HazOp analysis Supports cooperation between experts Aims to bridge the “culture gap” Systematic “though support” Prompt human operators Entities and phenomenon Domain specific “bad things” Consider all combinations Some make sense, others do not

    35. HazOp concepts Intention - how system should operate Guide word - abstract “bad things” Parameter – changeable entity or phenomenon Deviation - unintended operation (2 x 3) Cause - cause of deviation Consequence - results of deviation Suggested action - prevent deviation

    36. Example HazOp analysis Making “a nice cup of tea”

    37. Possible Deviations More tea leaves - too strong Less heat - poor brewing, cold tea Milk late - (tea in first) proteins damaged More sugar - too sweet Other than comfy chair - ruin experience

    38. Hazard analysis process

    39. Hazard classification Nature of damage caused (e.g. toxic) Example being road haulage labels Probability of damage Severity of damage

    40. Hazard analysis process

    41. Risk assessment Produce calculated risk values Consider acceptability of risk: Intolerable As Low As Reasonably Practical (ALARP) Acceptable Consider social and political factors Take into account costs of prevention Help decide if action needs taking

    42. Risk phenomenon Risk is a very strange thing Subject to illogical thinking Subject to political and social pressure Perceived risk differs from actual risk

    43. Perception of risk Big accident, many fatalities = high impact Small accident, few fatalities = low impact Even though there are many small accidents “Total deaths” is not important !!! What kills more people: Planes or Donkeys? 2004… 9000 trouser related accidents resulted in injury

    44. Strange risk Train crash - many killed Public outcry Government forced into action Introduce train protection system Slower trains, increased fares More passengers choose to drive Cars are less safe than trains More people die than if gov did nothing!!

    45. Risk calculations Hazard probability (occurrence) Incident probability (conversion) Accident probability (completion) Hazard severity (worst case damage) Hazard risk = haz_prob x incident_prob x accident_prob x haz_sev

    46. Dimensions of risk Probability - numerical value or scale: Frequent, Probable, Occasional, Remote, Improbable, Incredible (N.B. nothing is impossible!) Severity - numerical value or scale: Catastrophic, Hazardous, Major, Minor, Negligible, No effect Risk - numerical (death/year) or scale: Intolerable, Undesirable, Tolerable, Negligible

    47. Risk estimation question? Identify potential accidents resulting from each of the following and give estimations of perceived and actual risks: Driving on M6 in the snow Flying on concorde Riding a rollercoaster Being an MSc student

    48. Event tree analysis How hazards contribute to accidents Interaction of hazards and events Effect of combined hazards Help reason about what could happen Uses probability of hazards and events Calculate probability of accident Used in assessment of risk

    49. Example system Boat with leaky hull Sea water detection system Automatic pump Pump failure alarm Level alarm Manual pump available

    50. Event tree analysis

    51. Hazard analysis process

    52. Hazard filtration Minimise set of hazards for analysis Remove impossible hazards Remove very improbable hazards Remove very low risk hazards Keep record of removed hazards! Retain rationale for removal

    53. Hazard analysis process

    54. Hazard Decomposition Identify causes of each hazard Often combination of factors lead to hazard A single hazard may have different causes Essential to understanding of each hazard

    55. Fault tree analysis Systematic documentation of hazards Can utilise probability of events Tables of failure probability available for common components Calculate probability of hazard Tend to produce very large trees Evolve greatly during analysis process

    56. Fault tree analysis

    57. Fault tree circuit example

    58. Example fault tree

    59. Hazard analysis process

    60. Guard proposition Prevent causes of hazards: Interlocks Physical guards Control software Work practices and procedures Block consequences of incidents This overlaps with damage limitation…

    61. Damage limitation

    62. Limitation approaches Assertions and state checks Exception handling Safety states (Fail-safe systems) Human flexibility Incident reporting Emergency procedure (e.g. fire drill)

    63. Safety states Fail-controlled: graceful failure Fail-uncontrolled: disgraceful failure Fail-stop: halts with no output Fail-silent: continue, but with no output Fail-safe: halts and drops into safe state Fail-operational: still some functionality

    64. Human “components” What effect do humans have on a system Inject unreliability and unpredictability? Inject flexibility and resilience? …Probably a bit of both Make use of their advantages Take account of their shortfalls Modern planes still carry a pilot! (opens up whole issue of trust)

    65. Blame All failures are caused by humans: Developers Administrators Operators Operators are good scapegoats if things go bad Especially if they are dead! Often “operator” errors trace back to UI

    66. FMECA (or just FMEA) Failure Mode - a way something can fail Cause - what leads to failures Effect - the consequences of failure Severity - seriousness off effects Occurrence - prob of cause occurring Criticality - severity x occurrence Current control - existing guard on cause Detection - prob of success for control Risk priority - criticality x detection

    67. A320 Group exercise In your groups, identify as many potential hazards as you can for flight in the Airbus A320. Consider all socio-technical issues. Consider using the following identification mechanisms: Freeform brainstorming Precedence and case based comparison HazOp analysis

    68. Extended exercise For each hazard identified by the previous exercise, assign the following values: Accident probability Accident severity Hazard risk Possible guards

    69. System evolution How would the hazards that you have identified be effected by the following: Change in flight duration Total removal of pilots Near miss protection system Increased security (e.g. air marshals)

More Related