1 / 71

Dependability & Maintainability Theory and Methods 5. Markov Models

Andrea Bobbio Dipartimento di Informatica Universit à del Piemonte Orientale, “ A. Avogadro ” 15100 Alessandria (Italy) bobbio@unipmn.it - http://www.mfn.unipmn.it/~bobbio/IFOA. Dependability & Maintainability Theory and Methods 5. Markov Models. IFOA, Reggio Emilia, June 17-18, 2003.

aysel
Download Presentation

Dependability & Maintainability Theory and Methods 5. Markov Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Andrea Bobbio Dipartimento di Informatica Università del Piemonte Orientale, “A. Avogadro” 15100 Alessandria (Italy) bobbio@unipmn.it - http://www.mfn.unipmn.it/~bobbio/IFOA Dependability & Maintainability Theory and Methods 5. Markov Models IFOA, Reggio Emilia, June 17-18, 2003 Reggio Emilia, June 17-18, 2003

  2. States and labeled state transitions State can keep track of: Number of functioning resources of each type States of recovery for each failed resource Number of tasks of each type waiting at each resource Allocation of resources to tasks A transition: Can occur from any state to any other state Can represent a simple or a compound event State-Space-Based Models Reggio Emilia, June 17-18, 2003

  3. Transitions between states represent the change of the system state due to the occurrence of an event Drawn as a directed graph Transition label: Probability: homogeneous discrete-time Markov chain (DTMC) Rate: homogeneous continuous-time Markov chain (CTMC) Time-dependent rate: non-homogeneous CTMC Distribution function: semi-Markov process (SMP) State-Space-Based Models (Continued) Reggio Emilia, June 17-18, 2003

  4. Should I Use Markov Models? State-Space-Based Methods + Model Dependencies +Model Fault-Tolerance and Recovery/Repair + Model Contention for Resources + Model Concurrency and Timeliness + Generalize to Markov Reward Models for Modeling Degradable Performance Modeler’sOptions Reggio Emilia, June 17-18, 2003

  5. Should I Use Markov Models? + Generalize to Markov Regenerative Models for Allowing Generally Distributed Event Times + Generalize to Non-Homogeneous Markov Chains for Allowing Weibull Failure Distributions + Performance, Availability and Performability Modeling Possible - Large (Exponential) State Space Modeler’sOptions Reggio Emilia, June 17-18, 2003

  6. Modeling Performance, Availability and Performability Modeling Complex Systems We Need Automatic Generation and Solution of Large Markov Reward Models In order to fulfil our goals Reggio Emilia, June 17-18, 2003

  7. Choice of the model type is dictated by: Measures of interest Level of detailed system behavior to be represented Ease of model specification and solution Representation power of the model type Access to suitable tools or toolkits Model-based evaluation Reggio Emilia, June 17-18, 2003

  8. x i State space models s s’ A transition represents the change of state of a single component Z(t)is the stochastic process Pr {Z(t) = s}is the probability of finding Z(t)in state sat timet. Pr {s  s’, t} = Pr {Z(t+ t) = s’| Z(t) = s} Reggio Emilia, June 17-18, 2003

  9. x i State space models s s’ If s  s’ represents a failure event: Pr {s  s’, t} = = Pr {Z(t+ t) = s’| Z(t) = s} =  it If s  s’ represents a repair event: Pr {s  s’, t} = = Pr {Z(t+ t) = s’| Z(t) = s} =  it Reggio Emilia, June 17-18, 2003

  10. Markov Process: definition Reggio Emilia, June 17-18, 2003

  11. Transition Probability Matrix initial

  12. State Probability Vector

  13. Chapman-Kolmogorov Equations

  14. Time-homogeneous CTMC

  15. Time-homogeneous CTMC

  16. The transition rate matrix

  17. C-K Equations for CTMC

  18. Solution equations

  19. Transient analysis Given that the initial state of the Markov chain, then the system of differential Equations is written based on: rate of buildup = rate of flow in - rate of flow out for each state (continuity equation).

  20. Steady-state condition If the process reaches a steady state condition, then:

  21. Steady-state analysis (balance equation) The steady-state equation can be written as a flow balance equation with a normalization condition on the state probabilities. (rate of buildup) = rate of flow in - rate of flow out rate of flow in = rate of flow out for each state (balance equation).

  22. State Classification

  23. 2-component system Reggio Emilia, June 17-18, 2003

  24. 2-component system Reggio Emilia, June 17-18, 2003

  25. 2-component system Reggio Emilia, June 17-18, 2003

  26. 2-component series system A1 A2 A1 A2 2-component parallel system Reggio Emilia, June 17-18, 2003

  27. 2-component stand-by system A B Reggio Emilia, June 17-18, 2003

  28. Markov Models Repairable systems - Availability

  29. Repairable system: Availability Reggio Emilia, June 17-18, 2003

  30. Repairable system: 2 identical components Reggio Emilia, June 17-18, 2003

  31. Repairable system: 2 identical components Reggio Emilia, June 17-18, 2003

  32. Assume we have a two-component parallel redundant system with repair rate . Assume that the failure rate of both the components is . When both the components have failed, the system is considered to have failed. 2-component Markov availability model Reggio Emilia, June 17-18, 2003

  33. Markov availability model • Let the number of properly functioning components be the state of the system. • The state space is {0,1,2} where 0 is the system down state. • We wish to examine effects of shared vs. non-shared repair. Reggio Emilia, June 17-18, 2003

  34. Markov availability model 2 1 0 Non-shared (independent) repair 2 1 0 Shared repair Reggio Emilia, June 17-18, 2003

  35. Note: Non-shared case can be modeled & solved using a RBD or a FTREE but shared case needs the use of Markov chains. Markov availability model Reggio Emilia, June 17-18, 2003

  36. For any state: Rate of flow in = Rate of flow out Considering the shared case i: steady state probability that system is in state i Steady-state balance equations Reggio Emilia, June 17-18, 2003

  37. Hence Since We have Or Steady-state balance equations Reggio Emilia, June 17-18, 2003

  38. Steady-state Unavailability: For the Shared Case = 0= 1 - Ashared Similarly, for the Non-Shared Case, Steady-state Unavailability =1 - Anon-shared Downtime in minutes per year = (1 - A)* 8760*60 Steady-state balance equations(Continued) Reggio Emilia, June 17-18, 2003

  39. Steady-state balance equations Reggio Emilia, June 17-18, 2003

  40. Absorbing states MTTF Reggio Emilia, June 17-18, 2003

  41. Absorbing states - MTTF Reggio Emilia, June 17-18, 2003

  42. Markov Reliability Model with Imperfect Coverage

  43. Next consider a modification of the 2-component parallel system proposed by Arnold as a model of duplex processors of an electronic switching system. We assume that not all faults are recoverable and that c is the coverage factor which denotes the conditional probability that the system recovers given that a fault has occurred. The state diagram is now given by the following picture: Markov model with imperfect coverage Reggio Emilia, June 17-18, 2003

  44. Now allow for Imperfect coverage c Reggio Emilia, June 17-18, 2003

  45. Assume that the initial state is 2 so that: Then the system of differential equations are: Markov modelwith imperfect coverage Reggio Emilia, June 17-18, 2003

  46. After solving the differential equations we obtain: R(t)=P2(t) + P1(t) From R(t), we can obtain system MTTF: It should be clear that the system MTTF and system reliability are critically dependent on the coverage factor. Markov model with imperfect coverage Reggio Emilia, June 17-18, 2003

  47. Measurement data from an operational system Large amount of data needed Improved instrumentation needed Fault-injection experiments Expensive but badly needed Tools from CMU,Illinois, LAAS (Toulouse) A fault/error handling submodel (FEHM) Phases: detection, location, retry, reconfig, reboot Estimate duration and probability of success of each phase Source of fault coverage data Reggio Emilia, June 17-18, 2003

  48. Modify the Markov model with imperfect coverage to allow for finite time to detect as well as imperfect detection. You will need to add an extra state, say D. The rate at which detection occurs is  . Draw the state diagram and investigate the effects of detection delay on system reliability and mean time to failure. Redundant System with Finite Detection Switchover Time Reggio Emilia, June 17-18, 2003

  49. Assumptions: Two units have the same MTTF and MTTR; Single shared repair person; Average detection/switchover time tsw=1/; We need to use a Markov model. Redundant System with Finite Detection Switchover Time Reggio Emilia, June 17-18, 2003

  50. Redundant System with Finite Detection Switchover Time 2 1D 1 0 Reggio Emilia, June 17-18, 2003

More Related