Loading in 2 Seconds...
Loading in 2 Seconds...
Performance and availability of computers and networks. Instructor: Prof. Kishor S. Trivedi Visiting Prof. Of Computer Science and Engineering, IITK Prof. Department of Electrical and Computer Engineering Duke University Durham, NC 27708-0291 Phone: 7576 e-mail: [email protected]
Instructor: Prof. Kishor S. Trivedi
Visiting Prof. Of Computer Science and Engineering, IITK
Prof. Department of Electrical and Computer Engineering
Durham, NC 27708-0291
e-mail: [email protected]
and computer science applications, K. S. Trivedi, Prentice-Hall, 1982 (Indian edition also available).
and computer science applications, K. S. Trivedi, second edition, John Wiley & Sons, 2001.
"The ability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval, assuming that the external resources, if required, are provided."
(non-life-critical/non-safety-critical, business critical)
Dependability– Umbrella term
Trustworthiness of a computer system such that reliance can justifiably be placed on the service it delivers
Faults are the cause of errors that may lead to failures
Middleware (Recovery Software)
Outage-and-recovery behavior not considered
Different levels of performance not considered
at any given instant
Performance under failures
Survivability“ilities” besides performance
Performability measures of the systems ability to perform designated functions
R.A.S.-ability concerns grow. High-R.A.S. not only a selling point for equipment vendors and service providers. But, regulatory outage report required by FCC for public switched telephone networks (PSTN) may soon apply to wireless.
A model is a convenient abstraction
These famous quotes bring out the difficulty of prediction
based on models:
“Does it work, and for how long?\'\'
“Given that it works, how well does it work?\'\'
“How much work will be done(lost) in a given interval including the effects of failure/repair/contention?\'\'
Most believable, most expensive
Not always possible or cost effective during system design
Less believable, Less expensive
1. Discrete-Event Simulation vs. Analytic
2. State-Space Methods vs. Non-State-Space Methods
3. Hybrid: Simulation + Analytic (SPNP)
4. State Space + Non-State Space (SHARPE)
Vaidyanathan et al ISSRE 99
Numerical solution using a tool
(Model Calibration or Parameterization)
Confidence Intervals should be obtained
Boeing, Draper, Union Switch projects
1. Face Validation
2. Input-Output Validation
3. Validation of Model Assumptions
NON-STATE SPACE MODELING TECHNIQUES
SP reliability block diagrams
Non-SP reliability block diagrams
discrete-time Markov chains
continuous-time Markov chains
Markov reward models
(discrete) State space models
Markov regenerative process
Answer “What-if Questions\'\'
Use Measurements + Models
E.g. Fault/Injection + Availability Model
Union Switch and Signals, Boeing, Draper
Workload based adaptive rejuvenation
Should I Use Discrete-Event Simulation?
+ Detailed System Behavior including non-exponential behavior
+ Performance, Dependability and Performability Modeling Possible
- Long Execution Time (Variance Reduction Possible)
- Many users in practice do not realize the need to calculate confidence intervals
Should I Use Non-State-Space Methods?
also called Combinatorial Models
+ Easy specification, fast computation, no distributional assumption
+ Can easily solve models with 100’s of components
- Failure/Repair Dependencies are often present; RBDs, FTREEs cannot easily handle these
(e.g., state dependent failure rate, shared repair, imperfect coverage, reliability with repair)
Should I Use Markov Models?
+ Model Fault-Tolerance and Recovery/Repair
+ Model Dependencies
+ Model Contention for Resources
+ Model Concurrency and Timeliness
+ Generalize to Markov Reward Models for Modeling Degradable Performance
Should I Use Markov Models?
+ Generalize to Markov Regenerative Models for Allowing Generally Distributed Event Times
+ Generalize to Non-Homogeneous Markov Chains for Allowing Weibull Failure Distributions
+ Performance, Availability and Performability Modeling Possible
- Large (Exponential) State Space
The Markov chains tend to be large and complex
Use automated means of generating the Markov chains: Stochastic Petri Nets, Stochastic Reward Nets
Use sparse storage for the matrices
Use sparsity preserving solution methods
Well-known modeling tool (Installed at over 300 Sites; companies and universities)
Combines flexibility of Markov models and efficiency of combinatorial models
Ported to most architectures and operating systems
Used for Education, Research, Engineering PracticeOverview of SHARPE
Hierarchical & Hybrid Compositions
(GSPN & SRN)
Multistate fault tree
Reliability block diagram
Stochastic reward net
Fully symbolic CDF
Fully symbolic MTTF
Fully symbolic PQCDF
Expected interval availability
(No Retrofitting Required)
(Encourages Hybrid and Hierarchical Composition)
Appropriate Solution Algorithm for Each Model Type
i.e., Hierarchy for Solution as well as Specification