Software Reliability Engineering:Techniques and Tools CS130 Winter, 2002
Source Material • “Software Reliability and Risk Management: Techniques and Tools”, Allen Nikora and Michael Lyu, tutorial presented at the 1999 International Symposium on Software Reliability Engineering • Allen Nikora, John Munson, “Determining Fault Insertion Rates For Evolving Software Systems”, proceedings of the International Symposium on Software Reliability Engineering, Paderborn, Germany, November, 1998
Agenda Part I: Introduction Part II: Survey of Software Reliability Models Part III: Quantitative Criteria for Model Selection Part IV: Input Data Requirements and Data Collection Mechanisms Part V: Early Prediction of Software Reliability Part VI: Current Work in Estimating Fault Content Part VII: Software Reliability Tools
Part I: Introduction • Reliability Measurement Goal • Definitions • Reliability Theory
Reliability Measurement Goal • Reliability measurement is a set of mathematical techniques that can be used to estimate and predict the reliability behavior of software during its development and operation. • The primary goal of software reliability modeling is to answer the following question: “Given a system, what is the probability that it will fail in a given time interval, or, what is the expected duration between successive failures?”
Basic Definitions • Software Reliability R(t): The probability of failure-free operation of a computer program for a specified time under a specified environment. • Failure: The departure of program operation from user requirements. • Fault: A defect in a program that causes failure.
Basic Definitions (cont’d) • Failure Intensity (rate) f(t): The expected number of failures experienced in a given time interval. • Mean-Time-To-Failure (MTTF): Expected value of a failure interval. • Expected total failuresm(t): The number of failures expected in a time period t.
Reliability Theory Let "T" be a random variable representing the failure time or lifetime of a physical system. For this system, the probability that it will fail by time "t" is: The probability of the system surviving until time "t" is:
Reliability Theory (cont’d) Failure rate - the probability that a failure will occur in the interval [t1, t2] given that a failure has not occurred before time t1. This is written as:
Reliability Theory (cont’d) Hazard rate - limit of the failure rate as the length of the interval approaches zero. This is written as: This is the instantaneous failure rate at time t, given that the system survived until time t. The terms hazard rate and failure rate are often used interchangeably.
Reliability Theory (cont’d) A reliability objective expressed in terms of one reliability measure can be easily converted into another measure as follows (assuming an “average” failure rate, , is measured):
Part II: Survey of Software Reliability Models • Software Reliability Estimation Models: • Exponential NHPP Models • Jelinski-Moranda/Shooman Model • Musa-Okumoto Model • Geometric Model • Software Reliability Modeling and Acceptance Testing
Jelinski-Moranda/Shooman Models • Jelinski-Moranda model was developed by Jelinski and Moranda of McDonnell Douglas Astronautics Company for use on Navy NTDS software and a number of modules of the Apollo program. The Jelinski-Moranda model was published in 1971. • Shooman's model, discovered independently of Jelinski and Moranda's work, was also published in 1971. Shooman's model is identical to the JM model.
Jelinski-Moranda/Shooman (cont'd) Assumptions: • The number of errors in the code is fixed. • No new errors are introduced into the code through the correction process. • The number of machine instructions is essentially constant. • Detections of errors are independent. • The software is operated in a similar manner as the anticipated operational usage. • The error detection rate is proportional to the number of errors remaining in the code.
Jelinski-Moranda/Shooman (cont'd) Let represent the amount of debugging time spent on the system since the start of the test phase. From assumption 6, we have: where K is the proportionality constant, and r is the error rate (number of remaining errors normalized with respect to the number of instructions). ET = number of errors initially in the program IT = number of machine instructions in the program c = cumulative number of errors fixed in the interval [0,normalized by the number of instructions). z() = Kr() r() = ET / IT- c()
K(ET/IT) z(t) eC(t) ET/IT Jelinski-Moranda/Shooman (cont'd) ET and IT are constant (assumptions 1 and 3). No new errors are introduced into the correction process (assumption 2). As, c() ET/IT, so r() 0. The hazard rate becomes:
Jelinski-Moranda/Shooman (cont'd) The reliability function becomes: The expression for MTTF is:
Geometric Model • Proposed by Moranda in 1975 as a variation of the Jelinski-Moranda model. • Unlike models previously discussed, it does not assume that the number of errors in the program is finite, nor does it assume that errors are equally likely to occur. • This model assumes that errors become increasingly difficult to detect as debugging progresses, and that the program is never completely error free.
Geometric Model (cont'd) Assumptions: • There are an infinite number of total errors. • All errors do not have the same chance of detection. • The detections of errors are independent. • The software is operated in a similar manner as the anticipated operational usage. • The error detection rate forms a geometric progression and is constant between error occurrences.
Geometric Model (cont'd) The above assumptions result in the following hazard rate: z(t) = Di-1 for any time "t" between the (i - 1)st and the i'th error. The initial value of z(t) = D
Geometric Model (cont'd) Hazard Rate Graph Hazard rate } D } D(1 - ) D(1 - ) 2 D 2 D time
Musa-Okumoto Model • The Musa-Okumoto model assumes that the failure intensity function decreases exponentially with the number of failures observed: • Since (t) = d(t)/dt, we have the following differential equation: or
Musa-Okumoto Model (cont’d) Note that We then obtain
Musa-Okumoto Model (cont’d) Integrating this last equation yields: Since (0) = 0, C = 1, and the mean value function (t) is:
Software Reliability Modeling and Acceptance Testing Given a piece of software advertised as having a failure rate , you can see if it meets that failure rate to a specific level of confidence. • is the risk (probability) of falsely saying that the software does not meet the failure rate goal. • is the risk of saying that the goal is met when it is not. • The discrimination ratio, , is the factor you specify that identifies acceptable departure from the goal. For instance, if = 2, the acceptable failure rate lies between /2 and 2.
Software Reliability Modeling and Acceptance Testing (cont’d) Reject Continue Failure Number Accept Normalized Failure Time (Time to failure times failure intensity objective)
Software Reliability Modeling and Acceptance Testing (cont’d) We can now draw a chart as shown in the previous slide. Define intermediate quantities A and B as follows: The boundary between the “reject” and “continue” regions is given by: where n is number of failures observed. The boundary between the “continue” and “accept” regions of the chart is given by:
Part III: Criteria for Model Selection • Background • Non-Quantitative criteria • Quantitative criteria
Criteria for Model Selection - Background • When software reliability models first appeared, it was felt that a process of refinement would produce “definitive” models that would apply to all development and test situations: • Current situation • Dozens of models have been published in the literature • Studies over the past 10 years indicate that the accuracy of the models is variable • Analysis of the particular context in which reliability measurement is to take place so as to decide a priori which model to use does not seem possible.
Criteria for Model Selection (cont’d) Non-Quantitative Criteria • Model Validity • Ease of measuring parameters • Quality of assumptions • Applicability • Simplicity • Insensitivity to noise
Criteria for Model Selection (cont’d) Quantitative Criteria for Post-Model Application • Self-consistency • Goodness-of-Fit • Relative Accuracy (Prequential Likelihood Ratio) • Bias (U-Plot) • Bias Trend (Y-Plot)
Criteria for Model Selection (cont’d) Self-constency - Analysis of a model’s predictive quality can help user decide which model(s) to use. • Simplest question a SRM user can ask is “How reliable is the software at this moment? • The time to the next failure, Ti, is usually predicted using observed times ot failure • In general, predictions of Ti can be made using observed times to failure The results of predictions made for different values of K can then be compared. If a model produced “self consistent” results for differing values of K, this indicates that its use is appropriate for the data on which the particular predictions were made. HOWEVER, THIS PROVIDES NO GUARANTEE THAT THE PREDICTIONS ARE CLOSE TO THE TRUTH.
Criteria for Model Selection (cont’d) Goodness-of-fit - Kolmogorov-Smirnov Test • Uses the absolute vertical distance between two CDFs to measure goodness of fit. • Depends on the fact that: where F0 is a known, continuous CDF, and Fn is the sample CDF, is distribution free.
Criteria for Model Selection (cont’d) Goodness-of-fit (cont’d) - Chi-Square Test • More suited to determining GOF of failure counts data than to interfailure times. • Value given by: where: • n = number of independent repetitions of an experiment in which the outcomes are decomposed into k+1 mutually exclusive sets A1, A2,..., Ak+1 • Nj = number of outcomes in the j’th set • pj = P[Aj]
Criteria for Model Selection (cont’d) Prequential Likelihood Ratio • The pdf for Fi(t) for Ti is based on observations . The pdf • For one-step ahead predictions of , the prequential likelihood is: • Two prediction systems, A and B, can be evaluated by computing the prequential likelihood ratio: • If PLRn approaches infinity as n approaches infinity, B is discarded in favor of A
Prequential Likelihood Example fi+2 fi+1 fi true pdf High bias, low noise true pdf fi+1 fi+2 fi+3 fi Low bias, high noise
Criteria for Model Selection (cont’d) Prequential Likelihood Ratio (cont'd) When predictions have been made for , the PLR is given by: Using Bayes' Rule, the PLR is rewritten as:
Criteria for Model Selection (cont’d) Prequential Likelihood Ratio (cont’d) This equals: If the initial conditions were based only on prior belief, the second factor of the final equation is the prior odds ratio. If the user is indifferent between models A and B, this ratio has a value of 1.
Criteria for Model Selection (cont’d) Prequential Likelihood Ratio (cont’d): The final equation is then written as: This is the posterior odds ratio, where wA is the posterior belief that A is true after making predictions with both A and B and comparing them with actual behavior.
Criteria for Model Selection (cont’d) The “u-plot” can be used to assess the predictive quality of a model • Given a predictor, , that estimates the probability that the time to the next failure is less than t. Consider the sequence where each is a probability integral transform of the observed ti using the previously calculated predictor based upon . • If each were identical to the true, but hidden, , then the would be realizations of independent random variables with a uniform distribution in [0,1]. • The problem then reduces to seeing how closely the sequence resembles a random sample from [0,1]
U-Plots for JM and LV Models 1.0 JM LV 0.5 0 0 1.0 0.5
Criteria for Model Selection (cont’d) The y-plot: • Temporal ordering is not shown in a u-plot. The y-plot addresses this deficiency • To generate a y-plot, the following steps are taken: • Compute the sequence of • For each , compute • Obtain by computing: for i m, m representing the number of observations made • If the really do form a sequence of independent random variables in [0,1], the slope of the plotted will be constant.
Y-Plots for JM and LV Models 1.0 LV JM 0.5 0 0 1.0 0.5
Criteria for Model Selection (cont’d) Quantitative Criteria Prior to Model Application • Arithmetical Mean of Interfailure Times • Laplace Test
Arithmetical Mean of Interfailure Times • Calculate arithmetical mean of interfailure times as follows: i = number of observed failures j = j’th interfailure time • Increasing series of t(i) suggests reliability growth. • Decreasing series of t(i) suggests reliability decrease.
Laplace Test • The occurrence of failures is assumed to follow a non-homogeneous Poisson process whose failure intensity is decreasing: • Null hypothesis is that occurrences of failures follow a homogeneous Poisson process (I.e., b=0 above). • For interfailure times, test statistic computed by:
Laplace Test (cont’d) • For interval data, test statistic computed by:
Laplace Test (cont’d) • Interpretation • Negative values of the Laplace factor indicate decreasing failure intensity. • Positive values suggest an increasing failure intensity. • Values varying between +2 and -2 indicate stable reliability. • Significance is that associated with normal distribution; e.g. • The null hypothesis “H0 : HPP” vs. “H1 : decreasing failure intensity” is rejected at the 5% significance level for m(T) < -1.645 • The null hypothesis “H0 : HPP” vs. “H1 : increasing failure intensity” is rejected at the 5% significance level for m(T) > -1.645 • The null hypothesis “H0 : HPP” vs. “H1 : there is a trend” is rejected at the 5% significance level for |m(T)| > 1.96
Part IV: Input Data Requirements and Data Collection Mechanisms • Model Inputs • Time Between Successive Failures • Failure Counts and Test Interval Lengths • Setting up a Data Collection Mechanism • Minimal Set of Required Data • Data Collection Mechanism Examples