1 / 13

Software Reliability A Survey

Department of Engineering Management, Information and Systems. EMIS 7305. Systems Reliability, Supportability and Availability Analysis. Software Reliability A Survey. Scott Eisenhart. System Reliability.

ramiro
Download Presentation

Software Reliability A Survey

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Department of Engineering Management, Information and Systems EMIS 7305 Systems Reliability, Supportability and Availability Analysis Software Reliability A Survey Scott Eisenhart

  2. System Reliability • System Reliabilityis defined as the ability of a system to perform and maintain its required function sunder nominal and anomalous conditions for a specified period of time in a given environment. • System reliability is typically expressed by a failure-probability density function over time.

  3. Software Reliability • Software reliability is often defined as the probability of failure free operation of a computer program for a specified time in a specified environment. [Lyu] • Many concepts of reliability engineering are adapted from the classic techniques of hardware reliability. However this must be done with care as there are many differences between hardware and software failure modes and processes. [Eusgeld]

  4. Hardware Failure Modes • The largest component of hardware failures is a function of time and results in wear out due to physical faults • A partial list of distinct characteristics of software compared to hardware reliability • Failure Mode: Software defects are primarily design defects • Wear Out: Software does not have an energy related wear put phase. Errors can happen any time. • Time Dependency: Software reliability is not a function of operational time • Reliability prediction: Software reliability cannot be predicted from a physical basis since it depends on human factors in design

  5. Software Failure Modes • The faults seldom repeat in the same fashion and activation is both usage and time dependent. • Generally over time these updates result in increased stability and robustness of the software performance in the system. • During the useful life new features generate defects due to latent defects being activated based on a use case that may not have been envisioned by the design team. Hence the small spike in increased failure rate after a major release. In the final phase, obsolescence, there is not the typical wear out as seen in hardware bathtub curve.

  6. Software Fault Lifecycle • A fault is defined as the cause of a failure in program execution. • A failure is said to occur if the observable outcome of a program execution is different from the expected outcome • The four lifecycle techniques are: • fault prevention, the avoidance of a fault or defect • fault removal, detect and eliminate defects • fault tolerance, providing methods for the on-going function or service in spite of a fault • fault/failure forecasting is the estimate of faults and the consequences of the faults • Fault Forecasting is the primary area of software reliability modeling. • Fault prevention and fault removal are areas of focus for software engineering methodologies using formal methods and inspections, testing etc

  7. Software Reliability Models • The modeling of software elements for reliability calculations is a similar approach to the modeling performed for the overall system and hardware. • There are two primary types of reliability models: prediction models and estimation models. • Prediction models use historical data, used at early stages of the development cycle and predict reliability in future time usually at release or initial deployment. The results can be used to modify or change the software architecture or design early in the development phase. For prediction models to be accurate there must be ample historical data based on similar applications/code. • Estimation models use real time data from the current project, applied during the final test or system test phase of development, and are used to estimate reliability at either the present time or some future time. Both models rely on gathering defect or failure data from the development effort and post release and apply various statistical models for analysis.

  8. Software Reliability Models Continued • Reliability Models make the following assumptions: • The operation environment is the same as the testing environment • Once a failure occurs the fault which causes the failure is immediately removed • The removal process does not introduce new failures/faults • The faults in the software follow a mathematical formulae at least in a statistical sense [Lyu] • How to choose a model? • Collect failure data (failure specification) • Examine data (Density distribution vs. Cumulative distribution) • Select a model • Estimate model parameters • Customize model using the estimated parameters • Goodness-of-fit test • Make reliability predictions [Calgary]

  9. Software Reliability Models • There are many models to choose from and no one model is a perfect fit for a given situation. It is important to analyze the type of system, software complexity, goals for reliability and assumptions when choosing a model. • A list of some of the model mentioned most frequently in the literature include: • Musa’s Execution Time Model • Goel-Okumoto (G-O) Model • Exponential Model • Weibull Distribution Model • Raleigh-Putnam Method • Rome Laboratory TR-92-52 • Bayesian Belief Nets • Non-Homogeneous Poisson Process (NHPP)

  10. Software Reliability Growth Models • The prediction models are often referred to as SRGM or software reliability growth models. • There are 100’s of models and variations of the models in existence. • Common software reliability growth models are the Basic Exponential model which assumes finite failures ν0 in infinite time and the Logarithmic Poisson model which assumes infinite failures. • In software, however, we want to “fix” the problem, i.e., have a lower probability for the remaining failures after a repair (or longer Δti = ti-ti-1). Therefore, we need a model for reliability growth (i.e., reliability change over time). In reliability growth models we are assuming some effort of fault removal. This leads to a variable failure intensity λ(t). Every reliability growth model is based on specific assumptions concerning the change of failure intensity λ(t) through the process of fault removal.[Calgary] • The two measurements related to reliability are the number of failures in a time period and time between failures. Time in this case is CPU execution time not calendar time since it better represents the actual time the software is executing.[Lyu]

  11. Software Reliability Metrics • In order to predict/estimate the reliability of a software product it is necessary to gather the appropriate data thru definition of the appropriate metrics throughout the software development lifecycle. • The software metrics needed for reliability are many of the same metrics collected for project execution and judging software quality. In fact, it is very interesting how the concepts of software quality and the processes and efforts to develop “good code” directly impact the software reliability calculations. • Some of the metrics needed for software reliability include: • Source Lines of Code • Complexity – such a function points • Faults/Defects – when found in the process stage, execution time, class of defect

  12. Industry Practice/Challenges to SRE Comparison of key software reliability assumptions and current software development practices: [Calgary][Strongfellow and Andrews 2003]

  13. Summary • The software engineering community is focused on the first two of the four efforts of managing failures/faults/defects mentioned by [Lyu[; Fault Prevention, Fault Removal, Fault Tolerence, and Fault Forecasting, with vigor and discipline, while the next two, tolerance and forecasting get varying amounts of focus with forecasting probably being reserved for specialized engineers in SQA teams, to achieve software reliability. • There appears to be a common line of thinking that if reliability is dependent on the occurrence of failures created by faults/defects in the code that appear at final test in an operational mode, the best strategy is to focus on fault/defect elimination and hence improve reliability as a benefit of fault reduction. • In the software community the approach to handling fault prevention and removal is a focus on deploying and improving practices such as code inspection, unit testing, integration testing early in the development phases. The result of these efforts is “quality” software that by inference equates to reliable software. • The value in the models and process defined in the papers for software reliability is the ability to set targets for software faults/defects for a given project – in a given phase of the project, beginning or end (including warranty and maintenance phases) that can benefit the program management team and management team to assess the progress of the project versus quality/defect goals and make decisions about the progress of the project.

More Related