An experimental evaluation on reliability features of n version programming
1 / 30

An Experimental Evaluation on Reliability Features of N-Version Programming - PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

An Experimental Evaluation on Reliability Features of N-Version Programming. Teresa Cai, Michael R. Lyu and Mladen A. Vouk ISSRE’2005 November 10, 2005. Outline. Introduction Motivation Experimental evaluation Fault analysis Failure probability Fault density Reliability improvement

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

An Experimental Evaluation on Reliability Features of N-Version Programming

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

An Experimental Evaluation on Reliability Features of N-Version Programming

Teresa Cai, Michael R. Lyu and Mladen A. Vouk


November 10, 2005


  • Introduction

  • Motivation

  • Experimental evaluation

    • Fault analysis

    • Failure probability

    • Fault density

    • Reliability improvement

  • Discussions

  • Conclusion and future work


  • N-version programming (NVP) is one of the main techniques for software fault tolerance

  • It has been adopted in some mission-critical applications

  • Yet, its effectiveness is still an open question

    • What is reliability enhancement?

    • How does the fault correlation between multiple versions affect the final reliability?

Introduction (cont’)

  • Empirical and theoretical investigations have been conducted based on experiments, modeling, and evaluations

    • Avizienis and Chen (1977), Knight and Leveson (1986), Kelly and Avizienis (1983), Avizienis, Lyu and Schuetz (1988), Eckhardt et al (1991), Lyu and He (1993)

    • Eckhardt and Lee (1985), Littlewood and Miller (1989), Popov et al. (2003)

    • Belli and Jedrzejowicz (1990), Littlewood. et al (2001), Teng and Pham (2002)

  • No conclusive reliability estimation can be made because of the size, population, complexity and comparability of these experiments

Research questions

  • What is the reliability improvement of NVP?

  • Is fault correlation a big issue that will affect the final reliability?

  • What kind of empirical data can be comparable with previous investigations?


  • To address the reliability and fault correlation issues in NVP

  • To conduct a comparable experiment with previous empirical studies

  • To investigate the “variant” and “invariant” features in NVP

Experimental background

  • Some features about the experiment

    • Complexity

    • Large population

    • Well-defined

    • Statistical failure and fault records

  • Previous empirical studies

    • UCLA Six-Language project

    • NASA 4-University project

    • Knight and Leveson’s experiment

    • Lyu-He study

Experimental setup

  • RSDIMU avionics application

  • 34 program versions

  • A team of 4 students

  • Comprehensive testing exercised

    • Acceptance testing: 800 functional test cases and 400 random test cases

    • Operational testing: 100,000 random test cases

  • Failures and faults collected and studied

  • Qualitative as well as quantitative comparisons with NASA 4-University project performed

Experimental description

  • Geometry

  • Data flow diagram

Comparisons between the two projects

  • Qualitative comparisons

    • General features

    • Fault analysis in development phase & operational test

  • Quantitative comparisons

    • Failure probability

    • Fault density

    • Reliability improvement

General features comparison

Faults in development phase

Distribution of related faults

Fault analysis in development phase

  • Common related faults

    • Display module (easiest part)

    • Calculation in wrong frame of reference

    • Initialization problems

    • Missing certain scaling computation

  • Faults in NASA project only

    • Division by zero

    • Incorrect conversion factor

    • wrong coordinate system problem.

  • Fault analysis in development phase (cont’)

    • Both cause and effect of some related faults remain the same

    • Related faults occurred in both easy and difficult subdomains

    • Some common problems, e.g., initialization problem, exist for different programming languages

    • The most fault-prone module is the easiest part of the application

    Faults in our operational test

    Faults in operational test (cont’)

    • These faults are all related to the same module, i.e., sensor failure detection and isolation problem

    • Fault pair (34.2 & 22.1) : 25 coincidence failures

    • Fault pair (34.3 & 29.1) : 32 coincidence failures

    • Yet these two pairs are quite different in nature

    • Version 34 shows the lowest quality

      • Poor program logic and design organization

      • Hard coding

    • The overall performance of NVP derived from our data would be better if the data from version 34 are ignored

    Input/Output domain classification

    • Normal operations are classified as:

      Si,j = {i sensors previously failed and

      j of the remaining sensors fail

      | i = 0, 1, 2; j = 0, 1 }

    • Exceptional operations: Sothers

    Failures in operational test

    • States S0,0, S1,0 and S2,0 are more reliable than states S0,1, S1,1, S2,1

    • Exceptional state reveals most of the failures

    • The failure probability in S0,1 is the highest

    • The programs inherit high reliability on average

    Coincident failures

    • Two or more versions fail at the same test case, whether the outputs identical or not

    • The percentage of coincident failures versus total failures is low:

      • Version 22: 25/618=4%

      • Version 29: 32/2760=1.2%

      • Version 34: (25+32)/1351=4.2%

    Fault density

    • Six faults identified in 4 out of 34 versions

    • The size of these versions varies from 1455 to 4512 source lines of code

    • Average fault density:

      • one fault per 10,000 lines

    • It is close to industry-standard for high quality software systems

    Failure bounds for 2-version system

    • Lower and upper bounds for coincident failure probability under Popov et al model

    • DP1: normal test cases without sensor failures dominates all the testing cases

    • DP3: the test cases evenly distributed in all subdomains

    • DP2: between DP1 & DP3

    Quantitative comparison in operational test

    • NASA 4-university project: 7 out of 20 versions passed the operational testing

    • Coincident failures were found among 2 to 8 versions


    • The difference on fault number and fault density is not significant

    • In NASA project:

      • The number of failures and coincident failures in NASA project is much higher

      • Although there is coincident failures in 2- to 8-version combinations, the reliability improvement for 3-version system still achieves 80~330 times better

    • In our project:

      • Average failure rate is 50 times better

      • The reliability improvement for 3-version system is 30~60 times better


    • Reliable program versions with low failure probability

    • Similar number of faults and fault density

    • Distinguishable reliability improvement for NVP, with 102 to 104 times enhancement

    • Related faults observed in both difficult and easy parts of the application


    • Compared with NASA project, our project:

      • Some faults not observed

      • Less failures

      • less coincident failures

      • Only 2-version coincident failures

      • The overall reliability improvement is an order of magnitude larger


    • The improvement of our project may attributed to

      • stable specification

      • better programming training

      • experience in NVP experiment

      • cleaner development protocol

      • different programming languages & platforms

    Discussions (cont’)

    • The hard-to-detected faults are only hit by some rare input domains

    • New testing strategy is needed to detect such faults:

      • Code coverage?

      • Domain analysis?


    • An empirical investigation is performed to evaluate reliability features by a comprehensive comparison between two NVP projects

    • NVP can provide significant improvement for final reliability according to our empirical study

    • Low number of coincident failures provides a supportive evidence for NVP

    • Possible attributes that may affect the NVP reliability improvement are discussed

    Future work

    • Apply more intensive testing on both Pascal and C programs

    • Conduct cross-comparison on these program versions developed by different programming languages

    • Investigate the reliability enhancement of NVP based on the combined set of program versions

  • Login