Testing effectiveness and reliability modeling for diverse software systems
Download
1 / 44

Testing Effectiveness and Reliability Modeling for Diverse Software Systems - PowerPoint PPT Presentation


  • 161 Views
  • Uploaded on

Testing Effectiveness and Reliability Modeling for Diverse Software Systems. CAI Xia Ph.D Term 4 April 28, 200 5. Outline . Introduction Background study Reliability modeling Testing effectiveness Future work Conclusion. Introduction. Software reliability engineering techniques

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Testing Effectiveness and Reliability Modeling for Diverse Software Systems' - field


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Testing effectiveness and reliability modeling for diverse software systems

Testing Effectiveness and Reliability Modeling for Diverse Software Systems

CAI Xia

Ph.D Term 4

April 28, 2005


Outline
Outline Software Systems

  • Introduction

  • Background study

  • Reliability modeling

  • Testing effectiveness

  • Future work

  • Conclusion


Introduction
Introduction Software Systems

  • Software reliability engineering techniques

    • Fault avoidance

      • structure programming, software reuse, and formal methods

    • Fault removal

      • testing, verification, and validation

    • Fault tolerance

      • single-version technique

      • multi-version technique (design diversity)

    • Fault prediction

      • reliability modeling


Software fault tolerance
Software Fault Tolerance Software Systems

  • Layers of Software fault tolerance


Sft techniques
SFT techniques Software Systems

  • Single-version techniques

    • Checkpointing and recovery

    • Exception handling

    • Data diversity

  • Multi-version techniques (Design diversity)

    • Recovery block

    • N-version programming

    • N self-checking programming


Design diversity
Design diversity Software Systems

  • To deploy multiple-version programs to tolerate software faults during operation

  • Principle: redundancy

  • Applications

    • Airplane control systems, e.g., Boeing 777 & AIRBUS A320/A330/A340

    • aerospace applications

    • nuclear reactors

    • telecommunications products


Design diversity cont
Design diversity (cont’) Software Systems

  • Controversial issues:

    • Failures of diverse versions may correlate with each other

      • Reliability modeling on the basis of failure data collected in testing

    • Testing is a critical issue to ensure the reliability

      • Testing completeness and effectiveness  Test case selection and evaluation  code coverage?

    • Real-world empirical data are needed to perform the above analysis


Research questions
Research questions Software Systems

  • How to predict the reliability of design diversity on the basis of the failure data of each individual version?

  • How to evaluate the effectiveness of a test set? Is code coverage a good indicator?


Experimental description Software Systems

  • Motivated by the lack of empirical data, we conducted the Redundant Strapped-Down Inertial Measurement Unit (RSDIMU) project

  • It took more than 100 students 12 weeks to develop 34 program versions

  • 1200 test cases were executed on these program versions

  • 426 mutants were generated by injecting a single fault identified in the testing phase

  • A number of analyses and evaluations were conducted in our previous work


Outline1
Outline Software Systems

  • Introduction

  • Background study

  • Reliability modeling

  • Testing effectiveness

  • Future work

  • Conclusion


Reliability models for design diversity Software Systems

  • Eckhardt and Lee (1985)

    • Variation of difficulty on demand space

    • Positive correlations between version failures

  • Littlewood and Miller (1989)

    • Forced design diversity

    • Possibility of negative correlations

  • Dugan and Lyu (1995)

    • Markov reward model

  • Tomek and Trivedi (1995)

    • Stochastic reward net

  • Popov, Strigini et al (2003)

    • Subdomains on demand space

    • Upper/lower bounds for failure probability

Conceptual

models

Structural

models

In between


Ps model
PS Model Software Systems

  • Alternative estimates for probability of failures on demand (pfd) of a 1-out-of-2 system


Ps model cont
PS Model (cont’) Software Systems

  • Upper bound of system pfd

  • “Likely” lower bound of system pfd

    - under the assumption of conditional independence


Dl model
DL Model Software Systems

  • Example: Reliability model of DRB


Dl model cont
DL Model (cont’) Software Systems

  • Fault tree models for 2-, 3-, and 4-version systems


Comparison of ps dl model
Comparison of PS & DL Model Software Systems


Outline2
Outline Software Systems

  • Introduction

  • Background study

  • Reliability modeling

  • Testing effectiveness

  • Future work

  • Conclusion


Testing effectiveness
Testing effectiveness Software Systems

  • The key issue in software testing is test case selection and evaluation

  • What is a good test case?

    • testing effectiveness and completeness

    • fault coverage

  • To allocate testing resources, how to predict the effectiveness of a given test case in advance?


Testing effectiveness Software Systems

  • Code coverage: an indicator of fault detection capability?

    • Positive evidence

      • high code coverage brings high software reliability and low fault rate

      • both code coverage and fault detected in programs grow over time, as testing progresses.

    • Negative evidence

      • Can this be attributed to causal dependency between code coverage and defect coverage?


Testing effectiveness (cont’) Software Systems

  • Is code coverage a good indicator for fault detection capability?

    ( That is, what is the effectiveness of code coverage in testing? )

  • Does such effect vary under different testing profiles?

  • Do different code coverage metrics have various effects?


Basic concepts code coverage
Basic concepts: code coverage Software Systems

  • Code coverage - measured as the fraction of program codes that are executed at least once during the test.

  • Block coverage - the portion of basic blocks executed.

  • Decision coverage - the portion of decisions executed

  • C-Use- computational uses of a variable.

  • P-Use - predicate uses of a variable


Basic concepts testing profiles
Basic concepts: testing profiles Software Systems

  • Functional testing – based on specified functional requirements

  • Random testing - the structure of input domain based on a predefined distribution function

  • Normal operational testing – based on normal operational system status

  • Exceptional testing - based on exceptional system status


Experimental requirement
Experimental requirement Software Systems

  • Complicated and real-world application

  • Large population of program versions

  • Controlled development process

  • Bug history recorded

  • Real faults studied

  • Our RSDIMU project satisfies above requirements


Test cases description
Test cases description Software Systems

I

II

III

IV

V

VI


The correlation between code coverage and fault detection
The correlation between code coverage and fault detection Software Systems

Is code coverage a good indicator of fault detection capability?

  • In different test case regions

  • Functional testing vs. random testing

  • Normal operational testing vs. exceptional testing

  • In different combinations of coverage metrics


The correlation various test regions
The correlation: various test regions Software Systems

  • Test case coverage contribution on block coverage

  • Test case coverage contribution on mutant coverage


The correlation various test regions1
The correlation: various test regions Software Systems

  • Linear modeling fitness in test case regions

  • Linear regression relationship between block coverage and defect coverage in whole test set


The correlation various test regions2
The correlation: various test regions Software Systems

  • Linear regression relationship between block coverage and defect coverage in region IV

  • Linear regression relationship between block coverage and defect coverage in region VI


The correlation various test regions3
The correlation: various test regions Software Systems

Observations:

  • Code coverage: a moderate indicator

  • Reasons behind the big variance between region IV and VI


The correlation functional testing vs random testing
The correlation: Software Systemsfunctional testing vs. random testing

  • Code coverage:

    - a moderate indicator

  • Random testing

    – a necessary complement to functional testing

    • Similar code coverage

    • High fault detection capability


The correlation functional testing vs random testing1
The correlation: Software Systemsfunctional testing vs. random testing

  • Failure details of mutants failed at less than

    20 test cases:

    detected by 169

    functional test cases

    (800 in total)

    & 94 random test cases

    (400 in total)


The correlation functional testing vs random testing2
The correlation: Software Systemsfunctional testing vs. random testing

  • Failure number of mutants that detected only by functional testing or random testing


The correlation normal operational testing vs exceptional testing
The correlation: normal operational testing vs. exceptional testing

  • The definition of operational status and exceptional status

    • Defined by specification

    • application-dependent

  • For RSDIMU application

    • Operational status: at most two sensors failed as the input and at most one more sensor failed during the test

    • Exceptional status: all other situations

  • The 1200 test cases are classified to operational and exceptional test cases according to their inputs and outputs


The correlation normal operational testing vs exceptional testing1
The correlation: normal operational testing vs. exceptional testing

  • Normal operational testing

    • very weak correlation

  • Exceptional testing

    • strong correlation


The correlation normal operational testing vs exceptional testing2
The correlation: normal operational testing vs. exceptional testing

  • Normal testing: small coverage range (48%-52%)

  • Exceptional testing: two main clusters


The correlation normal operational testing vs exceptional testing3
The correlation: normal operational testing vs. exceptional testing

  • Failure number of mutants that detected only by normal operational testing or exceptional testing


The difference between two pairs of testing profiles
The difference between two pairs of testing profiles testing

  • The whole testing demand space can be classified into seven subsets according to system status Si,j :

    • S0,0 S0,1 S1,0 S1,1 S2,0 S2,1 Sothers

    • i: number of sensors failed in the input

    • j: number of sensors failed during the test

  • Functional testing vs. random testing

    • big overlap on seven system status

  • Normal testing vs. exceptional testing

    • no overlap on seven system status

  • This may explain the different performance of code coverage on testing effectiveness under two pairs of testing profiles


The correlation under different combinations
The correlation: under different combinations testing

  • Combinations of testing profiles

  • Observations:

    • Combinations containing exceptional testing indicate strong correlations

    • Combinations containing normal testing inherit weak correlations


The correlation under different coverage metrics
The correlation: under different coverage metrics testing

  • Similar patterns as block coverage

  • Insignificant difference under normal testing

    • Decision/P-use: control flow change related

    • Larger variation in code coverage brings more faults detected


Discussions
Discussions testing

  • Does the effect of code coverage on fault detection vary under different testing profiles?

    • A significant correlation exists in exceptional test cases, while no correlation in normal operational test cases.

    • Higher correlation is revealed in functional testing than in random testing, but the difference is insignificant.

  • Do different coverage metrics have various effects on such relationship?

    • Not obvious with our experimental data


Discussions cont
Discussions (cont’) testing

  • This is the first time that the effect of code coverage on fault detection are examined under different testing profiles

  • Overall, code coverage is a moderate indicator for testing effectiveness

  • The correlation in small code coverage range is insignificant

  • Our findings of the positive correlation can give guidelines for the selection and evaluation of exceptional test cases


Future work
Future work testing

  • Generate 1 million test cases and exercise them on current 34 versions to collect statistical failure data

  • Conduct cross-comparison with previous project to investigate the “variant” and “invariant” features in design diversity

  • Quantify the relationship between code coverage and testing effectiveness


Conclusion
Conclusion testing

  • Survey on software fault tolerance evolution, techniques, applications and modeling

  • Evaluate the performance of current reliability models on design diversity

  • Investigate the effect of code coverage under different testing profiles and find it is a clear indicator for fault detection capability, especially for exceptional test cases


Q & A testing

Thank you!


ad