1 / 41

An Empirical Study on Testing and Fault Tolerance for Software Reliability Engineering

An Empirical Study on Testing and Fault Tolerance for Software Reliability Engineering. Presented by: CAI Xia Ph.D Term2 Presentation April 28, 2004. Outline. Background and motivations Project descriptions and experimental Procedure

schapin
Download Presentation

An Empirical Study on Testing and Fault Tolerance for Software Reliability Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Empirical Study on Testing and Fault Tolerance for Software Reliability Engineering Presented by: CAI Xia Ph.D Term2 Presentation April 28, 2004

  2. Outline • Background and motivations • Project descriptions and experimental Procedure • Preliminary experimental results on testing and fault tolerance • Evaluation on current reliability models • Conclusion and future work

  3. Background • Fault removal and fault tolerance are two major approaches in software reliability engineering • Software testing is the main fault removal technique • Data flow coverage testing • Mutation testing • The main fault tolerance technique is software design diversity • Recovery blocks • N-version programming • N self-checking programming

  4. Design Diversity • N-version Programming (NVP) • To employ different development teams to build different program versions according to one single specification • The target is to achieve quality and reliability of software systems by detecting and tolerating software faults during operations • The final output of NVP is voted by multiple versions • Problem: the possibility of correlated faults in multiple versions

  5. Background • Conclusive evidence about the relationship between test coverage and software reliability is still lacking • Mutants with hypothetical faults are either too easily killed, or too hard to be activated • The effectiveness of design diversity heavily depends on the failure correlation among the multiple program versions, which remains a debatable research issue.

  6. Motivation • The lack of real world project data for investigation on software testing and fault tolerance techniques • The lack of comprehensive analysis and evaluation on software testing and fault tolerance together • The lack of evaluation and validation on current software reliability models for design diversity

  7. Motivation • Conduct a real-world project to engage multiple teams for development of program versions • Perform detailed experimentation to study the nature, source, type, detectability and effect of faults uncovered in the versions • Apply mutation testing with real faults and investigate different hypotheses on software testing and fault tolerance schemes • Evaluate the current reliability models

  8. Project descriptions • In spring of 2002, 34 teams are formed to develop a critical industry application for a 12-week long project in a software engineering course • Each team composed of 4 senior-level undergraduate students with computer science major from the Chinese University of Hong Kong

  9. Project descriptions • The RSDIMU project • Redundant Strapped-Down Inertial Measurement Unit RSDIMU System Data Flow Diagram

  10. Software development procedure • Initial design document ( 3 weeks) • Final design document (3 weeks) • Initial code (1.5 weeks) • Code passing unit test (2 weeks) • Code passing integration test (1 weeks) • Code passing acceptance test (1.5 weeks)

  11. Program metrics

  12. Mutant creation • Revision control was applied in the project and code changes were analyzed • Fault found during each stage were also identified and injected into the final program of each version to create mutants • Each mutant contains one design or programming fault • 426 mutants were created for 21 program versions

  13. Setup of evaluation test • ATAC tool was employed to analyze and compare the test coverage • 1200 test cases were exercised on 426 mutants • All the resulting failures from each mutant were analyzed, their coverage measured, and cross-mutant failure results compared • 60 Sun machines running Solaris were involved in the test, one cycle took 30 hours and a total of 1.6 million files around 20GB were generated

  14. Static analysis: fault classificaiton and distribution • Mutant defect type distribution • Mutant qualifier distribution • Mutant severity distribution • Fault distribution over development stage • Mutant effect code lines

  15. Static Analysis result (1) Qualifier Distribution Defect Type Distribution

  16. Static Analysis result (2) Severity Distribution

  17. Static Analysis result (3) Development Stage Distribution Fault Effect Code Lines

  18. Dynamic analysis of mutants • Software testing related • Effectiveness of code coverage • Test case contribution: test coverage vs. mutant coverage • Finding non-redundant set of test cases • Software fault tolerance related • Relationship between mutants • Relationship between the programs with mutants

  19. Fault Detection Related to Changes of Test Coverage

  20. Test Case Contribution on Program Coverage

  21. Percentage of Test Case Coverage

  22. Test Case Contributions on Mutant Average: 248 (58.22%) Maximum: 334 (78.40%) Minimum: 163 (38.26%)

  23. Non-redundant Set of Test Cases Gray: redundant test cases (502/1200) Black: non-redundant test cases (698/1200) Reduction: 58.2%

  24. Mutants Relationship Related mutants: two mutants have the same success/failure result on the 1200-bit binary string Similar mutants: two mutants have the same binary string and with the same erroneous output variables Exact mutants: two mutants have the same binary string with the same erroneous output variables, and erroneous output values are exactly the same

  25. Observation • Coverage measures and mutation scores cannot be evaluated in isolation, and an effective mechanism to distinguish related faults is critical • A good test case should be characterized not only by its ability to detect more faults, but also by its ability to detect faults which are not detected by other test cases in the same test set

  26. Observation • Individual fault detection capability of each test case in a test set does not represent the overall capability of the test set to cover more faults, diversity natures of the test cases are more important • Design diversity involving multiple program versions can be an effective solution for software reliability engineering, since the portion of program versions with exact faults is very small • Software fault removal and fault tolerance are complementary rather than competitive, yet the quantitative tradeoff between the two remains a research issue

  27. Evaluations on Current Reliability Models • Popov and Strigini’s reliability bounds estimation model (PS model) • Dugan and Lyu’s dependability model (DL model)

  28. PS Model • PS model gives the upper and ''likely'' lower bounds for probability of failures on demand for a 1-out-of-2 diverse system • As it is hard to obtain complete knowledge on the whole demand space, the demand space can be partitioned into some independent subsets, which is called sub-domains. • Given the knowledge on subdomains, failure probabilities of the whole system can be estimated as a function of the subdomain to which a demand belongs.

  29. PS Model

  30. PS Model • The upper bound on the probability of system failure as a weighted sum of upper bounds within subdomains: • The ''likely'' lower bound can be drawn from the assumption of conditional independence:

  31. PS Model: mutants passed acceptance test

  32. PS Model Demand profile

  33. PS Model: joint pfds

  34. PS Model

  35. DL Model • Dugan and Lyu’s dependability model • a Markov model details the system structure, • two fault trees represent the causes of unacceptable results in the initial configuration and in the reconfigured degraded state. • Three parameters can be estimated: • the probability of an unrelated fault in a version; • the probability of a related fault between two versions; • the probability of a related fault in all versions.

  36. DL Model

  37. DL Model

  38. Conclusion • Our target is to investigate software testing, fault correlation and reliability modeling for design diversity • We perform an empirical investigation on evaluating fault removal and fault tolerance issues as software reliability engineering techniques • Mutation testing was applied with real faults

  39. Conclusion • Static and dynamic analysis were performed to evaluate the relationship of fault removal and fault tolerance techniques • Different reliability models are applied on our project data to evaluate their validation and prediction accuracy.

  40. Future Work • For further evaluation and investigation purpose, more test cases should be generated to be executed on the mutants and versions. • Comparison with existing project data will be made to observe the “variants” as well as “invariants” of design diversity

  41. Q & A Thank you!

More Related