Lecture 11 Testing, Verification, Validation and Certification

CS 540 – Quantitative Software Engineering Lecture 11 Testing, Verification, Validation and Certification You can’t test in quality Independent system testers

Software Testing • Dijkstra “Testing can show the presence of bugs but not their absence!” • Independent testing is a necessary but not sufficient condition for trustworthiness. • Good testing is hard and occupies 20% of the schedule • Poor testing can dominate 40% of the schedule • Test to assure confidence in operaiton; not to find bugs

Types of Tests • Unit • Interface • Integration • System • Scenario • Reliability • Stress • Verification • Validation • Certification

When to Test • Boehm- errors discovered in the operational phase incur cost 10 to 90 times higher than design phase • Over 60% of the errors were introduced during design • 2/3’s of these not discovered until operations • Test requirements specifications, architectures and designs

Testing Approaches • Coverage based - all statements must be executed at least once • Fault based- detect faults, artificially seed and determine whether tests get at least X% of the faults • Error based - focus on typical errors such as boundary values (off by 1) or max elements in list • Black box - function, specification based,test cases derived from specification • White box - structure, program based, testing considering internal logical structure of the software • Stress Based – no load, impulse, uniform, linear growth, exponential growth by 2’s.

Testing Vocabulary • Error - human action producing incorrect result • Fault is a manifestation of an error in the code • Failure – a system anomaly, executing a fault induces a failure • Verification “The process of evaluating a system or component to determine whether the products of a given development phase satisfy conditions imposed at the start of the phase” e.g., ensure software correctly implements a certain function- have we built the system right • Validation “The process of evaluating a system or component during or at the end of development process to determine whether it satisfies specified requirements” • Certification “The process of assuring that the solution solves the problem.

Test Process Program or Doc Expected output Prototype Or model input Subset of input Test strategy compare Acutal output Subset of input Execute Test results

Fault Detection vs. Confidence Building • Testing provokes failure behavior - a good strategy for fault detection but does not inspire confidence • User wants failure free behavior - high reliability • Automatic recovery minimizes user doubts. • Test team results can demoralize end users, so report onoy those impacting them. A project with no problems is in deep trouble.

Cleanroom • Developer does not execute code - convinced of correctness through static analysis • Modules are integrated and tested by independent testers using traffic based input profiles. • Goal: Achieve a given reliability level considering expected use.

Testing requirements • Review or inspection to check that all aspects of the system have been described • Scenarios with prospective users resulting in functional tests • Common errors in a specification: • Missing information • Wrong information • Extra information

Boehm’s specification criteria • Completeness- all components present and described completely - nothing pending • Consistent- components do not conflict and specification does not conflict with external specifications --internal and external consistency. Each component must be traceable • Feasibility- benefits must outweigh cost, risk analysis (safety-robotics) • Testable - the system does what’s described • Roots of ICED-T

Traceability Tables • Features - requirements relate to observable system/product features • Source - source for each requirement • Dependency - relation of requirements to each other • Subsystem - requirements by subsystem • Interface requirements relation to internal and external interfaces

Traceability Table: Pressman SUBSYSTEM REQUIREMENTS

Maintenance Testing • More than 50% of the project life is spent in maintenance • Modifications induce another round of tests • Regression tests • Library of previous test plus adding more (especially if the fix was for a fault not uncovered by previous tests) • Issue is whether to retest all vs selective retest, expense related decision (and state of the architecture/design related decision – when entropy sets test thoroughly!) • Cuts testing interval in half.

V&V planning and documentation • IEEE 1012 specifies what should be in Test Plan • Test Design Document specifies for each software feature the details of the test approach and lists the associated tests • Test Case Document lists inputs, expected outputs and execution conditions • Test Procedure Document lists the sequence of action in the testing process • Test Report states what happened for each test case. Sometimes these are required as part of the contract for the system delievery. • In small projects many of these can be combined

Purpose Referenced Documents Definitions V&V overview Organization Master schedule Resources summary Responsibilities Tools, techniques and methodologies Life cycle V&V Management of V&V Requirements phase V&V Design phase V&V Implementation V&V Test phase V&V Installation and checkout phase V&V O&M V&V Software V&V Reporting V&V admin procedures Anomaly reporting and resolution Task iteration policy Deviation policy Control procedures Standard practices and conventions IEEE 1012

Human static testing • Reading - peer reviews (best and worst technique) • Walkthroughs and Inspections • Scenario Based Evaluation (SAAM) • Correctness Proofs • Stepwise Abstraction from code to spec

Inspections • Sometimes referred to as Fagan inspections • Basically a team of about 4 folks examines code, statement by statement • Code is read before meeting • Meeting is run by a moderator • 2 inspectors or readers paraphrase code • Author is silent observer • Code analyzed using checklist of faults: wrongful use of data, declaration, computation, relational expressions, control flow, interfaces • Results in problems identified that author corrects and moderator reinspects Constructive attitude essential; do not use for programmer's performance reviews

Walk throughs • Guided reading of code using test data to run a “simulation” • Generally less formal • Learning situation for new developers • Parnas advocates a review with specialized roles where the roles define questions asked - proven to be very effective - active reviews • Non-directive listening

The Value of Inspections/Walk-Thoughs(Humphrey 1989) • Inspections can be 20 times more efficient than testing. • Code reading detects twice as many defects/hour as testing • 80% of development errors were found by inspections • Inspections resulted in a 10x reduction in cost of finding errors Beware bureaucratic code reviews drive away gurus.

SAAM • Software Architecture Analysis Method • Scenarios that describe both current and future behavior • Classify the scenarios by whether current architecture directly (full support) or indirectly supports it • Develop a list of changes to architecture/high level design - if semantically different scenarios require a change in the same component, this may indicate flaws in the architecture • Cohesion glue that keeps modules together - low=bad • Functional cohesion all components contribute to the single function of that module • Data cohesion - encapsulate abstract data types • Coupling strength of inter module connections, loosely coupled modules are easier to comprehend and adapt, low=good

Coverage based Techniques(unit testing) • Adequacy of testing based on coverage, percent statements executed, percent functional requirements tested • All paths coverage is an exhaustive testing of code • Control flow coverage: • All nodes coverage, all statements coverage recall Cyclomatic complexity graphs • All edge coverage or branch coverage, all branches chosen at least once • Multiple condition coverage or extended branch coverage covers all combinations of elementary predicates • Cyclomatic number criterion tests all linearly independent paths

Coverage Based Techniques -2 Data Flow Coverage - considers definitions and use of variables • A variable is defined if it is assigned a value in a statement • A definition is alive if the variable is not reassigned at an intermediate statement and it is a definition clear path • Variable use P-use (as a predicate) C-use (as anything else) • Testing each possible use of a definition is all-uses coverage

Requirements coverage • Transform the requirements into a graph • nodes denoting elementary requirements • edges denoting relations between elementary requirements • Derive test cases • Use control flow coverage

Model of Requirements Specification Enter fields Notify user All req’d Fields completed Check Dept budget no yes

Fault Seeding to estimate faults in a program • Artificially seed faults, test to discover both seeded and new faults: Total faults = ((total faults found – total seeded faults found)[ total seeded faults/total seeded faults found • Assumes real and seeded errors have same distribution but manually generating faults may not be realistic • Alternative: use two groups: real faults found by X become seeded faults for Y • Trust reutls when most faults found are seeded. • Many real faults found is negative. Redesign module. • Probability of more faults in a module is proportional to the number of errors already found!

Orthogonal Array Testing • Intelligent selection of test cases • Fault model being tested is that simple interactions are a major source of defects • Independent variables - factors and number of values they can take -- if you have four variables, each of which could have 3 values, exhaustive testing would be 81 tests (3x3x3x3) whereas OATS technique would only require 9 tests yet would test all pair-wise interactions

Humphrey, 1989 Top-down and Bottom-up

Some Specialized Tests • Testing GUIs • Testing with Client/Server architectures • Testing documentation and help facilities • Testing real time systems • Acceptance test • Conformance test

Software Testing Footprint Tests Completed Planned Rejection point Tests run successfully Poor Module Quality Time

Test Status

People Process Product Project (control, risk, schedule, trustworthiness) Technology and Platforms (rules, tools, assets) People work days, computers work nights Work, not people, needs to be mobile Productivity must continue to double with no loss of reliability or performance Software Engineering Vision

Customer Interests I N S T A L L A T I O N • Before • Features • Price • Schedule • After • Reliability • Response Time • Throughput

Why bad things happen to good systems • Customer buys • off-the-shelf • System works • with 40-60% • flow- through • Developers complies • with enhancements • Customer refuses • critical Billing • Module • Customer demands • 33 enhancements • and tinkers with • database • Unintended • system • consequences BUT

One common process is not the goal Commonly managed processes are possible Scalability is essential Lessons Learned

Brooks: System Production x3 Program Programming System x3 Programming System Product Programming Product x9

Move from a culture of minimal change to one of maximal change. Move to "make it work, make it work right, make it work better" philosophy through prototyping and delaying code optimization. Give the test teams the "right of refusal" for any code that was not reasonably tested by the developers. Mindset

Productivity = F {people, system nature, customer relations, capital investment} Productivity

Trends in Software Productivity Expansion Factor The ratio of Source line of code to a machine level line of code Order of Magnitude Every Twenty Years Technology Change: Machine Instructions Macro Assemblers High Level Languages Database Managers On-Line Dev Prototyping Subsec Time Sharing Object Oriented Programming Large Scale Reuse Regression Testing 4GL Small Scale Reuse Each date is an estimate of widespread use of a software technology

The Project Manager’s crib sheet Who’s in charge? What’s the problem and what changed? When did it happen? Where is it? How did it happen? What should we do? It depends.

Lecture 11 Testing, Verification, Validation and Certification