Session 8

Operations Research Methods for Software-Intensive Systems Session 8

Course Outline Day 1 Day 2 • Part 4: Design and Execution • Understanding a test matrix • Choose the test space – set levels • Factorials and fractional factorials • Execution – randomization and blocking • In-Class F-18 LEX Study - Design Build • Paper Helicopter - Pt 2 Design for Power • Part 5: Analysis • Regression model building • ANOVA • Interpreting results – assess results, redesign and plan further tests • Optimization • In-Class F-18 LEX Study - Analysis • Seatwork Exercise 2 – NASCAR • Paper Helicopter – Pt 3 Execute and Analyze • Paper Helicopter – Pt 4 Multiple Response Optimization • Part 6: Case Studies • Part 7: Re-engineering Proposed Designs with DOT&E • Part 8: Design of Experiments for Software-Intensive Systems • Part 0: Student Introduction • Paper Helicopter - Pt 0 Use what you know • Part 1: DOE Introduction • What is a Designed Experiment? • Part 2: Planning • Understand the test item’s process from start to finish • Identify Test Objective – screen, characterize, optimize, compare • Response variables • Identify key factors affecting performance • Paper Helicopter - Pt 1 Planning • Part 3: Hypothesis Testing • Random variables • Understanding hypothesis testing • Demonstrate a hypothesis test • Sample size, risk, and constraints • Seatwork Exercise 1 – Hang time Measurements • Part 4: Design and Execution New! New!

Science of Test IV Metrics of Note Plan Sequentially for Discovery Factors, Responses and Levels Design with Confidence and Power to Span the Battlespace N, a, Power, Test Matrices Analyze Statistically to Model Performance Model, Predictions, Bounds DOE Execute to Control Uncertainty Randomize, Block, Replicate

Test Agency Objectives Mitigate the risk that a flaw in system design, functionality, or performance is discovered in Operational Test (or use) • DT: Ensure system meets design specifications and functional requirements • DT/OT: Ensure system meets performance thresholds • OT: Ensure system meets user needs Mitigate the risk that an end-user (or any support function) concludes the system has little utility (+ilities)

Here’s where we are… • There is a methodical way to consider operational science for testing Systems of Systems (SoS) that are software-intensive • It can often be more complicated than: “sitting down with the testers for a few hours” • Its value must be weighed against the time and resources required to apply it • Available methods are generally ‘tactics’ and lack a codified set of principles as a basis

Overview • How to span complex configuration trees • How to search uniformly over many dimensions • How to distribute test resources objectively % Objective 1 % % Test resources Objective 2 Objective 3

Software-Intensive Systems Traditional DOE Steps Issues for SIS No modeling? Always coverage What is ‘significant’? Black box? Categorical and overabundant Constrained and nested Non-orthogonal? Anti-randomizing? (gasp!) N/A? • 0: Objective Identification • Test Questions • Test Hypotheses • 1: Process Decomposition • Measures • Factors/levels • 2: Planning • 3: Execution • 4: Analysis

Software-Intensive Systems Traditional Metrics Issues for SIS a  0 b >> 0 s 0 S: fault that we care about (strength/region) n 1 • a > 0 • b > 0 • s > 0 • d: difference we care about • n> 0

Software-Intensive Systems Physical Phenomena Issues for SIS Logic Algorithmics Digital/Discreet/Finite Simulated Deterministic Repeatable Outcomes • Nature • Physics • Analog/Continuous/Infinite • Live • Stochastic • Statistical Variability A lifetime of experience tells us how to manipulate physical objects…that knowledge must be re-learned for algorithms in order to understand how to test them

Basic Probability • Combinations • “n choose r” ( ) • Example: • How many pairings can I form from the set S ={a, b, c, d}? • {ab}, {ac}, {ad}, {bc}, {bd}, {cd} • Permutations • Example: • How many ordered pairings can I form from the set S? • {ab}, {ac}, {ad}, {bc}, {bd}, {cd}, {ba}, {ca}, {da}, {cb}, {db}, {dc} Probability is the science of chance, in test, we face the chance that we won’t find something we want to discover

Spanning Simple Configurations • A={1,2,3} • B={1,3} • C={2,4,6} • There are 3∙2∙3=18 combinations of the levels of all three factors together What is the minimum number of test cases to ‘cover’ all the combinations of the levels of only 2 factors taken together?

Spanning Simple Configurations • 6 Combinations of A & B • 9 Combinations of A & C • 6 Combinations of B & C Can we form a test matrix that covers all 6+9+6 = 21 ‘pairwise’ combinations in fewer than 18 cases?

Factor Covering • FCA(3, 2, 9) is a Factor Covering Array (FCA) with 3 factors, of strength-2, in 9 cases • It ‘covers’ all 21 combinations of the levels of any 2 factors taken together • (verify)

Factor Covering

Factor Covering It takes FIVE cases to cover all pairs for FOUR factors (2 levels) We can extend into TEN dimensions with only ONE more case…

Factor Covering • National Institute of Standards and Technology • Automated Combinatorial Test Software (ACTS) • Sequence Covering Arrays Icons Hyperlinked…

Searching Simple Input Spaces Function SquareRootEst_ (P As Double, E As Double, Opt J As Int = 0) D = 1 X = 0 C = 2 * P Dim Step As String If P >= 1 Or P < 0 Or E > 1 Or E <= 0 Then End End If While D > E D = D / 2 T = C - (2 * X + D/2) If T >= 0 Then C = 2 * (C - (2 * X + D)) X = X + D Else C = 2 * C End If Wend If J = 0 Then SquareRootEst = X Else SquareRootEst = Step End If End Function • 300 randomly drawn test points results in a 95% chance for detecting a “1%” error • P(no detections in 300 trials) = (1 - 0.01)300 = .05 • (Note: for a single point failure in an infinite design space, the Pdetect ≈ 0. For a finite design space (size R) and test matrix (size n), the probability is n/R for any design.) E P

Searching Simple Input Spaces r=.0564 l=0.0798 • A naive 144 point grid design (~ ½ of 300) can cover the design space such that the max distance between nearest-neighbor points guarantees discovery of a contiguous, symmetric failure region of area .01, but can still miss Amorphous Defect Phenomena (ADP) • The second chart represents the true failure region for the software E E Actual Failure Regions P P

Searching Simple Input Spaces h=0.0846 l=0.098 • A sphere-packing 110 point uniform design (~ ⅓ of 300) can cover the design space with the same fault finding capability as the naive approach. • In two dimensions, its not difficult…but even with m dimensions and nested levels, computer-generated designs (nested-space-filling designs using swapping-algorithms) can achieve the same notion of design quality E E P P

Space Filling • 3 popular algorithms: • Sphere-Packing • Maximize the smallest distance between neighbors • Effect: Moves points out to boundaries • Uniform • Minimize discrepancy from a uniform distribution • Effect: Spreads points within interior • Latin Hypercube • Assign n congruent levels and minimize covariance • Effect: Combination of the above

Sphere-Packing

Uniform

Latin Hypercube

Searching Complex Input Spaces • Space Filling is an efficient way to search or cover continuous input spaces • Space Filling algorithms spread out test points using tailored optimality criteria • Assumes a couple strange things that probably aren’t true • It’s still better than random…but what about categorical or mixed inputs?

Assigning Resources Presence & nature of defects Likelihood of detection Criticality RIGHT? Test case development Defect detection effectiveness Test decision Defect detection efficiency WRONG?

Session 8: Summary of DOE for C4ISR • THERE IS A SCIENCE TO SOFTWARE SYSTEM TEST • IT IS FUNDAMENTALLY ALGORITHMIC AND PROBABILISTIC • There is NOT a well-developed, overarching discipline with a codified set of principles and methods.. • There ARE tools and techniques (we covered some) that have utility for C4ISR test design: • Decision Modeling for managing risk • Factor Covering for covering sub-configurations • Space Filling for spanning regions • This seminar has been an introduction to the idea of “probabilistic methods” as a complementary approach to software-test/system-test/functional-test in DoD acquisition THERE IS MUCH MORE WORK TO BE DONE…

Session 8

Session 8

Presentation Transcript

Session #8

Session 8

Session 8

Session 8

Session 8

Session 8

Session 8

Session 8

Session 8

Session 8

Session 8

Session 8

Session 8

Session 8

Session #8

Session 8

Session #8

Session 8

Session #8

Session 8