Dimensions of Formal Verification and Validation

Dimensions of Formal Verification and Validation Doron Drusinsky Bret Michael Mantak Shing Naval Postgraduate School

Contents • Tradeoffs MC vs. TP vs. EMC

x 10sec 20sec The Role of Specification: “Have we built the right product?” Customer cognitive requirements E.g., “if pump pressure is turned Low then High and then Low again all within 10 seconds then pump should not be High for at least 20 additional seconds” Spec. = Formal representation class PumpCtl { int x; void pumpOn() { … } }

x 10sec 20sec The Role of Verification: “Have we built the product right?” E.g., “if pump pressure is turned Low then High and then Low again all within 10 seconds then pump should not be High for at least 20 additional seconds” Verification = The bridge between specification and implementation class PumpCtl { int x; void pumpOn() { … } }

Verification vs. Validation Emphasis Most academic work is verification centered We care about modeling, programming, and validation just as much.

FM promise: no need to execute the system Background: Primary Verification Techniques “True” Model-Checking: automatic verification Cognitive/NL requirement Formal spec Formalization and validation (Finite State) Model of system ==? Test suite = many inputs sequences Manual modeling (e.g., in Promela) or via abstraction tool system

Limitations:(1) limited validation (weak, hard-to-use spec.-langs) (2) state-explosion Background: Primary Verification Techniques “True” Model-Checking: Formal spec (Finite State) Model of system ? Manual translation (e.g., Promela) or via abstraction tool system

Limitations:(1) limited validation (weak, hard-to-use spec.-langs) (2) state-explosion Background: Primary Verification Techniques “True” Model-Checking: A significant limitation in our opinion Formal spec (Finite State) Model of system Focus of academic interest ? Manual translation (e.g., Promela) or via abstraction tool system

FM promise: no need to execute the system Background: Primary Verification Techniques Theorem Proving: Cognitive/NL requirement Formal spec Formalization and validation (Infinite State) system ==? Test suite = many inputs sequences system

Background: Primary Verification Techniques Theorem Proving: Limitations:(1) limited validation (even weaker…, hard-to-use spec.-langs) (2) Requires Ph.D. driver Formal spec (Infinite State) system ==? system

Background: Primary Verification Techniques Manual Testing: Cognitive requirement Express as NL Limitations:(1) Slow, poor verification coverage, expensive, hard to repeat (2) Requires many human testers (slow and expensive) (3) Validation is missing (effectively tester does both V&V)

Background: Primary Verification Techniques Execution based Model-Checking (EMC) = Run-time Execution Monitoring (REM/RV) + Automatic Test Generation (ATG): Cognitive/NL requirement Formalization and validation ATG Monitor (REM)

Background: Primary Verification Techniques Execution based Model-Checking (EMC): PRO: Better validation -- easy to use, expressive languages, with simulation. Limitations:(1) No absolute coverage (but more can be specified…)

The Coverage Cube(more is better)

The Cost Cube(more is worse)

Customer cognitive requirement Example#1 of A Validation Issue:Weak Specification Coverage “if pump pressure is repeatedly turned Low then High N or more times (N>1) within 10 seconds then pump should not be Low for at least 20 additional seconds”

Customer cognitive requirement Example #1 (cont.) Statechart-assertion for RV and EMC

Example #1 (cont.) • Outside the scope of MC/TP. They do not support: • Real-time constraints (10 sec, 20 sec…) • Counting (N times…) • In general, they support at most ω-regular properties.

Customer cognitive requirement Example#2 of Poor Specification Coverage Statechart-assertion for RV and EMC NL (time-series): Whenever the track count (cnt) Average Arrival Rate (ART) exceeds 80% of the MAX_COUNT_PER_MIN cnt ART must be reduced back to 50% of the MAX_COUNT_PER_MIN within 2 minute and cnt ART must remain below 60% of the MAX_COUNT_PER_MIN for at least 10 minutes. 2min 10min If ART>80% Then ART>50% And ART>60%

More about Specification LanguagesLTL or Buchi-Automata vs. Statechart-Assertions LTL & Buchi-Automata have lower specification coverage and are more expensive to use, partial list of reasons: • Theoretical: weak descriptive power (ω-regular at best). • Hard to use – the National Team can attest w.r.t. LTL • Lack of support for most basic constraints (real-time). • Infinite sequence semantics. • They are propositional (e.g., Always P  Eventually Q ), while real systems are both conditional (propositional) and event-driven (see UML standard).

Example of Poor Program Coverage Can we verify the property in the context of the REAL code? Program: InfusionPump.java

Validation using JUnit or MSC Validation. The StateRover uses JUnit-based simulation for validation. JUnit-based scenario: assertion.P(); assertion.Q(); assertion.Q(); assertion.P(); assertion.Q(); assertTrue( assertions.isSuccess()); 3 Q’s after 1’st P. Is that OK? Depends on cognitive expectation “No more than N (e.g., 2) Q events can follow a P event”

Validation: What can go Wrong? 1. Assertion is incorrect (usually where blame is assigned). 2. Natural lang. is ambiguous. 3. NL was written for main scenario, doesn’t work as well for other scenarios. 4. Validation scenario is not what we think it is… JUnit-based scenario: assertion.P(); assertion.Q(); assertion.Q(); assertion.P(); assertion.Q(); assertTrue( assertions.isSuccess()); 3 Q’s after 1’st P. Is that OK? Depends on cognitive expectation “No more than N (e.g., 2) Q events can follow a P event”

Thank you

Blunt User Questions Q1. A property says “light must be on for at least 5 seconds after door opens”. My program already implements that, why write a spec.-property for that? A. • Indeed, if everything we implemented was always correct the world would be a nice place… • When the implementation changes, who is the “lobbyist” for this requirement?  We need a separate representative for each requirement.

Blunt User Questions Q2. Why not write a specification in Java (or in the language of the model). A. We write spec’s as statechart-assertions. The motivation for not writing in Java is the same motivation that applies to using a code generator in general. “No more than N newCar events can follow a newTruck event”

Blunt User Questions Q3. What’s the difference between a model and a program. • Abstraction. Once the model has sufficient detail to be used as source code then it’s a program. That’s how StateRover statechart models/programs are used.

Blunt User Questions Q4. Who says the spec. is correct? • Validation. The StateRover uses JUnit-based simulation for validation. “No more than N (e.g., 2) Q events can follow a P event” JUnit-based scenario: assertion.P(); assertion.Q(); assertion.Q(); assertion.P(); assertion.Q(); assertTrue( assertions.isSuccess()); 3 Q’s after 1’st P. Is that OK? Depends on cognitive expectation

Comments of IV&V Director Dr. Caffal 1. Natural language requirements are typically vague, inconsistent, and incomplete. 2. Natural language requirements frequently have counter-examples to the expressed logic. The counter-examples are not easily observed by reading the requirement. 3. Unlike other disciplines, software developers oftentimes do not employ tools to describe behavior and elicit requirements. 4. Behavior specification comes in three flavors: what we want the system to do, what we do not want the system to do, and what we want the system to do under adverse conditions. 5. Nearly impossible to detect missing requirements. 6. Natural language requirements typically express constraints and limitations - rarely express behavior.The Team had to be hard pressed to come-up with behavioral requirements 7. Without specifying behavior, developers implicitly allow programmers to define behaviors. As such, system behaviors emerge without design and structure. Thus, emergent behaviors of systems are frequently an unhappy surprise to developers.

Behavioral specifications about:What we want the system to do Whenever stop command is received then vehicle should reach complete stop within 30 seconds

Behavioral specifications about:what we do not want the system to do (“negative behavior”) Pump should never operate until at least two seconds after valve-shut. This is where the end user says: I’ve already implemented this behavior the positive way, why do I need a negative behavior assertion?

Behavioral specifications about:what the system will do under adverse conditions (recovery)

Doing More for Validation Under development: tool that point out missing assertion simulation/validation scenarios

Thank You

backup

5sec 5sec Behavioral specifications about:what we do not want the system to do (“negative behavior”) As of cruiseSet speed should not change by more than 2% unless incline is more than 5% for more than 10 seconds. Speed instability stable V Ambiguous: a. Incline after speed instability b. Incline during speed instability

Behavioral specifications about:what we do not want the system to do (“negative behavior”) Negative statement: As of cruiseSet speed should not change by more than 2% unless incline is more than 5% for more than 10 seconds. Positive statement: As of cruiseSet speed should be 98% stable unless incline is more than 5% for more than 10 seconds. The key about negative behavior is not the way its phrased. It’s the fact that a system is built to do the positive, so it is assumed the negative is

Dimensions of Formal Verification and Validation

Dimensions of Formal Verification and Validation

Presentation Transcript

Verification and Validation

Verification and Validation

Verification and Validation

Verification and Validation

Verification and Validation

Validation and Verification

Validation and verification

Verification and Validation

Verification and Validation

Verification and Validation

Verification and Validation

Verification and Validation

Validation and Verification

Verification and Validation

Verification and Validation

Verification and Validation

Verification and Validation

Verification and Validation

Verification and Validation

Verification and Validation

Verification and Validation

Verification and Validation