Translation Validation of Compilers for Model-based Programming

Translation Validation of Compilers for Model-based Programming Supratik Mukhopadhyay supratik@csee.wvu.edu Research Heaven,West Virginia

- 37.5 % - 25 % - 75 % - 50 % Why Model-based Programming? • Most effective way to amortize software development cost is to make the software plug and play • Immobots programmed by specifying component models of hardware and software behavior to support plug and play • Development of model libraries reduces design time, facilitate reuse and amortize modeling costs • Reduces sensitivity to modeling inaccuracies and hardware errors • Validation can be done in an early phase

Model-based Developmentat NASA • Much publicized use of Remote Agent autonomy architecture used in Deep Space • Mode Identification and Recovery (MIR) component uses Lisp-based Livingstone (L1) Integrated Vehicle Health Management (IVHM) system • Accepts models of components of system; infers overall behavior of system • Being used in the next-generation shuttle project for Vehicle health management

Livingstone: How it works Livingstone (L2) Source in C++ Are these translations correct? C++ Compiler Model in JMPL Model in XMPL System Behavior Livingstone Executable JMPL Compiler

In other words… Is the right model getting fed to Livingstone? Is Livingstone correctly inferring behavior of the system?

Things can go wrong… for(i=0; i<=max; i++){ … } i=0 For implementations disregarding arithmetic overflows to improve performance, loop may not terminate i++ 0<=i<=max no yes …

Things can go wrong… Actual machines have finite stack sizes while programming languages have unbounded recursion

Why do we care? Livingstone (L2) Source in C++ Validate these C++ Compiler Model in JMPL Model in XMPL System Behavior Livingstone Executable JMPL Compiler Validating high level source code useless if correctness does not transfer to Machine code that is finally executed

Why Validate Translations? • Mistrust in compilers is one of the reasons why safety-critical software certified at the level of machine or assembly code. Results: • increased time and cost • error-prone • difficult to maintain; no modularity • difficult to reuse • Vulnerability to ‘self-modifying’ code • Question: • How to bridge such a huge gap in the software development cycle?

Why ValidateTranslations? • Answer 1: • Hoare, Mueller-Olm et. al.: Verify the compiler. • Feasible?? • Too complicated; too much details • Equally time-consuming and costly • ‘Freezes’ updates to compiler • Answer2: • Validate each run of the compile individually • Manageable; do not have to go to the low level compiler details • Independent of the particular compiler; depends only on source and target languages

Why model-based landscape is so special? • Involves Concurrency and Components • embedded and real-time aspects • More high-level than traditional programs • Procedural (Livingstone C++) • Object-oriented (source of L2) • .Declarative (JMPL) • Object-oriented to unstructured • Declarative to declarative • Declarative to Procedural (e.g., MPL to SMV) • Dynamics • Optimizations

Assigns correct target programs To AST’s Which parts are important? The most interesting stage where bugs are most likely Generate Code Scan Parse Source Code Target Code

So what do we need? • Source code and target code represented using a common semantic framework • Establish refinement mapping from target code to source code • Consideration: • XMPL is in prefix notation • Consideration: • In the containers for “equals”, “or” etc., XMPL allows n-ary arguments whereas JMPL allows 2 arguments

Translation Validation Technology Developed • Use a symbolic logical semantic framework; we use Quantified Propositional Temporal Logic (QPTL) with fixpoints (for loops) • Translate both source and target program to their logical semantics (QPTL formulas) • Developed an automatic tool to generate logical semantics from C++ source code; Can handle multi-threading in the source program • Developed a classification methodology for acceptable and unacceptable failures in target program

Translation Validation Technology Illustrated • Tool obtains logical semantics (QPTL) formulas from C++ source code bottom-up Φ x=e; ψ A= Set of acceptable failures φ = ◊(A \/ ψ[x->e])

Establishing Refinement Mapping • Refinement = Logical semantics of target code entails that of the source code • Refinement checking done using a tool called Temporal Logic Verifier (TLV) • TLV implements decision procedure for QPTL but not for the fixpoint part • TLV programmable; implementing the decision procedure for the fixpoint part on top of TLV in TLV-Basic Counterexample Yes TLV Automatic Tool Refinement Calculus Abstract Frame work Source Code Target Code

Refinement of Source Code • Tool built using Lex/Yacc and 500 lines of Awk code • Used our tool to automatically generate logical semantics of methods in L2 code written in C++ • 1000 lines of code handled in less than 10 seconds • Currently refinement calculus for JMPL being implemented

Abstraction of Target Code • Currently developing abstraction calculus for assembly and machine language of Pentium-4 • Abstraction calculus for XMPL being implemented

State Space Explosion • Abstraction and Refinement leads to state explosion • Need to be less ambitious • More “abstract” methods coming up

New Methods for Refinement Checking • Randomized refinement checking • at each branching point pretend that go along all branches with different probabilities • Bounded Refinement Checking and Refinement Testing • Bound the size of the models built by TLV. Experiments show that faster in finding counterexamples • Generate automatically (based on the specifications of the source code) a sequence of models and check whether they are counterexamples;

Validating Compiler Optimizations • Optimizations potential cause for introducing errors • Code motion can convert a terminating program to a non-terminating • one and vice-versa Most compiler optimizations conveniently represented as rewrite rules of the form: Φ is a logical condition I → I’, φ

Rewriting and Static Analysis Optimizer Source Code Optimized Code • Developed a preliminary tool for validating compiler optimizations • combining rewriting and static analysis • Binds free variables in conditions to program • locations and program variables

Translation Validation: System Architecture Source Code Counterexample Refinement tool Compiler Bad Abstraction tool Translation Validator Target Code TLV Proof Script Rudimentary Proof Checker Fault indication (Not OK) OK

Current status • Automatic tool for logical semantics of C++ code • implemented • Abstraction calculus for Pentium 4 assembly code developed • currently under implementation • Preliminary tool for validating compiler optimizations • implemented • Refinement calculus for JMPL developed • to be implemented • Experiments • new methods for refinement checking conducted • Found bounded refinement cheking to be faster in some cases • Preliminary case studies • Livingstone source code • Translated several methods of Livingstone to their logical semantics • Maximum ~ 1400 lines taking < 12 seconds

To do… (next quarter) • Developing and implementing • abstraction calculus for XMPL and Pentium 4 machine language • Studying and developing abstraction calculus • for Power PC machine language • Completing the pending implementations • More rigorous case studies

Related Work • Translation Validation for Synchronous Languages (Pnueli et. al) • Proof-carrying compilation (Necula et. al) • Compiler verification (Hoare, Mueller-Olm et. al)

Lessons learnt • Semi-automatic tools for translation validation possible • Features of model-based programming both provide advantages (less data dependency) and disadvantages (communication) • Use a combination of techniques • Supratik’s law • Software reliability can be transferred from source to target code (reliability can be compiled)

Translation Validation of Compilers for Model-based Programming

Translation Validation of Compilers for Model-based Programming

Presentation Transcript

Programming Validation

Model Validation

Validation of an Agent-based Civil Violence Model

Logic Programming Based Model Transformations

Model Validation

Standards-Based Programming Model

A Path-based Transfer Model for Machine Translation

Model-based Programming of Cooperating Robots

Model Based Validation

Model-based Validation of Streaming Data

Translation Validation

Model-based Programming of Cooperating Explorers

Model-Based Requirements Validation

Model-Validation in Model-Based Development

Model Validation

Model Based Validation

Model-Validation in Model-Based Development

Model-Based Requirements Validation

Model-based evaluation of clustering validation measures

Translation Validation for an Optimizing Compiler