1 / 44

Code-Carrying Theory

Code-Carrying Theory. Aytekin Vargun Rensselaer Polytechnic Institute. Outline. Introduction Proof-Carrying Code (PCC) Code-Carrying Theory (CCT) Generic Proofs Organizing Theorems and Proofs Conclusions and Future Work. Potential Problems to be Solved. Memory Safety

marnie
Download Presentation

Code-Carrying Theory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Code-Carrying Theory Aytekin Vargun Rensselaer Polytechnic Institute

  2. Outline • Introduction • Proof-Carrying Code (PCC) • Code-Carrying Theory (CCT) • Generic Proofs • Organizing Theorems and Proofs • Conclusions and Future Work

  3. Potential Problems to be Solved • Memory Safety • illegal operations or illegal access to memory • Security • unauthorized access to data or system resources • Functional Correctness • whether the code does correctly what it is formally required to do

  4. Two Solutions • Proof-Carrying Code (PCC) • Code-Carrying Theory (CCT)

  5. Proof-Carrying Code (PCC) • Developed by Necula and Lee [1996] at CMU. • Basic Idea: Use machine-checkable proofs as certificates. • Proof construction is harder than proof checking • Code producer provides the proof • Code consumer checks it

  6. Code-Carrying Theory (CCT) • The consumer gives the specification of the function • The producer starts with axioms that define functions • The form of axioms is such that it is easy to extract executable code from them. • Prove that the defined functions obey certain requirements. • Termination • Consistency • Correctness

  7. Code-Carrying Theory (CCT) • The producer transmits • Axioms, Theorems, and Proofs • No explicit code transmission • The consumer checks proofs to see if the theorems are proved • If proof checking succeeds, the consumer applies the code extractor to the axioms and obtain the executable code

  8. PCC/CCT Differences • PCC starts from code and assertions, CCT starts from assertions only and later extracts code from them • PCC concentrates on safety properties which are relatively easy to prove fully automatically, we have concentrated on functional correctness properties which are more difficult • We concentrate more on the proof issues with these more challenging types of properties, and less on programming language issues or other issues that PCC deals with more directly

  9. Code-Carrying Theory (CCT) • Proving Termination: • Use TCGEN to produce the termination condition (TC) and termination axioms • Prove TC • Proving Consistency: • Use CCGEN to produce the consistency condition(CC) • Prove it • Prove Correctness: • Prove correctness conditions (CTC) given by the consumer

  10. General Requirements Proof Checker Consistency Condition (CC) Termination Condition (TC) Both CC and TC Proved? Assert FDA Proof Checker Prove Correctness (CTC) Proofs CTC Proved? CC TC Proofs Code Producer Axioms(FDA) Code Consumer Proof Checker CC Check Proof of CC Application Specific Requirements Application Specific Requirements TC Check Proof of TC Both Proofs Check? Assert FDA Check Proof of CTC CTC Proved? General Requirements CPU CODE

  11. Code Producer Proofs Proofs Axioms(FDA) CC TC Hacker Code Consumer Axioms (FDA) Proofs Proof Checker Different CC Check Proof of CC Different TC Check Proof of TC Both Proofs Check? Assert FDA Check Proof of CTC CPU CODE CTC Proved?

  12. Code Producer Proofs Proofs Axioms(FDA) CC TC Hacker Code Consumer Proofs Axioms(FDA) Proof Checker CC Check Proof of CC Check Proof of TC TC Both Proofs Check? Assert FDA Check Proof of CTC CPU CODE CTC Proved?

  13. Code Producer Proofs Proofs Axioms(FDA) CC TC Hacker Code Consumer Axioms (FDA) Proofs Proof Checker Different CC Check Proof of CC Different TC Check Proof of TC Both Proofs Check? Assert FDA Check Proof of CTC CPU CODE CTC Proved?

  14. Issues • Encoding axioms and proofs • Proof Checking • Implementation of • CCGEN • TCGEN • CODEGEN

  15. ATHENA • Implemented by K.Arkoudas • A language for both: • Ordinary Computation • Logical Deduction

  16. ATHENAOrdinary Computation Language • Provides higher-order functions • Has primitive functions for • Unification • Matching • Substitution

  17. ATHENALogical Language • Special Deductive Forms • dcheck, dseq, assume, … • Primitive Deduction Methods • mp, both, left-and, … • Declarations • structure, declare, … • Directives • load-file, clear-assumption-base, … • Calls to external automatic resolution theorem provers like SPASS and Vampire

  18. ATHENAAdvantages • Better Proof Readability • Machine checkable proofs • Makes it possible to formulate and write proofs as methods • Good for writing generic proofs • write the proof once and instantiate it to prove specific cases • But:

  19. ATHENANo built-in rewriting methods • We added the following methods to be able to use equational rewriting: • (setup c t) : initializes c with t • (reduce c u E) : attempts to transform the term t in c to be identical with the given term u by using theorem E as a left-to-right rewriting rule • (expand c u E) : attempts to transform the term t in c to be identical with the given term u by using theorem E as a right-to-left rewriting rule • (combine left right) : deduces (= t u) if left contains (= t t’) , right contains (= u u’), and if t’ and u’ are identical terms.

  20. CCT - Tools • Small trusted computing base • TCGEN + CCGEN + CODEGEN ≈1000 lines • Tested with hundreds of axioms/theorems and more than 10.000 lines of proofs

  21. Termination of a function • Termination is undecidable • But it can be solved in special cases • Does a measure of arguments decrease in the ordering with each recursive call of the function? • This requires an ordering relation to be defined every time

  22. TCGENTermination of a function • Our approach is similar but does not use an ordering relation • We construct the proof of termination as a proof by induction that mirrors the recursion structure in the axioms • We generate a termination axiom for each axiom • Construct a termination condition • Prove the termination condition using the termination axioms

  23. Function-defining Axioms: (forall ?x (= (power ?x zero) one)) (forall ?x ?n (= (power ?x (succ ?n)) (Times ?x (power ?x ?n)))) Termination Axioms: (forall ?x (power_t ?x zero)) (forall ?x ?n (if (and (power_t ?x ?n) (Times_t ?x (power ?x ?n))) (power_t ?x (succ ?n)))) • one is a constant • Steps: • Rename power to power_t • Check the right hand sides. If the rhs is a constant, eliminate it • if there are nested function applications in rhs, conjunct them • Construct an implication from new lhs and rhs ``if rhs lhs’’ • Eliminate the applications of known total functions • Assert these and prove the termination condition • Times_t is total Termination Axioms: (forall ?x (power_t ?x zero)) (forall ?x ?n (if (power_t ?x ?n) (power_t ?x (succ ?n)))) Termination Condition: (forall ?x ?n (power_t ?x ?n))

  24. CCGENConsistency of axioms • Input is function-defining axioms • Output is a predicate (the consistency condition) • It states that it is possible to define a function that satisfies the axioms: • For every tuple of values of the function domain, there exists a range value y

  25. Function-defining Axioms: (forall ?y (= (f ?y zero) one))) (forall ?x ?y (if (not (= ?y zero)) (= (f ?x ?y) two))) (forall ?x ?w (if (= ?w zero) (= (f ?x ?w) one))) (forall ?x ?w (if (not (= ?w zero)) (= (f ?x ?w) two))) • Steps: • Rename ?y to ?w • Add or update conditions • Replace (f ?x ?w) with ?y, conjunct the propositions, and add ``exists ?y’’ Consistency Condition is: (forall ?x ?w (exists ?y (and (if (= ?w zero) (= ?y one)) (if (not (= ?w zero)) (= ?y two)))))

  26. Proving Correctness

  27. Correctness Proof (Producer) (by-induction sum-list-correctness (Nil (dseq (!setup left (sum-list Nil)) (!setup right (sum-list@ Nil)) (!reduce left zero sum-list-empty) (!reduce right zero sum-list@-empty) (!combine left right))) ((Cons x L) (dseq (!setup left (sum-list (Cons x L))) (!setup right (sum-list@ (Cons x L))) (!reduce left (sum-list-compute L x) sum-list-nonempty) (!reduce right (sum-list-compute L x) sum-list-compute-relation) (!combine left right)))))) Application-specific Requirements (from the consumer) (define sum-list@-empty (= (sum-list@ Nil) zero)) (define sum-list@-nonempty (forall ?L ?x (= (sum-list@ (Cons ?x ?L)) (Plus ?x (sum-list@ ?L))))) Function-defining Axioms (Producer) (define sum-list-empty (= (sum-list Nil) zero)) (define sum-list-nonempty (forall ?L ?x (= (sum-list (Cons ?x ?L)) (sum-list-compute ?L ?x)))) (define sum-list-compute-empty (forall ?x (= (sum-list-compute Nil ?x) ?x))) (define sum-list-compute-nonempty (forall ?L ?x ?y (= (sum-list-compute (Cons ?y ?L) ?x) (sum-list-compute ?L (Plus ?x ?y))))) • Define an efficient function • Note: • Executable but inefficient code can be extracted from these axioms Correctness Condition: (define sum-list-correctness (forall ?L (= (sum-list ?L) (sum-list@ ?L))))

  28. Application-specific Requirements (from the consumer) (define reverse-range-Correctness (forall ?i ?j (if (valid (range ?i ?j)) (forall ?M (= (access-range (reverse-range M (range i j)) (range i j)) (reverse (access-range M (range i j)))))))) • Note: • Proof is by range induction • Basis cases: • Empty range: (range i i) • Range of one element: (range i (++ i)) • Induction Step: • Assume for (range (++ i) (-- j)) • Show that it is true for (range i j) • Note: • Specification is not executable. The correctness condition itself is a specification. (define reverse-empty-range-axiom (forall ?i ?M (= (reverse-range ?M (range ?i ?i)) Function-defining Axioms (Producer) ?M))) (define reverse-nonempty-range-axiom1 (forall ?i ?j ?M (if (and (not (= ?i ?j)) (= (++ ?i) ?j)) (= (reverse-range ?M (range ?i ?j)) ?M)))) (define reverse-nonempty-range-axiom2 (forall ?i ?j ?M (if (and (valid (range ?i ?j)) (and (not (= ?i ?j)) (not (= (++ ?i) ?j)))) (= (reverse-range ?M (range ?i ?j)) (reverse-range (swap ?M (* ?i) (* (-- ?j))) (range (++ ?i) (-- ?j)))))))

  29. CODEGENCode Extraction • Quantified Equations and Conditional Equations • These are clauses of a recursive function definition • CODEGEN has to be able to combine these into a recursive function • Target language is currently Oz • Oz has pattern matching • Possible to extract efficient code: Oz has ``last call optimization’’. Executes tail-recursive functions in constant stack size

  30. CODEGENCode Extraction • Can extract both: • Memory-observing (examines data structures but doesn’t make any changes) • access, access-range, sum-list, find, find-if, power • Memory-updating functions (makes in-place changes) • assign, assign-range, swap, reverse-range, rotate, copy • Does optimizations when necessary

  31. Code Extraction (Consumer) fun {SumList L} case L of nil then 0 [] X|L then {SumListCompute L X} end End fun {SumListCompute L X} case L of nil then X [] Y|L then {SumListCompute L (X + Y)} end end Function-defining Axioms (Producer) (define sum-list-empty (= (sum-list Nil) zero)) (define sum-list-nonempty (forall ?L ?x (= (sum-list (Cons ?x ?L)) (sum-list-compute ?L ?x)))) (define sum-list-compute-empty (forall ?x (= (sum-list-compute Nil ?x) ?x))) (define sum-list-compute-nonempty (forall ?L ?x ?y (= (sum-list-compute (Cons ?y ?L) ?x) (sum-list-compute ?L (Plus ?x ?y))))) • Note: • There are two variables but: ``case [L X]’’ has been optimized to ``case L’’ by CODEGEN

  32. (define reverse-empty-range-axiom (forall ?i ?M (= (reverse-range ?M (range ?i ?i)) Function-defining Axioms (Producer) ?M))) (define reverse-nonempty-range-axiom1 (forall ?i ?j ?M (if (and (not (= ?i ?j)) (= (++ ?i) ?j)) (= (reverse-range ?M (range ?i ?j)) ?M)))) (define reverse-nonempty-range-axiom2 (forall ?i ?j ?M (if (and (valid (range ?i ?j)) (and (not (= ?i ?j)) (not (= (++ ?i) ?j)))) (= (reverse-range ?M (range ?i ?j)) (reverse-range (swap ?M (* ?i) (* (-- ?j))) (range (++ ?i) (-- ?j))))))) • Note: • CODEGEN optimizes it fun {ReverseRange M R } Code needs to be optimized case R of range(I I ) then M [] range(I J ) then if {And {Not (I == J )} {Not ({`++` I } == J )} } then {ReverseRange {Swap M {`*` I } {`*` {`--` J } } } range({`++` I } {`--` J } ) } elseif {And {Not (I == J )} ({`++` I } == J )} then M end end end

  33. fun {ReverseRange M R } Code needs to be optimized case R of range(I I ) then M [] range(I J ) then if {And {Not (I == J )} {Not ({`++` I } == J )} } then {ReverseRange {Swap M {`*` I } {`*` {`--` J } } } range({`++` I } {`--` J } ) } elseif {And {Not (I == J )} ({`++` I } == J )} then M end end end fun {ReverseRange M R } Optimized Code case R of range(I I ) then M [] range(I J ) then if {Not (I == J )} then if ({`++` I } == J ) then M else {ReverseRange {Swap M {`*` I } {`*` {`--` J } } } range({`++` I } {`--` J } ) } end end end end

  34. CODEGENCode Extraction • We have been working on simple functions. But: • In analogy to STL, it is useful to have a library of simple functions from which more complex functions can be composed, especially if the functions are generic • It is possible for CODEGEN to extract complex functions composed of such simple functions

  35. Generic Proof Writing • Proofs are very large • Generic Proofs might be a solution • No need to develop and transmit the similar proofs to the consumer • It is harder to write generic proofs but, • Once the consumer has the generic proofs, he can instantiate them with many different ways • Athena is a higher order language: • We can express generic functions and proofs

  36. Generic Proof Writing • Generic property definitions and proofs are constructed in the form of programs that are parameterized with operator mappings • Generic theorem: • it is a generic property • contains a single property, for which there is an associated generic proof • Provide functions which perform operator mappings • Instantiate the generic proof with a particular operator mapping later

  37. Generic Property Definitions in CCT (define (sum-list@-definition name ops) Name and parameter list (let ((Plus (ops 'Plus)) (Zero (ops 'Zero))) Local Declarations (match name ('sum-list@-empty (= (sum-list@ Nil) Zero)) ('sum-list@-nonempty (forall ?L ?y (= (sum-list@ (Cons ?y ?L)) (Plus ?y (sum-list@ ?L)))))))) Generic Axiom or Theorems (define (sum-list-compute-relation name ops) Name and parameter list (match name ('sum-list-compute-relation (forall ?L ?x (= (sum-list@ (Cons ?x ?L)) (sum-list-compute ?L ?x)))))) Axiom or Theorems

  38. (define (sum-list-correctness-proof name ops) Name and parameter list (dlet ((Zero (ops 'Zero)) (left (cell true)) (right (cell true)) (prop (method (name) (!property name ops Sum-list-theory))) (theorem (sum-list-correctness name ops))) Local Declarations (by-induction theorem (Nil (dseq (!setup left (sum-list Nil)) (!setup right (sum-list@ Nil)) (!reduce left Zero (!prop 'sum-list-empty)) (!reduce right Zero (!prop 'sum-list@-empty)) (!combine left right))) ((Cons x L) (dseq (!setup left (sum-list (Cons x L))) (!setup right (sum-list@ (Cons x L))) (!reduce left (sum-list-compute L x) (!prop 'sum-list-nonempty)) (!reduce right (sum-list-compute L x) (!prop 'sum-list-compute-relation)) (!combine left right)))))) Generic Proof A Generic Proof method in CCT

  39. Instantiation of a Generic Axiom Operator Mappings: (define (Monoid-ops op) (match op ('Plus Plus) ('Zero zero))) (define (sum-list@-definition name ops) (let ((Plus (ops 'Plus)) (Zero (ops 'Zero))) (match name ('sum-list@-empty (= (sum-list@ Nil) Zero)) ('sum-list@-nonempty (forall ?L ?y (= (sum-list@ (Cons ?y ?L)) (Plus ?y (sum-list@ ?L)))))))) (define (Times-ops op) (match op ('Plus Times) ('Zero one))) (define (Monoid-ops op) (match op ('Plus Append) ('Zero Nil))) Instantiated Axioms: (= (sum-list@ Nil) zero) (forall ?L ?y (= (sum-list@ (Cons ?y ?L)) (Plus ?y (sum-list@ ?L)))) Instantiated Axioms: (= (sum-list@ Nil) one) (forall ?L ?y (= (sum-list@ (Cons ?y ?L)) (Times ?y (sum-list@ ?L)))) Instantiated Axioms: (= (sum-list@ Nil) Nil) (forall ?L ?y (= (sum-list@ (Cons ?y ?L)) (Append ?y (sum-list@ ?L))))

  40. Conclusions • CCT provides strong assurance for correctness • Only very small examples so far, but a basis for tackling larger examples • Readable proofs • Generic proof writing • Tools for organizing theorems and proofs

  41. Future Work • Test CCT with more examples, including the ones that are larger and more complex • Complete the extension of CODEGEN to check preconditions where necessary • Use CCT to prove safety properties • A Really Longer Term Goal: Verifying Compiler – Tony Hoare’s grand challenge problem

  42. Organizing Theorems and Proofs • We have a few hundred axioms, theorems, and proofs • Prove some lemmas and use them in the proofs of other theorems • Main idea: Group the related properties under the same theories • Searches for a stored theorem are faster

  43. Organizing Theorems and Proofs • We define a structured theory as an abstract data type with the following functions • theory: creates a structured theory from a generic property function containing axioms • evolve: extends an existing structured theory with a new generic theorem and its proof; • refine: creates a new structured theory as a composition of one or more existing structured theories and a generic property function. • property: retrieves an instance of a generic property function, and its corresponding proof

  44. Naturals zero, succ Lists Nil, Cons Memory Theory Access, Assign, Swap Iterator Theory ++, - -, *, I-, I+, I-I Range Theory valid, range ++ preincrement -- predecrement I- iterator subtraction I+ iterator addition I-I iterator difference Memory Range Theory Access-range, Assign-range

More Related