1 / 48

Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University

Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University. Lecture 3 July 12, 2001 VC Generation and Proof Representation. Lipari School on Foundations of Wide Area Network Programming. Whew!. Recap. When the host system receives certified code, it

lynn
Download Presentation

Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mobility, Security, andProof-Carrying Code Peter LeeCarnegie Mellon University Lecture 3 July 12, 2001 VC Generation and Proof Representation Lipari School on Foundations of Wide Area Network Programming

  2. Whew!

  3. Recap • When the host system receives certified code, it • inspects the code, generating verification conditions (VCs), and • finds a proof for each VC (if it can). • [Abstractly, one thinks of generating a single predicate, which is the conjunction of all the VCs.] • Generation of VCs is done relative to a safety policy.

  4. High-Level Architecture Code Verification condition generator Checker Explanation Agent Safety policy Host

  5. What Is a “Safety Policy”? • Yesterday, we gave the intuition of a reference interpreter that aborts the program just prior to any unsafe operation. • In this case, the reference interpreter essentially defines the safety policy.

  6. Safety Policies • More formally, we begin by defining the small-step operational semantics of a machine, call it the s86. • , , pcinstr’, pc’ • We define the machine so that only safe executions are defined. program program counter register state

  7. Safety Policies, cont’d • For convenience we choose the s86 to be a restriction of the x86. • Hence all s86 programs will execute faithfully on a real x86. • The goal then is to prove that any given program always makes progress (or returns) in the s86. • With such a proof, the x86 is then just as good as an s86.

  8. Verification Conditions • The point of the verification conditions, then, is to provide such progress theorems for each instruction in the program. • In other words, a VC’s validity says that the corresponding instruction has a defined execution in the s86 operational semantics.

  9. Symbolic Evaluator • We can define the verification condition generator (VCGen) via a symbolic evaluator • SE,,0,Post(i, , L) • The result of symbolic evaluation is a conjunction of VCs, so the overall progress theorem is then • Pre SE,,0,Post(i, , L) annotations LF signature entry point postcondition

  10. Soundness • For particular operational semantics (a safe x86 and a safe Alpha), we have presented theorems that say, essentially: • Thm: If Pre SE,,0,Post(i, , L), then execution of , given Pre and 0, and starting from entry point i, will always make progress (or return).

  11. Getting from Concept to Implementation • In an actual implementation, it is also handy to have a bit more than just a VC generator. • Precise syntax for VCs. • Pre/post-conditions for each entry point expected by the host in any downloaded code. • Precisely specified logical system for proving the VCs.

  12. Safety Policy Implementations • Safety policies are thus given in three parts: • A verification-condition generator (VCGen). • A specification of the pre & post conditions for all required procedures. • A specification of the inference rules for constructing valid proofs. • LF is used for the rule and pre/post specifications, C for the VCGen.

  13. C?!@$#@! • The use of C to define and implement the VCGen is, at best, expedient and at worst dubious. • However, since any code-inspection system must parse object files (not trivial!) and understand the instruction set, this seems to have practical benefits. • Clearly, a more formal approach would be desirable.

  14. ExampleJava Type-Safety Specification • Our largest example of a safety-policy specification is for the “SpecialJ” Java native-code compiler. • It contains about 140 inference rules. • Roughly speaking, these rules can be separated into 5 classes.

  15. Safety PolicyRule Excerpts 1. Standard syntax and rules for first-order logic. Syntax of predicates. /\ : pred -> pred -> pred. \/ : pred -> pred -> pred. => : pred -> pred -> pred. all : (exp -> pred) -> pred. pf : pred -> type. truei : pf true. andi : {P:pred} {Q:pred} pf P -> pf Q -> pf (/\ P Q). andel : {P:pred} {Q:pred} pf (/\ P Q) -> pf P. ander : {P:pred} {Q:pred} pf (/\ P Q) -> pf Q. … Type of valid proofs, indexed by predicate. Inference rules.

  16. Safety PolicyRule Excerpts 2. Syntax and rules for arithmetic and equality. “csuble” means  in the x86 machine. = : exp -> exp -> pred. <> : exp -> exp -> pred. eq_le : {E:exp} {E':exp} pf (csubeq E E') -> pf (csuble E E'). moddist+: {E:exp} {E':exp} {D:exp} pf (= (mod (+ E E') D) (mod (+ (mod E D) E') D)). =sym : {E:exp} {E':exp} pf (= E E') -> pf (= E' E). <>sym : {E:exp} {E':exp} pf (<> E E') -> pf (<> E' E). =tr : {E:exp} {E':exp} {E'':exp} pf (= E E') -> pf (= E' E'') -> pf (= E E'').

  17. Safety PolicyRule Excerpts 3. Syntax and rules for the Java type system. jint : exp. jfloat : exp. jarray : exp -> exp. jinstof : exp -> exp. of : exp -> exp -> pred. faddf : {E:exp} {E':exp} pf (of E jfloat) -> pf (of E' jfloat) -> pf (of (fadd E E') jfloat). ext : {E:exp} {C:exp} {D:exp} pf (jextends C D) -> pf (of E (jinstof C)) -> pf (of E (jinstof D)).

  18. Safety PolicySample Rules 4. Rules describing the layout of data structures. aidxi : {I:exp} {LEN:exp} {SIZE:exp} pf (below I LEN) -> pf (arridx (add (imul I SIZE) 8) SIZE LEN). wrArray4: {M:exp} {A:exp} {T:exp} {OFF:exp} {E:exp} pf (of A (jarray T)) -> pf (of M mem) -> pf (nonnull A) -> pf (size T 4) -> pf (arridx OFF 4 (sel4 M (add A 4))) -> pf (of E T) -> pf (safewr4 (add A OFF) E). This “sel4” means the result of reading 4 bytes from heap M at address A+4.

  19. Safety PolicySample Rules 5. Quick hacks. nlt0_0 : pf (csubnlt 0 0). nlt1_0 : pf (csubnlt 1 0). nlt2_0 : pf (csubnlt 2 0). nlt3_0 : pf (csubnlt 3 0). nlt4_0 : pf (csubnlt 4 0). Sometimes “unclean” things are put into the specification...

  20. How Do We Know That It’s Right?

  21. Homework Exercise • 4. Some of the proof rules are specific to the type system of the source language (Java), even though we are actually verifying x86 machine code. • Why has this been done?

  22. A Note about Memory • We define a type for valid heap memory states: • mem : exp • and operators for reading and writing heap memory: • (sel M A) • (upd M A E)

  23. The VCGen, via Detailed Examples

  24. High-Level Architecture Code Verification condition generator Checker Explanation Agent Safety policy Host

  25. Example: Source Code public class Bcopy { public static void bcopy(int[] src, int[] dst) { int l = src.length; int i = 0; for(i=0; i<l; i++) { dst[i] = src[i]; } } }

  26. Example: Target Code L7: ANN_LOOP(INV = { (csubneq ebx 0), (csubneq eax 0), (csubb edx ecx), (of rm mem)}, MODREG = (EDI,EDX,EFLAGS,FFLAGS,RM)) cmpl %esi, %edx jae L13 movl 8(%ebx, %edx, 4), %edi movl %edi, 8(%eax, %edx, 4) incl %edx cmpl %ecx, %edx jl L7 ret L13: call __Jv_ThrowBadArrayIndex ANN_UNREACHABLE nop L6: call __Jv_ThrowNullPointer ANN_UNREACHABLE nop ANN_LOCALS(_bcopy__6arrays5BcopyAIAI, 3) .text .align 4 .globl _bcopy__6arrays5BcopyAIAI _bcopy__6arrays5BcopyAIAI: cmpl $0, 4(%esp) je L6 movl 4(%esp), %ebx movl 4(%ebx), %ecx testl %ecx, %ecx jg L22 ret L22: xorl %edx, %edx cmpl $0, 8(%esp) je L6 movl 8(%esp), %eax movl 4(%eax), %esi

  27. Cut Points • Each loop entry must be annotated as a cut point. • VCGen requires this so that checking can be performed in a single scan of the code. • As a convenience, the modified registers are also declared in the cut annotations.

  28. Example: Target Code L7: ANN_LOOP(INV = { (csubneq ebx 0), (csubneq eax 0), (csubb edx ecx), (of rm mem)}, MODREG = (EDI,EDX,EFLAGS,FFLAGS,RM)) cmpl %esi, %edx jae L13 movl 8(%ebx, %edx, 4), %edi movl %edi, 8(%eax, %edx, 4) incl %edx cmpl %ecx, %edx jl L7 ret L13: call __Jv_ThrowBadArrayIndex ANN_UNREACHABLE nop L6: call __Jv_ThrowNullPointer ANN_UNREACHABLE nop ANN_LOCALS(_bcopy__6arrays5BcopyAIAI, 3) .text .align 4 .globl _bcopy__6arrays5BcopyAIAI _bcopy__6arrays5BcopyAIAI: cmpl $0, 4(%esp) je L6 movl 4(%esp), %ebx movl 4(%ebx), %ecx testl %ecx, %ecx jg L22 ret L22: xorl %edx, %edx cmpl $0, 8(%esp) je L6 movl 8(%esp), %eax movl 4(%eax), %esi VCGen requires annotations in order to simplify the process.

  29. Example: Source Code public class Bcopy { public static void bcopy(int[] src, int[] dst) { int l = src.length; int i = 0; for(i=0; i<l; i++) { dst[i] = src[i]; } } }

  30. The VCGen Process (1) _bcopy__6arrays5BcopyAIAI: cmpl $0, src je L6 movl src, %ebx movl 4(%ebx), %ecx testl %ecx, %ecx jg L22 ret L22: xorl %edx, %edx cmpl $0, dst je L6 movl dst, %eax movl 4(%eax), %esi L7: ANN_LOOP(INV = … A0 = (type src_1 (jarray jint)) A1 = (type dst_1 (jarray jint)) A2 = (type rm_1 mem) A3 = (csubneq src_1 0) ebx := src_1 ecx := (sel4 rm_1 (add src_1 4)) A4 = (csubgt (sel4 rm_1 (add src_1 4)) 0) edx := 0 A5 = (csubneq dst_1 0) eax := dst_1 esi := (sel4 rm_1 (add dst_1 4))

  31. The VCGen Process (2) L7: ANN_LOOP(INV = { (csubneq ebx 0), (csubneq eax 0), (csubb edx ecx), (of rm mem)}, MODREG = (EDI, EDX, EFLAGS,FFLAGS,RM)) cmpl %esi, %edx jae L13 movl 8(%ebx,%edx,4), %edi movl %edi, 8(%eax,%edx,4) … A3 A5 A6 = (csubb 0 (sel4 rm_1 (add src_1 4))) edi := edi_1 edx := edx_1 rm := rm_2 A7 = (csubb edx_1 (sel4 rm_2 (add dst_1 4)) !!Verify!! (saferd4 (add src_1 (add (imul edx_1 4) 8)))

  32. The Checker (1) The checker is asked to verify that (saferd4 (add src_1 (add (imul edx_1 4) 8))) under assumptions A0 = (type src_1 (jarray jint)) A1 = (type dst_1 (jarray jint)) A2 = (type rm_1 mem) A3 = (csubneq src_1 0) A4 = (csubgt (sel4 rm_1 (add src_1 4)) 0) A5 = (csubneq dst_1 0) A6 = (csubb 0 (sel4 rm_1 (add src_1 4))) A7 = (csubb edx_1 (sel4 rm_2 (add dst_1 4)) The checker looks in the PCC for a proof of this VC.

  33. The Checker (2) In addition to the assumptions, the proof may use axioms and proof rules defined by the host, such as szint : pf (size jint 4) rdArray4: {M:exp} {A:exp} {T:exp} {OFF:exp} pf (type A (jarray T)) -> pf (type M mem) -> pf (nonnull A) -> pf (size T 4) -> pf (arridx OFF 4 (sel4 M (add A 4))) -> pf (saferd4 (add A OFF)).

  34. Checker (3) A proof for (saferd4 (add src_1 (add (imul edx_1 4) 8))) in the Java specification looks like this (excerpt): (rdArray4 A0 A2 (sub0chk A3) szint (aidxi 4 (below1 A7))) This proof can be easily validated via LF type checking.

  35. VCGenSummary • VCGen is a symbolic evaluator for the object language. • It essentially implements a reference interpreter, except: • it uses symbolic values in order to model all possible executions, and • instead of performing run-time checks, it asks a Checker to verify the safety of “dangerous” instructions.

  36. Homework Exercises • 5. When a loop invariant is encountered for the second time, what actions must the VCGen perform? • 6. In principle, how big can a VC get, relative to the size of the program? • 7. What kind of program might make a VC get very large?

  37. Another Example[by George Necula] void fir (int *data, int dlen, int *filter, int flen) { int i, j; for (i=0; i<=dlen-flen; i++) { int s = 0; for (j=0; j<flen; j++) s += filter[j] * data[i+j]; data[i] = s; } } Skip this example

  38. Compiled Example /* rd=data, rdl=dlen, rf=filter, rfl=flen */ ri = 0 sub t1 = rdl, rfl L0: CUT(ri,rj,rs,t2,t3,t4,rm) le t2 = ri, t1 jeq t2, L3 rs = 0 rj = 0 L1: CUT(rj,rs,t2,t3,t4) lt t2 = rj, rfl jeq t2, L2 ult t2 = rj, rfl jeq t2, Labort ld t3 = [rf + 4*rj] add t2 = ri, rj ult t4 = t2, rdl jeq t4, Labort ld t2 = [rd + 4*t2] mul t2 = t3, t2 add rs = rs, t2 add rj = rj, 1 jmp L1 L2: ult t2 = ri, rdl jeq t2, Labort st [rd + 4*ri] = rs add ri = ri, 1 jmp L0 L3: ret Labort: call abort

  39. The Safety Policy • The safety policy defines verification conditions of the form: • true, E = E • saferd(M, E), safewr(M, E, E) • array(EA, ES, EL), vector(EA, ES, EL) • Prefir = array(rd,4,rdl), vector(rf,4,rfl) • Postfir = true

  40. VCGen Example Set rd=cd; rdl=cdl; rf=cf; rfl=cfl; rm=cm Assume precondition: array(cd,4,cdl) vector(cf,4,cfl) Set ri = 0 ri = 0 sub t1 = rdl, rfl L0: CUT(ri,rj,rs,t2,t3,t4,rm) le t2 = ri, t1 jeq t2, L3 … L3: ret Set t1 = sub(cdl,cfl) Set ri=ci; rj=cj; rs=cs; t2=c2; t3=c3; t4=c4; rm=cm’ Set t2 = le(ci, sub(cdl,cfl)) Assume not(le(ci, sub(cdl,cfl))) Checkpostcondition; Checkrd,rdl,rf,rfl have initial values

  41. VCGen Example Set ri = 0 ri = 0 sub t1 = rdl, rfl L0: CUT(ri,rj,rs,t2,t3,t4,rm) le t2 = ri, t1 jeq t2, L3 rs = 0 rj = 0 L1: CUT(rj,rs,t2,t3,t4) lt t2 = rj, rfl jeq t2, L2 … L2: ult t2 = ri, rdl jeq t2, Labort st [rd + 4*ri] = rs Set t1 = sub(cdl,cfl) Set ri=ci; rj=cj; rs=cs; t2=c2 t3=c3; t4=c4; rm=cm’ Set t2 = le(ci, sub(cdl,cfl)) Assume le(ci, sub(cdl,cfl)) Set rs = 0 Set rj = 0 Set rj=cj’; rs=cs’; t2=c2’; t3=c3’; t4=c4’ Set t2 = lt(cj’, cfl) Assume not(lt(cj’, cfl)) Set t2 = ult(ci, cdl) Assume ult(ci, cdl) Check safewr(cm’, add(cd,mul(4,ci)),cs’)

  42. More on the Safety Policy • The safety policy is defined as an LF signature. rdarray : saferd(M,add(A,mul(S,I))) <- array(A,S,L), ult(I,L). rdvector : saferd(M,add(A,mul(S,I))) <- vector(A,S,L), ult(I,L). wrarray : safewr(M,add(A,mul(S,I)),V) <- array(A,S,L), ult(I,L).

  43. The Checker • When the Checker is invoked on • safewr(cm’, add(cd,mul(4,ci)), cs’) • There are assumptions: • assume0 : ult(ci,cdl). • assume1 : not(lt(cj’,cfl)). • assume2 : le(ci, sub(cdl,cfl)). • assume3 : vector(cf,4,cfl). • assume4 : array(cd,4,cdl).

  44. The Checker, cont’d • The VC • safewr(cm’, add(cd,mul(4,ci)), cs’) • can be verified by using the rule • wrarray : safewr(M,add(A,mul(S,I)),V) <- • array(A,S,L), ult(I,L). • and assumptions • assume0 : ult(ci,cdl). • assume4 : array(cd,4,cdl).

  45. Proof Representation • A simple (but somewhat naïve) representation of the proof is simply the sequence of proof rules: • wrarray, assume4, assume0 • We shall see that better representations are possible. • LF typechecking is sufficient for proofchecking.

  46. Optimized Code • The previous example was somewhat simplified. • More realistic code is optimized, usually based on inferences about integer values. • Such optimizations require that arithmetic invariants be placed in the cut points.

  47. Optimized Example /* rd=data, rdl=dlen, rf=filter, rfl=flen */ ri = 0 sub t1 = rdl, rfl L0: CUT(ri>0,{ri,rj,…}) le t2 = ri, t1 jeq t2, L3 rs = 0 rj = 0 L1: CUT(rj>0,{rj,rs,…}) lt t2 = rj, rfl jeq t2, L2 ld t3 = [rf + 4*rj] add t2 = ri, rj ld t2 = [rd + 4*t2] mul t2 = t3, t2 add rs = rs, t2 add rj = rj, 1 jmp L1 L2: st [rd + 4*ri] = rs add ri = ri, 1 jmp L0 L3: ret

  48. VCGen Example Set ri = 0 ri = 0 sub t1 = rdl, rfl L0: CUT(ri>0, {ri,rj,rs,t2,t3,t4,rm} le t2 = ri, t1 jeq t2, L3 rs = 0 rj = 0 … Set t1 = sub(cdl,cfl) Set ri=ci; rj=cj; rs=cs; t2=c2 t3=c3; t4=c4; rm=cm’ Assume >(ci,0) Set t2 = le(ci, sub(cdl,cfl)) Assume le(ci, sub(cdl,cfl))

More Related