1 / 22

CS711 Foundational PCC

CS711 Foundational PCC. Greg Morrisett Cornell University. Claimed Contributions. Types and typing rules not baked in PCC has to prove soundness & consistency of these rules at meta-level Allocation & initialization PCC/TAL treated these issues in a funky way

peri
Download Presentation

CS711 Foundational PCC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS711Foundational PCC Greg Morrisett Cornell University

  2. Claimed Contributions • Types and typing rules not baked in • PCC has to prove soundness & consistency of these rules at meta-level • Allocation & initialization • PCC/TAL treated these issues in a funky way • Much wider variety of type constructors • records, tagged unions, 1st class functions and labels, ADTs, unions, intersections, covariant recursive types • Remove VCGen (i.e., decompilation of machine code into logic) • no need to prove correctness of VCGen Lang. Based Security

  3. The Logic • Higher-order logic • Natural numbers, arithmetic, induction • Semantics of machine instructions • Predicates describing readability/writeability/jumpability of machine addresses Lang. Based Security

  4. Semantics: Relational upd(f,d,x,f') =def=All z.(d = z & f'(z) = x) or (d != z & f'(z) = f(z)) add(d,s1,s2)(r,m,r',m') =def=upd(r,d,r(s1)+r(s2),r') & m = m' load(d,s,c)(r,m,r',m') =def=readable(r(s)+c) & upd(r,d,m(r(s)+c),r') & m = m' Lang. Based Security

  5. Semantics Contd. store(s1,s2,c)(r,m,r',m') =def=writable(r(s2)+c) & upd(m,r(s2)+c,r(s1),m') & r = r' jump(d,s,c)(r,m,r',m') =def=Exists r''.upd(r,pc,r(s)+c,r'') & upd(r'',d,pc,r') & m = m' Lang. Based Security

  6. Decoding is Explicit • decode(e,m,i) • address e of machine m has instruction i • Exists d,s1,s2. format(m(v),0,d,s1,s2) & i = add(d,s1,s2)or Exists d,s1,s2. format(m(v),1,d,s1,s2) & i = addi(d,s1,s2)or ... • Here, format(...) is a predicate for showing that a given instruction assembles into a particular number. Lang. Based Security

  7. The Step and Multi-Step Rlns • step(r,m,r',m') =def= Exists i, r''.decode(r(pc),m,i) & upd(r,pc,r(pc)+1,r'') & i(r'',m,r',m') • only holds when code is "safe" • The multi-step rule is a co-inductive extension of the single-step • intuition: start with set of all (finite & infinite) sequences of states and weed out only those that can't arise from the step relation. Lang. Based Security

  8. Neat Things • No restriction (in principle) on self-modifying code • In practice, you immediately assume as part of the safety policy that code is immutable. Lang. Based Security

  9. Types • int(m)(v) =def= true • any value is an integer • record(t1,t2) m v =def= readable(v) & readable(v+1) & t1 m(m v) & t2 m(m(v + 1)) • v and v+1 must be readable • the values in memory m at locations v and v+1 must have types t1 and t2 respectively (in memory m) • Necula's rule for traversal of a pair is then trivial Lang. Based Security

  10. Prob: how to create a pair? • Because the types are inductively defined, any write to memory could invalidate the type of something else. • record2(record2(int,int),int)) m r1 • means in memory m, r1 has type (int*int)*int • if I do a store, then I get a new memory m' • so, it is no longer the case that r1 has type (int*int)*int • we could establish this fact as long as we knew the update either: • (a) preserved the type of r1 or • (b) didn't write any locations in common with r1's traversal (e.g., r1, m(r1), m(r1+1), m(m(r1)), m(m(r1)+1)) Lang. Based Security

  11. Soln for allocation: • Types indexed not only by memory, but also the current set of allocated locations: • record(t1,t2)(a,m) v =def= v in a & v+1 in a & readable(v) & readable(v+1) & t1(a,m)(m v) & t2(a,m)(m,v) • a is a predicate on the set of allocated values. • In this paper, just set of values in some pre-determined range (e.g., 100 up to r6) Lang. Based Security

  12. Extension • To support allocation, a value should retain its type even if: • the allocated set grows • there are writes to the unallocated set • Hence, the notion of validity on types: • valid(t) =def= All a,a',m,v. (a subset a') => t(a,m)v => t(a',m)v & All a,m,m',v. (All x in a.m(x) = m'(x)) => t(a,m)v => t(a,m')v Lang. Based Security

  13. Remarks • What they're doing is giving a semantics for types using the machine. • We're used to seeing a semantics for types as sets or PERs or some other mathematical objects. • These extension properties had to be shown for TAL as well. • if a heap H is described by T, then H[x->v] is also described by T. • Of course, we had to show this was true. • Here, you can't define a type unless it is true. Lang. Based Security

  14. Given This Setup constty i (a,m) v =def= v=i char(a,m) v =def= 0 <= v < 256 boxed(a,m) v =def= v >= 256 ptr t (a,m) v =def= v in a & readable(v) & t (a,m) (m v) offset i t (a,m) =def= t(a,m)(v+i) field i t =def= offset i (ptr t) union(t1,t2)(a,m) v =def= t1(a,m)v or t2(a,m)v intersect(t1,t2)(a,m)v =def= t1(a,m)v & t2(a,m)v record2(t1,t2) =def= intersect(field 0 t1, field 1 t2) Lang. Based Security

  15. More Constructors aref t(a,m) v =def= v in a & readable(v) & Exists a'.(a' subset a) & v not in a' & t(a',m)(m v) • acyclic mutable refs • giving a semantics for (possibly) cyclic refs is hard (see next paper) existential(F)(a,m)v =def= Exists t.(F t)(a,m)v & valid(t) universal(F)(a,m)v =def=All t.valid(t) => (F t)(a,m) v Lang. Based Security

  16. Code Pointers codeptr(P)(a,m) v =def=All r',m'. r'(pc) = v & stdp(r',m') & // global invariants P(stda(r',m'),m')(r') // "call convention" => safe(r',m') • The code type wraps up the implicit global invariants (e.g., the heap can grow with out invalidating the old types) as well as the calling convention (e.g., this register holds the allocation pointer.) Lang. Based Security

  17. Recursive Types • subtype(t1,t2) =def= All a,m,v.t1(a,m)v => t2(a,m)(v) • rec(F) =def= All t.valid(t) => subtype(F(t),t) => t(a,m)(v) • e.g., list(char) =def= rec(fn t => union(constty 0, intersect(boxed,record2(char,t)))) Lang. Based Security

  18. Validity & Recursive Types • rec(F) is valid for any function that preserves validity and is monotone: • preserves_valid(F) =def= All t.valid(t) => valid(F(t)) • monotone(F) =def= All t1,t2. subtype(t1,t2) => subtype(F(t1),F(t2)) • In particular, this gives us:rec(F)(a,m)(v) <=> (F(rec(F)))(a,m)(v) Lang. Based Security

  19. Bad News • Does not handle code types (in general) • e.g., datatype d = D of d->d • any negative occurrence will not be monotone • This is (also) the problem with supporting refs in general • when the heap has cycles, you need recursive types to describe it. • Could construct a more elaborate semantics based on domain theory • but then you have to code up all of domain theory in the logic Lang. Based Security

  20. Good News • See the next paper w. McAllester • there is a simple model • easy to code up in the logic • (go over this next time) Lang. Based Security

  21. Stepping Back What's been accomplished with FPCC? • assume only h.o. logic and machine semantics (TCB) • build enough math to encode the semantics of your types • proofs are much more detailed • less brittle than syntactic approaches? • yes: working at the machine level, can define new derived type constructors w/out reproving correctness, admissible rules, etc. • no: have to bake in details (e.g., allocation framework, safety policy, etc.) and scaling the semantics to realistic languages is much harder Lang. Based Security

  22. Lang. Based Security

More Related