420 likes | 526 Views
Next Generation Type Systems. Greg Morrisett Harvard University. A Famous Phrase:. “Well typed programs won’t go wrong.” More formally: 1. Describe abstract machine: M ::= [ s ,c] 2. Give transition relation: M 1 M 2 [ s , (x:=42 ; c)] [ s {x 42}, c]
E N D
Next Generation Type Systems Greg Morrisett Harvard University
A Famous Phrase: “Well typed programs won’t go wrong.” More formally: 1. Describe abstract machine: M ::= [s,c] 2. Give transition relation: M1 M2 [s, (x:=42 ; c)] [s{x42}, c] [s, if true then e1 else e2] [s, e1] [s, if false then e1 else e2] [s, e2] 3. Classify all terminal states as “bad” or “good” good: [s, exit(0)], [s, 42] bad: [s, if 42 then e1 else e2], [s, “Simon” / true] 4. Prove well-typed code never reaches bad states.
What’s “good” and “bad”? • I could say [s, “Simon” / true ] [s, 42]. • I could say [s, exit(0)] is “bad”. • It’s up to you! • But of course, for even simple safety policies, statically proving a program (much less a language) won’t “go wrong” is pretty challenging.
Thus, we cheat: • For real languages (Java, C#, ML, …): • We add some artificial transitions: [s, 42 / 0] [s, throw(DivByZero)] • and then label some bad states as good: [s, throw(DivByZero)] • Other examples: • Null pointer dereference, array index out of bounds, bad downcast, stack inspection error, file already closed, deadlock, … • So the reality is that today, well-typed programs don’t continue to go wrong. • Better than a code injection attack. • But little comfort when your airplane crashes.
Exceptions • The escape hatch for typing: throw : .exn • In languages such as ML & Haskell, they don’t appear in interfaces: • div : int int int • sub : .array int • All too easy to forget to write a handler.
Throws Clauses In Java & C# we have throws clauses: • div : int int intthrows DivByZero • sub : .array int throws Subscript Much better documentation. But introduces a couple of problems…
Problems with Throws: • Iterators (more generally, abstraction) break: • map: ,b.(b)list listb • map div vs. map sub • map:,b,.(b throws)list listb throws • Must add attendant polymorphism. • False positives: if (n != 0) avg := div(sum,n); else avg := 0;
What We Really Want: • Refinements: • div : int (y:int) intrequires y != 0 • sub : .(x:array) (i:int) requires i >= 0 && i < size(x) • head : .(x:list) requires x != NULL • And even: • printf : (x:string) -> (vs:list obj) -> unitrequires (ts,parses(x,ts) && have_types(vs,ts)) • prove: (p:prop) -> (b:bool) ensures (b = true => p) • compile : (x:ast) (y:x86)ensures (bisimilar(x,y))
Driving Research Question: Static type-checking has proven to be an effective, lightweight formal method. Over 10 years, can we scale it to: • Eliminate remaining language errors? • Eliminate more API and user-level errors? • Go all the way to correctness?
Extended Static Checking ESC/M3, ESC/Java, Spec#, Cyclone, Deputy, … • Take existing languages (Java, C#, C). • Aimed at eliminating language bugs: • null pointers, array bounds, downcasts, … • Augment types with pre/post-conditions. • Calculate refinements at each program point. • use weakest-pre or strongest-post-conditions • in conjunction with some abstract interpretation techniques to generate loop invariants • Use SMT prover to check pre/post-conditions.
Simple Example: void strncpy(char *d, char *s, int n) requires(|d| <= n && |s| <= n) { int i = 0; while (i < n) { d[i] = s[i]; i++; } } |d| <= n, |s| <= n, i=0 |d| <= n, |s| <= n, 0 <= i < n |d| <= n, |s| <= n, 0 <= i < n |d| <= n, |s| <= n, 0 <= i < n |d| <= n, |s| <= n, i >= n
Tremendous Progress • Scalable analysis: • in spite of path & context sensitivity • one key: modularity • another key: efficient representation techniques • Much improvement in provers: • SMT provers integrate decision procedures • Advances with SAT, BDDs, ILPs, … • Improved invariant finders: • e.g., octagon domains • counter-example guided refinement For 70 Kloc in the Cyclone compiler, discharge 95% of the null & array bounds checks.
All Not Rosy: • foldl( f(,,), , , list_t<>); int strcpy_offset(string_t d, int i, string_t s) requires (i+|s| <= |d|) ensures (result = i+|s|); int add_size(int dummy, string_t s, int a) ensures (result = a+|s|); string_t concat(list_t<string_t> xs) { int n = foldl(add_size, 0, 0, xs); string_t res = new_string(n); foldl(strcpy_offset, d, 0, xs); return d; }
Even if we inline HO code: int strcpy_offset(string_t d, int i, string_t s) requires (i+|s| <= |d|) ensures (result = i+|s|); string_t concat(list_t<string_t> xs) { int n = 0; for (let t = xs; t != NULL; t = t->tl) n += size(t->hd); string_t res = new_string(n); int i = 0; for (let t = xs; t != NULL; t = t->tl) i = strcpy_offset(res, i, t->hd); return res; } Can’t synthesize an inductive invariant, much less prove it automatically!
Forced to Rewrite Again: int strcpy_offset(string_t d, int i, string_t s) requires (i+|s| <= |d|) ensures (result = i+|s|); string_t concat(list_t<string_t> xs) { int n = 0; for (let t = xs; t != NULL; t = t->tl) n += size(t->hd); string_t res = new_string(n); int i = 0; for (let t = xs; t != NULL; t = t->tl) if (i + size(t->hd) <= n) i = strcpy_offset(res, i, t->hd); else throw Error; return res; }
The Reality of ES Checkers: • Still too many false positives: • Still have 1000 checks left in Cyclone compiler • programmers will dismiss false positives • Many Culprits: • language of specifications is too weak • calculated invariants are too weak • theorem provers are too weak • memory, aliasing, framing (more on this later) • Even with unsound assumptions! • no overflow, no aliasing, no concurrency, … • Seems hopeless, no?
What Can We Do? 1. Change the language to minimize VCs: • polymorphism minimized downcasts in C#. • make not-NULL the default. • aliasing issues are minimized when most values are immutable or localized. • working with maps/folds/iterators on arrays (or other finite maps) avoids indexing problems. • bignums are easier to reason about. • … If we don’t do this, we’ll spend 10 more years in the weeds working on problems we can define away.
Other Big Change: Give programmers the ability to work around short-comings of automation. • Magic is good as long as it doesn’t prevent you from getting real work done… In particular, give them a way to build explicit proofs within the language. • if automation can’t find proof, at least programmer can try to construct one. Not a new idea: type theory! • in particular, Coq, Isabelle/HOL, PRL, Epigram, … • but many challenges in making this practical
Programming in Type Theory: sub:(v:vector ,i:nat,pf:i<size(v)) ; cmp:(i:nat,j:nat) LT{i<j} + GTE{i>=j} checked_sub(v:vector , i:nat):option = match cmp(i,size(v)) with | LT{pf} => SOME(sub(v,i,pf)) | _ => NONE end
Another Example: sub:(v:vector ,i:nat,pf:i<size(v)) ; lt_trans: i j k,(i < j) (j <= k) (i < k) sum_upto(v:vector int, e:nat, pf: e <= size(v)) = let loop(i:nat,a:int):int = match cmp(i,e) with | LT pf2 => loop(i+1, a+sub(v,i,lt_trans pf2 pf)) | _ => a in loop(0,0) end
In Coq: sum_upto(v:vector int, e:nat, pf: e<=size(v)) = let loop(i:nat,a:int):nat = match lt(i,e) with | LT pf2 => loop(i+1, a+sub(v,i,[•])) | _ => a in loop(0,0) end Can leave “holes” for proofs. • proof obligations are generated. • engage in interactive proof using tactics to discharge or simplify obligations. • output is a proof object (can independently check.)
Uniformity and Modularity: In Coq, we can abstract over: • terms (i.e., values, functions) • types (i.e., polymorphism) • predicates (just another kind of type) • proofs (just another kind of term) For example: vmap : (P Q:Prop) ((x:,P x) (y:,Q y)) (xs:vector ,vall P xs) (ys:vector ,vall Q ys) where vall P v = (i:nat), (pf:i < |v|) P (sub v i pf)
After-the-Fact Reasoning: Often, we don’t know what properties we need to specify for a given function. e.g., vector sort: • (x:vector int) (y:vector int, pf:|y|=|x|) • (x:vector int) (y:vector int, pf:sorted(y)) • (x:vector int) (y:vector int, pf:permutation(x,y)) Coq lets us prove properties of definitions after the fact (e.g., by induction on the length of the vector) and then export those proofs independently: • sort : vector int vector int • pf1 : x, permutation(x,sort(x)) • pf2 : x, sorted(sort(x))
How Does All This Scale? X.Leroy [PoPL ‘06]: correct, optimizing compiler from C to PowerPC: • Build interpreter for C code. • Build interpreter for PowerPC code. • compile: S (T, Cinterp(S) PPCinterp(T)) • compiler comparable to good ugrad class • CSE, constant prop, register allocation, … • decomposed into series of intermediate stages • Coq extracts Ocaml code by erasing proofs • not just modeling code and proving model correct. • software/proof co-design Bottom line: it’s feasible to build mechanically verified software using this kind of approach.
Great Progress, but… • 4,000 line compiler: • 7,000 lines of lemmas and theorems • includes interpreters/models of C and PPC code • much is re-usable in other contexts • 17,000 lines of proof scripts • though with right tactics, could at least cut in half. • and keep in mind, this is a very deep property. • Many research opportunities here: • Advances in SMT provers not yet adopted. • Can we maintain proofs when code changes? • Proof scripts (a la Coq) are unreadable though smaller & less sensitive to change than explicit proofs. • Explicit proofs (a la Twelf) are bigger, but perhaps force better abstraction, readability, & maintainability.
Another Big Problem: Systems like Coq are limited to pure, total functions: • no hash tables, union-find, splay trees, … • Leroy forced to use functional data structures • Not a bad thing per se, but we should be able to get good algorithmic complexity where needed (e.g., unification.) • no I/O, no exceptions, no diverging computations, no concurrency, … • So building a server in Coq is out of the question. Note: you can model these things in Coq. • but then you have the model/code disconnect.
Why Purely Functional? 1. Reasoning principles: foo(x: nat, v: vector int, p: x < |v| ) { … x := rand(); … sub(v,x,p) … y := 2*x; } Predicates (& other types) are persistent: • what is p’s type after x:=rand()? Want substitution principles: • y = 2*x and 2*x = x+x, ergo y = rand()+rand()?
Why Only Total Functions? • Again, simplifies reasoning: • e.g., can use list induction on e to prove sorted(sort e). • But perhaps more importantly: • In ML, Haskell, C#, … fun bot() = bot() : .unit • If we can code bot in Coq:bot() : 42 < 0 • So the logic becomes inconsistent.
A Solution: Monads Similar to Haskell, distinguish purity with types: • e : int • e is equivalent to an integer value • e :int • e is a delayedcomputation which when run in a world w either diverges, or yields an int and some new world w’. • Because computations are delayed, they are pure. • So we can safely manipulate them within types and proofs. • e :(42 < 0) • meaningful, but means e must diverge when run!
Reasoning with : By refining with predicates, we can capture the effects of an imperative computation within its type. e :{P}x:int{Q} When run in a world satisfying P, either • diverges, or else • terminates with an integer x and world satisfying Q. i.e., Hoare-logic meets Type Theory
Ynot: • Extends Coq with a refined monad: • non-termination, (H.O.) state, exceptions • working on I/O, FFI, and concurrency • Can extract imperative Haskell code directly. • Able to code & prove correct imperative ADTs • e.g., hash tables • “Principle Specifications” • Automatically infers weakest pre- and strongest post-condition for straight-line, imperative code. • (Recursive functions require explicit invariants.)
For Example: insert(h:table)(k:key)(v:value) := Let i := mod(hash k,size h) in bucket <- (sub h i _); upd h i ((k,v) :: bucket) _ end
What Coq “Infers”: insert(h:table,k:key,v:value) : {bind_pre (read_pre (vsub h (mod(hash k,size h)) (read_post (vsub h (mod(hash k,size h)) (fun _ : list (key * value) => allocated (vsub h (mod(hash k,size h))))} _:unit { bind_post (read_pre (vsub h (mod(hash k,size h)))) (read_post (vsub h (mod(hash k,size h)))) (fun bucket => write_post (vsub h (mod(hash k,size h)))((k, v)::bucket)) } Essentially, the code is reflected in the type.
In Practice: Cast to desired spec, but with explicit proof: do : {P1}x:a{Q1}(P2P1Q1Q2) {P2}x:a{Q2} For example: insert(h:table)(k:key)(v:val): {reps(h,F)}_:unit{reps(h,ins(k,v,F)} := do (Let i := mod(hash k,size h) …) _ <16-line proof script>
Footprints in Hoare Logic Consider: inc(x:loc) : {x->v}_:unit{x->v+1} Suppose I know{a->0 b->4}: Q: What do I know after calling inc(a)? A: {a->1} I lose the fact that b->4 (and anything not specified in the post-condition of inc!).
Minimizing Footprints: Traditionally, we add a “modifies” clause: void inc(x:int ref) modifies(x) But this suffers from many problems: • As before, HO code needs to abstract over modifies clauses. • What we modify will depend (in general) on inputs. • For ADT interfaces, we don’t want to reveal specifics of changed locations. • e.g., if I change my hash table buckets from lists to splay trees, the interface has to change. A key issue since Hoare Logic was introduced.
Much Recent Progress • Substructural types (e.g., linearity, uniqueness): • At most one reference to a unique object. • So inc() can’t access a ref unless we pass it directly. • Therefore, any unique ref not passed to inc() has its value preserved. • But of course, uniqueness is a huge constraint. • Confined/Ownership types: • Label references with an “owner” • e.g., buckets are owned by the Hashtable • ADTs are allowed to modify locations they “own” without revealing this in interfaces. • Works well for local state, but is problematic when ownership needs to be transferred.
Separation Logic [Reynolds, Pym, O’Hearn, …] Specifications are rigged to be “parametric” in an extended heap: If e :{P}x:a{Q} then: • forall heaps h = h1h2where P(h1) • when we run e in h, we get out an h’ • where h’ = h1’ h2and Q(h1’) Anything not mentioned by P is preserved!
For Example: Define(P*Q)(h):=h=h1h2andP(h1) and Q(h2). inc x : {x->v}_:unit{x->v+1} insert h k v : {reps(h,F)}_:unit{reps(h,ins(v,k,F))} {a->0 * b->4 * reps(h,F)} inc(a); {a->1 * b->4 * reps(h,f)} insert h k v; {a->1 * b->4 * reps(h,ins(v,k,f))}
To Wrap Up • Type theory makes it possible to capture & check deep properties of code. • from simple types to full correctness. • Provides a uniform, modular framework for: • types and specifications. • code, models, and proofs. • abstraction at all levels. • Recent advances scale it from pure languages to more realistic programming environments without losing modularity. • refined monads. • separation logic.
But Lots to Do: • More experience • e.g., a server guaranteed to make progress • Scaling the theory further: • concurrency • total correctness, liveness, … • More automation: • better inference • adapt good decision procedures from SMT • termination analyses, shape analyses, etc. • Re-think languages & environments: • in particular, for discharging explicit proofs
I Remain Optimistic These obstacles will be overcome. We won’t develop all software with proofs of correctness, but I do believe that within another 10 years: • type systems for mainstream languages will rule out language-level errors, and many library errors, and • a lot more safety & security-critical code will be developed with machine-checked proofs of key properties.