A Translation from Typed Assembly Language to Certified Assembly Programming Zhong Shao Yale University Joint work with Zhaozhong Ni Paper URL: http://flint.cs.yale.edu/flint/publications/talcap.html August 11, 2006
certified L5 SW certified L4 SW certified L3 SW certified L2 SW Research objective(of the FLINT group) To build a certified software platform with real guarantee of reliability & security ! legacy SW layer4 legacy SW layer3 legacy SW layer2 legacy SW layer1 certified L1 software Hardware
certified L5 SW certified L4 SW certified L3 SW certified L2 SW The lowest SW layer is the key! A buggy L1 software can take over the machine easily! infected L5 SW legacy SW layer4 infected L4 SW legacy SW layer3 infected L3 SW legacy SW layer2 infected L2 SW legacy SW layer1 buggy L1 software (or VM) Hardware
CPU Safety Policy No Proof Checker Proof Yes machine code Must be Trusted! Structure of our certified framework • certified code (proof + machine code) • machine model • safety policy • mechanized meta-logic • proof checker
What makes a good mechanized meta logic? You’d better be very paranoid! • The logic must be “rock-solid”, i.e., consistent! • The logic must be expressive to express everything a hacker wants to say • Support explicit, machine-checkable proof objects • The logic must be simple so that the proof checker can be hand-verified • Can serve as logical framework and meta-logical framework to allow one to prove using specialized logics • Compatible with “automated proof construction”
How to scale? Modularity, modularity, modularity! specification S1 specification S2 binary code C1 binary code C2 formal proof P1 formal proof P2 specification S specification S3 specification S4 Linking binary code C formal proof P binary code C3 binary code C4 formal proof P3 formal proof P4 specification S5 specification S6 binary code C5 binary code C6 formal proof P5 formal proof P6
Must accurately specify & certify all these interfaces! Another form of modularity Software is often organized as a vertical stack of abstractions! Not everything is certified at the assembly level! certified L5 SW certified L4 SW certified L3 SW certified L2 SW certified L1 software Hardware
A really “juicy” research area … Many interesting & exciting problems: • How to certify each standard language and OS abstraction? • general code pointers • procedure call/return • general stack-based control abstraction • mutable data structures (& malloc/free …) • self-modifying code (& OS boot loader …) • interrupt/signal handling • device drivers and IO managers • thread libraries and synchronization • multiprocessor and memory model • OS kernel/user abstraction • ………… • How to combine proof assistant with general-purpose programming? • Other exciting interplays btw machine-checked proofs & computation
Related research projects at Yale Certifying different language & OS abstractions: • certified assembly programming [CAP ESOP’02] • embedded code pointers [XCAP POPL’06] • non-preemptive threads [CCAP ICFP’04 & CMAP ICFP’05] • stack-based control abstractions [SCAP PLDI’06] • self-modifying code & local reasoning [Cai et al GCAP on-going] • thread libraries and synchronizations [Ni et al on-going] • interrupts & multiprocessors [Ferreira et al on-going] • open framework for interoperability [Feng et al OCAP on-going] • boot-loaders & preemptive threads [Feng et al on-going] • memory management using malloc/free [CAP ESOP’02] • garbage collector & mutator [McCreight et al on-going]
Features of a CAP-style system • All built on a mechanized meta logic (e.g., Coq) • Both the machine-level program and the property are specified by formulas in the meta logic • Like TLA except our meta logic is mechanized • Hoare-style assertions & inference rules enforce both the correctness & type safety properties • No need of a separate type system; not a “refinement” • Assertion languages can vary: • Borrow those from Coq (shallow embedding) --- CAP • Hybrid: Coq assertions + a thin layer of syntax --- XCAP
TAL vs. CAP • Type-based Approach • TAL [Morrisett98] • Touchstone PCC [Colby00] • Syntactic FPCC[Hamid02] • FTAL [Crary03] • LTAL[Chen03] • … • Modular • Generate proof easily • Type safety • Logic-based Approach • Original PCC [Necula98] • CAP [Yu03] • CCAP/CMAP [Yu04, Feng05] • XCAP [Ni & Shao 06] • SCAP [Feng et al 06] • … • Expressive • Advanced properties • Good interoperability
SCAP [Feng et al PLDI’06] XCAP [Ni & Shao POPL’06] This talk! We also show how to embed TAL into our new XCAP! Can we have the best of both worlds? • Can a Hoare-style CAP system support: • embedded code pointers? • closures? • exceptions? • runtime stacks? • general references w. weak update? • recursive data structures?
Mechanized meta logic All implemented in the Coq proof assistant!
Validity rules for PropX [P] = ¢` P
Memory mutation in XCAP? • Strong update! • special conjunction (p * q) in separation logic • directly definable in Prop and PropX • explicit alias control, popular in system level • Weak update (general reference)? • mutable reference (int ref) in ML • managed data pointers (int __gc*) in .NET • rely on GC to recycle memory
Recursive specification in XCAP? • Simple recursive data structures! • linked list, queue, stack, tree, etc. • supported via inductive definition of Prop • Complex recursive structures with ECP? • object (self refers to the entire object) • threading invariant (each thread assumes others)
TAL-to-XCAP translation Step #1: we build a “logic-based” TAL that uses semantic subtyping Step #2: translating regular TAL into “logic-based” TAL (this is fairly straight-forward!) Step #3: translating “logic-based” TAL into XCAP
TAL-to-XCAP: other translations • translation of preconditions • translation of code heap types • translation of data heap types
TAL userapplication library devicedriver OSkernel XCAP firmware Conclusion XCAP can be extended to support general reference, weak update, and recursive specification ! We give a direct embedding of TAL into XCAP.