Modular machine code verification
Download
1 / 67

Modular Machine Code Verification - PowerPoint PPT Presentation


  • 99 Views
  • Uploaded on

PhD Thesis Defense. Modular Machine Code Verification. Zhaozhong Ni Advisor: Zhong Shao Committee: Zhong Shao, Paul Hudak Carsten Sch ü rmann, David Walker Department of Computer Science, Yale University Nov. 29, 2006. 19 Lines of Code on Every PC. ; load new context

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Modular Machine Code Verification' - zita


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Modular machine code verification

PhD Thesis Defense

Modular Machine Code Verification

Zhaozhong Ni

Advisor: Zhong Shao

Committee: Zhong Shao, Paul Hudak

Carsten Schürmann, David Walker

Department of Computer Science, Yale University

Nov. 29, 2006


19 lines of code on every pc
19 Lines of Code on Every PC

; load new context

mov eax, [esp+8]

mov esp, [eax+28]

mov ebp, [eax+24]

mov edi, [eax+20]

mov esi, [eax+16]

mov edx, [eax+12]

mov ecx, [eax+8]

mov ebx, [eax+4]

mov eax, [eax+0]

ret

swapcontext:

; store old context

mov eax, [esp+4]

mov [eax+0], OK

mov [eax+4], ebx

mov [eax+8], ecx

mov [eax+12], edx

mov [eax+16], esi

mov [eax+20], edi

mov [eax+24], ebp

mov [eax+28], esp


19 lines of code in every ms
19 Lines of Code in Every ms

swapcontext:

  • Runs thousands of time per second

  • Used by assembly, C, MSIL, JVML, etc.

  • Basis of multi-tasking, OS, and software

  • Safety and correctness taken for granted


19 lines of code looks simple
19 Lines of Code Looks Simple

swapcontext:

call swapcontext

eax

a1

retp

ebx

OK

a2

old

b1

ecx

a3

new

edx

a4

b2

esi

a5

b3

edi

b4

a6

b5

ebp

a7

esp

a8

a8

b6

b7

b8

retp’


19 lines of code proven hard
19 Lines of Code Proven Hard

swapcontext:

  • Simple code, complex reasoning!

    • stack / heap / memory mutation

    • procedure call / first-class code pointer

    • protection / polymorphism

  • Lack specification and verification that are

    • formal (machine checkable in sound logic)

    • general (allows all possible usage of context)

    • realistic (usable from assembly and C level)


Outline
Outline

  • Introduction

  • The XCAP Framework

  • Mini Thread Library

  • Connect XCAP to TAL

  • Conclusion


Software reliability
Software Reliability

  • Bugs are costly

  • Especially important for

    • mission-critical software

    • consumer electronics software

    • internet software


Test patch approach
Test-Patch Approach

  • Works most of the time

  • Gives no guarantee

  • Could make things worse

test

debug

yes

pre-release?

no

create patch


Language based approach
Language-based Approach

  • Uses types and other formal specifications

  • Excludes all bugs in certain categories

    illegal command, overflow, dangling pointer, etc.

  • Successful and popular

    ML, Java, C#, etc.

  • Reached virtual machine code level

    JVML, MSIL, TIL, TAL, etc.

  • Meta-theorems can make guarantees


Traditional assumptions
Traditional Assumptions

  • Types are for application software

    you can not write OS without (void *)

  • Types are for high-level languages

    not much to talk about 89 84 24 07 5B CD 15

  • Types are only for “no blue screen”

    how about “variable x is a prime number”

  • Type safety are bad for performance

    turn off array-bound checking before release


Program specification
Program Specification

syntactic types

bool prime (int n) {

assert (n > 0);

for (int i = 2; i < n; i ++)

// n mod 2,…,i-1 ≠ 0

if (n % i == 0)

return false;

// n mod 2,…,n-1 ≠ 0

return true;

}

machine-logical

specifications

meta-logical

specifications


Machine code verification
Machine Code Verification

  • Motivations

    • everything goes down to binary

    • high-level safety efforts lost in compilation

    • critical code directly written in low level

  • Challenges

    • Expressiveness

    • Modularity

  • Goals

    • both user and system level code

    • modular specification + certification


Proof carrying code
Proof-Carrying Code

  • Proposed 10 years ago [Necula & Lee]

  • machine code

  • machine checkable proof

Code

Specification

Proof

Meta theory

Checker


Foundational pcc
Foundational PCC

  • Proposed by [Appel]

Code

Specification

Proof

Meta theory

Checker

mathematic logic theory

mathematic logic checker


Approaches to pcc
Approaches to PCC

  • Type-based PCC

    • TAL [Morrisett98]

    • Touchstone PCC [Colby00]

    • Syntactic FPCC[Hamid02]

    • FTAL [Crary03]

    • LTAL[Chen03]

    • Modular

    • Generate proof easily

    • Type safety

  • Logic-based PCC

    • Original PCC [Necula98]

    • Semantic FPCC [Appel01]

    • CAP [Yu03]

    • Open Verifier [Chang05]

    • CCAP/CMAP [Yu04, Feng05]

    • Expressive

    • Advanced properties

    • Good interoperability


Pcc after 10 years
PCC After 10 Years

In principle, can verify any machine code!

In reality, many programs are not verified.

For some code, we do not know HOW!

Code

Specification

Proof

Meta theory

Checker


User level code list append
User-level Code: List Append

Adapted from [Reynolds02]

……


User level code list append1
User-level Code: List Append

Adapted from [Reynolds02]

……


User level code list append2
User-level Code: List Append

Adapted from [Reynolds02]


Ecp problem w hoare logic
ECP Problem w. Hoare Logic

  • Embedded code pointers (ECP)

    Examples: computed GOTOs, higher-order functions, indirect jumps, continuations, return addresses

    “… are difficult to describe in … Hoare logic”[Reynolds02]

  • Previous approaches

    • Ignore ECP [Necula98, Yu04]

    • Limit ECP specifications to types [Hamid04]

    • Sacrifice modularity [Yu03]

    • Use complex indexed semantic models [Appel01]


Outline1
Outline

  • Introduction

  • The XCAP Framework

  • Mini Thread Library

  • Connect XCAP to TAL

  • Conclusion


The xcap framework popl 06
The XCAP Framework [POPL’06]

  • A logic-based PCC framework

    • modular verification of machine code

    • supports ECP without compromise

  • Support both system and user code

  • Consists of

    • target machine (not fixed)

    • assertion language (consistency)

    • inference rules (soundness)




Certified assembly programming
Certified Assembly Programming

[Yu03, Hamid04, Yu04, Feng05]

  • Hoare logic in CPS

  • Use general predicate logic for assertions

    example:

  • Mechanized in a proof assistant (Coq)

  • Extensions made: CCAP, CMAP, etc.




The ecp problem
The ECP Problem

cptr(f, a) = ?


Previous approach
Previous Approach

  • Internalize Hoare-derivation for ECP

Circularity!

  • Stratification

    [OHearn97, Naumann01]

    • Works for simple case

    • Hard for assembly

    • Hard for polymorphism

  • Step-Indexing

    [Appel01, Appel02, Schneck03]

    • Works for polymorphism

    • Heavyweight

    • Not standard Hoare logic


Cap s approach
CAP’s Approach

  • Specify ECP by checking against code spec

  • Verify all code specs are indeed valid

  • Modularity problem


The xcap approach
The XCAP Approach

  • Specify ECP independent of code spec

  • Check ECP against global code spec

  • Verify global code spec is indeed valid




How xcap works with ecp
How XCAP Works with ECP

(SEQ)

(ECP)

(JMP)

(JD)



Impredicative polymorphisms
Impredicative Polymorphisms

  • Important for ECP

  • Naïve interpretation function fails


New interpretation
New Interpretation

Interpretation

Soundness of interpretation

Consistency


Recursive specification
Recursive Specification

  • Simple recursive data structures

    • linked list, queue, stack, tree, etc.

    • supported via inductive definition of Prop

  • Complex recursive structures with ECP

    • object (self refers to the entire object)

    • threading invariant (each thread assumes others)

  • Recursive specification


Memory mutation
Memory Mutation

  • Strong update

    • special conjunction (p * q) in separation logic

    • directly definable in Prop and PropX

    • explicit alias control, popular in system level

  • Weak update (general reference)

    • mutable reference (int ref) in ML

    • managed data pointers (int __gc*) in .NET

    • rely on GC to recycle memory

    • popular in user level


Weak update
Weak Update

  • Reference cell

  • Interpretation

  • Record macro


Implementation in coq
Implementation in Coq

  • PropX can share similar tactics with Prop


Outline2
Outline

  • Introduction

  • The XCAP Framework

  • Mini Thread Library

  • Connect XCAP to TAL

  • Conclusion


Why thread library
Why Thread Library?

  • Concurrent verification

    • primitives’ correctness is assumed

    • primitives are not really “primitive”!

    • poor portability due to lack of formal spec

  • Core of OS kernel

    • assignment 1 of OS course

    • written in C and Assembly

    • requires both safety and efficiency


A mini thread library
A Mini Thread Library

  • Modeled after Pth

  • Non-preemptive user level threads

  • Written in (subset of) x86 assembly




Verify that 19 lines of code
Verify That 19 Lines of Code

Step 1: specify machine context

Step 2: specify function call/return

Step 3: specify swapcontext()

Step 4: prove it!


Machine context
Machine Context

typedef struct mctx_st *mctx_t;

struct mctx_st { int eax,int ebx,int ecx,int edx,

int esi, int edi, int ebp,int esp };

mctx

retv

public

bx

cx

private

dx

cs

si

di

bp

sp

ret


Function call return
Function Call / Return

excess space

local storage

esp

return address

argument 1

argument 2

argument n

caller frames


Swapcontext
swapcontext()

void swapcontext (mctx_t old, mctx_t new);

mov eax, [esp+4]

mov [eax+ 0], OK

mov [eax+ 4], ebx

mov [eax+ 8], ecx

mov [eax+12], edx

mov [eax+16], esi

mov [eax+20], edi

mov [eax+24], ebp

mov [eax+28], esp

mov eax, [esp+8]

mov esp, [eax+28]

mov ebp, [eax+24]

mov edi, [eax+20]

mov esi, [eax+16]

mov edx, [eax+12]

mov ecx, [eax+ 8]

mov ebx, [eax+ 4]

mov eax, [eax+ 0]

ret


Other context routines
Other Context Routines

void loadcontext (mctx_t mctx);

void makecontext (mctx_t mctx, char *sp, void *lnk, void *func, void *arg);


Thread control block
Thread Control Block

q

mth

next

next

NULL

state

state

state

machine

context

machine

context

machine

context

typedef struct mth_st *mth_t;

struct mth_st {mth_t next, mth_state_t state, mctx_st mctx};


Threading invariant
Threading Invariant

mctx_sched

sched

scheduler

context

mth_cur

cur

mth_rq

st

ready threads


Threading routines
Threading Routines

void mth_yield (void);

mth_t mth_spawn (int stacksize,

void *(*func)(void *),

void *arg);

void mth_scheduler (void);


Implementation
Implementation

  • 40,000 lines of Coq code

  • Where comes the complexity?

    • lemma library: large and reusable

    • x86 machine: finite integer

    • embedding: de Burijin indices

    • engineering: limited proof re-use

    • target code: this is the kernel of software!


Outline3
Outline

  • Introduction

  • The XCAP Framework

  • Mini Thread Library

  • Connect XCAP to TAL

  • Conclusion


Typed assembly language
Typed Assembly Language

  • TAL [Morrisett et al]

  • Top-level typing judgment

  • Target of type-preserving compilation

  • For user and simple system level code


Tal to xcap translation 1
TAL to XCAP Translation (1)

  • Translation of value types


Tal to xcap translation 2
TAL to XCAP Translation (2)

  • Translation of preconditions

  • Translation of code heap types

  • Translation of data heap types



Application scenario
Application Scenario

TAL

userapplication

library

devicedriver

OSkernel

XCAP

firmware


Outline4
Outline

  • Introduction

  • The XCAP Framework

  • Mini Thread Library

  • Connect XCAP to TAL

  • Conclusion


Summarizing xcap
Summarizing XCAP

  • Support user-level machine code

    • demonstrated by type-preserving translation

  • Support system-level machine code

    • demonstrated by mini thread library

  • Support modular machine code verification

    • modular as type

    • expressive as logic


Other work
Other Work

  • A syntactic approach to FPCC [LICS’02]

    • Simple type safety, no need of indexed model

  • Stack-based control abstractions [PLDI’06]

    • utilizes the fixed ECP pattern to simplify things

  • An open framework for FPCC [TLDI’07]

    • allows different verification styles in a system


Some future directions
Some Future Directions

  • Add logic power to higher level languages

    C and C#, certifying compilation

  • Certify those safe “unsafe” code

    garbage collector, preemptive thread library, device driver, etc.

  • Consider other properties

    correctness, liveness, security, etc.

  • Build tools for productivity

    concrete syntax and parser, large lemma libraries, etc.