Modular machine code verification
This presentation is the property of its rightful owner.
Sponsored Links
1 / 67

Modular Machine Code Verification PowerPoint PPT Presentation


  • 47 Views
  • Uploaded on
  • Presentation posted in: General

PhD Thesis Defense. Modular Machine Code Verification. Zhaozhong Ni Advisor: Zhong Shao Committee: Zhong Shao, Paul Hudak Carsten Sch ü rmann, David Walker Department of Computer Science, Yale University Nov. 29, 2006. 19 Lines of Code on Every PC. ; load new context

Download Presentation

Modular Machine Code Verification

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Modular machine code verification

PhD Thesis Defense

Modular Machine Code Verification

Zhaozhong Ni

Advisor: Zhong Shao

Committee: Zhong Shao, Paul Hudak

Carsten Schürmann, David Walker

Department of Computer Science, Yale University

Nov. 29, 2006


19 lines of code on every pc

19 Lines of Code on Every PC

; load new context

mov eax, [esp+8]

mov esp, [eax+28]

mov ebp, [eax+24]

mov edi, [eax+20]

mov esi, [eax+16]

mov edx, [eax+12]

mov ecx, [eax+8]

mov ebx, [eax+4]

mov eax, [eax+0]

ret

swapcontext:

; store old context

mov eax, [esp+4]

mov [eax+0], OK

mov [eax+4], ebx

mov [eax+8], ecx

mov [eax+12], edx

mov [eax+16], esi

mov [eax+20], edi

mov [eax+24], ebp

mov [eax+28], esp


19 lines of code in every ms

19 Lines of Code in Every ms

swapcontext:

  • Runs thousands of time per second

  • Used by assembly, C, MSIL, JVML, etc.

  • Basis of multi-tasking, OS, and software

  • Safety and correctness taken for granted


19 lines of code looks simple

19 Lines of Code Looks Simple

swapcontext:

call swapcontext

eax

a1

retp

ebx

OK

a2

old

b1

ecx

a3

new

edx

a4

b2

esi

a5

b3

edi

b4

a6

b5

ebp

a7

esp

a8

a8

b6

b7

b8

retp’


19 lines of code proven hard

19 Lines of Code Proven Hard

swapcontext:

  • Simple code, complex reasoning!

    • stack / heap / memory mutation

    • procedure call / first-class code pointer

    • protection / polymorphism

  • Lack specification and verification that are

    • formal (machine checkable in sound logic)

    • general (allows all possible usage of context)

    • realistic (usable from assembly and C level)


Outline

Outline

  • Introduction

  • The XCAP Framework

  • Mini Thread Library

  • Connect XCAP to TAL

  • Conclusion


Software reliability

Software Reliability

  • Bugs are costly

  • Especially important for

    • mission-critical software

    • consumer electronics software

    • internet software


Test patch approach

Test-Patch Approach

  • Works most of the time

  • Gives no guarantee

  • Could make things worse

test

debug

yes

pre-release?

no

create patch


Language based approach

Language-based Approach

  • Uses types and other formal specifications

  • Excludes all bugs in certain categories

    illegal command, overflow, dangling pointer, etc.

  • Successful and popular

    ML, Java, C#, etc.

  • Reached virtual machine code level

    JVML, MSIL, TIL, TAL, etc.

  • Meta-theorems can make guarantees


Traditional assumptions

Traditional Assumptions

  • Types are for application software

    you can not write OS without (void *)

  • Types are for high-level languages

    not much to talk about 89 84 24 07 5B CD 15

  • Types are only for “no blue screen”

    how about “variable x is a prime number”

  • Type safety are bad for performance

    turn off array-bound checking before release


Program specification

Program Specification

syntactic types

bool prime (int n) {

assert (n > 0);

for (int i = 2; i < n; i ++)

// n mod 2,…,i-1 ≠ 0

if (n % i == 0)

return false;

// n mod 2,…,n-1 ≠ 0

return true;

}

machine-logical

specifications

meta-logical

specifications


Machine code verification

Machine Code Verification

  • Motivations

    • everything goes down to binary

    • high-level safety efforts lost in compilation

    • critical code directly written in low level

  • Challenges

    • Expressiveness

    • Modularity

  • Goals

    • both user and system level code

    • modular specification + certification


Proof carrying code

Proof-Carrying Code

  • Proposed 10 years ago [Necula & Lee]

  • machine code

  • machine checkable proof

Code

Specification

Proof

Meta theory

Checker


Foundational pcc

Foundational PCC

  • Proposed by [Appel]

Code

Specification

Proof

Meta theory

Checker

mathematic logic theory

mathematic logic checker


Approaches to pcc

Approaches to PCC

  • Type-based PCC

    • TAL [Morrisett98]

    • Touchstone PCC [Colby00]

    • Syntactic FPCC[Hamid02]

    • FTAL [Crary03]

    • LTAL[Chen03]

    • Modular

    • Generate proof easily

    • Type safety

  • Logic-based PCC

    • Original PCC [Necula98]

    • Semantic FPCC [Appel01]

    • CAP [Yu03]

    • Open Verifier [Chang05]

    • CCAP/CMAP [Yu04, Feng05]

    • Expressive

    • Advanced properties

    • Good interoperability


Pcc after 10 years

PCC After 10 Years

In principle, can verify any machine code!

In reality, many programs are not verified.

For some code, we do not know HOW!

Code

Specification

Proof

Meta theory

Checker


User level code list append

User-level Code: List Append

Adapted from [Reynolds02]

……


User level code list append1

User-level Code: List Append

Adapted from [Reynolds02]

……


User level code list append2

User-level Code: List Append

Adapted from [Reynolds02]


Ecp problem w hoare logic

ECP Problem w. Hoare Logic

  • Embedded code pointers (ECP)

    Examples: computed GOTOs, higher-order functions, indirect jumps, continuations, return addresses

    “… are difficult to describe in … Hoare logic”[Reynolds02]

  • Previous approaches

    • Ignore ECP [Necula98, Yu04]

    • Limit ECP specifications to types [Hamid04]

    • Sacrifice modularity [Yu03]

    • Use complex indexed semantic models [Appel01]


Outline1

Outline

  • Introduction

  • The XCAP Framework

  • Mini Thread Library

  • Connect XCAP to TAL

  • Conclusion


The xcap framework popl 06

The XCAP Framework [POPL’06]

  • A logic-based PCC framework

    • modular verification of machine code

    • supports ECP without compromise

  • Support both system and user code

  • Consists of

    • target machine (not fixed)

    • assertion language (consistency)

    • inference rules (soundness)


Target machine

Target Machine


Dynamic semantics

Dynamic Semantics


Certified assembly programming

Certified Assembly Programming

[Yu03, Hamid04, Yu04, Feng05]

  • Hoare logic in CPS

  • Use general predicate logic for assertions

    example:

  • Mechanized in a proof assistant (Coq)

  • Extensions made: CCAP, CMAP, etc.


How cap certify instructions

How CAP Certify Instructions


How cap certify programs

How CAP Certify Programs


The ecp problem

The ECP Problem

cptr(f, a) = ?


Previous approach

Previous Approach

  • Internalize Hoare-derivation for ECP

Circularity!

  • Stratification

    [OHearn97, Naumann01]

    • Works for simple case

    • Hard for assembly

    • Hard for polymorphism

  • Step-Indexing

    [Appel01, Appel02, Schneck03]

    • Works for polymorphism

    • Heavyweight

    • Not standard Hoare logic


Cap s approach

CAP’s Approach

  • Specify ECP by checking against code spec

  • Verify all code specs are indeed valid

  • Modularity problem


The xcap approach

The XCAP Approach

  • Specify ECP independent of code spec

  • Check ECP against global code spec

  • Verify global code spec is indeed valid


Extended propositions

Extended Propositions


Xcap rules

XCAP Rules


How xcap works with ecp

How XCAP Works with ECP

(SEQ)

(ECP)

(JMP)

(JD)


Verification of append

Verification of append()


Impredicative polymorphisms

Impredicative Polymorphisms

  • Important for ECP

  • Naïve interpretation function fails


New interpretation

New Interpretation

Interpretation

Soundness of interpretation

Consistency


Recursive specification

Recursive Specification

  • Simple recursive data structures

    • linked list, queue, stack, tree, etc.

    • supported via inductive definition of Prop

  • Complex recursive structures with ECP

    • object (self refers to the entire object)

    • threading invariant (each thread assumes others)

  • Recursive specification


Memory mutation

Memory Mutation

  • Strong update

    • special conjunction (p * q) in separation logic

    • directly definable in Prop and PropX

    • explicit alias control, popular in system level

  • Weak update (general reference)

    • mutable reference (int ref) in ML

    • managed data pointers (int __gc*) in .NET

    • rely on GC to recycle memory

    • popular in user level


Weak update

Weak Update

  • Reference cell

  • Interpretation

  • Record macro


Implementation in coq

Implementation in Coq

  • PropX can share similar tactics with Prop


Outline2

Outline

  • Introduction

  • The XCAP Framework

  • Mini Thread Library

  • Connect XCAP to TAL

  • Conclusion


Why thread library

Why Thread Library?

  • Concurrent verification

    • primitives’ correctness is assumed

    • primitives are not really “primitive”!

    • poor portability due to lack of formal spec

  • Core of OS kernel

    • assignment 1 of OS course

    • written in C and Assembly

    • requires both safety and efficiency


A mini thread library

A Mini Thread Library

  • Modeled after Pth

  • Non-preemptive user level threads

  • Written in (subset of) x86 assembly


Threading model

Threading Model


Modules and interfaces

Modules and Interfaces


Verify that 19 lines of code

Verify That 19 Lines of Code

Step 1: specify machine context

Step 2: specify function call/return

Step 3: specify swapcontext()

Step 4: prove it!


Machine context

Machine Context

typedef struct mctx_st *mctx_t;

struct mctx_st {int eax,int ebx,int ecx,int edx,

int esi, int edi, int ebp,int esp };

mctx

retv

public

bx

cx

private

dx

cs

si

di

bp

sp

ret


Function call return

Function Call / Return

excess space

local storage

esp

return address

argument 1

argument 2

argument n

caller frames


Swapcontext

swapcontext()

void swapcontext (mctx_t old, mctx_t new);

mov eax, [esp+4]

mov [eax+ 0], OK

mov [eax+ 4], ebx

mov [eax+ 8], ecx

mov [eax+12], edx

mov [eax+16], esi

mov [eax+20], edi

mov [eax+24], ebp

mov [eax+28], esp

mov eax, [esp+8]

mov esp, [eax+28]

mov ebp, [eax+24]

mov edi, [eax+20]

mov esi, [eax+16]

mov edx, [eax+12]

mov ecx, [eax+ 8]

mov ebx, [eax+ 4]

mov eax, [eax+ 0]

ret


Other context routines

Other Context Routines

void loadcontext (mctx_t mctx);

void makecontext (mctx_t mctx, char *sp, void *lnk, void *func, void *arg);


Thread control block

Thread Control Block

q

mth

next

next

NULL

state

state

state

machine

context

machine

context

machine

context

typedef struct mth_st *mth_t;

struct mth_st {mth_t next, mth_state_t state, mctx_st mctx};


Threading invariant

Threading Invariant

mctx_sched

sched

scheduler

context

mth_cur

cur

mth_rq

st

ready threads


Threading routines

Threading Routines

void mth_yield (void);

mth_t mth_spawn (int stacksize,

void *(*func)(void *),

void *arg);

void mth_scheduler (void);


Implementation

Implementation

  • 40,000 lines of Coq code

  • Where comes the complexity?

    • lemma library: large and reusable

    • x86 machine: finite integer

    • embedding: de Burijin indices

    • engineering: limited proof re-use

    • target code: this is the kernel of software!


Outline3

Outline

  • Introduction

  • The XCAP Framework

  • Mini Thread Library

  • Connect XCAP to TAL

  • Conclusion


Typed assembly language

Typed Assembly Language

  • TAL [Morrisett et al]

  • Top-level typing judgment

  • Target of type-preserving compilation

  • For user and simple system level code


Tal to xcap translation 1

TAL to XCAP Translation (1)

  • Translation of value types


Tal to xcap translation 2

TAL to XCAP Translation (2)

  • Translation of preconditions

  • Translation of code heap types

  • Translation of data heap types


Typing preservation

Typing Preservation


Application scenario

Application Scenario

TAL

userapplication

library

devicedriver

OSkernel

XCAP

firmware


Outline4

Outline

  • Introduction

  • The XCAP Framework

  • Mini Thread Library

  • Connect XCAP to TAL

  • Conclusion


Summarizing xcap

Summarizing XCAP

  • Support user-level machine code

    • demonstrated by type-preserving translation

  • Support system-level machine code

    • demonstrated by mini thread library

  • Support modular machine code verification

    • modular as type

    • expressive as logic


Other work

Other Work

  • A syntactic approach to FPCC [LICS’02]

    • Simple type safety, no need of indexed model

  • Stack-based control abstractions [PLDI’06]

    • utilizes the fixed ECP pattern to simplify things

  • An open framework for FPCC [TLDI’07]

    • allows different verification styles in a system


Some future directions

Some Future Directions

  • Add logic power to higher level languages

    C and C#, certifying compilation

  • Certify those safe “unsafe” code

    garbage collector, preemptive thread library, device driver, etc.

  • Consider other properties

    correctness, liveness, security, etc.

  • Build tools for productivity

    concrete syntax and parser, large lemma libraries, etc.


Thank you

Thank You!


  • Login