Bit vector decision procedures a basis for reasoning about hardware software
Download
1 / 32

Bit Vector Decision Procedures A Basis for Reasoning about Hardware Software - PowerPoint PPT Presentation


  • 84 Views
  • Uploaded on

Bit Vector Decision Procedures A Basis for Reasoning about Hardware & Software. Randal E. Bryant Carnegie Mellon University. http://www.cs.cmu.edu/~bryant. Collaborators. Sanjit Seshia, Bryan Brady UC Berkeley Daniel Kroening ETH Zurich Joel Ouaknine Oxford University Ofer Strichman

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Bit Vector Decision Procedures A Basis for Reasoning about Hardware Software' - bozica


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Bit vector decision procedures a basis for reasoning about hardware software l.jpg

Bit Vector Decision ProceduresA Basis for Reasoning about Hardware & Software

Randal E. Bryant

Carnegie Mellon University

http://www.cs.cmu.edu/~bryant


Collaborators l.jpg
Collaborators

  • Sanjit Seshia, Bryan Brady

    • UC Berkeley

  • Daniel Kroening

    • ETH Zurich

  • Joel Ouaknine

    • Oxford University

  • Ofer Strichman

    • Technion

  • Randy Bryant

    • Carnegie Mellon

3 continents

4 countries

5 institutions

6 authors


Motivating example 1 l.jpg
Motivating Example #1

  • Do these functions produce identical results?

  • Strategy

    • Represent and reason about bit-level program behavior

    • Specific to machine word size, integer representations, and operations

int abs(int x) {

int mask = x>>31;

return (x ^ mask) + ~mask + 1;

}

int test_abs(int x) {

return (x < 0) ? -x : x;

}


Motivating example 2 l.jpg
Motivating Example #2

void fun() {

char fmt[16];

fgets(fmt, 16, stdin);

fmt[15] = '\0';

printf(fmt);

}

  • Is there an input string that causes value 234 to be written to address a4a3a2a1?

  • Answer

    • Yes: "a1a2a3a4%230g%n"

      • Depends on details of compilation

    • But no exploit for buffer size less than 8

    • [Ganapathy, Seshia, Jha, Reps, Bryant, ICSE ’05]


Motivating example 3 l.jpg
Motivating Example #3

bit[W] popSpec(bit[W] x)

{

int cnt = 0;

for (int i=0; i<W; i++) {

if (x[i]) cnt++;

}

return cnt;

}

bit[W] popSketch(bit[W] x)

{

loop (??) {

x = (x&??) + ((x>>??)&??);

}

return x;

}

  • Is there a way to expand the program sketch to make it match the spec?

  • Answer

    • W=16:

    • [Solar-Lezama, et al., ASPLOS ‘06]

x = (x&0x5555) + ((x>>1)&0x5555);

x = (x&0x3333) + ((x>>2)&0x3333);

x = (x&0x0077) + ((x>>8)&0x0077);

x = (x&0x000f) + ((x>>4)&0x000f);


Motivating example 4 l.jpg
Motivating Example #4

Sequential Reference Model

Pipelined Microprocessor

  • Is pipelined microprocessor identical to sequential reference model?

  • Strategy

    • Automatically generate abstraction function from pipeline to program state [Burch & Dill, CAV ’94]

    • Represent machine instructions, data, and state as bit vectors

      • Compatible with hardware description language representation


Slide7 l.jpg
Task

  • Bit Vector Formulas

    • Fixed width data words

    • Arithmetic operations

      • E.g., add/subtract/multiply/divide & comparisons

      • Two’s complement, unsigned, …

    • Bit-wise logical operations

      • E.g., and/or/xor, shift/extract and equality

    • Boolean connectives

  • Reason About Hardware & Software at Bit Level

    • Formal verification

    • Security analysis

    • Test & program generation

      • What function arguments will cause program to take specified branch?


Decision procedures l.jpg

Decision

Procedure

Satisfying solution

Formula

Unsatisfiable

(+ proof)

Decision Procedures

  • Core technology for formal reasoning

  • Boolean SAT

    • Pure Boolean formula

  • SAT Modulo Theories (SMT)

    • Support additional logic fragments

    • Example theories

      • Linear arithmetic over reals or integers

      • Functions with equality

      • Bit vectors

      • Combinations of theories



  • Bv decision procedures some history l.jpg
    BV Decision Procedures:Some History

    • B.C. (Before Chaff)

      • String operations (concatenate, field extraction)

      • Linear arithmetic with bounds checking

      • Modular arithmetic

    • SAT-Based “Bit Blasting”

      • Generate Boolean circuit based on bit-level behavior of operations

      • Convert to Conjunctive Normal Form (CNF) and check with best available SAT checker

      • Handles arbitrary operations

      • Effective in many applications

        • CBMC [Clarke, Kroening, Lerda, TACAS ’04]

        • Microsoft Cogent + SLAM [Cook, Kroening, Sharygina, CAV ’05]

        • CVC-Lite [Dill, Barrett, Ganesh], Yices [deMoura, et al], STP


    Research challenge l.jpg
    Research Challenge

    • Is there a better way than bit blasting?

    • Requirements

      • Provide same functionality as with bit blasting

      • Find abstractions based on word-level structure

      • Improve on performance of bit blasting

    • A New Approach

      • [Bryant, Kroening, Ouaknine, Seshia, Stichman, Brady, TACAS ’07]

      • Use bit blasting as core technique

      • Apply to simplified versions of formula

      • Successive approximations until solve or show unsatisfiable


    Approximating formula l.jpg

    +

    Overapproximation

    More solutions:

    If unsatisfiable, then so is 

      +

    Fewer solutions:

    Satisfying solution also satisfies 

    −  

    Underapproximation

    −

    Approximating Formula

    • Example Approximation Techniques

      • Underapproximating

        • Restrict word-level variables to smaller ranges of values

      • Overapproximating

        • Replace subformula with Boolean variable

    Original Formula


    Starting iterations l.jpg
    Starting Iterations

    • Initial Underapproximation

      • (Greatly) restrict ranges of word-level variables

      • Intuition: Satisfiable formula often has small-domain solution

    1−


    First half of iteration l.jpg

    1+

    UNSAT proof:

    generate overapproximation

    If SAT, then done

    First Half of Iteration

    • SAT Result for 1−

      • Satisfiable

        • Then have found solution for 

      • Unsatisfiable

        • Use UNSAT proof to generate overapproximation 1+

        • (Described later)

    1−


    Second half of iteration l.jpg

    If UNSAT, then done

    SAT:

    Use solution to generate refined underapproximation

    2−

    Second Half of Iteration

    • SAT Result for 1+

      • Unsatisfiable

        • Then have shown  unsatisfiable

      • Satisfiable

        • Solution indicates variable ranges that must be expanded

        • Generate refined underapproximation

    1+

    1−


    Iterative behavior l.jpg
    Iterative Behavior

    2+

    1+

    • Underapproximations

      • Successively more precise abstractions of 

      • Allow wider variable ranges

    • Overapproximations

      • No predictable relation

      • UNSAT proof not unique

      

    k+

    k−

      

    2−

    1−


    Overall effect l.jpg

    2+

    1+

      

    UNSAT

    k+

    k−

      

    2−

    1−

    SAT

    Overall Effect

    • Soundness

      • Only terminate with solution on underapproximation

      • Only terminate as UNSAT on overapproximation

    • Completeness

      • Successive underapproximations approach 

      • Finite variable ranges guarantee termination

        • In worst case, get k−  


    Generating overapproximation l.jpg

    1+

    UNSAT proof:

    generate overapproximation

    1−

    2−

    Generating Overapproximation

    • Given

      • Underapproximation 1−

      • Bit-blasted translation of 1− into Boolean formula

      • Proof that Boolean formula unsatisfiable

    • Generate

      • Overapproximation 1+

      • If 1+ satisfiable, must lead to refined underapproximation

        • Generate 2− such that 1−  2−  


    Bit vector formula structure l.jpg

    x = y

    x + 2 z 1

    a

    w & 0xFFFF = x

    x % 26 = v

    Bit-Vector Formula Structure

    • DAG representation to allow shared subformulas


    Structure of underapproximation l.jpg

    x = y

    x + 2 z 1

    w

    Range

    Constraints

    x

    y

    a

    z

    w & 0xFFFF = x

    x % 26 = v

    Æ

    Structure of Underapproximation

    • Linear complexity translation to CNF

      • Each word-level variable encoded as set of Boolean variables

      • Additional Boolean variables represent subformula values

    −


    Encoding range constraints l.jpg

    Range

    Constraints

    w

    x

    Encoding Range Constraints

    • Explicit

      • View as additional predicates in formula

    • Implicit

      • Reduce number of variables in encoding

        Constraint Encoding

        0 w 8 0 0 0 ··· 0 w2w1w0

        −4x  4 xsxsxs··· xsxsx1x0

      • Yields smaller SAT encodings

    0 w 8  −4x  4


    Unsat proof l.jpg

    w

    Range

    Constraints

    x

    y

    z

    Ç

    Æ

    :

    Æ

    Ç

    UNSAT Proof

    • Subset of clauses that is unsatisfiable

    • Clause variables define portion of DAG

    • Subgraph that cannot be satisfied with given range constraints

    x = y

    x + 2 z 1

    a

    Æ

    w & 0xFFFF = x

    Ç

    x % 26 = v


    Generated overapproximation l.jpg

    Ç

    Æ

    :

    Ç

    Generated Overapproximation

    • Identify subformulas containing no variables from UNSAT proof

    • Replace by fresh Boolean variables

    • Remove range constraints on word-level variables

    • Creates overapproximation

      • Ignores correlations between values of subformulas

    x = y

    x + 2 z 1

    a

    1+

    Æ

    b1

    b2


    Refinement property l.jpg

    w

    Range

    Constraints

    x

    y

    z

    Ç

    Æ

    :

    Æ

    Ç

    Refinement Property

    • Claim

      • 1+ has no solutions that satisfy 1−’s range constraints

        • Because 1+ contains portion of 1− that was shown to be unsatisfiable under range constraints

    x = y

    x + 2 z 1

    UNSAT

    a

    Æ

    1+

    b1

    b2


    Refinement property cont l.jpg

    Ç

    Æ

    :

    Ç

    Refinement Property (Cont.)

    • Consequence

      • Solving 1+ will expand range of some variables

      • Leading to more exact underapproximation 2−

    x = y

    x + 2 z 1

    a

    Æ

    1+

    b1

    b2


    Effect of iteration l.jpg

    SAT:

    Use solution to generate refined underapproximation

    2−

    Effect of Iteration

    • Each Complete Iteration

      • Expands ranges of some word-level variables

      • Creates refined underapproximation

    1+

    UNSAT proof:

    generate overapproximation

    1−


    Approximation methods l.jpg
    Approximation Methods

    • So Far

      • Range constraints

        • Underapproximate by constraining values of word-level variables

      • Subformula elimination

        • Overapproximate by assuming subformula value arbitrary

    • General Requirements

      • Systematic under- and over-approximations

      • Way to connect from one to another

    • Goal: Devise Additional Approximation Strategies


    Function approximation example l.jpg

    x

    *

    y

    Function Approximation Example

    • Motivation

      • Multiplication (and division) are difficult cases for SAT

    • §: Prohibited

      • Gives underapproximation

      • Restricts values of (possibly intermediate) terms

    • §: f(x,y)

      • Overapproximate as uninterpreted function f

      • Value constrained only by functional consistency


    Results uclid bv vs bit blasting l.jpg
    Results: UCLID BV vs. Bit-blasting

    • UCLID always better than bit blasting

    • Generally better than other available procedures

    • SAT time is the dominating factor

    [results on 2.8 GHz Xeon, 2 GB RAM]


    Uclid bv run time analysis l.jpg

    wi: Maximum word-level variable size

    si: Maximum word-level variable instantiation

    Generated abstractions are small

    Few iterations of refinement loop needed

    UCLID BV run-time analysis


    Why this work is worthwhile l.jpg
    Why This Work is Worthwhile

    • Realistic Semantic Model for Hardware and Software

      • Captures all details of actual operation

        • Detects errors related to overflow and other artifacts of finite representation

      • Allows mixing of integer and bit-level operations

      • Can capture many abstractions that are currently applied manually

    • SAT-Based Methods Are Only Logical Choice

      • Bit blasting is only way to capture full set of operations

      • SAT solvers are good & getting better

    • Abstraction / Refinement Allows Better Scaling

      • Take advantage of cases where formula easily satisfied or disproven


    Future work l.jpg
    Future Work

    • Lots of Refinement & Tuning

      • Selecting under- and over-approximations

      • Iterating within under- or over-approximation

        • E.g., attempt to control variable ranges when overapproximating

      • Reusing portions of bit-blasted formulas

        • Take advantage of incremental SAT

    • Additional Abstractions

      • More general use of functional abstraction

        • Subsume use of uninterpreted functions in current verification methods


    ad