An smt based method for optimizing arithmetic computations in embedded software code
This presentation is the property of its rightful owner.
Sponsored Links
1 / 48

An SMT Based Method for Optimizing Arithmetic Computations in Embedded Software Code PowerPoint PPT Presentation


  • 64 Views
  • Uploaded on
  • Presentation posted in: General

An SMT Based Method for Optimizing Arithmetic Computations in Embedded Software Code. Presented by: Kuldeep S. Meel Adapted from slides by Hassan Eldib and Chao Wang (Virginia Tech). A Robotic Dream. Having a tool that automatically synthesizes the optimum version of a software program.

Download Presentation

An SMT Based Method for Optimizing Arithmetic Computations in Embedded Software Code

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


An smt based method for optimizing arithmetic computations in embedded software code

An SMT Based Method for Optimizing Arithmetic Computations in Embedded Software Code

Presented by: Kuldeep S. Meel

Adapted from slides by Hassan Eldib and Chao Wang (Virginia Tech)


A robotic dream

A Robotic Dream

  • Having a tool that automatically synthesizes the optimum version of a software program.


Embedded software

Embedded Software


Some bugs matter more than others

Some bugs matter more than others

https://www.youtube.com/watch?v=AjUpe7Oh1j0#t=75


Some even more

Some even more ……

Cost: US $350 million


The age of sci fi cs

The Age of Sci-Fi CS

  • Verification

    • Verify if there is overflow/underflow bug in the code

  • Synthesis

    • Given a reference implementation, synthesize an implementation that does not have any of these bugs


Objective

Objective

  • Synthesizing an optimal version of the C code with fixed-point linear arithmetic computation for embedded devices.

    • Minimizing the bit-width.

    • Maximizing the dynamic

      range.


Motivating example

Motivating Example

  • Compute average of Aand B on a microcontroller with signed 8-bit fixed-point

  • Given: A, B ∈ [-20, 80].

    • (A +B)/2 may have overflow errors

    • A/2 + B/2 may have truncation errors

    • A + (B-A)/2 has neither overflow nor truncation errors .


Bit width versus range

Bit-width versus Range

  • Larger range requires a larger bit-width.

  • Decreasing the bit-width, will reduce the range.


Fixed point representation

Fixed-point Representation

Representations for 8-bit fixed-point numbers

Range ∝ Bit-width

Resolution ∝ Bit-width

  • Range: -128 ↔ 127

  • Resolution = 1

  • Range : -16 ↔ 15.875

  • Resolution = 1/8


Problem statement

Problem Statement

Program:

Optimized program:

Range & resolution of the input variables:

A -1000 3000

res. 1/4

B -1000 3000

res. 1/4


Problem statement1

Problem Statement

  • Given

    • The C code with fixed-point linear arithmetic computation

      • Fixed-point type: (s,N,m)

    • The range and resolution of all input variables

  • Synthesize the optimized C code with

    • Reduced bit-width with same input range, or

    • Larger input range with the same bit-width


Restrictions

Restrictions

  • Bounded loops (via unrolling)

  • Same bit-width for all the variables and constants

    • Can have different bit representations


Smt based inductive program synthesis

SMT-based Inductive Program Synthesis


Smt based inductive program synthesis1

SMT-based Inductive Program Synthesis


Step 1 finding a candidate program

Step 1: Finding a Candidate Program

  • Create the most general ASTthat can represent any arithmetic equation, with reduced bit-width.

  • Use SMT solver to find a solution such that

    • For some test inputs (samples),

    • output of the AST is the same as the desired computation


Smt based solution

SMT-based Solution

  • SMT encoding for the general equation AST structure

    • Each Op node can any operation from *, +, -, >> or <<.

    • Each L node can be an input variable or a constant value.

  • SMT Solver finds a solution by equating the AST output to that of the desired program

Fig. General Equation AST.


Smt encoding

SMT Encoding

  • : Desired input program to be optimized.

  • : General AST with reduced bit-width.

  • : Same input values.

  • Same output value.

  • : Test cases (inputs).

  • : Blocked solutions.


Smt based solution an example

SMT-based Solution (an example)


Smt based inductive program synthesis2

SMT-based Inductive Program Synthesis


Step 2 verifying the solution

Step 2: Verifying the Solution

  • Is the program good for all possible inputs?

    • Yes, we found an optimized program

    • No, block this (bad) solution, and try again


Smt encoding1

SMT Encoding

  • : Desired input program to be optimized.

  • : Found candidate solution.

  • : Same input values.

  • : Different output value.

  • : Ranges of the input variables.

  • : Resolution of the input variables.


Smt based inductive program synthesis3

SMT-based Inductive Program Synthesis


The next solution

The Next Solution

B + ≡


Smt based inductive program synthesis4

SMT-based Inductive Program Synthesis


Scalability problem

Scalability Problem

  • Advantage of the SMT-based approach

    • Find optimal solution within an AST depth bound

  • Disadvantage

    • Cannot scale up to larger programs

      • Sketch tool by Solar-Lezama& Bodik (5 nodes)

      • Our own tool based on YICES (9 nodes)


Ou r incremental approach

Our Incremental Approach


Incremental optimization

Incremental Optimization

  • Combine static analysis and SMT-based inductive synthesis.

  • Apply SMT solver only to small code regions

    • Identify an instruction that causes overflow/underflow.

    • Extract a small code region for optimization.

    • Compute redundant LSBs (allowable truncation error).

    • Optimize the code region.

    • Iterate until no more further optimization is possible.


Region for optimization

Region for Optimization

  • BFS order (Tree has return as root)

    • Parent node doesn’t have overflow or underflow

    • Parent node restores the value created in the overflowing node back to the normal range in orig.

    • Larger extracted region allows for more opportunities

    • AST with at most 5 levels, 63 AST nodes


Region for optimization1

Region for optimization

Detecting Overflow Errors

  • The addition of a and b may overflow

The parent nodes

Some sibling nodes

Some child nodes


Truncation error margins

Truncation Error margins

Computing Redundant LSBs

  • The redundant LSBs of a are computed as 4 bits

  • The redundant LSBs of b are computed as 3 bits.


Truncation error margins1

Truncation Error margins

  • step(x*y) = step(x) + step(y)

  • step(x+y) = min(step(x),step(y))

  • step(x-y) = min(step(x),step(y))

  • step(x << c) = step(x) + c

  • step(x >> c) = max(step(x)-c,0)


Extracting code region

Extracting Code Region

  • Extract the code surrounding the overflow operation.

  • The new code requires a smaller bit-width.


Inductive generation of new regions

Inductive Generation of New Regions

  • : Extracted region of bit width (bw1).

  • : Desired solution with bit wi.

  • : Same input values.

  • : Same output value.

  • : random tests.

  • : Blocked programs


Verification

Verification

  • : Desired input program to be optimized.

  • : Found candidate solution.

  • : Same input values.

  • : Different output value.

  • : Ranges of the input variables.

  • : Resolution of the input variables.


The incremental approach

The Incremental Approach


Implementation

Implementation

  • Clang/LLVM + Yices SMT solver

  • Bit-vector arithmetic theory

  • Evaluated on a set of public benchmarks for embedded control and DSP applications


Benchmarks embedded control software

Benchmarks(embedded control software)

All benchmark examples are public-domain examples


Experiment increase in range

Experiment (increase in range)

  • Average increase in range is 307%

    (602%, 194%, 5%, 40%, 32%, 1515%, 0% , 103%)


Experiment decrease in bit width

Experiment (decrease in bit-width)

  • Required bit-width: 32-bit 16-bit

    64-bit 32-bit


Experiment scaling error

Experiment (scaling error)

Original program New program

If we reduce microcontroller’s bit-width, how much error will be introduced?


Experiment runtime statistics

Experiment (runtime statistics)

64 bit


Related work sketch

Related Work: Sketch

  • CEGIS based

  • Synthesis is over the whole program

  • Solver not capable of efficiently handling linear fixed-point arithmetic computations (?)


Related work gulwani et al pldi 2011

Related Work: Gulwani et al. (PLDI 2011)

  • User specified logical specification (e.g., unoptimized program)

  • Uses CEGIS to find optimized version

  • Slow (This technique performs worse than enumeration/stochastic techniques)


Related work rupak et al ices 12

Related Work: Rupak et al. (ICES 12)

  • Optimization of bit-vector representations

  • Does not change structure of the program

  • Uses Mixed Integer Linear Programming (MILP) Solver


Conclusions

Conclusions

  • We presented a new SMT-based method for optimizing fixed-point linear arithmetic computations in embedded software code

    • Effective in reducing the required bit-width

    • Scalable for practice use

  • Future work

    • Other aspects of the performance optimization, such as execution time, power consumption, etc.


More on related w ork

More on Related Work

  • Solar-Lezamaet al. Programming by sketching for bit-streaming programs, ACM SIGPLAN’05.

    • General program synthesis. Does not scale beyond 3-4 LoC for our application.

  • Gulwaniet al. Synthesis of loop-free programs, ACM SIGPLAN’11.

    • Synthesizing bit-vector programs. Largest synthesized program has 16 LoC, taking >45mins. Do not have incremental optimization.

  • Jha. Towards automated system synthesis using sciduction, Ph.D. dissertation, UC Berkeley, 2011.

    • Computing the minimal required bit-width for fixed-point representation. Do not change the code structure.

  • Rupaket al. Synthesis of minimal-error control software, EMSOFT’12.

    • Synthesizing fixed-point computation from floating-point computation. Again, only compute minimal required bit-widths, without changing code structure.


  • Login