An SMT Based Method for Optimizing Arithmetic Computations in Embedded Software Code - PowerPoint PPT Presentation

An smt based method for optimizing arithmetic computations in embedded software code
Download
1 / 48

  • 81 Views
  • Uploaded on
  • Presentation posted in: General

An SMT Based Method for Optimizing Arithmetic Computations in Embedded Software Code. Presented by: Kuldeep S. Meel Adapted from slides by Hassan Eldib and Chao Wang (Virginia Tech). A Robotic Dream. Having a tool that automatically synthesizes the optimum version of a software program.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

An SMT Based Method for Optimizing Arithmetic Computations in Embedded Software Code

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


An smt based method for optimizing arithmetic computations in embedded software code

An SMT Based Method for Optimizing Arithmetic Computations in Embedded Software Code

Presented by: Kuldeep S. Meel

Adapted from slides by Hassan Eldib and Chao Wang (Virginia Tech)


A robotic dream

A Robotic Dream

  • Having a tool that automatically synthesizes the optimum version of a software program.


Embedded software

Embedded Software


Some bugs matter more than others

Some bugs matter more than others

https://www.youtube.com/watch?v=AjUpe7Oh1j0#t=75


Some even more

Some even more ……

Cost: US $350 million


The age of sci fi cs

The Age of Sci-Fi CS

  • Verification

    • Verify if there is overflow/underflow bug in the code

  • Synthesis

    • Given a reference implementation, synthesize an implementation that does not have any of these bugs


Objective

Objective

  • Synthesizing an optimal version of the C code with fixed-point linear arithmetic computation for embedded devices.

    • Minimizing the bit-width.

    • Maximizing the dynamic

      range.


Motivating example

Motivating Example

  • Compute average of Aand B on a microcontroller with signed 8-bit fixed-point

  • Given: A, B ∈ [-20, 80].

    • (A +B)/2 may have overflow errors

    • A/2 + B/2 may have truncation errors

    • A + (B-A)/2 has neither overflow nor truncation errors .


Bit width versus range

Bit-width versus Range

  • Larger range requires a larger bit-width.

  • Decreasing the bit-width, will reduce the range.


Fixed point representation

Fixed-point Representation

Representations for 8-bit fixed-point numbers

Range ∝ Bit-width

Resolution ∝ Bit-width

  • Range: -128 ↔ 127

  • Resolution = 1

  • Range : -16 ↔ 15.875

  • Resolution = 1/8


Problem statement

Problem Statement

Program:

Optimized program:

Range & resolution of the input variables:

A -1000 3000

res. 1/4

B -1000 3000

res. 1/4


Problem statement1

Problem Statement

  • Given

    • The C code with fixed-point linear arithmetic computation

      • Fixed-point type: (s,N,m)

    • The range and resolution of all input variables

  • Synthesize the optimized C code with

    • Reduced bit-width with same input range, or

    • Larger input range with the same bit-width


Restrictions

Restrictions

  • Bounded loops (via unrolling)

  • Same bit-width for all the variables and constants

    • Can have different bit representations


Smt based inductive program synthesis

SMT-based Inductive Program Synthesis


Smt based inductive program synthesis1

SMT-based Inductive Program Synthesis


Step 1 finding a candidate program

Step 1: Finding a Candidate Program

  • Create the most general ASTthat can represent any arithmetic equation, with reduced bit-width.

  • Use SMT solver to find a solution such that

    • For some test inputs (samples),

    • output of the AST is the same as the desired computation


Smt based solution

SMT-based Solution

  • SMT encoding for the general equation AST structure

    • Each Op node can any operation from *, +, -, >> or <<.

    • Each L node can be an input variable or a constant value.

  • SMT Solver finds a solution by equating the AST output to that of the desired program

Fig. General Equation AST.


Smt encoding

SMT Encoding

  • : Desired input program to be optimized.

  • : General AST with reduced bit-width.

  • : Same input values.

  • Same output value.

  • : Test cases (inputs).

  • : Blocked solutions.


Smt based solution an example

SMT-based Solution (an example)


Smt based inductive program synthesis2

SMT-based Inductive Program Synthesis


Step 2 verifying the solution

Step 2: Verifying the Solution

  • Is the program good for all possible inputs?

    • Yes, we found an optimized program

    • No, block this (bad) solution, and try again


Smt encoding1

SMT Encoding

  • : Desired input program to be optimized.

  • : Found candidate solution.

  • : Same input values.

  • : Different output value.

  • : Ranges of the input variables.

  • : Resolution of the input variables.


Smt based inductive program synthesis3

SMT-based Inductive Program Synthesis


The next solution

The Next Solution

B + ≡


Smt based inductive program synthesis4

SMT-based Inductive Program Synthesis


Scalability problem

Scalability Problem

  • Advantage of the SMT-based approach

    • Find optimal solution within an AST depth bound

  • Disadvantage

    • Cannot scale up to larger programs

      • Sketch tool by Solar-Lezama& Bodik (5 nodes)

      • Our own tool based on YICES (9 nodes)


Ou r incremental approach

Our Incremental Approach


Incremental optimization

Incremental Optimization

  • Combine static analysis and SMT-based inductive synthesis.

  • Apply SMT solver only to small code regions

    • Identify an instruction that causes overflow/underflow.

    • Extract a small code region for optimization.

    • Compute redundant LSBs (allowable truncation error).

    • Optimize the code region.

    • Iterate until no more further optimization is possible.


Region for optimization

Region for Optimization

  • BFS order (Tree has return as root)

    • Parent node doesn’t have overflow or underflow

    • Parent node restores the value created in the overflowing node back to the normal range in orig.

    • Larger extracted region allows for more opportunities

    • AST with at most 5 levels, 63 AST nodes


Region for optimization1

Region for optimization

Detecting Overflow Errors

  • The addition of a and b may overflow

The parent nodes

Some sibling nodes

Some child nodes


Truncation error margins

Truncation Error margins

Computing Redundant LSBs

  • The redundant LSBs of a are computed as 4 bits

  • The redundant LSBs of b are computed as 3 bits.


Truncation error margins1

Truncation Error margins

  • step(x*y) = step(x) + step(y)

  • step(x+y) = min(step(x),step(y))

  • step(x-y) = min(step(x),step(y))

  • step(x << c) = step(x) + c

  • step(x >> c) = max(step(x)-c,0)


Extracting code region

Extracting Code Region

  • Extract the code surrounding the overflow operation.

  • The new code requires a smaller bit-width.


Inductive generation of new regions

Inductive Generation of New Regions

  • : Extracted region of bit width (bw1).

  • : Desired solution with bit wi.

  • : Same input values.

  • : Same output value.

  • : random tests.

  • : Blocked programs


Verification

Verification

  • : Desired input program to be optimized.

  • : Found candidate solution.

  • : Same input values.

  • : Different output value.

  • : Ranges of the input variables.

  • : Resolution of the input variables.


The incremental approach

The Incremental Approach


Implementation

Implementation

  • Clang/LLVM + Yices SMT solver

  • Bit-vector arithmetic theory

  • Evaluated on a set of public benchmarks for embedded control and DSP applications


Benchmarks embedded control software

Benchmarks(embedded control software)

All benchmark examples are public-domain examples


Experiment increase in range

Experiment (increase in range)

  • Average increase in range is 307%

    (602%, 194%, 5%, 40%, 32%, 1515%, 0% , 103%)


Experiment decrease in bit width

Experiment (decrease in bit-width)

  • Required bit-width: 32-bit 16-bit

    64-bit 32-bit


Experiment scaling error

Experiment (scaling error)

Original program New program

If we reduce microcontroller’s bit-width, how much error will be introduced?


Experiment runtime statistics

Experiment (runtime statistics)

64 bit


Related work sketch

Related Work: Sketch

  • CEGIS based

  • Synthesis is over the whole program

  • Solver not capable of efficiently handling linear fixed-point arithmetic computations (?)


Related work gulwani et al pldi 2011

Related Work: Gulwani et al. (PLDI 2011)

  • User specified logical specification (e.g., unoptimized program)

  • Uses CEGIS to find optimized version

  • Slow (This technique performs worse than enumeration/stochastic techniques)


Related work rupak et al ices 12

Related Work: Rupak et al. (ICES 12)

  • Optimization of bit-vector representations

  • Does not change structure of the program

  • Uses Mixed Integer Linear Programming (MILP) Solver


Conclusions

Conclusions

  • We presented a new SMT-based method for optimizing fixed-point linear arithmetic computations in embedded software code

    • Effective in reducing the required bit-width

    • Scalable for practice use

  • Future work

    • Other aspects of the performance optimization, such as execution time, power consumption, etc.


More on related w ork

More on Related Work

  • Solar-Lezamaet al. Programming by sketching for bit-streaming programs, ACM SIGPLAN’05.

    • General program synthesis. Does not scale beyond 3-4 LoC for our application.

  • Gulwaniet al. Synthesis of loop-free programs, ACM SIGPLAN’11.

    • Synthesizing bit-vector programs. Largest synthesized program has 16 LoC, taking >45mins. Do not have incremental optimization.

  • Jha. Towards automated system synthesis using sciduction, Ph.D. dissertation, UC Berkeley, 2011.

    • Computing the minimal required bit-width for fixed-point representation. Do not change the code structure.

  • Rupaket al. Synthesis of minimal-error control software, EMSOFT’12.

    • Synthesizing fixed-point computation from floating-point computation. Again, only compute minimal required bit-widths, without changing code structure.


  • Login