Dynamic floating point error detection
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

Dynamic Floating-Point Error Detection PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Dynamic Floating-Point Error Detection. Mike Lam, Jeff Hollingsworth and Pete Stewart. Motivation. Finite precision -> roundoff error Compromises ill-conditioned calculations Hard to detect and diagnose Increasingly important as HPC grows Single-precision is faster on GPUs

Download Presentation

Dynamic Floating-Point Error Detection

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Dynamic floating point error detection

Dynamic Floating-Point Error Detection

Mike Lam,

Jeff Hollingsworth and Pete Stewart



Finite precision -> roundoff error

Compromises ill-conditioned calculations

Hard to detect and diagnose

Increasingly important as HPC grows

Single-precision is faster on GPUs

Double-precision fails on long-running computations

Previous solutions are problematic

Numerical analysis requires training

Manual re-writing and testing in higher precision is tedious and time-consuming

Our solution

Our Solution

Instrument floating-point instructions


Minimize developer effort

Ensure analysis consistency and correctness


Include shared libraries w/o source code

Include compiler optimizations



Our solution1

Our Solution

Three parts

Utility that inserts binary instrumentation

Runtime shared library with analysis routines

GUI log viewer

General overview

Find floating-point instructions and insert calls to shared library

Run instrumented program

View output with GUI

Our solution2

Our Solution

Dyninst-based instrumentation


No special hardware required

Stack walking and binary rewriting

Java GUI


Minimal development effort

Our solution3

Our Solution

Cancellation detection

Instrument addition & subtraction

Compare runtime operand values

Report cancelled digits

Side-by-side (“shadow”) calculations

Instrument all floating-point instructions

Higher/lower precision

Different representation (i.e. rationals)

Report final errors

Cancellation detection

Cancellation Detection


Loss of significant digits during operations

For each addition/subtraction:

Extract value of each operand

Calculate result and compare magnitudes (binary exponents)

If eans < max(ex,ey) there is a cancellation

For each cancellation event:

Record a “priority:” max(ex,ey) - eans

Save event information to log

Gaussian elimination

Gaussian Elimination

  • A -> [L,U]

  • Comparison of eight methods

    • Classical

    • Classical w/ partial pivoting

    • Classical w/ full pivoting

    • Bordering (“Sherman’s march”)

    • “Pickett’s charge”

    • “Pickett’s charge” w/ partial pivoting

    • Crout’s method

    • Crout’s method w/ partial pivoting

Gaussian elimination1

Gaussian Elimination

Gaussian elimination2

Gaussian Elimination

Classical vs. Bordering

Gaussian elimination3

Gaussian Elimination

Spec benchmarks

SPEC Benchmarks

Results are hard to interpret without domain knowledge


Roundoff error

Roundoff Error

Sparse “shadow value” table

Maps memory addresses to alternate values

Shadow values can be single-, double-, quad- or arbitrary-precision

Other ideas: rationals, # of significant digits, etc.

Instrument every FP instruction

Extract operation type and operand addresses

Perform the same operation on corresponding shadow values

Output shadow values and errors upon termination

More gaussian elimination

More Gaussian Elimination

Issues possible solutions

Issues & Possible Solutions

Expensive overheads (100-500X)

Optimize with inline snippets

Reduce workload with data flow analysis

Following values through compiler optimizations

Selectively instrument MOV instructions

Filtering false positives

Deduce “root cause” of error using data flow



Analysis of floating-point error is hard

Our tool provides automatic analysis of such error

Work in progress

Thank you

Thank you!

  • Login