Loading in 5 sec....

Dynamic Floating-Point Error DetectionPowerPoint Presentation

Dynamic Floating-Point Error Detection

- 100 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Dynamic Floating-Point Error Detection' - airell

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Motivation

Finite precision -> roundoff error

Compromises ill-conditioned calculations

Hard to detect and diagnose

Increasingly important as HPC grows

Single-precision is faster on GPUs

Double-precision fails on long-running computations

Previous solutions are problematic

Numerical analysis requires training

Manual re-writing and testing in higher precision is tedious and time-consuming

Our Solution

Instrument floating-point instructions

Automatic

Minimize developer effort

Ensure analysis consistency and correctness

Binary-level

Include shared libraries w/o source code

Include compiler optimizations

Runtime

Data-sensitive

Our Solution

Three parts

Utility that inserts binary instrumentation

Runtime shared library with analysis routines

GUI log viewer

General overview

Find floating-point instructions and insert calls to shared library

Run instrumented program

View output with GUI

Our Solution

Dyninst-based instrumentation

Cross-platform

No special hardware required

Stack walking and binary rewriting

Java GUI

Cross-platform

Minimal development effort

Our Solution

Cancellation detection

Instrument addition & subtraction

Compare runtime operand values

Report cancelled digits

Side-by-side (“shadow”) calculations

Instrument all floating-point instructions

Higher/lower precision

Different representation (i.e. rationals)

Report final errors

Cancellation Detection

Overview

Loss of significant digits during operations

For each addition/subtraction:

Extract value of each operand

Calculate result and compare magnitudes (binary exponents)

If eans < max(ex,ey) there is a cancellation

For each cancellation event:

Record a “priority:” max(ex,ey) - eans

Save event information to log

Gaussian Elimination

- A -> [L,U]
- Comparison of eight methods
- Classical
- Classical w/ partial pivoting
- Classical w/ full pivoting
- Bordering (“Sherman’s march”)
- “Pickett’s charge”
- “Pickett’s charge” w/ partial pivoting
- Crout’s method
- Crout’s method w/ partial pivoting

Gaussian Elimination

Classical vs. Bordering

Roundoff Error

Sparse “shadow value” table

Maps memory addresses to alternate values

Shadow values can be single-, double-, quad- or arbitrary-precision

Other ideas: rationals, # of significant digits, etc.

Instrument every FP instruction

Extract operation type and operand addresses

Perform the same operation on corresponding shadow values

Output shadow values and errors upon termination

Issues & Possible Solutions

Expensive overheads (100-500X)

Optimize with inline snippets

Reduce workload with data flow analysis

Following values through compiler optimizations

Selectively instrument MOV instructions

Filtering false positives

Deduce “root cause” of error using data flow

Conclusion

Analysis of floating-point error is hard

Our tool provides automatic analysis of such error

Work in progress

Download Presentation

Connecting to Server..