1 / 35

Reconfigurable Computing Lecture 14: Floating Point

This lecture covers floating point operations in reconfigurable computing, including formats, addition, and common challenges. Don't miss out!

farrell
Download Presentation

Reconfigurable Computing Lecture 14: Floating Point

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPRE 583Reconfigurable ComputingLecture 14: Fri 10/12/2011(Floating Point) Instructor: Dr. Phillip Jones (phjones@iastate.edu) Reconfigurable Computing Laboratory Iowa State University Ames, Iowa, USA http://class.ee.iastate.edu/cpre583/

  2. Announcements/Reminders • Project Teams: Form by Monday 10/10 • MP2 due Friday 10/14 • Project Proposal due Friday 10/14 (midnight) • High-level topic, and high plan for execution • I’ll give feedback • Project proposal Class presentation on Wed 10/19 • 5-10 power point slides • I plan to have exams back to you this Friday

  3. Project Grading Breakdown • 50% Final Project Demo • 30% Final Project Report • 20% of your project report grade will come from your 5-6 project updates. Friday’s midnight • 20% Final Project Presentation

  4. Projects Ideas: Relevant conferences • Micro • Super Computing • HPCA • IPDPS • FPL • FPT • FCCM • FPGA • DAC • ICCAD • Reconfig • RTSS • RTAS • ISCA

  5. Projects: Target Timeline • Teams Formed and Topic: Mon 10/10 • Project idea in Power Point 3-5 slides • Motivation (why is this interesting, useful) • What will be the end result • High-level picture of final product • Project team list: Name, Responsibility • High-level Plan/Proposal: Fri 10/14 • Power Point 5-10 slides (presentation to class Wed 10/19) • System block diagrams • High-level algorithms (if any) • Concerns • Implementation • Conceptual • Related research papers (if any)

  6. Projects: Target Timeline • Work on projects: 10/19 - 12/9 • Weekly update reports • More information on updates will be given • Presentations: Finals week • Present / Demo what is done at this point • 15-20 minutes (depends on number of projects) • Final write up and Software/Hardware turned in: Day of final (TBD)

  7. Initial Project Proposal Slides (5-10 slides) • Project team list: Name, Responsibility (who is project leader) • Team size: 3-4 (5 case-by-case) • Project idea • Motivation (why is this interesting, useful) • What will be the end result • High-level picture of final product • High-level Plan • Break project into mile stones • Provide initial schedule: I would initially schedule aggressively to have project complete by Thanksgiving. Issues will pop up to cause the schedule to slip. • System block diagrams • High-level algorithms (if any) • Concerns • Implementation • Conceptual • Research papers related to you project idea

  8. Weekly Project Updates • The current state of your project write up • Even in the early stages of the project you should be able to write a rough draft of the Introduction and Motivation section • The current state of your Final Presentation • Your Initial Project proposal presentation (Due Wed 10/19). Should make for a starting point for you Final presentation • What things are work & not working • What roadblocks are you running into

  9. Common Questions

  10. Common Questions

  11. Overview • Floating Point on FPGAs (Chapter 21.4 and 31) • Why is it viewed as difficult?? • Options for mitigating issues

  12. Floating Point Format (IEEE-754) Single Precision S exp Mantissa 1 8 23 23 Mantissa = b-1 b-2 b-3 ….b-23 = ∑ b-i 2-i i=1 Floating point value = (-1)S * 2(exp-127) * (1.Mantissa) Example: 0 x”80” 110 x”00000” = -1^0 * 2^128-127 * 1.(1/2 + 1/4) = -1^0 * 2^1 * 1.75 = 3.5 Double Precision S exp Mantissa 1 11 52 Floating point value = (-1)S * 2(exp-1023) * (1.Mantissa)

  13. Fixed Point Whole Fractional b-1 b-2 …. b-F bW-1 … b1 b0 Example formats (W.F): 5.5, 10.12, 3.7 Example fixed point 5.5 format: 01010 01100 = 10. 1/4 + 1/8 = 10.375 Compare floating point and fixed point Floating point: 0 x”80” “110” x”00000” = 3.5 10-bit (Format 3.7) Fixed Point for 3.5 = ? 011 1000000

  14. Fixed Point (Addition) Operand 1 Whole Fractional Whole Fractional Operand 2 + Whole Fractional sum

  15. Fixed Point (Addition) 11-bit 4.7 format 0011 111 0000 Operand 1 = 3.875 0001 101 0000 Operand 2 = 1.625 + sum 0101 100 0000 = 5.5 You can use a standard ripple-carry adder!

  16. Floating Point (Addition) 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”7F” 101 x”00000” +

  17. Floating Point (Addition) 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”7F” 101 x”00000” + • Common exponent (i.e. align binary point) • Make x”80” -> x”7F” or visa-verse?

  18. Floating Point (Addition) 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”7F” 101 x”00000” + • Common exponent (i.e. align binary point) • Make x”7F”->x”80”, lose least significant bits of Operand 2 • - Add the difference of x”80” – x“7F” = 1 to x”7F” • - Shift mantissa of Operand 2 by difference to the right. • remember “implicit” 1 of the original mantissa 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”80” 110 x”80000” +

  19. Floating Point (Addition) 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”7F” 101 x”00000” + • Add mantissas 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”80” 110 x”80000” +

  20. Floating Point (Addition) 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”7F” 101 x”00000” + • Add mantissas 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”80” 110 x”80000” + Overflow! 1 110 x”00000”

  21. Floating Point (Addition) 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”7F” 101 x”00000” + • Add mantissas • You can’t just overflow mantissa into exponent field • You are actually overflowing the implicit “1” of Operand 1, so you sort of have an implicit “2” (i.e. “10”). 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”80” 110 x”80000” + Overflow! 1 110 x”00000”

  22. Floating Point (Addition) 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”7F” 101 x”00000” + • Add mantissas • Deal with overflow of Mantissa by normalizing. • Shift mantissa right by 1 (shift a “0” in because of implicit “2”) • Increment exponent by 1 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”80” 110 x”80000” + 0 x”81” 011 x”00000”

  23. Floating Point (Addition) 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”7F” 101 x”00000” + = 5.5 0 x”81” 011 x”00000” • Add mantissas • Deal with overflow of Mantissa by normalizing. • Shift mantissa right by 1 (shift a “0” in because of implicit “2”) • Increment exponent by 1 0 x”80” 111 x”80000” Operand 1 = 3.875 Operand 2 = 1.625 0 x”80” 110 x”80000” + 0 x”81” 011 x”00000”

  24. Floating Point (Addition): Other concerns Single Precision S exp Mantissa 1 8 23

  25. Fixed Point (Addition): Hardware

  26. Floating Point (Addition): High-level Hardware E1 E0 M0 M1 Difference Greater Than Mux SWAP Shift value Standard Adder from previous slidet Right Shift Add/Sub Priority Encoder Round Denormal? Left Shift value Left Shift Sub/const E M

  27. Floating Point • Both Xilinx and Altera supply floating point soft-cores (which I believe are IEEE-754 compliant). So don’t get too afraid if you need floating point in your class projects • Also there should be floating point open cores that are freely available.

  28. Fixed Point vs. Floating Point • Floating Point advantages: • Application designer does not have to think “much” about the math • Floating point format supports a wide range of numbers (+/- 3x1038 to +/-1x10-38), single precision • If IEEE-754 compliant, then easier to accelerate existing floating point base applications • Floating Point disadvantages • Ease of use at great hardware expense • 32-bit fix point add (~32 DFF + 32 LUTs) • 32-bit single precision floating point add (~250 DFF + 250 LUTs). About 10x more resources, thus 1/10 possible best case parallelism. • Floating point typically needs massive pipeline to achieve high clock rates (i.e. high throughput) • No hard-resouces such as carry-chain to take advantage of

  29. Fixed Point vs. Floating Point • Range example: (using decimal for clarity) • Assume we can only use 3 digit • For fixed point, all 3 digits used for whole part (3.0 format) • For floating point, 2 digits used for mantissa, 1 digit for exponent • What is the largest number you can represent for each? • Precision example: (using decimal for clarity) • For the same format above, represent 125

  30. Mitigating Floating Point Disadvantages • Only support a subset of the IEEE-754 standard • Could use software to off-load special cases • Modify floating point format to support a smaller data type (e.g. 18-bit instead of 32-bit) • Link to Cornell class: • http://instruct1.cit.cornell.edu/courses/ece576/FloatingPoint/index.html • Add hardware support in the FPGA for floating point • Hardcore multipliers: Added by companies early 2000’s • Altera: Hard shared paths for floating point (Stratix-V 2011) • How to get 1-TFLOP throughput on FPGAs article • http://www.eetimes.com/design/programmable-logic/4207687/How-to-achieve-1-trillion-floating-point-operations-per-second-in-an-FPGA • http://www.altera.com/education/webcasts/all/wc-2011-floating-point-dsp-fpga.html

  31. Mitigating Fixed Point Disadvantages (21.4) • Block Floating Point (mitigating range issue) • All data in a block share an exponent • Periodic rescaling • Makes use of Fix-point hardware • Useful in application where data is processed in stages, and a limited value range can be placed on each stage (e.g. FFT)

  32. Next Lecture • Review Mid-term • EDK (XPS) tool overview

  33. Questions/Comments/Concerns • Write down • Main point of lecture • One thing that’s still not quite clear • If everything is clear, then give an example of how to apply something from lecture OR

  34. Lecture Notes Altera App Notes on computing FLOPs for Stratix-III or Stratix-IV Altera old app Notes on floating point add/mult Link to floating point single precision calculator Block (fixed) floating point (build slide explanation example) Number comparing CPU/FPGA/GPU floating point throughput Pre-make showing some examples of Fix Point advantage for: • Representing the precision of a number • And precision for add a convoluted type of number 1M.0001

  35. Lecture Notes Points: Original 286, 386: Not floating point HW Next: Floating point coprocessor (on a separate chip) Next: Floating point on same chip Why carry ripple used over my advanced “high” performing generate-propagate adders (.1 for 4-LUTs vs .4ns for 1 LUT

More Related