1 / 32

Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation. Tor Aamodt and Paul Chow University of Toronto { aamodt, pc }@eecg.utoronto.ca. 3rd ACM International Conference on Compilers, Architectures and Synthesis for Embedded Systems, Nov. 17-18th, 2000, San Jose CA.

orenda
Download Presentation

Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation Tor Aamodt and Paul Chow University of Toronto { aamodt, pc }@eecg.utoronto.ca 3rd ACM International Conference on Compilers, Architectures and Synthesis for Embedded Systems, Nov. 17-18th, 2000, San Jose CA

  2. What is this presentation about? • FOCUS: Signal processing applications developed using high-level language representation and floating-point data types... • WANT: Faster fixed-point software development... • QUESTION: Are there “better” fixed-point DSP instruction-sets in terms of runtime, power, or roundoff-noise performance? Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  3. Presentation Outline • Motivation & Background • Focus on… • Automatic Conversion to Fixed-Point • Architectural Enhancements • Some Experimental Results • Summary / Future Directions Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  4. Motivation • 80% of DSPs in use are Fixed-Point. Why? • Because fixed-point hardware is cheaper and uses less power … • … however, it is much harderto develop signal-processing software for. Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  5. Background • UTDSP Project: DSP Compiler/Architecture Co-design • Traditional DSP architectures are hard for compilers to generate efficient code for… eg. extended precision accumulators • First Generation Silicon Sept. 30, 1999: 108 pin PGA 0.35 µm CMOS / 63 MHz (Sean Peng’s M.A.Sc.) • 16-bit Fixed-Point VLIW DSP with novel 2-level Instruction fetching architecture (reduced pin-count) • June 2000: Synopsys CoCentric Fixed-Point Designer Tool • First commercial tool for transforming floating-point ANSI C programs into fixed-point ($20,000 US) Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  6. signbit 8 bit exponent (excess 127) 23+1 bit normalized mantissa 32 bit Floating-Point (IEEE): Fixed-Point: Background: Fixed-Point versus Floating-Point explicit binary-point implied binary-point sign bit Integer Part Fractional Part Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  7. Explicit Scaling Operations Background: Using Fixed-Point Arithmetic Floating-Point: yn=yn-1 + xn yn=(( •yn-1>>3)+ xn )<< 1 Fixed-Point: Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  8. Automatic Conversion Process Traditional Optimizing Compiler: Input Program Parser Optimizer Code Generator Processor • CONSTRAINT: Input/Output Invariance • GOAL: Application Speedup ie. make code faster, but do not break anything!!! Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  9. Automatic Conversion Process Traditional Optimizing Compiler: Input Program Optimizer Code Generator Parser Processor Sample Inputs Floating-Point to Fixed-Point Translator • “RELAX” CONSTRAINTS… • GOALS: “Good” Input/Ouput Fidelity (eg. good signal-to-noise ratio) Fast/Low-Power Operation (10-500  faster than FP emulation) Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  10. float a, b, x[N]; y = a*x[i] + b*x[i+1]; 1. Type Conversion 2. Scaling Operations 3. Fractional Fixed-Point Operations Floating-Point to Fixed-Point Translation int a, b, x[N]; y = a•x[i] >> 2 + b•x[i+1]; Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  11. Floating-Point to Fixed-Point Translator SUIF Parser* Optimizer Identifier Assignment Fixed-PointConversion Instrument Code Sample Inputs Profile *SUIF = Stanford University Intermediate Format See: http://suif.stanford.edu Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  12. Consider the ANSI C code: float a, b, x[N]; y = a*x[i] + b*x[i+1]; Code Instrumentation: tmp_1 = a*x[i]; tmp_2 = b*x[i+1]; y = tmp_1 * tmp_2; profile(tmp_1,1); profile(tmp_2,2); profile(y,0); Equivalent Expression Tree: ID Assignment: a “1” : tmp_1 * x[i] y + “0” : b * x[i+1] “2” : tmp_2 Collecting Dynamic Range Information Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  13. IWL Sign bit Integer Part Fractional Part Generating Scaling Operations • Signal Scaling: Integer Word Length (IWL) • definition: IWL[x] = log2 max(x) + 1 Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  14.   ?   Generating Scaling Operations Example: “A op B”: IWLA op B measured IWLA op B current IWLA measured IWLA current IWLB measured IWLB current op Converted Sub-Expressions A B Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  15. Automatic Conversion Process:IRP: Using Intermediate ResultProfileData • Previous Algorithms: • ‘Worst-Case Evaluation’: Markus Willems et. al. FRIDGE: An Interactive Code Generation Environment for HW/SW CoDesign. ICASSP, April 1997. (a.k.a. Predecessor to Synopsys CoCentric Fixed-Point Designer Tool) • A ‘Statistical’ Approach: Ki-Il Kum, Jiyang Kang, and Wonyong Sung. A Floating-Point to Fixed-Point C Converter for Fixed-Point Digital Signal Processors. In Proc. 2nd SUIF Compiler Workshop, August 1997. • Neither use Intermediate Result Profile data, instead, they combine range information from leaf nodes  Is Useful Information Lost? Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  16. “A  B”  “(A << nA)  (B >> [n-nB])” where: nA = IWLA current - IWLA measured nB = IWLA current - IWLB measured n = IWLA measured - IWLB measured IWLA+B current = IWLA measured IRP: Additive Operations For example, assume |A| > |B|, and IWLA+B measured  IWLA measured “A ± B” A: B: >> n n Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  17. IRP: Multiplication “A • B”  “(A << nA) • (B << nB)” where: nA = IWLA current - IWLA measured nB = IWLA current - IWLB measured IWLA•B current =IWLA measured+ IWLB measured Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  18. IRP: Division “A / B”  “(A >> [ndividend - nA]) / (B << nB)” nA = IWLA current - IWLA measured nB = IWLA current - IWLB measured ndiff = IWLA/B measured - IWLA measured + IWLB measured ndividend = ndiff, if ndiff  0 0 , otherwise Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  19. Example: y = (a*x[i] + (b*x[i+1]>>1)) << 1 Question: Is information discarded unnecessarily here? Consider the following alternative: y = (a*x[i]<<1) + b*x[i+1] IRP-SA: Using ‘Shift Absorption’ BUT: Can we really discard most significant bits and get roughly the same answer???? YES! Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  20. Architectural Support Common occurrence (using IRP-SA): A•B << n IWLA Fractional Multiplication with internal Left Shift A: IWLB B: IWLA+ IWLB A*B: n Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  21. Experimental Results Benchmarks 4th Order Cascaded/Parallel IIR Filter (IIR-C, IIR-P) (Normalized) Lattice Filter (LAT, NLAT) 128-Point Radix 2 Decimation in Time FFT (FFT-NR, FFT-MW) Levinson-Durbin Recursion (LEVDUR) 10x10 Matrix-Multiply (MMUL10) Nonlinear Control (INVPEND) Trig Function (SIN) Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  22. SQNR Enhancement: FMLS and/or IRP-SA Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  23. What Is The Effect of “Shift Absorption” ? Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  24. Experimental Results:Rotational Inverted Pendulum U of T System Control Group Non-linear Testbench Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  25. Closed-Loop System Response: Rotational Inverted Pendulum12-bit Controller Comparison WC : 32.8 dB IRP-SA: 41.1 dB IRP-SA w/ fmls: 48.0 dB Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  26. 128-Point Radix-2 FFT (Generated by MATLAB RealTime Workshop) Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  27. Speedup? Rotational Inverted Pendulum: Fractional Multiply Output Shift Relative Frequencies Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  28. …Yup! Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  29. Speedup* Using FMLS Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  30. SQNR Enhancement for various Output Shift Sets Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  31. Summary • The Fractional Multiply with internal Left Shift (FMLS) operation can improve runtime and signal-to-noise performance. Speedups of up to 35% and SQNR enhancement equivalent of up to 2 bits maybe even 4 bits (depending on how you choose to measure it) • Easy VLSI implementation, and easy for compiler to use. Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

  32. Future Directions • Higher Level Transformations: • Automatic Generation of Block-Floating-Point... • Quantization Error Feedback… • BOTH need signal-flow-graph representation… therefore probably need a better DSP language than ANSI C • Variable Precision Arithmetic (How much precision does each operation need?) Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation

More Related