Matlab Extensions for the Development, Testing and Verification of Real-Time DSP Software

Matlab Extensions for the Development, Testing and Verification of Real-Time DSP Software

1084 Views

Download Presentation
## Matlab Extensions for the Development, Testing and Verification of Real-Time DSP Software

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Matlab Extensions for the Development, Testing and**Verification of Real-Time DSP Software David P. Magee Communication Systems Engineer Texas Instruments Dallas, TX**Presentation Outline**• DSP Software Development • DSP Simulator • Introduction to Intrinsics • FFT Example • Algorithm Optimization Results • Other Matlab and Simulink Extensions • Closing Remarks • Q & A**Develop Floating**Point Simulation Debug Simulation Step 1: Develop Understanding Develop Fixed Point Simulation Debug Simulation Step 2: Address Scaling Issues Develop Assembly Code Debug Assembly Code Step 3: Optimize for Performance DSP Software Development • Common steps for DSP software development**Issues with the 3 Step Approach**• Each step takes time and resources • Algorithm testing at each stage • Multiple versions of the algorithm – version control headaches • Evaluation of processor instruction set compatibility and MIPS requirements often occurs late in the software development cycle • Debugging algorithms on a pipelined and/or parallel processor can be very difficult (the problem is getting more difficult as processors become more complicated) Can the development cycle be improved ? Yes !**Develop Floating**Point Simulation Debug Simulation Step 1: Develop Understanding Simultaneously Develop Fixed Point Simulation and Assembly Code Simultaneously Debug Simulation and Assembly Code Step 2: Address Scaling Issues and Optimize for Performance Improved Software Development Cycle • Merge Steps 2 and 3 Question: How can these steps be combined ?**Floating Point**Simulation System Simulation Matlab Simulation Environment Fixed Point Simulation System Simulation Host Environment DSP Simulator Matlab + DSP Simulator • Develop Floating Point and Fixed Point Simulations in a single development environment - Matlab • Develop and test C/C++ code for Fixed Point Simulation in cooperation with the DSP Simulator • Migrate the C/C++ code directly to the target DSP**DSP Simulator**C/C++ code MEX-file Matlab DSP Simulator in Matlab Develop and Debug Fixed Point C/C++ Code in Matlab Benefits: • Accelerate the development and analysis of DSP code • A mechanism to implement your IP blocks in efficient DSP code • Process large amounts of data • Compare fixed point and floating point algorithm implementations • Provide mixed simulation environment with fixed point and floating point algorithm implementations • Advanced graphing capabilities**What is a MEX-file ?**• A file containing one function that interfaces C/C++ code to the Matlab shell • MathWorks specifies the syntax for this function void mexFunction(int nlhs,mxArray *plhs[ ], int nrhs,const mxArray *prhs[ ]) • See http://www.mathworks.com • Enter mex files into their Search engine**What is a DSP Simulator ?**• A library of functions that simulate the mathematical operations of DSP assembly instructions. • For TI DSPs, the compiler recognizes special functions called Intrinsics and maps them directly into inline assembly instructions • In the DSP Simulator, make each function represent a supported compiler Intrinsic**C code**C6x Assembly Code Function Example() { . y = _add2(a,b); . } Example: . ADD2 . S1 A1,A2,A3 . . Intrinsic Example • ADD2: adds the upper and lower 16-bit portions of a 32 bit register • Intrinsic: dst = _add2(src1,src2) • Assembly Instruction: ADD2 (.unit) src1,src2,dst Compile**DSP Simulator**typedef struct _REG32X2 { short lo; short hi; } reg32x2; int32 _add2(int32 a,int32 b) { int32 y; reg32x2 *pa,*pb,*py; pa = (reg32x2 *)&a; pb = (reg32x2 *)&b; py = (reg32x2 *)&y; py->lo = pa->lo+pb->lo; py->hi = pa->hi+pb->hi; return(y); } // end of _add2() function C code Function Example() { . y = _add2(a,b); . } DSP Simulator Example • C Code with _add2() Intrinsic**DSP Simulator**• How many Intrinsics exist for each DSP family ? TMS320C54x: 36 TMS320C55x: 42 TMS320C62x: 59 TMS320C64x: 135 TMS320C64+: 162 TMS320C67x: 68 Most algorithms previously written in assembly code can now be expressed in C/C++ code with Intrinsic function calls**DSP Simulator**• Consists of two files • C6xSimulator.c • C6xSimulator.h • Contains C functions for representing the numerical operations of 158 DSP assembly instructions • Can control endianness with a symbolic constant**DSP Simulator and C++**• DSP Simulator works in C++ programming environments • Partition data into appropriate types (real, complex) and bit widths (8/16/32 bits) • Write functions in C++ • Use operator overloading for required data types to map operators to the desired Intrinsic functions Benefit: Operator overloading allows for easy migration to next generation DSP instruction sets**Using the DSP Simulator**• Develop C/C++ code with Intrinsic function calls • Compile and link the C/C++ code and the DSP Simulator to form a Matlab executable file • Debug and evaluate the performance of the fixed point algorithms in Matlab • Rely on TI tools to generate an optimized assembly version of the C/C++ code for the target DSP Benefit: One version of C/C++ code runs in Matlab and in the target DSP !**Migrating C/C++ Code to the DSP**• How does it work ? C/C++ code can directly access DSP assembly instructions without actually writing assembly code Benefit: Eliminate headaches associated with assembly programming • Pipeline scheduling • Register allocation • Unit allocation • Stack manipulation • Parallel instruction debug Conclusion: Make the compiler do the hard work !**When is the C/C++ Code Optimized ?**• Look at compiler report in the assembly file to determine unit loading. • Look at the assembly code. Are all the units being used each cycle ? • Try to balance loading by using different sequence of Intrinsics to perform the same overall mathematical operation. • e.g. X * 4 => X << 2 • May require manual unrolling of loops. • Determine the ideal number of MAC operations for an algorithm and compare it to the compiler report**Limitations**• DSP software engineer must perform algorithm mapping from floating point to fixed point manually • ranges for floating point values • fixed point scaling issues • saturation issues • DSP software architecture is limited to the creativity of the software engineer Recommendation: Develop an automated tool that converts Matlab/Simulink floating point files to fixed point DSP C/C++ code using the programming guidelines discussed in the paper.**FFT Example**Developed an FFT for the C64x DSP architecture Briefly discuss • FFT Functions • FFT Simulation File • Development time between hand coded assembly and C code with Intrinsics • Software development time • Software performance**// inside the Radix-2 stage**for(k=Nover2;k>0;k--) { . // compute the real part // (x0.real-x1.real)*w1.real reg2 = _mpyhir(w1,reg1real); // (x0.imag-x1.imag)*w1.imag reg3 = _mpylir(w1,reg1imag); reg2 -= reg3; // compute the imag part // (x0.imag-x1.imag)*w1.real reg4 = _mpyhir(w1,reg1imag); // (x0.real-x1.real)*w1.imag reg5 = _mpylir(w1,reg1real); reg4 += reg5; . } FFT Functions The FFT functions • Main FFT function • First FFT stage • Radix-2 stage • Radix-4 stage • Last FFT stage Example: Radix-2 stage • Uses mpyhir() and mpylir() Intrinsics Note: Twiddle factor indexing not shown in this Example**% test_fft.m**% initialize some parameters Nin = 64; N = 128; NumFFTs = 1000; % create a random input h = rand(NumFFTs,Nin); h = [h;zeros(NumFFTs,N-Nin)]; % compute FFT using Matlab function Hd = fft(h,[],2); % call the fixed point function [H] = ti_fft(h1dfilt,Nin,N); % compute the NSR in dB scale e = Hd-H; NSR = 10*log10(sum(abs(e).^2,2)… ./sum(abs(Hd).^2,2)); FFT Simulation File The simulation file is a Matlab script file • Performs the simulation • Calls the floating point Matlab FFT function fft() • Calls the fixed point FFT function ti_fft() • Compares the frequency responses of fixed point and floating point FFTs in Matlab • Computes the SNR, NSR, etc. using Matlab**FFT Development Time**Software Development Time Comparison • Time required to develop hand-coded assembly functions • 2-3 person months • Time required to develop C code with Intrinsic function calls • 2-3person weeks Development time is reduced by a factor of 4 to 5 !**FFT Performance Comparison**Metric: Kernel sizes and cycle counts • Kernel sizes for hand-coded assembly functions • FirstFFTStage: 18*(N/16) • R2Stage: 7*(N/8) • R4Stage: 12*(N/8) • LastFFTStage: 24*(N/16) • Kernel sizes for C code with Intrinsic function calls • FirstFFTStage: 19*(N/16) • R2Stage: 8*(N/8) • R4Stage: 14*(N/8) • LastFFTStage: 27*(N/16) Intrinsics performance is within 15% of assembly !**Algorithm Optimization Results**In most cases, Intrinsics performance is within 10% !**DSP Simulator**Library Function N Function 1 Function 2 C/C++ code MEX-file Matlab Matlab Function Libraries For a particular DSP application • The DSP Simulator emulates the numerical behavior of the DSP instructions • Power User develops a library of optimized algorithms that contain Intrinsic function calls • General user writes C/C++ code that calls the optimized functions in the library • The user’s C/C++ code is compiled with the DSP Simulator, the library and the MEX-file • User tests the algorithms for performance, evaluates cycle counts, etc. in Matlab • The same C/C++ code is migrated directly to the target DSP**Library**Library Library NoiseEst NoiseEst ChanEst ResEqu SlidingMode Hinf OuterProduct InnerProduct PID FIR RS BF IC VectorSum Viterbi Matlab Function Library Examples Math Library Communications Library Controls Library Benefit: Ability to share fixed-point DSP C/C++ code and test vectors between multiple users**Closing Remarks**DSP Simulator Benefits • Develop fixed point DSP code in Matlab • Easily compare floating point and fixed point algorithm implementations in Matlab • Bit-true, fixed point simulations • Reduce software development time by a factor of 4 to 5 • Incorporate DSP code into higher level system simulations • Debugging code in Matlab is easier than in a real-time system • Easily evaluate/predict MIPS requirements • Run the same C/C++ source code in Matlab and in the DSP • Easily migrate algorithms to new DSP instruction sets • Develop software before next generation DSPs are available**Q & A**• Thanks for attending my presentation !