1 / 20

DSPs for future wireless systems

DSPs for future wireless systems. Sridhar Rajagopal. Motivation. Baseband. Programmable. A/D. Wireless Mobile. RF Unit. D/A. device. Communications. Processor. Higher Layers. Add-on PCMCIA Network Interface Card. Mobile: Switch between standards and between parameters

halee-dean
Download Presentation

DSPs for future wireless systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DSPs for future wireless systems Sridhar Rajagopal

  2. Motivation Baseband Programmable A/D Wireless Mobile RF Unit D/A device Communications Processor Higher Layers Add-on PCMCIA Network Interface Card • Mobile: Switch between standards and between parameters • Base-station: varying number of users with different parameters

  3. GPP DSP Performance Power Flexibility FPGA VLSI The problem

  4. An approach for the solution • Algorithms well understood at VLSI level • Can design real-time systems. • Pushing it higher in the chain • Current DSPs not powerful enough for our application • Using the IMAGINE simulator to see what kind of architecture features would be useful in a future DSP for such applications.

  5. History of my work Multiuser channel estimation Multiuser detection Distant Past Algorithms VLSI Task-partitioning Parallelism Pipelining FPGA Recent Past Conventional arithmetic On-line arithmetic DSP Instruction set extensions Co-processor support Functional unit design and usage Recent and Near Future IMAGINE

  6. Contents • Programmable architecture design using the IMAGINE simulator • Multiuser estimation and detection implementation • Performance comparisons and results • Other extensions for possible integration • Conclusions

  7. SDRAM SDRAM SDRAM SDRAM Streaming Memory System Stream Controller Network Host Stream Register File Network Interface Processor Microcontroller ALU Cluster 7 ALU Cluster 0 ALU Cluster 1 ALU Cluster 2 ALU Cluster 3 ALU Cluster 4 ALU Cluster 5 ALU Cluster 6 Imagine Stream Processor The IMAGINE architecture and simulator • IMAGINE is a media signal processor

  8. Why the IMAGINE simulator? • Great for media processing algorithms • Has a VLIW-based cluster -- DSP comparisons • A good base architecture : 1024-pt FFT • RSIM, SimpleScalar…: more general purpose architecture simulators

  9. What does the simulator give us? • Execution time for the different parts of the code • Functional unit utilization • Insights into the bottlenecks • Flexibility to add and remove functional units already present or design your own • Graphical view of the schedule on the functional units

  10. Down-side • 2 level C++ programming • StreamC: • transfers streams of data between main memory and stream register file (SRF) • KernelC: • transfers streams from the SRF to the ALU clusters • Code optimized to the number of ALU clusters and the size of the data • Compiler may fail register allocation if too many variables or functional units modified

  11. Contents • Programmable architecture design using the IMAGINE simulator • Multiuser estimation and detection implementation • Performance comparisons and results • Other extensions for possible integration • Conclusions

  12. Typical workload representation (Base-station) • Equalization • FFT • Viterbi decoding • Channel estimation • Multiuser detection • Viterbi/Turbo decoding • Multiple antennas • Long spreading codes • Space-Time codes Wireless LAN W-CDMA If you felt that life was too easy

  13. Estimation/Detection (64,32 sizes) Multiuser Estimation Kernel 1,2,3 Massaging matrices for detection Kernel 4, 5 Multiuser Detection Kernel 6, 7

  14. Kernels • 1. Update: Update Rbb, Rbr • 2. Mmult : multiply Rbb * A • 3. Iterate: gradient descent • 4. MmultL: Calculate L • 5. MmultC: Calculate C • 6. Mf: Matched Filter • 7. Pic: 1 Parallel Interference Cancellation Stage

  15. Kernel 2 (mmult) for 3 +,2*Divider not being utilizedAdders have limited FU utilizationO(N3) *, O(N3) +Multipliers 100% in loopReplace / with *

  16. Kernel 2 (mmult)for 3 +,3*better adder utilization needs sufficient registers for scaling [register allocation may fail]code may also need slight tuning of variables for optimization

  17. Contents • Programmable architecture design using the IMAGINE simulator • Multiuser estimation and detection implementation • Performance comparisons and results • Other extensions for possible integration • Conclusions

  18. FU utilization on each cluster Time for detection at 128 Kbps for each of 32 users at 500 MHz : 4000 cycles

  19. Comparisons with DSPs -2 10 -3 10 -4 10 Execution time (in seconds) X -5 10 Single DSP implementation 2 DSP implementation Target data rate - 128 Kbps/user x Our architecture based on Imagine -6 10 0 5 10 15 20 25 30 35 Users

  20. Current work • Evaluating performance of wireless communication algorithms such as estimation, detection and decoding on this architecture • Studying bottlenecks, functional unit design needed to attain real-time • The insights gained from the design can also be applied to other processors such as DSPs.

More Related