1 / 43

APPLIED SIGNAL PROCESSING AND IMPLEMENTATION (ASPI)

APPLIED SIGNAL PROCESSING AND IMPLEMENTATION (ASPI). Introduction for 7th semester Fall 2005. Embedded Systems group: pk, yml, abo, ssc, jmk, dlc, rab, oo Dicom group: kjh, pr, uh, . Outline. Rationale for ASPI Basic ASPI Model (A 3 ) Trends: S8 -> S9 -> S10 Course structure

bluma
Download Presentation

APPLIED SIGNAL PROCESSING AND IMPLEMENTATION (ASPI)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. APPLIED SIGNAL PROCESSING AND IMPLEMENTATION(ASPI) Introduction for 7th semester Fall 2005 Embedded Systems group: pk, yml,abo, ssc, jmk, dlc, rab, oo Dicom group: kjh, pr, uh, ....

  2. Outline • Rationale for ASPI • Basic ASPI Model (A3) • Trends: S8 -> S9 -> S10 • Course structure • Project examples: S8 – S9/S10 • Lab facilities • Demonstrations • Conclusion

  3. Rationale for ASPI/1 • Embedded System: • a collection of heterogeneous parts • subject to stringent design constraint such as ...

  4. Rationale for ASPI/2Embedded Systems From To Nokia 7710

  5. Rationale for ASPI/3Shannon Beats Moore’s Law and Energy Plays a Major Role Algorithmic Complexity (Shannon’s Law) 3G Processor Performance (~Moore’s Law) 2G Battery Capacity 1G Source: Jan Rabaey, Summer Course, 2000

  6. Basic ASPI Model (A3) Applications Algorithms Architectures Equalizer FIR/IIR DSP/FPGA For each application => many candidate algorithms For each algorithm => many implementation architectures => Large no. of solutions => Large Design Space => ASPI challenge

  7. FPGA • FPGA components: • Dedicated I/O blocks • Programmable LogicArrayBlocks (LAB)- combinatorial / seqential circuits- routing resources • Dedicated blocks- RAM blocks- multipliers- processors (ARM/PowerPC) • Development tools

  8. FPGA

  9. ASPI Design Principle Pipelined Serial Parallel • Transform a serial specification into a combination of: • Serial, parallel and pipelined units • That satifies the design constraints: Area, Time => Power

  10. Trends: S8 -> S9 -> S10 Applications 1 Algorithms 2 Architectures 3 • Application: Non-Linear Signal Processing/Mobile Communication • Algorithm selection • Simulation • Architecture selection and mapping • Example later

  11. Compiler optimization Compiler optimiser C code modifications

  12. Trends: S8 -> S9 -> S10 Applications 1 Algorithms 2 Architectures 4 5 3 • Application: Non-Linear Signal Processing/Mobile Communication • Algorithm selection • Simulation • Architecture selection and modelling • Design Space Exploration • HW/SW Co-Design

  13. Design Space Exploration Amax Tmax Constraints: Area, Time => Power = Area*fclock Area Possible solutions (A*T ~ K) Time

  14. HW/SW Co-Design

  15. Trends: S8 -> S9 -> S10 Applications Constraints Algorithms Properties Architectures • Implementing a complete design trajectory • With solutions where properties satisfies constraints

  16. ASPI Course Structure Algorithm analysis SW Platform analysis HW Platform analysis SW compilers HW compilers Design Space Expoloration Design Methodology 8.sem 9.Sem

  17. 8th Semester Courses 

  18. 9th Semester Courses  EL : ELective Course

  19. Technology • Simulation tools / Language: • Matlab/M • Ptolemy/(M)any • Design Trotter/C • Processors / Language: • ARM/ C++, ASM • TI 320-6413/C++, ASM • Blackfin/ C++, ASM • Microblaze/ C++, ASM • NIOS/ C++, ASM • Programmable Logic: • Xilinx FPGA/ Handel-C • Altera FPGA/ Handel-C

  20. Technology Lab facilities Xilinx Virtex FPGA Celoxica RC203 board

  21. Technology Lab facilities Altera Stratix FPGA Altera Stratix board

  22. Technology Lab facilities Analog Devices Blackfin board Analog Devices Blackfin DSP

  23. Project Examples: S8/S9/S10 • S8 Noise Suppression in Speech • S9 FPGA implementation of a JPEG 2000 encoder/decoder • Reed Solomon Decoder for DVB-H • Most projects involves external contacts in other research groups or companies

  24. Noise Suppression in Speech ASPI 8, Gruppe 840 Søren Birk Sørensen Andreas Popp Michael Smed Kristensen

  25. Agenda • Applikation • Systemoversigt • Algoritme • Princip i algoritme • Resultater • Arkitektur • Implementation

  26. Systemoversigt • Krav • Forbedring af taleforståelighed • Forbedring af signal-støj-forhold (SNR) • Acceptabel forsinkelse i systemet (latenstid)

  27. Princip

  28. Resultater SNR ikke væsentligt forbedret Taleforståelse: Fra ”Very poor” til ”Good” Latenstid: 35 ms

  29. Implementation • Dele af algoritmen blev implementeret på et TI TMS320C6713 udviklingsboard • Floating point • Varierende pipeline dybde • 8 instruktioner i parallel • Analysere resultat af compilering • Efterfølgende optimering

  30. Foretagede optimeringer • Eksekveringstid • Anden algoritme til autokorrelationsberegning • Loop unrolling giver mere parallelitet • Informere kompiler om dataafhængighed • Udnyttelse af pipeline • Anden divisionsberegning • Kortere eksekveringstid

  31. Resultat af optimering • Autokorrelationsberegning • 24096 cycles  2624 cycles • 153% mere end estimeret minimum antal cycles • Levinson funktion • 3842 cycles  1122 cycles • 26% mere end estimeret minimum antal cycles

  32. 9th semester project example”FPGA implementation of a JPEG 2000 encoder/decoder”

  33. FPGA implementation of a JPEG2000 encoder/decoder • Motivation • JPEG2000 is up to six times more complex to implement than JPEG • 2 complex DSP algorithms at the heart of JPEG2000 • Discrete Wavelet Transform (DWT) • Embedded Block Coding with Optimized Truncation (EBCOT) • FPGAs provide the ability to accelerate arithmetic operations via parallel processing JPEG2K Block diagram (encoder)

  34. FPGA implementation of a JPEG 2000 encoder/decoder • Project flow • Analysis of reference C-code • processing analysis (search for potential parallelism) • memory analysis (memory requirements) • Sketch an architecture based on the analysis (architectural exploration) • FPGA implementation • Handel-C language to describe the architecture • Handel-C to FPGA (Celoxica Design-suite) • Analysis -> architectural refinement

  35. S10 Project: Reed-Solomon Decoder Nokia 7710 Parity Data Data • Application: • from DVB-T to DVB-H • FEC: RS(n,k,t) => RS(255, 191, 64) • Constraints: • Frame size: upto 2 MB • Data rate: 2 MB/S • Time constraint: ASAP

  36. S10 Project: Reed-Solomon Decoder • Complexity: • Execution on ARM: 22 min/2MB frame

  37. S10 Project: Reed-Solomon Decoder • Algorithm: • Galois field arithmetic GF(28) • Data: 8 bit bytes • operators: binary +, *, not • Properties: • no carry, overflow or rounding error => • bitwise operations In parallel • Short critical path (delay) => high clock rate • Identification of parallelism • coarse grain @ function level • fine grain @ operations level

  38. S10 Project: Reed-Solomon Decoder • Results: • Execution on ARM: 22 min/2MB frame • Parallelism:the error locator and the evaluator polynomial can be computed concurrently • Reusable DataPath: Syndrome computation, Chien Search, polynomial evaluation and error correction can be performed on the same parallel DataPath

  39. S10 Project: Reed-Solomon Decoder • Results: • DataPath: 65 8 bit blocks • Design Space Exploration:

  40. S10 Project: Reed-Solomon Decoder (DSE)

  41. Conclusion • ASPI salient features: • based on Models and Methods • application independent but also • application related • encompasses new technologies and tools • driven by current research projects • local & global industry cooperation Any questions - • before student presentation continues

  42. Reklame Min A3 'opdragelse' er kommet rigtig til gavn – vi veksler frem og tilbage mellem applikation, algoritme og arkitektur noejagtig som vi gjorde i de gode gamle dage i VLSI gruppen. Desvaerre faar vi ikke gjort meget ved aritmetikken – syntese vaerktoejerne kommer med meget effektive modulgeneratorer for multipliers, adders etc. – og I den 0.18u teknologi vi arbejder i er de mere end rigeligt hurtige. Saa aritmetikken er mere en del af min baggrund for at forstaa hvad modul generatorerne spytter ud - og hvordan vi bedst udnytter dem. (Og dog - det lysner - jeg skal til at designe en divider for naeste generation IC !-) Uddrag af e-mail fra: Jack Andersen <jandersen@d2audio.com>

  43. ASPI Home Page, Staff etc • Home Page: http://kom.aau.dk/~dsp/aspi-05/sites/default/ • Secretary: • Dorthe Sparre, NJV12 A5-214, Tlf. 9635 8616, dsp@kom.aau.dk • Staff: • Peter Koch, Yannick LeMoullec, Ole Olsen • Daniel Lázaro Cuadrado, Anders B. Olsen, Jesper Michael Kristensen, Søren Skovgaard Christensen, Rasmus Abildgren • Location: • Offices: B1-208, -211, -213, NJV12 A5-207 • Lab: NJ14 3-015 • Students: A6-108

More Related