1 / 39

The Microprocessor is no more General Purpose

The Microprocessor is no more General Purpose. Design Gap. Problems with Fine Grained Approach FPGAs. Area in-efficient Percentage of chip area for wiring far too high Too slow Unavoidable critical paths too long Routing and Placement is very complex. Problems with Fine Grained FPGAs.

marinel
Download Presentation

The Microprocessor is no more General Purpose

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Microprocessor is no more General Purpose

  2. Design Gap

  3. Problems with Fine Grained Approach FPGAs • Area in-efficient • Percentage of chip area for wiring far too high • Too slow • Unavoidable critical paths too long • Routing and Placement is very complex

  4. Problems with Fine Grained FPGAs

  5. Coarse Grained Reconfigurable computing • Uses reconfigurable arrays with path-widths greater than 1 bit • More area-efficient • Massive reduction in configuration memory and configuration time • Drastic reduction in complexity of Placement & Routing

  6. Coarse Grained ArchitecturesClassification • Mesh-based • Linear Arrays based • Cross-bar based

  7. Mesh Based Architectures • Arranges PEs in a 2-D array • Encourages nearest neighbor links between adjacent PEs • Eg. KressArray, Matrix, RAW, CHESS

  8. Matrix – Mesh based Architecture

  9. Matrix – Mesh Based Architecture

  10. Architectures based on Linear Arrays • Aimed at mapping pipelines on linear arrays • If pipeline has forks longer lines spanning whole or part of the array are used • Eg. RaPiD, PipeRench

  11. PipeRench – Linear Array based architecture

  12. PipeRench – Linear Array Based Architecture

  13. Cross-bar based Architectures • Communication Network is easy to route • Uses restricted cross-bars with hierarchical interconnect to save area • Eg. PADDI-1, PADDI-2, Pleiades

  14. PADDI-2 – Cross-bar based architecture

  15. PADDI-2 Cross-bar based Architecture

  16. Coarse Grained Architectures

  17. EGRA • Architectural template to enable design space exploration • Execute expressions as opposed to operations • Supports heterogeneous cells and various memory interfaces

  18. EGRA

  19. Evolution of fine grained and coarse grained architectures

  20. EGRA – at Cell Level

  21. Architectural Exploration

  22. Architectural exploration

  23. EGRA vs CGRA vs FPGA

  24. EGRA – at array level • Organized as a mesh of cells of three types • RACs • Memories • Multipliers • Cells are connected using both nearest neighbor and horizontal-vertical buses • Each cell has a I/O interface, context memory and core

  25. Control Unit

  26. EGRA Operation • DMA mode • Used to transfer data in bursts to EGRA • To program cells and to read/write from scratchpad memories • Execution mode • Control unit orchestrates data flow between cells

  27. EGRA – at array level

  28. Experimental Results

  29. Experimental Results

  30. Experimental Results

  31. EGRA Memory Interface • Data register at the output of computational cells • Memory cells can be scattered around in the array • A scratchpad memory outside reconfigurable mesh

  32. Architectural exploration - Area

  33. Architectural exploration - Delay

  34. MORA

  35. The reconfigurable Cell

  36. Operating modes of RC

  37. Interconnection Topology • Hierarchical • Level 1 used within 4x4 quadrant to provide nearest neighbor connectivity • Interleaved Horizontal and Vertical connectivity of length two • Each RC can receive data from at most two other RCs and send data to at-most four other RCs • Data and control across quadrants is guaranteed over Level 2 interconnection

  38. Interconnection Topology

  39. Computational Strategies • Temporal computational load balancing • Spatial computational load balancing

More Related