1 / 18

VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors

University of Rome “ Tor Vergata” Department of Electronic Engineering. VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors. Authors : G.C. Cardarilli, L. Di Nunzio, R. Fazzolari, C. Lenci , M. Re. Index. Introduction / motivation

macha
Download Presentation

VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UniversityofRome “Tor Vergata” Departmentof Electronic Engineering VLSI ImplementationofReconfigurableCellsfor RFU in EmbeddedProcessors Authors: G.C. Cardarilli, L. Di Nunzio, R. Fazzolari, C. Lenci, M. Re

  2. Index • Introduction / motivation • ReconfigurableFunctionalUnits • MulticontextLogicBlocks • Traditional and proposedcellcomparison • Performance evaluation • Delay • Powerconsumption • Area requirements • Conclusions

  3. Motivation • Operandsusuallyshorterthan native processorwordlenght in some applications Data 2 Data 1 Result Result Poorefficiencyofgeneralpurposeprocessorswhile processing shorter data XOR AND

  4. PossibleSolution • Executionspeed can beincreasedusing a reconfigurableunitfor “custom” instructions Register File ALU ReconfigurableUnit PROCESSOR

  5. ReconfigurableUnits Attached Processing Unit (APU) • Locatedoutsideof the processorcore • “Slow” data-transfer between APU and processor • Originalinstruction set Register File ALU PROCESSOR Processorcore APU

  6. ReconfigurableUnits Coprocessor • Locatedoutsideof the processorcore • FasterinteractionwithprocessorcorethanAPUs • Instruction set extensionneeded Register File ALU Processorcore Coprocessor Register File Coprocessor PROCESSOR

  7. Register File ALU ReconfigurableUnits ReconfigurableFunctionalUnits (RFUs) • Integratedinto the processorcore • Fastest interaction with the processor • Core re-design needed • Instruction set extensionneeded RFU PROCESSOR

  8. ReconfigurableUnits ReconfigurableUnitrequirements: • Fast data-transfer between RU and processor RFU approachchosen • Fast reconfigurationof the RU • Silicon area assmallaspossible • Low powerconsumption

  9. MulticontextReconfigurableCells Traditionalapproach (LUT-based): OneLook-Up Tablefor eachcontext (operation) Configurable Block LUT Context N LUT Context 1 Context Memory ReconfigurableLogic Block: A single reconfigurable block, complete with a memorycontaining the contexts output input output input Selector context selection context selection

  10. ProposedLogic Block • Full-Adderbased • Additionalblocksforitsconfiguration • 4 configurationbits (24 = 16 context) • 3 Input bits/ 1 Output bit S0 S1 D2 S0 S1 P D3 S2 D1 MUX CIN Sum MUX Data Full Adder X CIN ConfigurationBits COUT Y Switch LB Out To CIN ofnext LB

  11. ReconfigurableCellComparison ProposedReconfigurableCell Logic Block A single reconfigurablelogic block based on a full-adder, complete with a memorycontaining the contextconfigurationbits 16x3 Context Memory Context Enable Out CIN SUM 4 3 Context Selection COUT 3 D2 D1 D3

  12. ReconfigurableCellsComparison Traditional (LUT-based) implementationof the samecell: S0 SUM MUX MUX 8 8 16x8 LUT 16x8 LUT 4 CIN Context Selection Context Enable Context Enable Out Out COUT 3 D2 D1 D3 Data Input MUX 3

  13. Performance evaluation • Simulation software: SPECTRE, Cadence Virtuoso Suite • Processused: CL018 by TSMC, Taiwan (0.18μm featuresize) • Processrelatedsimulation data: NCSU Design Kit

  14. Performance evaluation: layout LUT-basedcell layout: Proposedcell layout: 0.00903 mm2 vs 0.0212 mm2(57.4% less)

  15. Performance evaluation: delay Maximumdelaysof the proposedcell: Maximumdelaysof the traditionalLUT-basedcell:

  16. Performance evaluation: power • Simulationconditions: • 100 MHz operatingfrequency • 100% input nodeactivity Powerconsumptionof the proposedcell: 0.572mW Powerconsumptionof the traditionalLUT-basedcell: 1.097mW Averagepowerconsumptionreducedby 48%

  17. Performance evaluation: summary Summaryof performance comparison:

  18. Conclusions • Architectureadvantages: • Fast reconfiguration • Low transistor count (68.8% less) and area requirements • Low powerconsumption • Mainlimitations: • Reducedflexibilityifcomparedto a LUT-basedcell • Future work: • Useof the proposedcell in a complete RFU architecture • Integrationof the RFU in anexistingembeddedprocessor

More Related