slide1
Download
Skip this Video
Download Presentation
VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors

Loading in 2 Seconds...

play fullscreen
1 / 18

VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors - PowerPoint PPT Presentation


  • 108 Views
  • Uploaded on

University of Rome “ Tor Vergata” Department of Electronic Engineering. VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors. Authors : G.C. Cardarilli, L. Di Nunzio, R. Fazzolari, C. Lenci , M. Re. Index. Introduction / motivation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors' - macha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

UniversityofRome “Tor Vergata”

Departmentof Electronic Engineering

VLSI ImplementationofReconfigurableCellsfor RFU in EmbeddedProcessors

Authors:

G.C. Cardarilli, L. Di Nunzio, R. Fazzolari, C. Lenci, M. Re

slide2

Index

  • Introduction / motivation
  • ReconfigurableFunctionalUnits
  • MulticontextLogicBlocks
  • Traditional and proposedcellcomparison
  • Performance evaluation
    • Delay
    • Powerconsumption
    • Area requirements
  • Conclusions
slide3

Motivation

  • Operandsusuallyshorterthan native processorwordlenght in some applications

Data 2

Data 1

Result

Result

Poorefficiencyofgeneralpurposeprocessorswhile processing shorter data

XOR

AND

slide4

PossibleSolution

  • Executionspeed can beincreasedusing a reconfigurableunitfor “custom” instructions

Register File

ALU

ReconfigurableUnit

PROCESSOR

slide5

ReconfigurableUnits

Attached Processing Unit (APU)

  • Locatedoutsideof the processorcore
  • “Slow” data-transfer between APU and processor
  • Originalinstruction set

Register File

ALU

PROCESSOR

Processorcore

APU

slide6

ReconfigurableUnits

Coprocessor

  • Locatedoutsideof the processorcore
  • FasterinteractionwithprocessorcorethanAPUs
  • Instruction set extensionneeded

Register File

ALU

Processorcore

Coprocessor

Register File

Coprocessor

PROCESSOR

slide7

Register File

ALU

ReconfigurableUnits

ReconfigurableFunctionalUnits (RFUs)

  • Integratedinto the processorcore
  • Fastest interaction with the processor
  • Core re-design needed
  • Instruction set extensionneeded

RFU

PROCESSOR

slide8

ReconfigurableUnits

ReconfigurableUnitrequirements:

  • Fast data-transfer between RU and processor

RFU approachchosen

  • Fast reconfigurationof the RU
  • Silicon area assmallaspossible
  • Low powerconsumption
slide9

MulticontextReconfigurableCells

Traditionalapproach (LUT-based):

OneLook-Up Tablefor

eachcontext (operation)

Configurable

Block

LUT

Context

N

LUT

Context

1

Context

Memory

ReconfigurableLogic Block:

A single reconfigurable block, complete

with a memorycontaining the contexts

output

input

output

input

Selector

context

selection

context

selection

slide10

ProposedLogic Block

  • Full-Adderbased
  • Additionalblocksforitsconfiguration
  • 4 configurationbits (24 = 16 context)
  • 3 Input bits/ 1 Output bit

S0

S1

D2

S0

S1

P

D3

S2

D1

MUX

CIN

Sum

MUX

Data

Full Adder

X

CIN

ConfigurationBits

COUT

Y

Switch

LB Out

To CIN ofnext LB

slide11

ReconfigurableCellComparison

ProposedReconfigurableCell

Logic

Block

A single reconfigurablelogic block based on a full-adder, complete with a memorycontaining the contextconfigurationbits

16x3

Context

Memory

Context

Enable

Out

CIN

SUM

4

3

Context

Selection

COUT

3

D2

D1

D3

slide12

ReconfigurableCellsComparison

Traditional (LUT-based) implementationof the samecell:

S0

SUM

MUX

MUX

8

8

16x8 LUT

16x8 LUT

4

CIN

Context

Selection

Context

Enable

Context

Enable

Out

Out

COUT

3

D2

D1

D3

Data Input

MUX

3

slide13

Performance evaluation

  • Simulation software: SPECTRE, Cadence Virtuoso Suite
  • Processused: CL018 by TSMC, Taiwan (0.18μm featuresize)
  • Processrelatedsimulation data: NCSU Design Kit
slide14

Performance evaluation: layout

LUT-basedcell layout:

Proposedcell layout:

0.00903 mm2 vs 0.0212 mm2(57.4% less)

slide15

Performance evaluation: delay

Maximumdelaysof the proposedcell:

Maximumdelaysof the traditionalLUT-basedcell:

slide16

Performance evaluation: power

  • Simulationconditions:
    • 100 MHz operatingfrequency
    • 100% input nodeactivity

Powerconsumptionof the proposedcell: 0.572mW

Powerconsumptionof the traditionalLUT-basedcell: 1.097mW

Averagepowerconsumptionreducedby 48%

slide17

Performance evaluation: summary

Summaryof performance comparison:

slide18

Conclusions

  • Architectureadvantages:
  • Fast reconfiguration
  • Low transistor count (68.8% less) and area requirements
  • Low powerconsumption
  • Mainlimitations:
    • Reducedflexibilityifcomparedto a LUT-basedcell
  • Future work:
    • Useof the proposedcell in a complete RFU architecture
    • Integrationof the RFU in anexistingembeddedprocessor
ad