slide1 n.
Download
Skip this Video
Download Presentation
VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors

Loading in 2 Seconds...

play fullscreen
1 / 18

VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors - PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on

University of Rome “ Tor Vergata” Department of Electronic Engineering. VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors. Authors : G.C. Cardarilli, L. Di Nunzio, R. Fazzolari, C. Lenci , M. Re. Index. Introduction / motivation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'VLSI Implementation of Reconfigurable Cells for RFU in Embedded Processors' - macha


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

UniversityofRome “Tor Vergata”

Departmentof Electronic Engineering

VLSI ImplementationofReconfigurableCellsfor RFU in EmbeddedProcessors

Authors:

G.C. Cardarilli, L. Di Nunzio, R. Fazzolari, C. Lenci, M. Re

slide2

Index

  • Introduction / motivation
  • ReconfigurableFunctionalUnits
  • MulticontextLogicBlocks
  • Traditional and proposedcellcomparison
  • Performance evaluation
    • Delay
    • Powerconsumption
    • Area requirements
  • Conclusions
slide3

Motivation

  • Operandsusuallyshorterthan native processorwordlenght in some applications

Data 2

Data 1

Result

Result

Poorefficiencyofgeneralpurposeprocessorswhile processing shorter data

XOR

AND

slide4

PossibleSolution

  • Executionspeed can beincreasedusing a reconfigurableunitfor “custom” instructions

Register File

ALU

ReconfigurableUnit

PROCESSOR

slide5

ReconfigurableUnits

Attached Processing Unit (APU)

  • Locatedoutsideof the processorcore
  • “Slow” data-transfer between APU and processor
  • Originalinstruction set

Register File

ALU

PROCESSOR

Processorcore

APU

slide6

ReconfigurableUnits

Coprocessor

  • Locatedoutsideof the processorcore
  • FasterinteractionwithprocessorcorethanAPUs
  • Instruction set extensionneeded

Register File

ALU

Processorcore

Coprocessor

Register File

Coprocessor

PROCESSOR

slide7

Register File

ALU

ReconfigurableUnits

ReconfigurableFunctionalUnits (RFUs)

  • Integratedinto the processorcore
  • Fastest interaction with the processor
  • Core re-design needed
  • Instruction set extensionneeded

RFU

PROCESSOR

slide8

ReconfigurableUnits

ReconfigurableUnitrequirements:

  • Fast data-transfer between RU and processor

RFU approachchosen

  • Fast reconfigurationof the RU
  • Silicon area assmallaspossible
  • Low powerconsumption
slide9

MulticontextReconfigurableCells

Traditionalapproach (LUT-based):

OneLook-Up Tablefor

eachcontext (operation)

Configurable

Block

LUT

Context

N

LUT

Context

1

Context

Memory

ReconfigurableLogic Block:

A single reconfigurable block, complete

with a memorycontaining the contexts

output

input

output

input

Selector

context

selection

context

selection

slide10

ProposedLogic Block

  • Full-Adderbased
  • Additionalblocksforitsconfiguration
  • 4 configurationbits (24 = 16 context)
  • 3 Input bits/ 1 Output bit

S0

S1

D2

S0

S1

P

D3

S2

D1

MUX

CIN

Sum

MUX

Data

Full Adder

X

CIN

ConfigurationBits

COUT

Y

Switch

LB Out

To CIN ofnext LB

slide11

ReconfigurableCellComparison

ProposedReconfigurableCell

Logic

Block

A single reconfigurablelogic block based on a full-adder, complete with a memorycontaining the contextconfigurationbits

16x3

Context

Memory

Context

Enable

Out

CIN

SUM

4

3

Context

Selection

COUT

3

D2

D1

D3

slide12

ReconfigurableCellsComparison

Traditional (LUT-based) implementationof the samecell:

S0

SUM

MUX

MUX

8

8

16x8 LUT

16x8 LUT

4

CIN

Context

Selection

Context

Enable

Context

Enable

Out

Out

COUT

3

D2

D1

D3

Data Input

MUX

3

slide13

Performance evaluation

  • Simulation software: SPECTRE, Cadence Virtuoso Suite
  • Processused: CL018 by TSMC, Taiwan (0.18μm featuresize)
  • Processrelatedsimulation data: NCSU Design Kit
slide14

Performance evaluation: layout

LUT-basedcell layout:

Proposedcell layout:

0.00903 mm2 vs 0.0212 mm2(57.4% less)

slide15

Performance evaluation: delay

Maximumdelaysof the proposedcell:

Maximumdelaysof the traditionalLUT-basedcell:

slide16

Performance evaluation: power

  • Simulationconditions:
    • 100 MHz operatingfrequency
    • 100% input nodeactivity

Powerconsumptionof the proposedcell: 0.572mW

Powerconsumptionof the traditionalLUT-basedcell: 1.097mW

Averagepowerconsumptionreducedby 48%

slide17

Performance evaluation: summary

Summaryof performance comparison:

slide18

Conclusions

  • Architectureadvantages:
  • Fast reconfiguration
  • Low transistor count (68.8% less) and area requirements
  • Low powerconsumption
  • Mainlimitations:
    • Reducedflexibilityifcomparedto a LUT-basedcell
  • Future work:
    • Useof the proposedcell in a complete RFU architecture
    • Integrationof the RFU in anexistingembeddedprocessor
ad