slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Architectural Exploration: 802.11a Transmitter Arvind, Nirav Dave, Steve Gerding, Mike Pellauer PowerPoint Presentation
Download Presentation
Architectural Exploration: 802.11a Transmitter Arvind, Nirav Dave, Steve Gerding, Mike Pellauer

Loading in 2 Seconds...

play fullscreen
1 / 24

Architectural Exploration: 802.11a Transmitter Arvind, Nirav Dave, Steve Gerding, Mike Pellauer - PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on

Architectural Exploration: 802.11a Transmitter Arvind, Nirav Dave, Steve Gerding, Mike Pellauer Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology MIT-Nokia Architecture Group Helsinki, June 5, 2006. Why architectural exploration.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Architectural Exploration: 802.11a Transmitter Arvind, Nirav Dave, Steve Gerding, Mike Pellauer' - temple


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Architectural Exploration:

802.11a Transmitter

Arvind, Nirav Dave, Steve Gerding, Mike Pellauer

Computer Science & Artificial Intelligence Laboratory

Massachusetts Institute of Technology

MIT-Nokia Architecture Group

Helsinki, June 5, 2006

why architectural exploration
Why architectural exploration
  • Architects are clever people and can think of a variety of designs
  • But often cannot determine which design is best for a given metric (e.g., power)
    • Too short of time and manpower to go far enough with several designs for proper evaluation

 Guess work instead of architectural exploration

New design tools can change all that

this talk
This talk
  • Architectural exploration of 802.11a transmitter
    • The goal is to show that it is easy and economical to do so in Bluespec
    • You don’t have to know 802.11a or Bluespec to understand the talk
802 11a transmitter overview

Depending upon the transmission rate, consumes 1, 2 or 4 tokens to produce one OFDM symbol

Cyclic

Extend

Controller

Scrambler

Encoder

Interleaver

Mapper

IFFT

IFFT Transforms 64 (frequency domain) complex numbers into 64 (time domain) complex numbers

One OFDM symbol

(64 Complex Numbers)

accounts for > 95% area

802.11a Transmitter Overview

headers

Must produce one OFDM symbol every 4 msec

24 Uncoded bits

data

combinational ifft

+

+

Radix 4

in0

out0

Radix 4

-

-

Radix 4

in1

out1

Radix 4

Permute_1

x16

out2

in2

Radix 4

Radix 4

Radix 4

Permute_3

Permute_2

in3

out3

+

+

out4

in4

Radix 4

Radix 4

*

t0

out63

in63

-

-

*

t1

*

t2

*j

*

t3

Combinational IFFT

All numbers are complex and represented as two sixteen bit quantities. Fixed-point arithmetic is used to reduce area, power, ...

design tradeoffs
Design Tradeoffs
  • We can decrease the area by multiplexing some circuits

It may be a win if the throughput requirements can be met without increasing the frequency

  • Power can be lowered by lowering the frequency, which can be adjusted by changing the voltage

power  (voltage)2

combinational ifft opportunity for reuse

Radix 4

in0

out0

Radix 4

Radix 4

out1

in1

Radix 4

Permute_1

x16

in2

out2

Radix 4

Radix 4

Radix 4

Permute_2

Permute_3

in3

out3

in4

out4

Radix 4

Radix 4

in63

out63

Combinational IFFTOpportunity for reuse

Reuse the same circuit three times

circular pipeline reusing the pipeline stage

Radix 4

in0

out0

Permute_2

Permute_3

Permute_1

in1

out1

Radix 4

in2

out2

in3

out3

in4

out4

in63

out63

Circular pipeline: Reusing the Pipeline Stage

64, 4-way Muxes

Stage Counter

16 Radix 4s can be shared but not the three permutations. Hence the need for muxes

superfolded circular pipeline just one radix 4 node

in0

out0

Radix 4

in1

out1

4, 16-way Muxes

in2

Permute_1

out2

64, 4-way Muxes

in3

out3

in4

out4

Index Counter 0 to 15

4, 16-way DeMuxes

Permute_2

in63

out63

Stage Counter 0 to 2

Permute_3

Superfolded circular pipeline: Just one Radix-4 node!

Designs with 2, 4, and 8 Radix-4 modules make sense too!

which design consumes the least energy to transmit a symbol
Which design consumes the least energy to transmit a symbol?
  • Can we quickly code up all the alternatives?
    • single source with parameters?

Not practical in traditional hardware description languages like Verilog/VHDL

bluespec code radix 4 node

+

+

-

-

+

+

*

-

-

*

*

*j

*

Bluespec code: Radix-4 Node

function Vector#(4,Complex)

radix4(Vector#(4,Complex) t, Vector#(4,Complex) k);

Vector#(4,Complex) m = newVector(),

y = newVector(),

z = newVector();

m[0] = k[0] * t[0]; m[1] = k[1] * t[1];

m[2] = k[2] * t[2]; m[3] = k[3] * t[3];

y[0] = m[0] + m[2]; y[1] = m[0] – m[2];

y[2] = m[1] + m[3]; y[3] = i*(m[1] – m[3]);

z[0] = y[0] + y[2]; z[1] = y[1] + y[3];

z[2] = y[0] – y[2]; z[3] = y[1] – y[3];

return(z);

endfunction

Polymorphic code: works on any type of numbers for which *, + and - have been defined

combinational ifft can be used as a reference

Radix 4

in0

out0

Radix 4

Radix 4

out1

in1

Radix 4

Permute_1

x16

in2

out2

Radix 4

Radix 4

Radix 4

Permute_2

Permute_3

in3

out3

in4

out4

Radix 4

Radix 4

in63

out63

Combinational IFFTCan be used as a reference

stage_f function

repeat it three times

bluespec code for combinational ifft
Bluespec Code for Combinational IFFT

function SVector#(64, Complex) ifft (SVector#(64, Complex) in_data);

//Declare vectors

SVector#(4,SVector#(64, Complex)) stage_data = replicate(newSVector);

stage_data[0] = in_data;

for (Integer stage = 0; stage < 3; stage = stage + 1)

stage_data[i+1] = stage_f(stage, stage_data[i]);

return(stage_data[3]);

function SVector#(64, Complex) stage_f(Bit#(2) stage,

SVector#(64, Complex) stage_in);

begin

for (Integer i = 0; i < 16; i = i + 1)

begin

Integer idx = i * 4;

let twid = getTwiddle(stage, fromInteger(i));

let y = radix4(twid, stage_in[idx:idx+3]);

stage_temp[idx] = y[0]; stage_temp[idx + 1] = y[1];

stage_temp[idx + 2] = y[2]; stage_temp[idx + 3] = y[3];

end

//Permutation

for (Integer i = 0; i < 64; i = i + 1)

stage_out[i] = stage_temp[permute[i]];

end

return(stage_out);

The code is unfolded to generate a combinational circuit

Stage function

synchronous pipeline

x

inQ

sReg1

sReg2

outQ

f1

f2

f3

Synchronous pipeline

rule sync-pipeline (True);

inQ.deq();

sReg1 <= f1(inQ.first());

sReg2 <= f2(sReg1);

outQ.enq(f3(sReg2));

endrule

This is real IFFT code; just replace f1, f2 and f3 with stage_f code

folded pipeline

f

f1

f2

f3

Folded pipeline

x

inQ

outQ

stage

sReg

function f (stage,sx);

case (stage)

1: return f1(sx);

2: return f2(sx);

3: return f3(sx);

endcase

endfunction

rule folded-pipeline (True);

if (stage==1)

begininQ.deq();

sxIn= inQ.first(); end

else sxIn= sReg;

sxOut = f(stage,sxIn);

if (stage==3) outQ.enq(sxOut);

else sReg <= sxOut;

stage <= (stage==3)? 1 : stage+1;

endrule

This is real IFFT code too ...

expressing these designs in bluespec is easy
Expressing these designs in Bluespec is easy
  • All these designs were done in less than one day!
  • Area and power estimates?

How long will it take to write these designs in Verilog? VHDL? SystemC?

bluespec tool flow

Power estimation tool

Place &

Route

Physical

Tapeout

Bluespec Tool flow

Bluespec SystemVerilog source

Bluespec Compiler

Verilog 95 RTL

C

CycleAccurate

Bluespec C sim

Verilog sim

RTL synthesis

VCD output

gates

Debussy

Visualization

FPGA

Sequence Design PowerTheater

802 11a transmitter synthesis results for various ifft designs
802.11a Transmitter Synthesis results for various IFFT designs

TSMC .18 micron; numbers reported are before place and route.

Some areas will be larger after layout.

algorithmic improvements

Radix 4

in0

out0

Radix 4

Radix 4

out1

in1

Radix 4

Permute_1

x16

in2

out2

Radix 4

Radix 4

Radix 4

Permute_2

Permute_3

in3

out3

in4

out4

Radix 4

Radix 4

in63

out63

Algorithmic Improvements

1. All the three permutations can be made identical

 more saving in area

2. One multiplication can be removed from Radix-4

802 11a transmitter synthesis results old vs new ifft designs
802.11a Transmitter Synthesis results: old vs. new IFFT designs

???

expected

TSMC .18 micron; numbers reported are before place and route.

802 11a transmitter synthesis results with new ifft designs
802.11a Transmitter Synthesis results with new IFFT designs

TSMC .18 micron; numbers reported are before place and route.

802 11a transmitter with new ifft designs power estimates
802.11a Transmitter with new IFFT designs: Power Estimates

Work in progress

c3 = min clock x scaling factor;

c4 is raw data collected by the Sequence Design PowerTheater

c5 = c4xc3/100MHz/voltage scaling(=10);

c6 = c5x4 sec

summary
Summary
  • It is essential to do architectural exploration for better (area, power, performance, ...) designs.
  • It is possible to do so with new design tools and methodologies.
  • Better and faster tools for estimating area, timing and power would dramatically increase our capability to do architectural exploration.

Thanks