A comparison of dsp architectures blackfin adsp bfxxx compute unit
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit PowerPoint PPT Presentation


  • 128 Views
  • Uploaded on
  • Presentation posted in: General

A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit. Based on a ENEL619.23 white paper prepared by Darrell Anklovitch. Overview. Architecture Overview Register Map ALU features and sample instructions Multiplier features and sample instructions

Download Presentation

A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


A comparison of dsp architectures blackfin adsp bfxxx compute unit

A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit

Based on a ENEL619.23 white paperprepared by Darrell Anklovitch

Blackfin Compute Unit

REV B


Overview

Overview

  • Architecture Overview

  • Register Map

  • ALU features and sample instructions

  • Multiplier features and sample instructions

  • Shifter features and sample instructions

Blackfin Compute Unit

REV B


References

References

  • ADSP-BF535 Blackfin Processor Hardware Reference, Rev 2, April 2004, Analog Devices. – Section 2

  • Blackfin Processor Instruction Set Reference, Rev 2, May 2003, Analog Devices. – Sections 8 ~ 10, 14 & 15

  • A number of the figures in this presentation are based on figures found in the ADSP-BF535 Blackfin Processor Hardware Reference.

Blackfin Compute Unit

REV B


A comparison of dsp architectures blackfin adsp bfxxx compute unit

ADSP-2106x Core Architecture

CACHE

JTAG TEST &

MEMORY

EMULATION

32 x 48

FLAGS

DAG 1

DAG 2

PROGRAM

8 x 4 x 32

8 x 4 x 24

SEQUENCER

TIMER

24

PMA BUS

PMA

DMA BUS

32

DMA

48

PMD BUS

PMD

BUS CONNECT

DMD BUS

40

DMD

REGISTER

FLOATING & FIXED-POINT

FILE

32-BIT

FLOATING-POINT

MULTIPLIER,

16 x 40

BARREL

& FIXED-POINT

FIXED-POINT

SHIFTER

ALU

ACCUMULATOR

Blackfin Compute Unit

REV B


Register file and compute units

Register File and COMPUTE Units

  • Key issues

    • 5 data paths FROM COMPUTE units

    • 5 data paths TO COMPUTE units

    • Highly parallel operations UNDER THE RIGHT CONDITIONS

Blackfin Compute Unit

REV B


Bf533 memory accesses

BF533 Memory Accesses

Under the right conditions -- 4 memory accesses at same time

64 bit Instruction Fetch, 2x32 bit Data Loads, 32 bit Data Store

PLUS up to 2 ALU(32 bit) and 2 MAC(16 bit) operations at the same time

PLUS background DMA activity

Blackfin Compute Unit

REV B


Compute unit architecture

Compute Unit Architecture

Register

File

2 Multipliers

1 set of

Video

ALUs

1

Shifter

2 ALUs

Blackfin Compute Unit

REV B


Register file

8 x 32 bit

OR

16 x 16 bit

2 x 40 bit

accumulators

Register File

  • DATA REGISTER SYNTAX:

  • R0, R1 etc refer to 32 bit registers

  • R0.L refers to the low 16 bits of the R0 32 bit reg

  • R0.H refers to the high 16 bits of the R0 register

  • ACCUMULATOR SYNTAX:

  • A0.L => low 16 bits

  • A0.H => next 16 bits

  • A0.W => least significant 32 bit word

  • A0.X => MS 8 bit extension

SHARC – 16 32-bit data registers, integer and floatThere is a pair of SHARC accumulator registers too

Blackfin Compute Unit

REV B


Alu data flow

ALU Data Flow

2 x 32 bit paths to dual

Multiplier/ALU units

2 x 32 bit paths back

to register file

Blackfin Compute Unit

REV B


Sample instructions

Sample instructions

Blackfin Compute Unit

REV B


Alu features

Dual 16 bit OPS:

Can be :

ALU Features

Single 16 bit OPS:

31

Rm

Rp

Rn

Dual 16 bit Cross:

Single 32 bit OPS:

31

Rm

Rp

Rn

Blackfin Compute Unit

REV B


Alu sample instructions

Quad 16 bit ops:

Dual 32 bit ops:

C

A

B

D

A

B

ALU Sample Instructions

Single 16 bit ops:

Dual 16 bit ops:

Single 32 bit ops:

Does not work in parallel

Must have this option

Operator order is important

+ must come before -

  • A & B registers must stay on the same side of the ‘|’ for both

  • Instructions

  • For dual and quad 16 bit operations the (CO) option causes the

  • destination registers to cross

Blackfin Compute Unit

REV B


Multiply data flow

Multiply Data Flow

2 x 32 bit paths to dual

Multiplier/ALU units

Multiplier share the same operand/result buses as the ALU

2 x 40 bit

accumulator

2 x 32 bit paths back

to register file

Blackfin Compute Unit

REV B


Multiply features

H

H

L

L

H

L

H

L

Multiply Features

  • Multiplies are signed fractional by default

  • Signed fractional multiply result is automatically left

  • shifted 1 bit.

  • Signed fractional multiply != signed integer multiply

  • Rounding available on fractional number multiplies and

  • special option of integer number multiplies

Blackfin Compute Unit

REV B


Rounding

31

Rm

31

Rp

32 bit result

0x8000

0x8000

top 16 bits go to destination register

top 16 bits go to destination register

31

31

Rd

Rd

Rounding

2 cases:

Rounding adds 0x8000 to the 32 bit multiplier result or

accumulator value before extracting a 16 bit value to the

destination register

Blackfin Compute Unit

REV B


Fractional multiply

Fractional Multiply

Fractional

Multiply !=

Integer

Multiply

Fractional

Multiply !=

Integer

Multiply

  • When extracting a 16 bit fractional value from an accumulator

  • the high 16 bits is taken

  • Where in the destination register it goes depends on which

  • accumulator is being extracted from

Blackfin Compute Unit

REV B


Integer multiply

Integer Multiply

Fractional

Multiply !=

Integer

Multiply

  • When extracting a 16 bit integer value from an accumulator

  • the low 16 bits is taken.

  • Where in the destination register the 16 bit value goes depends

  • on which accumulator is being extracted from

Blackfin Compute Unit

REV B


Multiply sample instructions

Multiply Sample Instructions

16 bit extraction from ACC 0

16 bit extraction from ACC 1

Multi-issue MAC Instruction Examples

32 bit extraction

A1 += R1.H * R2.L , A0 += R1.L * R2.L;

R3.H = (A1 += R1.H * R2.L) , R3.L = (A0 += R1.L * R2.L);

Any combination of .H and .L in the 2 operands is allowed

R3 = (A1 += R1.H*R2.L), R2 = (A0 += R1.L * R2.L);

Where destination registers must be paired as follows:

R[1,0], R[3,2], R[5,4] and R[7,6]

R3.H = (A1 += R1.H * R2.L), A0 += R1.L * R2.L;

Blackfin Compute Unit

REV B


Shifter sample instructions

Arithmetic shift

3 op

Reg

shift

3 op

Immediate

shift

2 operator

Register

shifts

2 operator

Immediate

shifts

Shifter Sample Instructions

Blackfin Compute Unit

REV B


Parallel instruction examples

Parallel Instruction Examples

  • In general there are 16 and 32 bit versions of the arithmetic instructions

  • Most of the 32 bit instructions can be executed in parallel with 2 x 16 bit memory/index operations

  • Exceptions are DIVS, DIVQ and MULTIPLY with 32 bit operands

  • || means parallel

  • Examples:

    • A1=R2.L*R1.L,A0=R2.H*R1.H||R2.H=W[I2++] || [I3++]=R3;\

    • R2=R2+|+R4, R4=R2-|-R4 || I0+=M0||R1=[I0];

Blackfin Compute Unit

REV B


  • Login