- 144 Views
- Uploaded on
- Presentation posted in: General

A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit

Based on a ENEL619.23 white paperprepared by Darrell Anklovitch

Blackfin Compute Unit

REV B

- Architecture Overview
- Register Map
- ALU features and sample instructions
- Multiplier features and sample instructions
- Shifter features and sample instructions

Blackfin Compute Unit

REV B

- ADSP-BF535 Blackfin Processor Hardware Reference, Rev 2, April 2004, Analog Devices. – Section 2
- Blackfin Processor Instruction Set Reference, Rev 2, May 2003, Analog Devices. – Sections 8 ~ 10, 14 & 15
- A number of the figures in this presentation are based on figures found in the ADSP-BF535 Blackfin Processor Hardware Reference.

Blackfin Compute Unit

REV B

ADSP-2106x Core Architecture

CACHE

JTAG TEST &

MEMORY

EMULATION

32 x 48

FLAGS

DAG 1

DAG 2

PROGRAM

8 x 4 x 32

8 x 4 x 24

SEQUENCER

TIMER

24

PMA BUS

PMA

DMA BUS

32

DMA

48

PMD BUS

PMD

BUS CONNECT

DMD BUS

40

DMD

REGISTER

FLOATING & FIXED-POINT

FILE

32-BIT

FLOATING-POINT

MULTIPLIER,

16 x 40

BARREL

& FIXED-POINT

FIXED-POINT

SHIFTER

ALU

ACCUMULATOR

Blackfin Compute Unit

REV B

- Key issues
- 5 data paths FROM COMPUTE units
- 5 data paths TO COMPUTE units
- Highly parallel operations UNDER THE RIGHT CONDITIONS

Blackfin Compute Unit

REV B

Under the right conditions -- 4 memory accesses at same time

64 bit Instruction Fetch, 2x32 bit Data Loads, 32 bit Data Store

PLUS up to 2 ALU(32 bit) and 2 MAC(16 bit) operations at the same time

PLUS background DMA activity

Blackfin Compute Unit

REV B

Register

File

2 Multipliers

1 set of

Video

ALUs

1

Shifter

2 ALUs

Blackfin Compute Unit

REV B

8 x 32 bit

OR

16 x 16 bit

2 x 40 bit

accumulators

- DATA REGISTER SYNTAX:
- R0, R1 etc refer to 32 bit registers
- R0.L refers to the low 16 bits of the R0 32 bit reg
- R0.H refers to the high 16 bits of the R0 register
- ACCUMULATOR SYNTAX:
- A0.L => low 16 bits
- A0.H => next 16 bits
- A0.W => least significant 32 bit word
- A0.X => MS 8 bit extension

SHARC – 16 32-bit data registers, integer and floatThere is a pair of SHARC accumulator registers too

Blackfin Compute Unit

REV B

2 x 32 bit paths to dual

Multiplier/ALU units

2 x 32 bit paths back

to register file

Blackfin Compute Unit

REV B

Blackfin Compute Unit

REV B

Dual 16 bit OPS:

Can be :

Single 16 bit OPS:

31

Rm

Rp

Rn

Dual 16 bit Cross:

Single 32 bit OPS:

31

Rm

Rp

Rn

Blackfin Compute Unit

REV B

Quad 16 bit ops:

Dual 32 bit ops:

C

A

B

D

A

B

Single 16 bit ops:

Dual 16 bit ops:

Single 32 bit ops:

Does not work in parallel

Must have this option

Operator order is important

+ must come before -

- A & B registers must stay on the same side of the ‘|’ for both
- Instructions
- For dual and quad 16 bit operations the (CO) option causes the
- destination registers to cross

Blackfin Compute Unit

REV B

2 x 32 bit paths to dual

Multiplier/ALU units

Multiplier share the same operand/result buses as the ALU

2 x 40 bit

accumulator

2 x 32 bit paths back

to register file

Blackfin Compute Unit

REV B

H

H

L

L

H

L

H

L

- Multiplies are signed fractional by default
- Signed fractional multiply result is automatically left
- shifted 1 bit.
- Signed fractional multiply != signed integer multiply
- Rounding available on fractional number multiplies and
- special option of integer number multiplies

Blackfin Compute Unit

REV B

31

Rm

31

Rp

32 bit result

0x8000

0x8000

top 16 bits go to destination register

top 16 bits go to destination register

31

31

Rd

Rd

2 cases:

Rounding adds 0x8000 to the 32 bit multiplier result or

accumulator value before extracting a 16 bit value to the

destination register

Blackfin Compute Unit

REV B

Fractional

Multiply !=

Integer

Multiply

Fractional

Multiply !=

Integer

Multiply

- When extracting a 16 bit fractional value from an accumulator
- the high 16 bits is taken
- Where in the destination register it goes depends on which
- accumulator is being extracted from

Blackfin Compute Unit

REV B

Fractional

Multiply !=

Integer

Multiply

- When extracting a 16 bit integer value from an accumulator
- the low 16 bits is taken.
- Where in the destination register the 16 bit value goes depends
- on which accumulator is being extracted from

Blackfin Compute Unit

REV B

16 bit extraction from ACC 0

16 bit extraction from ACC 1

Multi-issue MAC Instruction Examples

32 bit extraction

A1 += R1.H * R2.L , A0 += R1.L * R2.L;

R3.H = (A1 += R1.H * R2.L) , R3.L = (A0 += R1.L * R2.L);

Any combination of .H and .L in the 2 operands is allowed

R3 = (A1 += R1.H*R2.L), R2 = (A0 += R1.L * R2.L);

Where destination registers must be paired as follows:

R[1,0], R[3,2], R[5,4] and R[7,6]

R3.H = (A1 += R1.H * R2.L), A0 += R1.L * R2.L;

Blackfin Compute Unit

REV B

Arithmetic shift

3 op

Reg

shift

3 op

Immediate

shift

2 operator

Register

shifts

2 operator

Immediate

shifts

Blackfin Compute Unit

REV B

- In general there are 16 and 32 bit versions of the arithmetic instructions
- Most of the 32 bit instructions can be executed in parallel with 2 x 16 bit memory/index operations
- Exceptions are DIVS, DIVQ and MULTIPLY with 32 bit operands
- || means parallel
- Examples:
- A1=R2.L*R1.L,A0=R2.H*R1.H||R2.H=W[I2++] || [I3++]=R3;\
- R2=R2+|+R4, R4=R2-|-R4 || I0+=M0||R1=[I0];

Blackfin Compute Unit

REV B