architectural analysis of a dsp device the instruction set and the addressing modes n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Architectural Analysis of a DSP Device, the Instruction Set and the Addressing Modes PowerPoint Presentation
Download Presentation
Architectural Analysis of a DSP Device, the Instruction Set and the Addressing Modes

Loading in 2 Seconds...

play fullscreen
1 / 50

Architectural Analysis of a DSP Device, the Instruction Set and the Addressing Modes - PowerPoint PPT Presentation


  • 157 Views
  • Uploaded on

Architectural Analysis of a DSP Device, the Instruction Set and the Addressing Modes. SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic. Outline. FIR filter on ADPS-21x DSP Requirements Fast Multiply-Accumulates (Data-path)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Architectural Analysis of a DSP Device, the Instruction Set and the Addressing Modes' - amaryllis


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
architectural analysis of a dsp device the instruction set and the addressing modes

Architectural Analysis of a DSP Device, the Instruction Set and the Addressing Modes

SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications

Miodrag Bolic

outline
Outline
  • FIR filter on ADPS-21x

DSP Requirements

  • Fast Multiply-Accumulates (Data-path)
  • Extended Precision Accumulator Register (Data-path)
  • Dual Operand Fetch (Memory)
  • Circular Buffering (Addressing)
  • Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming

  • SHARC
  • Blackfin
  • Performance Optimization
slide3
ADSP -21x

Copied from [Kester03]

calculating outputs of 4 tap fir filter using a circular buffer
CALCULATING OUTPUTS OF 4-TAP FIR FILTER USING A CIRCULAR BUFFER

Memory

Location

0

1

2

3

Read

x(0)

x(1)

x(2)

x(3)

Write

x(4)

Read

x(4)

x(1)

x(2)

x(3)

Write

x(5)

Read

x(4)

x(5)

x(2)

x(3)

y(3) = h(0) x(3) + h(1) x(2) + h(2) x(1) + h(3) x(0)

y(4) = h(0) x(4) + h(1) x(3) + h(2) x(2) + h(3) x(1)

y(5) = h(0) x(5) + h(1) x(4) + h(2) x(3) + h(3) x(2)

Copied from [Kester03]

fir filter steps
FIR filter steps

1. Obtain a sample with the ADC; generate an interrupt

2. Detect and manage the interrupt

3. Move the sample into the input signal's circular buffer

4. Update the pointer for the input signal's circular buffer

5. Zero the accumulator

6. Control the loop through each of the coefficients

7. Fetch the coefficient from the coefficient's circular buffer

8. Update the pointer for the coefficient's circular buffer

9. Fetch the sample from the input signal's circular buffer

10. Update the pointer for the input signal's circular buffer

11. Multiply the coefficient by the sample

12. Add the product to the accumulator

13. Move the output sample (accumulator) to a holding buffer

14. Move the output sample from the holding buffer to the DAC

Copied from [Kester03]

fir filter steps cont

Single

Cycle

Instruction

FIR filter steps (cont.)

ADSP21xx Example code:

CNTR = N-1;

DO convolution UNTIL CE;

convolution:

MR = MR + MX0 * MY0(SS), MX0 = DM(I0,M1), MY0 = PM(I4,M5);

Copied from [Kester03]

outline1
Outline
  • FIR filter on ADPS-21x

DSP Requirements

  • Fast Multiply-Accumulates (Data-path)
  • Extended Precision Accumulator Register (Data-path)
  • Dual Operand Fetch (Memory)
  • Circular Buffering (Addressing)
  • Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming

  • SHARC
  • Blackfin
  • Performance Optimization
motorola dsp5600x
Motorola DSP5600X

Copied from [Takala05]

slide13
ADSP -21x

MAC

www.analog.com/dsp

sharc architecture adsp 2106x
SHARC Architecture ADSP-2106X

Copied from [Takala05]

outline2
Outline
  • FIR filter on ADPS-21x

DSP Requirements

  • Fast Multiply-Accumulates (Data-path)
  • Extended Precision Accumulator Register (Data-path)
  • Dual Operand Fetch (Memory)
  • Circular Buffering (Addressing)
  • Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming

  • SHARC
  • Blackfin
  • Performance Optimization
outline3
Outline
  • FIR filter on ADPS-21x

DSP Requirements

  • Fast Multiply-Accumulates (Data-path)
  • Extended Precision Accumulator Register (Data-path)
  • Dual Operand Fetch (Memory)
  • Circular Buffering (Addressing)
  • Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming

  • SHARC
  • Blackfin
  • Performance Optimization
hardware loops
Hardware loops
  • Software loop:

MOVE #16,B Initialize loop counter B

LOOP: MAC (R0)+,(R4)+,A Register-indirect addressing with post-increment

DEC B

JNE LOOP

  • Hardware loops: no time is spent on
    • Decrementing counters
    • Checking to see if the loop is finished
    • Branching back to the top of the loop

RPT #16

MAC (R0)+,(R4)+,A

[Lapsley97]

adi general purpose dsp product families
ADI General Purpose DSP Product Families

TigerSHARC

High-Performance

$35 - $200

  • Upto 4800MMACS (16-bit)
  • or 1200MMACS (32-bit)
    • 2.5G/3G Infrastructure
      • Medical Imaging
    • Industrial Imaging
  • Multiprocessing
  • Upto 160MMACS
  • Wired Voice
  • Wireless Voice
  • VOIP/VON
  • Industrial Control

Blackfin Media Enabled

$5 - $30

SHARC

Low-Cost

Floating Point

$10 - $100

Performance

  • Upto 600MMACS (32-bit)
    • Audio
    • Infotainment
  • Industrial

ADSP-218x/9x

Power Efficient

$5 - $10

  • Upto 3000MMACS
  • Image compression
  • Digital Still/Video Camera
  • MMOIP
  • Telematics
  • Biometrics

www.analog.com/dsp

outline4
Outline
  • FIR filter on ADPS-21x

DSP Requirements

  • Fast Multiply-Accumulates (Data-path)
  • Extended Precision Accumulator Register (Data-path)
  • Dual Operand Fetch (Memory)
  • Circular Buffering (Addressing)
  • Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming

  • SHARC
  • Blackfin
  • Performance Optimization
sharc architecture
SHARC Architecture

Copied from [Smith97]

sharc architecture features
SHARC Architecture - Features
  • The Super Harvard ARChitecture
  • 100MHz Core / 300 MFLOPS Peak
  • Parallel Operation of: Multiplier, ALU, 2 Address Generators &

Sequencer

    • No Arithmetic Pipeline; All Computations Are Single-Cycle
  • High Precision and Extended Dynamic Range
    • 32/40-Bit IEEE Floating-Point Math
    • 32-Bit Fixed-Point MAC’s with 64-Bit Product & 80-Bit Accumulation
  • Single-Cycle Transfers with Dual-Ported Memory Structures
    • Supported by Cache Memory and Enhanced HarvardArchitecture
  • Glueless Multiprocessing Features
  • JTAG Test and Emulation Port
  • DMA Controller, Serial Ports, Link Ports, External Bus, SDRAM

Controller, Timers

www.analog.com/dsp

slide29

ADSP-2106x Core Architecture

CACHE

JTAG TEST &

MEMORY

EMULATION

32 x 48

FLAGS

DAG 1

DAG 2

PROGRAM

8 x 4 x 32

8 x 4 x 24

SEQUENCER

TIMER

24

PMA BUS

PMA

DMA BUS

32

DMA

48

PMD BUS

PMD

BUS CONNECT

DMD BUS

40

DMD

REGISTER

FLOATING & FIXED-POINT

FILE

32-BIT

FLOATING-POINT

MULTIPLIER,

16 x 40

BARREL

& FIXED-POINT

FIXED-POINT

SHIFTER

ALU

ACCUMULATOR

www.analog.com/dsp

example dot product
C codeExample- Dot product

Copied from [Smith97]

example dot product assembly
Example- Dot product - Assembly

Copied from [Smith97]

example dot product assembly1
Example- Dot product - Assembly

Copied from [Smith97]

c or assembly
C or Assembly
  • How complicated is the program?
  • Are you pushing the maximum speed of the DSP?
  • How many programmers will be working together?
  • Which is more important, product cost or development cost?
  • What is your background?
  • What does the DSP's manufacturer suggest you use?

Copied from [Smith97]

outline5
Outline
  • FIR filter on ADPS-21x

DSP Requirements

  • Fast Multiply-Accumulates (Data-path)
  • Extended Precision Accumulator Register (Data-path)
  • Dual Operand Fetch (Memory)
  • Circular Buffering (Addressing)
  • Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming

  • SHARC
  • Blackfin
  • Performance Optimization
b lack fin processor core

Address Arithmetic Unit

SP

FP

L3

I3

B3

M3

P5

L2

B2

I2

M2

P4

DAG1

DAG0

I1

M1

B1

L1

P3

I0

M0

B0

L0

P2

P1

P0

Sequencer

R7

R6

16

16

R5

8

8

8

8

R4

R3

R2

R1

Barrel

Shifter

R0

40

40

Acc0

Acc1

Data Arithmetic Unit

BLACKfin Processor Core

Two 16-bit Multipliers

Two 40-bit ALUs, Four 8-bit Video ALUs

Barrel Shifter

Sixteen 16-bit /Eight 32-bit Math Registers

Two DAGs, byte addressing

Eight 32-bit pointer registers

Four Sets of 32-bit Index, Modify, Length, Base

16-bit Instructions, 32-bit Instructions

Multi-Issue, 64-bit Instructions

Interlocked Pipeline

Micro Signal Architecture, developed with Intel

www.analog.com/dsp

adsp bf535 blackfin processor architecture
ADSP-BF535 BLACKfin Processor Architecture

Great Performance Value

  • Highest Frequency (350 MHz)
  • 1.0V to 1.6V
  • 260 PBGA

High System Integration

  • Address range 768Mbytes
  • SPORTs support 8 Channels of I2S Audio
  • (532Mbps) I/O Bandwidth, DMA Bandwidth & Memory Bandwidth
  • Microcontroller features include WDT, PCI, USB1.1 SDRAM controller

User Peripherals

System Peripherals

Dynamic

Power

Management

USB 1.1

To 350 MHz

BLACKfin

Processor Core

SPORTs 2

PLL

SPI 2

Watchdog

UART 2

JTAG

Real Time

Clock

Timers 3 (32bit)

Memory

GPIO 16

308 Kbytes

On-Chip

SRAM

48 Kbytes

On-Chip

Cache

264Kbytes

On-Chip

SRAM

PCI

Interfaces

FLASH/SRAM

DMA

SDRAM

www.analog.com/dsp