chapter 3 parallel and pipelined processing
Download
Skip this Video
Download Presentation
Chapter 3 Parallel and Pipelined Processing

Loading in 2 Seconds...

play fullscreen
1 / 13

Chapter 3 Parallel and Pipelined Processing - PowerPoint PPT Presentation


  • 165 Views
  • Uploaded on

Chapter 3 Parallel and Pipelined Processing. Parallel processing. Pipelined processing. Basic Ideas. time. time. P1 P2 P3 P4. P1 P2 P3 P4. a1. a2. a3. a4. a1. b1. c1. d1. b1. b2. b3. b4. a2. b2. c2. d2. c1. c2. c3. c4. a3. b3. c3. d3. d1. d2. d3. d4. a4.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Chapter 3 Parallel and Pipelined Processing' - yukio


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
basic ideas
Parallel processing

Pipelined processing

Basic Ideas

time

time

P1

P2

P3

P4

P1

P2

P3

P4

a1

a2

a3

a4

a1

b1

c1

d1

b1

b2

b3

b4

a2

b2

c2

d2

c1

c2

c3

c4

a3

b3

c3

d3

d1

d2

d3

d4

a4

b4

c4

d4

Less inter-processor communication

Complicated processor hardware

More inter-processor communication

Simpler processor hardware

Colors: different types of operations performed

a, b, c, d: different data streams processed

(C) 1997-2006 by Yu Hen Hu

data dependence
Parallel processing requires NO data dependence between processors

Pipelined processing will involve inter-processor communication

Data Dependence

P1

P2

P3

P4

P1

P2

P3

P4

time

time

(C) 1997-2006 by Yu Hen Hu

usage of pipelined processing
By inserting latches or registers between combinational logic circuits, the critical path can be shortened.

Consequence:

reduce clock cycle time,

increase clock frequency.

Suitable for DSP applications that have (infinity) long data stream.

Method to incorporate pipelining: Cut-set retiming

Cut set:

A cut set is a set of edges of a graph. If these edges are removed from the original graph, the remaining graph will become two separate graphs.

Retiming:

The timing of an algorithm is re-adjusted while keeping the partial ordering of execution unchanged so that the results correct

Usage of Pipelined Processing

(C) 1997-2006 by Yu Hen Hu

graphic transpose theorem

x[n]

z-1

z-1

h[0]

h[1]

y[n]

h[2]

?

=

Graphic Transpose Theorem
  • The transfer function of a signal flow graph remain unchanged if
    • The directions of each arc is reversed
    • The input and output labels are switched.

u[n]

y[n]

z-1

z-1

h[2]

h[0]

h[1]

x[n]

(C) 1997-2006 by Yu Hen Hu

data broadcast structure
Algorithm transform may lead to pipelined structure without adding additional delays.

Given a FIR filter SFG

Critical path TM+2TA

Use graph transposition theorem:

Reverse all arcs

Reverse input/output

We obtain

Critical path TM+ TA

No additional delay added!

Data broadcast structure

(C) 1997-2006 by Yu Hen Hu

fine grain pipelining
Fine-grain pipelining

To further reduce TM.

Critical Path = Max {TM1, TM2, TA}

(C) 1997-2006 by Yu Hen Hu

block processing
One form of vectorized parallel processing of DSP algorithms. (Not the parallel processing in most general sense)

Block vector: [x(3k) x(3k+1) x(3k+2)]

Clock cycle: can be 3 times longer

Original (FIR filter):

Rewrite 3 equations at a time:

Define block vector

Block formulation:

Block Processing

(C) 1997-2006 by Yu Hen Hu

block processing9
Block Processing

(C) 1997-2006 by Yu Hen Hu

general approach for block processing
General approach for block processing

(C) 1997-2006 by Yu Hen Hu

block processing for iir digital filter
Original formulation:

Rewrite

Define block vectors

Then

Time indices

n: sampling period

k: clock period (processor)

k = 2n

Note:

Pipelining: clock period = sampling period.

Block (parallel): clock period not equal to sampling period.

Block Processing for IIR Digital Filter

(C) 1997-2006 by Yu Hen Hu

block iir filter
Block IIR Filter

y(2(k-1))

D

x(2k)

y(2k)

+

x(n)

S/P

P/S

y(n)

y(2k+1)

+

x(2k+1)

y(2(k-1)+1)

D

(C) 1997-2006 by Yu Hen Hu

timing comparison
Timing Comparison

x(1)

x(2)

x(3)

x(4)

MAC

1

2

3

4

y(1)

y(2)

y(3)

y(4)

  • Pipelining
  • Block processing

x(1)

x(2)

x(3)

x(4)

x(5)

x(6)

x(7)

x(7)

Add

1

2

3

4

5

6

7

8

y(1)

y(2)

y(3)

y(4)

y(5)

y(6)

y(7)

y(7)

a y(1)

Mul

1

2

3

4

5

6

7

8

x(2)

x(4)

x(6)

x(8)

2

2

4

4

6

6

8

8

x(1)

x(3)

x(5)

x(7)

1

1

3

3

5

5

7

7

(C) 1997-2006 by Yu Hen Hu

ad