CSE 8383 - Advanced Computer Architecture

1 / 40

# CSE 8383 - Advanced Computer Architecture - PowerPoint PPT Presentation

CSE 8383 - Advanced Computer Architecture. Week-3 Week of Jan 26, 2004 engr.smu.edu/~rewini/8383. Contents. Linear Pipelines Nonlinear pipelines Instruction Pipelines Arithmetic Operations Design of Multifunction Pipeline. Linear Pipeline. Processing Stages are linearly connected

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'CSE 8383 - Advanced Computer Architecture' - omer

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### CSE 8383 - Advanced Computer Architecture

Week-3

Week of Jan 26, 2004

engr.smu.edu/~rewini/8383

Contents
• Linear Pipelines
• Nonlinear pipelines
• Instruction Pipelines
• Arithmetic Operations
• Design of Multifunction Pipeline
Linear Pipeline
• Processing Stages are linearly connected
• Perform fixed function
• Synchronous Pipeline
• Clocked latches between Stage i and Stage i+1
• Equal delays in all stages
• Asynchronous Pipeline (Handshaking)
Latches

S1

S2

S3

L1

L2

Slowest stage determines delay

Equal delays  clock period

Reservation Table

Time

S1

S2

S3

S4

Non Linear Pipelines
• Variable functions
• Feed-Forward
• Feedback
Linear Instruction Pipelines
• Assume the following instruction execution phases:
• Fetch (F)
• Decode (D)
• Operand Fetch (O)
• Execute (E)
• Write results (W)
Dependencies
• Data Dependency

• Instruction Dependency

(Branching)

Will that Cause a Problem?

Data Dependency

I1 -- Add R1, R2, R3

I2 -- Sub R4, R1, R5

1

2

3

4

5

6

F

D

O

E

W

Solutions
• STALL
• Forwarding
• Write and Read in one cycle
• ….
Instruction Dependency

I1 – Branch o

I2 –

1

2

3

4

5

6

F

D

O

E

W

Solutions
• STALL
• Predict Branch taken
• Predict Branch not taken
• ….
Floating Point Multiplication
• Inputs (Mantissa1, Exponenet1), (Mantissa2, Exponent2)
• Add the two exponents  Exponent-out
• Multiple the 2 mantissas
• Normalize mantissa and adjust exponent
• Round the product mantissa to a single length mantissa. You may adjust the exponent
Linear Pipeline for floating-point multiplication

Round

Normalize

Exponents

Multiply

Mantissa

Round

Normalize

Accumulator

Partial

Products

Exponents

Re

normalize

Partial

Shift

Find

Mantissa

Partial

Shift

Subtract

Exponents

Round

Re

normalize

B

Partial

Products

G

C

H

F

A

Partial

Shift

Find

Mantissa

Partial

Shift

Exponents

Subtract

Round

Re

normalize

E

D

Nonlinear Pipeline Design
• Latency

The number of clock cycles between two initiations of a pipeline

• Collision

Resource Conflict

• Forbidden Latencies

Latencies that cause collisions

Nonlinear Pipeline Design cont
• Latency Sequence

A sequence of permissible latencies between successive task initiations

• Latency Cycle

A sequence that repeats the same subsequence

• Collision vector

C = (Cm, Cm-1, …, C2, C1), m <= n-1

n = number of column in reservation table

Ci = 1 if latency i causes collision, 0 otherwise

Collision Vector for Multiply after Multiply

Forbidden Latencies:1, 2

Collision vector

0 0 0 0 1 1  11

Maximum forbidden latency = 2  m = 2

Example

Y

X

S1

S2

S3

Forbidden Latencies
• X after X
• X after Y
• Y after X
• Y after Y
X after X

2

S1

S2

S3

5

S1

S2

S3

X after X

4

S1

S2

S3

7

S1

S2

S3

Collision Vector
• Forbidden Latencies: 2, 4, 5, 7
• Collision Vector =

1 0 1 1 0 1 0

Y after Y

S1

S2

S3

S1

S2

S3

Collision Vector
• Forbidden Latencies: 2, 4
• Collision Vector =

1 0 1 0

State Diagram for X

8+

1 0 1 1 0 1 0

8+

3

8+

6

1*

1 0 1 1 0 1 1

1 1 1 1 1 1 1

3*

6

Cycles
• Simple cycles each state appears only once

(3), (6), (8), (1, 8), (3, 8), and (6,8)

• Greedy Cycles  simple cycles whose edges are all made with minimum latencies from their respective starting states

(1,8), (3)  one of them is MAL