Cpe 242 computer architecture and engineering designing a multiple cycle processor
Download
1 / 44

CpE 242 Computer Architecture and Engineering Designing a Multiple Cycle Processor - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

CpE 242 Computer Architecture and Engineering Designing a Multiple Cycle Processor. Outline of Today’s Lecture. Recap and Introduction (5 minutes) Introduction to the Concept of Multiple Cycle Processor (15 minutes) Multiple Cycle Implementation of R-type Instructions (15 minutes)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' CpE 242 Computer Architecture and Engineering Designing a Multiple Cycle Processor' - lyn


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Cpe 242 computer architecture and engineering designing a multiple cycle processor

CpE 242Computer Architecture and EngineeringDesigning a Multiple Cycle Processor


Outline of today s lecture
Outline of Today’s Lecture

  • Recap and Introduction (5 minutes)

  • Introduction to the Concept of Multiple Cycle Processor (15 minutes)

  • Multiple Cycle Implementation of R-type Instructions (15 minutes)

  • What is a Multiple Cycle Delay Path and Why is it Bad? (10 minutes)

  • Multiple Cycle Implementation of Or Immediate (5 minutes)

  • Multiple Cycle Implementation of Load and Store (15 minutes)

  • Putting it all Together (5 minutes)


A single cycle processor
A Single Cycle Processor

ALUop

ALU

Control

ALUctr

3

RegDst

func

op

3

Main

Control

Instr<5:0>

6

ALUSrc

6

:

Instr<31:26>

Instruction<31:0>

Branch

Instruction

Fetch Unit

Jump

Rd

Rt

<21:25>

<16:20>

<11:15>

<0:15>

Clk

RegDst

1

0

Mux

Rt

Rs

Rd

Imm16

Rs

Rt

RegWr

ALUctr

5

5

5

MemtoReg

busA

Zero

MemWr

Rw

Ra

Rb

busW

32

32 32-bit

Registers

0

ALU

32

busB

32

0

Clk

Mux

32

Mux

32

1

WrEn

Adr

1

Data In

32

Data

Memory

Extender

imm16

32

16

Instr<15:0>

Clk

ALUSrc

ExtOp


Instruction fetch unit

1

Mux

PC

0

0

Adder

Mux

1

Adder

Clk

Instruction Fetch Unit

30

Addr<31:2>

30

PC<31:28>

Addr<1:0>

“00”

4

Target

Instruction

Memory

30

Instruction<25:0>

26

30

32

30

“1”

Jump

Instruction<31:0>

30

SignExt

30

imm16

16

Instruction<15:0>

Branch

Zero


The main control

.

.

.

.

.

.

op<5>

op<5>

op<5>

op<5>

op<5>

op<5>

.

.

.

.

.

.

<0>

<0>

<0>

<0>

<0>

op<0>

R-type

ori

lw

sw

beq

jump

The Main Control

RegWrite

ALUSrc

RegDst

MemtoReg

MemWrite

Branch

Jump

ExtOp

ALUop<2>

ALUop<1>

ALUop<0>


Drawbacks of this single cycle processor
Drawbacks of this Single Cycle Processor

  • Long cycle time:

    • Cycle time must be long enough for the load instruction:

      • PC’s Clock -to-Q +

      • Instruction Memory Access Time +

      • Register File Access Time +

      • ALU Delay (address calculation) +

      • Data Memory Access Time +

      • Register File Setup Time +

      • Clock Skew

  • Cycle time is much longer than needed for all other instructions. Examples:

    • R-type instructions do not require data memory access

    • Jump does not require ALU operation nor data memory access


Outline of today s lecture1
Outline of Today’s Lecture

  • Recap and Introduction (5 minutes)

  • Introduction to the Concept of Multiple Cycle Processor

  • Multiple Cycle Implementation of R-type Instructions (15 minutes)

  • What is a Multiple Cycle Delay Path and Why is it Bad? (10 minutes)

  • Multiple Cycle Implementation of Or Immediate (5 minutes)

  • Multiple Cycle Implementation of Load and Store (15 minutes)

  • Putting it all Together (5 minutes)


Overview of a multiple cycle implementation
Overview of a Multiple Cycle Implementation

  • The root of the single cycle processor’s problems:

    • The cycle time has to be long enough for the slowest instruction

  • Solution:

    • Break the instruction into smaller steps

    • Execute each step (instead of the entire instruction) in one cycle

      • Cycle time: time it takes to execute the longest step

      • Keep all the steps to have similar length

    • This is the essence of the multiple cycle processor


Overview of a multiple cycle implementation1
Overview of a Multiple Cycle Implementation

  • The advantages of the multiple cycle processor:

    • Cycle time is much shorter

    • Different instructions take different number of cycles to complete

      • Load takes five cycles

      • Jump only takes three cycles

    • Allows a functional unit to be used more than once per instruction


The five steps of a load instruction
The Five Steps of a Load Instruction

Instruction Fetch

Instr Decode /

Address

Data Memory

Reg Wr

Reg. Fetch

Clk

Clk-to-Q

Old Value

New Value

PC

Instruction Memory Access Time

Rs, Rt, Rd,

Op, Func

Old Value

New Value

Delay through Control Logic

ALUctr

Old Value

New Value

ExtOp

Old Value

New Value

ALUSrc

Old Value

New Value

RegWr

Old Value

New Value

Register File Access Time

busA

Old Value

New Value

Delay through Extender & Mux

Register File Write Time

busB

Old Value

New Value

ALU Delay

Address

Old Value

New Value

Data Memory Access Time

busW

Old Value

New


Register file memory write timing ideal vs reality

WrEn

Adr

32

Ideal

Memory

Din

Dout

32

32

Clk

WrEn

Adr

32

Ideal

Memory

Din

Dout

32

32

Register File & Memory Write Timing: Ideal vs. Reality

  • In previous lectures, register file and memory are simplified:

    • Write happens at the clock tick

    • Address, data, and write enable must bestable one “set-up” time before the clock tick

  • In real life:

    • Neither register file nor ideal memory has the clock input

    • The write path is a combinational logic delay path:

      • Write enable goes to 1 and Din settles down

      • Memory write access delay

      • Din is written into mem[address]

    • Important: Address and Data must bestable BEFORE Write Enable goes to 1


Race condition between address and write enable

RegWr

Ra

5

Rb

busA

5

32

Reg File

Rw

busB

5

32

busW

32

WrEn

Adr

32

Ideal

Memory

Din

Dout

32

32

Race Condition Between Address and Write Enable

  • This “real” (no clock input) register file may notwork reliably in the single cycle processor because:

    • We cannot guarantee Rw willbe stable BEFORE RegWr = 1

    • There is a “race” between Rw (address)and RegWr (write enable)

  • The “real” (no clock input) memory may not workreliably in the single cycle processor because:

    • We cannot guarantee Address willbe stable BEFORE WrEn = 1

    • There is a race between Adr and WrEn


How to avoid this race condition
How to Avoid this Race Condition?

  • Solution for the multiple cycle implementation:

    • Make sure Address is stable by the end of Cycle N

    • Assert Write Enable signal ONE cycle later at Cycle (N + 1)

    • Address cannot change until Write Enable is disasserted


Dual port ideal memory

MemWr

00

RAdr<1:0>

<31:2>

30

Ideal

Memory

32

WrAdr

Din

Dout

32

32

Dual-Port Ideal Memory

  • Dual Port Ideal Memory

    • Independent Read (RAdr, Dout) and Write (WAdr, Din) ports

    • Read and write (to different location) can occur at the same cycle

  • Read Port is a combinational path:

    • Read Address Valid -->

    • Memory Read Access Delay -->

    • Data Out Valid

  • Write Port is also a combinational path:

    • MemWrite = 1 -->

    • Memory Write Access Delay -->

    • Data In is written into location[WrAdr]


Instruction fetch cycle in the beginning

ALU

32

ALU

Control

32

Instruction Fetch Cycle: In the Beginning

  • Every cycle begins right AFTER the clock tick:

    • mem[PC] PC<31:0> + 4

Clk

One “Logic” Clock Cycle

You are here!

PCWr=?

PC

32

MemWr=?

IRWr=?

32

32

RAdr

Clk

4

32

Ideal

Memory

Instruction Reg

WrAdr

32

Dout

Din

32

ALUop=?

32

Clk


Instruction fetch cycle the end

ALU

32

ALU

Control

32

Instruction Fetch Cycle: The End

  • Every cycle ends AT the next clock tick (storage element updates):

    • IR <-- mem[PC] ; PC<31:0> <-- PC<31:0> + 4

Clk

One “Logic” Clock Cycle

You are here!

PCWr=1

PC

32

MemWr=0

IRWr=1

32

00

32

RAdr

Clk

4

32

Ideal

Memory

Instruction Reg

32

WrAdr

Dout

Din

ALUOp = Add

32

32

Clk


Instruction fetch cycle overall picture cycle 1

Ifetch

ALUOp=Add

1: PCWr, IRWr

x: PCWrCond

RegDst, Mem2R

Others: 0s

Target

32

0

Mux

0

Mux

1

ALU

1

32

ALU

Control

Instruction Fetch Cycle: Overall PictureCycle 1

PCWr=1

PCWrCond=x

PCSrc=0

BrWr=0

Zero

ALUSelA=0

IorD=0

MemWr=0

IRWr=1

1

Mux

32

PC

0

Zero

32

32

RAdr

32

busA

Ideal

Memory

32

Instruction Reg

32

4

0

32

WrAdr

32

1

32

Din

Dout

busB

32

2

32

3

ALUSelB=00

ALUOp=Add


Register fetch instruction decode cycle 2

32

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

Register Fetch / Instruction Decode:Cycle 2

  • busA <- RegFile[rs] ; busB <- RegFile[rt] ;

  • ALU is not being used: ALUctr = xx

PCWr=0

PCWrCond=0

PCSrc=x

Zero

ALUSelA=x

IorD=x

MemWr=0

IRWr=0

RegDst=x

RegWr=0

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Op

Go to the

Control

6

Imm

ALUSelB=xx

Func

6

16

ALUOp=xx


Register fetch instruction decode cycle 2 continue

Rfetch/Decode

ALUOp=Add

1: BrWr, ExtOp

ALUSelB=10

x: RegDst, PCSrc

IorD, MemtoReg

Others: 0s

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

<< 2

Extend

Register Fetch / Instruction Decode:Cycle 2 (Continue)

busA <- Reg[rs] ; busB <- Reg[rt] ;

Target <- PC + SignExt(Imm16)*4

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=1

Zero

ALUSelA=0

IorD=x

MemWr=0

IRWr=0

RegDst=x

RegWr=0

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Control

Beq

Op

Rtype

Imm

6

ALUSelB=10

Ori

Func

Memory

6

16

32

ALUOp=Add

:

ExtOp=1


Branch completion cycle 3

BrComplete

ALUOp=Sub

ALUSelB=01

x: IorD, Mem2Reg

RegDst, ExtOp

1: PCWrCond

ALUSelA

PCSrc

Target

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

<< 2

Extend

Branch Completion: Cycle 3

  • if (busA == busB)

    • PC <- Target

PCWr=0

PCWrCond=1

PCSrc=1

BrWr=0

Zero

ALUSelA=1

IorD=x

MemWr=0

IRWr=0

RegDst=x

RegWr=0

1

32

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Imm

ALUSelB=01

16

32

ALUOp=Sub

ExtOp=x


Outline of today s lecture2
Outline of Today’s Lecture

  • Recap and Introduction (5 minutes)

  • Introduction to the Concept of Multiple Cycle Processor (15 minutes)

  • Multiple Cycle Implementation of R-type Instructions

  • What is a Multiple Cycle Delay Path and Why is it Bad? (10 minutes)

  • Multiple Cycle Implementation of Or Immediate (5 minutes)

  • Multiple Cycle Implementation of Load and Store (15 minutes)

  • Putting it all Together (5 minutes)


Instruction decode cycle 2 we have a r type

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

<< 2

Extend

Instruction Decode: Cycle 2, We have a R-type!

  • Next Cycle: R-type Execution

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=1

Zero

ALUSelA=0

IorD=x

MemWr=0

IRWr=0

RegDst=x

RegWr=0

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Control

Beq

Op

Rtype

Imm

6

ALUSelB=10

Ori

Func

Memory

6

16

32

ALUOp=Add

:

ExtOp=1


R type execution cycle 3

RExec

1: RegDst

ALUSelA

ALUSelB=01

ALUOp=Rtype

x: PCSrc, IorD

MemtoReg

ExtOp

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

Mux

1

0

<< 2

Extend

16

R-type Execution: Cycle 3

  • ALU Output <- busA op busB

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=0

Zero

ALUSelA=1

IorD=x

MemWr=0

IRWr=0

RegDst=1

RegWr=0

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Imm

32

ALUOp=Rtype

ExtOp=x

MemtoReg=x

ALUSelB=01


R type completion cycle 4

Rfinish

ALUOp=Rtype

1: RegDst, RegWr

ALUselA

ALUSelB=01

x: IorD, PCSrc

ExtOp

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

32

1

ALU

Control

Mux

1

0

<< 2

Extend

16

R-type Completion: Cycle 4

  • R[rd] <- ALU Output

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=0

Zero

ALUSelA=1

IorD=x

MemWr=0

IRWr=0

RegDst=1

RegWr=1

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

Din

Dout

busW

busB

32

2

32

3

Imm

32

ALUOp=Rtype

ExtOp=x

MemtoReg=0

ALUSelB=01


Outline of today s lecture3
Outline of Today’s Lecture

  • Recap and Introduction (5 minutes)

  • Introduction to the Concept of Multiple Cycle Processor (15 minutes)

  • Multiple Cycle Implementation of R-type Instructions (15 minutes)

  • What is a Multiple Cycle Delay Path and Why is it Bad? (10 minutes)

  • Multiple Cycle Implementation of Or Immediate (5 minutes)

  • Multiple Cycle Implementation of Load and Store (15 minutes)

  • Putting it all Together (5 minutes)


A multiple cycle delay path

0

Mux

1

ALU

0

Mux

1

ALU

Control

Mux

1

0

A Multiple Cycle Delay Path

  • There is no register to save the results between:

    • Register Fetch: busA <- Reg[rs] ; busB <- Reg[rt]

    • R-type Execution: ALU output <- busA op busB

    • R-type Completion: Reg[rd] <- ALU output

Register here to save

outputs of Rfetch?

ALUselA

PCWr

Register here to save

outputs of RExec?

Zero

Rs

Ra

5

busA

32

Rt

Rb

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

1

Rd

32

busW

busB

32

2

3

ALUselB

ALUOp


A multiple cycle delay path continue
A Multiple Cycle Delay Path (Continue)

  • Register is NOT needed to save the outputs of Register Fetch:

    • IRWr = 0: busA and busB will not change after Register Fetch

  • Register is NOT needed to save the outputs of R-type Execution:

    • busA and busB will not change after Register Fetch

    • Control signals ALUSelA, ALUSelB, and ALUOpwill not change after R-type Execution

    • Consequently ALU output will not change after R-type Execution

  • In theory (P. 320, P&H), you need a register to hold a signal value if:

    • (1) The signal is computed in one clock cycle and used in another.

    • (2) AND the inputs to the functional block that computes this signal can change before the signal is written into a state element.

  • You can save a register if Cond 1 is true BUT Cond 2 is false:

    • But in practice, this will introduce a multiple cycle delay path:

      • A logic delay path that takes multiple cycles to propagate from one storage element to the next storage element


Pros and cons of a multiple cycle delay path

0

Zero

Rs

Mux

Ra

5

busA

32

Rt

1

Rb

32

ALU

Instruction Reg

Reg File

5

32

0

4

Rt

0

Rw

32

Mux

1

Rd

32

busW

busB

32

1

2

ALU

Control

Mux

1

0

3

ALUselB

Pros and Cons of a Multiple Cycle Delay Path

  • A 3-cycle path example:

    • IR (storage) -> Reg File Read -> ALU -> Reg File Write (storage)

  • Advantages:

    • Register savings

    • We can share time among cycles:

      • If ALU takes longer than one cycle, still “a OK” as longas the entire path takes less than 3 cycles to finish


Pros and cons of a multiple cycle delay path continue

0

Zero

Rs

Mux

Ra

5

busA

32

Rt

1

Rb

32

ALU

Instruction Reg

Reg File

5

32

0

4

Rt

0

Rw

32

Mux

1

Rd

32

busW

busB

32

1

2

ALU

Control

Mux

1

0

3

ALUselB

Pros and Cons of a Multiple Cycle Delay Path (Continue)

  • Disadvantage:

    • Static timing analyzer, which ONLY looks at delay between two storage elements, will report this as a timing violation

    • You have to ignore the static timing analyzer’s warnings

    • But you may end up ignoring real timing violations

    • We always TRY to put in registers between cycles to avoid MCDP

      assume we add registers A,B, ALUOut, and Mem. Data register

A

ALUOut

(Can also be

Used

intead of

the Target reg.)

B

Mem Data


Outline of today s lecture4
Outline of Today’s Lecture

  • Recap and Introduction (5 minutes)

  • Introduction to the Concept of Multiple Cycle Processor (15 minutes)

  • Multiple Cycle Implementation of R-type Instructions (15 minutes)

  • What is a Multiple Cycle Delay Path and Why is it Bad? (10 minutes)

  • Multiple Cycle Implementation of Or Immediate (5 minutes)

  • Multiple Cycle Implementation of Load and Store (15 minutes)

  • Putting it all Together (5 minutes)


Instruction decode cycle 2 we have an ori

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

<< 2

Extend

Instruction Decode: Cycle 2, We have an Ori!

  • Next Cycle: Ori Execution

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=1

Zero

ALUSelA=0

IorD=x

MemWr=0

IRWr=0

RegDst=x

RegWr=0

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Intruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Control

Beq

Op

Rtype

Imm

6

ALUSelB=10

Ori

Func

Memory

6

16

32

ALUOp=Add

:

ExtOp=1


Ori execution cycle 3

OriExec

ALUOp=Or

1: ALUSelA

ALUSelB=11

x: MemtoReg

IorD, PCSrc

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

Mux

1

0

<< 2

Extend

Ori Execution: Cycle 3

ALU output <- busA or ZeroExt[Imm16]

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=0

Zero

ALUSelA=1

IorD=x

MemWr=0

IRWr=0

RegDst=0

RegWr=0

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Imm

16

32

ALUOp=Or

ExtOp=0

MemtoReg=x

ALUSelB=11


Ori completion cycle 4

OriFinish

ALUOp=Or

x: IorD, PCSrc

ALUSelB=11

1: ALUSelA

RegWr

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

32

1

ALU

Control

Mux

1

0

<< 2

Extend

16

Ori Completion: Cycle 4

  • Reg[rt] <- ALU output

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=0

Zero

ALUSelA=1

IorD=x

MemWr=0

IRWr=0

RegDst=0

RegWr=1

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

Din

Dout

busW

busB

32

2

32

3

Imm

32

ALUOp=Or

ExtOp=0

MemtoReg=0

ALUSelB=11


Outline of today s lecture5
Outline of Today’s Lecture

  • Recap and Introduction (5 minutes)

  • Introduction to the Concept of Multiple Cycle Processor (15 minutes)

  • Multiple Cycle Implementation of R-type Instructions (15 minutes)

  • What is a Multiple Cycle Delay Path and Why is it Bad? (10 minutes)

  • Multiple Cycle Implementation of Or Immediate (5 minutes)

  • Multiple Cycle Implementation of Load and Store (15 minutes)

  • Putting it all Together (5 minutes)


Instruction decode cycle 2 we have a memory access

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

<< 2

Extend

Instruction Decode: Cycle 2, We have a Memory Access!

  • Next Cycle: Memory Address Calculation

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=1

Zero

ALUSelA=0

IorD=x

MemWr=0

IRWr=0

RegDst=x

RegWr=0

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Control

Beq

Op

Rtype

Imm

6

ALUSelB=10

Ori

Func

Memory

6

16

32

ALUOp=Add

:

ExtOp=1


Memory address calculation cycle 3

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

Mux

1

0

<< 2

Extend

Memory Address CalculationCycle 3

AdrCal

1: ExtOp

ALUSelA

ALUSelB=11

ALUOp=Add

x: MemtoReg

ALU output <- busA + SignExt[Imm16]

PCSrc

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=0

Zero

ALUSelA=1

IorD=x

MemWr=0

IRWr=0

RegDst=x

RegWr=1

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Imm

16

32

ALUOp=Add

ExtOp=1

MemtoReg=x

ALUSelB=11


Memory access for store cycle 4

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

32

1

ALU

Control

Mux

1

0

<< 2

Extend

16

Memory Access for StoreCycle 4

SWmem

1: ExtOp

MemWr

ALUSelA

ALUSelB=11

ALUOp=Add

x: PCSrc,RegDst

PCWr=0

  • mem[ALU output] <- busB

PCWrCond=0

PCSrc=x

BrWr=0

MemtoReg

Zero

ALUSelA=1

IorD=x

MemWr=1

IRWr=0

RegDst=x

RegWr=0

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

Din

Dout

busW

busB

32

2

32

3

Imm

32

ALUOp=Add

ExtOp=1

MemtoReg=x

ALUSelB=11


Memory access for load cycle 4

LWmem

1: ExtOp

ALUSelA, IorD

ALUSelB=11

ALUOp=Add

x: MemtoReg

PCSrc

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

Mux

1

0

<< 2

Extend

Memory Access for LoadCycle 4

  • Mem Dout <- mem[ALU output]

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=0

Zero

ALUSelA=1

IorD=1

MemWr=0

IRWr=0

RegDst=0

RegWr=0

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Imm

16

32

ALUOp=Add

ExtOp=1

MemtoReg=x

ALUSelB=11


Write back for load cycle 5

LWwr

1: ALUSelA

RegWr, ExtOp

MemtoReg

ALUSelB=11

ALUOp=Add

x: PCSrc

IorD

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

1

ALU

Control

Mux

1

0

<< 2

Extend

Write Back for LoadCycle 5

  • Reg[rt] <- Mem Dout

PCWr=0

PCWrCond=0

PCSrc=x

BrWr=0

Zero

ALUSelA=1

IorD=x

MemWr=0

IRWr=0

RegDst=0

RegWr=0

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

32

Din

Dout

busW

busB

32

2

32

3

Imm

16

32

ALUOp=Add

ExtOp=1

MemtoReg=1

ALUSelB=11


Outline of today s lecture6
Outline of Today’s Lecture

  • Recap and Introduction (5 minutes)

  • Introduction to the Concept of Multiple Cycle Processor (15 minutes)

  • Multiple Cycle Implementation of R-type Instructions (15 minutes)

  • What is a Multiple Cycle Delay Path and Why is it Bad? (10 minutes)

  • Multiple Cycle Implementation of Or Immediate (5 minutes)

  • Multiple Cycle Implementation of Load and Store (15 minutes)

  • Putting it all Together (5 minutes)


Putting it all together multiple cycle datapath

Target

32

0

Mux

0

Mux

1

ALU

0

1

Mux

32

1

ALU

Control

Mux

1

0

<< 2

Extend

16

Putting it all together: Multiple Cycle Datapath

PCWr

PCWrCond

PCSrc

BrWr

Zero

ALUSelA

IorD

MemWr

IRWr

RegDst

RegWr

1

Mux

32

PC

0

Zero

32

Rs

Ra

32

RAdr

5

32

Rt

Rb

busA

32

Ideal

Memory

32

Instruction Reg

Reg File

5

32

4

Rt

0

Rw

32

WrAdr

32

1

32

Rd

Din

Dout

busW

busB

32

2

32

3

Imm

32

ALUOp

ExtOp

MemtoReg

ALUSelB


Summary
Summary

  • Disadvantages of the Single Cycle Proccessor

    • Long cycle time

    • Cycle time is too long for all instructions except the Load

  • Multiple Cycle Processor:

    • Divide the instructions into smaller steps

    • Execute each step (instead of the entire instruction) in one cycle

  • Do NOT confuse Multiple Cycle Processor with Multiple Cycle Delay Path

    • Multiple Cycle Processor executes eachinstruction in multiple clock cycles

    • Multiple Cycle Delay Path: a combinational logic path between two storage elements that takes more than one clock cycle to complete

  • It is possible (desirable) to build a MC Processor without MCDP:

    • Use a register to save a signal’s value whenever a signal is generated in one clock cycle and used in another cycle later


Putting it all together control state diagram

Ifetch

Rfetch/Decode

BrComplete

ALUOp=Add

1: PCWr, IRWr

ALUOp=Add

ALUOp=Sub

x: PCWrCond

1: BrWr, ExtOp

ALUSelB=01

RegDst, Mem2R

ALUSelB=10

x: IorD, Mem2Reg

Others: 0s

RegDst, ExtOp

x: RegDst, PCSrc

IorD, MemtoReg

1: PCWrCond

ALUSelA

Others: 0s

PCSrc

RExec

1: RegDst

ALUSelA

ALUOp=Or

ALUSelB=01

1: ALUSelA

ALUOp=Rtype

ALUSelB=11

x: PCSrc, IorD

MemtoReg

x: MemtoReg

ExtOp

IorD, PCSrc

Rfinish

1: ALUSelA

ALUOp=Rtype

ALUOp=Or

RegWr, ExtOp

1: RegDst, RegWr

MemtoReg

x: IorD, PCSrc

ALUselA

ALUSelB=11

ALUSelB=11

ALUSelB=01

ALUOp=Add

1: ALUSelA

x: IorD, PCSrc

x: PCSrc

RegWr

ExtOp

IorD

Putting it all together: Control State Diagram

beq

AdrCal

1: ExtOp

ALUSelA

ALUSelB=11

lw or sw

ALUOp=Add

x: MemtoReg

Ori

PCSrc

Rtype

lw

sw

OriExec

SWMem

LWmem

1: ExtOp

ALUSelA, IorD

1: ExtOp

MemWr

ALUSelB=11

ALUSelA

ALUOp=Add

ALUSelB=11

x: MemtoReg

ALUOp=Add

PCSrc

x: PCSrc,RegDst

MemtoReg

OriFinish

LWwr


Where to get more information
Where to get more information?

  • Next two lectures:

    • Multiple Cycle Controller: Appendix C of your text book.

    • Microprogramming: Section 5.7 of your text book. Appendix C will e added from the CD to the notes folder in the new ecampus


ad