1 / 50

# Chapter 5 - PowerPoint PPT Presentation

Chapter 5. Instructor: Mozafar Bag-Mohammadi Spring 2010 Ilam University. Our Methodology. Only flip-flops All on the same edge (e.g. falling) All with same clock No need to draw clock signals All logic finishes in one cycle. Our Methodology, cont’d. No clock gating!

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Chapter 5' - lainey

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Chapter 5

Instructor:

Spring 2010

Ilam University

• Only flip-flops

• All on the same edge (e.g. falling)

• All with same clock

• No need to draw clock signals

• All logic finishes in one cycle

• No clock gating!

• Correct design:

• Assumption: get whole instruction done in one long cycle

• Instructions:

• add, sub, and, or slt, lw, sw, & beq

• To do

• For each instruction type

• Putting it all together

• Fetch instruction, then increment PC

• Same for all types

• Assumes

• PC updated every cycle

• No branches or jumps

• After this instruction fetch next one

• and \$1, \$2, \$3 # \$1 <= \$2 & \$3

• E.g. MIPS R-format

Opcode rs rt rd shamt function

6 5 5 5 5 6

• lw \$1, immed(\$2) # \$1 <= M[SE(immed)+\$2]

• E.g. MIPS I-format:

Opcode rt rt immed

6 5 5 16

• beq \$1, \$2, addr # if (\$1==\$2) PC = PC + addr<<2

• Actually

newPC = PC + 4

target = newPC + addr << 2 # in MIPS offset from newPC

if ((\$1 - \$2) == 0)

PC = target

else

PC = newPC

• Single-cycle implementation

• Datapath: combinational logic, I-mem, regs, D-mem, PC

• Last three written at end of cycle

• Need control – just combinational logic!

• Inputs:

• Instruction (I-mem out)

• Zero (for beq)

• Outputs:

• Control lines for muxes

• ALUop

• Write-enables

• Fast control

• Divide up work on “need to know” basis

• Logic with fewer inputs is faster

• E.g.

• Global control need not know which ALUop

• Assume ALU uses

• ALU-ctrl = f(opcode,function)

• To simplify ALU-ctrl

• ALUop = f(opcode)

2 bits 6 bits

• ALU-ctrl = f(ALUop, function)

• 3 bits 2 bits 6 bits

• Requires only five gates plus inverters

• R-format: opcode rs rt rd shamt function

6 5 5 5 5 6

• I-format: opcode rs rt address/immediate

6 5 5 16

6 26

• Route instruction[25:21] as read reg1 spec

• Route instruction[20:16] are read reg2 spec

• Route instruction[20:16] (load) and instruction[15:11] (others) to

• Write reg mux

• Call instruction[31:26] op[5:0]

• Global control outputs

• ALU-ctrl - see above

• ALU src - R-format, beq vs. ld/st

• MemWrite - sw

• MemtoReg - lw

• RegDst - lw dst in bits 20:16, not 15:11

• RegWrite - all but beq and sw

• PCSrc - beq taken

• Global control outputs

• Replace PCsrc with

• Branch beq

• PCSrc = Branch * Zero

• What are the inputs needed to determine above global control signals?

• Just Op[5:0]

• RegDst = ~Op[0]

• ALUSrc = Op[0]

• RegWrite = ~Op[3] * ~Op[2]

• More complex with entire MIPS ISA

• Need more systematic structure

• Want to share gates between control signals

• Common solution: PLA

• MIPS opcode space designed to minimize PLA inputs, minterms, and outputs

• See MIPS Opcode map (Fig A.19)

Cycles

Time

X

X

Instruction

Program

Cycle

(code size)

(CPI)

(cycle time)

What’s wrong with single cycle?

• Critical path probably lw:

• I-mem, reg-read, alu, d-mem, reg-write

• Other instructions faster

• E.g. rrr: skip d-mem

• Instruction variation much worse for full ISA and real implementation:

• FP divide

• Cache misses (what the heck is this? – chapter 7)

• Solution

• Variable clock?

• Too hard to control, design

• Fixed short clock

• Variable cycles per instruction

• Clock cycle = max(i-mem,reg-read+reg-write, ALU, d-mem)

• Reuse combination logic on different cycles

• One memory

• One ALU without other adders

• But

• Control is more complex

• Need new registers to save values (e.g. IR)

• Used again on later cycles

• Logic that computes signals is reused

• Note:

• Instruction register, memory data register

• One memory with address bus

• One ALU with ALUOut register

• Share wires to reduce #signals

• Distributed multiplexor

• Multiple sources driving one bus

• Ensure only one is active!

State

Current

Next

state

State Fn

outputs

Inputs

Output

Fn

Multi-cycle Control

• Function of Op[5:0] and current step

• Defined as Finite State Machine (FSM) or

• Micro-program or microcode (later)

• FSM – App. B

• State is combination of step and which path

• For each state, define:

• Control signals for datapath for this cycle

• Control signals to determine next state

• All instructions start in same IF state

• Instructions terminate by making IF next

• After proper PC update

ID

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

Multi-cycle Example (and)

• You will be using FSM control for your processor implementation

• You will be producing the state machine and control outputs from binary (ISA/datapath controls)

• There are multiple methods for specifying a state machine

• Moore machine (output is function of state only)

• Mealy machine (output is function of state/input)

• There are different methods of assigning states

• State assignment is converting logical states to binary representation

• Is state assignment interesting/important?

• Judicious choice of state representation can make next state fcn/output fcn have fewer gates

• Optimal solution is hard, but having intuition is helpful (CAD tools can also help in practice)

• 10 states in multicycle control FSM

• Each state can have 1 of 16 (2^4) encodings with “dense” state representation

• Any choice of encoding is fine functionally as long as all states are unique

• Appendix C-26 example: RegWrite signal

ID

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

State Assignment,RegWrite Signal

State 0

State 1

State 2

State 6

State 8

State 9

State 4

State 7

State 5

State 3

State 7 (0111b)

State 9 (1001b)

State 4 (0100b)

State 8 (1000b)

Original: 2 inverters, 2 and3s, 1 or2

New: No gates--just bit 3!

ID

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

Multi-cycle Example (lw)

ID

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

Multi-cycle Example (sw)

ID

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

Multi-cycle Example (beq T)

ID

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

Multi-cycle Example (j)

• Processor implementation

• Datapath

• Control

• Single cycle implementation

• Next: microprogramming