- 122 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Chapter 5' - lainey

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Our Methodology

- Only flip-flops
- All on the same edge (e.g. falling)
- All with same clock
- No need to draw clock signals

- All logic finishes in one cycle

Our Methodology, cont’d

- No clock gating!
- Book has bad examples

- Correct design:

Datapath – 1 CPI

- Assumption: get whole instruction done in one long cycle
- Instructions:
- add, sub, and, or slt, lw, sw, & beq

- To do
- For each instruction type
- Putting it all together

Fetch Instructions

- Fetch instruction, then increment PC
- Same for all types

- Assumes
- PC updated every cycle
- No branches or jumps

- After this instruction fetch next one

ALU Instructions

- and $1, $2, $3 # $1 <= $2 & $3
- E.g. MIPS R-format
Opcode rs rt rd shamt function

6 5 5 5 5 6

Load/Store Instructions

- lw $1, immed($2) # $1 <= M[SE(immed)+$2]
- E.g. MIPS I-format:
Opcode rt rt immed

6 5 5 16

Branch Instructions

- beq $1, $2, addr # if ($1==$2) PC = PC + addr<<2
- Actually
newPC = PC + 4

target = newPC + addr << 2 # in MIPS offset from newPC

if (($1 - $2) == 0)

PC = target

else

PC = newPC

Control Overview

- Single-cycle implementation
- Datapath: combinational logic, I-mem, regs, D-mem, PC
- Last three written at end of cycle

- Need control – just combinational logic!
- Inputs:
- Instruction (I-mem out)
- Zero (for beq)

- Outputs:
- Control lines for muxes
- ALUop
- Write-enables

- Datapath: combinational logic, I-mem, regs, D-mem, PC

Control Overview

- Fast control
- Divide up work on “need to know” basis
- Logic with fewer inputs is faster

- E.g.
- Global control need not know which ALUop

ALU Control

- Assume ALU uses

ALU Control

- ALU-ctrl = f(opcode,function)
- To simplify ALU-ctrl
- ALUop = f(opcode)
2 bits 6 bits

- ALUop = f(opcode)

ALU Control

- ALU-ctrl = f(ALUop, function)
- 3 bits 2 bits 6 bits
- Requires only five gates plus inverters

Global Control

- R-format: opcode rs rt rd shamt function
6 5 5 5 5 6

- I-format: opcode rs rt address/immediate
6 5 5 16

- J-format: opcode address
6 26

Global Control

- Route instruction[25:21] as read reg1 spec
- Route instruction[20:16] are read reg2 spec
- Route instruction[20:16] (load) and instruction[15:11] (others) to
- Write reg mux

- Call instruction[31:26] op[5:0]

Global Control

- Global control outputs
- ALU-ctrl - see above
- ALU src - R-format, beq vs. ld/st
- MemRead - lw
- MemWrite - sw
- MemtoReg - lw
- RegDst - lw dst in bits 20:16, not 15:11
- RegWrite - all but beq and sw
- PCSrc - beq taken

Global Control

- Global control outputs
- Replace PCsrc with
- Branch beq
- PCSrc = Branch * Zero

- Replace PCsrc with
- What are the inputs needed to determine above global control signals?
- Just Op[5:0]

Global Control (Fig. 5.20)

- RegDst = ~Op[0]
- ALUSrc = Op[0]
- RegWrite = ~Op[3] * ~Op[2]

Global Control

- More complex with entire MIPS ISA
- Need more systematic structure
- Want to share gates between control signals

- Common solution: PLA
- MIPS opcode space designed to minimize PLA inputs, minterms, and outputs

- See MIPS Opcode map (Fig A.19)

Cycles

Time

X

X

Instruction

Program

Cycle

(code size)

(CPI)

(cycle time)

What’s wrong with single cycle?- Critical path probably lw:
- I-mem, reg-read, alu, d-mem, reg-write

- Other instructions faster
- E.g. rrr: skip d-mem

- Instruction variation much worse for full ISA and real implementation:
- FP divide
- Cache misses (what the heck is this? – chapter 7)

Single Cycle Implementation

- Solution
- Variable clock?
- Too hard to control, design

- Fixed short clock
- Variable cycles per instruction

- Variable clock?

Multi-cycle Implementation

- Clock cycle = max(i-mem,reg-read+reg-write, ALU, d-mem)
- Reuse combination logic on different cycles
- One memory
- One ALU without other adders

- But
- Control is more complex
- Need new registers to save values (e.g. IR)
- Used again on later cycles
- Logic that computes signals is reused

High-level Multi-cycle Data-path

- Note:
- Instruction register, memory data register
- One memory with address bus
- One ALU with ALUOut register

Comment on busses

- Share wires to reduce #signals
- Distributed multiplexor

- Multiple sources driving one bus
- Ensure only one is active!

State

Current

Next

state

State Fn

outputs

Inputs

Output

Fn

Multi-cycle Control- Function of Op[5:0] and current step
- Defined as Finite State Machine (FSM) or
- Micro-program or microcode (later)

- FSM – App. B
- State is combination of step and which path

Finite State Machine (FSM)

- For each state, define:
- Control signals for datapath for this cycle
- Control signals to determine next state

- All instructions start in same IF state
- Instructions terminate by making IF next
- After proper PC update

ID

MemRead

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

MemRead

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

Multi-cycle Example (and)Nuts and Bolts--More on FSMs

- You will be using FSM control for your processor implementation
- You will be producing the state machine and control outputs from binary (ISA/datapath controls)
- There are multiple methods for specifying a state machine
- Moore machine (output is function of state only)
- Mealy machine (output is function of state/input)

- There are different methods of assigning states

FSMs--State Assignment

- State assignment is converting logical states to binary representation
- Is state assignment interesting/important?
- Judicious choice of state representation can make next state fcn/output fcn have fewer gates
- Optimal solution is hard, but having intuition is helpful (CAD tools can also help in practice)

State Assignment--Example

- 10 states in multicycle control FSM
- Each state can have 1 of 16 (2^4) encodings with “dense” state representation
- Any choice of encoding is fine functionally as long as all states are unique

- Appendix C-26 example: RegWrite signal

ID

MemRead

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

MemRead

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

State Assignment,RegWrite SignalState 0

State 1

State 2

State 6

State 8

State 9

State 4

State 7

State 5

State 3

State 7 (0111b)

State 9 (1001b)

State 4 (0100b)

State 8 (1000b)

Original: 2 inverters, 2 and3s, 1 or2

New: No gates--just bit 3!

ID

MemRead

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

MemRead

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

Multi-cycle Example (lw)ID

MemRead

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

MemRead

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

Multi-cycle Example (sw)ID

MemRead

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

MemRead

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

Multi-cycle Example (beq T)ID

MemRead

ALUSrcA=0

IorD = 0

IRWrite

ALUSrcB = 01

ALUOp = 00

PCWrite

PCSrc = 00

ALUSrcA = 0

ALUSrcB = 11

ALUOp = 00

Start

LW | SW

J

RRR

BEQ

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 10

ALUSrcA = 1

ALUSrcB = 00

ALUOp = 01

PCWriteCond

PCSource = 01

PCWrite

PCSource = 10

ALUSrcA = 1

ALUSrcB = 10

ALUOp = 00

EX

LW

SW

MemRead

IorD = 1

MemWrite

IorD = 1

RegDst = 1

RegWrite

MemtoReg = 0

WB

MEM

RegDst = 0

RegWrite

MemtoReg = 1

WB

Multi-cycle Example (j)Summary

- Processor implementation
- Datapath
- Control

- Single cycle implementation
- Next: microprogramming

Download Presentation

Connecting to Server..