CSCI-2500: Computer Organization

CSCI-2500:Computer Organization Processor Design

Datapath • The datapath is the interconnection of the components that make up the processor. • The datapath must provide connections for moving bits between memory, registers and the ALU. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 2

Control • The control is a collection of signals that enable/disable the inputs/outputs of the various components. • You can think of the control as the brain, and the datapath as the body. • the datapath does only what the brain tells it to do. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 3

Processor Design The sequencing and execution of instructions • We already know about many of the individual components that are necessary: • ALU, Multiplexors, Decoders, Flip-Flops • We need to discuss how to use a clock • We need to think about registers and memory. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 4

The Clock The clock generates a never-ending sequence of alternating 1s and 0s. All operations are synchronized to the clock. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 5

Clocking Methodology • Determines when (relative to the clock) a signal can be read and written. • Read: signal value is used by some component. • Written: a signal value is generated by some component. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 6

Simple Example: Enabled AND • We want an AND gate that holds it’s output value constant until the clock switches from 0 (lo) to 1 (hi). • We can use a flip-flop to hold the inputs to the AND gate constant during the time we want the output constant. • We use a clocked flip-flop to make things happen when the clock changes. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 7

D Q Clock Q D Flip-Flop Reminder The output (Q) changes to reflect D only when the Clock is a 1. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 8

D Flip-Flop Timing 1 0 D 1 0 C 1 0 Q CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 9

Clocked AND gate A D flip-flop D Q C A•B (clocked) B D flip-flop D Q C Clock CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 10

Edge-triggered Clocking • Values stored are updated (can change) only on a clock edge. • When the clock switches from 0 to 1 everybody allows signals in. • everybody means state elements • combinational elements always do the same thing, they don’t care about the clock (that’s why we added the flip-flops to our AND gate). CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 11

State Elements • Any component that stores one or more values is a state element. • The entire processor can be viewed as a circuit that moves from one state (collection of all the state elements) to another state. • At time i a component uses values generated at time i-1. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 12

Register File R e a d r e g i s t e r • Contains multiple registers • each holds 32 bits • Two output ports (read ports) • One input port (write port) • To change the value of a register: • supply register number • supply data • clock (the Write control signal) R e a d n u m b e r 1 d a t a 1 R e a d r e g i s t e r n u m b e r 2 R e g i s t e r f i l e W r i t e r e g i s t e r R e a d d a t a 2 W r i t e d a t a W r i t e CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 13

R e a d r e g i s t e r n u m b e r 1 R e g i s t e r 0 R e g i s t e r 1 M u R e a d d a t a 1 x R e g i s t e r n – 1 R e g i s t e r n R e a d r e g i s t e r n u m b e r 2 M u R e a d d a t a 2 x Implementation of Read Ports Figure B.19 CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 14

W r i t e C 0 R e g i s t e r 0 1 D n - t o - 1 C R e g i s t e r n u m b e r d e c o d e r R e g i s t e r 1 D n – 1 n C R e g i s t e r n – 1 D C R e g i s t e r n D R e g i s t e r d a t a Implementation of Write CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 15

Memory • Memory is similar to a very large register file: • single read port (output) • chip select input signal • output enable input signal • write enable input signal • address lines (determine which memory element) • data input lines (used to write a memory element) CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 16

D i n [ 1 ] D i n [ 0 ] D D D D C l a t c h Q C l a t c h Q W r i t e e n a b l e E n a b l e E n a b l e 0 2 - t o - 4 D D D D d e c o d e r C l a t c h Q C l a t c h Q E n a b l e E n a b l e 1 D D D D A d d r e s s C l a t c h Q C l a t c h Q E n a b l e E n a b l e 2 D D D D C l a t c h Q C l a t c h Q E n a b l e E n a b l e 3 D o u t [ 1 ] D o u t [ 0 ] 4 x 2 Memory (SRAM) CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 17

Memory Usage • For now, we treat memory as a single component that supports 2 operations: • write (we change the value stored in a memory location) • read (we get the value currently stored in a memory location). • We can only do one operation at a time! CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 18

Instruction & Data Memory • It is useful to treat the memory that holds instructions as a separate component. • instruction memory is read-only • Typically there is really one memory that holds both instructions and data. • as we will see when we talk more about memory, the processor often has two interfaces to the memory, one for instructions and one for data! CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 19

Designing a Datapath for MIPS • We start by looking at the datapaths needed to support a simple subset of MIPS instructions: • a few arithmetic and logical instructions • load and store word • beq and j instructions CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 20

Functions for MIPS Instructions • We can generalize the functions we need to: • using the PC register as the address, read a value from the memory (read the instruction) • Read one or two register values (depends on the specific instruction). • ALU Operation , Memory read or write, … • Possibly change the value of a register. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 21

Fetching the next instruction • PC Register holds the address • Memory holds the instruction • we need to read from memory. • Need to update the PC • add 4 to current value CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 22

Instruction Fetch DataPath A d d 4 R e a d P C a d d r e s s I n s t r u c t i o n I n s t r u c t i o n m e m o r y CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 23

5 bits 5 bits 5 bits 5 bits 6 bits 6 bits op rs rt rd shamt funct Supporting R-format instructions Includes add, sub, slt, and & or instructions. Generalization: • read 2 registers and send to ALU. • perform ALU operation • store result in a register CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 24

MIPS Registers • MIPS has 32 general purpose registers. • Register File holds all 32 registers • need 5 bits to select a register • rs,rt & rd fields in R-format instructions. • MIPS Register File has 2 read ports. • can get at both source registers at the same time. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 25

Datapath for R-format Instructions A L U o p e r a t i o n 3 R e a d r e g i s t e r 1 R e a d d a t a 1 R e a d Z e r o r e g i s t e r 2 I n s t r u c t i o n R e g i s t e r s A L U A L U W r i t e r e s u l t r e g i s t e r R e a d d a t a 2 W r i t e d a t a R e g W r i t e CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 26

Load and Store Instructions Need to compute the address • offset (part of the instruction) • base (stored in a register). For Load: • read from memory • store in a register For Store: • read from register • write to memory CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 27

Computing the address • 16 bit signed offset is part of the instruction. • We have a 32 bit ALU. • need to sign extend the offset (to 32 bits). • Feed the 32 bit offset and the contents of a register to the ALU • Tell the ALU to “add”. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 28

Load/Store Datapath A L U o p e r a t i o n 3 R e a d r e g i s t e r 1 M e m W r i t e R e a d d a t a 1 R e a d Z e r o r e g i s t e r 2 I n s t r u c t i o n A L U R e g i s t e r s A L U R e a d W r i t e r e s u l t A d d r e s s d a t a r e g i s t e r R e a d d a t a 2 W r i t e D a t a d a t a m e m o r y W r i t e R e g W r i t e d a t a 1 6 3 2 S i g n M e m R e a d e x t e n d CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 29

Supporting beq • 2 registers compared for equality • 16 bit offset used to compute target address. • signed offset is relative to the PC • offset is in words not in bytes! • Might branch, might not (need to decide). CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 30

Computing target address • Recall that the offset is actually relative to the address of the next instruction. • we always add 4 to the PC, we must make sure we use this value as the base. • Word vs. Byte offset • we just need to shift the 16 bit offset 2 bits to the right (fill with 2 zeros). CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 31

P C + 4 f r o m i n s t r u c t i o n d a t a p a t h A d d S u m B r a n c h t a r g e t S h i f t l e f t 2 A L U o p e r a t i o n 3 R e a d r e g i s t e r 1 I n s t r u c t i o n R e a d d a t a 1 R e a d r e g i s t e r 2 T o b r a n c h R e g i s t e r s A L U Z e r o c o n t r o l l o g i c W r i t e r e g i s t e r R e a d d a t a 2 W r i t e d a t a R e g W r i t e 1 6 3 2 S i g n e x t e n d Branch Datapath CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 32

Datapaths We looked at individual datapaths that support: • Fetching Instructions • Arithmetic/Logical Instructions • Load & Store Instructions • Conditional branch We need to combine these in to a single datapath. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 33

Issues • When designing one datapath that can be used for any operation: • the goal is to be able to handle one instruction per cycle. • must make sure no datapath resource needs to be used more than once at the same time. • if so – we need to provide more than one! CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 34

Sharing Resources • We can share datapath resources by adding a multiplexor (and a control line). • for example, the second input to the ALU could come from either: • a register (as in an arithmetic instruction) • from the instruction (as in a load/store – when computing the memory address). CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 35

Sharing with a Multiplexor Example Operand 1 A A+B (Control==0) Operand 2 ADD A+C (Control==1) B C Control CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 36

Combining Datapaths for memory instructions and arithmetic instructions • Need to share the ALU • For memory instructions used to compute the address in memory. • For Arithmetic/Logical instructions used to perform arithmetic/logical operation. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 37

R e a d r e g i s t e r 1 R e a d d a t a 1 R e a d r e g i s t e r 2 I n s t r u c t i o n R e g i s t e r s R e a d R e a d W r i t e d a t a 2 d a t a r e g i s t e r M M u u W r i t e x D a t a x d a t a m e m o r y W r i t e d a t a 1 6 3 2 S i g n e A L U o p e r a t i o n 3 M e m W r i t e M e m t o R e g A L U S r c Z e r o A L U A L U A d d r e s s r e s u l t R e g W r i t e M e m R e a d x t e n d New Controls Sharing Multiplexors CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 38

Adding the Instruction Fetch • One memory for instructions, separate memory for data. • otherwise we might need to use the memory twice in the same instruction. • Dedicated Adder for updating the PC • otherwise we might need to use the ALU twice in the same instruction. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 39

A d d 4 R e g i s t e r s R e a d A L U o p e r a t i o n 3 M e m W r i t e r e g i s t e r 1 R e a d P C R e a d M e m t o R e g R e a d a d d r e s s d a t a 1 r e g i s t e r 2 A L U S r c Z e r o I n s t r u c t i o n A L U R e a d A L U R e a d W r i t e A d d r e s s r e s u l t d a t a 2 r e g i s t e r d a t a M M u I n s t r u c t i o n u W r i t e x D a t a x m e m o r y d a t a m e m o r y W r i t e R e g W r i t e d a t a 1 6 3 2 S i g n M e m R e a d e x t e n d Dedicated Adder Two Memory Units CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 40

Need to add datapath for beq • Register comparison (requires ALU). • Another adder to compute target address. • One input to adder is sign extended offset, shifted by 2 bits. • Other input to adder is PC+4 CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 41

P C S r c M A d d u x A L U A d d 4 r e s u l t S h i f t l e f t 2 R e g i s t e r s A L U o p e r a t i o n 3 R e a d M e m W r i t e A L U S r c R e a d r e g i s t e r 1 P C R e a d a d d r e s s R e a d M e m t o R e g d a t a 1 Z e r o r e g i s t e r 2 I n s t r u c t i o n A L U A L U R e a d W r i t e R e a d A d d r e s s r e s u l t M d a t a r e g i s t e r d a t a 2 M u I n s t r u c t i o n u x W r i t e m e m o r y D a t a x d a t a m e m o r y W r i t e R e g W r i t e d a t a 3 2 1 6 S i g n M e m R e a d e x t e n d New adder and mux CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 42

Whew! • Keep in mind that the datapath we now have supports just a few MIPS instructions! • Things get worse (more complex) as we support other instructions: j jal jr addi • We won’t worry about them now… CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 43

Control Unit • We need something that can generate the controls in the datapath. • Depending on what kind of instruction we are executing, different controls should be turned on (asserted) and off (deasserted). • We need to treat each control individually (as a separate boolean function). CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 44

Controls • Our datapath includes a bunch of controls: • ALU operation (3 bits) • RegWrite • ALUSrc • MemWrite • MemtoReg • MemRead • PCSrc CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 45

ALU Operation Control • A 3 bit control (assumes the ALU designed in chapter 3): CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 46

ALU Functions for other instructions lw , sw (load/store): addition beq: subtraction add, sub, and, or, slt (arithmetic/logical): All R-format instructions CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 47

5 bits 5 bits 5 bits 5 bits 6 bits 6 bits op rs rt rd shamt funct R-Format Instructions Operation is specified by some bits in the funct field in the instruction. CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 48

6 bits MIPS Instruction OPCODEs • The MS 6 bits are an OPCODE that identifies the instruction. • R-Format: always 000000 • (funct identifies the operation) lw sw beq 100011 101011 000100 op varies depending on instruction CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 49

Generating ALU Controls We can view the 3 bit ALU control as 3 boolean functions. Inputs are: • the op field (OPCODE) • funct field (for R-format instructions only) CSCI-2500 FALL 2012, Processor Design, Chapter 4 — 50

CSCI-2500: Computer Organization