1 / 31

Building a Multi-Cycle Datapath

This material covers the implementation of a multicycle MIPS datapath. It discusses the disadvantages of a single-cycle datapath and introduces the concept of multicycle implementations.

lcarol
Download Presentation

Building a Multi-Cycle Datapath

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 14:332:331Computer Architecture and Assembly LanguageSpring 2006Week 10Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s UCB CS152 slides and Mary Jane Irwin’s PSU CSE331 slides]

  2. Head’s Up • This week’s material • Multicycle MIPS datapath implementation • Reading assignment – PH 5.5 and C.3

  3. Review: Single Cycle Data and Control Path Instr[25-0] 1 Shift left 2 28 32 26 0 PC+4[31-28] 0 Add Add 1 4 Shift left 2 PCSrc Jump ALUOp Branch MemRead Instr[31-26] Control MemtoReg MemWrite ALUSrc RegWrite RegDst ovf Instr[25-21] Read Addr 1 Instruction Memory Read Data 1 Address Register File Instr[20-16] zero Read Addr 2 Data Memory Read Address PC Instr[31-0] 0 Read Data 1 ALU Write Addr Read Data 2 0 1 Write Data 0 Instr[15 -11] Write Data 1 Instr[15-0] Sign Extend ALU control 16 32 Instr[5-0]

  4. Disadvantages of the Single Cycle Datapath • Uses the clock cycle inefficiently – the clock cycle must be timed to accommodate the slowest instruction • especially problematic for more complex instructions like floating point multiply • Is wasteful of area since some functional units must be duplicated since they can not be “shared” during an instruction execution • e.g., need separate adders to do PC update and branch target address calculations, as well as an ALU to do R-type arithmetic/logic operations and data memory address calculations

  5. Multicycle Implementation Overview • Each step in the execution takes 1 clock cycle • An instruction takes more than 1 clock cycle to complete • Not every instruction takes thesame number of clock cycles to complete • Multicycle implementations allow • functional units to be used more than once per instruction as long as they are used on different clock cycles, as a result • only need one memory • need only one ALU/adder • faster clock rates • different instructions to take a different number of clock cycles

  6. IR A ALUout B MDR The Multicycle Datapath – A High Level View • Registers have to be added after every major functional unit to hold the output value until it is used in a subsequent clock cycle Address Memory Read Addr 1 PC Read Data 1 Register File Read Addr 2 Read Data (Instr. or Data) ALU Write Addr Write Data Read Data 2 Write Data

  7. System Clock clock cycle MemWrite RegWrite IR Address Memory Read Addr 1 A PC Read Data 1 Register File Read Addr 2 Read Data (Instr. or Data) ALUout ALU Write Addr Write Data Read Data 2 B Write Data MDR Clocking the Multicycle Datapath

  8. Multicycle Approach • Break up the instructions into steps where each step takes a cycle while trying to • balance the amount of work to be done in each step • restrict each cycle to use only one major functional unit • At the end of a cycle • Store values needed in a later cycle by the current instruction in a state element (internal register) not visible to the programmer IR – Instruction Register MDR – Memory Data Register A and B – register file read data registers ALUout – ALU output register • All (except IR) hold data only between a pair of adjacent clock cycles (so don’t need a write control signal) • Data used by subsequent instructions are stored in programmer visible state elements (i.e., register file, PC, or memory)

  9. MDR The Complete Multicycle Data with Control PCWriteCond PCWrite PCSource IorD ALUOp MemRead Control ALUSrcB MemWrite ALUSrcA MemtoReg RegWrite IRWrite RegDst PC[31-28] Instr[31-26] Shift left 2 28 Instr[25-0] 2 0 1 Address Memory 0 PC 0 Read Addr 1 A Read Data 1 IR Register File 1 1 zero Read Addr 2 Read Data (Instr. or Data) 0 ALUout ALU Write Addr Write Data 1 Read Data 2 B 0 1 Write Data 4 1 0 2 Instr[15-0] Sign Extend Shift left 2 3 32 ALU control Instr[5-0]

  10. Multicycle Approach, con’t • Reading/writing to • any of the internal registers or the PC occurs (quickly) at the end of a clock cycle • reading/writing to the register file takes ~50% of a clock cycle since it has additional control and access overhead (reading can be done in parallel with decode) • Have to add multiplexors in front of several of the functional unit inputs because the functional units are shared by different instruction cycles • All operations occurring in one step occur in parallel within the same clock cycle • This limits us to one ALU operation, one memory access, and one register file access per step (per clock cycle)

  11. Five Instruction Steps • Instruction Fetch • Instruction Decode and Register Fetch • R-type Instruction Execution, Memory Read/Write Address Computation, Branch Completion, or Jump Completion • Memory Read Access, Memory Write Completion or R-type Instruction Completion • Memory Read Completion (Write Back) INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!

  12. Step 1: Instruction Fetch • Use PC to get instruction from the memory and put it in the Instruction Register • Increment the PC by 4 and put the result back in the PC • Can be described succinctly using RTL "Register-Transfer Language“ IR = Memory[PC]; PC = PC + 4; Can we figure out the values of the control signals? What is the advantage of updating the PC now?

  13. Fetch Control Signals Settings Instr Fetch Start

  14. Step 2: Instruction Decode and Register Fetch • Don’t know what the instruction is yet, so can only • Read registers rs and rt in case we need them • Compute the branch address in case the instruction is a branch • RTL: A = Reg[IR[25-21]];B = Reg[IR[20-16]];ALUOut = PC +(sign-extend(IR[15-0])<< 2); • Note we aren't setting any control lines based on the instruction (since we are busy "decoding" it in our control logic)

  15. MDR Datapath Activity During Instruction Decode PCWriteCond PCWrite PCSource IorD ALUOp MemRead Control ALUSrcB MemWrite ALUSrcA MemtoReg RegWrite IRWrite RegDst PC[31-28] Instr[31-26] Shift left 2 28 Instr[25-0] 2 0 1 Address Memory 0 PC 0 Read Addr 1 A Read Data 1 IR Register File 1 1 zero Read Addr 2 Read Data (Instr. or Data) 0 ALUout ALU Write Addr Write Data 1 Read Data 2 B 0 1 Write Data 4 1 0 2 Instr[15-0] Sign Extend Shift left 2 3 32 ALU control Instr[5-0]

  16. Decode Control Signals Settings Decode IorD=0 MemRead;IRWrite ALUSrcA=0 ALUsrcB=01 PCSource,ALUOp=00 PCWrite Instr Fetch Unless otherwise assigned PCWrite,IRWrite, MemWrite,RegWrite=0 others=X Start

  17. Step 3 (instruction dependent) • ALU is performing one of four functions, based on instruction type • Memory reference (lw and sw): ALUOut = A + sign-extend(IR[15-0]); • R-type: ALUOut = A op B; • Branch: if (A==B) PC = ALUOut; • Jump: PC = PC[31-28] || (IR[25-0] << 2);

  18. MDR Datapath Activity During Instruction Execute PCWriteCond PCWrite PCSource IorD ALUOp MemRead Control ALUSrcB MemWrite ALUSrcA MemtoReg RegWrite IRWrite RegDst PC[31-28] Instr[31-26] Shift left 2 28 Instr[25-0] 2 0 1 Address Memory 0 PC 0 Read Addr 1 A Read Data 1 IR Register File 1 1 zero Read Addr 2 Read Data (Instr. or Data) 0 ALUout ALU Write Addr Write Data 1 Read Data 2 B 0 1 Write Data 4 1 0 2 Instr[15-0] Sign Extend Shift left 2 3 32 ALU control Instr[5-0]

  19. Execute Control Signals Settings Decode IorD=0 MemRead;IRWrite ALUSrcA=0 ALUsrcB=01 PCSource,ALUOp=00 PCWrite Instr Fetch Unless otherwise assigned PCWrite,IRWrite, MemWrite,RegWrite=0 others=X ALUSrcA=0 ALUSrcB=11 ALUOp=00 PCWriteCond=0 Start (Op = R-type) (Op = beq) (Op = lw or sw) (Op = j) Execute

  20. Step 4 (instruction dependent) • Memory reference: MDR = Memory[ALUOut]; -- lwor Memory[ALUOut] = B;-- sw • R-type instruction completion Reg[IR[15-11]] = ALUOut; • Remember, the register write actually takes place at the end of the cycle on the clock edge

  21. MDR Datapath Activity During Memory Access PCWriteCond PCWrite PCSource IorD ALUOp MemRead Control ALUSrcB MemWrite ALUSrcA MemtoReg RegWrite IRWrite RegDst PC[31-28] Instr[31-26] Shift left 2 28 Instr[25-0] 2 0 1 Address Memory 0 PC 0 Read Addr 1 A Read Data 1 IR Register File 1 1 zero Read Addr 2 Read Data (Instr. or Data) 0 ALUout ALU Write Addr Write Data 1 Read Data 2 B 0 1 Write Data 4 1 0 2 Instr[15-0] Sign Extend Shift left 2 3 32 ALU control Instr[5-0]

  22. Memory Access Control Signals Settings Decode IorD=0 MemRead;IRWrite ALUSrcA=0 ALUsrcB=01 PCSource,ALUOp=00 PCWrite Instr Fetch Unless otherwise assigned PCWrite,IRWrite, MemWrite,RegWrite=0 others=X ALUSrcA=0 ALUSrcB=11 ALUOp=00 PCWriteCond=0 Start (Op = R-type) (Op = beq) (Op = lw or sw) (Op = j) ALUSrcA=1 ALUSrcB=10 ALUOp=00 PCWriteCond=0 ALUSrcA=1 ALUSrcB=00 ALUOp=01 PCSource=01 PCWriteCond ALUSrcA=1 ALUSrcB=00 ALUOp=10 PCWriteCond=0 PCSource=10 PCWrite Execute (Op = lw) (Op = sw) Memory Access

  23. Step 5: Memory Read Completion (Write Back) • All we have left is the write back into the register file the data just read from memory for lw instruction Reg[IR[20-16]]= MDR; What about all the other instructions?

  24. MDR Datapath Activity During Write Back PCWriteCond PCWrite PCSource IorD ALUOp MemRead Control ALUSrcB MemWrite ALUSrcA MemtoReg RegWrite IRWrite RegDst PC[31-28] Instr[31-26] Shift left 2 28 Instr[25-0] 2 0 1 Address Memory 0 PC 0 Read Addr 1 A Read Data 1 IR Register File 1 1 zero Read Addr 2 Read Data (Instr. or Data) 0 ALUout ALU Write Addr Write Data 1 Read Data 2 B 0 1 Write Data 4 1 0 2 Instr[15-0] Sign Extend Shift left 2 3 32 ALU control Instr[5-0]

  25. Write Back Control Signals Settings Decode IorD=0 MemRead;IRWrite ALUSrcA=0 ALUsrcB=01 PCSource,ALUOp=00 PCWrite Instr Fetch Unless otherwise assigned PCWrite,IRWrite, MemWrite,RegWrite=0 others=X ALUSrcA=0 ALUSrcB=11 ALUOp=00 PCWriteCond=0 Start (Op = R-type) (Op = beq) (Op = lw or sw) (Op = j) ALUSrcA=1 ALUSrcB=10 ALUOp=00 PCWriteCond=0 ALUSrcA=1 ALUSrcB=00 ALUOp=01 PCSource=01 PCWriteCond ALUSrcA=1 ALUSrcB=00 ALUOp=10 PCWriteCond=0 PCSource=10 PCWrite Execute (Op = lw) (Op = sw) Memory Access RegDst=1 RegWrite MemtoReg=0 PCWriteCond=0 MemRead IorD=1 PCWriteCond=0 MemWrite IorD=1 PCWriteCond=0 Write Back

  26. RTL Summary

  27. Simple Questions • How many cycles will it take to execute this code? lw $t2, 0($t3) lw $t3, 4($t3) beq $t2, $t3, Label #assume not add $t5, $t2, $t3 sw $t5, 8($t3)Label: ... • What is going on during the 8th cycle of execution? • In what cycle does the actual addition of $t2 and $t3 takes place? • In what cycle is the branch target address calculated?

  28. Datapath control points Combinational control logic State Reg Inst Opcode Next State Multicycle Control • Multicycle datapath control signals are not determined solely by the bits in the instruction • e.g., op code bits tell what operation the ALU should be doing, but not what instruction cycle is to be done next • We’ll use a finite state machine for control • a set of states (current state stored in State Register) • next state function (determined by current state and the input) • output function (determined by current state and the input) • We’ll use a Moore machine (so control signals based only on current state) . . . . . . . . .

  29. Finite State Machine Implementation PCWrite PCWriteCond IorD MemRead MemWrite IRWrite MemtoReg Combinational control logic PCSource Outputs ALUOp ALUSourceB ALUSourceA RegWrite RegDst Inputs Op5 Op4 Op3 Op2 Op1 Op0 Next State State Reg Inst[31-26] System Clock

  30. Datapath Control Outputs Truth Table

  31. Next State Truth Table

More Related