1 / 75

CS/COE0447 Computer Organization & Assembly Language

CS/COE0447 Computer Organization & Assembly Language. Multi-Cycle Execution. A Multi-cycle Datapath. A single memory unit for both instructions and data Single ALU rather than ALU & two adders

nizana
Download Presentation

CS/COE0447 Computer Organization & Assembly Language

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS/COE0447Computer Organization & Assembly Language Multi-Cycle Execution

  2. A Multi-cycle Datapath • A single memory unit for both instructions and data • Single ALU rather than ALU & two adders • Registers added after every major functional unit to hold the output until it is used in a subsequent clock cycle

  3. Multi-Cycle ControlWhat we need to cover • Adding registers after every functional unit • Need to modify the “instruction execution” slides to reflect this • Breaking instruction execution down into cycles • What can be done during the same cycle? What requires a cycle? • Need to modify the “instruction execution” slides again • Timing • Control signal values • What they are per cycle, per instruction • Finite state machine which determines signals based on instruction type + which cycle it is • Putting it all together

  4. Execution: single-cycle (reminder) • add • Fetch instruction and add 4 to PC add $t2,$t1,$t0 • Read two source registers $t1 and $t0 • Add two values $t1 + $t0 • Store result to the destination register $t1 + $t0  $t2

  5. A Multi-cycle Datapath • For add: • Instruction is stored in the instruction register (IR) • Values read from rs and rt are stored in A and B • Result of ALU is stored in ALUOut

  6. Execution: single-cycle (reminder) • lw (load word) • Fetch instruction and add 4 to PC lw $t0,-12($t1) • Read the base register $t1 • Sign-extend the immediate offset fff4  fffffff4 • Add two values to get address X =fffffff4 + $t1 • Access data memory with the computed address M[X] • Store the memory data to the destination register $t0

  7. A Multi-cycle Datapath • For lw: lw $t0, -12($t1) • Instruction is stored in the IR • Contents of rs stored in A $t1 • Output of ALU (address of memory location to be read) stored in ALUOut • Value read from memory is stored in the memory data register (MDR)

  8. Execution: single-cycle (reminder) • sw • Fetch instruction and add 4 to PC sw $t0,-4($t1) • Read the base register $t1 • Read the source register $t0 • Sign-extend the immediate offset fffc  fffffffc • Add two values to get address X =fffffffc + $t1 • Store the contents of the source register to the computed address $t0  Memory[X]

  9. A Multi-cycle Datapath • For sw: sw $t0, -12($t1) • Instruction is stored in the IR • Contents of rs stored in A $t1 • Output of ALU (address of memory location to be written) stored in ALUOut

  10. Execution: single-cycle (reminder) • beq • Fetch instruction and add 4 to PC beq $t0,$t1,L • Assume that L is +4 instructions away • Read two source registers $t0,$t1 • Sign Extend the immediate, and shift it left by 2 • 0x0003  0x0000000c • Perform the test, and update the PC if it is true • If $t0 == $t1, the PC = PC + 0x0000000c

  11. A Multi-cycle Datapath • For beq beq $t0,$t1,label • Instruction stored in IR • Registers rs and rt are stored in A and B • Result of ALU (rs – rt) is stored in ALUOut

  12. Execution: single-cycle (reminder) • j • Fetch instruction and add 4 to PC • Take the 26-bit immediate field • Shift left by 2 (to make 28-bit immediate) • Get 4 bits from the current PC and attach to the left of the immediate • Assign the value to PC

  13. A Multi-cycle Datapath • For j • No accesses to registers or memory; no need for ALU

  14. Multi-Cycle ControlWhat we need to cover • Adding registers after every functional unit • Need to modify the “instruction execution” slides to reflect this • Breaking instruction execution down into cycles  • What can be done during the same cycle? What requires a cycle? • Need to modify the “instruction execution” slides again • Timing • Control signal values • What they are per cycle, per instruction • Finite state machine which determines signals based on instruction type + which cycle it is • Putting it all together

  15. Multicycle Approach • Break up the instructions into steps • each step takes one clock cycle • balance the amount of work to be done in each step/cycle so that they are about equal • restrict each cycle to use at most once each major functional unit so that such units do not have to be replicated • functional units can be shared between different cycles within one instruction

  16. Operations • These take time: • Memory (read/write); register file (read/write); ALU operations • The other connections and logical elements have no latency (for our purposes)

  17. Five Execution Steps • Each takes one cycle • In one cycle, there can be at most one memory access, at most one register access, and at most one ALU operation • But, you can have a memory access, an ALU op, and/or a register access, as long as there is no contention for resources • Changes to registers are made at the end of the clock cycle • PC, ALUOut, A, B, etc. save information for the next clock cycle

  18. Step 1: Instruction Fetch • Access memory w/ PC to fetch instruction and store it in Instruction Register (IR) • Increment PC by 4 • We can do this because the ALU is not being used for something else this cycle

  19. Step 2: Decode and Reg. Read • Read registers rs and rt • We read both of them regardless of necessity • Compute the branch address in case the instruction is a branch • We can do this because the ALU is not busy • ALUOut will keep the target address

  20. Step 3: Various Actions • ALU performs one of three functions based on instruction type (later – cycles per type of instruction; easier to understand) • Memory reference • ALUOut <= A + sign-extend(IR[15:0]); • R-type • ALUOut <= A op B; • Branch: • if (A==B) PC <= ALUOut; • Jump: • PC <= {PC[31:28],IR[25:0],2’b00};

  21. Step 4: Memory Access… • If the instruction is memory reference • MDR <= Memory[ALUOut]; // if it is a load • Memory[ALUOut] <= B; // if it is a store • Store is complete! • If the instruction is R-type • Reg[IR[15:11]] <= ALUOut; • Now the instruction is complete!

  22. Step 5: Register Write Back • Only the lw instruction reaches this step • Reg[IR[20:16]] <= MDR;

  23. 4 Multicycle Execution Step (1):Instruction Fetch IR = Memory[PC]; PC = PC + 4; PC + 4

  24. Reg[rs] PC + 4 Reg[rt] Multicycle Execution Step (2):Instruction Decode & Register Fetch A = Reg[IR[25-21]]; (A = Reg[rs]) B = Reg[IR[20-15]]; (B = Reg[rt]) ALUOut = (PC + sign-extend(IR[15-0]) << 2) Branch Target Address

  25. Reg[rs] Mem. Address PC + 4 Reg[rt] Multicycle Execution Step (3):Memory Reference Instructions ALUOut = A + sign-extend(IR[15-0]);

  26. Reg[rs] PC + 4 Reg[rt] Multicycle Execution Step (4):Memory Access - Write (sw) Memory[ALUOut] = B;

  27. Mem. Address Reg[rs] PC + 4 Reg[rt] Multicycle Execution Step (4):Memory Access - Read (lw) MDR = Memory[ALUOut]; Mem. Data

  28. Reg[rs] Mem. Address PC + 4 Mem. Data Reg[rt] Multicycle Execution Step (5):Memory Read Completion (lw) Reg[IR[20-16]] = MDR;

  29. Reg[rs] R-Type Result PC + 4 Reg[rt] Multicycle Execution Step (3):ALU Instruction (R-Type) ALUOut = A op B

  30. Reg[rs] R-Type Result PC + 4 Reg[rt] Multicycle Execution Step (4):ALU Instruction (R-Type) Reg[IR[15:11]] = ALUOUT

  31. Branch Target Address Reg[rs] Reg[rt] Multicycle Execution Step (3):Branch Instructions if (A == B) PC = ALUOut; Branch Target Address

  32. Branch Target Address Reg[rs] Reg[rt] Multicycle Execution Step (3):Jump Instruction PC = PC[31-28] concat (IR[25-0] << 2) Jump Address

  33. For Reference • The next 5 slides give the steps, one slide per instruction

  34. Multi-Cycle Execution: R-type • Instruction fetch • IR <= Memory[PC]; sub $t0,$t1,$t2 • PC <= PC + 4; • Decode instruction/register read • A <= Reg[IR[25:21]]; rs • B <= Reg[IR[20:16]]; rt • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • ALUOut <= A op B; op = add, sub, and, or,… • Completion • Reg[IR[15:11]] <= ALUOut; $t0 <=ALU result

  35. Multi-cycle Execution: lw • Instruction fetch • IR <= Memory[PC]; lw $t0,-12($t1) • PC <= PC + 4; • Instruction Decode/register read • A <= Reg[IR[25:21]]; rs • B <= Reg[IR[20:16]]; • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • ALUOut <= A + sign-extend(IR[15:0]); $t1 +-12 (sign extended) • Memory Access • MDR <= Memory[ALUOut]; M[$t1 + -12] • Write-back • Load: Reg[IR[20:16]] <= MDR; $t0 <= M[$t1 + -12]

  36. Multi-cycle Execution: sw • Instruction fetch • IR <= Memory[PC]; sw $t0,-12($t1) • PC <= PC + 4; • Decode/register read • A <= Reg[IR[25:21]]; rs • B <= Reg[IR[20:16]]; rt • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • ALUOut <= A + sign-extend(IR[15:0]); $t1 + -12 (sign extended) • Memory Access • Memory[ALUOut] <= B; M[$t1 + -12] <= $t0

  37. Multi-cycle execution: beq • Instruction fetch • IR <= Memory[PC]; beq $t0,$t1,label • PC <= PC + 4; • Decode/register read • A <= Reg[IR[25:21]]; rs • B <= Reg[IR[20:16]]; rt • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • if (A == B) then PC <= ALUOut; • if $t0 == $t1 perform branch

  38. Multi-cycle execution: j • Instruction fetch • IR <= Memory[PC]; j label • PC <= PC + 4; • Decode/register read • A <= Reg[IR[25:21]]; • B <= Reg[IR[20:16]]; • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • PC <= {PC[31:28],IR[25:0],”00”};

  39. Multi-Cycle ControlWhat we need to cover • Adding registers after every functional unit • Need to modify the “instruction execution” slides to reflect this • Breaking instruction execution down into cycles • What can be done during the same cycle? What requires a cycle? • Need to modify the “instruction execution” slides again • Timing • Control signal values • What they are per cycle, per instruction • Finite state machine which determines signals based on instruction type + which cycle it is • Putting it all together

  40. Datapath w/ Control Signals

  41. Final Version w/ Control

  42. Examplefrom beginning to end • lw $t0,4($t1) • Machine code: opcode rs rt immediate • 100011 01001 01000 0000 0000 0000 0100 • IR[31:26] IR[25:21] IR[20:16] IR[15:0] rs rt

  43. Multi-cycle Execution: lw • Instruction fetch • IR <= Memory[PC]; lw $t0,-12($t1) • PC <= PC + 4; • Instruction Decode/register read • A <= Reg[IR[25:21]]; rs • B <= Reg[IR[20:16]]; • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • ALUOut <= A + sign-extend(IR[15:0]); $t1 +-12 (sign extended) • Memory Access • MDR <= Memory[ALUOut]; M[$t1 + -12] • Write-back • Load: Reg[IR[20:16]] <= MDR; $t0 <= M[$t1 + -12]

  44. Example: Load (1) 00 1 1 0 0 1 0 01 00

  45. rs rt Example: Load (2) 0 11 00

  46. Example: Load (3) 1 10 00

  47. Example: Load (4) 1 1 0

  48. Example: Load (5) 1 0 1

  49. Example: Jump (1) 00 1 1 0 0 1 0 01 00

  50. Example: Jump (2) 0 11 00

More Related