# Chapter 4 - PowerPoint PPT Presentation

1 / 146
Chapter 4

## Chapter 4

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Chapter 4 The Processor

2. Introduction §4.1 Introduction EET 4250: Microcomputer Architecture • CPU performance factors • Instruction count • Determined by ISA and compiler • CPI and Cycle time • Determined by CPU hardware • We will examine two MIPS implementations • A simplified version • A more realistic pipelined version • Simple subset, shows most aspects • Memory reference: lw, sw • Arithmetic/logical: add, sub, and, or, slt • Control transfer: beq, j

3. Instruction Execution EET 4250: Microcomputer Architecture • PC  instruction memory, fetch instruction • Register numbers register file, read registers • Depending on instruction class • Use ALU to calculate • Arithmetic result • Memory address for load/store • Branch target address • Access data memory for load/store • PC  target address or PC + 4

4. CPU Overview EET 4250: Microcomputer Architecture

5. Multiplexers • Can’t just join wires together • Use multiplexers EET 4250: Microcomputer Architecture

6. Control EET 4250: Microcomputer Architecture

7. Logic Design Basics §4.2 Logic Design Conventions EET 4250: Microcomputer Architecture • Information encoded in binary • Low voltage = 0, High voltage = 1 • One wire per bit • Multi-bit data encoded on multi-wire buses • Combinational element • Operate on data • Output is a function of input • State (sequential) elements • Store information

8. A Y B A A Mux I0 Y + Y Y I1 ALU B B S F Combinational Elements • Adder • Y = A + B • Arithmetic/Logic Unit • Y = F(A, B) • Multiplexer • Y = S ? I1 : I0 EET 4250: Microcomputer Architecture • AND-gate • Y = A & B

9. D Q Clk Clk D Q Sequential Elements EET 4250: Microcomputer Architecture • Register: stores data in a circuit • Uses a clock signal to determine when to update the stored value • Edge-triggered: update when Clk changes from 0 to 1

10. Clk D Q Write Write D Clk Q Sequential Elements EET 4250: Microcomputer Architecture • Register with write control • Only updates on clock edge when write control input is 1 • Used when stored value is required later

11. Clocking Methodology EET 4250: Microcomputer Architecture • Combinational logic transforms data during clock cycles • Between clock edges • Input from state elements, output to state element • Longest delay determines clock period

12. Building a Datapath • Datapath • Elements that process data and addressesin the CPU • Registers, ALUs, mux’s, memories, … • We will build a MIPS datapath incrementally • Refining the overview design §4.3 Building a Datapath EET 4250: Microcomputer Architecture

13. Creating a Single Datapath from the Parts • Assemble the datapath segments and add control lines and multiplexors as needed • Single cycle design – fetch, decode and execute each instructions in one clock cycle • no datapath resource can be used more than once per instruction, so some must be duplicated (e.g., separate Instruction Memory and Data Memory, several adders) • multiplexors needed at the input of shared elements with control lines to do the selection • write signals to control writing to the Register File and Data Memory • Cycle time is determined by length of the longest path

14. Instruction Fetch Increment by 4 for next instruction 32-bit register EET 4250: Microcomputer Architecture

15. R-Format Instructions • Read two register operands • Perform arithmetic/logical operation • Write register result EET 4250: Microcomputer Architecture

16. Load/Store Instructions • Read register operands • Calculate address using 16-bit offset • Use ALU, but sign-extend offset • Load: Read memory and update register • Store: Write register value to memory EET 4250: Microcomputer Architecture

17. Branch Instructions • Read register operands • Compare operands • Use ALU, subtract and check Zero output • Calculate target address • Sign-extend displacement • Shift left 2 places (word displacement) • Add to PC + 4 • Already calculated by instruction fetch EET 4250: Microcomputer Architecture

18. Branch Instructions Justre-routes wires Sign-bit wire replicated EET 4250: Microcomputer Architecture

19. Branch Addressing – Absolute 1st Instruction 2nd Instruction EET 4250: Microcomputer Architecture • Simplified branch ADDR field • If immediate field is 3 bits instead of 16 • beq \$t0, \$t1, ADDR • How do we maximize how big ADDR can be? • Case 1: ADDR is an absolute address • Limit to 23 = 8 addresses in program! • Only 2 instructions (at address 0 & 4)

20. Branch Addressing – Byte Offset 1st Instr. Any of the 232 addresses 2nd Instr. EET 4250: Microcomputer Architecture • Simplified branch ADDR field • If immediate field is 3 bits instead of 16 • Case 2: ADDR is a byte-offset from PC • PC is 32 bits = 232 addresses • Branch can go PC ± 22 bytes = PC ± 4 bytes = PC ± 1 instructions!!

21. Branch Addressing – Word Offset EET 4250: Microcomputer Architecture • Simplified branch ADDR field • If immediate field is 3 bits instead of 16 • Case 3: ADDR is a word-offset from PC • PC is 32 bits = 232 addresses • Branch can go PC ± 22 words = PC ± 16 bytes = PC ± 4 instructions!!

22. Branch Instructions Justre-routes wires Sign-bit wire replicated EET 4250: Microcomputer Architecture

23. Composing the Elements • First-cut data path does an instruction in one clock cycle • Each datapath element can only do one function at a time • Hence, we need separate instruction and data memories • Use multiplexers where alternate data sources are used for different instructions EET 4250: Microcomputer Architecture

24. R-Type/Load/Store Datapath EET 4250: Microcomputer Architecture

25. Full Datapath EET 4250: Microcomputer Architecture

26. 0 4 35 or 43 rs rs rs rt rt rt rd address address shamt funct 31:26 31:26 31:26 25:21 25:21 25:21 20:16 20:16 20:16 15:11 10:6 15:0 15:0 5:0 The Main Control Unit • Control signals derived from instruction R-type Load/Store Branch opcode always read read, except for load write for R-type and load sign-extend and add EET 4250: Microcomputer Architecture

27. ALU Control • ALU used for • Load/Store: F = add • Branch: F = subtract • R-type: F depends on funct field §4.4 A Simple Implementation Scheme EET 4250: Microcomputer Architecture

28. ALU Control • Assume 2-bit ALUOp derived from opcode • Combinational logic derives ALU control EET 4250: Microcomputer Architecture

29. Datapath With Control EET 4250: Microcomputer Architecture

30. Add 4 Fetch PC = PC+4 Instruction Memory Exec Decode Read Address PC Instruction Fetching Instructions • Fetching instructions involves • reading the instruction from the Instruction Memory • updating the PC value to be the address of the next (sequential) instruction • PC is updated every clock cycle, so it does not need an explicit write control signal just a clock signal • Reading from the Instruction Memory is a combinational activity, so it doesn’t need an explicit read control signal

31. Fetch PC = PC+4 Exec Decode Read Addr 1 Read Data 1 Register File Read Addr 2 Write Addr Read Data 2 Write Data Decoding Instructions • Decoding instructions involves • sending the fetched instruction’s opcode and function field bits to the control unit Control Unit Instruction • reading two values from the Register File • Register File addresses are contained in the instruction

32. 31 25 20 15 10 5 0 R-type: op rs rt rd shamt funct RegWrite ALU control Fetch PC = PC+4 Read Addr 1 Read Data 1 Register File Read Addr 2 overflow Instruction zero Exec Decode ALU Write Addr Read Data 2 Write Data Executing R Format Operations • R format operations (add, sub, slt, and, or) • perform operation (op and funct) on values in rs and rt • store the result back into the Register File (into location rd) • Note that Register File is not written every cycle (e.g. sw), so we need an explicit write control signal for the Register File

37. Branch-on-Equal Instruction EET 4250: Microcomputer Architecture

41. 2 address 31:26 25:0 Implementing Jumps • Jump uses word address • Update PC with concatenation of • Top 4 bits of old PC • 26-bit jump address • 00 • Need an extra control signal decoded from opcode Jump EET 4250: Microcomputer Architecture

43. Executing Jump Operations • Jump operation involves • replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits Add 4 4 Jump address Instruction Memory Shift left 2 28 Read Address PC Instruction 26

44. Datapath With Jumps Added EET 4250: Microcomputer Architecture

45. Performance Issues • Longest delay determines clock period • Critical path: load instruction • Instruction memory  register file  ALU  data memory  register file • Not feasible to vary period for different instructions • Violates design principle • Making the common case fast • We will improve performance by pipelining EET 4250: Microcomputer Architecture

46. Instruction Critical Paths • What is the clock cycle time assuming negligible delays for muxes, control unit, sign extend, PC access, shift left 2, wires, setup and hold times except: • Instruction and Data Memory (200 ps) • ALU and adders (200 ps) • Register File access (reads or writes) (100 ps)

47. Pipelining Analogy • Pipelined laundry: overlapping execution • Parallelism improves performance §4.5 An Overview of Pipelining EET 4250: Microcomputer Architecture

48. MIPS Pipeline • Five stages, one step per stage • IF: Instruction fetch from memory • ID: Instruction decode & register read • EX: Execute operation or calculate address • MEM: Access memory operand • WB: Write result back to register EET 4250: Microcomputer Architecture

49. Pipeline Performance • Assume time for stages is • 100ps for register read or write • 200ps for other stages • Compare pipelined datapath with single-cycle datapath EET 4250: Microcomputer Architecture

50. Pipeline Performance Single-cycle (Tc= 800ps) Pipelined (Tc= 200ps) EET 4250: Microcomputer Architecture