1 / 35

COE 308

COE 308. Building the Datapath. Definitions. The processor executes instructions that operate on data. Data is modified, moved, produced and discarded as a result of the program’s target The Datapath is the hardware circuitry of the processor where the Data is: Temporarily stored Modified

ziarre
Download Presentation

COE 308

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COE 308 Building the Datapath COE 308

  2. Definitions • The processor executes instructions that operate on data. • Data is modified, moved, produced and discarded as a result of the program’s target • The Datapath is the hardware circuitry of the processor where the Data is: • Temporarily stored • Modified • By opposition, the Control Unit is responsible for controlling and monitoring the Datapath to ensure the proper execution of the instructions COE 308

  3. Digital Circuit Model Combinational Logic Memorization Elements A Digital Circuit is modeled as a Sequential Circuit It consists of several Memorization Elements connected to one another through combinational logic blocks. • The Memorization Elements are governed by clocking signals defined by a clocking methodology • Inputs of any combinational logic block come from memorization elements and are fed-back into memorization elements COE 308

  4. D Q Ck Combinational Logic Clocking Methodology Use an edge-triggered clocking methodology • Memorization elements values are only updated on the active edge of the clock • Memorization elements outputs are available immediately after their update and stable for the whole duration of the clock cycle Ck Input Output No infinite loop using edge-triggered Memorization Elements Memorization Element Delay Combinational Logic Delay COE 308

  5. General Design Methodology • Clearly understand the list of requirements • Translate the requirements list into design-oriented requirements that can be directly mapped into design blocks • Divide the design into several design blocks organized by function that implement the list of requirements • Clearly define the inputs and outputs of each block (interface) • Connect the blocks together to form the design entity as a whole • Refine the details by verifying that each item in the requirement list is fully fulfilled. Add the necessary modifications if needed. COE 308

  6. Requirements • To simplify the design, only a subset of the MIPS instruction set is to be supported by the datapath. The subset consists of the following instructions: • Arithmetic and Logic instructions • Add, Sub, And, Or and Slt • Both Register and Immediate addressing modes are included • Memory Access Instructions • Lw and Sw • Control Sequence instructions: • branch if equal (Beq) • Start with a simple implementation that uses a single clock cycle for every instruction COE 308

  7. Combinational Logic State Element Single Clock Cycle Model In a single clock cycle execution model: • Every instruction is executed in a single clock cycle • The state elements are modified at the end of the execution of each instruction. Means: • Datapath architecture should not have intermediate registers to store partial results in the middle of the execution of one instruction Ck Instruction Flow Instruction i-1 Instruction i Instruction i+1 Instruction i+2 COE 308

  8. How to Proceed Need to implement a Datapath that executes 3 types of instructions: • Arithmetic and Logic Instructions • Memory Access Instructions • Control of Sequence Instructions Difficult to address all the 3 instruction type details and specificities at the same time Solution: Use Incremental Design Methodology Start with a Datapath that can execute Arithmetic and Logic Instructions Alone COE 308

  9. A/L Instruction Properties • The processor executes the Arithmetic and Logic Instructions as follows: • Fetch the instruction from Memory • Fetch its operands. Operands are either • Both registers: R-type • One operand is a register and the other is an immediate value • Execute the required operation (add, sub, and, or, slt) • Store the result in the destination register COE 308

  10. The processor fetches the instructions from memory one by one before executing them The processor needs to know the address of the instruction to be able to fetch it The address is usually kept in a register called PC (Program Counter) The PC is incremented to point to the next instruction in memory Fetching Instructions The PC is the first State Element COE 308

  11. PC PC Block • The PC contains the address of the next instruction to be executed. • Instructions are in memory • For simplicity, we assume that the instruction and data memories are separate entities. • Instruction Memory is read only. No need for R/Wb signal and Din bus. • The instruction address (PC register output) is fed into the memory to fetch the instruction from the instruction memory • Question: Is the instruction memory a combinational or sequential block? • Clocked memory (synchronous) requires min. two clock cycles: 1st to read the instruction from the memory and 2nd to execute the instruction. Conflict with single cycle execution requirement • Non-clocked memory (treated as a combinational circuit) satisfies the single cycle execution requirement Addr Dout Iout Instruction Din Memory Ck R/Wb COE 308

  12. Register File • MIPS contains 32 registers. Need a Register File • First support A/L Instructions of R-type. Need 2 Read and 1 Write operation per instruction • Option 1: 1 Read/Write Port. Cannot perform 2 reads and 1 write in a single clock cycle • Option 2: 1 Read Port and 1 Read/Write Port. Cannot perform 1 read and 1 write during one clock cycle • Option 3: 2 Read Ports and 1 Write Port. Read ports are asynchronous and Write port is clocked 5 Read Register 1 Read Data 1 32 Coming From Instruction 5 Read Register 2 Register File 5 Write Register Read Data 2 32 32 Write Data COE 308

  13. Instruction Register ? Do we need an Instruction Register to save the Fetched Instruction? • No, because: • Instruction Memory is asynchronous • ANDSingle-cycle execution datapath • AND PC is modified after the current instruction execution is finished PC out Delay Ck PC Instruction Memory Delay Iout No Instruction Register means: Use Output of Instruction Memory as Instruction Register COE 308

  14. PC Connecting the Register File R-type instruction format For R-type A/L instructions, connect the Rs, Rt and Rd fields of the instruction to the register address input of the register file. op rs rt rd shamt funct Bits: 31 26 25 21 20 16 15 11 10 6 5 0 • Connect Rs Field to Read Port Address 1 • Connect Rt Field to Read Port Address 2 • Connect Rd Field toWrite Port Address Rs: I[25:21] 5 Read Register 1 32 Iout Addr Read Data 1 32 Rt: I[20:16] 5 Read Register 2 Instruction Register File Memory Rd: I[15:11] 5 Write Register Read Data 2 32 32 Write Data COE 308

  15. Executing The Operation • Use the ALU to perform the Operation. • Read Register 1 Data is connected to 1st ALU operand • Read Register 2 Data is connected to 2nd ALU operand • Result is written back in the Register File through the unique Write Port ALUOp 5 Read Register 1 Read Data 1 32 Coming From Instruction 5 Zero Read Register 2 Register File 5 ALU Write Register Read Data 2 32 32 Write Data RegWrite COE 308

  16. Getting to The Next Instruction To get to the next instruction we need to: • Increment PC: PC  PC + 4 (Addressing space is in bytes, so to get to the next 32 bits which are 4 bytes, add the value 4) • Wait for the rising edge of the clock Because we are in a single-cycle execution model, the PC is updated only at the rising edge of the clock which means: We should prepare the next value of the PC beforehand. We Need an adder to perform PC + 4. We have two choices: • Use the ALU: • Add an adder specially dedicated to adding 4 to the PC. COE 308

  17. PC Using the ALU to Increment the PC Read Register 1 Read Data 1 Read Register 2 Register File ALU Write Register Read Data 2 Write Data 4 First Option, connect the two sources PC and Registers directly to the ALU Bus Conflict COE 308

  18. PC Using the ALU to Increment the PC (2) Read Register 1 Read Data 1 Read Register 2 Register File ALU Write Register Read Data 2 4 Write Data Second Option, connect the two sources PC and Registers through Muxes to select which block uses the ALU Need 2 phases (clock cycles) COE 308

  19. PC PC Increment Block + 4 No choice but to add an adder dedicated for Computing the value: PC + 4 COE 308

  20. PC Datapath for R-type A/L Inst. + 4 ALUOp Read Register 1 Iout Addr Read Data 1 Zero Read Register 2 Instruction Register File Memory ALU Write Register Read Data 2 Register Write Write Data RegWrite COE 308

  21. Support for I-type Instructions I-type instruction format For I-type Instructions, the second operand is an immediate value op rs rd I Field Bits: 31 26 25 21 20 16 15 0 Second Operand is a signed 16-bits immediate value: • Needs to be extended to 32 bits  Sign extend it • Add Sign Extension Block • The register destination number comes from the rt field (bits [20:16]) instead of coming from the rd field (bits [15:11]) • Add a Mux to select between the rd and rt field for the register destination number COE 308

  22. Instruction Memory I-type Support ALU Source Mux added to select the second operand between Rs Register and The Sign Extended Immediate Value Register Destination Mux added to select the register destination number between the Rd field (R-type) and the Rt field (I-type) Rs: I[25:21] 5 Read Reg. 1 32 Iout Addr Read Data 1 Rt: I[20:16] 5 Zero Read Reg. 2 Reg. File Read Data 2 ALU Write Reg. Rd: I[15:11] 5 Register Write Write Data Reg. Write I Field: I[15:0] 16 32 Sign Extension Block added to sign extend the Immediate value from 16 to 32 bits Sign Extend COE 308

  23. Data Memory Memory Access Instructions Two instructions to access the memory: lw and sw. • Both instructions use the displacement addressing mode where the address is the result of the addition of a register value to a signed displacement: EA = Rs + disp • Rs is read from the Register File. So the Address Calculation takes place into the ALU. • The Address is then used to access the data memory • The Data Memory is a Read/Write Memory. • Two Control Signals are required for the Data Memory: MemRead and MemWrite MemRead Dout Addr Din MemWrite COE 308

  24. Instruction Data Memory Memory Load Instruction Lw reads a word from the memory and stores it in the Register File The Data Memory Read Access is the same as the Instruction Memory Access. It is Asynchronous and behaves like a Combinational Circuit. The Memory Read Data need to be stored in the Register File. However, Register File Write Data input already connected to output of ALU (for Arithmetic/Logic Instructions) Solution: Insert a Mux to select between ALU output and Data memory Read Data. Read Reg. 1 Read Data 1 Iout Addr Read Reg. 2 Reg. File Read Data 2 Write Reg. Dout Addr ALU Write Data Din Sign Extend COE 308

  25. Instruction Data Memory Memory Store Instruction Sw writes a register into the memory Need to Connect the Din Bus of the Data Memory to the Appropriate signal Read Reg. 1 Read Data 1 Iout Addr Read Reg. 2 Reg. File Read Data 2 Write Reg. Dout Addr ALU Write Data Din Sign Extend COE 308

  26. Branch Equal (beq) Instruction Beq rs, rt, label: Branch to “label” if rs == rt. Means: if rs == rt  PC  PC + Offset + 4 else PC  PC + 4 Structural Conflict: Need to • Compare Rs and Rt • Compute PC + Offset + 4 • Use the ALU to perform the comparison. • Add a new adder for the displacement. How to Connect 16-bit Offset to a 32-bit adder? • Beq is I-type. • Offset field is 16-bits signed  Sign extend it • Represents a word Offset  Shift left by 2 positions • Add New Adder to Compute PC + Offset • Offset is sign-extended left shifted to be 32 bits • ALU is used for comparison In Summary  COE 308

  27. Left Shift + + Instruction Data Memory Memory PC Branch Equal Instruction (2) Mux inserted to select between: • PC + Offset + 4: When it is a branch instruction and the branch is taken • PC + 4: in all the other cases 4 Read Reg. 1 Read Data 1 Iout Addr Read Reg. 2 Reg. File Read Data 2 Write Reg. Dout Addr ALU Write Data Din Sign Extend COE 308

  28. Left Shift + + Instruction Data Memory Memory PC Showing The Control Signals 4 PCSrc ALUOp Read Reg. 1 Read Data 1 MemWrite Iout Addr Read Reg. 2 Zero Reg. File Read Data 2 Write Reg. Dout Addr ALU Write Data RegDst Din RegWrite Sign Extend ALUSrc MemtoReg MemRead COE 308

  29. Control Signals COE 308

  30. A Word about ALU Control ALU Control, as designed in section 4 has two control signals: • Operation • Binvert The ALU in all what preceded includes its own control signals generation block that takes the ALUOp control signal and translates it into the Operation and Binvert signals. ALUOp ALU Ctrl Gen. Binvert Operation ALU COE 308

  31. Control Signals Settings (*) if Zero (from ALU) is equal to 1 COE 308

  32. Left Shift + + Instruction Data Memory Memory PC Datapath and Control Branch RegDst RegWrite ALUSrc ALUOp MemtoReg MemRead MemWrite PCSrc Control 4 Branch ALUOp Read Reg. 1 Read Data 1 MemWrite Iout Addr Read Reg. 2 Zero Reg. File Read Data 2 Write Reg. Dout Addr ALU Write Data RegDst Din RegWrite Sign Extend ALUSrc MemtoReg MemRead COE 308

  33. Multicycle Implementation • Multiple clock cycles to execute a single instruction • Clock cycle is shorter • No conflict between different operations on block usage. A block can be used by an entity in a clock cycle and reused to do something else in another clock cycle • Need intermediate registers to save intermediate values • Optimal implementation: shortest clock cycle time .vs. smallest number of clock cycles per instruction COE 308

  34. How many clock cycles ? Choose a clock cycle to perform ONE operation at a time. For example: • Instruction Fetch • Operand Fetch • ALU Operation • PC  PC + 4 • Memory Access • Result Write Back in Register File COE 308

  35. Instruction Memory PC IR Instruction Fetch • PC  Instruction Memory • Read Instruction • Store Instruction into Instruction Register IR Introduced to allow PC to change its value without waiting until the last clock cycle of the instruction execution Iout Addr Since ALU is unused during the Instruction Fetch cycle: Use the ALU to perform PC + 4 and store it back in the PC 4 COE 308

More Related