1 / 22

CSC 4250 Computer Architectures

CSC 4250 Computer Architectures. September 19, 2006 Appendix A. Pipelining. Three Classes of Pipeline Hazards. Structural Hazards: Arise from resource conflicts when hardware cannot support the overlapped execution of all possible combinations of instructions

mardi
Download Presentation

CSC 4250 Computer Architectures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSC 4250Computer Architectures September 19, 2006Appendix A. Pipelining

  2. Three Classes of Pipeline Hazards • Structural Hazards: Arise from resource conflicts when hardware cannot support the overlapped execution of all possible combinations of instructions • Data Hazards: Arise when an instruction depends on results of a previous instruction exposed by the pipeline • Control Hazards: Arise from pipelining of branches and other instructions that change PC (what is PC?)

  3. Structural Hazards • Functional unit is not pipelined, e.g., FP divide • One register write port ─ two writes in a cycle; when can this happen? • Single memory pipeline for data and instructions ─ instruction contains data memory reference

  4. Figure A.4. Load with One Memory Port

  5. Why Allow Structural Hazards? • Reduce cost • Pipelining (or duplicating) all functional units is expensive (e.g., fully pipeline FP multiply) • Processors that support both instruction and data cache accesses every cycle require twice as much bandwidth

  6. Data Hazards • Pipelining changes order of read/write accesses: DADD R1,R2,R3 DSUB R4,R1,R5 AND R6,R1,R7 OR R8,R1,R9 XOR R10, R1, R11 • Add writes R1 in WB stage (5th cycle) • Sub reads R1 in ID (3rd cycle) → data hazard • Same problem for And instruction • What about Or? Or reads R1 in the 5th cycle, while Add writes R1

  7. Fig. A.6. Use of DADD Result Causes Data Hazard

  8. Minimize Data Hazard Stalls by Forwarding • ALU result from both EX/MEM and MEM/WB pipeline registers always fed back to ALU inputs • If forwarding hardware detects that previous ALU operation writes the register corresponding to current source for ALU operation, then control logic selects forwarded result as input

  9. Fig. A.7. Use Forwarding Paths to Avoid Data Hazard

  10. Fig. A.23. Extra Hardware for Forwarding to ALU

  11. Forwarding • Generalized Forwarding Result forwarded from pipeline register corresponding to output of one unit to input of another unit • Forwarding Fails Load causes delay that forwarding cannot handle • Pipeline Interlock Hardware detects a hazard and stalls pipeline until hazard is cleared • MIPS Microprocessor without Interlocking Pipeline Stages

  12. Fig. A.8. Forwarding of Operand Required by Stores

  13. Figure A.9. Load Instruction Causes Stall

  14. Figure A.17. Implementation of MIPS Data Path

  15. Figure A.18.Pipeline Data Path by Adding Pipeline Registers

  16. Control Hazard • Branch may change value of PC • Branch is taken or untaken • Three cycles of delay on MIPS

  17. MIPS Branch Delay Clock Number Instr. # 1 2 3 4 5 6 7 8 9 Branch instr. IF ID EX ME WB Instr. i+1 IF stall stall stall stall Branch target IF ID EX ME WB Branch target+1 IF ID EX ME Branch target+2 IF ID EX

  18. How MIPS Reduces Branch Delay • Consider only BEQZ and BNEZ • Move zero test into ID stage (from EX stage) • Compute both PCs (taken and untaken) early • Additional adder in ID stage (old: use ALU) • Only one cycle stall on branches • Branch on result of immediately preceding ALU instruction causes data hazard

  19. Figure A.24.Branch Hazard Stall Reduced to One Cycle

  20. Data Hazard in ALU Instr. followed by Branch Clock Number Instruction # 1 2 3 4 5 6 7 ALU instruction IF ID EX ME WB Branch instruction IF ID ID EX ME WB Example. ADD R1,R2,R3 BEQZ R1,name

  21. Delayed Branch • Heavily used in early RISC processors • Works well with branch delay of one cycle • Sequential successor is in branch delay slot. This instruction is executed whether or not branch is taken: • Branch instruction • Sequential successor • Branch target if taken

  22. Figure A.14. Schedule Branch Delay Slot

More Related