1 / 48

Lecture 10: Processors

Lecture 10: Processors. EEN 312: Processors: Hardware, Software, and Interfacing. Department of Electrical and Computer Engineering Spring 2014, Dr. Rozier (UM). LAB 3. PROCESSORS. What needs to be done to “ Process ” an Instruction?. Check the PC Fetch the instruction from memory

zamir
Download Presentation

Lecture 10: Processors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 10: Processors EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr. Rozier (UM)

  2. LAB 3

  3. PROCESSORS

  4. What needs to be done to “Process” an Instruction? • Check the PC • Fetch the instruction from memory • Decode the instruction and set control lines appropriately • Execute the instruction • Use ALU • Access Memory • Branch • Store Results • PC = PC + 4, or PC = branch target

  5. Pipeline Write Back Integer Fetch Issue Multiply Decode Floating Point Load Store

  6. Pipelined laundry: overlapping execution Parallelism improves performance Pipelining Analogy §4.5 An Overview of Pipelining • Four loads: • Speedup= 8/3.5 = 2.3 • Non-stop: • Speedup= 2n/0.5n + 1.5 ≈ 4= number of stages

  7. Situations that prevent starting the next instruction in the next cycle Structure hazards A required resource is busy Data hazard Need to wait for previous instruction to complete its data read/write Control hazard Deciding on control action depends on previous instruction Hazards

  8. Conflict for use of a resource In pipeline with a single memory Load/store requires data access Instruction fetch would have to stall for that cycle Would cause a pipeline “bubble” Hence, pipelined datapaths require separate instruction/data memories Or separate instruction/data caches Structure Hazards

  9. An instruction depends on completion of data access by a previous instruction add r0, r4, r1sub r2, r0, r3 Data Hazards

  10. Use result when it is computed Don’t wait for it to be stored in a register Requires extra connections in the datapath Forwarding (aka Bypassing)

  11. Can’t always avoid stalls by forwarding If value not computed when needed Can’t forward backward in time! Load-Use Data Hazard ldrr0 [r2,#0] sub r1, r1, r0

  12. Reorder code to avoid use of load result in the next instruction C code for A = B + E; C = B + F; Code Scheduling to Avoid Stalls ldr r1, [r0, #0] ldrr2, [r0, #4] add r3, r1, r2 str r3, [r0, #12] ldrr4, [r0, #8] add r5, r1, r4 str r5, [0, #16] ___________________ stall stall

  13. Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can’t always fetch correct instruction Still working on ID stage of branch In pipeline Need to compare registers and compute target early in the pipeline Add hardware to do it in ID stage Control Hazards

  14. Wait until branch outcome determined before fetching next instruction Stall on Branch adds r4, r5, r6 beq label add r7, r8, r9

  15. Longer pipelines can’t readily determine branch outcome early Stall penalty becomes unacceptable Predict outcome of branch Only stall if prediction is wrong In pipeline Can predict branches not taken Fetch instruction after branch, with no delay Branch Prediction

  16. MIPS with Predict Not Taken adds r4, r5, r6 Prediction correct beq label add r7, r8, r9 adds r4, r5, r6 Prediction incorrect beq label sub r7, r8, r9

  17. Static branch prediction Based on typical branch behavior Example: loop and if-statement branches Predict backward branches taken Predict forward branches not taken Dynamic branch prediction Hardware measures actual branch behavior e.g., record recent history of each branch Assume future behavior will continue the trend When wrong, stall while re-fetching, and update history More-Realistic Branch Prediction

  18. Pipelining improves performance by increasing instruction throughput Executes multiple instructions in parallel Each instruction has the same latency Subject to hazards Structure, data, control Instruction set design affects complexity of pipeline implementation Pipeline Summary The BIG Picture

  19. MIPS Pipelined Datapath §4.6 Pipelined Datapath and Control MEM Right-to-left flow leads to hazards WB

  20. Need registers between stages To hold information produced in previous cycle Pipeline registers

  21. Cycle-by-cycle flow of instructions through the pipelined datapath “Single-clock-cycle” pipeline diagram Shows pipeline usage in a single cycle Highlight resources used c.f. “multi-clock-cycle” diagram Graph of operation over time We’ll look at “single-clock-cycle” diagrams for load & store Pipeline Operation

  22. IF for Load, Store, …

  23. ID for Load, Store, …

  24. EX for Load

  25. MEM for Load

  26. WB for Load Wrongregisternumber

  27. Corrected Datapath for Load

  28. EX for Store

  29. MEM for Store

  30. WB for Store

  31. Form showing resource usage Multi-Cycle Pipeline Diagram

  32. Traditional form Multi-Cycle Pipeline Diagram

  33. State of pipeline in a given cycle Single-Cycle Pipeline Diagram

  34. Consider this sequence: sub r2, r1, r3 and r7, r2, r5 or r8, r6, r2 add r9, r2, r2 sw r10, [r2, #100] We can resolve hazards with forwarding How do we detect when to forward? Data Hazards in ALU Instructions §4.7 Data Hazards: Forwarding vs. Stalling

  35. Dependencies & Forwarding sub r2, r1, r3 and r7, r2, r5 or r8, r6, r2 add r9, r2, r2 sw r10, [r2, #100]

  36. Pass register numbers along pipeline e.g., ID/EX.RegisterRs = register number for Rs sitting in ID/EX pipeline register ALU operand register numbers in EX stage are given by ID/EX.RegisterRs, ID/EX.RegisterRt Data hazards when 1a. EX/MEM.RegisterRd = ID/EX.RegisterRs 1b. EX/MEM.RegisterRd = ID/EX.RegisterRt 2a. MEM/WB.RegisterRd = ID/EX.RegisterRs 2b. MEM/WB.RegisterRd = ID/EX.RegisterRt Detecting the Need to Forward Fwd fromEX/MEMpipeline reg Fwd fromMEM/WBpipeline reg

  37. But only if forwarding instruction will write to a register! EX/MEM.RegWrite, MEM/WB.RegWrite Detecting the Need to Forward

  38. Forwarding Paths

  39. Consider the sequence: add r1,r1,r2add r1,r1,r3add r1,r1,r4 Both hazards occur Want to use the most recent Revise MEM hazard condition Only fwd if EX hazard condition isn’t true Double Data Hazard

  40. Datapath with Forwarding Chapter 4 — The Processor — 40

  41. Load-Use Data Hazard Need to stall for one cycle ldr r2, [r1, #20] and r4, r2, r5 or r8, r2, r6 and r9, r4, r2 add r1, r6, r7

  42. Stall/Bubble in the Pipeline ldr r2, [r1, #20] and becomes nop and r4, r2, r5 or r8, r2, r6 add r9, r4, r2 Stall inserted here

  43. Datapath with Hazard Detection

  44. Stalls reduce performance But are required to get correct results Compiler can arrange code to avoid hazards and stalls Requires knowledge of the pipeline structure Stalls and Performance The BIG Picture

  45. Poor ISA design can make pipelining harder e.g., complex instruction sets (VAX, IA-32) Significant overhead to make pipelining work IA-32 micro-op approach e.g., complex addressing modes Register update side effects, memory indirection e.g., delayed branches Advanced pipelines have long delay slots Pitfalls

  46. ISA influences design of datapath and control Datapath and control influence design of ISA Pipelining improves instruction throughputusing parallelism More instructions completed per second Latency for each instruction not reduced Hazards: structural, data, control Concluding Remarks

  47. WRAP UP

  48. For next time • Continued Pipelines and Processors

More Related