Download
pipelining n.
Skip this Video
Loading SlideShow in 5 Seconds..
Pipelining PowerPoint Presentation

Pipelining

53 Views Download Presentation
Download Presentation

Pipelining

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Pipelining See: P&H Chapter 4.5

  2. The Kids • Alice • Bob • They don’t always get along…

  3. The Bicycle

  4. The Materials Saw Drill Glue Paint

  5. The Instructions • N pieces, each built following same sequence: Saw Drill Glue Paint

  6. Design 1: Sequential Schedule • Alice owns the room • Bob can enter when Alice is finished • Repeat for remaining tasks • No possibility for conflicts

  7. Sequential Performance • Elapsed Time for Alice: 4 • Elapsed Time for Bob: 4 • Total elapsed time: 4*N • Can we do better? time1 2 3 4 5 6 7 8 … Latency: Throughput: Concurrency:

  8. Design 2: Pipelined Design • Partition room into stagesof a pipeline Dave Carol Bob Alice One person owns a stage at a time 4 stages 4 people working simultaneously Everyone moves right in lockstep

  9. time1 2 3 4 5 6 7… Pipelined Performance Latency: Throughput: Concurrency: What if some tasks don’t need drill?

  10. Unequal Pipeline Stages 90 min 30 min 45 min 15 min 0h 1h 2h 3h… Latency: Throughput: Concurrency:

  11. Poorly-balanced Pipeline Stages 90 min 30 min 45 min 15 min 0h 1h 2h 3h… Latency: Throughput: Concurrency:

  12. Re-Balanced Pipeline Stages 90 min 30 min 45 min 15 min 0h 1h 2h 3h… Latency: Throughput: Concurrency:

  13. Splitting Pipeline Stages 45+45 min 30 min 45 min 15 min 0h 1h 2h 3h… Latency: Throughput: Concurrency:

  14. Pipeline Hazards • Q: What if glue step of task 3 depends on output of task 1? 0h 1h 2h 3h… Latency: Throughput: Concurrency:

  15. Lessons • Principle: • Latencies can be masked by parallel execution • Pipelining: • Identify pipeline stages • Isolate stages from each other • Resolve pipeline hazards

  16. A Processor memory registerfile inst alu +4 +4 addr =? PC din dout control cmp offset memory new pc target imm extend

  17. A Processor memory registerfile inst alu +4 addr PC din dout control memory new pc computejump/branchtargets imm extend Write-Back InstructionDecode InstructionFetch Memory Execute

  18. Basic Pipeline Five stage “RISC” load-store architecture • Instruction fetch (IF) • get instruction from memory, increment PC • Instruction Decode (ID) • translate opcode into control signals and read registers • Execute (EX) • perform ALU operation, compute jump/branch targets • Memory (MEM) • access memory if needed • Writeback (WB) • update register file Slides thanks to Sally McKee & Kavita Bala

  19. Pipelined Implementation • Break instructions across multiple clock cycles (five, in this case) • Design a separate stage for the execution performed during each clock cycle • Add pipeline registers to isolate signals between different stages