1 / 76

Computer Architecture

Computer Architecture. Lecture 18 Superscalar Processor and High Performance Computing. Static Superscalar Pipeline. Fetch 64-bits/clock cycle; Int on left, FP on right – Can only issue 2nd instruction if 1st instruction issues – More ports for FP registers to do FP load & FP op in a pair

gamma
Download Presentation

Computer Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer Architecture Lecture 18 Superscalar Processor and High Performance Computing

  2. Static Superscalar Pipeline Fetch 64-bits/clock cycle; Int on left, FP on right – Can only issue 2nd instruction if 1st instruction issues – More ports for FP registers to do FP load & FP op in a pair Type Pipe Stages Int. instruction IF ID EX MEM WB FP instruction IF ID EX MEM WB Int. instruction IF ID EX MEM WB FP instruction IF ID EX MEM WB Int. instruction IF ID EX MEM WB FP instruction IF ID EX MEM WB • 1 cycle load delay can cause delay up to 3 instructions in Superscalar - instruction in right half can’t use it, nor instructions in next slot

  3. LD/ST Wait for Operands Wait for Operands EX TAC Mem Acces Read Reg FP CDB #1 Wider Bus Wait for Operands Wait for Operands A 1 A 2 A 3 A 4 ISSUE/ Rename to RS CDB #2 Wait for Operands Wait for Operands M 1 M 2 .. M 7 Instr. Cache Wait for Operands ISSUE/ Rename to RS Divide Write Reg Check for RAW Check for RS Dynamic Super Scalar pipeline in operation

  4. Example 1 Loop: L.D F0,0(R1) ;F0=array element ADD.D F4,F0,F2 S.D F4,0(R1) ; store result ADDIU R1,R1,#-8 ;8 bytes (per DW) BNE R1,R2,LOOP ;branch R1!=R2

  5. Dual issue, 1 Integer Unit

  6. Dual issue, 1 Integer Unit

  7. Dual issue, 1 Integer Unit

  8. Dual issue, 1 Integer Unit

  9. Dual issue, 1 Integer Unit

  10. Dual issue, 1 Integer Unit

  11. Dual issue, 1 Integer Unit

  12. Dual issue, 1 Integer Unit

  13. Dual issue, 1 Integer Unit

  14. Dual issue, 1 Integer Unit

  15. Dual issue, 1 Integer Unit

  16. Dual issue, 1 Integer Unit

  17. Dual issue, 1 Integer Unit

  18. Dual issue, 1 Integer Unit

  19. Dual issue, 1 Integer Unit

  20. Dual issue, 1 Integer Unit

  21. Dual issue, 1 Integer Unit

  22. Dual issue, 1 Integer Unit

  23. LD/ST Wait for Operands Wait for Operands EX TAC Mem Access Read Reg Integer Wait for Operands Wait for Operands EX CDB #1 Wider Bus FP ISSUE/ Rename to RS CDB #2 Wait for Operands Wait for Operands A 1 A 2 A 3 A 4 Instr. Cache Wait for Operands Wait for Operands M 1 M 2 .. M 7 ISSUE/ Rename to RS Write Reg Wait for Operands Divide Check for RS Check for RAW Separate MEM and INT

  24. Dual issue, 2 Integer Unit

  25. Dual issue, 2 Integer Unit

  26. Dual issue, 2 Integer Unit

  27. Dual issue, 2 Integer Unit

  28. Dual issue, 2 Integer Unit

  29. Dual issue, 2 Integer Unit

  30. Dual issue, 2 Integer Unit

  31. Dual issue, 2 Integer Unit

  32. Dual issue, 2 Integer Unit

  33. Dual issue, 2 Integer Unit

  34. Dual issue, 2 Integer Unit

  35. Dual issue, 2 Integer Unit

  36. Dual issue, 2 Integer Unit

  37. Dual issue, 2 Integer Unit

  38. Speculative Execution • Need to overcome • Branch Hazards • Precise Exception

  39. LD/ST Wait for Operands EX TAC Mem Acces Integer Wait for Operands EX Wait for Operands A 1 A 2 A 3 A 4 Wait for Operands M 1 M 2 .. M 7 Wait for Operands Divide Speculative Pipeline Read Reg ROB CDB ISSUE/ Rename to RS FP Write Reg Check for RS Check for RAW

  40. The Hardware: Reorder Buffer IM • If inst write results in program order, reg/memory always get the correct values • Reorder buffer (ROB) – reorder out-of-order inst to program order at the time of writing reg/memory (commit) • If some inst goes wrong, handle it at the time of commit – just flush inst afterwards • Inst cannot write reg/memory immediately after execution, so ROB also buffer the results No such a place in Tomasulo original Fetch Unit Reorder Buffer Decode Rename Regfile S-buf L-buf RS RS DM FU1 FU2

  41. Speculative Tomasulo Algorithm • Issue — get instruction from FP Op Queue • Condition: a free RS at the required FU • Actions: (1) decode the instruction; (2) allocate a RS and ROB entry; (3) do source register renaming; (4) do dest register renaming; (5) read register file; (6) dispatch the decoded and renamed instruction to the RS and ROB • Execution — operate on operands (EX) • Condition: At a given FU, At lease one instruction is ready • Action: select a ready instruction and send it to the FU • Write result— finish execution (WB) • Condition: At a given FU, some instruction finishes FU execution • Actions: (1) FU writes to CDB, broadcast to all RSs and to the ROB; (2) FU broadcast tag (ROB index) to all RS; (3) de-allocate the RS. Note: no register status update at this time

  42. Speculative Tomasulo Algorithm • Commit—update register with reorder result • Condition: ROB is not empty and ROB head inst has finished execution • Actions if no mis-prediction/exception: (1) write result to register/memory, (2) update register status, (3) de-allocate the ROB entry • Actions if with mis-prediction/exception: flush the pipeline, e.g. (1) flush IFQ; (2) clear register status; (3) flush all RS and reset FU; (4) reset ROB

  43. Example while (A(i) <> x) { A(i) ++; i++; } Loop: LD R2,0(R1) ;R1 = base address of A() DADDIU R2,R2,#1 SD R2,0(R1) ;store result DADDIU R1,R1,#4 ; BNE R2,R3,LOOP ; x = R3

  44. Non-Speculative execution:Dual issue, 2 CDB, 2 Int Units

  45. Non-Speculative execution:Dual issue, 2 CDB

  46. Non-Speculative execution:Dual issue, 2 CDB

  47. Non-Speculative execution:Dual issue, 2 CDB

  48. Non-Speculative execution:Dual issue, 2 CDB

  49. Non-Speculative execution:Dual issue, 2 CDB

  50. Non-Speculative execution:Dual issue, 2 CDB

More Related