CSL718 : Superscalar Processors - PowerPoint PPT Presentation

csl718 superscalar processors l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
CSL718 : Superscalar Processors PowerPoint Presentation
Download Presentation
CSL718 : Superscalar Processors

play fullscreen
1 / 38
CSL718 : Superscalar Processors
240 Views
Download Presentation
roman
Download Presentation

CSL718 : Superscalar Processors

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. CSL718 : Superscalar Processors Issue and Despatch 23rd Jan, 2006 Anshul Kumar, CSE IITD

  2. Early proposals/prototypes Term Superscalar Cheetah America project(4) IBM Multititan project(2) DEC Match(2) Torch(4) Stanford U SIMP(4) DSNS(4) Kyushu U 1982 1983 1984 1985 1986 1987 1988 1989 Anshul Kumar, CSE IITD

  3. Commercial superscalars RISCs • Intel 960KA/KB  960CA (3) 1989 • IBM Power 1 RS/6000 (4) 1990 • HP PA7000 PA7100 (2) 1992 • SUN SPARC SuperSparc (3) 1992 • DEC Alpha 21064(2) 1992 • Motorola MC88100  MC88110(2) 1993 • Motorola PowerPC 601/603 (3) 1993 • MIPS R4000 R8000(4) 1994 Anshul Kumar, CSE IITD

  4. Commercial superscalars CISCs • Intel 80486  Pentium (2) 1993 • Motorola MC68040 MC68060 (2) 1993 • Gmicro Gmicro/100p Gmicro 500 (2) 1993 • AMD K5(2) – 4 RISC instr 1995 • CYRIX M1 (2) 1995 Anshul Kumar, CSE IITD

  5. Tasks of superscalar processing Parallel Parallel Preserving the decoding instruction sequential and issue execution consistency of instruction execution and exception processing Anshul Kumar, CSE IITD

  6. Superscalar decode and issue I - cache I - cache Instruction buffer Instruction buffer Scalar Issue Superscalar Issue Decode & Issue Decode & Issue IF D/I IF D I Anshul Kumar, CSE IITD

  7. Parallel Decoding • Fetch multiple instructions in instruction buffer • Decode multiple instructions in parallel – instruction window • Possibly check dependencies among these as well as with the instructions already under execution Anshul Kumar, CSE IITD

  8. Pre-decoding • Do partial decoding while instructions are being loaded in I-cache • Decoded information is appended to the instruction • This includes instruction class, resources required etc. Second level cache or main memory N bits/cycle Pre-decode unit N + n bits/cycle I - cache Anshul Kumar, CSE IITD

  9. Number of Pre-decode bits ProcessorNo. of predecode bits PA 7200 (1995) 5 PA 8000 (1996) 5 PowerPC 620(1996) 7 UltraSparc (1995) 4 HAL PM1 (1995) 4 AMD K5 (1995) 5 (per byte) R 10000 (1996) 4 Anshul Kumar, CSE IITD

  10. Blocking Issue Decode and issue to EU Instructions may be blocked due to data dependency Non-blocking Issue Decode and issue to buffer From buffer dispatch to EU Instructions are not blocked due to data dependency Issue vs Dispatch Anshul Kumar, CSE IITD

  11. Blocking Issue Instruction buffer issue window Decode Check & Issue EU EU EU Anshul Kumar, CSE IITD

  12. Non-blocking (shelved) Issue Instruction buffer Decode & Issue Reservation station Reservation station Reservation station Dep. Checking/ dispatch Dep. Checking/ dispatch Dep. Checking/ dispatch EU EU EU Anshul Kumar, CSE IITD

  13. Handling of Issue Blockages Preserving issue order Alignment of instruction issue aligned unaligned in-order out of order Anshul Kumar, CSE IITD

  14. Issue Order Issue in strict program order Out of order Issue Issue window Issue window Instructions to be issued Instructions issued Instructions to be issued Instructions issued e d c b a e d c b a a c a Example: MC 88110, PowerPC 601 Independent instruction Dependent instruction Issued instruction Anshul Kumar, CSE IITD

  15. Alignment Aligned Issue Unaligned Issue next window fixed window gliding window checked in cycle 1 h g f e d c b a h g f e d c b a issued in cycle 1 a a checked in cycle 2 h g f e d c b h g f e d c b issued in cycle 2 c b c b checked in cycle 3 h g f e d h g f e d issued in cycle 3 d f e d Anshul Kumar, CSE IITD

  16. Design choices in instruction issue Coping with Coping with Use of Handling of Issue false data unresolved shelving issue blockages rate dependencies control (2-6) dependencies blocking shelved no Register renaming wait speculative Anshul Kumar, CSE IITD

  17. Frequently used issue policies in scalar processors Traditional Traditional Traditional Traditional scalar issue scalar issue scalar issue scalar issue with shelving with shelving with spec. and renaming execution i386 MC68030 R3000 Sparc CDC 6600 IBM 360/91 I486 MC68040 R4000 MicroSparc Anshul Kumar, CSE IITD

  18. Frequently used issue policies in super scalar processors Straightforward Straightforward Straight forward Advanced superscalar superscalar superscalar superscalar issue issue with issue with issue shelving renaming (renaming+shelving) (speculative execution in all) aligned unaligned R10000 PentiumPro PowerPC602 PA8000 Sparc64 Am29000 K5 MC88110 R8000 MC68060 PA7200 UltraSparc Pentium PowerPC601 PA7100 SuperSparc Alpha21164 PowerPC602 Anshul Kumar, CSE IITD

  19. Frequently used issue policies Traditional Traditional Straight forward Advanced scalar issue scalar issue superscalar issue superscalar with spec. Issue execution aligned unaligned Anshul Kumar, CSE IITD

  20. Design Space of Shelving Scope of Layout of Operand fetch Instruction shelving shelving policy dispatch scheme buffers partial full Anshul Kumar, CSE IITD

  21. Layout of Shelving Buffers Type of the Number of Number of read shelving buffers shelving buffer entries and write ports depends on no. of EUs connected individual 2-4 group 6-16 central 20 total 15-40 Stand combined with alone renaming and (RS) reordering Anshul Kumar, CSE IITD

  22. RS RS RS RS RS Reservation Stations (RS) Individual RSs Group RSs Central RS EU EU EU EU EU EU EU EU Anshul Kumar, CSE IITD

  23. Combined Buffer(for Shelving, Renaming, Reordering) From decode/issue Deferred scheduling, Register renaming and Instruction Shelving DRIS EU EU Anshul Kumar, CSE IITD

  24. Operand Fetch Policies Issue bound fetch Dispatch bound fetch Anshul Kumar, CSE IITD

  25. RS RS RS RS Issue bound operand fetch(with single register file) instruction data Decode/issue RF EU EU EU EU Anshul Kumar, CSE IITD

  26. instruction data RF RS RS RS RS Dispatch bound operand fetch (with single register file) Decode/issue EU EU EU EU Anshul Kumar, CSE IITD

  27. RS RS RS RS Issue bound operand fetch(with multiple register files) instruction data Decode/issue RF RF EU EU EU EU Anshul Kumar, CSE IITD

  28. instruction data RF RF RS RS RS RS Dispatch bound operand fetch (with multiple register files) Decode/issue EU EU EU EU Anshul Kumar, CSE IITD

  29. RS RS RS RS Updating RFs and RSs instruction data Decode/issue RF RF EU EU EU EU Anshul Kumar, CSE IITD

  30. Instruction dispatch scheme Dispatch Dispatch Checking Treatment of policy rate operand empty RS availability single multiple instr/ instr/ cycle cycle Individual RS Group or central RS Anshul Kumar, CSE IITD

  31. Dispatch policy Selection Arbitration Dispatch rule rule order Rule for identifying instructions which are ready for execution (data dependency check) Rule for choosing one out of several ready instructions (earlier instruction has priority) Anshul Kumar, CSE IITD

  32. RS RS Dispatch order in-order partially out of out of order order check check Anshul Kumar, CSE IITD

  33. Checking availability of operands Direct check of Check of explicit score-board bits status bits in RS (usual for dispatch (usual for issue bound operand fetch) bound operand fetch) control flow approach data flow approach Flynn’s terminology Anshul Kumar, CSE IITD

  34. Score-board Introduced with CDC6600 Data status 0 Register File 1 1 0 2 1 0 1 Anshul Kumar, CSE IITD

  35. Checking in dispatch bound fetch decoded instruction check V bits of sources Reservation station update Rd set V bit Rs1,Rs2,Rd reset V bit of Rd OC Rs1 Rs2 Rd Register File Os1 OC (opcode) Os2 (operand value) EU result, Rd Anshul Kumar, CSE IITD

  36. Checking in issue bound fetch decoded instruction update Rd, set V bit Rs1,Rs2,Rd reset V bit of Rd Register File Os1 Os2 (operand value) check Vs1, Vs2 Reservation station OC, Os1, Os2, Rd OC Os1/Is1 Vs1 Os2/Is2 Vs2 Rd EU associative update of Is1, Is2 with Rd, set Vs bits result, Rd Anshul Kumar, CSE IITD

  37. RS RS Treatment of an empty RS Straight forward Bypassing approach RS if empty At least one cycle stay in RS EU EU Sparc64 PowerPc 604 Nx586 Anshul Kumar, CSE IITD

  38. Approaches in dispatching Straight forward Enhanced Advanced in order partially out of order out of order single single multiple instr/cycle instr/cycle instr/cycle individual RSs individual RSs group/central RSs Power1, PPC603 Power2 PM1, PentiumPro Nx586, Am29000 PPC604,620 PA8000, R10000 Anshul Kumar, CSE IITD