1 / 30

8. Microarchitecture of Superscalars (6) Register renaming

8. Microarchitecture of Superscalars (6) Register renaming. Dezső Sima Fall 2006.  D. Sima, 2006. Overview. 1 The Principle of register renaming. 2 Design Space. 2.1 Overview. 2.2 Types of rename buffers. 3 Operation of register renaming. 4 Design parameters of register renaming.

illias
Download Presentation

8. Microarchitecture of Superscalars (6) Register renaming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 8. Microarchitecture of Superscalars (6)Register renaming Dezső Sima Fall 2006  D. Sima, 2006

  2. Overview 1 The Principle of register renaming 2Design Space 2.1 Overview 2.2 Types of rename buffers 3Operation of register renaming 4Design parameters of register renaming 5Implementation of renaming in superscalars 5.1 The chronology of introducing register renaming 5.2 Basic implementation schemes of register renaming 6 Examples

  3. 1. Principle of register renaming (1) Aim: • Eliminating false data dependencies to relieve the issue bottleneck False data dependencies WAW WAR Write After Read (Anti dependency) Write After Write: (Output dependency) I1: mul r1, r2, r3 I2: add r1, r4, r5 I1: mul r1, r2, r3 I2: add r2, r4, r5 Examples:

  4. 1. Principle of register renaming (2) EU EU Basic principle to eliminate false data dependencies: Source register numbers False data dependencies are eliminated by writing generated results temporarily to buffers, called the rename buffers (RB) instead of the referenced architectural registers (AR). Results Retirement RB AR Ops. Then - during dispatching a new rename buffer need to be allocated to each instruction whose destination register causes false data depenency1, - referenced source operands need to be fetched from the RB file, if they are actually renaned, else from the AR file, - during retirement buffered results need to be transferred from the RB file to the AR file. 1Usually, processors allocate to each dispatched instruction a rename buffer without checking for the existence of false data dependecies to reduce logic complexity. Figure 1.1: The principle of register renaming

  5. 2. Design space of register renaming 2.1 Overview Register renaming Scope of register renaming Rename rate Layout of the rename buffers Layout of the register mapping Type of rename buffers

  6. 2.2 Types of rename buffers Types of rename buffers Rename reg. file Reg. nrs. Res. Ret. AR RR Ops.

  7. Allocate, if instruction Rename reg. file is dispatched Initialized Allocated, not valid Reg. nrs. Available Reclaim, if instruction is Res. Ret. AR canceled RR Update, if instruction Reclaim, if instruction is finished is retired Allocated, valid Ops.

  8. 2.2 Types of rename buffers Types of rename buffers Rename reg. file Future file Reg. nrs. Reg. nrs. Res. Res. Ret. FF Ret. AR AR RR Ops. Ops. PowerPC 603 (1993) PowerPC 604 (1995) PowerPC 620 (1996) Power3 (1998) PA 8000 (1996) PA 8200 (1997) PA 8500 (1999)

  9. Future file Update if instruction is finished Valid Reg. nrs. Not valid Initialized Res. FF Ret. AR Invalidate by referring to the same register as destination Ops. The FF has as many entries as the AR and holds the most actual register values

  10. 2.2 Types of rename buffers Types of rename buffers Rename reg. file Merged arch. and rename register file Future file Reg. nrs. Reg. nrs. Reg. nrs. Res. Res. Ret. Res. FF Ret. AR AR AR, RR RR Ops. Ops. Ops. PowerPC 603 (1993) PowerPC 604 (1995) PowerPC 620 (1996) Power3 (1998) PA 8000 (1996) PA 8200 (1997) PA 8500 (1999) UltraSPARC III (1999) K7 (FX) (1999) K8 (FX) (2003)

  11. Merged arch. and rename register file Entry is allocated to a dispatched instruction Initialized Reg. nrs. RB, not valid Available Instruction Architectural register Instruction is AR, RR canceled is reclaimed if this architectural register becomes renamed anew. is finished Res. RB, AR valid Ops. Instruction is completed It needs a large number of physical registers. During completion no physical transfer is needed from the rename buffer to the referenced architetural register instead the former rename buffer changes its state and becomes the referenced architectural register.

  12. 2.2 Types of rename buffers Types of rename buffers Rename reg. file Merged arch. and rename register file Holding renamed values in the ROB Future file Reg. nrs. Reg. nrs. Reg. nrs. Reg. nrs. ROB Res. Res. AR Ret. Res. Res. FF Ret. Ret. AR AR AR, RR RR Ops. Ops. Ops. Ops. Power1 (1990) Power2 (1993) R10000 (1996) R12000 (1999) Alpha 21264 (1998) Pentium 4 (FP) (2000) K7 (FP) (1999) K8 (FP) (2003) PowerPC 603 (1993) PowerPC 604 (1995) PowerPC 620 (1996) Power3 (1998) PA 8000 (1996) PA 8200 (1997) PA 8500 (1999) UltraSPARC III (1999) K7 (FX) (1999) K8 (FX) (2003)

  13. Holding renamed values in the ROB Allocate, if instruction is dispatched Initialized Reg. nrs. Allocated, not valid Available ROB Reclaim, AR Ret. if instruction is Res. canceled Update, if instruction Reclaim, if instruction is finished is retired Allocated, valid Ops. ROB entries are extended to hold results as well. During dispatching a new ROB entry with its result field is allocated to each dispatched instruction. (The result field serves as the allocated rename buffer).

  14. 2.2 Types of rename buffers Types of rename buffers Rename reg. file Merged arch. and rename register file Holding renamed values in the ROB Future file Reg. nrs. Reg. nrs. Reg. nrs. Reg. nrs. ROB Res. Res. AR Ret. Res. Res. FF Ret. Ret. AR AR AR, RR RR Ops. Ops. Ops. Ops. Power1 (1990) Power2 (1993) R10000 (1996) R12000 (1999) Alpha 21264 (1998) Pentium 4 (FP) (2000) K7 (FP) (1999) K8 (FP) (2003) K5 (1995) K6 (1997) Pentium Pro (1995) Pentium II (1997) Pentium III (1999) Pentium 4 (FX) (2000) Pentium M (2003) Core (2006) PowerPC 603 (1993) PowerPC 604 (1995) PowerPC 620 (1996) Power3 (1998) PA 8000 (1996) PA 8200 (1997) PA 8500 (1999) UltraSPARC III (1999) K7 (FX) (1999) K8 (FX) (2003)

  15. 3. Operation of register renaming (1) The actual rename process depends on both the rename technique implemented and the underlying microarchitecture. Assumptions: Rename technique: using rename registers and mapping tables

  16. Rename registers: Provide buffer space to temporarily hold instruction results V Rename register file (RR) During dispatching the Valid bit of the allocated rename register becomes invalidated (v 0) When the instruction becomes finishedthe result of the instruction is transferred to the allocated rename buffer entry and the Valid bit is set (V 1), to indicate that the corresponding value is available.

  17. 3. Operation of register renaming (1) The actual rename process depends on both the rename technique implemented and the underlying microarchitecture. Assumptions: Rename technique: using rename registers and mapping tables

  18. Entry RB valid index 0 Mapping table 6 0 Look-up 7 1 12 for r7 14 8 1 n-1 "12" (RB index=12) Mapping table: It includes an entry to each architectural register. Each entry has an „Entry valid” bit that indicates whether or not the corresponding architectural register is renamed and in case of a renaming it holds the indexof the associated rename buffer (RB index). • A new entry is created while an instruction is dispatched • by setting the „Entry valid” bit and • writing the index of the allocated rename buffer („RB index”) to the entry that corresponds to the destination register of the dispatched instruction. A valid mapping is updated by writing a new „RB index” into it when the architectural register belonging to that entry is renamed again. An entry is invalidated when the instruction that actually belongs to that entry is retired. In this way the mapping table continuously holds the latest allocations.

  19. 3. Operation of register renaming (1) The actual rename process depends on both the rename technique implemented and the underlying microarchitecture. Assumptions: • Rename technique: • using rename registers and mapping tables • Underlying microarchitechture: • in order dispatching • dynamic instruction issue • split FX and FP register files • operand fetch policy • both alteratives are discussed

  20. 3. Operation of register renaming (2) Considered part of the microarchitecture for both dispatch bound and issue bound operand fetching : • it executes only FX-instructions, • consists of anarchitectural register file (AR) and a single execution unit (EU).

  21. 3. Operation of register renaming (3) Reservation station (RS) Decoded instructions OC Dispatch Rd, Rs1, Rs2 Rs1, Rs2 Update RR Renaming destination and surce registers Rs1' Update Rename register Architectural register Mapping Rs2' file (RR) file (AR) arch. rf. table When inst. retired updating the AR V Op1/Rs1' Op1 Rd' Fetching op.s if valid else tags Op2/Rs2' Op2 Issuing instr. when op.s ready Update RS Issue Check valid bits OC Rd' Op1/Rs1' Op2/Rs2' V1 V2 OC, Rd', Op1, Op2 EU After instr. executed, updating RS, RR Bypassing Result, Rd' Figure 3.1: An FX-core assuming buffered issue and dispatch bound operand fetching

  22. 3. Operation of register renaming (4) EU Decoded instructions Dispatch Rd, Rs1, Rs2 OC Renaming destination and source registers Mapping table Rd' Rs2' Rs1' Dispatching instructions into the RS Reservation Checking for availability station (RS) of (Rs1'), (Rs2') Rs1', Rs2' Issue Issuing inst. when operands valid, fetching op.s OC Rd’ Rs1' Rs2' Update RR Updating AR when inst. retires Rename register Architectural register file (RR) file (AR) V Op1 Op2 OC, Rd' Executing instr. updating RR when instr. finished Bypassing Result, Rd' Figure 3.2: An FX-core assuming buffered issue and issue bound operand fetching

  23. 4. Design parameters of register renaming (1) Source: Sima, D. „Register Renaming Techniques”, Computer Engineering Handbook, CRC PRESS 2006

  24. 4. Design parameters of register renaming (2) Source: Sima, D. „Register Renaming Techniques”, Computer Engineering Handbook, CRC PRESS 2006

  25. 5. Implementation of renaming in superscalars 5.1 The chronology of introducing register renaming Figure 5.1: Chronology of introducing register renaming Source: Sima, D. „Register Renaming Techniques”, Computer Engineering Handbook, CRC PRESS 2006

  26. 5.2The basic implementation schemes ofregister renaming Types of rename buffers Types of ren.buffers Rename reg. file Merged arch. and rename register file Holding renamed values in the ROB Future file Op. fet. poli. Dispatch bound Issue bound Dispatch bound Issue bound Dispatch bound Issue bound Dispatch bound Issue bound Smith, Pleszkun, (85) Proposals Johnson (87) Keller (75) Sohi,Vajapeyam (87) PowerPC 603 (93) K7 (FX) (99) PM1 (95) ES/9000 (92) Pentium Pro (95) K8 (FX) (03) (SPARC 64) POWER1 (90) Pentium II (97) PowerPC 604 (95) Pentium III (99) POWER2 (93) PowerPC 620 (96) UltraSPARC III (99) Pentium M (03) P2SC (96) Core (06) POWER4 (01) POWER3 (98) Examples POWER5 (04) PA 8000 (96) Nx586 (94) Am29000 (95) PA 8200 (97) R10000 (96) K5 (95) PA 8500 (99) R12000 (99) Lightning* (91) Pentium 4 (00) K6* (97) K7 (FP) (99) K8 (FP) (03)

  27. 6. Examples (1) Rename register file Figure 6.1: The microarchitecture of the POWER3 Source: Song, P. „IBM’s Power3 to Replace P2SC”, Microprocessor Report, Nov. 17, 1997

  28. 6. Examples (2) Future file WARF: Working and Architectural Register File (Future file) Figure 6.2: The microarchitecture of the UltraSPARC-III Source: Horel, T. „UltraSPARC-III”, IEEE MICRO, May-June 99, pp. 73-95

  29. 6. Examples (3) Merged architectural and rename reg. Figure 6.3: The microarchitecture of the Alpha 21264 Source: Kessler, R.E. et al. .„The Alpha 21264 Microprocessor Architecture”, h18002.www1.hp.com/alphaserver

  30. 6. Examples (4) Holding renamed values in the ROB Figure 6.4: The microarchitecture of the Core processor Source: Kanter, D., „Intel’s next Generation Microarchitecture Unveiled”, Real World Tech., 2006 March 9.

More Related