lecture 11 modern superscalar processor models n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Lecture 11 : Modern Superscalar Processor Models PowerPoint Presentation
Download Presentation
Lecture 11 : Modern Superscalar Processor Models

Loading in 2 Seconds...

play fullscreen
1 / 15

Lecture 11 : Modern Superscalar Processor Models - PowerPoint PPT Presentation


  • 123 Views
  • Uploaded on

Lecture 11 : Modern Superscalar Processor Models. Generic Superscalar Models, Issue Queue-based Pipeline, Multiple-Issue Design. Generic Superscalar Processor Models. Issue queue based. FU. Wakeup select. Regfile. bypass. Fetch. Rename. D-cache. FU. commit. schedule. execute.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Lecture 11 : Modern Superscalar Processor Models' - yaphet


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
lecture 11 modern superscalar processor models

Lecture 11: Modern Superscalar Processor Models

Generic Superscalar Models, Issue Queue-based Pipeline, Multiple-Issue Design

generic superscalar processor models
Generic Superscalar Processor Models

Issue queue based

FU

Wakeup

select

Regfile

bypass

Fetch

Rename

D-cache

FU

commit

schedule

execute

Reservation based (already studied)

Reg

FU

bypass

Fetch

Rename

D-cache

ROB

Wakeupselect

FU

commit

schedule

execute

Revised from Paracharla PhD thesis 1998

issue queue based pipeline
Issue Queue Based Pipeline

Fetch->Rename->Issue->Reg-read-> Execute->Writeback/Commit

Core structure: register mapping table

  • Rename: translate architectural registers into physical registers
  • Issue: send instruction out to register read and then execution
  • Commit: Process mis-prediction/exception, update register renaming

Why study? Used in Alpha 21264, MIPS R10000, Intel P4

compare reservation station and issue queue
Compare Reservation Station and Issue Queue
  • Pipeline Stage Sequence
    • RS: IF -> REN -> REG/ROB->SCHD->…
    • IQ: IF -> REN -> SCHD -> REG ->…
  • Mapping Table vs. Status Table
    • RS: Status table chooses architectural register or ROB
    • IQ: Always renames to a physical register
  • Register file
    • RS: Architectural register file stores architectural states
    • IQ: Physical register file; No architectural register file! Mapping table determines architectural states
compare reservation station and issue queue1
Compare Reservation Station and Issue Queue
  • Reservation Station
    • RS: busy, fu, op, Qj, Qk, Vj, Vk
    • IQ: busy, fu, op, Pj, Pk, ReadyJ, ReadyK
  • ROB
    • RS: Store register values
    • IQ: No register contents

Pros and Cons of IQ:

    • No copying between ROB and register
    • Efficient use of register
    • Bad: Complex mapping table design
register mapping table
Records the mapping from virtual, architectural registers to physical registers

Mapping is stored in RAM or CAM memories

Register Mapping Table

Phy reg

Arch reg

(virtual)

R1 => P3

R2 => P10

R3 => P6

R4 => P8

R5 => P12

register renaming examples
Loop:

LW R2, 0(R1)

ADD R2, R2, 1

SW R2, 0(R1)

ADD R1, R1, 4

BNE R2, R3, LOOP

LW returns 100, R1=1000

Renamed dynamic instructions:

BNE P2, P3, Loop

LW P32, 0(P1)

ADD P33, P32, 1

SW R33, 0(P1)

ADD P34, P1, 4

BNE P34, P3, LOOP

Assume at first BNE.rename, R1-R31 mapped to P1-P31, P32-P127 are free

First BNE may be predicted either correctly or not

Register Renaming Examples
register mapping status
Register Mapping Status

R1 => P1

R2 => P2

R3 => P3

R4 => P4

R5 => P5

R1 => P1

R2 => P32

R3 => P3

R4 => P4

R5 => P5

R1=>P1

R2 => P33

R3 => P3

R4 => P4

R5 => P5

R1=>P1

R2 => P33

R3 => P3

R4 => P4

R5 => P5

R1=>R34

R2 => P33

R3 => P3

R4 => P4

R5 => P5

At commit (possible sequence)

P1=4000

P2=200

P32=100

P33=?

P34=4004

P1=4000

P2=200

P32=100

P33=101

P34=4004

P1=4000

P2=200

P32=100

P33=101

P34=4004

No change

P1=4000

P2=200

P32=100

P33=101

P34=4004

commit and rollback
Commit successful: make the next mapping status as committed mapping status

free the previous physical register

Mis-prediction/exception: flush pipeline, flush the following mappings

Commit and Rollback

Rename point

commit point

R1 => P1

R2 => P2

R3 => P3

R4 => P4

R5 => P5

R1 => P1

R2 => P32

R3 => P3

R4 => P4

R5 => P5

P1=>R1

R2 => P33

R3 => P3

R4 => P4

R5 => P5

P1=>R1

R2 => P33

R3 => P3

R4 => P4

R5 => P5

P1=>R34

R2 => P33

R3 => P3

R4 => P4

R5 => P5

P1=4000

P2=200

P32=100

P33=?

P34=4004

program execution correctness
Program Execution Correctness
  • Only committed instructions write to register and memory

Yes, from programmer’s viewpoint -- only committed instructions’ register output becomes visible

  • Maintain correct data flow – a child instruction always use the values from its parents

Yes, in renamed form, and not affected by speculative execution

  • Register/memory receives the value of last write

Yes, from programmer’s viewpoint --architectural mapping status is updated in program order

Note memory correctness is not affected

mapping table design mips r1000
Mapping Table Design – MIPS R1000

Mapping tables

Branch stack

Current

mapping

Mapping after Br4

Alternative PC4

Mapping after Br3

Alternative PC3

Mapping after Br2

Alternative PC2

Committed

mapping

Mapping after Br1

Alternative PC1

Committed mapping

RAM-based structure:

  • Automatically, parallel saving on branches at rename
  • On mis-prediction: restore the previous mapping immediately, flush pipeline, restart fetch at the alternative PC
  • On commit of branch instruction: make the corresponding mapping as the committed one
  • Stall if branch stack is full
mapping table design mips r10001
Mapping Table Design – MIPS R1000
  • How about precise exception?
    • Cannot preserve every mapping status for every instruction
  • Solution: record the change of mapping in ROB
    • ROB: Contains Dest Architectural Register, Renamed physical register, Old renamed physical register
    • On exception: rollback mapping one instruction by one instruction, four instructions per cycle
    • Slow performance – but how frequent is exception?

Note branch mis-prediction has fast recovery

mapping table design alpha 21264
Mapping Table Design – Alpha 21264

Valid bits

p0

Arch. Reg #

1

1

p1

Arch. Reg #

1

0

Match and valid

p2

Arch. Reg #

0

1

pk

Arch. Reg #

1

1

committed

mapping

current

mapping

CAM structure

  • Associative searching on architecture register index, output physical register index (through an encoder)
  • One column represents one mapping, allocated to each instruction with register output at rename
  • One pair of valid bit changes per one dest renaming
  • Fast recovery even on exceptions
multiple issue pipelines
Multiple Issue Pipelines

Each pipeline stages accept k instructions – k-issue processor

  • Alpha 21264 – 4-issue
  • MIPS R1000 – 4-issue
  • Intel P4 – 3-issue

Memory structure must have multiple ports proportional to issue width!

What if k instructions at rename have dependence among them? Need Dependence check logic!

dependence check logic
Dependence Check Logic

Rs0

Rt0

Rd0

Rs1

Rt1

Rd1

Rs2

Rt2

Rd2

Rs3

Rt3

Rd3

mapping table

No dependencecheck yet

Ps0

Ps1

Ps0

Ps1

Ps0

Ps1

Ps0

Ps1

Pd0

Pd1

Pd2

Pd3

Any change to the first renaming?

What is the change to the second one? Third and forth ones?