ia 64 microarchitecture itanium processor l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
IA-64 Microarchitecture --- Itanium Processor PowerPoint Presentation
Download Presentation
IA-64 Microarchitecture --- Itanium Processor

Loading in 2 Seconds...

play fullscreen
1 / 21

IA-64 Microarchitecture --- Itanium Processor - PowerPoint PPT Presentation


  • 172 Views
  • Uploaded on

IA-64 Microarchitecture --- Itanium Processor. Jun Feng Jun Xie Huafeng Lü. Outline. Introduction Pipeline Issue Performance Comparison Summary. Itanium Processor. First implementation of IA-64 Compiler based exploitation of ILP Also has many features of superscalar.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'IA-64 Microarchitecture --- Itanium Processor' - junior


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
outline
Outline
  • Introduction
  • Pipeline Issue
  • Performance Comparison
  • Summary
itanium processor
Itanium Processor
  • First implementation of IA-64
  • Compiler based exploitation of ILP
  • Also has many features of superscalar
10 stage pipeline
10-stage Pipeline
  • Front-end
  • Instruction delivery
  • Operand delivery
  • Execution
front end
Front-end
  • IPG, Fetch, Rotate

Prefetches up to 32 bytes per cycle (2 bundles) into a prefetch buffer (up to hold 8 bundles)

Branch prediction is done using a multilevel adaptive predictor

instruction delivery
Instruction delivery
  • EXP and REN

Distributes up to 6 instructions to the 9 functional units

Implements registers renaming for both rotation and register stacking

operand delivery
Operand delivery
  • WLD and REG

Accesses the register file

Performs register bypassing

Accesses and updates a register scoreboard

Checks predicate dependences

execution
Execution
  • EXE, DET and WRB

Executes instructions through ALUs and load/store units

Detects exceptions and posts NaTs

Retires instructions and performs write-back

integer performance specint benchmark considerably slower
Integer PerformanceSPECint benchmark: considerably slower
  • Itanium is considerably slower than Alpha 21264 and Pentium 4.
  • Only: 60% of of P4, 68% of Alpha

Itanium: HP rx4610, 800MHz, 4MB off-chip L3 cache

Alpha 21264: Compaq GS320, 1GHz, on-chip L2 cache

Pentium 4: Compaq Precision 330, 2GHz, 256KB on-chip L2 cache

floating point performance specfp benchmarks a different story
Floating Point Performance SPECfp benchmarks: a different story
  • Itanium is quicker than Alpha 21264 and Pentium 4.
  • 108% of of P4, 120% of Alpha

Itanium: HP rx4610, 800MHz, 4MB off-chip, L3 cache

Alpha 21264: Compaq GS320, 1GHz, on-chip L2 cache

Pentium 4: Compaq Precision 330, 2GHz, on-chip L2 cache

discussion on specfp
Discussion on SPECfp
  • Floating point app: competitive

.higher degrees of ILP

.aggressive memory system

Art benchmark: 4 times of Pentium 4

Alpha: outperform when tuned

In terms of power: worse than P4

56% of floating point performance per watt

summary by us
Summary By Us
  • Good floating point performance
  • Poor integer performance
  • Overall: not so good as Intel has advertised
conclusion
Conclusion
  • Large code size
  • Only static instruction-level parallelism
  • Cannot manage cache misses/hits flexibly
  • Lack of applications