1 / 18

Itanium

Itanium. CSE 820. IA-64. Intel introduced a new ISA with no backward compatibility to x86 IA-32. What do you get from a clean sheet?. IA-64. The first product line is the Itanium. Status:

lavender
Download Presentation

Itanium

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Itanium CSE 820

  2. IA-64 Intel introduced a new ISA with no backward compatibility to x86 IA-32. What do you get from a clean sheet? Michigan State University Computer Science and Engineering

  3. IA-64 The first product line is the Itanium. Status: • NEC announced that a 32-processor, Itanium 2-based server has achieved the world's best TPC-C benchmark result on a 32-processor SMP platform. • 1GHz, 3MB tertiary cache, 512 GB RAM Michigan State University Computer Science and Engineering

  4. SPEC (top in 3/03) SPECint2000 • Pentium4 3GHz 1100 • IBM 690 1.3GHz 839 • Pentium4 2.2GHz 811 • Itanium 2 1GHz 810 SPECfp2000 • Itanium 2 1GHz 1431 • IBM 690 1.3GHz 1266 • Pentium4 3GHz 1090 Michigan State University Computer Science and Engineering

  5. Registers • 128@ 65-bit general-purpose registers • 64-bit + NaT • 128@ 82-bit floating-point registers • 2 extra exponent bits over IEEE 80-bit • 64 @ 1-bit predicate registers • 8 @ 64-bit branch registers • for indirect branches • Other registers for system control, memory mapping, performance counters, and communication with the OS Michigan State University Computer Science and Engineering

  6. Integer Registers • 0-31 general purpose • 32-128 used as a register stacksimilar to SPARC: renaming registers for function calls; includes a frame pointer (CFM)Also, special hardware handles stack overflow Michigan State University Computer Science and Engineering

  7. Register Rotation Register rotation of registers 32-128 is used for allocating registers insoftware-pipelined loops When combined with predication, loops can be unrolled without separate prologue and epilogue—reducing the code expansion overhead of loop unrolling That is, the overhead cost of loop unrolling is reduced so smaller loops can be unrolled. Michigan State University Computer Science and Engineering

  8. Explicit Parallelism One important aspect of the IA-64 is to allow the compiler to do more andto allow the compiler to communicate more information to hardware. In particular, the compiler can indicate when an instruction cannot be executed in parallel with its successors. Michigan State University Computer Science and Engineering

  9. Group A sequence of consecutive instructions with no data dependences among them. All instructions can be executed in parallel, if sufficient hardware and if memory dependences are preserved. A group can be arbitrarily long, but the compiler must explicitly indicate the boundary with a stop instruction between groups. Michigan State University Computer Science and Engineering

  10. Bundle 128-bit wide • Three 41-bit instructions • 4 MSB are opcode • 6 LSB specify predicate registers • 5-bit template • Encoded • Specifies execution unit for each instruction • Indicates “stops” Opcode combines MSB 4 bits + template info Michigan State University Computer Science and Engineering

  11. Execution Slots • I-unit: ALU ops, shifts, moves • M-unit: ALU ops, loads, stores • F-unit: FP ops • B-unit: Branches • L+X: Extended immediates, stops, NOP2-instruction slots for 64-bit immediates Michigan State University Computer Science and Engineering

  12. Predication • Predicate registers are set using compare or test instructions • 10 tests • Write 2 predicate registers (complement) • Multiple comparisons can be handled in one instruction • A conditional branch is simply a predicated branch Michigan State University Computer Science and Engineering

  13. Deferred Exception Handling Itanium uses poison bits:NaT = “Not a Thing” (65th GPR bit)NaTVal = “Not a Value” (special FP value) Generated by speculative loads(all ops will propagate NaT and NaTVal) There exist nonspeculative loads which do not defer exceptions FP exceptions are handled separately using special FP status registers. Michigan State University Computer Science and Engineering

  14. Deferred Exception Handling If NaT (or NaTVal) if nonspeculative, e.g store, an immediate exception is raised if chk.s, branch to a compiler-generated routine to recover from speculative op. (special instructions exist so O/S can save registers with NaT on context switch) Michigan State University Computer Science and Engineering

  15. Advanced Loads Hoist loads above stores it may be dependent upon Instruction ld.a generates entry in ALAT table which stores register destination and memory address. On store, the ALAT is accessed by memory address to check for conflict.If conflict, mark ALAT entry as invalid. Michigan State University Computer Science and Engineering

  16. Advanced Load Before any nonspeculative instruction (store) is to use the value from an advanced load the ALAT is checked. If OK, clear ALAT.If not OK • If ld.c reexecute load • If chk.a reexecute load and any speculative instructions which depend on the load Michigan State University Computer Science and Engineering

  17. Michigan State University Computer Science and Engineering

  18. Michigan State University Computer Science and Engineering

More Related