1 / 16

Mechanisms: Run-time and Compile-time

Mechanisms: Run-time and Compile-time. Outline. Agents of Evolution Run-time Branch prediction Trace Cache SMT, SSMT The memory problem (L2 misses) Compile-time Block structured ISA Predication Fast track, slow track Wish branches Braids Multiple levels of cache.

iain
Download Presentation

Mechanisms: Run-time and Compile-time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mechanisms: Run-time and Compile-time

  2. Outline • Agents of Evolution • Run-time • Branch prediction • Trace Cache • SMT, SSMT • The memory problem (L2 misses) • Compile-time • Block structured ISA • Predication • Fast track, slow track • Wish branches • Braids • Multiple levels of cache

  3. Evolution of the Process • Pipelining • Branch Prediction • Speculative execution (and recovery) • Special execution units (FP, Graphics, MMX) • Out of order execution (and in-order retirement) • Pipelining with hiccups (R/S, ROB) • Wide issue • Trace cache • SMT, SSMT • Soft errors • Handling L2 misses

  4. Run-time Mechanisms • Branch prediction • Trace Cache • SMT, SSMT • The memory problem (L2 misses)

  5. The Conditional Branch Problem • Why? Because there are pipelines. (Note: HEP) • Mechanisms to solve it • Delayed branch • Take both paths • Eliminate branches (predication, compound predicates) • Branch Prediction • Static: Always taken, BTFN • Early dynamic: LT, 2 bit counter • 2 level  gshare, agree, hybrid • Indirect jumps • Perceptron, TAGE • Expose branch prediction hardware to the software • Wish branches

  6. Trace Cache Issues • Storage partitioning (and set associativity) • Path Associativity • Partial Matching • Inactive Issue • Branch Promotion • Trace Packing • Dual Path Segments • Block collection (retire vs. issue) • Fill unit latency

  7. SMT and SSMT • SMT • Invented by Mario Nemirovsky, called DMS (HICSS-94) • Multiple threads sharing common back end • SSMT (aka “helper threads) • Proposed in ISCA 99 (Rob Chappell et.al) • Simultaneously invented by M. Dubois, USC tech report • When the application only has one thread • Subordinate threads work on chip infrastructure • Subordinate threads: in microcode or at ISA level

  8. The memory problem • Simply: off-chip, on-chip access time imbalance • (assuming bandwidth is not a problem) • What to do about it • MLP replacement policy • DIP replacement • VWay cache • Runahead execution • Address, value prediction • 3D Stacking • Two-level register file • Denser encoding of instructions, data • Better organization of code, data

  9. Compile-time mechanisms • Predication (and wish branches) • Block structured ISA • Fast track, slow track • Braids • Multiple levels of cache

  10. Wish Branches(Predicated execution to the next level) • Compile-time and Run-time • At compile time: • Does predication even make sense • If no, regular branch and branch prediction • If yes, mark wish branch and defer to run-time • At run time: • Prediction accuracy low: predicate • Prediction accuracy high: branch predict

  11. Block-structured ISA • References: • Stephen Melvin dissertation (Berkeley, 1991) • Eric Hao dissertation (Michigan, 1997) • ISC paper (Melvin and Patt, 1989) • ISCA paper (Melvin and Patt, 1991) • Micro-29 paper (Hao, Chang, Evers, Patt, 1996) • The Atomic Unit of Processinng • Enlarged Blocks • Savings on Register Pressure • Faulting Branches, Trapping Branches • Serial Execution on Exceptions • Comparison to Superblocks, Hyperblocks

  12. Compile-time mechanisms • Predication (and wish branches) • Block structured ISA • Fast track, slow track • Braids (Francis Tseng, ISCA 2008) • Multiple levels of cache

More Related