Pushing performance efficiency and scalability of microprocessors
Download
1 / 15

Pushing Performance, Efficiency and Scalability of Microprocessors - PowerPoint PPT Presentation


  • 67 Views
  • Uploaded on

Pushing Performance, Efficiency and Scalability of Microprocessors. CERCS IAB Meeting, Fall 2006 Gabriel Loh. Research Overview. Funding from state of GA, Intel, MARCO Currently 2 PhD students, 2 MS Active undergrad research as well Collaborations Universities: PSU, UO, Rutgers

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Pushing Performance, Efficiency and Scalability of Microprocessors' - umika


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Pushing performance efficiency and scalability of microprocessors

Pushing Performance, Efficiency and Scalability of Microprocessors

CERCS IAB Meeting, Fall 2006

Gabriel Loh


Research overview
Research Overview Microprocessors

  • Funding from state of GA, Intel, MARCO

  • Currently 2 PhD students, 2 MS

    • Active undergrad research as well

  • Collaborations

    • Universities: PSU, UO, Rutgers

    • Industry: Intel, IBM


Research focus
Research Focus Microprocessors

  • “Near-term” microprocessor design issues

    • ~ 5-year time scale

    • Power/performance/complexity

    • Traditional uniprocessor performance

    • Multi-core performance

  • “Longer-term”

    • Keeping Moore’s Law alive for the longer term

    • Primarily, 3D integration for now


Scaling performance and efficiency
Scaling Performance and Efficiency Microprocessors

  • Multi-cores are here, but single-thread perf still matters

    • Intel Core 2 Duo is multi-core, but…

    • Single core is more OOO than ever

      • Larger instruction window, improved branch prediction, speculative load-store ordering, wider pipe and decoders

    • But power also really matters

      • Lower clock speeds, different channel length transistors, more uop fusion, …


Research focus1
Research Focus Microprocessors

  • Maximum performance within bounds

    • Bounds = power, area, TDP, …

  • Single-core performance helps multi-core performance, too

    • For future multi-core systems, need to strike a good balance between 1T and MT

  • Most of our research is at the uarch level

    • Caches, branch predictors, instruction schedulers, memory queue design, memory dependence prediction, etc.


Highlight traditional caching micro 06
Highlight: Traditional Caching [MICRO’06] Microprocessors

  • Well known that different apps respond differently to different replacement policies

  • Previous work in the OS domain has described adaptive replacement with provable bounds on performance

  • Adapted techniques for on-chip caches


Pushing performance efficiency and scalability of microprocessors
Idea… Microprocessors


Adaptive cache implementation
Adaptive Cache Implementation Microprocessors

  • Theoretical Guarantees

    • Miss rate provably bounded to be within a factor of two of the better algorithm

In practice,

it’s much better


Current research
Current Research Microprocessors

  • Working on multi-core generalizations of adaptive caching and other ways to manage shared resources

  • Uniprocessor microarchitecture

    • Scalable memory scheduling [MICRO’06]

    • Memory dependence prediction [HPCA’06]

    • Branch prediction […]

    • And more…


Longer term processor scaling
Longer-Term Processor Scaling Microprocessors

  • Limitations/Obstacles

    • Wire scaling

      • Latency/performance

      • Power

    • Feature size

      • Lithography, parametric variations

    • Off-chip communication


3d integration
3D Integration Microprocessors

Active

Layer 1

  • Wire

    • Power/perf.

  • Off-chip

  • Feature size

    • Limitations, variations

Metal

Layers 1

Die-to-Die

Vias

Metal

Layers 2

Active

Layer 2

Die/Wafer Stacking

Less RC  faster, lower-power


Example caches

3D Bitline Stacking

  • Bitline length halved

  • BL reduction has greater impact on power savings

  • Split decoder  no activity stacking

3D Wordline Stacking

Example: Caches

We’ve studied

a wide variety

of other CPU

building blocks

Simplified 2D SRAM Array


Uarch level 3d design
Uarch-level 3D design Microprocessors

Smaller footprint 

faster and lower-power

Width-based gating 

even lower power,

close to original power density

Overall: 47% performance gain at

only 2 degree temperature increase

Example: 4-die significance-partitioned datapath

Use uarch prediction mechanism for early determination of width


3d research summary
3D Research Summary Microprocessors

  • Circuit-level [ICCD’05,ISVLSI’06,ISCAS’06,GLSVLSI’06]

  • Uarch-level [MICRO’06 (w/ ),HPCA’07]

  • Tutorial papers [JETC’06]

  • Tutorial [MICRO’06]

  • Tools [DATE’06,TCAD’07] w/ GTCAD &

  • Parametric Variations w/ Jim Meindl

  • Funding, equip from ,


Summary
Summary Microprocessors

  • loh@cc

  • http://www.cc.gatech.edu/~loh

  • Lots of exciting work going on here