1 / 13

Enhancing SMP Scalability Through CMR and HAS Innovations

This presentation by Terry Arnold II explores the challenges and solutions for scaling Symmetric Multiprocessing (SMP) technologies. It addresses skepticism regarding SMP scalability due to bandwidth limitations, presenting Coherent Memory Replication (CMR) and Hierarchical Affinity Scheduling (HAS) as innovative approaches. Both methods optimize memory access patterns to improve performance, particularly for Online Transaction Processing (OLTP) workloads. The session discusses implementation details, competitive comparisons, results achieved, and poses critical questions about software dependencies and compatibility with other Distributed Shared Memory (DSM) solutions.

wyanet
Download Presentation

Enhancing SMP Scalability Through CMR and HAS Innovations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WildFire: A Scalable Path for SMPs Erick Hagersten and Michael Koster Sun Microsystems Inc. Presented by Terry Arnold II

  2. Introduction • What was the goal? • How did they achieve it? • CMR • HAS • Competitive Comparisons • Results • Questions

  3. The Goal • In the past people have been skeptical about the ability of SMPs to continue to scale due to their bandwidth limitations • The trend has been to switch to cc-NUMA • To improve the scalability of SMP technologies

  4. Cc-NUMA issues • Great scalability but have less than optimal “access patterns” • Require high software optimization for capacity and conflict misses • Non trivial scheduling, etc. (resource and memory management)

  5. How? • The answer is the same as the answer to all engineering problems, that is, throwing new acronyms at the problem • Coherent Memory Replication (CMR) • Hierarchical Affinity Scheduling (HAS) • Both of these exploit locality as a means of increasing performance (that is for OLTP workloads)

  6. The Overview

  7. The Acronyms: CMR • S-COMA with fixed home locations for each address • Shadow physical pages • Coherence at hardware level (64 byte) • Things start out cc-NUMA and changed into CMR based on hardware counters that monitor memory access patterns • Limitations – memory-resident pages and large physical pages can only be replicated explicitly

  8. The Acronyms: HAS • Schedules in the following way: • Last processor it ran on • Same node processor • Remote node processor (when load balances exceeds “threshold”)

  9. Implementation • 2 ASICs – NIAC (coherence), NIDC (bit sliced interconnect) • These improve upon latency of a switch • NIAC – Interface and Global-Coherence Layer • Translators and Counters

  10. Competition • The SGI Origin and Sequent’s NUMA-Q

  11. Results 1

  12. Results 2

  13. Questions? Is this “solution” too dependent on the software (kernel modifications)? How compatible are CMR and HAS with the other DSM solutions?

More Related