1 / 11

RDMA@HP

This presentation discusses HP's philosophy in the RDMA fabric space, summarizes RDMA enablement in HP's product line, and emphasizes the importance of software ecosystem in achieving high performance computing. The historical perspective, scalable architectures, volume economics, and HP's hardware and software offerings are also covered.

Download Presentation

RDMA@HP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RDMA@HP Steve Langdon HP Fellow High Performance Computing Chief Technologist

  2. Intent of this presentation • Discuss HP’s philosophy in the RDMA fabric space • Summarize RDMA enablement and experience in HP’s product line • Keep the presentation short to leave time for Q&A

  3. Historical Perspective • HP has experience with RDMA enabled fabrics that dates back to the ’90s • Memory Channel, Quadrics, etc. • HP was a significant participant in the definition of both Infiniband and iWarp • People like Dwight Barron and Mike Krause • Our experience with file system use of “converged” or “unified” fabrics is also substantial • AlphaServer SC since the ’90s, Lustre since ‘04

  4. HPC has been the proving ground • The move to clusters in HPC has been a driving force • Latency • Bandwidth • Scalability • Infiniband has been a major beneficiary of this market • iWarp (or even plain 10 GbE) has not really been a player as a cluster interconnect

  5. The high-end drives HPC innovation Scalable architectures: NUMA & clusters Scalable interconnects: IB Scalable software development: MPI, UPC, SHMEM Cluster/parallel file systems Cluster and resource management Volume economics drives the market Rapid technical evolution of high-volume components and systems Processors driven by PCs, GPUs by games Volume drives ecosystem development Applications, development tools, etc. Volume drives cost reduction New Technology Capability Volume Economics EnterpriseandDivisional Departmentaland Workgroup The Leveraged Model for HPC Commercial Servers PCs

  6. Guiding Principles • There is never one fabric that satisfies everyone • In general products need to allow choice, although there is a place for fabric specific optimization • HP supports open industry standards • Today, in Fabric terms, that means Infiniband and iWarp enhanced Ethernet • The software ecosystem is critical • Which is why HP is an OFA member

  7. Current HP Hardware offerings • Infiniband • Rack mounted Proliant and Integrity servers • Full range of single and dual port SDR and DDR HCAs • Blades • Single port DDR mezzanine today, dual port coming soon • 10 GbE RNIC • Proliant rack mount servers support a single port 10 Gb RNIC with plans for iWarp support • Additional alternatives planned for Blades, Integrity, etc. • A variety of IB and Ethernet switches

  8. HP 4X DDR IB Switch Module World First 4X DDR IB Switch for Blades • 960Gb/sec total switching bandwidth • Fully non-blocking Fabric Switch • 16 4X DDR down links connecting blade servers • 8 External 4X DDR ports for scale out • Multiple switches supported in a c7000 enclosure • Redundancy • Aggregate bandwidth

  9. Current HP Software offerings • Linux • Legacy “gen1” stacks from Voltaire and Cisco • Latest releases of XC clusters and SFS (Lustre) file system offerings just migrated to OFED 1.1 • HP-UX • Native IB support implemented for a couple of years • RNIC support planned • Windows • IB support based on Voltaire supplied drivers (IBAL and Winsock direct) • HP-MPI • Supports a range of RDMA fabrics and SW stacks including OFED 1.1

  10. “Converged” or “Unified” Fabric support • HP has the infrastructure in place and can work with advanced customers now • Cluster file system usage with the (Lustre based) HP SFS product is well established • The barrier to broader adoption is software availability and maturation • HP is opposed to developing this market based on switch vendor proprietary software stacks • OFED 1.2 should help, but more work (and experience) will be needed

  11. Closing thoughts – a cliché based approach • High speed fabrics with RDMA support are “crossing the chasm” from HPC to commercial scale-out computing (with apologies to Geoffrey Moore) • We need to remember that, “It’s the software stupid” (with apologies to James Carville) • The Ethernet based alternatives are “not your father’s Ethernet” (with apologies to a defunct GM division) • Ironically the Infiniband based ecosystem has had more RDMA maturation time

More Related