1 / 2

SNIC/KTH Proposal

SNIC/KTH Proposal. Objective Improved energy efficiency over common IA32 based nodes by a factor of at least 5 High compute density, possibly 10 times that of an IA32 based system in SP Modest increase in programming complexity High volume component technologies Technology

quanda
Download Presentation

SNIC/KTH Proposal

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SNIC/KTH Proposal • Objective • Improved energy efficiency over common IA32 based nodes by a factor of at least 5 • High compute density, possibly 10 times that of an IA32 based system in SP • Modest increase in programming complexity • High volume component technologies • Technology • Embedded processor technology • ARM Cortex 9 4-core CPU (0.8 – 2GHz, 0.4 – 2W) • TI DSP (designed in Nice) • Hybrid programming OpenMP+MPI • Industry Partners (tentative) • TI (4th largest IC company by revenue in 2009 after Intel, Samsung and Toshiba) • Supermicro • Smooth Stone (start-up with funding from ARM) targeting energy efficient servers for Internet and web applications. CPU chip by TI.

  2. High Performance Compute Node • HPC Compute Node Performance • 1024 GMAC (1 TMAC) (MAC = Multiply-Accumulate, 32-bit) • 512 Single Precision Floating Point Operations @ 1Ghz (=614.4 GF SP@1.2 GHz) • Support for double precision floating point announced. • Approximate 50 to 60 W • DDR3 (number of DIMMs not yet fixed) • Interconnect • DSP to DSP: SRIO • CPU to DSP: PCIe x2 Gen2 (5 GHz) • Node to Node: 10 G Ethernet • Board • 4 – 8 nodes • 2,5 - 5 TF SP per board/blade • 3.5 – 7 TF/U SP • ~ 400 – 800 W/U • Programming Model • MPI across compute nodes • OpenMP within a node • DSPs and ARM processor both programmed in a high level language • OpenMP-style directives define accelerate regions that are executed on the DSPs HPC Compute Node Texas Instruments 8 core DSP @ 1.2 GHz Texas Instruments 8 core DSP @ 1.2 GHz Acceleration Memory DDR3-1333 PCIe/SRIO/Eth Connectivity CPU ARM/x86 Texas Instruments 8 core DSP @ 1.2 GHz CPU External Memory Texas Instruments 8 core DSP @ 1.2 GHz 10G Ethernet

More Related