1 / 12

Jacquard: Architecture and Application Performance Overview NERSC Users’ Group October 2005

Jacquard: Architecture and Application Performance Overview NERSC Users’ Group October 2005. Outline. An engineering level overview of the HW and SW that make up jacquard. CPU’s Memory OS Interconnect Will use seaborg as a point of reference. G. P. F. S. main memory. GPFS.

matia
Download Presentation

Jacquard: Architecture and Application Performance Overview NERSC Users’ Group October 2005

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jacquard: Architecture and Application Performance Overview NERSC Users’ Group October 2005

  2. Outline An engineering level overview of the HW and SW that make up jacquard. • CPU’s • Memory • OS • Interconnect Will use seaborg as a point of reference.

  3. G P F S main memory GPFS MPI seaborg.nersc.gov (review?) 16 way SMP NHII Node Seaborg: 380 x Colony Switch CSS0 CSS1 crossbar • 6080 dedicated CPUs, 96 shared login CPUs • Hierarchy of caching, speeds • Bottleneck determined by first depleted resource HPSS

  4. G P F S Main Memory GPFS MPI jacquard.nersc.gov basics 2 way Opteron node Jacquard: 320 x Infiniband Switch HT IB • 640 dedicated CPUs, 8 shared login CPUs • Smaller caches, HT, Really Fast • SMP? NUMA? SUMO. HPSS

  5. Opteron Block Diagram : Not strictly SMP SDRAM SDRAM Switch, I/O 1 TLB per CPU 1K entries 4K pages  4MB coverage

  6. Hyper Transport: Good Stuff Little conflict between data movement and computation

  7. SMP size and memory contention Why is Jacquard 2 way SMP? Jacquard’s numbers 1 task : 100 % 2 tasks: 98%

  8. Flops @ 2.2 GHz • Peak Theoretical Flops • Double (64 bit) floats : 1 add + 1 mult = 2.2 GFlop/s • Single (32 bit) floats : 2 add + 2 mult = 4.4 GFlop/s • Peak Realized Flops • Double (64 bit) floats : 1.9 GFlop/s • Single (32 bit) floats : 3.4 GFlop/s • Your Flops? • Walltime is more important than flops • For a known algorithm flops are a sanity check Memory BW 4 GB/sec per CPU

  9. MPI Bandwidth: seaborg

  10. MPI Bandwidth: Jacquard

  11. Linux for AIX Users Linux and AIX are more similar than different • Linux is not as good as AIX in keeping processes scheduled of the same CPU  processor affinity work. • Linux has easy interfaces to architectural and process performance information /proc/cpuinfo, /proc/self, etc. • AIX MPI is in /usr/{bin,lib}, Linux MPI is in modules • Linux doesn’t need –bmaxdata ! • Little vs. Big Endian

  12. Conclusions • The underlying HW technologies HT, IB, etc. are quite promising. Opteron systems are delivering great price/performance. • Still working some SDRAMM, OS, and SW issues. • What’s useful to you? Let us know.

More Related