1 / 20

Results of the Fermilab 64-bit Linux Hardware and Software Evaluation

Results of the Fermilab 64-bit Linux Hardware and Software Evaluation. Spring 2005 HEPiX meeting Karlsruhe, Germany Ken Schumacher, Steven Timm. Goals of the Evaluation. Gain experience with x86_64 architecture of Linux kernel and see if it is a stable OS platform

melita
Download Presentation

Results of the Fermilab 64-bit Linux Hardware and Software Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Results of the Fermilab 64-bit Linux Hardware and Software Evaluation Spring 2005 HEPiX meeting Karlsruhe, Germany Ken Schumacher, Steven Timm Fermilab 64-bit Linux Evaluation

  2. Goals of the Evaluation • Gain experience with x86_64 architecture of Linux kernel and see if it is a stable OS platform • Evaluate AMD Opteron CPU and the associated hardware platforms to see if they are reliable hardware platforms. • Obtain relative performance numbers between Intel Xeon EM64T “Nocona” and AMD Opteron processors • Obtain relative performance numbers on applications compiled in 32-bit and 64-bit mode. Fermilab 64-bit Linux Evaluation

  3. 64-bit hardware • Intel IA64 as implemented in Itanium 2 • Not considered in this evaluation, • Not binary-compatible with IA32 instruction set • Expensive • Intel - EM64T Xeon “Nocona” • Fermilab already has >240 of these in production • AMD - AMD64 Opteron • Note-Spec CINT2000 are about the same. Opteron 250=1452 and Xeon 3.6GHz=1429 Fermilab 64-bit Linux Evaluation

  4. Extending 32-bit instruction set • Intel and AMD scheme very similar • 48-bit virtual address space • 64-bit General Purpose Registers • Support 64-bit addressing and integer math • Eight extra GPR added • Eight extra XMM added • Difference—EM64T supports SSE3 instructions, Opteron has 3DNow! Fermilab 64-bit Linux Evaluation

  5. Vendor Selection • Only used vendors that Fermilab has previous experience with. • Requested 12 evaluation units, got 9. • Opteron units from Koi, ASA, Penguin, CSI, Rackable, IBM, HP, Sun • Purposely requested a variety of CPU speeds • Motherboard manufacturers represented include Tyan, Accelertech, Sun (by Newisys), IBM (by MSI), HP. • Dell Poweredge SC1425 Xeon unit (3.6 GHz) from Dell, as a reference. (Dell doesn’t offer Opteron). Fermilab 64-bit Linux Evaluation

  6. Machine configurations: Fermilab 64-bit Linux Evaluation

  7. Hardware features • Dual Opteron boards designed with NUMA • Each CPU has its own memory bank • No contention between CPU’s on front side bus • Some remote management available on all of them; we did not test it. • Several with SATA drives, they work fine. • Broadcom tg3 is network interface on all. • Rackable has low voltage Opteron 246HE chip, only 55W but same compute power as regular Opteron 246. Fermilab 64-bit Linux Evaluation

  8. Evaluation units Fermilab 64-bit Linux Evaluation

  9. OS Installation • Successfully installed all systems with 64-bit NPACI-Rocks, Scientific Linux Fermi i386, and Scientific Linux Fermi x86_64. • Tested operations of XFS file system, OK • Default SL kernel in version 3.0.3. is 2.4.21-20, ran with that most of time. • 2.6.9 kernel needed to take full advantage of NUMA architecture of Opterons, that works too. Fermilab 64-bit Linux Evaluation

  10. Linux kernels and distros. • One architecture x86_64, kernels come compiled for ia32e (Xeon) and amd64 (Opteron). • Similar to i386 architecture with separate i686 and athlon kernels. • All other rpms are the same for either. • Able to run almost all of our 32-bit applications under the 64-bit kernel/distro in compatibility mode with little trouble. Fermilab 64-bit Linux Evaluation

  11. Reliability Testing • Full Fermilab Acceptance test for 30 days • Continual disk activity both disks • Both cpu’s continuously busy. • 20 days in 64-bit mode, 10 in 32-bit mode • Excluding one node with two catastrophic disk failures (which was disqualified), other seven Opterons had 97.6% uptime. • Downtime was due to kernel hangs in 64-bit mode that we haven’t been able to reproduce since. Fermilab 64-bit Linux Evaluation

  12. Benchmarks • All major Fermilab computing users contributed benchmarks and people to run them. • CDF: reconstruction • D0: reconstruction • CMS: OSCAR and ORCA simulation and digitization, Root stress test, Pythia • SDSS: Supernova search program • LQCD: QCDStreams, MILC lattice code • General: seti@home, CERN unit benchmark, tiny • Many more details in our paper Fermilab 64-bit Linux Evaluation

  13. CMS Root Benchmark 64-bit mode gives gains on Opterons of about 40% Fermilab 64-bit Linux Evaluation

  14. Fermi Cycles • Reconstruction farms use Fermi Cycles (to account for differences in clock speed between Intel and AMD hardware). • Pentium III 1 GHz is defined to have 1000 Fermi Cycles • All other platforms take the average of the performance of CDF Reconstruction and D0 Reconstruction, normalized to PIII 1GHz performance. • D0 and CDF executables are 32-bit, optimized only at Pentium architecture, not recompiled. • We find D0 legacy executable runs is 2.93x faster on Opteron 250, 2.38x faster on Xeon 3.6 (than PIII 1GHz). Fermilab 64-bit Linux Evaluation

  15. Compilers • Use “tiny” (3000-line mock reconstruction program in Fortran, runs all in cache) • Opteron 250 • Legacy executable, i386: 1290 VUPS • Gcc 3.4.2 optimized: 2440 VUPS • Pathscale compiler: 2677 VUPS • Xeon 3.6 • Legacy executable, i386: 1386 VUPS • Gcc 3.4.2 optimized: 2309 VUPS • Intel 8.1 compiler:2910 VUPS • Intel 8.1 compiler with profile feedback: 4332 VUPS • Intel Fortran (and C) 8.1 uses SSE3 instructions to optimize, makes it incompatible with Opterons. • For comparison PentiumIII 1.0 GHz=568 VUPS. Fermilab 64-bit Linux Evaluation

  16. Run II benchmarks Fermilab 64-bit Linux Evaluation

  17. Power Draw In general Opterons draw 10-27% less current at full load than comparable Xeon chips. Four Opteron 248’s vary in current draw, explained by increasing numbers of fans and higher-performance disk drives. Low voltage Opteron246HE saves 10-15% over high-voltage Opteron 246. We need to average 10 kVA per rack in our facility. Have many racks now that are 12kVA. 10kVA/rack = 2.1A/node Fermilab 64-bit Linux Evaluation

  18. Conclusions • 64-bit Linux OS is a stable operating platform • Opteron CPU and associated platforms have sufficient reliability for Fermilab production Farms • Opteron CPU gives us slightly better performance for significantly less power draw and about the same price as Xeon. • Using 64-bit compilation and optimization can lead to significant performance gains on AMD and Intel. Fermilab 64-bit Linux Evaluation

  19. Referances • Fermilab Evaluation Results: • http://www-oss.fnal.gov/scs/public/qualify2005/opteron_external.ps • AMD Developer Symposium 2002 • “Optimizing for the AMD Opteron™ Processor” by Tim Wilkens PH.D. • http://www.amd.com/us-en/assets/content_type/DownloadableAssets/Optimization_-_Tim_Wilkens.pdf Fermilab 64-bit Linux Evaluation

  20. Power Supply Efficiency • General Information on PS Efficiency • http://www.efficientpowersupplies.org/ • “Energy Efficiency of Computer Power Supplies” from CEPE website • http://www.cepe.ethz.ch/download/staff/bernard/28_formated.pdf • http://www.xbitlabs.com/articles/other/display/psu-methodology.html Fermilab 64-bit Linux Evaluation

More Related