high performance compute cluster n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
High Performance Compute Cluster PowerPoint Presentation
Download Presentation
High Performance Compute Cluster

Loading in 2 Seconds...

play fullscreen
1 / 21
neville-kane

High Performance Compute Cluster - PowerPoint PPT Presentation

119 Views
Download Presentation
High Performance Compute Cluster
An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. High Performance Compute Cluster Abdullah Al Owahid Graduate Student, ECE Auburn University A. Al Owahid: ELEC 5200-001/6200-001

  2. Topic Coverage • Cluster computer • Cluster categories • Auburn’s vSMP HPCC • Software installed • Accessing HPCC • How to run simulations in HPCC • Demo • Performance • Points of Contact A. Al Owahid: ELEC 5200-001/6200-001

  3. Cluster Computer • Multi Processor-Distributed Network • A computer cluster is a group of linked computer • Works together closely thus in many respects forming a single computer • Connected to each other through fast local area networks A. Al Owahid: ELEC 5200-001/6200-001

  4. Cluster Computer-Categories • High-availability (HA) clusters --They operate by having redundant nodes • Load-balancing clusters --Multiple computers are linked together to share computational workload • Compute clusters -- HPCC --Computational purposes --Cluster shares a dedicated network --Compute job uses one or few nodes, and needs little or no inter-node communication (grid computing) --Uses MPI or PVM (parallel virtual machine) A. Al Owahid: ELEC 5200-001/6200-001

  5. HPCC A. Al Owahid: ELEC 5200-001/6200-001

  6. Auburn’s vSMP HPCCSamuel Ginn College of Engineering Computational Cluster • Dell M1000E Blade Chassis Server Platform • 4 M1000E Blade Chassis Fat Nodes • 16 M610 half-height Intel dual socket Blade • 2CPU, Quad-core Nehalem 2.80 GHz processors • 24GB RAM, two 160GB SATA drives and • Single Operating System image (CentOS). A. Al Owahid: ELEC 5200-001/6200-001

  7. Auburn’s vSMP HPCC (contd..) • Each M610 blade server is connected internally to the chassis via a Mellanox Quad Data Rate (QDR) InfiniBand switch 40Gb/s for creation of the ScaleMP vSMP • Each M1000E Fat Node is interconnected via 10 GbE Ethernet using M6220 blade switch stacking modules for parallel clustering using OpenMPI/MPICH2 • Each M1000E Fat Node also has independent 10GbE Ethernet connectivity to the Brocade Turboiron 24X Core LAN Switch • Each node with 128 cores @ 2.80 GHz Nehalem • Total of 512 cores @ 2.80 GHz, 1.536TB shared memory RAM, and 20.48TB RAW internal storage A. Al Owahid: ELEC 5200-001/6200-001

  8. vSMP Scale MP • ScaleMP is the leader in virtualization for high-end computing • The innovative Versatile SMP (vSMP) architecture aggregates multiple x86 systems into a single virtual x86 system, delivering an industry-standard, high-end symmetric Multiprocessing (SMP) computer. • vSMP Foundation aggregates up to 16 x86 systems to create a single system with 4 to 32 processors (128 cores) and up to 4 TB of shared memory. A. Al Owahid: ELEC 5200-001/6200-001

  9. vSMP HPCC Configuration Diagram A. Al Owahid: ELEC 5200-001/6200-001

  10. Network Architecture A. Al Owahid: ELEC 5200-001/6200-001

  11. Software installed • Matlab (/export/apps/MATLAB) -- Parallel distributed computing toolbox with 128 workers • Fluent (/export/apps/Fluent.Inc) – 512 parallel license • LS Dyna (/export/apps/ls-dyna) – 128 parallel license • Starccm+ (/export/apps/starccm)--128 Parallel license • MPICH2 – Argonne National Laboratory /opt/mpich2-1.2.1p1 /opt/mpich2 A. Al Owahid: ELEC 5200-001/6200-001

  12. Accessing HPCC http://www.eng.auburn.edu/ens/hpcc/Access_information.html A. Al Owahid: ELEC 5200-001/6200-001

  13. How to run simulations in HPCC • Save .rhosts file in your home directory • Save .mpd.conf file in home directory • Your H:\ drive is already mapped • Add rsa keys by ssh compute-i and then exit, i=1,2,3,4 • mkdirfolder_name • In your script file add a line • #PBS –d /home/au_user_id/folder name obtained by “pwd” • Make the script executable “chmod 744 s_file.sh” • Submit the script using qsub “./script_file.sh” A. Al Owahid: ELEC 5200-001/6200-001

  14. Basic commands • showq • runjobjob_id • canceljobjob_id • pbsnodes –a • pbsnodes compute-1 • ssh compute-1 • ps –ef | grepany_process_you_want_to_see • pkillprocess_name • kill -9 your_ aberrant process_id • exit A. Al Owahid: ELEC 5200-001/6200-001

  15. Demo Live demo (25 minute) • Accessing cluster • Setting all the path and home space • Making changes in script based on requirement • Submitting multiple jobs • Obtaining the data • Viewing load • Tracing the processes A. Al Owahid: ELEC 5200-001/6200-001

  16. Performance A. Al Owahid: ELEC 5200-001/6200-001

  17. Performance (contd..) A. Al Owahid: ELEC 5200-001/6200-001

  18. Performance (contd..) A. Al Owahid: ELEC 5200-001/6200-001

  19. Points of Contact • James ClarkInformation Technology Master Specialist Email: jclark@auburn.edu • Shannon PriceInformation Technology Master Specialist Email: pricesw@auburn.edu • Abdullah Al Owahid Email: azo0012@auburn.edu A. Al Owahid: ELEC 5200-001/6200-001

  20. Thank You Question & Answer A. Al Owahid: ELEC 5200-001/6200-001

  21. References • http://en.wikipedia.org/wiki/Computer_cluster • http://www.eng.auburn.edu/ens/hpcc/index.html A. Al Owahid: ELEC 5200-001/6200-001