1 / 46

Overview of Cluster Hardware and Software

Overview of Cluster Hardware and Software. Class 2. TDG Cluster. Hardware Configuration. 1 master node + 64 compute nodes + Gigabit Interconnection Master Node Dell PE2650, P4-Xeon 2.8GHz x 2 4GB RAM, 36GB x 2 U160 SCSI (mirror) Gigabit Ethernet ports x 2 SCSI Storage Dell PV220S

janae
Download Presentation

Overview of Cluster Hardware and Software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview of Cluster Hardware and Software Class 2

  2. TDG Cluster

  3. Hardware Configuration • 1 master node + 64 compute nodes + Gigabit Interconnection • Master Node • Dell PE2650, P4-Xeon 2.8GHz x 2 • 4GB RAM, 36GB x 2 U160 SCSI (mirror) • Gigabit Ethernet ports x 2 • SCSI Storage • Dell PV220S • 73GB x 10 (RAID5)

  4. Hardware Configuration • Compute Nodes • Dell PE2650, P4-Xeon 2.8GHz x 2 • 2GB RAM, 36GB U160 SCSI HD • Gigabit Ethernet ports x 2 • Gigabit Ethernet Switch • Extreme BlackDiamond 6816 • 256Gb Backplane • 72 Gigabit ports (8 ports card x 9)

  5. Software List • Operating System • ROCKS 3.1.0 from http://www.rocksclusters.org • MPI and PVM libraries • LAM/MPI 7.0.4 • MPICH 1.2.5.2 • mpiJava 1.2 • PVM 3.4.3-6beolin • Compilers • GCC 3.2.3 • PGI C/C++/f77/f90/hpf version 5.1 • J2SDK 1.4.2

  6. Software List • Applications • Amber 7 • Gamess • Gaussian 03 • Gromacs 3.2 • MATLAB 6.5 with MPITB • NAMD 2.5 • Q-Chem 2.1 • R 1.8.1 • VMD 1.8.2 • WebMO 4.1

  7. Software List • MATH and Other Libraries • ATLAS 3.6.0 • ScaLAPACK 1.7 • SPRNG 2.0a • HPJava 1.0 • Editors • Vi • Pico • Emacs • joe • Queuing system • Torque/PBS • Maui scheduler

  8. Cluster O.S. – ROCKS 3.1.0 • Developed by NPACI and SDSC • Based on RedHat Entreprise Linux 3.0 • Allow setup of 64 nodes in 1 hour • Useful command for users to monitor jobs in all nodes. E.g. • cluster-fork date • cluster-ps morris • cluster-kill morris • Web based management and monitoring • http://tdgrocks.sci.hkbu.edu.hk

  9. Ganglia

  10. Hostnames • Master node • External : tdgrocks.sci.hkbu.edu.hk • Internal : frontend-0 • Compute nodes • comp-pvfs-0-1, …, comp-pvfs-0-64 • Alias • comp-pvfs-0-1.local, …, comp-pvfs-0-64.local • cp0-1, cp0-2, …, cp0-64

  11. Network Design

  12. Kickstart Graph • http://tdgrocks.sci.hkbu.edu.hk/homepage/dot-graph.php

  13. Login Master Node • Login is allowed remotely in all HKBU networked PCs by SSH or vncviewer • SSH Login (Terminal) • Using your favourite SSH client software, namely putty, SSHsecureshell on windows and openssh on Linux/UNIX • e.g. on all SCI workstations (spc01 – spc30), type ssh tdgrocks.sci.hkbu.edu.hk

  14. Login Master Node • VNC Login (GUI) • Download vncviewer from http://www.uk.research.att.com/vnc/ • E.g. in spc01 – spc30.sci.hkbu.edu.hk, vncviewer vnc.sci.hkbu.edu.hk:51 • E.g. in windows, run vncviewer and upon asking the server address, type vnc.sci.hkbu.edu.hk:51

  15. Username and Password • The unified password authentication has been implemented • Same as that of your NetWare account • Password authentication using NDS-AS • Setup similar to the UNIX servers in ITSC

  16. SSH Key Generation • To make use of multiple nodes in the PC cluster, users are restricted to use ssh. • Key generation is done once automatically during first login • You may input a passphrase to protect the key pair • The key pair is stored in your $HOME/.ssh/ • Use the following script to regenerate the SSH key /u1/local/share/makeKey

  17. User Policies • Users are allowed to remote login from other networked PCs in HKBU. • All users must use their own user account to login. • The master node (frontend) is used only for login, simple editing of program source code, preparing the job dispatching script and dispatching of jobs to compute node. No foreground or background can be run on it. • Dispatching of jobs must be done via the OpenPBS system.

  18. Torque/PBS system • Provide a fair and efficient job dispatching and queuing system to the cluster • PBS script shall be written for running job • Either sequential or parallel jobs can be handled by PBS • Jobs error and output are stored in different filenames according to job IDs.

  19. PBS scripts are shell script with directives preceding with #PBS The above example request only 1 node and deliver the job named ‘prime’ in default queue. PBS Script Example (Sequential) • #!/bin/bash • #PBS -l nodes=1 • #PBS -N prime • #PBS -q default • # the above is the PBS directive used in batch queue • echo Running on host `hostname` • /u1/local/share/example/pbs/prime 216091

  20. Delivering Sequential Job • Prepare and compile executable cp /u1/local/share/example/pbs/prime.c ./ • cc –o prime prime.c -lm • Prepare and edit PBS script as previous cp /u1/local/share/examples/pbs/prime.bat ./ • Submit the job qsub prime.bat

  21. PBS Script Example (Parallel) • #!/bin/sh • #PBS -N cpi • #PBS -r n • #PBS -e cpi.err • #PBS -o cpi.log • #PBS -l nodes=5:ppn=2 • #PBS -l walltime=01:00:00 • # This job's working directory • echo Working directory is $PBS_O_WORKDIR • cd $PBS_O_WORKDIR • echo Running on host `hostname` • echo This jobs runs on the following processors: • echo `cat $PBS_NODEFILE` • # Define number of processors • NPROCS=`wc -l < $PBS_NODEFILE` • echo This job has allocated $NPROCS nodes • # Run the parallel MPI executable “cpi” • /u1/local/mpich-1.2.5/bin/mpirun -v -machinefile $PBS_NODEFILE -np $NPROCS /u1/local/share/example/pbs/cpi

  22. Delivering Parallel Job • Copy the PBS script examples cp /u1/local/share/example/pbs/runcpi ./ • Submit the PBS job qsub runcpi • Note the error and output files named cpi.e??? cpi.o???

  23. MPI Libraries

  24. MPICH • MPICH is an open-source, portable implementation of the Message-Passing Interface Standard. It contains a complete implementation of version 1.2 of the MPI Standard and also significant parts of MPI-2, particularly in the area of parallel I/O. • URL: http://www-unix.mcs.anl.gov/mpi/ • MPICH 1.2.5.2 with gcc • /u1/local/mpich-1.2.5/bin • MPICH 1.2.5.2 with pgi • /u1/local/mpich-pgi/bin

  25. LAM/MPI • LAM/MPI is a high-performance, freely available, open source implementation of the MPI standard. • LAM/MPI supports all of the MPI-1 Standard and much of the MPI-2 standard. • LAM/MPI is not only a library that implements the mandated MPI API, but also the LAM run-time environment: a user-level, daemon-based run-time environment that provides many of the services required by MPI programs. • URL: http://www.lam-mpi.org/ • LAM 6.5.9 • /usr/bin • LAM 7.0.4 • /u1/local/lam-7.0.4/bin

  26. mpiJava • mpiJava provides an object-oriented Java interface to the Message Passing Interface (MPI) standard, for use on parallel or distributed computing platforms. • It also includes a comprehensive test suite for the Java interface. • It includes some simple examples and demos. • URL: http://www.hpjava.org/mpiJava.html • mpiJava 1.2: • /u1/local/mpiJava/

  27. Applications

  28. AMBER 7 • Assisted Model Building with Energy Refinement • Includes: • a set of molecular mechanical force fields for the simulation of biomolecules (which are in the public domain, and are used in a variety of simulation programs); • and a package of molecular simulation programs which includes source code and demos. • URL: http://amber.scripps.edu • AMBER 7 • /u1/local/amber7

  29. Gamess • The General Atomic and Molecular Electronic Structure System (GAMESS) • It is a general ab initio quantum chemistry package. • Calculates molecular properties, transition states, models solvent effects, orbitals, etc. • URL: http://www.msg.ameslab.gov/GAMESS/GAMESS.html • Gamess: • /u1/local/gamess

  30. Gaussian 03 • Calculate the basic laws of quantum mechanics • Predicts the energies, molecular structures, and vibrational frequencies of molecular systems along with numerous molecular properties. • It can be used to study molecules and reactions under a wide range of conditions, including both stable species and compounds which are difficult or impossible to observe experimentally such as short-lived intermediates and transition structures. • URL: http://www.gaussian.com/ • Gaussian 03 • /u1/local/g03

  31. Gaussian 03 – GaussView

  32. Gromacs 3.2 • GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. • URL: http://www.gromacs.org • Gromacs 3.2 • /u1/local/gromacs

  33. MATLAB 6.5 with MPITB • Call MPI library routines from within the MATLAB interpreter. • Processes can be spawned and arranged in topologies, MATLAB variables can be sent/received. • Commands for remote execution may be passed as strings to the desired MATLAB "computing instance", and the result may be sent back to the MATLAB "host instance". • URL: http://atc.ugr.es/javier-bin/mpitb_eng • MATLAB 6.5 with MPITB • /u1/local/matlab/toolbox/mpitb

  34. NAMD 2.5 • Reads X-PLOR, CHARMM, AMBER, and GROMACS input files. • Generates structure and coordinate files. • Efficient conjugate gradient minimization. • Fixed atoms and harmonic restraints. • Thermal equilibration via periodic rescaling, reinitialization, or Langevin dynamics • Simulation of the temperature, pressure, etc. • URL: http://www.ks.uiuc.edu/Research/namd/ • NAMD 2.5 • /u1/local/namd2

  35. Q-Chem 2.1 • Supports Ground State Self-Consistent Field Methods, Wave Function Based Treatments of Electron Correlation, Excited State Methods, Properties Analysis and Basis Sets. • URL: http://www.q-chem.com/ • Q-Chem 2.1 • /u1/local/qchem

  36. R 1.8.1 • Provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. • Produces well-designed publication-quality plots. • URL: http://cran.r-project.org/ • R 1.8.1 • /u1/local/R

  37. R 1.8.1

  38. VMD 1.8.2 • Supports various kind of Molecular Representations, Coloring Styles, Transparency and Materials, Display and rendering features • Supports various kind of plugins • Supports Interactive Molecular Dynamics Simulation • URL: http://www.ks.uiuc.edu/Research/vmd/ • VMD 1.8.2 • /u1/local/vmd-1.8.2

  39. VMD 1.8.2

  40. MATH and Other Libraries

  41. ATLAS 3.6.0 • The ATLAS (Automatically Tuned Linear Algebra Software) project is an ongoing research effort focusing on applying empirical techniques in order to provide portable performance. At present, it provides C and Fortran77 interfaces to a portably efficient BLAS implementation, as well as a few routines from LAPACK. • URL: http://sourceforge.net/projects/math-atlas/ • ATLAS 3.6.0 • /u1/local/ATLAS

  42. ScaLAPACK 1.7 • ScaLAPACK libraries are distributed memory versions (PBLAS) of the Level 1, 2 and 3 BLAS, and a set of Basic Linear Algebra Communication Subprograms (BLACS) for communication tasks that arise frequently in parallel linear algebra computations. In the ScaLAPACK routines, all interprocessor communication occurs within the PBLAS and the BLACS. One of the design goals of ScaLAPACK was to have the ScaLAPACK routines resemble their LAPACK equivalents as much as possible. • URL: http://www.netlib.org/scalapack/scalapack_home.html • ScaLAPACK 1.7 • /u1/local/SCALAPACK

  43. SPRNG 2.0a • The Scalable Parallel Random Number Generators Library (SPRNG) • SPRNG is a set of libraries for scalable and portable pseudorandom number generation, and has been developed keeping in mind the requirements of users involved in parallel Monte Carlo simulations. • The generators implemented in SPRNG are (i) two versions of Linear Congruential with Prime Addend and (ii) modified Additive Lagged Fibonacci (iii) Multiplicative Lagged Fibonacci (iv) Combined Multiple Recursive Generator (v) Prime Modulus Linear Congruential Generator. • URL: http://sprng.cs.fsu.edu/ • SPRNG 2.0a for MPICH • /u1/local/sprng2.0 • SPRNG 2.0a for LAM/MPI • /u1/local/sprng2.0-lam

  44. HPJava 1.0 • HPJava is an environment for scientific and parallel programming using Java. It is based on an extended version of the Java language. One feature that HPJava adds to Java is a multi-dimensional array, or multiarray, with properties similar to the arrays of Fortran. • URL: http://www.hpjava.org • HPJava 1.0 • /u1/local/hpjdk

  45. Homework 1 • Copy the files in /u1/local/share/example/pbs cp /u1/local/share/example/pbs/cpi.c . cp /u1/local/share/example/pbs/Makefile . cp /u1/local/share/example/runcpi . • Type make to create cpi executable • Modify runcpi to run cpi executable in this directory • Run runcpi in PBS with 1,2,4,6,8,10,12,16,24 nodes (CPUs) and plot the time graph Hints: qsub –l nodes=12 runcpi for requesting 12 nodes • From the graph, find the best no. of processors to run this program • Due day: Jan 13, 2005 e-mail to morris@hkbu.edu.hk

  46. END

More Related