1 / 36

Using NPACI Systems Part I. NPACI Environment ( npaci/Resources )

Using NPACI Systems Part I. NPACI Environment ( http://www.npaci.edu/Resources ). NPACI Parallel Computing Institute August 13- August 17, 2001 San Diego Supercomputer Center. NPACI Environment Outline:. Logging in ssh, startup scripts File System $HOME, $WORK, HPSS, HSI Accounting

stacey
Download Presentation

Using NPACI Systems Part I. NPACI Environment ( npaci/Resources )

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using NPACI SystemsPart I. NPACI Environment(http://www.npaci.edu/Resources) NPACI Parallel Computing Institute August 13- August 17, 2001 San Diego Supercomputer Center

  2. NPACI Environment Outline: • Logging in • ssh, startup scripts • File System • $HOME, $WORK, HPSS, HSI • Accounting • reslist

  3. NPACI environment: logging in • ssh horizon.npaci.edu -l username • ssh (secure shell) required for secure access • see: http://security.sdsc.edu/

  4. NPACI environment: unix startup files • .cshrc, .login, and .profile NPACI provided • set up programming environment • set up unix environment • execute COPYDEFAULTS to restore startup files /usr/local/paci/shellrc/bin/COPYDEFAULTS (then source .cshrc)

  5. Filesystem Structure • $HOME/: 50-100 MB/user of backed up storage used for storage of unix essentials, etc. • $WORK/: >200 GB of purged (96 hours) disk storage primary workspace • Archival Storage: HPSS(SDSC)/DMF(TACC) • Disk/tape system providing TBytes of storage

  6. HPSS - Archival Storage (SDSC) • Interactive execution: “hsi” is recommended HPSS interface utility; can also use pftp - Beginning Interactive Execution: tf004i% pftp Saving a file in HPSS: pftp> put filename Retrieving a file from HPSS: pftp> get filename Ending Interactive Execution: pftp> quit

  7. HPSS - Archival Storage (SDSC) • Single line execution: • Saving a file in HPSS: tf004i% pftp put filename • Retrieving a file from HPSS: tf004i% pftp get filename • For tips and examples of using HPSS, see http://www.npaci.edu/HPSS http://www.sdsc.edu/Storage/hsi/

  8. NPACI environment: accounting • Normally updated every 24 hours • beware of going over your allocation • Commands: • reslist -u username (-a accountname) • resalloc

  9. Using NPACI SystemsPart II. Blue Horizon(http://www.npaci.edu/Horizon) NPACI Parallel Computing Institute August 13- August 17, 2001 San Diego Supercomputer Center

  10. Overview • Hardware overview • Compiling programs • How CPUs are requested • Running parallel jobs • Interactive • Loadleveler scripts

  11. Configuration • 144 IBM SP-High Nodes • 8 shared memory processors/node for 1152 processors • 4 Gbytes/node • Power3 processors • 375 MHz • 4 floating point operations per cycle • 1.5 GFLOPS/processor • Nodes connected together with IBM switch • 350 MB/second

  12. Programming Environment • Compilers • Fortran : xlf, xlf90, mpxlf, mpxlf90 • C/C++ : xlc, xlC, mpcc, mpCC, gcc(GNU c/c++) • Tools • Debuggers • Profilers • Parallel Environment (PE) and Parallel Operating Environment (POE) • Enables running parallel jobs • Message passing libraries: LAPI, MPL, MPI, MPICH

  13. Compiling C • C • xlc -qtune=pwr3 -qarch=pwr3 -O3program.c • C MPI • mpcc -qtune=pwr3 -qarch=pwr3 -O3program.c • C OpenMP • cc_r -qsmp=noauto:omp program.c • C MPI and OpenMP • mpcc_r -qsmp=noauto:omp program.c • C++ • xlC mpCC xlC_r mpCC_r

  14. Compiling Fortran • Fortran • xlf -qtune=pwr3 -qarch=pwr3 -O3program.f • Fortran MPI • mpxlf -qtune=pwr3 -qarch=pwr3 -O3program.f • Fortran OpenMP • xlf_r -qsmp=noauto:omp program.f • Fortran MPI and OpenMP • mxlf_r program -qsmp=noauto:omp program.f • Fortran 90 • xlf90, mpxlf90 xlf90_r mpxlf90_r

  15. Laura C. Nett: Describe the break down of the machine: Describe how to request processors or CPUs. Note that # of nodes and tasks are requested via the POE or LL environment where # of threads is set via environmental variable. Current hardware is limited to only using 4 MPI tasks per node meaning that 4 PEs will be idle ONLY if using fast communications...don’t worry because you aren’t charged for them. New colony switch will change that. Note when using threads: set the number of threads to spawn per MPI task. How CPUs are Requested • Blue Horizon has 144 Nodes with 8 CPUs on each node. 144*8=1152 CPUs • Request: # of Nodes # of tasks per Node # of threads per task (hybrid code only) • # of CPUs requested = nodes*tasks*threads • Hybrid code - spawns threads per MPI task

  16. Laura C. Nett: Interactive jobs run similar to SP2 except you need to define the number of CPUs by defining # of nodes and tasks. if you want threads to be spawned define this before running poe. The interactive CPUs are set aside for use during the day and unlike the SP2 they are not shared. There are various environmental variables that you may want to set and they get defined using all lower case. Remember that the tasks per node is a number between 1-8 the number of CPUs on a node but for straight MPI code using fast communications the max you can set this to is 4. Running Interactive Jobs • Interactive Jobs: poe a.outarg_list -nodes n -tasks_per_node m -rmpool 1 • Interactive Debugging: pedb a.outarg_list -nodes n -tasks_per_node m -rmpool 1 Alternatively, use environment variables: setenv MP_NODES n setenv MP_TASKS_PER_NODE m setenv MP_RMPOOL 1 tasks_per_node: 1-8 (number of CPUs on each node)

  17. Batch Jobs with LoadLeveler • Develop a program and make sure it runs correctly • Write a script file that has information about your job and the nodes/tasks/threads you want to run it on • Submit the script file to the LoadLeveler • Check the status of your job • Batch Script Generator (sample) http://www.sdsc.edu/~lnett/Tools/batchform.html

  18. LoadLeveler Scripts: Example 1 #!/bin/ksh # @ environment = MP_EUILIB=us; MP_SHARED_MEMORY=YES; MP_PULSE=0; MP_INTRDELAY=100; # @ class = high # @ job_type = parallel # @ node=16 # @ tasks_per_node=8 # @ node_usage=not_shared # @ network.MPI=css0,not_shared,US # @ wall_clock_limit =3:30:00 # @ input = /dev/null # @ output = LL_out.$(jobid) # @ error = LL_err.$(jobid) # @ initialdir = /work/login-name/mydir # @ notify_user = login-name@npaci.edu # @ notification = always # @ queue poe a.out or poe myscript myscript: #!/bin/ksh a.out

  19. LoadLeveler Scripts: Example 24CPUs per node PLUS 2 threads per task #!/bin/ksh # @ environment = MP_EUILIB=us; MP_SHARED_MEMORY=YES; MP_PULSE=0;MP_INTRDELAY=100; # @ class = high # @ job_type = parallel # @ node=16 # @ tasks_per_node=4 # @ node_usage=not_shared # @ network.MPI=css0,not_shared,US # @ wall_clock_limit =3:30:00 # @ input = /dev/null # @ output = LL_out.$(jobid) # @ error = LL_err.$(jobid) # @ initialdir = /work/login-name/mydir # @ notify_user = login-name@npaci.edu # @ notification = always # @ queue export OMP_NUM_THREADS=2 export SPINS=0 export YIELDS=0 export SPINLOOPTIME=5000 poe a.out or poe myscript myscript: #!/bin/ksh export OMP_NUM_THREADS=2 export SPINS=0 export YIELDS=0 export SPINLOOPTIME=5000 a.out

  20. Keywords in LoadLeveler Scripts arguments - arguments to pass to the executable input - file to use as stdin output - file to use as stdout initialdir - initial working directory for job executable - executable or script to run job_type - can be parallel or serial notify_user - user to send email to regarding this job, keywords are case insensitive

  21. Keywords for LoadLeveler Scripts(cont.) notification - e-mail to notify_user (always, error, start, never, complete) node - how many nodes do you want tasks_per_node - MPI tasks/node (max of 4 for now) wall_clock_limit - maximum wall clock time to use for the job class - class (queue) to run in (low,normal,high,express) queue - puts the job in the queue

  22. LoadLeveler Commands • Job Submission llsubmit filename submit job filename to LoadLeveler llcancel [-q] jobid cancel a job with the given id llq [–s job_number] list job [job number] • Job Status showq show job queue showbf show backfill

  23. Documentation NPACI Blue Horizon documentation http://www.npaci.edu/Horizon IBM documentation http://www.rs6000.ibm.com/resource/aix_resource/sp_books

  24. Lab session-Blue Horizon • Log into horizon ssh -l username horizon.npaci.edu • Ensure you have the default environment /usr/local/paci/shellrc/bin/COPYDEFAULTS source .cshrc • Copy sample LoadLeveler script and code cp /work/NPACI/Training/LoadLeveler/LLexample . cp /work/NPACI/Training/LoadLeveler/hello_mpi.f . • Compile sample code: mpxlf hello_mpi.f • Run code interactively: MPI ONLY: poe a.out -nodes 1 -tasks_per_node 4 -rmpool 1 MPI/OpenMP: setenv OMP_NUM_THREADS 2 poe a.out -nodes 1 -tasks_per_node 4 -rmpool 1 • Submit job to LoadLeveler batch queue: • First edit LLexample llsubmit LLexample • Check current usage reslist

  25. Using NPACI SystemsPart III. SUN HPC 10000(http://www.npaci.edu/HPC10000) NPACI Parallel Computing Institute August 13- August 17, 2001 San Diego Supercomputer Center

  26. Outline • System Overview • Compiling programs • Running jobs • Lab example

  27. Sun HPC System Overview • 64 UltraSPARC II processors with 64GB RAM • Uniform Memory Access (UMA) shared memory parallelism • Each processor has 16 kB of direct-mapped data cache, 4 MB of external cache, 400 MHz clock speed, and peak performance of 800 MFLOPS.

  28. Accessing the Sun HPC • The Sun HPC 10000 is actually a collection of systems • Compute Engine: ultra — 64 processors • for batch jobs only • Front End : gaos.sdsc.edu —12 processors • development compatible with ultra • launch all jobs here • Connecting to the System ssh gaos.sdsc.edu -l username

  29. Compiling • Serial Programs with Forte (Workshop 6.1) compilers f77, f90, cc, CC • MPI Programs with HPC 3.1 Front End tmf77, tmf90, tmcc, tmCC (with -lmpi) • OpenMP programs using Kuck & Associates preprocessor guidef77, guidec • OpenMP programs using Sun Forte Compilers f90 -mp=openmp -xparallel –stackvar

  30. Running Interactively • Serial gaos % myprog • OpenMP gaos % setenv OMP_NUM_THREADS 4 gaos % ompprog • MPI gaos % bsub -I -n 4 -m ultra mpiprog (Will run in LSF “hpc” interactive queue)

  31. Batch Queue Structure • LSF batch queues • holding place for batch requests according to the supplied information • submits a job for batch execution on host(s) that satisfy the resource requirement • Batch Script Generator http://www.sdsc.edu/~lnett/Tools/batchform.html

  32. Batch Queues

  33. Submitting a Batch Job #1 • Simple MPI Program gaos % bsub -q normal -m ultra -n 16 –oe out pam prog • Equivalent Batch Script gaos % cat mpiprog.bsub.1 #!/bin/csh #BSUB -q normal #BSUB -m ultra #BSUB -n 16 # total process reservations #BSUB –oe out #BSUB cd /paci/sdsc/frost/src/mydir pam prog • Submitting the Batch Script gaos % bsub < mpiprog.bsub.1

  34. Submitting a Batch Job #2 • Batch Script for OpenMP job gaos % cat ompprog.bsub.1 #!/bin/csh #BSUB -q high #BSUB -m ultra #BSUB -o ompprog.log.1 #BSUB -e ompprog.err.1 #BSUB -u user@somewhere.net #BSUB -n 16 # total thread reservations #BSUB cd /paci/sdsc/frost/src/mydir setenv OMP_NUM_THREADS 16 ompprog • Submitting the Batch Script gaos % bsub < ompprog.bsub.1

  35. Monitoring and Managing Jobs • Status of your job: gaos % bjobs [-l jobid] • All Jobs gaos % bjobs -u all • Job Load Per Machine gaos % bhosts • Killing a job (jobid=33443) Gaos % bkill 33443

  36. Lab session-Ultra • Login % ssh gaos.sdsc.edu -l username • Ensure you have the default environment % /usr/local/paci/shellrc/bin/COPYDEFAULTS % source .cshrc • Check your current usage % reslist • Get job script example and sample code % cp /work/NPACI/Training/LSF/lsf_script . % cp /work/NPACI/Training/LSF/hello_mpi.f . • Compile % tmf90 hello_mpi.f –lmpi • Run job interactively % bsub -I -m ultra -n 4 a.out • Submit job to LSF batch queue • First edit the lsf_script % bsub < lsf_script

More Related