1 / 43

Introduction to the NERSC HPCF NERSC User Services

Introduction to the NERSC HPCF NERSC User Services. Hardware, Software, & Usage Mass Storage Access & Connectivity. Hardware, part 1. Cray Parallel Vector Processor (PVP) Systems 96 CPUs, Shared-memory parallelism (Cray tasking, OpenMP);

jaclyn
Download Presentation

Introduction to the NERSC HPCF NERSC User Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to the NERSC HPCFNERSC User Services Hardware, Software, & Usage Mass Storage Access & Connectivity

  2. Hardware, part 1 • Cray Parallel Vector Processor (PVP) Systems • 96 CPUs, Shared-memory parallelism (Cray tasking, OpenMP); • J90SE clock is 100 MHz; peak performance is 200 Mflops/cpu (~125, actual) • SV1 clock is 300 MHz; peak performance is 1200 Mflops/cpu (~300, actual) • J90Se and SV1 are not binary compatible • Cray T3E MPP System • mcurie • 692 PEs: 644 application, 33 command, 15 OS; 256 MB/PE • PE clock is 450 MHz; peak performance is 900 Mflops/PE (~100, actual) Introduction to NERSC - User Services Group

  3. Hardware, part 2 • IBM SP MPP System • gseaborg, Phase 1 • 304 nodes (608 CPUs): 256 (512) compute, 8 (16) login, 16 (32) GPFS, 8 (16) network, 16 (32) service); 1 GB/node • Node clock is 200 MHz; peak performance is 800Mflops per CPU (~200, actual) • Phase 2 will be bigger and faster • Visualization Server • escher; SGI Onyx 2 • 8 CPUs, 5 GB RAM, 2 graphic pipes • CPU clock is 195 MHz; 2 simultaneous video streams • Math Server • newton; Sun UltraSPARC-II • 1 CPU, 512 MB RAM • CPU clock is 248 MHz Introduction to NERSC - User Services Group

  4. Hardware, part 3 • Parallel Distributed Systems Facility (PDSF) • High Energy Physics facility for detector simulation and data analysis • Multiple clustered systems; Intel Linux PCs, Sun Solaris workstations • Energy Sciences Network (ESNet) • Major component of the Internet; ATM Backbone • Specializing in information retrieval, infrastructure, and group collaboration • High Performance Storage System (HPSS) • Multiple libraries, hierarchical disk and tape archive systems • High speed transfers to NERSC systems • Accessible from outside NERSC • Multiple user interface utilities • Directories for individual users and project groups Introduction to NERSC - User Services Group

  5. PVP: File Systems, part 1 • $HOME • “permanent” (but not archival) • 5 GB quota, regular backups, file migration • local to killeen, NFS-mounted on seymour and batch systems • poor performance for batch jobs • /u/repo/u10101 • /Un/u10101 • /u/ccc/u10101 • /U0/u10101 Introduction to NERSC - User Services Group

  6. PVP: File Systems, part 2 • $TMPDIR • temporary (created/destroyed each session) • no quota (but NQS limits 10 GB - 40 GB) • no backups, no migration • local to each machine • high-performance RAID arrays • system manages this for you • A.K.A. $BIG • /tmp • location of $TMPDIR • 14-day lifetime • A.K.A. /big • you manage this for yourself Introduction to NERSC - User Services Group

  7. PVP: Environment, part 1 • Unicos • Shells • Supported • sh • csh • ksh (same as sh) • Unsupported • tcsh (get it by “module load tcsh”) • bash (get it by “ module load tools”) Introduction to NERSC - User Services Group

  8. PVP: Environment, part 2 • Modules • Found on many Unix systems • Sets all or any of environment variables, aliases, executable search paths, man search paths, header file include paths, library load paths • Exercise care modifying startup files! • Cray’s PrgEnv is modules-driven • Provided startup files are critical! • Add to .ext files, don’t clobber originals • Append to paths, don’t set them, and this only if necessary • If you mess up, no compilers, etc. • Useful commands • module list • module avail • module load modfile • module display modfile • module help modfile Introduction to NERSC - User Services Group

  9. PVP: Environment, part 3 • Programming • Fortran 90 - f90 • C/C++ - cc, CC • Assembler - as • Use compiler (f90, cc, CC) for linking also • f90file naming conventions • filename.f - fixed form Fortran-77 code • filename.F - fixed form Fortran-77 code, run preprocessor first • filename.f90 - free form Fortran 90 code • filename.F90 - free form Fortran 90 code, run preprocessor first • Multiprocessing (aka multitasking, multithreading…) • setenv NCPUS 4 (csh) • export NCPUS=4 (ksh) • "a.out: Command not found.” • ./a.out … (Note: No parallelism specified with execution) Introduction to NERSC - User Services Group

  10. PVP: Environment, part 4a • Execution modes • Interactive serial • 10 hours on killeen and seymour • 80 MW max memory • Interactive parallel • No guarantee of real-time concurrency • Batch queues * = killeen, seymour, franklin, bhaskara ** = franklin, bhaskara • To see them: qstat -b • Queues shuffled at night, and sometimes during the day • Subject to change Introduction to NERSC - User Services Group

  11. PVP: Environment, part 4b • Batch • User creates shell script (e.g., “myscript”) • Submits to NQE with “cqsub myscript” • Returns NQE task id (e.g., “t1234”) • NQE selects machine and forwards to NQS • Job remains pending (“NPend”) until resources available • NQS runs the job • Assigns NQS request id (e.g., “5678.bhaskara”) • Run job in appropriate batch queue • Job log returned upon completion Introduction to NERSC - User Services Group

  12. PVP: Environment, part 5 • Libraries • Mathematics • nag, imsl, slatec, lsode, harwell, etc. • Graphics • ncar, gnuplot,etc. • I/O • HDF, netCDF, etc. • Applications • Amber, Ansys, Basis, Gamess, Gaussian, Nastran, etc. Introduction to NERSC - User Services Group

  13. PVP: Environment, part 6 • Tools • ja - job accounting • hpm - Hardware Performance Monitor • prof - Execution time profiler & viewer • flowtrace/flowview - Execution time profiler & viewer • atexpert - Autotasking performance predictor • f90 - Compiler feedback • totalview - Debugger (visual and line-oriented) Introduction to NERSC - User Services Group

  14. T3E: File Systems, part 1 • $HOME • “permanent” (but not archival) • 2 GB quota, regular backups, file migration • poor performance for batch jobs • /u/repo/u10101 • /Un/u10101 • /u/ccc/u10101 • /U0/u10101 Introduction to NERSC - User Services Group

  15. T3E: File Systems, part 2 • $TMPDIR • temporary (created/destroyed each session) • 75 GB quota (but NQS limits 4 GB - 32 GB) • no backups, no migration • high-performance RAID arrays • system manages this for you • Can be used for parallel files • /tmp • location of $TMPDIR • 14-day lifetime • A.K.A. /big • you manage this for yourself Introduction to NERSC - User Services Group

  16. T3E: Environment, part 1 • UNICOS/mk • Shells: sh/ksh, csh, tcsh • Supported: • Sh • Csh • ksh (same assh) • Unsupported: • tcsh (get it by “module load tcsh”) • Bash (get it by “module load tools”) Introduction to NERSC - User Services Group

  17. T3E: Environment, part 2 • Modules - manages user environment • Paths, Environment variables, Aliases, same as on PVP systems • Cray’s PrgEnv is modules-driven • Provided startup files are critical! • Add to .ext files, don’t clobber originals • Append to paths, don’t set them, and this only if necessary • If you mess up, no compilers, etc. • Useful commands • module list • module avail • module load modfile • module display modfile • module help modfile Introduction to NERSC - User Services Group

  18. T3E: Environment, part 3a • Programming • Fortran 90: f90 • C/C++: cc, CC • Assembler: camcam • Use compiler (f90, cc, CC) for linking also • Same naming conventions as on PVP systems • PGHPF - Portland group HPF • KCC: Kuck and Assoc. C++; • Get it via “module load KCC” • Multiprocessing • Execution in Single-Program, Multiple-Data (SPMD) Mode • In Fortran 90, C, C++, all processors execute same program Introduction to NERSC - User Services Group

  19. T3E: Environment, part 3b • Executables - Malleable or Fixed • specified in compilation and/or execution • f90 -Xnpes ... (e.g., -X64) creates “fixed” executable • Always runs on same number of (application) processors • Type ./a.out to run • f90-Xm... or without -X option creates “malleable” executable • ./a.out will run on command PE • mpprun -n npes ./a.out runs on npes APP PEs • Executing code can ask for: • Process id (from zero up) • MPI_COMM_RANK(...) • Total number of PEs • MPI_COMM_SIZE(...) • PE or Process/Task ID used to establish “master/slave” identities, controlling execution Introduction to NERSC - User Services Group

  20. T3E: Environment, part 4a • Execution modes • Interactive serial • < 60 minutes on one command PE, 20 MW max memory • Interactive parallel • < 30 minutes on < 64 processors, 29 MW memory per PE • Batch queues • To see them: qstat -b • Queues shuffled in at night • Subject to change Introduction to NERSC - User Services Group

  21. T3E: Environment, part 4b • (Old, obsolete) Example of T3E management and queue scheduling Introduction to NERSC - User Services Group

  22. T3E: Environment, part 5 • Math & graphics libraries, and application codes are similar to those on the PVP systems • Libraries are needed for communication: • MPI (Message-Passing Interface) • PVM (Parallel Virtual Machine) • SHMEM (SHared MEMory; non-portable) • BLACS (Basic Linear Algebra Communication Subprograms) • ScaLAPACK (SCAlable [parts of] LAPACK) • LIBSCI (including parallel FFTs), NAG, IMSL • I/O libraries • Cray’s FFIO • NetCDF (NETwork Common Data Format) • HDF (Hierarchical Data Format) Introduction to NERSC - User Services Group

  23. T3E: Environment, part 6 • Tools • Apprentice - finds performance problems and inefficiencies • PAT - Performance analysis tool • TAU - ACTS tuning and analysis utility • Vampir - commercial trace generation and viewing utility • Totalview - multiprocessing-aware debugger • F90 - compiler feedback Introduction to NERSC - User Services Group

  24. SP: File Systems, part 1 • AIX is a Virtual Memory operating system • Each node has its own disks, with OS image, swap and paging spaces, and scratch partitions . • Two types of user-accessible file systems: • Large, globally accessible parallel file system, called GPFS • Smaller node-local partitions Introduction to NERSC - User Services Group

  25. SP: File Systems, part 2 • Environment variables identify directories • $HOME - your personal home directory • Located in GPFS, so globally available to all jobs • Home directories are not currently backed up! • Quotas: 4 GB, and 5000 inodes • $SCRATCH - one of your temporary spaces • Located in GPFS • Very large - 3.5 TB • Transient - purged after session or job termination • $TMPDIR - another of your temporary spaces • Local to a node • Small - only 1 GB • Not particularly fast • Transient - purged on termination of creating session or batch job Introduction to NERSC - User Services Group

  26. SP: File Systems, part 3 • Directly-specified directory paths can also be used • /scratch - temporary space • Located in GPFS • Very large • Not purged at job termination • Subject to immediate purge • Quotas: 100 GB and 6000 inodes • Your $SCRATCH directory is set up in /scratch/tmpdirs/{nodename}/tmpdir.{number} where {number} is system-generated • /scratch/{username} - user-created temporary space • Located in GPFS • Large, fast, encouraged usage • Not purged at job termination • Subject to purge after 7 days, or as needed • Quotas: 100 GB and 6000 inodes Introduction to NERSC - User Services Group

  27. SP: File Systems, part 4 • /scr - temporary space • Local to a node • Small - only 1 GB • Your session-local $TMPDIR is set up in /scr/tmpdir.{number} where {number} is system-generated • Not user-accessible, except for $TMPDIR • /tmp - System-owned temporary space • Local to a node • Very small - 65 MB • Intended for use by utilities, such as vi for temporary files • Dangerous - DO NOT USE! • If filled up, it can cause the node to crash! Introduction to NERSC - User Services Group

  28. SP: Environment, part 1 • IBM's AIX - a true virtual memory kernel • Not a single system image, as on the T3E • Local implementation of module system • No modules load by default • Default shell is csh • Shell startup files (e.g., .login, .cshrc, etc.) are links; DON’T delete them! • Customize extension files (e.g., .cshrc.ext), not startup files Introduction to NERSC - User Services Group

  29. SP: Environment, part 2 • SP Idniosyncracies • All nodes have unique identities; different logins may put you on different nodes • Must change password, shell, etc. on gsadmin node • No incoming FTP allowed • xterms should not originate on the SP • Different sessions may be connected to different nodes • High speed I/O is done differently from the T3E • Processors are faster, but communication is slower, than on the T3E • PFTP is faster than native FTP • SSH access methods differ, slightly Introduction to NERSC - User Services Group

  30. SP: Environment, part 3a • Programming in Fortran • Fortran - Fortran 77, Fortran 90, and Fortran 95 • Multiple "versions" of the XLF compiler • xlf, xlf90 for ordinary serial code • xlf_r, xlf90_r for multithreaded code (shared memory parallelism) • mpxlf90, mpxlf90_r for MPI-based parallel code • Currently, must specify separate temporary directory for Fortran-90 “modules”xlf90 -qmoddir=$TMPDIR -I$TMPDIR modulesource.F source.F • IBM's HPF (xlhpf) is also available Introduction to NERSC - User Services Group

  31. SP: Environment, part 3b • Programming in C and C++ • C & C++ languages supported by IBM • Multiple "versions" of the XLC compiler • cc, xlc for ordinary serial C code • xlC for ordinary serial C++ code • cc_r, xlc_r for multithreaded C code (shared memory parallelism) • xlC_r for multithreaded C++ code (shared memory parallelism) • mpcc for MPI-based parallel C code • mpCC for MPI-based parallel C++ code • Kuck & Assoc. KCC also available in its own module Introduction to NERSC - User Services Group

  32. SP: Environment, part 4a • Execution • Many ways to run codes: • serial, parallel • shared-memory parallel, message-based parallel, hybrid • interactive, batch • Serial execution is easy: ./a.out <input_file >output_file • Parallel execution - SPMD Mode, as with T3E • Uses POE, a supra-OS resource manager • Uses Loadleveler to schedule execution • There is some overlap in options specifiable to POE and LoadLeveler • You can use one or both processors on each node • environment variables and batch options control this Introduction to NERSC - User Services Group

  33. SP: Environment, part 4b • Shared memory parallel execution • Within a node, only • OpenMP, Posix Threads, IBM SMP directives • Message-based parallel execution • Across nodes and within a node • MPI , PVM, LAPI, SHMEM (planned) • Hybrid parallel execution • Threading and message passing • Most likely to succeed: OpenMP and MPI • Currently, MPI understands inter- vs. intra-node communication, and sends intra-node messages efficiently Introduction to NERSC - User Services Group

  34. SP: Environment, part 4c • Interactive execution • Interactive jobs run on login nodes or compute nodes • currently, there are 8 login nodes • Serial execution is easy: ./a.out <input_file >output_file • Parallel exeuction involves POE: poe ./a.out -procs 4 <input_file >output_file • Interactive parallel jobs may be rejected due to resource scarcity; no queueing • By default, parallel interactive jobs use both processors on each node • Batch execution • Batch jobs run on the compute nodes • By default, parallel batch jobs use both processors on each node; • you will be charged for both, even if you override this • Use Loadleveler utilities set to submit, monitor, cancel, etc. • requires a script, specifying resource usage details, execution parameters, etc. • Several job classes, for charging, resource limits: premium, regular, low; • two job types - serial and parallel Introduction to NERSC - User Services Group

  35. SP: Environment, part 4d • SP Batch Queues and resource Limits • Limits: • 3 jobs running • 10 jobs considered for scheduling (idle) • 30 jobs submitted Introduction to NERSC - User Services Group

  36. SP: Environment, part 5 • Libraries and Other Software • Java, Assembler • Aztec, PETSc, ScaLAPACK • Emacs • Gaussian 98, NWChem • GNU Utilities • HDF, netCDF • IMSL, NAG, LAPACK • MASS, ESSL, PESSL • NCAR Graphics • TCL/TK Introduction to NERSC - User Services Group

  37. SP: Environment, part 6 • Tools • VT - vsualization tool for trace visualization and performance monitoring • Xprofiler - graphical code structure and execution time monitoring • Totalview - multiprocessing-aware debugger • Other Debugging Tools • Totalview - available in its own MODULE; • adb - general purpose debugger • dbx - symbolic debugger for C, C++, Pascal, and FORTRAN programs • pdbx - based on dbx, with functionality for parallel programming • TAU - ACTS tuning and analysis utility - planned! • Vampir - commercial trace generation and viewing utility - future! • KAP Suite - future? • PAPI - future? Introduction to NERSC - User Services Group

  38. HPSS Mass Storage • HPSS • Hierarchical, flexible, powerful, performance-oriented • Multiple user interfaces allow easy, flexible storage management • Two distinct physical library systems • May be logically merged in future software release • Accessible from any system from inside or outside NERSC • hpss.nersc.gov, archive.nersc.gov (from outside NERSC) • hpss, archive (from inside NERSC) • Accessible via several utilities • HSI, PFTP, FTP • Can be accessed interactively or from batch jobs • Compatible with system maintenance utilities (“sleepers”) Introduction to NERSC - User Services Group

  39. HPSS Mass Storage • HPSS • Allocated and accounted, just like CPU resources • Storage Resource Units (SRU’s) • Open ended - you get charged, but not cut off, if you exceed your allocation • “Project” spaces available, for easy group collaboration • Used for system backups and user archives • hpss used for both purposes • archive is for user use only • Has modern access control • DCE allows automatic authentication • Special DCE accounts needed • Not uniformly accessible from all NERSC systems • Problems with PFTP on the SP system • Modern secure access methods are problematic • ftp tunneling doesn’t work (yet…) Introduction to NERSC - User Services Group

  40. Accessing NERSC • NERSC recognizes two connection contexts: • Interaction (working on a computer) • File transfer • Use of SSH is required for interaction (telnet, rlogin are prohibited) • SSH is (mostly) standardized and widely available • Most Unix & Linux systems come with it • Commercial (and some freeware) versions available for Windows, Macs, • SSH allows telnet-like terminal sessions, but protects account name and password with encryption • simple and transparent to set up and use • Can look and act like rlogin • SSH can forward xterm connections • sets up a special “DISPLAY” environment variable • encrypts the entire session, in both directions Introduction to NERSC - User Services Group

  41. Accessing NERSC • SSH is encouraged for file transfers • SSH contains “scp”, which acts like “rcp” • scp encrypts login info and all transferred data • SSH also allows secure control connections through “tunneling” or “forwarding” • Here’s how tunneling is done: • Set up a terminal connection to a remote host with port forwarding enabled • This specifies a port on your workstation that ssh will forward to another host • FTP to the forwarded port - looks like you are ftp’ing to your own workstation • Control connection (login process) is forwarded encrypted • Data connections proceed as any ftp transfer would, unencrypted • Ongoing SSH issues being investigated by NERSC staff • Not all firewalls allow ftp tunneling, without “passive” mode • HPSS won’t accept tunneled ftp connections • Workstation platform affects tunneling method • Methods differ slightly on the SP • New options, must use xterm forwarding, no ftp tunneling... • Different platforms accept different ciphers Introduction to NERSC - User Services Group

  42. Information Sources - NERSC Web Pages Introduction to NERSC - User Services Group

  43. Information Sources - On-Line Lecture Materials Introduction to NERSC - User Services Group

More Related