1 / 47

Getting Started on Emerald

Getting Started on Emerald. ITS- Research Computing Group. Course Objectives. Word for the Day: Heterogeneous Emerald: the Swiss army knife of computing, something for everyone :) Something you can use today A reference for something you can use tomorrow. Course Objectives Cont.

flo
Download Presentation

Getting Started on Emerald

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Getting Started on Emerald ITS- Research Computing Group

  2. Course Objectives • Word for the Day: Heterogeneous • Emerald: the Swiss army knife of computing, something for everyone :) • Something you can use today • A reference for something you can use tomorrow

  3. Course Objectives Cont. • Educate users on the broader aspects of research computing • Practical knowledge to allow you to efficiently perform your research • Pointers towards more advanced topics

  4. Course Outline • Course Objectives • What are compute clusters and Emerald in particular? • Accessing Emerald • login • file systems • Running jobs on Emerald – Job Management • job schedulers • batch commands • submitting jobs • specialty scripts • Available Software • software • package space • Compiling Code

  5. Help Documentation • Getting Started on Emerald • http://help.unc.edu/6020 • General overview of Emerald for range of users • Short Course – Getting Started on Emerald • http://help.unc.edu/6479 • Detailed notes for beginning Emerald users

  6. What is a compute cluster?What is Emerald?

  7. Emerald Linux Cluster

  8. What is Emerald? • General Purpose Linux Cluster • Maintained by Research Computing Group • Appropriate for all users regardless of expertise level • Other Servers: • Cedar/Cypress (128-processor SGI/Altix) • a large shared memory system • Topsail (4160-processor Dell Linux Cluster) • homogeneous capability cluster with fast interconnect • Mass Storage • Account access

  9. What is a compute cluster? Some Typical Components • Compute Nodes • Interconnect • Shared File System • Software • Operating System (OS) • Job Scheduler/Manager • Mass Storage

  10. Compute Nodes Xeon blades, IBM Power 4 and Power5 Interconnect Gigabit Ethernet (aka gigE or GbE) Shared File Systems AFS, NFS, and GPFS Mass Storage ~/ms Software much licensed and public domain s/w in package space Operating Systems (OS) RH5 (64bit), RH4 (32 bit) and AIX (64 bit) Job Scheduler/Manager all handled by LSF Emerald is a HeterogeneousCluster

  11. Emerald Overview

  12. Advantages of Using Emerald • High performance • Large capacity • Parallel processing • Many available software packages • Variety of compiling options • Shared file systems • Mass storage

  13. Emerald Compute Nodes • Mostly IBM BladeCenter xeon blades • all are dual Socket Intel Xeons • 1, 2, or 4 cores/socket (i.e. 2,4,8 processors/node) • 2.0, 2.8, 3.0, 3.2 GHz processors • varying memory, mostly 2 or 4 GB per core • IBM Power 4 and 5 • large memory, varying processor speeds • Cluster is constantly evolving

  14. Emerald Blades No! Yes! A chassis with 14 blades

  15. Emerald Summary • Over 200 host blade nodes, Intel Xeon • Over 800 blade cores • typically 2-4 GB memory per core • 4 IBM AIX p575’s, Power 5 • 64 cores, large memory • 1 IBM AIX p690, Power 4 • 32 cores, large (128 GB) shared memory • Gigabit Ethernet switching fabric • Running 32 and 64 bit Linux and 64 bit AIX

  16. Emerald Details • Run the lshosts command to see resources for each node (host). Note host, model, ncpus, maxmem, resources • %lshosts • HOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCES • bc12-n01 X86_64 Xeon_3_2 12.0 23954M 996M Yes (X64bit blade blade12 L26 lammpi mem3 mem4 mpich2 mpichp4 RH5 tmp25G xeon32) • bc10-n10 X86_64 Xeon_2_8 11.7 2 3954M 996M Yes (X64bit blade blade10 L26 lammpi mem3 mem4 mpich2 mpichp4 RH5 tmp25G xeon28) • bc09-n01 X86_64 Xeon_2_8 11.7 2 3954M 996M Yes (X64bit blade blade9 L26 lammpi mem3 mem4 mpich2 mpichp4 RH5 tmp25G xeon28) • bc01-n01 X86_64 Xeon_3_0 11.9 8 32190M 29313M Yes (X64bit blade blade1 L26 lammpi mem32 mpich2 mpichp4 RH5 tmp100G xeon30)

  17. Accessing Emerald

  18. Logging Into Emerald • UNIX/Linux/OSX • ssh my_onyen@emerald.unc.edu • ssh –l my_onyen emerald.unc.edu • Windows: SSH Secure Shell • X windows software -> shareware.unc.edu • Setting up a Profile for Emerald • Forwarding X11 packets

  19. Head Nodes • Emerald has multiple head nodes or login nodes for • login and basic file manipulation • compiling • testing short (~ <1 min), small memory jobs • Login nodes run the Linux operating system • take the Introduction to Linux class or see some of the many online tutorials if you are unfamiliar with Linux

  20. Home Directory on Emerald Home Directory /afs/isis/home/m/y/my_onyen/ 250MB quota ~/private/ Files backed up daily [ ~/OldFiles ] Space quota/usage in Home Directory: fs lq

  21. Work Directories on Emerald No space limit but periodically cleaned Not backed up!!! Work Directories: /netscr/my_onyen, /nas/my_onyen, /nas2/my_onyen totals 26.2 TB /largefs optimized for large file operations (> 1MB) 23 TB /smallfs optimized for small file operations (< 1MB) 16 TB

  22. File Permissions • Your home directory is in AFS space. AFS is a distributed networked file system. • Permissions are determined by ACLs (access control lists) • see Introduction to AFS (http://help.unc.edu/215) • The other files systems, /largefs, /netscr, etc. are controlled by the usual Linux file permissions • making everything under /netscr/myOnyen accessible: chmod –R a+rX /netscr/myOnyen

  23. Mass Storage • access via ~/ms • looks like ordinary disk file system – data is actually stored on tape • “limitless” capacity • data is backed up • For storage only, not a work directory (i.e. don’t run jobs from here) • if you have many small files, use tar or zip to create a single file for better performance “To infinity … and beyond” - Buzz Lightyear

  24. Job Scheduling and Management LSF

  25. What does a Job Scheduler and batch system do? Manage Resources • allocate user tasks to resource • monitor tasks • process control • manage input and output • report status, availability, etc • enforce usage policies

  26. LSF • All Research Computing clusters use LSF to do job scheduling and management • LSF (Load Sharing Facility) is a (licensed) product from Platform Computing • Fairly distribute compute nodes among users • enforce usage policies for established queues • most common queues: int, now, week, month • RC uses Fair Share scheduling, not first come, first served (FCFS) • LSF commands typically start with the letter b (as in batch), e.g. bsub, bqueues, bjobs, bhosts, … • see man pages for much more info!

  27. Simplified view of LSF job dispatched to run on available host which satisfies job requirements Jobs Queued job_J job_F myjob job_7 Login Node job routed to queue bsub –R X64bit –q week myjob user logged in to login node submits job

  28. Common batch commands • bsub - submit jobs • bqueues – view info on defined queues • bqueues –l week • bkill – stop/cancel submitted job • bjobs – view submitted jobs • bjobs –u all • bhist – job history • bhist –l <jobID> • bhosts – status and resources of hosts (nodes)

  29. Common batch commands • bpeek – display output of running job • Use man pages to get much more info! • man bjobs • bfree – query LSF to find job slots currently available that fit your resource requirement • this is a RC command extension • bfree –help (or –h) • jobmon – monitor changes in job status • this is a RC command, typically runs in a separate window

  30. Submitting Jobs: bsub Command Submit Jobs - bsub All files must be in scratch space, e.g. /netscr, /largefs, /smallfs Home directory is not mounted on compute nodes bsub [- bsub_opts] executable [-exec_opts]

  31. bsub continued • Common bsub options: • –o <filename> • –o out.%J • -q <queue name> • -q now • -R “resource specification” • -R xeon30 • -n <number of processes> • used for parallel, MPI jobs • -a <application specific esub> • -a mpichp4 (used on MPI jobs)

  32. Two methods to submit jobs: • bsub example: submit the executable job, myexe, to the week queue to run on a 64 bit Linux OS and redirect output to the file out.<jobID> (default is to mail output) • Method 1: Command Line • bsub –q week –R X64bit –o out.%J myexe • Method 2: Create a file (details to follow) called, for example, myexe.bsub, and then submit that file. Note the redirect symbol, < • bsub < myexe.bsub

  33. Method 2 cont. • The file you submitted will contain all the bsub options you want in it, so for this example myexe.bsub will look like this • #BSUB –q week • #BSUB –o out.%J • #BSUB –R X64bit • myexe • This is actually a shell script so the top line could be the normal #!/bin/csh, etc and you can run any commands you would like. • if this doesn’t mean anything to you then nevermind :)

  34. Parallel Job example Batch Command Line Method • bsub –q week –o out.%J -n 30 -a mpichp4 mpirun.lsf myParallelExe Batch File Method • bsub < myexe.bsub • where myexe.bsub will look like this • #BSUB –q week • #BSUB –o out.%J • #BSUB –a mpichp4 • mpirun.lsf myexe

  35. Submitting Jobs: Specialty Scripts • Running a SAS job through batch (2 ways) • bsub -q week -R blade sas program.sas • bsas test.sas • Running a Matlab job through batch (2 ways) • bsub -q week -R blade matlab -nodisplay -nojvm -nosplash program.m -logfile program.log • bmatlab test.m

  36. Interactive Jobs: Setup X-Windows Linux/OSX X11 client Windows X-Win32 Offered on UNC Software Acquisition site https://shareware.unc.edu Port forwarding on SSH Secure Shell Setting up a session on X-Win32

  37. Interactive Jobs: Submission • –Ip or -Is • bsub –q int –R blade –Ip sas • bsub –q int –R blade –Ip gv • bsub –q int –R blade –Ip matlab • bsub –q int –Is tcsh • Specialty Scripts • xsas • xstata

  38. Software

  39. Licensed Software over 20 licensed software applications (some are site licensed, others restricted) Matlab, Maple, Mathematica, Gaussian, Accelrys Materials Studio and Discovery Studio modules, Sybyl, Schrodinger, SAS, Stata, ArcGIS, NAG, IMSL, Totalview, and more. compilers (licensed and otherwise) intel, PGI, absoft, gnu, IBM Numerous other packages provided for research and technical computing including BLAST, PyMol, SOAP, PLINK, NWChem, R, Cambridge Structural Database, Amber, Gromacs, Petsc, Scalapack, Netcdf, Babel, Qt, Ferret, Gnuplot, Grace, iRODS, XCrySDen, and more.

  40. Available Software • Most of the software is installed under AFS and is made available through package space. • AFS (Andrew File System) is a distributed networked file system. Your home directory and software packages are mounted in AFS space. • Changes made to your package space are preserved over login sessions.

  41. Package Space • Use ipm (Isis Package Manager) to manage your packages. • ipm commands • ipm add (ipm a) • ipm remove (ipm r) • ipm query (ipm q) • Available packages • http://help.unc.edu/1689 • man ipm

  42. Compiling

  43. Compiling on Emerald Compilers FORTRAN 77/90/95 C/C++ Parallel Computing MPI (MPICH, LAM/MPI, MPICH-GM) OpenMP

  44. Compiling Details on Emerald

  45. Compiling MPI programs • Use the MPI wrappers to compile your program • mpicc, mpiCC, mpif90, mpif77 • the wrappers will find the appropriate include files and libraries and then invoke the actual compiler • for example, mpicc will invoke either gcc, icc, or pgcc depending upon which package you have loaded

  46. Compiling Details on Emerald • Add a compiler into your working environment • ipm add package_name • Compile a code • command code.c –o executable • Run executable on a compute node using the bsub command • bsub –q week –R blade executable

  47. Contacting Research Computing • Questions? • For assistance with Emerald, please contact the Research Computing Group: • Email: research@unc.edu • Phone: 919-962-HELP • Submit help ticket at http://help.unc.edu

More Related