1 / 36

Introduction to HPC resources for BCB 660

Introduction to HPC resources for BCB 660. Nirav Merchant nirav@email.arizona.edu www.iplantcollaborative.org. Topic Coverage. What is Parallel Computing ? General overview of HPC systems Overview of batch system (and why we need them) Getting started with Ranger

cadee
Download Presentation

Introduction to HPC resources for BCB 660

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to HPC resources for BCB 660 Nirav Merchant nirav@email.arizona.edu www.iplantcollaborative.org

  2. Topic Coverage What is Parallel Computing ? General overview of HPC systems Overview of batch system (and why we need them) Getting started with Ranger Understanding the default user environment Introduction to modules (and why we need them) Submitting your first job (and monitoring it) Moving your data in and out of HPC systems Q/A

  3. What is computing ? • von Neumann Architecture • Named after the Hungarian mathematician John von Neumann who first authored the general requirements for an electronic computer in his 1945 papers. • Since then, virtually all computers have followed this basic design of: • Memory (RAM) • Control Unit (CPU) • Arithmetic Logic Unit (ALU) • Input/Output (Keyboard)

  4. What does it look like (your computer) ? Image courtesy Univ. of Washington

  5. What is Parallel Computing? A good introduction to concepts for parallel programing is at: https://computing.llnl.gov/tutorials/parallel_comp/ Parallel computing: use of multiple processors or computers working together on a common task. • Each processor works on part of the problem • Processors can exchange information

  6. Why we need it • Traditional software is written to execute serially i.e. one task at a time running on one CPU • As the size of data (tasks) is increasing we need to utilize multiple CPU’s • Size of data also has implications on how much RAM and disk space is required for the task (we need more RAM or disk that fits on one computer)

  7. HPC systems: Not very different Image courtesy TACC at Univ of Texas

  8. Some Terminology (Jargon) of HPC HPC: High Performance Computing = Super Computing Node: One self contained computer (many of which are connected together to form a “cluster”) CPU = Socket= Processor= Cores Interconnect: networking between Nodes (can be fiber optic, or regular ethernet like your computers) e.g. Infiniband or GigE

  9. More Terminology (Jargon) of HPC Scalability: Ability to use additional resources to execute tasks faster Embarrassingly Parallel: Data Parallel tasks where each task is independent and not much communication or coordination is required among tasks Observed Speedup: “wall time” taken for serial task divided by wall time for parallel task

  10. Types of HPC • Shared memory • All CPU (processors) have access to shared RAM • Distributed memory • Each CPU (processor) has its own local memory, but can be connected to others nodes via fast interconnect

  11. Again why do we need it ? • Limits of single CPU computing • Performance • Available memory (Disk and RAM) • Parallel computing allows one to: • Execute Tasks that don’t fit on a single CPU • Complete tasks in a reasonable time • Again Please check: https://computing.llnl.gov/tutorials/parallel_comp/ for basic intro to parallel computing concepts

  12. RANGER • Compute power • 504 Teraflops • 3,936 four socket nodes • 62,976 cores, 2.0 GHz AMD Opteron • Memory • 125Terabytes • 2GB/core, 32 GB/node • Disk subsystem • 1.7 PB Storage (Lustre Parallel File System) • 1 PB in /work filesystem • Interconnect • 8 Gb/s InfiniBand • Lonestar and others machines have similar (much larger specs)

  13. Filesystem Access • HOME • Store your source code and build your executableshere • Use $HOME to reference your home directory in scripts • WORK • Store large files here • This file system is NOT backed up, use $ARCHIVE for important files! • Use $WORK to reference this directory in scripts • SCRATCH • Store large input or output files here – TEMPORARILY • This file system is NOT backed up, use $ARCHIVE for important files! • Use $SCRATCH to reference this directory in scripts • ARCHIVE • Massive, long-term storage and archive system • Check with staff before using this on your account

  14. Limits on your filesystem

  15. How is it connected

  16. MUST READ THIS Please visit the TACC new user guide for RANGER You will pick up many hints that will make your life MUCH easier for running tasks on TACC resources http://www.tacc.utexas.edu/user-services/user-guides/ranger-user-guide http://goo.gl/0xyN5 (same as above)

  17. Batch, Module system With multiple users we need a way to organize tasks We need a way to assign suitable resources to the tasks (track, prioritize) With multiple software we need a way to deal with conflicts in version and dependency per tasks Batch scheduler user on all TACC systems is SGE (Sun Grid Engine) now owned by Oracle.

  18. Batch submission

  19. RANGER: Queue Options

  20. Common SGE commands

  21. Lets get working sshtrainXXX@ranger.tacc.utexas.edu

  22. Module Commands

  23. Compbio stack/modules

  24. But my favorite app is … Modules are for global use, hard to get cutting edge code as modules (limited staff time) You can always compile and use your own versions without waiting for a module to be built When possible, build your applications from source rather than running pre-compiled binaries If you choose to use “make Install”, you will need to modify the “configure” script to change where it is installed ./configure --prefix=$HOME/bin For best performance, use the the intel compilers For best compatibility, use the gcc compilers More in “bleeding edge s/w” slide

  25. Preparing for tasks Number of cores and nodes to use is set with: #$ -peNway 16*M N represents the number of cores to utilize per node Ranger: 1≤N≤16 Lonestar:1≤N≤12 M is the number of nodes to utilize The TOTAL number of cores used is thus: N*M

  26. Preparing a job submission

  27. Some more SGE options

  28. Working with bleeding edge s/w http://genomics.tacc.utexas.edu/projects/ls4compbio/wiki http://goo.gl/QYnIo (sameurl as abovejust short) Lets look at the tutorial sectiontowards the end of the page

  29. More from that page

  30. Getting data in and out SCP will work well for most smaller files Specialized options (bbcp and gridftp need special end point installation) As you get larger files (10Gb+) it gets time consuming to move it around Easier to move your data into iPlant data store from your desktop/server (parallel transfers) Pull that data where you need (and push more into it) Command line and GUI options (including dropbox for science)

  31. iPlant data store Details at: http://goo.gl/4xzhA Connectingfrom RANGER module load irods iinit Answer the prompts using info fromabove link You are nowconnected (without futureneed of passwordstoiPlant data store)

  32. From RANGER After loading irods module i.e module load irods

  33. Parametric Launcher • You have many tasks that you want to run and they are naturally parallel (“embarrassingly parallel” ) • Parametric Job Launcher: a simple utility for submitting multiple serial applications simultaneously. • % module load launcher • 2 key components: • paramlistexecution command • launcher.sge job submission script

  34. Parametric Launcher Check http://genomics.tacc.utexas.edu/projects/ls4compbio/wiki/TACC_NGS_Course_Practical_1 http://goo.gl/YBHKx Look at the shrimp_launcher.sgeforideas

  35. Gratitude TACC Staff for slides Matt Vaughn Michael Gonzalez And many more URL http://www.tacc.utexas.edu/user-services/user-guides/

More Related