using kure and topsail l.
Skip this Video
Loading SlideShow in 5 Seconds..
Using Kure and Topsail PowerPoint Presentation
Download Presentation
Using Kure and Topsail

Loading in 2 Seconds...

play fullscreen
1 / 56

Using Kure and Topsail - PowerPoint PPT Presentation

  • Uploaded on

Using Kure and Topsail. Mark Reed Grant Murphy Charles Davis ITS Research Computing. Outline. Compute Clusters Topsail Kure Logging In File Spaces User Environment and Applications, Compiling Job Management. Logistics. Course Format Lab Exercises Breaks UNC Research Computing

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Using Kure and Topsail

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
using kure and topsail

Using Kure and Topsail

Mark Reed

Grant Murphy

Charles Davis

ITS Research Computing

  • Compute Clusters
    • Topsail
    • Kure
  • Logging In
  • File Spaces
  • User Environment and Applications, Compiling
  • Job Management
  • Course Format
  • Lab Exercises
  • Breaks
  • UNC Research Computing
  • Getting started Topsail page
  • Getting started Kure page
what is a compute cluster
What is a compute cluster?

Some Typical Components

  • Compute Nodes
  • Interconnect
  • Shared File System
  • Software
  • Operating System (OS)
  • Job Scheduler/Manager
  • Mass Storage
compute cluster advantages
Compute Cluster Advantages
  • fast interconnect, tightly coupled
  • aggregated compute resources
  • large (scratch) file spaces
  • installed software base
  • scheduling and job management
  • high availability
  • data backup
initial topsail cluster
Initial Topsail Cluster
  • Initially: 1040 CPU Dell Linux Cluster
    • 520 dual socket, single core nodes
  • Infiniband interconnect
  • Intended for capability research
  • Housed in ITS Franklin machine room
  • Fast and efficient for large computational jobs
topsail upgrade 1
Topsail Upgrade 1
  • Topsail upgraded to 4,160 CPU
    • replaced blades with dual socket, quad core
  • Intel Xeon 5345 (Clovertown) Processors
    • Quad-Core with 8 CPU/node
  • Increased number of processors, but decreased individual processor speed (was 3.6 GHz, now 2.33)
  • Decreased energy usage and necessary resources for cooling system
  • Summary: slower clock speed, better memory bandwidth, less heat, quadrupled the core count
    • Benchmarks tend to run at the same speed per core
    • Topsail shows a net ~4X improvement
    • Of course, this number is VERY application dependent
topsail upgraded blades
Topsail – Upgraded blades
  • 52 Chassis: Basis of node names
    • Each holds 10 blades -> 520 blades total
    • Nodes = cmp-chassis#-blade#
  • Old Compute Blades: Dell PowerEdge 1855
    • 2 Single core Intel Xeon EMT64T 3.6 GHZ procs
    • 800 Mhz FSB
    • 2MB L2 Cache per socket
    • Intel NetBurst MicroArchitecture
  • New Compute Blades: Dell PowerEdge 1955
    • 2 Quad core Intel 2.33 GHz procs
    • 1333 Mhz FSB
    • 4MB L2 Cache per socket
    • Intel Core 2 MicroArchitecture
topsail upgrade 2
Topsail Upgrade 2
  • Most recent Topsail upgrade (Feb/Mar ‘09)
  • Refreshed much of the infrastructure
  • Improved IBRIX filesystem
  • Replaced and improved Infiniband cabling
  • Moved cluster to ITS-Manning building
    • Better cooling and UPS
top 500 history
Top 500 History
  • Top 500 lists comes out twice a year
    • ISC conference in June
    • SC conference in Nov
  • Topsail debuted at 74 in June 2006
  • Peaked at 25 in June 2007
  • Still in the Top 500
current topsail architecture
Current Topsail Architecture
  • Login node: 8 CPU @ 2.3 GHz Intel EM64T, 12 GB memory
  • Compute nodes:4,160 CPU @ 2.3 GHz Intel EM64T, 12 GB memory
  • Shared disk:39TB IBRIX Parallel File System
  • Interconnect: Infiniband 4x SDR
  • 64bit Linux Operating System
multi core computing
Multi-Core Computing
  • Processor Structure on Topsail
    • 500+ nodes
    • 2 sockets/node
    • 1 processor/socket
    • 4 cores/processor (Quad-core)
    • 8 cores/node
multi core computing14
Multi-Core Computing
  • The trend in High Performance Computing is towards multi-core or many core computing.
  • More cores at slower clock speeds for less heat
  • Now, dual and quad core processors are becoming common.
  • Soon 64+ core processors will be common
    • And these may be heterogeneous!
the heat problem
The Heat Problem

Taken From: Jack Dongarra, UT

more parallelism
More Parallelism

Taken From: Jack Dongarra, UT

infiniband connections
Infiniband Connections
  • Connection comes in single (SDR), double (DDR), and quad data rates (QDR).
    • Topsail is SDR.
  • Single data rate is 2.5 Gbit/s in each direction per link.
  • Links can be aggregated - 1x, 4x, 12x.
    • Topsail is 4x.
  • Links use 8B/10B encoding —10 bits carry 8 bits of data — useful data transmission rate is four-fifths the raw rate. Thus single, double, and quad data rates carry 2, 4, or 8 Gbit/s respectively.
  • Data rate for Topsail is 8 GB/s (4x SDR).
infiniband benchmarks
Infiniband Benchmarks
  • Point-to-point (PTP) intranode communication on Topsail for various MPI send types
  • Peak bandwidth:
    • 1288 MB/s
  • Minimum Latency (1-way):
    • 3.6 ms
infiniband benchmarks20
Infiniband Benchmarks
  • Scaled aggregate bandwidth for MPI Broadcast on Topsail
  • Note good scaling throughout the tested range (from 24-1536 cores)
  • The newest, “latest and greatest” compute cluster in RC
  • Named after the beach in North Carolina
  • It’s pronounced like the Nobel prize winning physicist and chemist, Madame Curie
kure compute cluster
Kure Compute Cluster
  • Heterogeneous Research Cluster
  • Hewlett Packard Blades
  • 79 Compute Nodes, mostly
    • Xeon 5560 2.8 GHz
    • Nehalem Microarchitecture
    • Dual socket, quad core
    • 48 GB memory
    • over 600 cores
    • some higher memory nodes
  • Infiniband4x QDR
  • priority usage for patrons
    • Buy in is cheap
  • Storage
    • Scratch space same as emerald
    • No AFS home
kure cont
Kure Cont.
  • The current configuration of Kure is mostly homogeneous but it will become increasingly heterogeneous as patrons and others add to it.
  • Most login nodes are 48 GB but there are currently four high memory nodes
  • 2 nodes each with 128 GB of memory
  • 2 nodes each with 96 GB of memory
topsail kure comparison
Topsail/Kure Comparison




600+ cores

2.8 Ghz cores, Intel Nehalem micorarch.

48 GB memory/node

IB 4x QDR interconnect

  • homogeneous
  • 4000+ cores
  • 2.33 GHz cores, Intel Core microarch.
  • 12 GB memory/node
  • IB 4x SDR interconnect
login to topsail kure
Login to Topsail/Kure
  • Use ssh to connect:
  • SSH Secure Shell with Windows
    • see
  • For use with X-Windows Display:
    • ssh –X orssh –X
    • ssh –Y orssh –Y
  • Off-campus users (i.e. domains outside of must use VPN connection
topsail file space
Topsail File Space
  • Home directories
    • /ifs1/home/<onyen>
    • anyone over 15 GB is not backed up
  • Scratch Space
    • /ifs1/scr/<onyen>
    • over 39 TB of scratch space
    • run jobs with large output in this space
  • Mass Storage
    • ~/ms
kure file space
Kure File Space
  • Home directories
    • /nas02/home/<a>/<b>/<onyen>
      • a = first letter of onyen, b = second letter of onyen
    • hard limit of 15 GB
  • Scratch Space – still evolving
    • /nas – to be upgraded to 15 TB
    • /largefs – to be upgraded to 30 TB
    • run jobs with large output in these spaces
  • Mass Storage
    • ~/ms
mass storage
Mass Storage
  • long term archival storage
  • access via ~/ms
  • looks like ordinary disk file system – data is actually stored on tape
  • “limitless” capacity
  • data is backed up
  • For storage only, not a work directory (i.e. don’t run jobs from here)
  • if you have many small files, use tar or zip to create a single file for better performance
  • Sign up for this service on

“To infinity … and beyond” - Buzz Lightyear

  • The user environment is managed by modules
  • Modules modify the user environment by modifying and adding environment variables such as PATH or LD_LIBRARY_PATH
  • Typically you set these once and leave them
  • Note there are two module settings, one for your current environment and one to take affect on your next login (e.g. batch jobs running on compute nodes)
common module commands
Common Module Commands
  • module avail
    • module avail apps
  • module help
  • module list
  • module add
  • module rm

Login version

  • module initlist
  • module initadd
  • module initrm

More on modules see

parallel jobs with mpi
Parallel Jobs with MPI
  • There are three implementations of the MPI standard installed:
    • mvapich
    • mvapich2 (currently only on topsail)
    • openmpi
  • Performance is similar for all three, all three run on the IB fabric. Mvapich is the default. Openmpi and mvapich2 have more the the MPI-2 features implemented.
compiling mpi programs
Compiling MPI programs
  • Use the MPI wrappers to compile your program
    • mpicc, mpiCC, mpif90, mpif77
    • the wrappers will find the appropriate include files and libraries and then invoke the actual compiler
    • for example, mpicc will invoke either gcc or icc depending upon which module you have loaded
compiling on topsail kure
Compiling on Topsail/Kure
  • Serial Programming
    • Intel Compiler Suite for Fortran77, Fortran90, C and C++, - Recommended by Research Computing
      • icc, icpc, ifort
    • GNU
      • gcc, g++, gfortran
  • Parallel Programming
    • MPI (see previous page)
    • OpenMP
      • Compiler tag: -openmp for Intel -fopenmp for GNU
      • Must set OMP_NUM_THREADS in submission script
debugging totalview
Debugging - Totalview
  • If you are debugging code there is a powerful commercial debugger, totalview
  • See
  • parallel and serial code
  • Fortran/C/C++
  • GUI for source level control
  • too many features to list!
what does a job scheduler and batch system do
What does a Job Scheduler and batch system do?

Manage Resources

  • allocate user tasks to resource
  • monitor tasks
  • process control
  • manage input and output
  • report status, availability, etc
  • enforce usage policies
job scheduling systems
Job Scheduling Systems
  • Allocates compute nodes to job submissions based on user priority, requested resources, execution time, etc.
  • Many types of schedulers
    • Load Sharing Facility (LSF) – Used by Topsail/Kure
    • IBM LoadLeveler
    • Portable Batch System (PBS)
    • Sun Grid Engine (SGE)
  • All Research Computing clusters use LSF to do job scheduling and management
  • LSF (Load Sharing Facility) is a (licensed) product from Platform Computing
    • Fairly distribute compute nodes among users
    • enforce usage policies for established queues
      • most common queues: int, now, week, month
    • RC uses Fair Share scheduling, not first come, first served (FCFS)
  • LSF commands typically start with the letter b (as in batch), e.g. bsub, bqueues, bjobs, bhosts, …
    • see man pages for much more info!
simplified view of lsf
Simplified view of LSF

job dispatched to run on available host which satisfies job requirements

Jobs Queued





Login Node

job routed to queue

bsub–n 64 –a mvapich–q week mpirunmyjob

user logged in to login node submits job

running programs on topsail
Running Programs on Topsail
  • Upon ssh to Topsail/Kure, you are on the Login node.
  • Programs SHOULD NOT be run on Login node.
  • Submit programs to one of the many, many compute nodes.
  • Submit jobs using Load Sharing Facility (LSF) via the bsub command.
common batch commands
Common batch commands
  • bsub - submit jobs
  • bqueues – view info on defined queues
    • bqueues –l week
  • bkill – stop/cancel submitted job
  • bjobs – view submitted jobs
    • bjobs –u all
  • bhist – job history
    • bhist –l <jobID>
common batch commands44
Common batch commands
  • bhosts – status and resources of hosts (nodes)
  • bpeek – display output of running job
  • Use man pages to get much more info!
    • man bjobs
submitting jobs bsub command
Submitting Jobs: bsub Command

Submit Jobs - bsub

Run large jobs out of scratch space, smaller jobs can run out of your home space

bsub [-bsub_opts] executable [-exec_opts]

Common bsub options:

–o <filename>

–o out.%J

-q <queue name>

-q week

-R “resource specification”

-R “span[ptile=8]”

-n <number of processes>

used for parallel, MPI jobs

-a <application specific esub>

-a mvapich(used on MPI jobs)

two methods to submit jobs
Two methods to submit jobs:
  • bsub example: submit the executable job, myexe, to the week queue and redirect output to the file out.<jobID> (default is to mail output)
  • Method 1: Command Line
    • bsub –q week –o out.%Jmyexe
  • Method 2: Create a file (details to follow) called, for example, myexe.bsub, and then submit that file. Note the redirect symbol, <
    • bsub < myexe.bsub
method 2 cont
Method 2 cont.
  • The file you submitted will contain all the bsub options you want in it, so for this example myexe.bsub will look like this

#BSUB –q week

#BSUB –o out.%J


  • This is actually a shell script so the top line could be the normal #!/bin/csh, etc and you can run any commands you would like.
    • if this doesn’t mean anything to you then nevermind :)
parallel job example
Parallel Job example

Batch Command Line Method

  • bsub –q week –o out.%J-n 64 -a mvapich mpirun myParallelExe

Batch File Method

  • bsub < myexe.bsub
  • where myexe.bsub will look like this

#BSUB –q week

#BSUB –o out.%J

#BSUB –a mvapich

#BSUB –n 64


some topsail queues
Some Topsail Queues
  • For access to the 512cpu queue the scalability must be demonstrated
some kure queues
Some Kure Queues

Most users have a 32 job slots limit unless they have been granted extra slots.

Queues are always subject to changeand probably will change as Kure production ramps up. Use the bqueues command to find the current status

common error 1
Common Error 1
  • If job immediately dies, check err.%J file
  • err.%J file has error:
    • Can't read MPIRUN_HOST
  • Problem: MPI enivronment settings were not correctly applied on compute node
  • Solution: Include mpirun in bsub command
common error 2
Common Error 2
  • Job immediately dies after submission
  • err.%J file is blank
  • Problem: ssh passwords and keys were not correctly setup at initial login to Topsail
  • Solution:
    • cd ~/.ssh/
    • mvid_rsaid_rsa-orig
    • mv
    • Logout of Topsail
    • Login to Topsail and accept all defaults
interactive jobs
Interactive Jobs
  • To run long shell scripts on Topsail or Kure, use int queue
  • bsub –q int –Ip /bin/bash
    • This bsub command provides a prompt on compute node
    • Can run program or shell script interactively from compute node
specialty scripts
Specialty Scripts
  • There are specialty scripts provided on Kure for the user convenience.
  • Batch scripts
    • bmatlab, bsas, bstata
  • X-window scripts
    • xmatlab, xsas, xstata
  • Interactive scripts
    • imatlab, istata
mpi openmp training
MPI/OpenMP Training
  • Courses are taught throughout year by Research Computing
  • See schedule for next course
    • MPI
    • OpenMP
further help with topsail kure
Further Help with Topsail/Kure
  • More details can be found on the Getting Started help documents:
    • - Topsail
    • Kure
    • - ON CAMPUS
  • For assistance with Topsail/Kure, please contact the ITS Research Computing group
    • Email:
    • Phone: 919-962-HELP
    • Submit help ticket at
  • For immediate assistance, see manual pages
    • man <command>