nanco a large hpc cluster for rbni russell berrie nanotechnology institute
Download
Skip this Video
Download Presentation
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute)

Loading in 2 Seconds...

play fullscreen
1 / 31

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) - PowerPoint PPT Presentation


  • 145 Views
  • Uploaded on

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute). Anne Weill – Zrahia Technion,Computer Center October 2008. Resources needed for applications arising from Nanotechnology. Large memory – Tbytes High floating point computing speed – Tflops

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute)' - venice


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
nanco a large hpc cluster for rbni russell berrie nanotechnology institute

Nanco: a large HPC cluster for RBNI(Russell Berrie Nanotechnology Institute)

Anne Weill – Zrahia

Technion,Computer Center

October 2008

resources needed for applications arising from nanotechnology
Resources needed for applications arising from Nanotechnology
  • Large memory –Tbytes
  • High floating point computing speed –Tflops
  • High data throughput – state of the art …
cluster architecture
Cluster architecture

Processor

Memory

Processor

Memory

Interconnection network

why not a cluster
Why not a cluster
  • Single SMP system easier to purchase/maintain
  • Ease of programming in SMP systems
why a cluster
Why a cluster
  • Scalability
  • Total available physical RAM
  • Reduced cost
  • But …
slide7
Having an application which exploits the parallel capabilities

requires

Studying the application or

applications which

will run on the cluster

other requirements
Other requirements
  • Space, power ,cooling constraints , strength of floors
  • Software configuration:
  • Operating system
  • Compilers & application deve. tools
  • Load balancing and job scheduling
  • System management tools
configuration
Configuration

M

M

M

P

P

P

P

P

P

node2

node64

node1

Infiniband Switch

before finalizing our choice
Before finalizing our choice …

One should check , on a similar system :

  • Single processor peak performance
  • Infiniband interconnect performance
  • SMP behaviour
  • Non commercial parallel applications behaviour
parallel applications issues
Parallel applications issues
  • Execution time
  • Parallel speedup Sp= T1/Tp
  • Scalability
benchmark design
Benchmark design
  • Must give a good estimate of performance of your application
  • Acceptance test -should match all its components
what did work
What did work
  • Running MPI code interactively
  • Running a serial job through the queue
  • Compiling C code with MPI
what did not work
What did not work
  • Compiling F90 or C++ code with MPI
  • Running MPI code through the queue
  • Queues do not do accounting per CPU
p ara llel performance results
Parallel performance results

Theoretical peak

2.1 Tflops

Nanco performance on HPL:

0.58 Tflops

conclusions from acceptance tests
Conclusions from acceptance tests
  • New gcc (gcc4) is faster than Pathscale for some applications
  • MPI collective communication functions are differently implemented in various MPI versions
  • Disk access times are crucial - use attached storage when possible
scheduling decisions
Scheduling decisions
  • Assessing priorities between user groups
  • Assessing parallel efficiency of different job types (MPI,serial ,OPenMP) /commercial software and designing special queues for them
  • Avoiding starvation by giving weight to the urgency parameter
observations during production mode
Observations during production mode
  • Assessing user’s understanding of machine – support in writing scripts and efficient parallelization
  • Lack of visualization tools – writing of script to show current usage of cluster
conclusion
Conclusion
  • Benchmark correct design is crucial to test capabilities of proposed architecture
  • Acceptance tests allow to negotiate with vendors and give insights on future choices
  • Only after several weeks and running of the cluster at full capacity can we make informed decisions on management of the cluster
ad