Condor and the ngs john kewley ngs support centre manager
This presentation is the property of its rightful owner.
Sponsored Links
1 / 18

Condor and the NGS John Kewley NGS Support Centre Manager PowerPoint PPT Presentation


  • 84 Views
  • Uploaded on
  • Presentation posted in: General

Condor and the NGS John Kewley NGS Support Centre Manager. Outline. What is High Throughput Computing What is Condor? Condor and the NGS. HPC vs. HTC. HPC (High Performance Computing) Large amounts of [simultaneous] computing power for comparatively short periods of time.

Download Presentation

Condor and the NGS John Kewley NGS Support Centre Manager

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Condor and the ngs john kewley ngs support centre manager

Condor and the NGSJohn KewleyNGS Support Centre Manager


Outline

Outline

  • What is High Throughput Computing

  • What is Condor?

  • Condor and the NGS

NGS Innovation Forum, Manchester


Hpc vs htc

HPC vs. HTC

HPC (High Performance Computing)

  • Large amounts of [simultaneous] computing power for comparatively short periods of time

HTC (High Throughput Computing)

  • Large amounts of computing over significantly longer periods, not necessarily all at the same time

NGS Innovation Forum, Manchester


Various job types

Various job types

Parameter Studies

OpenMP

Master-worker

Parallel

Parameter Search

Serial

Embarrassingly Parallel

Sequential

Parameter Sweep

Monte Carlo

MPI

PVM

NGS Innovation Forum, Manchester


Terminology

Terminology

Parallel

  • Tightly-coupled processes

  • Need synchronisation

  • Information sharing

    • Message passing

    • Shared memory

  • 1 process fails, whole job fails

  • Single large homogenous resource

  • Processors used simultaneously

Independent

  • Unordered (so not serial/sequential)

  • Nothing embarrassing about it

  • No communication once job starts

  • Might not need all results

  • Could run on different machines with different operating systems.

NGS Innovation Forum, Manchester


What is condor

What is Condor?

  • A job submission framework which utilises spare computing power

  • Works within a heterogeneous computer network

  • Desktop PCs, Linux workstations, servers, clusters, teaching lab resources can all be included in the Condor pool

  • Uses matchmaking to connect jobs with resources

  • Supports High Throughput Computing (HTC)

  • Developed over the past 20 years at the University of Wisconsin in Madison

NGS Innovation Forum, Manchester


Useful features

Useful Features

  • Automatic resubmission when jobs fail

  • Ability to cluster groups of jobs

  • Checkpointing / migration

  • DAGMan - Directed Acyclic Graph / workflow manager

  • Integration with Grid resources, especially through Condor-G

  • Staging and retrieval of data

  • Glide-in – dynamically add Grid worker nodes to your Condor pool

NGS Innovation Forum, Manchester


The ngs and cardiff

Central Manager

Execute Nodes

Submit Nodes

The NGS and Cardiff

  • NGS Partner site since April 05

  • First resource was a 32 processor SGI cluster (Apr 05)

  • Second resource was the Condor pool (Jun 07)

    • Over 1000 Windows XP workstations

    • Mixture of P4s (80%) and C2Ds (20%)

    • Capped at 200 jobs running concurrently

    • Used by 10 different numbered accounts

    • See www.cf.ac.uk/arcca

NGS Innovation Forum, Manchester


Other condor on the ngs

Other Condor on the NGS

  • Bristol: ~50 WindowsXP in a Condor pool fronted by a Linux server

  • Reading: ~400 Linux (CoLinux under WindowsXP)

NGS Innovation Forum, Manchester


What is condor g

Execute

Node

What is Condor-G?

condor_submit …

Remote Site

Head

Node

(Globus)

Submit

Node

Internet

Queue

Job 1

Job 2

Firewall

Batch System

NGS Innovation Forum, Manchester


Oxgrid overview

OxGrid: Overview

Department/College

Department/College

Oxford e-Research Centre

Department/College

Storage (SRB)

BDII, VOMS, SSO CA...

Resource Broker/ Login (Condor)

Condor pool

Departmental Clusters

Condor pool

Other University/Institution

Other University/Institution

Other University/Institution

National Grid Service Resource

Microsoft Cluster

National Grid Service Cluster

Super-computing centre

NGS Innovation Forum, Manchester


Condor and the ngs john kewley ngs support centre manager

User login

Condor-G

portal

MyProxy server

Condor-G central

manager

Condor-G submit

host

CSD-Physics

cluster

(ulgbc2)

CSD-Physics

cluster

(ulgbc2)

CSD AMD

cluster

(ulgbc1)

NW-GRID cluster (ulgbc3)

NW-GRID/POL cluster

(ulgp4)

Condor ClassAds

Globus file staging

NGS Innovation Forum, Manchester


University of manchester research computing services

University of Manchester, Research Computing Services

  • 100 cores (an additional 400 in 2nd pool)

  • Condor used as backfill for the SGE queues

  • IP-tunnelling used to enable connection to the NW-Grid backend nodes from Condor (rather than the provided GCB, the Generic Connection Broker)

NGS Innovation Forum, Manchester


Novel architecture

Novel Architecture !?

  • Condor itself is not that new

  • Some NGS users request Windows resources, but most previous NGS nodes used PBS, LSF or SGE on Linux

  • Campus Grids are being developed to harness all available processing power (incl. teaching pools, servers and clusters)

  • Condor can help NGS provide access to Windows resources

NGS Innovation Forum, Manchester


Windows on the ngs

Windows on the NGS

Many users are looking for Windows resources on which to run their computations.

As well as the resources provided by Cardiff, Bristol and Reading, Southampton have made available a group of 100 processors running under the Windows Compute Cluster Server

NGS Innovation Forum, Manchester


Other work

Other work

  • Jean-Alain Grunchec of the University of Edinburgh is trying Condor Glidein to add NGS resources to his condor pool

  • The e-Minerals project utilised a condor submission mechanism to submit jobs to both local Condor pools and Grid resources such as NGS and NW-Grid

  • Both the EGEE resource broker (being trialled by NGS) and Gridway metascheduler are based on Condor technologies

  • STFC Daresbury Laboratory (another NW-Grid site) in collaboration with Cockcroft Centre in setting up a Campus Grid using NW-Grid and Condor resources

NGS Innovation Forum, Manchester


Summary

Summary

  • Condor pools can be part of the NGS

  • Condor can be used in many ways with the NGS

  • Being combined with NGS in many Campus Grids

  • Condor can help NGS provide access to Windows resources

    Information on NGS resources can be found on

    http://www.grid-support.ac.uk/content/view/239/157/

NGS Innovation Forum, Manchester


Acknowledgements

Acknowledgements

  • Some slides are based on material from the University of Wisconsin-Madison Condor team.

  • Some of the slides describing the UK university condor work are based on ones they produced themselves (I hope nothing was "lost in translation"

?

NGS Innovation Forum, Manchester


  • Login