Distributed computing at the facility level applications and attitudes
Download
1 / 20

Distributed computing at the Facility level: applications and attitudes - PowerPoint PPT Presentation


  • 75 Views
  • Uploaded on

Distributed computing at the Facility level: applications and attitudes. Tom Griffin STFC ISIS Facility [email protected] NOBUGS 2008, Sydney. Spare cycles. Typical PC CPU usage is about 10% Usage minimal 5pm – 8am Most desktop PCs are really fast Waste of energy

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Distributed computing at the Facility level: applications and attitudes' - cachet


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Distributed computing at the facility level applications and attitudes

Distributed computing at the Facility level: applications and attitudes

Tom Griffin

STFC ISIS Facility

[email protected]

NOBUGS 2008, Sydney


Spare cycles
Spare cycles and attitudes

  • Typical PC CPU usage is about 10%

  • Usage minimal 5pm – 8am

  • Most desktop PCs are really fast

  • Waste of energy

  • How can we use (“steal?”) unused CPU cycles to solve computational problems?


Types of application
Types of Application and attitudes

  • CPU Intensive

  • Low to moderate memory use

  • Not too much file output

  • Coarse grained

  • Command line / batch driven

  • Licensing issues?


Distributed computing solutions
Distributed computing solutions and attitudes

Lots of choice CONDOR, GridEngine, GridMP…

  • Grid MP Server hardware

    • Two, dual Xeon 2.8GHz servers RAID 10

  • Software

    • Servers run RedHat Linux Enterprise Server / DB2

    • Unlimited Windows (and other) clients

  • Programming

    • Web Services interface – XML, SOAP

    • Accessed with C++ , Java, C#

  • Management Console

    • Web browser based

    • Can manage services, jobs, devices etc

  • Large industrial user base

    • GSK, J&J, Novartis etc.


Installing and running grid mp
Installing and Running Grid MP and attitudes

Server Installation

2 hours

Client Installation

Create MSI and RPM using ‘setmsiprop’

30 seconds

Manual Install

Better security on Linux and Macs


Adapting a program for gridmp
Adapting a program for GridMP and attitudes

  • Fairly easy to write

  • Interface to grid via Web Services

    • C++, Java, C#

  • Think about how to split your data

  • Wrap your executable

  • Write the application service

    • Pre and Post processing


Package your executable
Package your executable and attitudes

DLLs

Standard data

files

Executable

Environment

variables

}

PROGRAM MODULE

EXECUTABLE

Compress?

Encrypt?

Uploaded to, and resident

on, the server


Create run a job
Create / run a job and attitudes

Proteins

Molecules

Pkg3

Pkg4

Pkg2

Pkg1

Client side

https://

Datasets

Create job, generate

cross product

Server side

Workunits

Start job


Code examples
Code examples and attitudes

Mgsi.Job job = new Mgsi.Job();

job.application_gid = app.application_gid;

job.description = txtJobName.Text.Trim();

job.state_id = 1;

job.job_gid = ud.createJob(auth, job);

Mgsi.JobStep js = new Mgsi.JobStep();

js.job_gid = job.job_gid;

js.state_id = 1;

js.max_concurrent = 1

js.max_errors = 20;

js.num_results = 1;

js.program_gid = prog.program_gid;


Code examples1
Code examples and attitudes

  • Mgsi.DataSet ds =new Mgsi.DataSet();

  • ds.job_gid = job.job_gid;

  • ds.data_set_name = job.description + "_ds_" + DateTime.Now.Ticks;

  • ds.data_set_gid = ud.createDataSet(auth, ds);

  • for (int i = 1; i <= numWorkunits.Value; i++) {

  • FileTransfer.UploadData uploadD = ft.uploadFile(auth, Application.StartupPath + "\\testdata.tar");

  • Mgsi.Data data = new Mgsi.Data();

  • data.data_set_gid = ds.data_set_gid;

  • data.index = i;

  • data.file_hash = uploadD.hash;

    • data.file_size = long.Parse(uploadD.size);

  • datas[i - 1] = data; }

  • ud.createDatas(auth, datas);

  • ud.createWorkunitsFromDataSetsAsync(auth, js.job_step_gid, new string[] { ds.data_set_gid }, options);


Performance
Performance and attitudes

Famotidine form B

13 degrees of freedom

P21/c V=1421

Sync data to 1.64A

1 x 107 moves per run, 64 runs

Standard DASH

2.4GHz Core2 Quad

using single core

Gdash submit to test

grid of 5 in-use PCs

4 x 2.4GHz Core2 Quad

1 x 2.8GHz Core2 Quad

Job complete = 9 hrs

Job complete = 24 minutes

Speedup = 22.5 x


Performance 999 sa runs full grid
Performance – 999 SA runs, full grid and attitudes

4 days 18 hours CPU in

~40 minutes elapsed time

317 cores

from

163 devices

42 Athlons: 1.6–2.2Ghz

168 Core 2 duos: 1.8–3 Ghz

36 Core 2 quads: 2.4–2.8 Ghz

1 duron @ 1.2Ghz

42 P4s 2.4–3.6Ghz

27 Xeons: 2.5–3.6Ghz

Workunits

Time


A particular success mcstas
A Particular Success - McStas and attitudes

HRPD supermirror guide design

Complex design

Meaningful simulations take a long time

Want to try lots of ideas

Many runs of >200 CPU days

Simpler model was best value

Massive improvement in flux

Significant cost savings


Problems
Problems and attitudes

  • McStas

  • Interactions in the wild

  • Symantec Anti-Virus

  • Did not show up in testing

  • McStas restricted to night running only


User attitudes
User Attitudes and attitudes

  • A range

  • Theft

  • “I’m not having that on my machine”

  • First thing to get blamed

    • Gaining more trust

    • Evangelism by users


Flexibility with virtualisation
Flexibility with virtualisation and attitudes

  • Request to run ‘GARefl’ code

  • ISIS is Windows based

  • Few Linux PCs

  • VMWare server is freeware

  • 8 Hosts gave 26 cores

  • More cores = more demand

  • 56 real cores recruited from servers, 64-core Beowulf

  • 10 mac cores

  • Run Linux as a job



The future
The Future and attitudes

Grid growing in power every day

New machines added, old ones still left on

Electricity

Energy saving drive at STFC – switch machines off

Wake On-LAN ‘Magic Packets’ + Remote hibernate

Laptops

Good or bad?


Summary
Summary and attitudes

Distributed computing Perfect for coarse-grained,CPU intensive, ‘disk-lite’

Resources Use existing resources. Power increases with time, no need to write-off assets. Scalable

Not just faster Allows one to try different scenarios

Virtualisation Linux under Windows, Windows under Linux.

Green credentials PCs are running anyway, better to utilise them. Can be powered down & up.


Acknowledgements
Acknowledgements and attitudes

  • ISIS Data Analysis Group

    • Kenneth Shankland

    • Damian Flannery

  • STFC FBU IT Service Desk and ISIS Computing Group

  • Key Users

    • Richard Ibberson (HRPD)

    • Stephen Holt (GARefl)

  • Questions?


ad