1 / 23

Issues with Production Grids

Issues with Production Grids. Tony Hey Director of UK e-Science Core Programme. The ‘Grid’ is a set of core middleware services Running on top of high performance global networks to support research and innovation. CPUs. Clusters. Compute Resource Grids. Overlay and Compose

calais
Download Presentation

Issues with Production Grids

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Issues with Production Grids Tony Hey Director of UK e-Science Core Programme

  2. The ‘Grid’ is a set of core middleware services Running on top of high performance global networks to support research and innovation

  3. CPUs Clusters Compute Resource Grids Overlay and Compose Grids of Grids MPPs Methods Services Functional Grids Federated Databases Databases Data Resource Grids Sensor Sensor Nets Grids of Grids of Simple Services

  4. NGS “Today” Interfaces • Projects • e-Minerals • e-Materials • Orbital Dynamics of Galaxies • Bioinformatics (using BLAST) • GEODISE project • UKQCD Singlet meson project • Census data analysis • MIAKT project • e-HTPX project. • RealityGrid (chemistry) • Users • Leeds • Oxford • UCL • Cardiff • Southampton • Imperial • Liverpool • Sheffield • Cambridge • Edinburgh • QUB • BBSRC • CCLRC. OGSI::Lite

  5. NGS Hardware

  6. NGS Software

  7. RealityGrid AHM Experiment • Measuring protein-peptidebinding energies – Gbind is vital for e.g. understanding fundamental physical processes at play at the molecular level, for designing new drugs. • Computing a peptide-protein binding energy traditionallytakesweeks tomonths. • We have developed a grid-based method toacceleratethis process. We computedGbindduring the UK AHM i.e. in less than 48 hours ligand Src SH2 domain

  8. Experiment Details • A Grid based approach, using the RealityGrid steering library enables us to launch, monitor, checkpoint and spawn multiple simulations • Each simulation is a parallel molecular dynamic simulation running on a supercomputer class machine • At any given instant, we had up to nine simulations in progress (over 140 processors) on machines at 5 different sites: e.g 1x TG-SDSC, 3x TG-NCSA, 3x NGS-Oxford, 1x NGS-Leeds, 1x NGS-RAL

  9. Experiment Details (2) • In all 26 simulations were run over 48 hours. We simulated over 6.8ns of classical molecular dynamics in this time • Real time visualization and off-line analysis required bringing back data from simulations in progress. • We used UK-light between UCL and the TeraGrid machines (SDSC, NCSA)

  10. The e-Infrastructure UK NGS Leeds Manchester Starlight (Chicago) US TeraGrid Netherlight (Amsterdam) Oxford RAL SDSC NCSA PSC UCL UKLight AHM 2004 Local laptops and Manchester vncserver All sites connected by production network (not all shown) Computation Steering clients Service Registry Network PoP

  11. The scientific results … Some simulations require extending and more sophisticated analysis needs to be performed

  12. … and the problems • Restarted the GridService container Wednesday evening • Numerous quota and permission issues, especially at TG-SDSC • NGS-Oxford was unreachable Wednesday evening to Thursday morning • The steerer and launcher occasionally fail • We were unable to checkpoint two simulations • The batch queuing systems occasionally did not like our simulations • 5 simulations died of natural causes • Overall, up to six people were working on this calculation to solve these problems

  13. Grid Operation Support Centre NGS “Tomorrow” Web Services based National Grid Infrastructure

  14. Specifications that have/will enter a standardisation process but are not stable and are still experimental ‘WS-I+’ profile Standards that have broad industry support and multiple interoperable implementations Specifications that are emerging from standardisation process and are recognised as being ‘useful’ Web Service Grids: An Evolutionary Approach to WSRF WS-I

  15. OMII Vision • To be the national provider of reliable, interoperable, open source grid middleware • Provide one-stop portal and software repository for grid middleware • Provide quality assured software engineering, testing, packaging and maintenance for our products • Lead the evolution of Grid middleware through a managed programme and wide reaching collaboration with industry

  16. OMII Distribution 1 Oct 2004 • Collection of tested, documented and integrated software components for Web Service Grids • A base built from off-the-shelf Web Services technology • A package of extensions that can be enabled as required • An initial set of Web Services for building file-compute collaborative grids • Technical preview of Web Service version of OGSA-DAI database middleware • Sample applications

  17. Include the services in previous distributions +… OMII managed programme contributions Database service Workflow service Registry service Reliable messaging service Notification service Interoperability with other grids OMII future distributions

  18. Why Workflows and Services? Workflow = general technique for describing and enacting a process Workflow = describes what you want to do, not how you want to do it Web Service = how you want to do it Web Service = automated programmatic internet access to applications • Automation • Capturing processes in an explicit manner • Tedium! Computers don’t get bored/distracted/hungry/impatient! • Saves repeated time and effort • Modification, maintenance, substitution and personalisation • Easy to share, explain, relocate, reuse and build • Available to wider audience: don’t need to be a coder, just need to know how to do Bioinformatics • Releases Scientists/Bioinformaticians to do other work • Record • Provenance: what the data is like, where it came from, its quality • Management of data (LSID - Life Science IDentifiers)

  19. SOAPLAB Web Service Any Application Web Service e.g. DDBJ BLAST Workflow Components Freefluo Freefluo Workflow engine to run workflows Scufl Simple Conceptual Unified Flow Language Taverna Writing, running workflows & examining results SOAPLAB Makes applications available

  20. The Williams Workflows A B C A: Identification of overlapping sequence B: Characterisation of nucleotide sequence C: Characterisation of protein sequence

  21. The Workflow Experience Have workflows delivered on their promise? YES! • Correct and Biologically meaningful results • Automation • Saved time, increased productivity • Process split into three, you still require humans! • Sharing • Other people have used and want to develop the workflows • Change of work practises • Post hoc analysis. Don’t analyse data piece by piece receive all data all at once • Data stored and collected in a more standardised manner • Results amplification • Results management and visualisation

  22. VRE, VLE, IE HPCx + HECtoR LHC ISIS TS2 Future UK e-Infrastructure? Usersget common access, tools, information, nationally supported services, through NGS and robust, standards-compliant middleware from the OMII GOSC Regional and Campus grids Integrated internationally

More Related