1 / 25

Paris, 31 March 2003

Paris, 31 March 2003. EPSRC Annual e-Science Meeting, 22 April 2005. What makes Grid computing difficult? Peter Coveney Centre for Computational Science University College London. Talk contents. What is Grid computing? How to do it Problems Making grids more usable. Notes:

lida
Download Presentation

Paris, 31 March 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paris, 31 March 2003 EPSRC Annual e-Science Meeting, 22 April 2005 What makesGrid computing difficult?Peter CoveneyCentre for Computational ScienceUniversity College London

  2. Talk contents • What is Grid computing? • How to do it • Problems • Making grids more usable

  3. Notes: Computing means any activity involving digital information -- no distinction between numeric/symbolic, or numeric/data/viz Transparency implies minimal complexity for users of the technology Grid Computing My preferred definition: Grid computing is distributed computing performed transparently across multiple administrative domains See: Phil Trans R Soc London A (2005)

  4. Grid computing is NOT… • …launching isolated jobs onto medium-sized or big iron, as is the case for • most work being done on TeraGrid machines • most work being done on National Grid Service What added value is there to “grid-enabling” NGS and TeraGrid machines? - Common file-store system is an important and valuable feature • …talking about or merely complying with middleware specifications and standards • Note: We have interests in Grid-based “informatics” projects, OGSA-DAI and all that for the £1.1M EPSRC funded “Discovery of Novel Functional Oxides” Project

  5. Is worse is better? • The Global Grid Forum • The WSRF debacle 2004 • Credibility problem--an expensive talking shop? • Angels dancing on a pinhead, or the European plug revisited? • De facto standards • We need workable, usable solutions • There must be continual engagement between users and grid techies

  6. Transferring binary data • Web Services Applications Need Effective, Standard Methods for Handling Binary Data World Wide Web Consortium Issues Three Web Services Recommendations • http://www.w3.org/ -- 25 January 2005 -- The World Wide Web Consortium (W3C) has published three new Web Services Recommendations: • XML-binary Optimized Packaging (XOP), • SOAP Message Transmission Optimization Mechanism (MTOM), and • Resource Representation SOAP Header Block (RRSHB). These recommendations provides ways to efficiently package and transmit binary data included or referenced in a SOAP 1.2 message.

  7. Grid Computing: How? • To do grid computing we need to find or build a grid which is: • Stable • Persistent • Usable from which we can pick and choose the resources we need. • Where do we find such a grid that lasts longer than a demo? • UK: National Grid Service (since mid 2004) • US TeraGrid Note: All the above use elements of Globus Toolkit 2

  8. Demos versus real science A tension exists • Demos can help us make progress versus • Here today, gone tomorrow Grid infrastructure must be made persistent in order to perform real science

  9. Storage devices RealityGrid A £4 million project funded by EPSRC Instruments: XMT devices, LUSI,… Grid Middleware HPC resources Scalable MD, MC, mesoscale modelling User with laptop or PDA Steering Performance control/monitoring Visualization engines VR and/or AG nodes

  10. Building services on GT2 grids • Globus Toolkit 2 has limited usable functionality, so we: • Track specs & standards • Provide functionality as easily as possible • Put this on top of GT2 grid middleware • We do NOT wait for heavyweight generic solutions provided by others: • GT3 (obsolescent) • GT4 (yes, but when?) • It’s a recipe for being sidelined indefinitely… • Lightweight middleware: makes provision of a service oriented architecture a pleasant experience for all

  11. Grid computing headaches • Deployment on existing grids • It takes a long time and much effort by many people to get applications properly deployed • Often requires extensive re-working of existing application code • Lots of things can go wrong • Many people have given up -- ROI too low • Lack of persistent grid infrastructure and capabilities -- steering, viz, bandwidth provision; need interactive access • Security issues • Clunky, not very usable • Existing model not taken seriously by people who care about it

  12. TeraGyroid Grid 2003 Starlight (Chicago) Netherlight (Amsterdam) 10 Gbps ANL PSC Manchester Caltech NCSA Daresbury BT provision 2 x 1 Gbps production network MB-NG SJ4 SDSC Phoenix Visualization UCL Access Grid node Computation Service Registry Network PoP Dual-homed system

  13. “STIMD Grid 2004” Grid infrastructure UK NGS Leeds Manchester Starlight (Chicago) US TeraGrid Netherlight (Amsterdam) Oxford RAL SDSC NCSA PSC UCL UKLight AHM 2004 Local laptops,PDAs, and Manchester vncserver Both the US TeraGrid and UK NGS use GT2 middleware All sites connected by production network (not all shown) Computation Steering clients Service Registry Network PoP

  14. RealityGrid demos @ NeSC • Lattice-Boltzmann study of complex fluid flow through porous media (oilfield application) • Molecular dynamics/thermodynamic integration to compute SH2-protein/peptide and HIV-protease/drug binding affinities • Now achieving flexible multi-modal means to control/steer these applications using Qt-steerer, PDA & web portal • Grid infrastructure only comes together “around the demo” -- one has to work very hard to get that, and even harder to see it persist beyond the one-week time frame.

  15. Problems for users • lack of a common API for usable core functionality  (e.g.  file-transfer)  across distinct  grid applications and domains • heterogeneous software stacks make  grid-applicationportability a nightmare for users • security: high barrier for getting certificates accepted beyond the issuing domain--some improvements in past year for US/UK projects • non-uniform scheduling and  job-launching resources and often incompatible policies in different admin domains • complex grid middleware detrimental to scientific research, and contrary to the stipulated goals of grid computing

  16. X.509 digital certificates in grid computing • Users share certificates because “it’s too hard to get my own”, “it’s too hard to get my certificate authorised for that site, but my colleague managed to get his done”, “my certificate doesn’t work properly”, etc • Users store private keys in multiple locations • Users protect private keys with no passphrase or with trivial passphrases • Users re-use certificates obtained for one specific purpose for another “because it is too difficult to get another one” Adapted from Bruce Beckles

  17. Security and usability • Usability considerations alone lead one to the conclusion they are unsatisfactory: • Extremely difficult to use, particularly as implemented in current grid environments • Security solutions which are difficult to use are inherently insecure -- users inadvertently use them in an insecure way or deliberately subvert them in an attempt to “just get my work done without all this stuff getting in the way” Adapted from Bruce Beckles

  18. In other words… …it is too difficult to use this hopeless mess properly, and, anyway, I’ve got better things to do with my time, so… “Can’t use it. Won’t use it.” Adapted from Bruce Beckles

  19. Lightweight middleware • OGSI::Lite/WSRF::Lite • by Mark McKeown of Manchester University • Lightweight OGSI/WSRF implementation, written in Perl • uses existing software (eg for SSL) where possible; simple installation • Using OGSI::Lite (2003) • Grid-based job submission and steering retrofitted onto the LB2D workstation class simulation code within a week • Standards compliance: we were able to steer simulations from a web browser, with no custom client software needed • Necessary for all RealityGrid grid work, e.g. TeraGyroid • Now developing extended capabilities using WSRF::Lite on US TeraGrid & UK NGS • We have developed WEDS--a web service environment for distributed simulation

  20. Scientists developing middleware! • Rapid prototyping of usable grid middleware (EPSRC funded) • Robust application hosting under WSRF::Lite (OMII funded) • Total value > £500K OMII = Open Middleware Infrastructure Institute (UK) www.omii.ac.uk

  21. WEDS Architecture • Each resource runs a WSRF::Lite container containing a WEDS Machine service and factory services for each hosted application. • Each machine that a user wishes to use is registered with a broker service • The user contacts the broker with the details of the job to run • The broker match-makes the job details with the capabilities advertised by each machine service and decides where to invoke the service • The broker passes back the contact details of the service instance to the client Client Broker Machine Service Service Factory Wrapper Service Invoked Application Managed resource

  22. Robust application hosting • Developing our lightweight hosting tools to meet the needs of applications scientists • No preconceptions about the 'right way' to do things or pre-determined adherence to particular specifications or “work flows” • Gain experience by working with real-world problems, refactoring design as required • Projects/people we are collaborating with as “end-users” --Daniel Mason (Imperial) -- polystyrene-surface interactions (see demo) --CCP5’s DL-MESO Project (Rongshan Qin, DL) -- mesoscale modelling/simulation --Jonathan Essex (Southampton) -- NAMD for protein modelling --Integrative Biology EPSRC e-Science Project project --IBiS (Integrative Biological Simulation) BBSRC Bioinformatics & e-Science Project • Close collaboration with OMII and its middleware

  23. Summary • We are using the Grid to do real science • When successful, leads to step jump in our capabilities • We are working with US TeraGrid and UK National Grid Service to try to ensure compatibility between two grids into the future (GT4, …) • We’re being held back by the state of existing “grid infrastructure”

  24. Summary • There remain large barriers to routine use of flexible computational grids • Lightweight middleware greatly facilitates deployment of users’ applications on grids • We’re working with several “computational user communities” from physics through to biology to try to attract them onto grids in this manner

  25. Acknowledgements • Many colleagues, post-graduates and post-docs • EPSRC • OMII • NSF

More Related