1 / 17

Connecting OurGrid & GridSAM

Connecting OurGrid & GridSAM. A Short Overview. Content . Goals OurGrid : architecture overview OurGrid : short overview GridSAM : short overview GridSAM : example deployment with Condor Different paradigms: OurGrid Different paradigms: GridSAM Issues: File Staging

jarah
Download Presentation

Connecting OurGrid & GridSAM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Connecting OurGrid & GridSAM A ShortOverview

  2. Content • Goals • OurGrid: architectureoverview • OurGrid: short overview • GridSAM: shortoverview • GridSAM: exampledeploymentwith Condor • Different paradigms: OurGrid • Different paradigms: GridSAM • Issues: File Staging • Issues: many related job submissions • OurGrid<>GridSAM connector

  3. Goals • To maintain two grid environments in parallel: OurGrid & Condor • To handle job submission process through common interface: JSDL, using GridSAM • To build connector for GridSAM to talk to OurGrid • GridSAM can already talk to Condor through a connector, no problems here

  4. OurGrid: architecture overview

  5. OurGrid: short overview • Workersaretypically desktop computersthatcan run jobsdirectlyintheir OS orthroughvirtualization (XEN, VMWare, VirtualBox etc.) • „Clouds of Workers” arecontrolled by Peers • JobsaresubmittedthroughBrokers • Twopossibilitieshere: • Broker can be a dedicatedweb-siteinterfacingwithspecificPeers • Broker can be anymachinewithMyGridtoolinstalledthatcommunicates to specifiedPeers

  6. GridSAM: short overview • Web Service-typemiddlewarelayingbetweenjobsubmitter and coregridmachinery • Modulararchitecture: can talk to many gridinfrastructuresthroughspecificconnectors • Collectsjobsubmissionssent as XML JSDL files • Managesmultiplesubmissionsthanks to persistency and monitorssubmissionslifecycle • AfteracceptingJSDLs, re-submitsjobsdirectly to underlyinggridmachinery as definedinspecificconnectors

  7. GridSAM: exampledeploymentwithCondor • Machine (B) runs GridSAM instance in secured OMII container • Machine (B) has capability of directly re-submitting jobs to Condor Pool (C) • Authorized job submitter (A) can submit jobs over the internet to the GridSAM instance running on (B)

  8. Different paradigms: OurGrid • Designed for labsthathaveaccess to a pool of desktop machineswhosefree CPU cyclescan be utilized • Bag-of-Tasks: jobsareusuallydisjointunitswith independent input and output • Data setsoftenhavereasonableenoughsizes to be transferred many timesacross many machines • As end-userfriendly as possible: asksjobsubmitteronly for JDL jobsubmissionspecification, inputfiles and outputfiles • All details of jobscheduling and file transfer arehiddenfromjobsubmitter

  9. Different paradigms: GridSAM • Designedprimarily for labsutilizing high performance computing (HPC) techniquesusingfewpowerfulmachines • HPC istypicallyused for CPU-demandingcomputationsthatusesextensive data sets • Everymilisecondisimportant: jobspecification, input and outputfilesmust be handledwith minimum human and OS intervention • Jobsareoften dependent on verylargedatasets, file transfer should be minimized • Data must be accessedinfast and secureway, preferablythroughURIswhichrequires minimum externalintervention • TheURIsmust be specifieddirectlyin JSDL file

  10. Issues: File staging • In OurGrid, MyGridtooltakescare of transfer of inputfiles, distributingthemaccording to BoTparadigm, and transfer of outputfiles back to jobsubmitter • Also, whensubmittingthroughweb-site, feedbackissentwhenoutputfilesareavailable for download • Job submittercanjust point out files on itsownmachine, oruploadthem to somestorageserveraccessible to MyGrid • No dedicatedstorageisneeded for MyGrid to work

  11. Issues: File staging • GridSAMdoes not handle input and outputfiles by itself; itdelegatesthissubtask to yetanothermiddleware, Apache VFS • VFS was designed to access resources identified by URIsbased on fullyqualifiedhostnames and fewrecognizedprotocols (FTP/SFTP, HTTP, GridFTP, WebDAV etc.) • Whensubmitting JSDL usingGridSAMclient on particularmachine, one cannotjust point out localfiles; theymust be uploaded to somededicatedstoragespacethatisidentifiablethrough URI to VFS machinery • Onlywhencorrectlyspecified (reliableURIs!) in JSDL, and uploaded to dedicatedstorage, filesmay be furtherprocessed by GridSAM

  12. Issues: File staging • Possiblesolution 1: definededicatedstorageinthe form of SFTP/GridFTP file server, accessibleboth to OurGrid and GridSAM, and writeallURIsin JSDL filesaccording to thisdedicatedstorage • Possiblesolution 2: letjobsubmitterdecideitsownstoragemechanisms; accept URI ifitisaccessible (readable/writable), processthejob as usual, let VFS do therest

  13. Issues: File staging • In both cases, security is an important feature to consider • JSDL processing is secure enough in GridSAM but secure access to external storage must be maintained separately

  14. Issues: many relatedjobsubmissions • In OurGrid, jobsubmittercansubmit JDL jobspecificationwith many jobsdefined • Also, specific environment variables set by OurGridcan be utilized to differentiatebetweenmultiplejobs and multipleinput/outputfiles • No specificsupport for parametersweepconceptisprovided, but jobsubmittercansimulateit by usingproperlywritten JDL jobspecification

  15. Issues: many relatedjobsubmissions • WithGridSAM, jobsubmitterissubmitting JSDL thatcontainsdetails for single jobonly • In theory, itispossible to submitmultipleJSDLsinshort time; theyshould be internallyscheduledusingpersistencymechanisms by GridSAM, thengraduallyre-submitted to gridmachinerythroughspecifiedqueuingstrategy • Parametersweep JSDL extensioniscurrently not supportedinGridSAM; intheory, jobsubmittercansubmitbunch of JSDLsthatsimulateit

  16. Issues: many relatedjobsubmissions • Possiblesolution 1: rely on GridSAMschedulingmechanisms; allow to acceptmultiplesubmissionsinveryshort time and letGridSAMre-submitthemaccording to itsownstrategies • Possiblesolution 2: implementparametersweep JSDL extensioninOurGridconnectororeveninGridSAMcore module itself • Solution 1 isverystraightforward; however, thebehaviour of GridSAM under thoseconditionsneeds to be examinedclosely • Solution 2 isveryfeasible, but requires much time and resources

  17. OurGrid<>GridSAM connector • For OurGrid, MyGrid tool instance (either installed on local machine or as component of job submission web-site) is a single „contact point” for job submitter, hiding all the underlying grid-specific mechanisms • The connector should be a wrapper over MyGrid instance

More Related