1 / 32

RunJob Discussion Slides

RunJob Discussion Slides. Greg Graham 8-April-2004. Outline. How RunJob is Used in CMS RunJob in the Abstract Diagrammed Examples of RunJob in Specific Cases The Intended “User Experience” The Intended “Admin Experience”. RunJob in CMS.

giulia
Download Presentation

RunJob Discussion Slides

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RunJob Discussion Slides Greg Graham 8-April-2004

  2. Outline • How RunJob is Used in CMS • RunJob in the Abstract • Diagrammed Examples of RunJob in Specific Cases • The Intended “User Experience” • The Intended “Admin Experience”

  3. RunJob in CMS • RunJob is an Application Configuration and Job Creation Tool • RunJob uses metadata to abstract application steps and bolt them together into a workflow (Configurators) • RunJob uses metadata to abstract service invocations and iterators to parallelize job creation (other Configurators) • RunJob has an abstract model of a “batch job” that can be translated into concrete jobs for a variety of environments (local or grid) later depending on what modules are plugged in.

  4. RunJob in CMS • RunJob does job creation in macro scripts • RunJob macro scripts support external service invocations, looping over units of data, inline job submission, metadata constraint modeling among applications and services, persistence, and conventional provenance. • Macro scripts can be factored into simple scripts and contexts. Contexts may contain site level details, administrative details, logical details about constraints, or default input parameters for applications • Saving the context scripts can enable “meta-provenance” in the sense that recording constraints provides clues as to why certain parameters are set the way they are instead of just their value.

  5. RunJob in CMS • Specific Uses • Official Monte Carlo Generation • Over 30 sites, over 75 million GEANT events, controlled centrally by central RefDB. Assignments are pulled from the RefDB. (Monte Carlo Simulation) • Data Challenge 2004 • Used at CERN to govern online reconstruction jobs of simulated data at 25 Hz (Data Reprocessing) • Used at FNAL to create physics plots from reconstructed data transferred there (Batch Analysis)

  6. RunJob in CMS • Extensions to RunJob • XMLP: An XML persistency mechanism • shREEK: Runtime extensions to RunJob for monitoring, and delayed job specification. • logger: An event logger • scriptObjects: Abstract descriptions of jobs that serve as input to code generators producing actual jobs for specific environments • CAST: A GUI for construction Configurators and workflows.

  7. RunJob Abstract Architecture (without extensions) User Admin RunJob Scripts RunJob Contexts Linker:A container for Configurators that enables constraint propagation, framework calls, provenance, scripting, and context resolution of scripts Configurator: An Object Oriented, Metadata based description of services or applications with APIs for extending behaviors to many environments Databases & Catalogs Applications Execution Managers Local/Grid Resources ScriptObjects

  8. RunJob Abstract Architecture(without extensions) • Configurator Layer • Applications and services are exposed as namespaces that contain specific metadata. • Configurators receive messages • from the Linker through the framework, executing pre-defined actions at specified times, or • directly from the user through macro interface • Configurators can have dependencies on other Configurators. • Linker Layer • Runs the framework • Interprets and dispatches macro commands from the user • Allows metadata elements in one namespace to refer to metadata elements in another namespace, and to be refreshed periodically. • Maintains a partial order based on the dependencies.

  9. RunJob Abstract Architecture(without extensions) • Abstract Configurator Examples • Applications • Metadata are executable name and version and selected input parameters. • Actions are to refresh the metadata periodically and to execute some program at specific framework calls. (Usually, the “program” is to generate some code or wrapper script for later submission to a batch manager.) • Services • Metadata are query data or query results • Actions are to perform some query periodically on external services • ScriptObject • A “snapshot” of a Configurator plus selected output of Configurator action. • State is saved in a transportable format. (Usually, the wrapper script generated by the above “program” is saved here.)

  10. Specific Examples • The number of things that can be done with RunJob is very large. Please keep in mind that the following are meant to be illustrative and can in most cases the specific components be mixed and matched. • That is the point of RunJob, but it doesn’t fit on “one slide” • I am also omitting specific InputFiles and OutputFiles to save space. Assume all application Configurators have this metadata. • Filenames can be set directly, or built from “Unique” metadata in the Configurator

  11. Running Pythia User overrides admin set defaults User does not know physical resources Some parameters come from a Database

  12. Admin User -Physics Parameters set by Physics Group eg- HiggsMass=115 GeV/c2 TopMass=173 GeV/c2 Random Seeds from a Control DB -Physical Description of Site Resources eg- BatchManager=FBS -Configuration of Site Resources eg- FBSQueueName=CMS; Submit Jobs Inline -User’s Job Description eg- attach Pythia attach BatchManager -Physics Parameters of User Interest eg- TopMass=275 GeV/c2 -Specification of Which Context To Use RunJob Scripts RunJob Contexts Result Text -Read User Script -Read Context(s) specified by User Linker:A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts -Combine all specified contexts -Fully Specified Workflow Description eg- attach ControlDB attach Pythia HiggsMass=115 GeV/c2 TopMass=275 GeV/c2 Random Seeds from a Control DB attach FBS FBS QueueName=CMS; Submit Jobs Inline Results from Configurators

  13. Linker:A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts -Fully Specified Workflow Description eg- attach ControlDB attach Pythia HiggsMass=115 GeV/c2 TopMass=275 GeV/c2 Random Seeds from a Control DB attach FBS FBS QueueName=CMS; Submit Job FBS Submit Results Configurators: An Object Oriented, Metadata based description of services or applications with APIs for extending behaviors to many environments attach ControlDB Get Random Seeds attach Pythia HiggsMass=115 TopMass=275 random seeds {x} attach FBS FBSQueueName=CMS Submit Job Now return random seeds: {x} FBS Submit Results return Pythia scriptObject Queue CMS MySQL Control DB Pythia FBS

  14. What would the Macro ScriptsLook Like? attach Pythia attach BatchManager cfg Pythia define TopMass 275.0 if SubmitInline in @args cfg BatchManager define SubmitInline true end source CommonFramework.mcj User written script or canned script for user consumption attach ControlDB cfg ControlDB make RandomSeeds contextBlock Class=Pythia adddep ControlDB define TopMass 173.0 define HiggsMass 115.0 define RandomSeeds ::ControlDB:RandomSeeds end namespace add BatchManager Class=FBS contextBlock Class=FBS define QueueName CMS end Admin written context script that operates on user scripts

  15. What would the Macro ScriptsLook Like? The User Types: Linker.py script=userScript.mcj context=context.mcj [SubmitInline] attach ControlDB cfg ControlDB make RandomSeeds attach Pythia cfg Pythia adddep ControlDB cfg Pythia define TopMass 275.0 cfg Pythia define HiggsMass 115.0 cfg Pythia define RandomSeeds ::ControlDB:RandomSeeds attach FBS cfg FBS QueueName CMS source CommonFramework.mcj Equivalent “User Script” after modified by context

  16. Running Pythia and GEANT User overrides admin set defaults User specifies two step workflow Some parameters come from a Database

  17. Admin User -Physics Parameters set by Physics Group eg- HiggsMass=… ECALThresh = … Random Seeds, threshodls from a Control DB -Physical Description of Site Resources eg- BatchManager=FBS -Configuration of Site Resources eg- FBSQueueName=CMS; Submit Jobs Inline -User’s Job Description eg- attach Pythia attach GEANT3 cfg GEANT3 define Input ::Pythia:Output -Physics Parameters of User Interest eg- ECALThresh=500 keV -Specification of Which Context To Use RunJob Scripts RunJob Contexts Result Text -Read User Script -Read Context(s) specified by User Linker:A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts -Combine all specified contexts -Fully Specified Workflow Description eg- attach ControlDB attach Pythia HiggsMass=… attach GEANT3 Detector Thresh=… Random Seeds from a Control DB attach FBS FBS QueueName=CMS; Submit Jobs Inline Results from Configurators

  18. Linker:A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts -Combine all specified contexts -Fully Specified Workflow Description eg- attach ControlDB attach Pythia HiggsMass=… attach GEANT3 ECALThresh=… Random Seeds from a Control DB attach FBS FBS QueueName=CMS; Submit Jobs Inline FBS Submit Results Configurators: An Object Oriented, Metadata based description of services or applications with APIs for extending behaviors to many environments attach ControlDB Get Random Seeds attach Pythia HiggsMass=… random seeds {x} attach GEANT DetThresh=… Input=PythiaOutput attach FBS FBSQueueName=CMS Submit Job Now return Pythia scriptObject return GEANT scriptObject return random seeds: {x} FBS Submit Results Queue CMS MySQL Control DB Pythia GEANT FBS

  19. What would the Macro ScriptsLook Like? attach Pythia attach GEANT cfg GEANT adddep Pythia cfg GEANT define Input ::Pythia:Output cfg GEANT define ECALThresh 0.5 attach BatchManager if SubmitInline in @args cfg BatchManager define SubmitInline true end source CommonFramework.mcj User written script or canned script for user consumption attach ControlDB cfg ControlDB make RandomSeeds cfg ControlDB make Thresholds contextBlock Class=Pythia adddep ControlDB define TopMass … define RandomSeeds ::ControlDB:RandomSeeds end contextBlock Class=GEANT adddep ControlDB define ECALThresh ::ControlDB:ECALThresh define OtherThresh ::ControlDB:OtherThresh define RandomSeed ::ControlDB:RandomSeed end namespace add BatchManager Class=FBS contextBlock Class=FBS define QueueName CMS end Admin written context script that operates on user scripts

  20. What would the Macro ScriptsLook Like? The User Types: Linker.py script=userScript.mcj context=context.mcj [SubmitInline] attach ControlDB cfg ControlDB make RandomSeeds cfg ControlDB make Thresholds attach Pythia cfg Pythis adddep ControlDB cfg Pythia define TopMass 275.0 cfg Pythia define HiggsMass 115.0 cfg Pythia define RandomSeeds ::ControlDB:RandomSeeds attach GEANT cfg GEANT adddep ControlDB cfg GEANT adddep Pythia cfg GEANT define ECALThresh 0.5 cfg GEANT define OtherThresh ::ControlDB:OtherThresh cfg GEANT define RandomSeed ::ControlDB:RandomSeed cfg GEANT define Input ::Pythia:Output attach FBS cfg FBS QueueName CMS source CommonFramework.mcj Equivalent “User Script” after modified by context

  21. What can RunJob Do ? • Once the developer writes a Configurator, it all looks like metadata to the system, and the metadata undergoes transformations across certain framework calls, and work can be performed during other framework calls, and the Configurators can refer to and use results of other Configurators. • Excellent for “Job Building” that involves parallelization or metadata handling. • Excellent for adapting old jobs to new environments

  22. Running with SAM (how it could work) User overrides admin set defaults User specifies one step analysis workflow Files come from a SAM Project Not Parallelized, local execution Metadata Flow Specification

  23. Admin User -User’s Job Description eg- attach Analysis -Physics Parameters of User Interest eg- JetAlgo=Cone7 -Specification of Which Context To Use -Physics Parameters set by Physics Group eg- JetAlgo=Cone11, MCInfo=false -Physical Description of Site Resources eg- BatchManager=fork RunJobInline Use SAM to get files. RunJob Scripts RunJob Contexts Result Text -Read User Script -Read Context(s) specified by User Linker:A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts -Combine all specified contexts -Fully Specified Workflow Description eg- attach SAM attach Analysis JetAlgo=… Input=GetNextFile() result attach Fork cfg Fork RunJobInline Results from Configurators

  24. Linker:A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts -Combine all specified contexts -Fully Specified Workflow Description eg- attach SAM attach Analysis JetAlgo=… Input=GetNextFile() result attach Fork cfg Fork RunJobInline Fork Results Configurators: An Object Oriented, Metadata based description of services or applications with APIs for extending behaviors to many environments attach SAM Get next file attach Analysis JetAlgo=Cone7 attach Fork Run Job Now return Analysis scriptObject return next file Fork Results SAMDB SAM (Configurator) Analysis A Subshell

  25. What would the Macro ScriptsLook Like? attach Analysis cfg Analysis define JetAlgo Cone7 attach BatchManager source CommonFramework.mcj #(Call order: “PreJob”, “RunJob”, “PostJob”) User written script or canned script for user consumption attach SAM cfg SAM define ProjectName ::@args:ProjectName cfg SAM start project cfg SAM oncall PreJob do GetNextFile # Results stored in “CurrentFile” contextBlock Class=Analysis adddep SAM define JetAlgo Cone11 define ScriptObjectName SomeAnalysis define InputProject ::SAM:ProjectName oncall PreJob do define InputFile ::SAM:CuurentFile oncall RunJob do MakeScript # Predefined method, stores result in # a scriptObject end namespace add BatchManager Class=Fork contextBlock Class=Fork adddep Analysis define ScriptObjectToGet ::Analysis:ScriptObjectName oncall RunJob do GetAndRunScriptObject end Admin written context script that operates on user scripts.

  26. Running with SAMGrid (how it could work) User overrides admin set defaults User specifies one step analysis workflow Files come from a SAM Project Parallelized, Grid Execution

  27. DZero Job Characteristics

  28. Admin User -User’s Job Description -Physics Parameters of User Interest -Specification of Which Context To Use -Physics Parameters set by Physics Group -Physical Description of Grid Services RunJob Scripts RunJob Contexts Result Text -Read User Script -Read Context(s) specified by User Linker:A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts -Combine all specified contexts -Fully Specified Workflow Description Results from Configurators

  29. Asynchronous Mode: RunJob as Staging Proxy for SAM.GetNextFile() Linker:A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts -Combine all specified contexts -Fully Specified Workflow Description Submission Results Done! Job Configurators: An Object Oriented, Metadata based description of services or applications with APIs for extending behaviors to many environments The Grid get Application scriptObjects return Application scriptObjects Scheduler GetNextFile() executes locally Submission Results Job SAMDB SAM (Configurator) Applications Grid Submit

  30. Asynchronous Comments • GetNextFile() runs at job creation time, and files are staged to execution hosts • Would work better if GetNextFile() had asynchronous extensions to it, so you could get a “next” file before the last one was consumed • Staging can be done by RunJob FileMetaBroker or directly by the SAM SRM interface (does that exist?) • The job itself provides notification of successful consumption to SAM. • SAM stations are not needed on the remote sites • Native Grid schedulers such as Condor or LCG Resource Broker can be used.

  31. Synchronous Mode: scriptObjects execute SAM.GetNextFile() Linker:A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts SAMDB -Combine all specified contexts -Fully Specified Workflow Description Submission Results Job Configurators: An Object Oriented, Metadata based description of services or applications with APIs for extending behaviors to many environments The Grid return Application scriptObjects Scheduler Job Starts project only; returns project info Submission Results SAMDB SAM (Configurator) Applications Grid Submit

  32. Synchronous Comments • GetNextFile() runs at job execution time, and files are staged to execution hosts • Worker node or local storage element needs access to a SAM station or proxy • The job itself provides notification of successful consumption to SAM. • Native Grid schedulers such as Condor or LCG Resource Broker can still be used.

More Related