1 / 4

Observations from LCG

Observations from LCG. David Smith, Maarten Litmaath For LCG-GD, CERN CRM Interface discussion, Rome INFN 17-18 Feb 2005. For submission LCG uses: ‘Workload Management’ software (from EDG project) Condor-G Globus GRAM

nell-downs
Download Presentation

Observations from LCG

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Observations from LCG David Smith, Maarten Litmaath For LCG-GD, CERN CRM Interface discussion, Rome INFN 17-18 Feb 2005

  2. For submission LCG uses: • ‘Workload Management’ software (from EDG project) • Condor-G • Globus GRAM • User submits to a submission service which in turn a client that sends submission requests to the resource interfaces themselves • CRM Interface periodically reports back status of all managed jobs to clients • Problems: • Scaling problems – primarily on the resource interfaces • Network requirements can cause deployment complications • Initially the CRM interface used a fairly simple query batch system for state changes – have had loading problems • Some problems are higher level job management issues but may have implications for CRM interfaces

  3. Some features of job management in LCG-2: • Selection of job destination according to requirements expressed in a JDL • Optional job resubmission in case of error • Supply of input files • Retrieval of nominated output files • Best resource selection • Submit a job to a chosen destination • Use metric to measure response time • Once a resource is chosen it stays there until it completes (or fails) • Network connectivity • All the LCG-2 service machine require inbound connectivity • But several applications use port ranges e.g. GRAM to provide callback addresses This is has been a source of deployment problems

  4. Some users have arrived at broadly similar solutions to common problems: • The submission of a job which contacts an application specific server to request a task. • This may improve the distribution of tasks and the response time

More Related