1 / 23

ICFA Workshop On Grid Activities LHCb Data Management Tools

ICFA Workshop On Grid Activities LHCb Data Management Tools. Overview. Brief Introduction to LHCb Computing Model Data Management Requirements RAW, Stripped, MC. DIRAC Data Management System Storage Element, File Catalogues, Replica Manager, Transfer Agent, Bulk Transfer, FTS.

Download Presentation

ICFA Workshop On Grid Activities LHCb Data Management Tools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ICFA Workshop On Grid Activities LHCb Data Management Tools

  2. Overview • Brief Introduction to LHCb Computing Model • Data Management Requirements • RAW, Stripped, MC. • DIRAC Data Management System • Storage Element, File Catalogues, Replica Manager, Transfer Agent, Bulk Transfer, FTS. • Automatic Data Transfers • ReplicationAgent, RAW/Stripped DST Replication • File Integrity Checking

  3. Computing Model Intro. CERN – Central production centre • Distribution of RAW data • Quasi-real time to 6 LHCb Tier1s Tier1s(including CERN) - RAW data Reconstruction and Stripping • Stripped DSTs to be distributed to all other Tier1s • Load balanced availability for Analysis Tier2s – Monte Carlo production centres • Simulation files uploaded to Tier1s/CERN

  4. DM Requirements 1 RAW data files produced at LHCb Online Farm • Files created at 60MB/s • Dedicated 1GB link to Castor at Computing Centre • Files divided between Tier1 centres • Ratio determined by pledged computing resources • Files transferred to assigned Tier1 centre • RAW files in Castor have one Tier1 replica • Reliable bulk transfer system required • Capable of sustained 60MB/s out of CERN

  5. DM Requirements 2 Stripped DST files produced at Tier1 sites (including CERN) • RAW files reconstructed (currently in groups of 20/40) • Resulting rDSTs stripped once created • Stripped DSTs to be distributed to all other Tier1s • Reliable transfer system required between Tier1 sites • Either copy stripped DSTs ‘file-by-file’ • Collect files at Tier1s and perform bulk transfers Monte Carlo files mostly produced at Tier2 sites • Uploaded to CERN/Tier1s • Typical T2-T1 throughput ~1.1MB/s yearly average

  6. Request DB DIRAC DM System • The main components are: • Storage Element and Storage access plug-ins • Replica Manager • File Catalogs Data Management Clients UserInterface WMS TransferAgent FileCatalogC ReplicaManager FileCatalogB FileCatalogA DIRAC Data Management Components StorageElement HTTPStorage GridFTPStorage SRMStorage SE Service Physical storage

  7. Storage Element • DIRAC StorageElement is an abstraction of a Storage facility • Access to storage is provided by plug-in modules for each available access protocol. • Pluggable transport modules: srm, gridftp, bbftp, sftp, http,… • Storage Element is used mostly to get access to the files • Grid SE (also Storage Element) is the underlying resource used

  8. File Catalogs • DIRAC Data Management was designed to work with multiple File Catalogs • All available catalogs have identical APIs • Can be used interchangeably • Available catalogs • LCG File Catalog – LFC • Current baseline choice • Processing Database File Catalog • Exposing Processing DB Datafiles and Replicas table as a File Catalog • (more later) • BK database replica tables • To be phased out • +others….

  9. Replica Manager • Replica Manager provides logic for all data management operations • File upload/download to/from Grid • File replication across SEs • Registration in catalogs • etc. • Keeps a list of active File Catalogs • All registrations applied to all catalogues

  10. Transfer Agent + RequestDB • Data Management requests stored in RequestDB • XML containing params. required for operation • e.g. Operation, LFN, SourceSE, TargetSE, etc… • Transfer Agent • Picks up requests from RequestDB and executes them • Operations performed through Replica Manager • Replica Manager returns full log of operations • Transfer Agent performs retries based on logs • Retries attempted till success

  11. LCG File Catalog FC Interface Tier1 SE C Transfer network Tier1 SE A Tier1 SE B Tier0 SE Request DB Bulk Data Management LCG – Machinery LHCb - DIRAC DMS • Bulk asynchronous file replication • Requests set in RequestDB • Transfer Agent executes periodically • ‘Waiting’ or ‘Running’ requests obtained from RequestDB • FTS bulk transfer jobs submitted and monitored File Transfer Service Replica Manager Transfer Agent Transfer Manager Interface

  12. SARA RAL CNAF FZK FTS Server FTS Server FTS Server FTS Server Manage Incoming Channels Manage Incoming Channels Manage Incoming Channels Manage Incoming Channels IN2P3 FTS Server Manage Incoming Channels PIC FTS Server Manage Incoming Channels FTS Architecture • Point to point channels defined: • CERN-T1s • Tier1-Tier1 matrix • Bulk Transfers Tested During SC3 and LHCb’s DC06

  13. Many Castor 2 Problem Service Intervention Required Rate SARA Problems 60 CERN_Castor -> RAL_dCache-SC3 50 CERN_Castor -> PIC_Castor-SC3 40 CERN_Castor ->SARA_dCache-SC3 Rate (MB/s) 30 CERN_Castor -> IN2P3_HPSS-SC3 20 CERN_Castor -> GRIDKA_dCache-SC3 10 CERN_Castor -> CNAF_Castor-SC3 0 9/10/05 2/11/05 4/11/05 6/11/05 17/10/05 11/10/05 13/10/05 15/10/05 19/10/05 21/10/05 23/10/05 25/10/05 29/10/05 31/10/05 27/10/05 Date Bulk Transfer Performance

  14. Half-Time Summary • RAW data arrives at Castor • 60MB/s out of CERN to Tier1s • DIRAC Transfer Agent Interfaced to LCG FTS • Monte Carlo files generated at Tier2s • Upload to GRID SE using DIRAC DMS functionality • Stripped DST created at Tier1s • Mechanism still to be chosen for distribution • Files transferred as they become available or • Wait for a collection of files and perform bulk transfers • Utilizing Tier1-Tier1 channels • Strategy for replication also to be decided

  15. LHCb Online to Castor Online Run Database DIRAC AT PIT RequestDB • Files created at LHCb Online Farm at 60MB/s • These files must be transferred to Castor • DIRAC Instance installed on gateway at Farm • Online ‘data mover’ places transfer request • Processed by ReplicaManager and TransferAgent XML-RPC Data Mover Transfer Agent Online Storage Replica Manager CERN Castor BK DB LFC ADTDB FC API

  16. Auto Data Transfers • DIRAC components developed to perform data driven production, reconstruction and stripping • ProcessingDB contains pseudo file catalogue • Offers API to manipulate catalogue entries • Based on ‘transformations’ contained in the DB • File ‘mask’ applied to LFN • Can select files of given properties and locations • Data Management instance spawned ‘AutoDataTransferDB’ • TransformationAgent manipulates ProcessingDB API • Selects files of a particular type i.e. raw/dst/rdst etc. • Submits DIRAC jobs to WMS based on these files • Perform reconstruction or stripping • This component adapted to create ‘ReplicationAgent’ for Data Management operations

  17. ReplicationAgent • Replication agent developed to allow automatic data transfers when files become available • Transformations defined for each DM operation to be performed • Defines source and target SEs • File mask • Number of files to be transferred in each job • ReplicationAgent operation • Checks active files in ProcDB • Applies mask based on file type • Checks the location of file • Files which pass mask and match SourceSE selected for transformation • Once threshold number of files found bulk transfer jobs submitted • ReplicationAgent logic generalised so multiple transforms can be defined and run simultaneously

  18. BK DB DIRAC DATA WMS LFC Replication Agent ProcDB RequestDB FC API Transfer Agent Replica Manager Tier1 SE Automatic RAW Replication CERN Castor ADTDB Glite FTS

  19. Performance…

  20. DIRAC DATA WMS Replication Agent RequestDB Transfer Agent BK DB LFC Replica Manager FC API Stripped DST Replication Tier1 CE WN Tier1 SE ADTDB Glite FTS Tier1 SE Tier1 SE Tier1 SE Tier1 SE Tier1 SE Tier1 SE

  21. File Integrity Checking • Need to maintain integrity of file catalogues • Catalogue entries present on SEs • Regular listing of catalogue entries • Check that these entries exists on the SEs • via SRM functionalities • Files missing from SEs can be re-replicated • SE contents against catalogues • List the contents of the SE • Check against the catalogue for corresponding replicas • Possible because of file naming conventions • Files paths on SE always ‘SEhost/SAPath/LFN’ • Files missing from the catalogue can be • Re-registered in catalogue • Deleted from SE • Depending on file properties • These processes will eventually be run regularly • DIRAC Agent or daemon process

  22. Summary • DIRAC DMS built from ReplicaManager accessing File Catalogue and Storage Element interfaces. • TransferAgent also extended to perform bulk transfer using FTS • DMS utilized to get RAW data from LHCb to Castor • Then to distribute to Tier1s in load balanced way • Reconstruction jobs created automatically • Data driven mechanism to perform recon. and stripping • Transfer jobs created automatically to distribute data

  23. Questions…?

More Related