icfa workshop on grid activities lhcb data management tools n.
Download
Skip this Video
Download Presentation
ICFA Workshop On Grid Activities LHCb Data Management Tools

Loading in 2 Seconds...

play fullscreen
1 / 23

ICFA Workshop On Grid Activities LHCb Data Management Tools - PowerPoint PPT Presentation


  • 91 Views
  • Uploaded on

ICFA Workshop On Grid Activities LHCb Data Management Tools. Overview. Brief Introduction to LHCb Computing Model Data Management Requirements RAW, Stripped, MC. DIRAC Data Management System Storage Element, File Catalogues, Replica Manager, Transfer Agent, Bulk Transfer, FTS.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'ICFA Workshop On Grid Activities LHCb Data Management Tools' - winter-owens


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
overview
Overview
  • Brief Introduction to LHCb Computing Model
  • Data Management Requirements
    • RAW, Stripped, MC.
  • DIRAC Data Management System
    • Storage Element, File Catalogues, Replica Manager, Transfer Agent, Bulk Transfer, FTS.
  • Automatic Data Transfers
    • ReplicationAgent, RAW/Stripped DST Replication
  • File Integrity Checking
computing model intro
Computing Model Intro.

CERN – Central production centre

  • Distribution of RAW data
  • Quasi-real time to 6 LHCb Tier1s

Tier1s(including CERN) - RAW data Reconstruction and Stripping

  • Stripped DSTs to be distributed to all other Tier1s
  • Load balanced availability for Analysis

Tier2s – Monte Carlo production centres

  • Simulation files uploaded to Tier1s/CERN
dm requirements 1
DM Requirements 1

RAW data files produced at LHCb Online Farm

  • Files created at 60MB/s
  • Dedicated 1GB link to Castor at Computing Centre
  • Files divided between Tier1 centres
    • Ratio determined by pledged computing resources
  • Files transferred to assigned Tier1 centre
    • RAW files in Castor have one Tier1 replica
  • Reliable bulk transfer system required
    • Capable of sustained 60MB/s out of CERN
dm requirements 2
DM Requirements 2

Stripped DST files produced at Tier1 sites (including CERN)

  • RAW files reconstructed (currently in groups of 20/40)
  • Resulting rDSTs stripped once created
  • Stripped DSTs to be distributed to all other Tier1s
  • Reliable transfer system required between Tier1 sites
    • Either copy stripped DSTs ‘file-by-file’
    • Collect files at Tier1s and perform bulk transfers

Monte Carlo files mostly produced at Tier2 sites

  • Uploaded to CERN/Tier1s
    • Typical T2-T1 throughput ~1.1MB/s yearly average
dirac dm system

Request DB

DIRAC DM System
  • The main components are:
    • Storage Element and Storage access plug-ins
    • Replica Manager
    • File Catalogs

Data Management

Clients

UserInterface

WMS

TransferAgent

FileCatalogC

ReplicaManager

FileCatalogB

FileCatalogA

DIRAC Data Management

Components

StorageElement

HTTPStorage

GridFTPStorage

SRMStorage

SE Service

Physical storage

storage element
Storage Element
  • DIRAC StorageElement is an abstraction of a Storage facility
    • Access to storage is provided by plug-in modules for each available access protocol.
    • Pluggable transport modules: srm, gridftp, bbftp, sftp, http,…
  • Storage Element is used mostly to get access to the files
  • Grid SE (also Storage Element) is the underlying resource used
file catalogs
File Catalogs
  • DIRAC Data Management was designed to work with multiple File Catalogs
  • All available catalogs have identical APIs
    • Can be used interchangeably
  • Available catalogs
    • LCG File Catalog – LFC
      • Current baseline choice
    • Processing Database File Catalog
      • Exposing Processing DB Datafiles and Replicas table as a File Catalog
      • (more later)
    • BK database replica tables
      • To be phased out
    • +others….
replica manager
Replica Manager
  • Replica Manager provides logic for all data management operations
    • File upload/download to/from Grid
    • File replication across SEs
    • Registration in catalogs
    • etc.
  • Keeps a list of active File Catalogs
    • All registrations applied to all catalogues
transfer agent requestdb
Transfer Agent + RequestDB
  • Data Management requests stored in RequestDB
    • XML containing params. required for operation
    • e.g. Operation, LFN, SourceSE, TargetSE, etc…
  • Transfer Agent
    • Picks up requests from RequestDB and executes them
    • Operations performed through Replica Manager
    • Replica Manager returns full log of operations
    • Transfer Agent performs retries based on logs
      • Retries attempted till success
bulk data management

LCG File Catalog

FC Interface

Tier1 SE C

Transfer network

Tier1 SE A

Tier1 SE B

Tier0 SE

Request DB

Bulk Data Management

LCG – Machinery

LHCb - DIRAC

DMS

  • Bulk asynchronous file replication
    • Requests set in RequestDB
    • Transfer Agent executes periodically
    • ‘Waiting’ or ‘Running’ requests obtained from RequestDB
    • FTS bulk transfer jobs submitted and monitored

File Transfer Service

Replica

Manager

Transfer

Agent

Transfer Manager Interface

fts architecture

SARA

RAL

CNAF

FZK

FTS Server

FTS Server

FTS Server

FTS Server

Manage Incoming

Channels

Manage Incoming

Channels

Manage Incoming

Channels

Manage Incoming

Channels

IN2P3

FTS Server

Manage Incoming

Channels

PIC

FTS Server

Manage Incoming

Channels

FTS Architecture
  • Point to point channels defined:
    • CERN-T1s
    • Tier1-Tier1 matrix
  • Bulk Transfers Tested During SC3 and LHCb’s DC06
bulk transfer performance

Many Castor 2 Problem

Service Intervention

Required Rate

SARA Problems

60

CERN_Castor -> RAL_dCache-SC3

50

CERN_Castor -> PIC_Castor-SC3

40

CERN_Castor ->SARA_dCache-SC3

Rate (MB/s)

30

CERN_Castor -> IN2P3_HPSS-SC3

20

CERN_Castor -> GRIDKA_dCache-SC3

10

CERN_Castor -> CNAF_Castor-SC3

0

9/10/05

2/11/05

4/11/05

6/11/05

17/10/05

11/10/05

13/10/05

15/10/05

19/10/05

21/10/05

23/10/05

25/10/05

29/10/05

31/10/05

27/10/05

Date

Bulk Transfer Performance
half time summary
Half-Time Summary
  • RAW data arrives at Castor
    • 60MB/s out of CERN to Tier1s
    • DIRAC Transfer Agent Interfaced to LCG FTS
  • Monte Carlo files generated at Tier2s
    • Upload to GRID SE using DIRAC DMS functionality
  • Stripped DST created at Tier1s
    • Mechanism still to be chosen for distribution
      • Files transferred as they become available or
      • Wait for a collection of files and perform bulk transfers
        • Utilizing Tier1-Tier1 channels
    • Strategy for replication also to be decided
lhcb online to castor
LHCb Online to Castor

Online Run Database

DIRAC AT PIT

RequestDB

  • Files created at LHCb Online Farm at 60MB/s
    • These files must be transferred to Castor
  • DIRAC Instance installed on gateway at Farm
    • Online ‘data mover’ places transfer request
    • Processed by ReplicaManager and TransferAgent

XML-RPC

Data Mover

Transfer

Agent

Online Storage

Replica

Manager

CERN Castor

BK DB

LFC

ADTDB

FC API

auto data transfers
Auto Data Transfers
  • DIRAC components developed to perform data driven production, reconstruction and stripping
    • ProcessingDB contains pseudo file catalogue
      • Offers API to manipulate catalogue entries
      • Based on ‘transformations’ contained in the DB
        • File ‘mask’ applied to LFN
      • Can select files of given properties and locations
      • Data Management instance spawned ‘AutoDataTransferDB’
    • TransformationAgent manipulates ProcessingDB API
      • Selects files of a particular type i.e. raw/dst/rdst etc.
      • Submits DIRAC jobs to WMS based on these files
        • Perform reconstruction or stripping
      • This component adapted to create ‘ReplicationAgent’ for Data Management operations
replicationagent
ReplicationAgent
  • Replication agent developed to allow automatic data transfers when files become available
  • Transformations defined for each DM operation to be performed
    • Defines source and target SEs
    • File mask
    • Number of files to be transferred in each job
  • ReplicationAgent operation
    • Checks active files in ProcDB
    • Applies mask based on file type
    • Checks the location of file
    • Files which pass mask and match SourceSE selected for transformation
    • Once threshold number of files found bulk transfer jobs submitted
  • ReplicationAgent logic generalised so multiple transforms can be defined and run simultaneously
automatic raw replication

BK DB

DIRAC DATA WMS

LFC

Replication

Agent

ProcDB

RequestDB

FC API

Transfer

Agent

Replica

Manager

Tier1 SE

Automatic RAW Replication

CERN Castor

ADTDB

Glite FTS

stripped dst replication

DIRAC DATA WMS

Replication

Agent

RequestDB

Transfer

Agent

BK DB

LFC

Replica

Manager

FC API

Stripped DST Replication

Tier1 CE

WN

Tier1 SE

ADTDB

Glite FTS

Tier1 SE

Tier1 SE

Tier1 SE

Tier1 SE

Tier1 SE

Tier1 SE

file integrity checking
File Integrity Checking
  • Need to maintain integrity of file catalogues
    • Catalogue entries present on SEs
      • Regular listing of catalogue entries
      • Check that these entries exists on the SEs
        • via SRM functionalities
      • Files missing from SEs can be re-replicated
    • SE contents against catalogues
      • List the contents of the SE
      • Check against the catalogue for corresponding replicas
        • Possible because of file naming conventions
        • Files paths on SE always ‘SEhost/SAPath/LFN’
      • Files missing from the catalogue can be
        • Re-registered in catalogue
        • Deleted from SE
        • Depending on file properties
    • These processes will eventually be run regularly
      • DIRAC Agent or daemon process
summary
Summary
  • DIRAC DMS built from ReplicaManager accessing File Catalogue and Storage Element interfaces.
    • TransferAgent also extended to perform bulk transfer using FTS
  • DMS utilized to get RAW data from LHCb to Castor
  • Then to distribute to Tier1s in load balanced way
    • Reconstruction jobs created automatically
  • Data driven mechanism to perform recon. and stripping
    • Transfer jobs created automatically to distribute data