1 / 15

A proposal: from CDR to CDH

A proposal: from CDR to CDH. Paolo Valente – INFN Roma [Acknowledgements to A. Di Girolamo ]. Requirements/1. 1. Execute each operation [transfer, reconstruction…]. 2. Log operations and errors. Execute/Launch the transfer/reconstruction operations

holmes-vega
Download Presentation

A proposal: from CDR to CDH

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A proposal: from CDR to CDH Paolo Valente – INFN Roma [Acknowledgements to A. Di Girolamo] NA62 collaboration meeting

  2. Requirements/1 1. Execute each operation [transfer, reconstruction…] 2. Log operations and errors • Execute/Launch the transfer/reconstruction operations • Typically done with a set of scripts, in part running as demons, in part controlled by an operator • [Adapted] NA48 CDR and [adapted] COMPASS CDR have been used in 2007-2008 and during the technical run. • Logging and [in some case] error recovery

  3. Requirements/2 1. Execute each operation, controlling the sequence of all steps 2. Record every operation, keep a catalog of all filesand relative operations on it • Not only execute operation, but also know their status, recognize success/failure, handle anomalies, interface to operator, … • Know and control the sequence of operations • Handle/Notify the status of the “sequence” 3. Monitor/display the status of entire process, following each element during its lifetime Central Data Recording  Central Data Handler

  4. States and Transitions Burst 99923-0000 • The atomic unit is the burst. • A burst is connected to a sequence of operations to be performed: • First of all, generation of the RAW file • From the RAW file, a number of tasks involving generation of other files or file transfers • Each operation is a transition from one state to another: • RAW_on_farm_disk RAW_file_on_disk_pool  RAW_on_tape • RAW_generated  RECO-1_generated  THIN-1_generated… • An “operation” is performed for each transition: RAWRECO-1, the reconstruction pass-1 has to be executed; the appropriate copy or remote copy command for file transfers • A new entry has to be created in the file catalogfor each transition • Essentially, 2 kinds of transition: • File generation • File transfer cdr00099923-0000 RAW  burst RAW RECO Reconstruction RAW RAW Filter RECO THIN Thinning RECO RECO Split THIN NTUP MakeNtup

  5. The idea • We must have the catalog of all the files [+ their meta-data, e.g. data quality information, basic information from TDAQ, etc.] • Link all the files relative to a given burst: the logical unit is the burst • Define the sequence of statesthroughwhicheachbursthas to pass • Eachstate transitiondefines an operation to be performed on the files • Define a “task” as the operations to be applied to givenset of entries in the file catalog[thuscausing a state transition for the relative bursts] • Webuild an “Handler” process to control operations: given a task the Handlerwill: • Create the list of files on whichexecute the command(s) • Trigger the execution of the appropriate command(s) on them [typicallylaunching a script] • The trigger for startingthis can be eitherautomatic or performed by an operator • Check the execution and notify/handleanomalous or failures The file catalog The “Handler” on_farm_disk on_disk_pool Sequence: file storage The file catalog The “Handler” on_T0_tape Distributed_to_T1

  6. Burst 99923-0000 • The atomic unit is the burst. A burst is connected to a number of files: • There is only one RAW for each burst • Many RECO, THIN, NTUP, …, files can be generated starting from one burst • The files can have multiple copies on different filesystems and in different sites • Files of different kind are generated (RECO, THIN, …) • Use the burst id as primary key. • Generate the first entry as soon as the RAW file appears in the farm disk • Then, attach to it all the following steps in the lifetime of the burst cdr00099923-0000.dat • cdr00099923-0000.dat • cdr00099923-0000.reco-1 • cdr00099923-0000.reco-1.thin-1 • cdr00099923-0000.reco-1.thin-2 • … • cdr00099923-0000.reco-2 • cdr00099923-0000.reco-2.thin-1 • cdr00099923-0000.reco2.thin-2 • … • …

  7. Let’s make a toy example: file storage For the first step it would be ideal to have the MERGER to insert a new record for each new burst into the catalog, as soon as it creates a new RAW file (otherwise we’ll have to poll) on_farm_disk • /merger/../cdr00099925-0000.dat • /merger/../cdr00099923-0000.dat • /merger/../cdr00099924-0000.dat • /merger/../cdr00099925-0000.dat • The Handler queries for bursts in the state on_farm_disk and creates the list of files to be copied • The Handler creates the appropriate tranfer command • The Handler issues the execution of the command on each of the files in the list and checks for success: • If success: • Create new entry in the file catalog, corresponding to the new replica of the RAW file • Change the status of the burst N to on_disk pool • Otherwise: handle or just notify the failure Probably intermediate states are needed in order to correctly handle the progress of the operation + + … + + xrdcp • file //eos/na62/data/cdr root://eosna62.cern.ch • //eos/../cdr00099923-0000.dat • //eos/../cdr00099924-0000.dat • //eos/../cdr00099925-0000.dat on_farm_disk on_disk_pool on_farm_disk on_disk_pool_canceled on_disk_pool_pending on_disk_pool_failed on_disk_pool_started on_disk_pool

  8. The database • The file catalog and the states plus all necessary information will be in this database • Basic tasks of the catalog: • Give an unique file-id and relate to local filename • Relate to its metadata • We also want to: • Keep the relations between all the files related to the same burst, • Keep the state related to the reconstruction/transfer steps • The Handler will trigger the transition, based on the current state of the file Table: Burst • Number* • MotherRAW [File] • RunType • RunNumber • … Table: Storage Table: File NA62-FARM CERN-PROD RAL INFN-CNAF … Table: Site • Name* • StorageType [StorageType] • isActive • hasReplica • … • Name* • FileType [FileType] • CustodialLevel • Version • CreationTimestamp • ModificationTimestamp • DeletionTimeStamp • Site [Site] • Storage [Storage] • CopyNumber • Mother [File] • … SCRATCH-1 FARMDISK-1 EOSNA62 CASTORNA62 … • Name* • SiteType [SiteType] • Location • ContactPerson • isActive • … Table: StorageType Table: SiteType • Name* • isCustodial • … TAPE EOS DISK … • Name* • hasTape • hasDisk • … Table: FileType FARM TIER-0 TIER-1 TIER-2 … • Name* • isData • hasVersion • … RAW RECO THIN NTUP …

  9. Example Burst RAW (farm) RAW (disk pool) RAW (T0 tape) File File File RAW (T1 disk) RAW (T1 tape) File File RECO-1 (T1 disk) RECO-1 (T1 tape) File File THIN-1 (T2 disk) Reconstruction & thinning THIN-1 (T1 disk) File File File File File RAW (T1 disk) copy 2 First reprocessing File RECO-2 (T1 disk) File 300k bursts/year × 3 years 1,000,000 bursts × O(100) entries = 100M entries THIN-2 (T1 disk) File

  10. Which DB technology? 300k bursts/year × 3 years 1,000,000 bursts × O(100) entries = 100M entries • Looks huge e.g. for MySQL, but ALiEn (ALICE distributed environment, including CATALOG an JOB management) successfully uses MySQL • A number of optimizations/tricks can be used: • Partitioning • Indexes • Common queries/caching • … • Of course there are alternatives. SQUID caching necessary. • By the way… • ALiEn is a very close example: it uses open source software and can be inspirational or even reused • ALiEn project started to provide a file catalog to ALICE and then expanded

  11. ALiEn

  12. Grid services Catalog Job management Handler The other piece to have a complete system… User Interface

  13. ALICE WMS LHCb ATLAS

  14. Pull vs. Push job submission • gLite: a set of grid middleware components responsible for the distribution and management of tasks across grid resources • Push model: • Working as a super-batch system • Jobs submitted to the WMS which schedules the jobs to a Grid CE (computing center) • Computing centers implement their internal batch queues to schedule jobs on the worker nodes • Experiments have implemented their solutions to integrate between middleware and application layer • Frameworks born to manage high-level workflows • Direct control on translation from workflow into grid jobs Independently, the LHC experiments are evolving towards “Pilot job” systems: • Pull model: • Pilot jobs are asynchronously submitted jobs which are running on worker nodes • Users submit jobs to a centralized queue • Pilot jobs communicate with the WMS (pilot aggregator) pulling user jobs from the repository

  15. To be continued…

More Related