1 / 10

Existing system Real data bookkeeping New concepts for real data Browsers

Bookkeeping Working Group report Software week May 2007 O.Callot, LAL Slides from Ph. Charpentier for the Bookkeeping WG. Existing system Real data bookkeeping New concepts for real data Browsers. Mandate.

neveah
Download Presentation

Existing system Real data bookkeeping New concepts for real data Browsers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bookkeeping Working Group reportSoftware week May 2007O.Callot, LAL Slides from Ph. Charpentier for the Bookkeeping WG Existing system Real data bookkeeping New concepts for real data Browsers

  2. Mandate • The group should understand the requirement of LHCb for provenance &metadata to allow dataset characterisation of real data. The group should address the requirements of physics analysis & sub-detectors. It should consider as well the implementation issues associated with its recommendation. • Issues to consider: • Metadata/data quality requirements for raw data (from online) • Metadata/data quality requirements for processed real data (rDST, DST) • Query requirements from interested parties • physics analysts, sub-detector experts • Consideration of implementation: • is current BK implementation suitable • schema changes • use of views • provenance vs other metadata • Access needs: GANGA GUI, web access, Gaudi application, ... Bookkeeping Working Group report

  3. Membership and meetings • O. Callot, Ph. Charpentier (chair), C.Cioffi, O.Deschamps, M.Frank, A.Maier, N.Neufeld, M.Schmelling • 5 meetings so far • started March 19th with a review of the existing system • tWiki page: • https://twiki.cern.ch/twiki/bin/view/LHCb/BKWG Bookkeeping Working Group report

  4. roottree JOBFILEINFOn NBEVENTS EVENTTYPE EVENTDESCRIPTION DBVERSION CONFIG FILETYPE Program0 (step 1) InputFile0 (output 1) Program1 (step 2) InputFile1 (output 2) Program2 (step 3) InputFile2 (output 3) NumberBEvent FileSize FileName Laboratory WriteDate Brief review of the existing system Used for queries Warehouse schema Views schema Bookkeeping Working Group report

  5. Real data bookkeeping • Split the BKDB in two distinct parts • real data / MC data • same warehouse schema • possibly two views schemas if needed • different query mechanisms • BK configuration • DAQ partition + run type • Examples: LHCb - Physics / Calo - LED calibration • Source of real data (“job” in the BK) • DAQ run: produces several RAW files • Further processing • as currently for MC: reconstruction step, stripping step, etc… • The current warehouse schema seems adapted to real data Bookkeeping Working Group report

  6. Processing pass New concept • More important for users than the actual application version • Definition: a set of (application versions/CondDB tags) • <pass> = { [<pass> &] <application>&<tag> [& <application>&<tag>] }1-n • Implementation: new table in BKDB • Similar concept for MC data (e.g. “DC06 - physics quality”) • Purpose: used in the view for queries • Example: • Processing_1 : • Brunel-v30r16/Tag-v31r7 + Brunel-v30r17/Tag-v31r7 • Processing_2-Stripping_1: • Processing_2 & DaVinci-v20r5/Tag-v33r5 & Brunel-v31r6/Tag-v33r5 • Views with application versions are still needed for experts • Currently limited to 3 ancestor jobs, Carmine working on this Bookkeeping Working Group report

  7. Data taking period New concept • Set of runs that can be mixed at processing stage (stripping) • similar conditions: beam, detector stability, trigger conditions • Definition of periods is <pass>-dependent • First pass: change period only when noticeable events occur • after LHC shutdown • major trigger changes • detector missing or modified • Re-processing: periods result from the first pass analysis • possibly finer granularity • Implementation • Additional table in BKDB • For MC data: set of productions • Should be related to the processing DB in DIRAC • jobs should not mix data from different periods (stripping) Bookkeeping Working Group report

  8. Bookkeeping queries, datasets • Recommendation to support only a standalone GUI for complex queries • Web interface only for global statistical queries • Too heavy to support two implementations (web and GUI) • Generation of Gaudi cards from web interface is cumbersome • Need to integrate in Ganga • Recommendation to implement a tree-like browsing GUI • Need to possibly define the order of browsing • Default mimicking the main physics use case • Datasets • LFC can also be used to define datasets (directories with symbolic links) • Recommendation to have joint discussion with Ganga and DIRAC on a common definition Bookkeeping Working Group report

  9. Tree-like browser • Example from feicim (developed at Dublin) Bookkeeping Working Group report

  10. Conclusions and outlook • The current warehouse schema seems adequate for real data • Exact parameters still to be defined, in particular if to be queried on • Additional tables for new concepts: • Data taking period • Processing pass • Views to be adapted for queries • Not necessarily the same for MC and real data • Browser supported as GUI only • Tree-like (versatile) browsin • Next steps • Define a timescale for prototyping • Identify manpower for implementation • Define run parameters with Online • Further discuss dataset concept with DIRAC and Ganga Bookkeeping Working Group report

More Related