1 / 11

ATLAS Distributed Data Management

ATLAS Distributed Data Management. Simone Campana. ATLAS DDM (DQ2). Moves from a file based system to one based on datasets Hides file level granularity from users A hierarchical structure makes cataloging more manageable However file level access is still possible

cyrus-tyson
Download Presentation

ATLAS Distributed Data Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ATLAS Distributed Data Management Simone Campana

  2. ATLAS DDM (DQ2) • Moves from a file based system to one based on datasets • Hides file level granularity from users • A hierarchical structure makes cataloging more manageable • However file level access is still possible • Scalable global data discovery and access via a catalog hierarchy • No global physical file replica catalog • but global dataset replica catalog and global dataset location catalog Files Files Sites Datasets Files Files Files Dataset

  3. You do not need to know that!!!

  4. ‘Global’ catalogs Dataset Repository Holds all dataset names and unique IDs (+ system metadata) Dataset Hierarchy Maintains versioning information and information on ‘container datasets’, datasets consisting of other datasets Dataset Content Catalog Maps each dataset to its constituent files This one holds info on every logical file so must be highly scalable, however it can be highly partitioned using metadata etc.. Dataset Location Catalog Stores locations of each dataset All logically global but may be distributed physically

  5. ‘Local’ Catalogs Local Replica Catalog Per grid/site/tier providing logical to physical file name mapping. In LCG, the local replica catalog is the LCG File Catalog (LFC) Currently all ‘Local’ catalogs are deployed at each ATLAS T1

  6. LOCAL CATALOGS CENTRAL CATALOGS Dataset Name: <mydataset> Dataset Content Catalog Content: <guid1> <lfn1> <guid2> <lfn2> <…> <…> Entries: <guid1> </…/lfn1> <guid2> </…/lfn2> <…> <…> CNAF LFC Dataset Name: <mydataset> Dataset Location Catalog Dataset Location: CNAF,LYON

  7. About LFNs • ATLAS has a naming convention for MCProd Logical File Names (LFNs) • <lfn>=csc11.007060.singlepart_e_E50.evgen.EVNT.v11004201_tid002912._00027.pool.root.1 • <project>=csc11 • <datasetname>=csc11.007060.singlepart_e_E50.evgen.EVNT.v11004201_tid002912 • In Central content catalog, it will appear the LFN as it is • In the Local Replica Catalog (LFC), the LFN is namespaced i.e. under a directory structure • /grid/atlas/dq2/<project>/<dataset>/<lfn> • /grid/atlas/dq2/csc11/csc11.007060.singlepart_e_E50.evgen.EVNT.v11004201_tid002912/csc11.007060.singlepart_e_E50.evgen.EVNT.v11004201_tid002912._00027.pool.root.1

  8. Subscriptions • A site can subscribe to data • Dataset A is present in site Y but not site X • X subscribes to Dataset A • A is transferred to Site X and registered properly in catalogs • A kind of magic …. Site ‘X’: Does not contain Dataset A Dataset ‘A’ File 1 File 2 Site ‘Y’: Subscriptions: Contains Dataset A Dataset ‘A’ | Site ‘X’

  9. Complications… • A dataset can be closed or not. • If you subscribe a closed dataset A to site X: • Files will be transferred to the site • The subscription will be honored and disregarded • If you subscribe a open dataset A to the site X • Files will be transferred • The subscription will remain active • If new files are added to the dataset and stored in Y, they will be streamed to X

  10. The dq2 API • Instructions about how to install it can be found in • https://uimon.cern.ch/twiki/bin/view/Atlas/DDMOperations • https://uimon.cern.ch/twiki/bin/view/Atlas/DDM • https://uimon.cern.ch/twiki/bin/view/Atlas/ExecutorsCommon(VERY PRAGMATIC) • Once you install the API, you can run the “dq2” command to get the help page • You find the API also in AFS at CERN /afs/cern.ch/atlas/offline/external/GRID/ddm/pro02

  11. Monitoring Subscriptions • Subscriptions can be monitored in http://atlas-ddm-monitoring.web.cern.ch/atlas-ddm-monitoring/ • To report problems about subscriptions you can • Use savannah: https://savannah.cern.ch/projects/dq2-ddm-ops/ • Report to atlas-t1-ddm-oper@cern.ch • Use GGUS: www.ggus.org

More Related