1 / 18

ATLAS Distributed Computing Tutorial

ATLAS Distributed Computing Tutorial. Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow. Tags. What are tags? Why have them? When are they produced? Where are they? How can they be used?. What are Event Tags?.

Download Presentation

ATLAS Distributed Computing Tutorial

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow

  2. Tags • What are tags? • Why have them? • When are they produced? • Where are they? • How can they be used?

  3. What are Event Tags? • Event-level metadata: summary information about events, with a “pointer” to the corresponding AOD/ESD/RDO format • Useful for selecting events for physics analysis • Should be no bigger than 1KB per event (~ 1% AOD size)

  4. Why have Event Tags? • To make Physicist’s life easier and analysis faster • Allows you to exclude uninteresting events from data sample used for analysis without searching through AOD/ESD files • Samples of specific interest to an analysis can be extracted into a smaller set of files for repeated running • Provides a global view of the data, useful for data mining • Not to do analysis on directly

  5. Tag Use Cases • Some Physicist use cases: • Using official Tags with query in job options • Using local Tag “database” for preliminary analysis • Using global Tag database to look for events • Using global Tag database to build input list for Athena jobs

  6. What do they look like? • The LCG POOL infrastructure is used to store Tags • Hence use of “collection” terminology • They exist in 2 forms: • ROOT files • Relational database (MySQL and Oracle) • Why keep 2 forms? • ROOT files useful for local work • DB useful for queries, global view of data • Tag content: collection information + event information

  7. Tag Content • Collection Information • Collection ID, AOD/ESD/RDO references • Global Event Quantities • Event no., run no., no. of tracks, missing ET etc • Trigger Decisions • Electrons, Photons, Muons • Number, PT, h, f, etc • Jets, Taus • Number, PT, h, f, etc

  8. When are Tags produced? • Written to ROOT files at Tier 0 during AOD production – “Explicit collections” • Data then imported into central relational database (Oracle at CERN) • Database replicated to Tier 1 and lower • Oracle where available; MySQL otherwise • Users can create their own tag files

  9. Sample Queries • General Collection Information • How many events in collection A? • What are the names and types of Tag attributes? • What production task(s) produced these Tags? • Content Queries • Give me all events with at least 2 electrons and missing ET > 10 GeV which are ‘good for physics’ • Summary Queries • Give me the number of events for some content query • Give me sum of the luminosity for some content query

  10. How can Tags be used? • Collection tools • Athena • Tag Navigator Tool (TNT)

  11. Collection Tools • To use Tags in Athena, you need to know what the attributes are • POOL Collection tools can be used for this • Can copy collections, append collections, print list of files used, etc • Allows queries on the input collections • See Tutorial Exercises, part 1

  12. Tags in Athena • Both ROOT and Relational Tags can be read directly from Athena • Need file catalogue to find the AOD files, and Athena version which matches that used by the Tags • One can also produce private ROOT Tags from AOD • Focus here is on reading, rather than building, Tags

  13. Local Tag Files with Athena • jobOptions for event selection look like:

  14. Remote Tag Database with Athena • Not many Tags available in central database yet • This constrains the exercises somewhat, but we can at least illustrate the principles • jobOptions must include lines like: EventSelector.InputCollections = ['rome_4312_merge_H12_140_gamgam_AOD_tags’] EventSelector.Connection = 'oracle://atlas_tags/atlas_tags_rome’ EventSelector.CollectionType = 'ExplicitRAL'

  15. Tag Navigator Tool (TNT) • A utility which aims to allow ATLAS physicists to use the Tag database for analysis • Runs a query on the database and outputs a local ROOT collection • Divides this into a number of sub-collections • Submits user jobs to LCG, one per sub-collection • Output files can be registered as new DQ2 dataset

  16. What’s there now? • There is still a lot of work to be done to get an efficient Tag system running • Currently running performance / scalability tests on central database • Need Tags to be produced and loaded into database as a matter of course • Tag database from Rome workshop is still there, now awaiting Tags from Streaming Tests

  17. And finally… • Tags will become ever more useful as real data appears • Infrastructure is still being developed • Wednesday’s exercises aimed at familiarisation with ideas and methods

More Related