Lecture 2 2 pat embedding
1 / 10

Lecture 2.2: PAT Embedding - PowerPoint PPT Presentation

  • Uploaded on

Lecture 2.2: PAT Embedding. Andreas Hinzmann RWTH Aachen University PAT Tutorial @ FNAL, November 2010. Embedding in PAT. This lecture is about a mechanism in PAT introduced to optimize the event size of a pat::Tuple : Embedding

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Lecture 2.2: PAT Embedding' - lilli

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Lecture 2 2 pat embedding

Lecture 2.2:PAT Embedding

Andreas Hinzmann

RWTH Aachen University

PAT Tutorial @ FNAL, November 2010

Embedding in pat
Embedding in PAT

  • This lecture is about a mechanism in PAT introduced to optimize the event size of a pat::Tuple: Embedding

  • In order to save disk space we have to reduce the EventContent to what is really needed. This means to drop some collections from our output pat::Tuple.

  • But one may want to retain some collections for later uses.

  • PAT embedding has been conceived for retain useful information without burdening on pat::Tuple size.

  • We will see:

    • how to get infos on event size using edmEventSizeTool

    • what is the issue of internal references

    • what is the PAT solution – the embedding


  • The typical event size of a standard PAT output, is ~ 6-60 KB

    • It depends on the user's choice

  • There are some tools to estimate the event size. They can be useful when you want to decide which content of the event to be kept or dropped

  • Run this command and see the output file

  • Have a look at SWGuidePATEventSizefor more details:

  • https://twiki.cern.ch/twiki/bin/view/CMS/SWGuidePATEventSize

edmEventSize –v patTuple.root

The problem of references
The problem of references

  • To save disc space reco::Objectscontain pointer references:

  • Note:

    • when dropping collections all pointer relations to them turn invalid!

    • have to know all pointer relations to estimate the consequences.

    • even when the target collections are still in the event not all ref’s can be resolved in FWLite.

Examples jets calotowers
Examples : Jets & CaloTowers

  • CaloJets keep references to the CaloTowers they were produced from:

  • NOTE :

  • when dropping the CaloTower collection the references will turn in valid.

  • calling the corresponding member function will cause an edm::Exception.

The pat answer
The PAT Answer

  • The PAT answer is the Embedding of objects into pat::Objects when is required to access them.

  • The following objects can be embedded:

    • gsfTrackandSuperCluster in Electron

    • All of the tracks in Muon

    • Calo towers in Jets

    • etc.

  • You can safely use the’ reference pointer 'in FWLite

  • Note: Embedding is fully supported in compiled FWLite. Note that some functionality is for the moment not available in InteractiveFWLite (CINT) or PyFWLite. Therefore it is safer to use compiled FWLite code.

Configure the input

Embedding in cmssw 3 7 x
Embedding in <= CMSSW_3_7_X

  • Before CMSSW_3_8_X release, the PAT embedding was implemented by

  • hard-copying the referencedobjetcsinto the pat::Objects!

  • Calling the member function for the pat::Objectwill check for embedding internally and return the according references (completely user transparent).

  • You can safely drop the embedded collection (you should indeed) by the EventContent reducing the size of the pat::Tuple and refer to the embedding objects that became part of the pat::Object itself.

Configure the input

Embedding in cmssw 3 8 x
Embedding in >= CMSSW_3_8_X

In >=CMSSW 3_8_X the implementation of the embedding has changed resulting in an improvement of the performance in accessing the pat::Object.

Let’s see a concrete example: the pat::Jets and the CaloTowers.

In 3_8_X a separate collection containing all the CaloTowers clustered to the pat::Jets is stored. Of course its size is sensibly smaller w.r.t. the collection of ALL CaloTowers!

But how much space we can gain embedding CaloTowersin pat::Jet collection?

Let’s go to the next slide!

Configure the input

Embed calotowers in pat jets
Embed CaloTowers in pat::Jets

What about the size of patJets + CaloTowers w/o embedding and keeping all

calotowers ?

With CaloTowers embedded

patJets_cleanPatJets__PAT. 41840.6 5318.05

CaloTowers_selectedPatJets_caloTowers_PAT. 20906.3 5803.96

Without CaloTowers embedded

patJets_cleanPatJets__PAT. 29660.6 4497.83

CaloTowers_selectedPatJets_caloTowers_PAT. 413.94 289.85

Configure the input

Without embedded keeping all the CaloTowers

CaloTowersSorted_towerMaker__RECO. 143693 24379.2

patJets_cleanPatJets__PAT. 41840.6 5318.05

the size of the pat::Jets with calo towers has decreased from 11kB per event with embedding of CaloTowers to 5kB per event without embedded CaloTowers.

Keeping all CaloTowers causes an increasing in the size of about 24kB per event

More details going through the exercise section 6

Explain exercise n 6
Explain Exercise n°6

  • You are ready to go to Exercise 6 section:

    • https://twiki.cern.ch/twiki/bin/view/CMS/SWGuidePATEmbeddingExercise

  • where you will learn:

    • what the problem of internal references is within the EDM.

    • how PAT solves this situation.

    • how to embed extra information into a pat::Candidate.

  • You will find some Exercises to practice what you have learnt. Please go through all of them and fill the answers into the

  • e-learning results for this section.

  • Have fun!

Configure the input