Babar il nuovo modello di calcolo
1 / 23

BABAR: il nuovo modello di calcolo - PowerPoint PPT Presentation

  • Uploaded on

BABAR: il nuovo modello di calcolo. Computer Model Working Group Analysis Model Event Store Technologies Computing/Analysis Sites Implementation/Migration. J. Walsh INFN-Pisa.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' BABAR: il nuovo modello di calcolo' - austin-wilson

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Babar il nuovo modello di calcolo

BABAR: il nuovo modello di calcolo

  • Computer Model Working Group

  • Analysis Model

  • Event Store Technologies

  • Computing/Analysis Sites

  • Implementation/Migration

J. Walsh


Note: The new Babar Computing Model is still under development. Everything I present here today is still under discussion within Babar.

J. Walsh,


Need for new computing model
Need for new computing model

  • Babar Computing Model established end of 2000

    • changes in computing environment  require review

  • Babar Computing Review: April, 2002

    • large luminosity of PEP-II puts big burden on Babar computing resources

    • Analysis Groups produce huge ntuples that essentially duplicate micro-dst information  not scalable with luminosity

    • two event store technologies, Root/IO (Kanga/Root) and Objectivity  burden on Computing Group

    • several other issues raised: role of remote sites, etc.

  • Computer Model Working Group 2 - update Babar Computing Model. So far work concentrated in two main areas

    • New proposal for analysis model

    • Event store technology- Root vs. Objectivity debate

  • Plus, will comment on Remote sites – Tier A, Tier C, etc.

J. Walsh,


Luminosity projection

where we are now

Luminosity Projection

design luminosity (3x1033) exceeded in 2001

already expect 3 times current data sample in 2005

Previsioni '02

J. Walsh,


Computing model working group 2 members
Computing Model Working Group 2: Members

  • Members:

    David Brown/LBL

    Claudio Campagnari

    Andreas Hoecker

    Hassan Jawahery (co-chair)

    Yury Kolomensky (co-chair)

    David Lange

    Mike Roney

    Aaron Roodman

    Anders Ryd

    Bernhard Spaan

    John Walsh

    Fergus Wilson

  • Technical Experts:

    • Jacek Becla

    • Fabrizio Bianchi

    • Nicole Chevalier

    • Nicolo De Groot

    • Gregory Dubois-Felsmann

    • Peter Elmer

    • Steffen Luitz

    • Mauro Morandin

  • Ex-Officio:

    • Stephen Gowdy CC

    • Rainer Bartoldus DepCC

    • Richard Mount SCS

    • Dominique Boutigny Tier-A Rep.

    • Livio Lanceri DepPAC

J. Walsh,


Current computing model jargon
Current Computing Model - Jargon

  • Analysis:

    • performed exclusively on micro format

    • Beta/Framework is analysis program, C++ based

    • Central Skims produce subsets of data pertinent to various analyses

    • Analysis Groups (AWG’s) often produce ntuples (or rootuples) that are large and redundant (contain same info as micro)

    • Fortran (or root scripts) analyze ntuples (rootuples)

  • Event Store:

    • Originally only in Objectivity

    • Kanga/Root developed for micro format only, to enhance analysis performance, data distribution

    • micro currently maintained in Objy and Kanga/Root

  • Computing/Analysis Sites:

    • Full copy of micro data available at SLAC, IN2P3, RAL

    • Production reprocessing in place at INFN-Padova

    • Numerous institutes produce MC at remote sites

Tier A Sites

J. Walsh,


New analysis model key points
New Analysis Model – Key Points

  • Data formats:

    • Mini: new format with detailed detector information available.

    • New Micro: upgraded form of current micro, allow faster access, customizable content

    • Tag (nano-dst): no major changes, general cleanup

    • Note: large RECO and RAW objects no longer written to event store

  • Skims:

    • New Micro will make centralized skims more useful and efficient

  • Large ntuple production:

    • Hopefully, rendered obsolete, freeing up resources (CPU, disk space and manpower)

J. Walsh,


Mini format
Mini Format

  • New format introduced over the last year:

    • contains essentially full detector information

      • track hits, calorimeter crystals, DRC hits, etc.

    • efficiently packed to optimize space

      • about 8 kBytes per event (compare to micro: 2 kBytes/event)

    • increased analysis capability w.r.t. micro, e.g.:

      • track extrapolation through detector material

      • follow changing conditions (e.g. SVT alignment)

      • event display

      • etc.

J. Walsh,


Mini format ii
Mini Format - II

  • Additional characteristics:

    • customization

    • larger and slower than micro-dst:

      • develop coherent staging system to retrieve events from tape. Target access times:

        • < 1 hr for small (< 100 events) samples

        • < 1 day for medium (< 1 k events) samples

        • < 2-4 weeks for large samples

    • exact use pattern of mini not really predictable

      • need to remain flexible on implementation

J. Walsh,


New micro format
New-Micro Format

  • Radical improvement w.r.t. current micro-dst

    • Dual usage:

      • regular framework/Beta job (current Babar norm)

      • interactive use with Root

    • Customizable content

      • option to store detector info or not

      • additional user info can be added

      • composite candidates

      • different mass hypo track fits, etc.

    • High speed: the aim is to reach 1 kHz with framework/Beta  will require Beta development

      • current rate: few tens of Hz

      • higher read rates envisioned with Root access (at cost of reduced functionality)

J. Walsh,


New micro format ii
New Micro Format – II

  • Impact on users:

    • much analysis in Babar done at ntuple level

    • ntuple analysis code will have to be adapted/converted to new-micro (use of paw/Fortran discouraged)  potentially disruptive

  • Comment on Objectivity:

    • since interactive Root access is a basic feature of the new-micro  Objectivity event store is not an option

    • new-micro will be in Root/IO format

J. Walsh,


Event store debate
Event Store Debate

  • Current system: hybrid with Objectivity at SLAC and IN2P3 and Kanga/Root at RAL, INFN-Padova

  • Problems with Objectivity:

    • lock collisions

    • Prompt Reconstruction and Simulation Production performance issues

    • poor record of scaling with luminosity: every jump in data sample has been accompanied by Objy problems

    • distribution difficulties  getting data samples to Tier C sites

    • other HEP experiments have dropped Objy as an option

    • concerns about viability of Objectivity Company

      • we don’t have source code

      • how much expertise will be around in 2007?

J. Walsh,


Event store debate ii
Event Store Debate - II

  • Kanga/Root

    • much easier maintenance

    • easier to export

    • smaller event size (although Objy event size is decreasing with deployment of compression and redesign of navigational info)

    • more efficient CPU usage

    • becoming HEP standard – easy to attract manpower to support Kanga/Root

J. Walsh,


Event store debate iii
Event Store Debate - III

  • So, why not drop Objy and go with Kanga/Root system?

    • Cost of migration: effort and disruption: estimates ranged from 1 to 2 years to achieve migration  most in Babar agree a switch that takes more than 2 years is probably not worth doing.

    • Kanga/Root has some technical issues that need to be addressed:

      • file server to handle huge number of files

      • lack of transactions

      • lack of cross-file associations (e.g. mini-to-micro navigation)

      • bookkeeping

      • staging system

    • Political/human issues.

  • Note: conditions database implemented in Objectivity

    • too costly (estimate 2-3 years) to convert to Root-based DB

    • Babar relationship with Objy will continue in any case

Work to address these issues is ongoing (not just in Babar context)  probably no show-stoppers, but it is work.

J. Walsh,


Event store debate iv
Event Store Debate - IV

  • Alternative to Kanga/Root-only system: a hybrid system where:

    • new-micro in Kanga/Root format only

    • everything else (event reconstruction, simulation production, mini format) in Objectivity

  • Hybrid system has advantage of easier, less-disruptive migration, but we still need to support 2 event store technologies

  • Final decision/recommendation on event store coming soon

J. Walsh,


Computing analysis sites
Computing/Analysis Sites

  • The Working Group is just starting on this subject  just present the issues

  • Role of Tier A sites - large site that reduces significantly computing burden at SLAC

    • Primarily analysis: IN2P3, RAL

    • Production: INFN-Padova

    • Issues:

      • data replication at Tier A’s

      • data partitioning at Tier A’s (micro, mini, beam data, MC)

      • transparent access to data across Tier A’s (BabarGrid)

      • specialization of Tier A’s: skimming, (re-)processing, etc.

  • Role of Tier C sites – smaller sites at remote institutes

    • main contribution so far in MC production (majority of MC events produced away from SLAC)

    • analysis at Tier C’s has been difficult due to problems with data distribution need to resolve with new Computing Model

J. Walsh,



  • Mini

    • Already implemented in Objectivity (minor fixes, improvements ongoing)

    • Feasibility of Root implementation has been studied  could be ready by early 2003

  • New -micro

    • Dual usage (Beta and Root) prototype implementation has been achieved. Additional development needed:

      • customization

      • persistent composites

      • Beta/Framework optimization

J. Walsh,



  • Essential requirement: minimal disturbance to Babar capability to produce physics results

  • Mini

    • currently doing reprocessing in Padova of all data, producing mini format output

    • the mini is “new” feature, so disruption of ongoing analyses is minimal

  • New-micro

    • exploit Babar’s data replication to ease migration

      • maintain old-micro at SLAC and IN2P3 sites

      • introduce new-micro at RAL site

      • users have choice of format during transition period

  • Dependence on other parts of Computing Model

    • use of Tier A sites

    • choice of event store technology, etc.

J. Walsh,



  • Babar is currently updating its computing model, to be able to deal with large increase in data set in the coming years

  • A new analysis model, based on the new-micro and mini data formats has been proposed and largely agreed to.

    • the mini will permit more in-depth analysis

    • the new-micro will eliminate largely wasteful/redundant ntuples

  • The working group is also considering the future of event store technologies employed in Babar.

    • Should Objectivity event store be dropped in favor of Root-based technology?

    • Is Kanga/Root ready to be used as a full-scale event store?

    • Does a hybrid system do enough to alleviate the problems of Objy?

  • An important part of the new model will be how best to use remote computing/analysis sites: Tier A and Tier C

    • Work starting on this subject within the Working Group

J. Walsh,


The following are backup slides
the following are backup slides

J. Walsh,


Skims with new micro
Skims with New-Micro

  • The customization features of the new-micro make it an attractive tool to use with Centralized Skims

  • The idea is that each Analysis Working Group (AWG) will provide the appropriate event selection and contentcustomization to the Central Skim group

  • Small skims will be encouraged: deep-copies, which provide fast access, will be possible for small skims (< few % selection rate)

  • In addition to skims, a generic new-micro containing all physics events will be available  important for new analyses

  • Aiming for increased frequency for Central Skims – every 3 months

    • feasibility being evaluated

J. Walsh,


Tag nano dst format
Tag (nano-dst) Format

  • The Tag format will continue to be maintained

  • Optimization/cleanup to remove unused or redundant information – should get a factor of 2 size reduction

J. Walsh,


Deep copy vs pointer skims
Deep Copy vs. Pointer Skims

deep copy

  • Deep copy

    • copy full event to new location

    • faster read rate

    • more disk space

  • Pointer (shallow) copy

    • write pointer only

    • slower read rate

    • less disk space

Ev 1

Ev 2

Ev 2

Ev 4

Ev 3

Ev 4

shallow copy

Ptr 2

Ptr 4

J. Walsh,


Use cases
Use Cases

  • Mature analysis (like sin2b) could create a Mini skim of a relatively small number of events and work from that

  • An analysis with loose skim cuts (2-body charmless) would customize a new micro skim, dropping unneeded info and saving B candidates. Mini could be used near end of analysis when number of events is sufficiently reduced.

  • A new analysis would use allEvents generic new micro to explore concept and define cuts. Final analysis could require a customized new micro skim or a mini skim (if event sample is small enough).

  • An AWG could produce a skim that serves many analyses. Specific analyses could make pointer skims of the skim, or deep copy skims if small enough.

  • etc.

J. Walsh,