1 / 49

Experiences with Conditions DB in BaBar

Experiences with Conditions DB in BaBar. Igor A.Gaponenko Lawrence Berkeley National Laboratory (IAGaponenko@lbl.gov). http://www.slac.stanford.edu/~gapon/CERN2000/Experience.ppt. The virtual tour on the BaBar Conditions/DB. Introduction. Some history. The current design.

leanna
Download Presentation

Experiences with Conditions DB in BaBar

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experiences with Conditions DBin BaBar Igor A.Gaponenko Lawrence Berkeley National Laboratory (IAGaponenko@lbl.gov) http://www.slac.stanford.edu/~gapon/CERN2000/Experience.ppt

  2. The virtual tour on the BaBar Conditions/DB Introduction Some history The current design The Conditions/DB Setup at SLAC The problems... Ongoing / planned developments... Performance Lessons. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  3. Introduction… • The Conditions/DB has been developed in the context of the BaBar experiment at the Stanford Linear Accelerator Center: • is in use since May 1999 • Basic features: • The database is meant to store: • detector alignments; • calibrations constants; • Other time-dependent records, under which the experimental events are taken. • The recorded conditions are “time based” (rather than “run based”): • the resolution of time is 1 second • The data are recorded every 30 minutes (in average). • A possibility to have multiple versions of conditions data; • The persistency is explicitly exposed to the end-users; • The meta-data (information about stored conditions) and the conditions themselves are kept in separate databases and containers: • For more efficient access and easier management Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  4. Introduction…(cont.) • The technology is coherent with the general trends in the BABAR software: • Provides an API for applications in C++ • Uses Objectivity/DB as the underlying storage technology • Uses CORBA to implement more efficient access to persistent data: • Currently is only used to optimize read-only access in special setups of massive parallel data processing engines (like “Online Prompt Reconstruction” and “Reprocessing”); • There are more ambitions plans to serve the “Physics Analysis” jobs and to extend a list of services. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  5. The most important events and key decisions in the historyof the Conditions/DB 1993 December: Start of BaBar 1995 June: A key decision to use Objectivity is made. The beginning of the Conditions/DB design. 1996 July:The first prototype of the Conditions/DB is available 1997 Spring:“Proxy Dictionary”: - persistent / transient separation - transient proxies in the Conditions/DB The code development begins... Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  6. The most important events and key decisions in the historyof the Conditions/DB (cont.) 1998 December: Use of the “BaBar Database Authorization System” implemented in the Conditions/DB code 1999 January: The improved implementation of the “Data Clustering and Placement System” February: “Schema Evolution Strategy” for the Conditions/DB. Start working on the “Revision Extension”. May:The “Revision Extension” is available: - means extended persistent schema for meta-data - access to the “old” conditions BaBar starts taking data!!! Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  7. The most important events and key decisions in the historyof the Conditions/DB (cont.) 1999 June: The “History Extension” is available: - persistent bookkeeping for operations affecting the database - “rollbacks” are possible now - improved manageability August: The “User Area Extension” brings a possibility to keep common and private (owned by users) conditions in a single federation September: First problems are seen after 3 months of data taking: - the amount of data due to the Ambient/DB reaches ~20 GB - the scalability limits are seen (early warning) November: The need in the “Multi Federations Setup” at SLAC to avoid locking problem is finally realized. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  8. The most important events and key decisions in the historyof the Conditions/DB (cont.) 2000 January: The “Multi Federations Setup” of the Conditions/DB is deployed. Three major federations: - DAQ - Online Event Processing (OPR) - Physics Analysis February: The REPRO federation (a functional clone of OPR) is introduced. Some of the conditions needs to be “shared” with OPR. The “Rolling Calibration” is deployed at both OPR and REPRO. April: The specific to the “Rolling Calibration” problem of “staircases” in REPRO is identified and solved. May: The problem of the long “Startup Time” in OPR and REPRO is identified. Started the first experiments with CORBA. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  9. The current design of the Conditions/DB • The Conditions/DB classes in the overall BaBar software architecture: • Two types of applications: • BaBar Framework applications • Standalone applications • The roles of components: • The Environment • The Proxies • The Conditions/DB • The architecture of the Conditions/DB • The structure of the API it provides to its clients: Basic Read/Write, Revisions, Information and Management • Types of clients • The services it relies on: Clustering, Authorization, Objectivity Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  10. The current design of the Conditions/DB: Data Model • The namespace for conditions: • Two-layered namespace: <detector>, <condition> • There is just one set of conditions per federation: • EXCEPTION: “User Area Extension” • Meta-data and conditions objects are separated: • Time, intervals, versions, objects, revisions • Indexing • The History records • Physical data placement • A layout of the Persistent store • The file system of the database Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  11. The current design of the Conditions/DB: API • The API (as it appears to the end users): • Transient classes (BdbTime, BdbDatabase, BdbCondDatabaseMgr, etc.) • Persistent classes (BdbObject, BdbInterval, etc.) • Dealing with “revisions” • Management Tools • > 50% of time were spent to develop these tools!!! • The code • 4 core packages • >> 10 of detectors-specific packages • nearly 100 KL of code in C++, Perl, Java in core packages Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  12. Framework Applications Standalone Applications The Conditions/DB in BaBar Software Architecture The Environment AbsEnv EmcEnv XxxEnv Persistency Proxies EmcXtalProxy Conditions/DB Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  13. The role of proxies EmcFooClassT Transient classes Construct and cache a transient object Data conversion Proxy Load persistent objects Manage a transaction EmcFooClassP Persistent Store EmcFooClassP_001 EmcFooClassP_002 Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  14. The Structural Components BDB services in use Users’ code Authorization Data Clustering & Placement Management Utilities Proxies Transient cache Transaction Management Data store & fetch API Management API Conditions/DB Configuration Persistent cache Persistent Store (Objectivity/DB) Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  15. The Namespace for Conditions Characteristics: The name space has two layers only: <detector>/<conditions> The conditions names are unique within a detector. Detector name has exactly 3 symbols in its length. emc dch Status Xtals A Status DriftV Gas Notes: Two detector names “tmp” and “usr” are treated in a special way - they aren’t associated with any real detector. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  16. Meta-data and Conditions Objects NOTE: The plain timeline for a condition is shown. Validity time FIRST LAST -Infinity +Infinity The condition is parameterized by a single parameter – validity time. The timeline of a condition object is divided onto intervals: [begin, end). Each interval covers a period in the validity time where the value of the condition is constant. The intervals are connected into a linked list. But the Objectivity’s indexing is used to speed-up a process of locating desired interval for a given time. Each interval has pointer to the actual condition object residing in a different database. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  17. Meta-data and Conditions Objects: versions NOTE: The versioned timeline for a condition is shown. V4 V3 V2 V2 V1 V1 FIRST V0 V0 LAST -Infinity +Infinity The intervals above the baseline level are called versions. Each interval may have many version within its validity time limits. The versions are organized into vertical trees. Each tree grows from a baseline interval. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  18. The Concept of Revisions -Infinity +Infinity <3> = <2> + <2> = <1> + Revisions: <1> = <0> + “baseline” <0> = A revision is a second (vertical) key used when a condition is being fetched. A revision is made of intervals. Each revision, except a baseline one, has a base revision. A revision (with its direct and indirect ones) covers the whole timeline. Each revision with their direct and indirect base revisions provides a complete timeline for the condition. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  19. User defined persistent classes NOTE: All user-defined persistent classes have to be derived from the special one. <<persistent>> ooObj BdbObject B A Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  20. A A Layout of the Persistent Store NOTE: Just one detector is shown. Link database Index database(s) Object database(s) A B C B C A namespace for conditions. Each container has a symbolic link. Each container keeps a history of a particular condition. “Real” condition objects are stored in here. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  21. The File System of the Database ...conditions/ DOMAIN emc/ ... drc/ DETECTOR con_drc_Link con_drc_Index_core con_drc_Index_opr-core drc/ drc002000-002100/ con_drc_drc002000 con_drc_drc002001 con_drc_drc002002 Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  22. An Example of the Conditions/DB Use: Storing #include “BdbClustering/BdbCondClusteringHint.hh #include “UserPackage/MyObject.hh” #include “BdbTime/BdbTime.hh” #include “BdbCond/BdbDatabase.hh” #include “BdbCond/BdbObject.hh” // Create a new persistent object at given location BdbCondClusteringHint theHint( “emc” ); BdbHandle(BdbObject) theH = new( theHint.updatedHint( )) MyObject( ); // Register a new object in the database (create meta-data). BdbDatabase theDb( “emc” ); BdbTime theBegin = 1; BdbTime theEnd = 10; BdbStatus status; status = theDb.store( theH, “MyClass”, theBegin, theEnd ); if( BdbcSuccess != status ) { } Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  23. An Example of the Conditions/DB Use: Fetching #include “UserPackage/MyObject.hh” #include “BdbTime/BdbTime.hh” #include “BdbCond/BdbDatabase.hh” #include “BdbCond/BdbObject.hh” // Find an object for given criteria BdbDatabase theDb( “emc” ); BdbHandle(BdbObject) theH; BdbTime theTime = 1; BdbStatus status; status = theDb.fetch( theH, “MyClass”, theTime ); if( BdbcSuccess != status ) { } // Cast to a concrete class BdbHandle(MyObject) myH; myH = (BdbHandle(MyObject)) theH; Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  24. The Conditions/DB Setup at SLAC • Two stages in the developing of the setup: • A single, one directional chain of IR2, OPR federations • The Conditions/DB distributed into 3 federations • 4 types of conditions: • online, constants, “Rolling Calibrations” • various update frequency • Data synchronization procedures: • “sweep” • “merge” • propagate new types of conditions • Some statistics: • The numbers: • amount of data • number of persistent classes • number of proxies • TABLE: a distribution of conditions between detectors • TABLE: a distribution of databases between detectors Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  25. Prompt Reconstruction Farm: x150 nodes IR2 OPR The Federation Setup at SLAC (Stage I) Stage I: Before February 2000. All conditions are loaded/updated in one place only. No “Rolling Calibration” at (re-)processing time. DAQ Physics Analysis Jobs R/W RO RO PHYS This is the only place where conditions are created and updated. All other federations access conditions for reading only. To other federations Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  26. Prompt Reconstruction Farm: x150 nodes Re-Reconstruction Farm: x150 nodes IR2 OPR The Federation Setup at SLAC (Stage II) Stage II: February 2000+. All conditions are loaded/updated in 3 federations. The “Rolling Calibration”. DAQ Physics Analysis Jobs R/W R/W RO PHYS Now the conditions are created and updated in all three federations simultaneously. REPRO R/W To other federations Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  27. The Types of Conditions produced by Federations • The role of each “production” federation (the type of data it owns/generates): • IR2 Federation • online calibrations • other conditions under which the events are taken • OPR Federation • detector alignments • original “Rolling Calibration” constants • REPRO Federation • updated “Rolling Calibration” constants • Each of these federations is “responsible” for managing of its own set of conditions. • See the tables on the next pages... • Each federation (including the “consumer” ones) has: • a superposition of all conditions data produced in the previous three federations Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  28. X A A A Y B B B C C C D D D E E E F F F X Y The Federation Setup at SLAC (distributed conditions) Idea: The conditions are distributed (created/updated) between 3 federations. Each federation has a copy of conditions from others. Technology: Two-layered namespace <detector>/<condition> is mappeed to <origin>-s. Inter-federations synchronization mechanisms. “Sharing” a condition: IR2 OPR REPRO REPRO OPR time X Y These conditions are “shared” by OPR and REPRO. The “Rolling Calibration” conditions Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  29. Statistics: General • The current profile of the Conditions database: • The current amount of data in the database: ~6 GB • Database files: >180 • Types of conditions: >400 • Number of persistent classes: ~200 • Number of transient proxies: >100 • More than 10 active developers contributed into the concrete persistent classes and transient proxies development. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  30. Statistics: Conditions Table: There are more than 400 grouped into 10 detectors. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  31. Statistics: Databases (from OPR) Table: Physical (Objectivity) database files. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  32. Problems... • Pitfalls of the current design: • “Staircases” • “Composite objects” • troubles to manage the complexity • exposed persistency • a transient API or IDL would be better • Performance problems: • coming from the implementation: • have been partially studied • caused by the current setup (Objectivity AMS) • The startup time of OPR and REPRO jobs. • Scalability issues (expected from the recent performance studies) • Are connected to the number of persistent objects in a container Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  33. The problem of “Staircases” Use Case: The “Rolling Calibration” at reprocessing time... Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  34. The problem of “Staircases” (cont.) The Solution: The “purging” algorithm Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  35. Problem of “composite objects” Use Case: Deep copy operations involving condition objects... A A I: COPY INTERVALS borrowed references II: COPY OBJECTS The branches may have other type than top-level objects. These are not seen directly via intervals. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  36. A B userClone() Problem of “composite objects” (cont.) Solution: Extended API of a base class of condition objects. ooRef(ooObj) <<persistent>> ooObj copy() Persistent “memcpy” BdbObject This public method makes a copy of a composite object. clone() virtual userClone() Default implementation does nothing. ooRef(B) Extended implementation should follow the reference and make a clone of an object of class B. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  37. The Problem of The Long Startup Time • The problem is specific to the OPR and REPRO farms. • There are 150 parallel jobs in the current setup. • It takes 10..15 minutes before the jobs get to the very first event: • loading the conditions has been proven the major contributor: • Each jobs loads about 50 MB of data from Objectivity from < 200 various conditions. • All 150 jobs compete for the AMS server loading the same data. • The performance of the AMS server is limited by 4 MB/sec. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  38. The Startup Time at Reconstruction farms Problem Definition: It takes too long for (150) jobs to “warm up”. Here is a problem. I II III Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  39. On-going development… • Short-term activities… • OID Server deployment in the production for the next set of runs (January 2001) • Performance studies • Optimize the current code • About to begin the routine use of the “Revision System” • There are two reconstruction-s of current data (2 sets of conditions) • Improving the management • better automation is required to sweep conditions data between federations • GUI based tool: “Conditions Browser” • Moving from Solaris 2.6 to Solaris 7 in production: • A bug in the Objectivity 5.2 (for Solaris 7) to be fixed at Objectivity 6 • Changed API between Objectivity 5.2 and 6: code would be required • Accessing multiple federation from a process (Objectivity v6): • technical problems (the feature is said to be available - but never tried by us) • transaction management and context switching • “Super-intervals” (aimed at the scalability problem and locking problem caused by multi-federation access from OPR and REPRO) • alternative solution: use OID Server to write back into persistent store Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  40. On going development…(cont.) • Long-term developments… • Improved data model (to get rid of “staircases”) • NOTE: The backward compatibility is an issue! • Less persistency in the API • The “Environment” class for the Conditions/DB itself to control its global parameters. • Full access to the database via CORBA (yet to be studied): • too many persistent classes • the transient classes produced by proxies are too complicated • ??? Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  41. The OID Server • The problem statement: • Solution: • Fortunately: • Up to 80% of the data relate to the “meta-adata” - the service information which is meant to locate the OID-s of the conditions objects. • Hashed containers • Objectivity’s index • Fetching algorithm • The OID itself is just 8 Bytes structure. • All the jobs are looking for the same set of OID-s. • This is why the idea to serve the OID-s via the CORBA server emerged: • Load 80% of data just once and then serve results (OID-s) to jobs from the transient cache. • Performance: • Single-threaded OID Server • NOTE: MT version has been implemented but has not been tested for the performance: • Gain in the performance by running on multiple CPU machines. • Non-blocking service from the cache while a new condition is being loaded provides better service time for clients. • 50, 100, 150 and 200 nodes configuration Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  42. Job Job Job Job AMS LEGEND: Objectivity protocol BdbCondRemoteCmd CORBA Naming Service CORBA protocol OIDServer The OID Server Write new conditions objects and meta-data Finalize() Read conditions objects Read meta-dada Get OID OID Manage Cache Resolve Register Resolve Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  43. The Transaction Management Job 1 Job 2 ... ... Job N getOid commit LEGEND: READ transaction Timeout expires after the last getOid request. If no more requests then the server stops its own READ transaction. READ transaction within timeout UPDATE transaction in finalize Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  44. Performance Studies Copy an interval container: This is one of the most commonly used management procedures with meta-data. The scary thing is the parabolic growing (is noticeably higher for conditions having multiple vertical layers). For instance, if we had a condition having 50 K intervals in the baseline (each interval - one reconstruction run) then we would end up with ~1/2 hourr to make a copy of this condition. Possible Reasons: The persistent memory allocation in a container, poor performance of the indexing or the cache management in the Objectivity kernel. The actual reason is to be studied yet. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  45. Performance Studies (cont.) The “Rolling Calibration”: Here we are storing new conditions on the top of the existing ones in the middle of a timeline. The storing time is a linear function of the number of already existing persistent interval objects for a particular condition. This may be a scalability issue after a few years of operations (now we are working on the level of <20 K intervals in the baseline and vertical 2..4 layers. Possible Reasons: The persistent memory allocation in a container, poor performance of the indexing or the cache management in the Objectivity kernel. The actual reason is to be studied yet. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  46. Performance Studies: OID Server (I) Conditions loading requests are shown without and with the OID Server. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  47. Performance Studies: OID Server (II) The average improvement if the OID Server is used. Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  48. Lessons... • In spite of troubles the project was quite successful: • The Objectivity is also very usable - the point is how to use it effectively and how to avoid the “booby-traps”. • The product is still under development: • we gain new experience; we have better understanding; also new requirements show up • The main lessons: • Data Model: • Splitting of the meta-data from conditions was a right decision • Although the implementation of the meta-data was the poor one, which was a source of many problems. • No problems with “time-based” approach to the validity of conditions. • Two-layered namespace is too restrictive. • Storing the “Bookkeeping” (log) information persistently helps a lot. • Paying attention to what is stored in the database is required (evolution of data types, “composite” objects, etc.). Igor A. Gaponenko: Experiences with Conditions DB in BaBar

  49. Lessons…(cont.) • The main lessons (cont.): • API: • Exposed persistency complicates migration of services to CORBA • If we started design from the ground-up then one of the following would be a better solution: • Completely transient API • CORBA IDL (more portability, more flexibility, but … less performance) • Any other stable and vendor independent standard(?) • Database management could be an issue: • More then 50% of our code is about the management Igor A. Gaponenko: Experiences with Conditions DB in BaBar

More Related