1 / 13

The Event as an Object-Relational Database: Avoiding the Dependency Nightmare

Learn how using an object-relational model can solve the dependency issues in event data, leading to shorter compilation and link times, and improved code maintenance.

hherman
Download Presentation

The Event as an Object-Relational Database: Avoiding the Dependency Nightmare

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Event as an Object-Relational Database: Avoiding the Dependency Nightmare Christopher D. Jones Cornell University, USA

  2. Introduction • Data within an event often relate to one another • E.g., tracks matching to EM showers • Simple object-oriented design has these data items containing pointers to one another • Unfortunately, this causes serious dependency issues • Large compilation times • Extremely long link times • Broken code affects more systems • Using an object relational model avoids these problems and allows new possibilities C. Jones CHEP03

  3. Object Oriented Approach • Design • Related data are grouped into a class • E.g., Track, EM Shower • Data values are stored in objects of the appropriate class • Links between objects are accomplished by embedding pointers into the objects • Appeal • Easy for users to navigate the relationships between objects C. Jones CHEP03

  4. Object Oriented Approach Track Hit EM Shower Hit EM Shower Hit Hit EM Shower Hit Track Hit Hit Hit C. Jones CHEP03

  5. Problems With OOA: Interface • Adding new relationships means changing the classes • Must recompile all code that uses those classes • Published objects must be mutable • Need to be able to change object to set relation to other object • How to handle links in the case where multiple algorithms produce the same data? • E.g., tracks from different track-finders to same EM showers • Where to put the data that describes the relationship? • How do two people refer to the same object if each has made a sub-selection of a list? • Use the index in the original list? C. Jones CHEP03

  6. Problem with OOA: Compile & Link • In highly coupled systems, if one piece of code is broken the whole system can break • E.g., if tracking is broken may not be able to do EM shower work • To avoid excess compilation dependency you must only forward declare data in header files • To avoid excess linking dependencies, associated objects can not internally access member functions of each other • E.g., can not have function that calculates energy of EM shower divided by momentum of track • Can relax this requirement if you organize your code so that the associated routines are in a separate object file • Reference counting smart pointers cause strong compile- and link-time dependencies C. Jones CHEP03

  7. Problems with OOA: Storage • Direct references in objects complicates storage • Need to convert pointers to/from persistent values • If using bidirectional links must construct both objects before linking them • Often causes developer to couple objects directly to storage system • Reading/writing causes compile/link/runtime dependencies • Occurs even if object only holds pointers to other objects • Can avoid dependencies by reading back unlinked objects • User must specify when to make link • Burdens user with responsibility to be sure the link is made before she tries to use it C. Jones CHEP03

  8. Event as Object Relational Database • No objects have pointers to objects outside ‘atomic’ storage boundaries • E.g., MC Particles can hold pointers to their children if store all MC Particles together • Objects in lists have unique identifier • Physicists use the identifier when talking with other physicists • In our system, use our own templated Table class to hold lists of objects which sort the objects via their identifier method • In our system, lists are identified via unique keys based on type of object in the list and two character strings • Relationships are defined via separate object: Lattice C. Jones CHEP03

  9. Lattice • Links relationship data to the identifiers of two different objects (denoted by Left and Right) • Supports 16 different configurations • 1 or many Lefts per Link • 1 or many Rights per Link • 1 or many Links per Left • 1 or many Links per Right Left 1 Link A Right 1 Left 2 Link B Right 2 C. Jones CHEP03

  10. Object Relational Approach Track:1 Hit:1 EM Shower:1 1:data:1 1:data:1,2 2:data:1 Hit:2 EM Shower:2 4:data:1 Hit:3 7:data:1 2:data:2 Hit:4 EM Shower:3 3:data:2 Hit:5 Track:2 5:data:2 Hit:6 6:data:2 Hit:7 8:data:2 Hit:8 C. Jones CHEP03

  11. Improving Usability • Easier to use objects that directly link to related objects • Created ‘Navigation’ objects that give direct access to related objects • Internally look up relationship in appropriate Lattice • Related objects obtained using regular data access mechanism • I.e., Navigation objects just do what users would have to do • NOTE: To avoid interdependencies in crucial software, only analysis code is allowed to use Navigation objects • Taken special care so only if accessing an object via Navigation do you become compile/link-time dependent on it • E.g., if you do not use EM showers then you do not need to link them • Only one library is allowed interdependencies • Makes maintenance easier C. Jones CHEP03

  12. Advantages • Shortens link times • Usually less than 30 seconds on a moderate machine • We use dynamic loading so only have to link to libraries your module directly needs • Simplifies storage code • Easier to support many specialized storage formats • Speed up data read-back • Only have to retrieve data user actually uses • E.g., can ask if a Track is matched to a EM Shower without having to construct the EM Showers • Can use multiple data sources on read-back • E.g., build event by combining physicist’s data skim and experiment’s event database C. Jones CHEP03

  13. Conclusion • Compile/Link/Run-time dependencies make code less robust • Avoid unnecessary dependencies by encapsulating relationships in a separate object • Provide directly linked objects only to analysis users • Users productivity and satisfaction will increase • Shorter compile times • Shorter run times C. Jones CHEP03

More Related