1 / 15

Data Storage with the POOL persistency framework

Data Storage with the POOL persistency framework. Motivation Strategy Storage model Storage operation Summary Giacomo Govi PPARC-LCG, CERN IT/DB. Motivation. Provide storage and retrieval of C++ objects No intrusion into experiments data models Support for various type of data

tasya
Download Presentation

Data Storage with the POOL persistency framework

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Storage with the POOL persistency framework • Motivation • Strategy • Storage model • Storage operation • Summary • Giacomo Govi • PPARC-LCG, CERN IT/DB GridPP7 June30 - July 2, 2003

  2. Motivation • Provide storage and retrieval of C++ objects • No intrusion into experiments data models • Support for various type of data • Event data, Detector data, Analysis data • Different volumes • Different access pattern • Persistency technology may change over time • Different technologies may be used at the same time Avoid to bind to a single choice • Physics software should be independent from the underlying data storage technology GridPP7 June30 - July 2, 2003

  3. Strategy • Hide any technology details for the clients • Clients deal with objects or object references • Leave Transient data representation free from ‘knowledge’ about persistency Each technology can be handled transparently • Run-time binding of transient data to the underlying technology GridPP7 June30 - July 2, 2003

  4. Strategy (cont’d) • Objects maintain their state when made persistent • Allow for queries, selections and independent element access • Backend layers built on the technology • Use object feature when supported - need to be instructed • Split into primitives if no object support – need full access to member data • Need for object description: “dictionary” GridPP7 June30 - July 2, 2003

  5. Storage scheme • Define a model for an object storage system: • identifying commonalties among different technologies • Adapts to any technology with direct record access • Need to know record identifier in advance • RDBMS: More or less traditional • Primary key must be uniquely determined before writing GridPP7 June30 - July 2, 2003

  6. Persistent C++ pointer >> object ID Objects & references Objects, object IDs, DBs Persistency model Transient GridPP7 June30 - July 2, 2003

  7. Storage functions • Write objects • Return a unique identifier of their ‘address’ in the database (Token) • Read back/ modify/ delete stored objects • Localize objects in the database using the Tokens • Support of object association • Provide a transparent way to navigate into object references • Available: Root I/O backend GridPP7 June30 - July 2, 2003

  8. Components breakdown CLIENT SIDE POOL SIDE PersistencyService DataService Ref<T> Cache Client Storage Service LCG Dictionary GridPP7 June30 - July 2, 2003

  9. .xml .h GCC-XML Code Generator ROOTCINT LCG dictionary code CINT dictionary code Gateway I/O CINT dictionary LCGdictionary Other Clients Data I/O Reflection Technology dependent Dictionary generation DictionaryGeneration GridPP7 June30 - July 2, 2003

  10. Data Access through Reference Access to persistency service Ref<T> • References are implemented as smart pointers • Maintain access to the embedded class members • Provide services to handle persistency • Take care of the memory clean up Reference in the object cache Dereference Pointer to object GridPP7 June30 - July 2, 2003

  11. Data Service object cache • Object • Token Ref<T> • <…> • … • <pointer> • <…> Cache Ref Data Service T o k e n Pointer Storage type Object type Persistent Reference File Catalog Persistency Service Cache Access by Smart Pointer GridPP7 June30 - July 2, 2003

  12. Start Transaction Object Cache Ref<A> mark for write Ref<B> mark for write Ref<C> mark for write Data Service Client PersistencyService Commit Transaction cache->transaction().start(...); refA.mark_write(placement); ... refC.mark_write(placement); cache->transaction().commit(); Storage Service Data operation:WRITE GridPP7 June30 - July 2, 2003

  13. Data operation: READ/UPDATE/DELETE Start Transaction Object Cache Ref<A> Ref<B>. mark for update PersistencyService Data Service Client Ref<C>. mark for delete Tokens Commit Transaction Storage Service cache->transaction().start(...); refA->myMethod(); refB.mark_update(); refC.mark_delete(); cache->transaction().commit(); GridPP7 June30 - July 2, 2003

  14. Object • Token • <pointer> • <…> • <…> • <…> Link ID Link Info ... ... DB/Cont.name,... <number> Entry ID Link ID Local lookup table in each file Follow Object Associations Reference GridPP7 June30 - July 2, 2003

  15. Summary • The POOL framework provides persistency services with a generic store technology • The POOL model can be applied to other technologies based on database files, collections and objects within collections • POOL allows the client to choose technologies according to their needs • Root I/O backend implemented • Proof-of-concept prototype RDBMS backend started GridPP7 June30 - July 2, 2003

More Related