1 / 61

Persistency at LHC

Persistency at LHC. Vincenzo Innocente CERN. History is as old as Persistency. Sources and Contributions. Presentations at last RD45 workshop Presentations at the “Architecture Working Group” Experiments’ Web pages Contributions to this Workshop Focus on LHC experiments’ prototypes

luigi
Download Presentation

Persistency at LHC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Persistency at LHC Vincenzo Innocente CERN History is as old as Persistency

  2. Sources and Contributions • Presentations at last RD45 workshop • Presentations at the “Architecture Working Group” • Experiments’ Web pages • Contributions to this Workshop • Focus on • LHC experiments’ prototypes • New generation experiments (BaBar, STAR, RunII) experience and plans Vincenzo Innocente LCB workshop

  3. Persistency in General

  4. A process saves its state to be later re-used by the same process a different process running the same executable a different process running a different executable Ideal persistency: Core Dump! Persistency: what for? Process 1 Process 3 Process 2 Volatile Memory Permanent Storage Vincenzo Innocente LCB workshop

  5. Use Cases • Extended (in space and time) virtual memory: • proprietary format optimized for computational and storage performance of a single application • Import/Export in a heterogeneous environment • “standard” application-independent format • conversion to/from internal application format • Management of different versions (identification, query mechanism) and of concurrency (locking) • proprietary internal mechanism • rely on the file system • DBMS Vincenzo Innocente LCB workshop

  6. Caveat Caveat • Conversion is always required • What makes the difference is at which level is done • Operating System (or below) • Persistency Service Provider • Application Framework • Application Code • Doing at a given level does not imply that it has not been done also at a lower level • Doing it at higher levels introduces flexibility but reduce performances • Doing it at a lower level improves performances but requires high integration (binds to a given solution) Concurrency is not only for banks... Myprog.cc changed on disk; really edit the buffer? (emacs not oracle) Use Cases • Extended (in space and time) virtual memory: • proprietary format optimized for computational and storage performance of a single application • Import/Export in a heterogeneous environment • “standard” application-independent format • conversion to/from internal application format • Management of different versions (identification, query mechanism) and of concurrency (locking) • proprietary internal mechanism • rely on the file system • DBMS Vincenzo Innocente LCB workshop

  7. Objects are atomic entities have a state (data members including relationships) provide services (methods) Persistent objects survive process boundaries when “retrieved” have the same state provide the same services as they were “stored” Object Persistency Event Event Event Volatile Memory Permanent Storage Event Event Vincenzo Innocente LCB workshop

  8. Object Persistency • Persistency • Objects retain their state between two program contexts • Storage entity is a complete object • State of all data members • Object class • OO Language Support • Abstraction • Inheritance • Polymorphism • Parameterised Types (Templates) Vincenzo Innocente LCB workshop

  9. OO Language Binding • User had to deal with copying between program and I/O representations of the same data • User had to traverse the in-memory structure • User had to write and maintain specialised code for I/O of each new class/structure type • Tight Language Binding • ODBMS allow to use persistent objects directly as variables of the OO language • C++, Java and Smalltalk (heterogeneity) • I/O on demand: No explicit store & retrieve calls Vincenzo Innocente LCB workshop

  10. Problems with Naïve OP • Storing services (methods ready to run) is non trivial • persistency services are just object-data store • configuration management takes care of code • frameworks can use dynamic loading to match data & code • Clean and performant object design is difficult: • Different (partial) representations of the state of an object may be required to cope with computational, storage and I/O efficiencies (and code development efficiency) • Object design and implementation evolve, persistent objects stay the same • “Old” persistent objects need to be converted Vincenzo Innocente LCB workshop

  11. More Problems with Naïve OP • Object granularity does not match raw I/O granularity (which in turn is device dependent) • small objects should be physically clusterized according to users’ access patterns • Object logical relationships do not necessarily reflect access patterns (old rows vs columns dilemma) • How objects become persistent • At construction time (user can control clustering) • By reachability: An object becomes persistent when “attached” to an already persistent object (clustering control difficult) Vincenzo Innocente LCB workshop

  12. Physical Model and Logical Model • Physical model may be changed to optimise performance • Existing applications continue to work Vincenzo Innocente LCB workshop

  13. T a p e Application Algorithm Application Algorithm Application Cache disk I/O Buffer Realistic Object Persistency Conversion from/to computational optimal format? compression? object file page object objects page Conversion from/to machine dependent format new shape Vincenzo Innocente LCB workshop

  14. Components of a POM • Storage manager • manage the physical structure on “disk” • Transaction/concurrency manager • client transaction, journaling, locking mechanisms • (or rely on OS and file system protections) • RTTI system • identifies the concrete type of object to retrieve/store • Converters • from storage format to “user” format and viceversa • machine-dependencies, schema-evolutions, user-hooks Vincenzo Innocente LCB workshop

  15. Components of a POM • Application Cache manager • dynamic memory management with garbage collection • Tools and (G)UI • naming, indexing, query mechanisms • interactive browsing and query • development tools • administration tools Vincenzo Innocente LCB workshop

  16. Objectivity/DB ODBMS close to ODMG standard (library not framework) • Storage Manager based on fixed physical hierarchy slot-page-container-database(file)-federation • Lock-server and journals to manage transactions • Proprietary parsing of extension of C++ (ooddlx) • Objects are converted when “opened” • schema-evolution effects: automatic or user defined • Basic naming, indexing and query mechanisms • Crude Browsing and administration tools • but Objy is integrated with some third-party frameworks Vincenzo Innocente LCB workshop

  17. ROOT Application Framework with embedded I/O • Storage Manager based on • logical hierarchy Tbasket-branch-tree • physical “logical-records” in files • No transactions, no concurrency management • Parsing of C++ subset via CINT • Objects are converted when retrieved (Streamer) • Automatically or by user (schema-evolution only by user) • Basic naming, indexing or query mechanisms • and CINT scripting • “Paw”erful interactive environment Vincenzo Innocente LCB workshop

  18. (Wrapped O)RDBMS • Powerful, reliable and efficient storage managers with full concurrency and transaction management • SQL query mechanisms with transparent (hidden) indexing and naming • User friendly, fully integrated browsers and tools (for relational tables) • Poor object integration (developers should be both OO and ER experts at the same time) Vincenzo Innocente LCB workshop

  19. Persistency in HEP

  20. User Tag (N-tuple) Tracker Alignment Ecal calibration Tracks Event Collection Collection Meta-Data Electrons Event HEP Data • Environmental data • Detector and Accelerator status • Calibrations, Alignments • Event-Collection Meta-Data (luminosity, selection criteria, …) • … • Event Data, User Data Vincenzo Innocente LCB workshop

  21. Environmental Data Version C Geometry Version B Version A Version C Alignment Version B Version A Version B Calibration Version A time Parameters Snapshot for Environmental data items valid for the currently processed event. Vincenzo Innocente LCB workshop

  22. Event Structure & Placement (BaBar) Event Header Tag Evs Tag Sim Header Raw Header Emc Header Trk Header Pid Header Beta Header Hdr Sim Data Sim Raw Data Raw Emc Data Trk Data Pid Data Rec Emc Data Trk Data Pid Data Beta Data Esd Trk Data Pid Data Beta Data Aod Databases Vincenzo Innocente LCB workshop

  23. BaBar Event Structure • Decoupling of placement & navigation • Hierarchical Placement Regions • Sim (Simulated Data). ~100kBytes/event • Tru (Simulated Truth Data) ~40kBytes/event • Raw (Raw Data) ~30kBytes/event • Rec (Reconstructed Data) ~100kBytes/event • Esd (Event Summary Data) ~20kBytes/event • Aod (Analysis Object Data) ~2kBytes/event • Tag (Event Selection Tag) ~200Bytes/event • Navigation Trees • Minimize size of navigation headers • Allow for expansion of data without schema evolution Vincenzo Innocente LCB workshop

  24. Root Physical Clustering Vincenzo Innocente LCB workshop

  25. ODBMS-MSS Integration SLAC-Objy Plan • Extensible AMS • Allows use of any type of filesystem via oofs layer • Generic Authentication Protocol • Allows proper client identification • Opaque Information Protocol • Allows passing of hints to improve filesystem performance • Defer Request Protocol • Accommodates hierarchical filesystems • Redirection Protocol • Accommodates terabyte+ filesystems • Provides for dynamic load balancing Vincenzo Innocente LCB workshop

  26. vfs vfs vfs Dynamic Load Balancing Hierarchical Secure AMS ams Redwood ams Dynamic Selection hpss client Redwood ams Redwood Vincenzo Innocente LCB workshop

  27. One Technology for All ? • Event catalogues • Update (add and remove) items of a catalogue • Searchable: SQL or equivalent • Event data • Write once-read many (WORM) • Often on tertiary (sequential) storage • Bulk data used by the entire collaboration (Raw, Rec,…) • User extracted data (N-tuples) Vincenzo Innocente LCB workshop

  28. One Technology for All ? • Detector data • Updates of data items • Versioning of data items • Version configuration • Statistical data • Understandable by interactive tools A single coherent solution(non optimal for all purposes) or Ad-hoc optimal product for each given type? Vincenzo Innocente LCB workshop

  29. LHCb Event Persistency SicbCnvSvc Transient Event Store Sicb data Files Sicb/Zebra Converter Event Data Service Converter Converter Persistency Service RootCnvSvc Algorithm Algorithm Root data Files Root I/O Converter Converter Converter AppManager OutputStream OutputStream Vincenzo Innocente LCB workshop

  30. Link ID Link Info ... ... DB/Cont.name LHCb Generic Persistent Model Technology Converter (3) (2) (4) 12ByteOID <number> (1) Lookup table Vincenzo Innocente LCB workshop

  31. LHCb Link Tables • One Link table per Storage technology per DB • Link to Objy object • no link table • 8 Bytes are enough to hold ooRef directly • Link to ROOT object • Link table entry must contain all navigation info • File name • Tree/Branch name • Link toZEBRA (SICB) object • Link Table contains file name + ZEBRA bank name Vincenzo Innocente LCB workshop

  32. Hybrid Event Store in STAR • Adoption of ROOT I/O for the event store leaves Objectivity with one role left to cover: the true ‘database’ functions of the event store • Navigation among event collections, runs/events, event components • Data locality (now translates basically to file lookup) • Management of dynamic, asynchronous updating of the event store from one end of the processing chain to the other • From initiation of an event collection (run) in online through addition of components in reconstruction, analysis and their iterations • But with the concerns and weight of Objectivity it is overkill for this role. • So we went shopping… • looking to leverage the world around us, as always • and eyeing particularly the rising wave of Internet-driven tools and open software • and came up with MySQL in May. Vincenzo Innocente LCB workshop

  33. Vincenzo Innocente LCB workshop

  34. Experiments’ Status and Plans

  35. CMS • Uses Objectivity in production • Test Beam DAQ • Montecarlo (GEANT3) reconstruction • Objectivity fully integrated in Application Framework (CARF) • CARF manages transactions, physical clustering and the whole persistent object structure and its relations with the transient structure • users access persistent objects through C++ pointers • CARF takes care of pinning • leaf inheritance from ooObj often used Vincenzo Innocente LCB workshop

  36. CMS • Limited use of Objectivity “extentions” • associations, indexes, maps, query predicates, etc. • object copy, move, versions • Schema evolution routinely used • No complex object conversion attempted so far • Multi-federation environment to decouple • production • analysis • development Vincenzo Innocente LCB workshop

  37. ATLAS • Used Objectivity in several test-bed applications • HCAL test-beam • ATLFAST++ • 1TB Milestone (HPSS used as MSS) • Plan to use Objectivity in future test-beams and MonteCarlo reconstruction • The application framework will provide a “database” independent interface Vincenzo Innocente LCB workshop

  38. ALICE • Simulation and reconstruction framework fully integrated in ROOT • Used in MonteCarlo simulation and reconstruction • Will be Used in TestBeams Mockup Data Challenge done: 7 TB in seven days • Use HPSS and/or CASTOR for file management Vincenzo Innocente LCB workshop

  39. ALICE DC II NA 57 data source Computer Centre 9 PowerPC AIX LDC LDC 5 MB/s LDC Intel/Linux PC Cluster 10/15 nodes LDC LDC LDC LDC Switch LDC LDC GB eth GDCEvent Builder pipe Switch ROOTObjectifier Intel/PC Linux + PowerPC /AIX +Sun Switch LDC LDC LDC 10MB/s GB eth LDC LDC LDC LDC LDC 10 MB/s HPSS CASTOR ?? LDC ALICE DAQ data source DATE=GDC+LDC Vincenzo Innocente LCB workshop

  40. LHCb • Do not want to limit to one persistency technology • Speed, when you need speed • Functionality, when you need functionality • Ease migration to upcoming (superior) technologies • Independence • Well defined interface to persistency technologies • Interface: abstract technology independent API • Example: ODBC for relational DBs Vincenzo Innocente LCB workshop

  41. LHCb • LHCb application framework (GAUDI) is independent from persistent technology • Manage its own application caches (data services) specialized in • event data • detector data • statistical data • Abstract interface for user provided converters Vincenzo Innocente LCB workshop

  42. BaBar • Taking data since May • Use Objectivity for all kind of data • many home made tools to manage the database • Complete decoupling between transient objects (seen by end user) and their persistent representations • No schema evolution (explicit renaming of classes) • Starts using multiple-federations to decouple running environments Vincenzo Innocente LCB workshop

  43. STAR • Moved away from Objectivity mainly because of configuration management issues • Hybrid solution: • ROOT for event file • MySQL for event catalog and environmental data • MySQL under test for event tags as well • HPSS (through Grand Challenge) for tertiary storage management Vincenzo Innocente LCB workshop

  44. Objectivity Burdens in STAR • The list of burdens imposed by Objectivity grew as our experience and lessons from BaBar mounted • Management, development burden imposed by ensuring consistent schema in a single experiment-wide federation • Schema evolution unusable if forward compatibility is desired (ability to run old executables on new data) • Do-it-yourself access control, particularly with AMS • Risk of major impact from platform lock-in due to porting delays; both Linux and Sun • Scalability concerns (fall ‘98) -- lock manager performance issues in parallel usage? Vincenzo Innocente LCB workshop

  45. Requirements: STAR 8/99 View Vincenzo Innocente LCB workshop

  46. Fermi RUNII (CDF & DØ) • Sequential access model based on RUNI experience • focus on efficient data access from hierarchical storage • clustering optimized to largest data volume access pattern • Use • ROOT (CDF), EVpack (modified DSPACK) (DØ) for event files (MSQL and Oracle8 evaluated by DØ) just I/O back-ends to EDM and DØOM • DØ uses SAM for event catalog and file management • Oracle8 supporting database Vincenzo Innocente LCB workshop

  47. Data Organization User and physics group (derived) data Metadata Event Information Tiers Warm Cache Physical Clustering From Oct 1997 Review - Lee Lueking Vincenzo Innocente LCB workshop

  48. Data Access Mass Storage Pipeline Consumers Metadata Thumbnail Freight Train Pick Event User File =Group of Users =Data flow =File =Disk Storage =Tape Storage =Pipeline Name =Single User =Event Metadata Lee Lueking - October 1997 Vincenzo Innocente LCB workshop

  49. Season IV - aggregate bandwidths, summed from spreadsheet Vincenzo Innocente LCB workshop

  50. (non-technical) Risk Analysis

More Related