1 / 29

Andrea Valassi (CERN IT-PSS-DP) NEC2007, Varna, Bulgaria 14 th September 2007

Physics Services Support. Relational Databases for the LHC Computing Grid The LCG Distributed Database Deployment (3D) and Conditions Database (COOL) Projects. Andrea Valassi (CERN IT-PSS-DP) NEC2007, Varna, Bulgaria 14 th September 2007. Acknowledgements.

Download Presentation

Andrea Valassi (CERN IT-PSS-DP) NEC2007, Varna, Bulgaria 14 th September 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Physics Services Support Relational Databases for the LHC Computing GridThe LCG Distributed Database Deployment (3D)and Conditions Database (COOL) Projects Andrea Valassi (CERN IT-PSS-DP) NEC2007, Varna, Bulgaria 14th September 2007

  2. Acknowledgements • Several people have ‘lent’ me their slides or contributed useful suggestions for this talk • Dirk Duellmann and the 3D team • Maria Girone and the CERN Physics DB team • The COOL and CORAL teams • Several users in the experiments • Many thanks to all of them! 3D and COOL - 2

  3. Outline • Relational databases for LHC computing • Reliable services at CERN and other LCG sites • The 3D project: distributed database deployment • COOL and other conditions data • COOL development and deployment status • Conclusions • Relational databases for LHC computing • Reliable services at CERN and other LCG sites • The 3D project: distributed database deployment • COOL and other conditions data • COOL development and deployment status • Conclusions 3D and COOL - 3

  4. Relational databases for LHC • In LHC computing, relational databases will be crucial to store metadata of both physics applications and grid services • Detector conditions (calibration, geometry…) • Experiment data production bookkeeping • Core grid services for cataloguing, monitoring and distributing LHC data (e.g. LFC file catalog) • Key features of relational db services • High availability, backup and recovery, performance and scalability, security… 3D and COOL - 4

  5. The 3D Project • Distributed Database Deployment • The LCG provided initially tools for distributed access andreplication of file-based data • The aim of the 3D project is to provide a similar infrastructure for data stored in RDBMS services • Experience in running RDBMS services at CERN and several other LCG sites already since a long time • Goals of the 3D project as part of LCG • Increase database availability and scalability • Allow applications to access databases in a consistent and location-independent way • Provide database replication between sites • Coordinate the setup and deployment of the database and replication infrastructure 3D and COOL - 5

  6. 3D Service Architecture 3D and COOL - 6

  7. Building block – db cluster • CERN db services use Oracle 10g RAC • High availability – redundant storage and network • Scalability – for CPUs and storage independently • Cost reduction – commodity hardware on Linux • Homogeneous h/w and s/w setup for all physics DBs • Similar setup is used by most T1 sites as well 3D and COOL - 7

  8. Development service Validation service Production service Physics DB services at CERN • Size of Oracle services for physics • 110 mid-range servers, 110 disk arrays • i.e. 220 CPUs, 440 GB RAM, 300 TB disk space • Several production clusters • One offline RAC per LHC experiment (up to 8 nodes), Atlas online RAC, COMPASS RAC • In addition: development and validation services • Development and validation services too • Application release cycle 3D and COOL - 8

  9. Read-only access to Oracle data via http Oracle server at T0 Tomcat server at T0 Squid web cache at T0/T1/T2 Frontier used in CMS Under evaluation in Atlas (integrated in Coral/Cool) Successfully tested in CMS CSA’06, many improvements in 2007 CMS are confident that they have ways to avoid stale-cache issues Frontier and CMS 3D and COOL - 9

  10. Replication – Oracle Streams(Capture, Propagation, Apply) Barbara Martelli, INFN T1/T2 Workshop, Nov. 2006 3D and COOL - 10

  11. Replication – T0 to T1 • CERN data are replicated to ten T1 sites • Streams used by Atlas (10 T1) and LHCb (6 T1) • More details in the slides about COOL deployment • The present setup can sustain 2 GB/day to T1 • This is the Atlas requirement for COOL user data 3D and COOL - 11

  12. Streams downstream capture • This technology provides isolation of the source database against problems with the network or with the destination databases • In 3D, this shields the CERN T0 services from problems in the replication to T1 sites • The redo log retention on the downstream database is optimized (e.g. 5 days) to allow for re-synchronisation without recall from tape 3D and COOL - 12

  13. Replication – online to offline • Streams used by Atlas, LHCb and CMS • For LHCb offline to online too (see COOL slides) • Work in progress with Atlas to test replication of the full PVSS archive • Allow detector expert analysis without impacting the performance of the online production server • Data rates (6 GB/day) much higher than COOL • Tests over the last two months are promising 3D and COOL - 13

  14. 3D service operation • DB service level according to WLCG MoU • At T0: piquet service being set up to replace current 24x7 best-effort operation • Streams interventions 8x5 for now • At T1: need more experience to confirm coverage • Some policies proposed by CERN T0 have been accepted also by the T1 sites • Backup and recovery (Oracle RMAN) • Security patch application (frequency, procedure) • Database and Streams monitoring, usage reports • Integration with WLCG procedures • GGUS tickets, intervention announcement 3D and COOL - 14

  15. Outline • Relational databases for LHC computing • Reliable services at CERN and other LCG sites • The 3D project: distributed database deployment • COOL and other conditions data • COOL development and deployment status • Conclusions 3D and COOL - 15

  16. What are conditions data? • Non-event detector data that vary with time • And may also exist in different versions • Data produced both online and offline • Geometry, detector control, alignment, calibration... • Data used for event processing and more • Detector experts • Alignment and calibration • Event reconstruction and analysis 3D and COOL - 16

  17. CondDB in the 4 experiments • ALICE • Alice-specific software for time/version handling • ROOT files with AliEn file catalog • ALICE-managed deployment (AliEn MySQL at T0) • CMS • CMS-specific software for time/version handling • Oracle (via POOL-ORA) with Frontier web cache • 3D/CMS deployment: Oracle/Frontier (T0), Squid (T1/T2) • ATLAS and LHCb • COOL common software for time/version handling • Common development of Atlas, LHCb and CERN IT • Oracle, MySQL, SQLite, Frontier (via COOL API) • 3D/Atlas/LHCb deployment: Oracle (T0/T1) with Streams • ALICE • Alice-specific software for time/version handling • ROOT files with AliEn file catalog • ALICE-managed deployment (AliEn MySQL at T0) • CMS • CMS-specific software for time/version handling • Oracle (via POOL-ORA) with Frontier web cache • 3D/CMS deployment: Oracle/Frontier (T0), Squid (T1/T2) • ATLAS and LHCb • COOL common software for time/version handling • Common development of Atlas, LHCb and CERN IT • Oracle, MySQL, SQLite, Frontier (via COOL API) • 3D/Atlas/LHCb deployment: Oracle (T0/T1) with Streams 3D and COOL - 17

  18. COOL software overview • Consistent approach to many use cases • Single-version (DCS) and multi-version (calib/align) • Technology-neutral C++ API • API is not relational - no direct SQL user access • Same user code can be used on all backends • Maximize reuse of other LCG AA software • CORAL and SEAL for C++ implementation • ROOT/Reflex for python bindings (PyCool) • Single relational implementation via Coral • Same code for Oracle, MySQL, SQLite, Frontier • Same relational schema for all backends • Emphasis on read and write performance • Best practices (bulk operations, bind variables) • Detailed performance studies and optimizations 3D and COOL - 18

  19. COOL relational implementation • Modeling of condition data “objects” • System-managed common “metadata” • Data items: many tables, each with many “channels” • Interval of validity - IOV: since, until • Versioning information with handling of interval overlaps • User-defined schema for “data payload” • Support for simple C++ types 3D and COOL - 19

  20. Development summary • Milestones • COOL 1.0 released in April 2005 • Basic functionality (development started in Nov. 2004) • COOL 2.0 released in January 2007 • Major backward-incompatible API and schema changes • Current focus is performance optimization • Separate optimizations for different use cases • Several performance issues solved in 2007 • Feedback from and for Atlas/LHCb stress tests • Work in progress also on support for new platforms and a few functional enhancements 3D and COOL - 20

  21. COOL data distribution • Replication at the database backend level • Oracle Streams (see next slides) • Cross-technology replication is possible (same schema for all backends), not really attempted yet • Oracle remote access via Frontier • Under evaluation in Atlas • Replication tools based on the COOL API • Static (copy once) or dynamic (copy then update) • Data slicing/selection is also possible • Cross-technology replication is possible • Many use cases for SQLite files in Atlas and LHCb 3D and COOL - 21

  22. Deployment in LHCb • Computing model • Reconstruction at T0/T1 • Only MC prod at T2 • COOL stores only conditions data for event reconstruction • Oracleat PIT, T0, T1 with replication viaStreams • Geometry and conditions for MC sent to T2 as SQLite file • Online db master at PIT • Replicated forward to T0 and T1 via Streams • Data from PVSS processes • Offline db master at T0 • Replicated back to PIT and forward to T1 via Streams • Data computed in offline calibration/alignment jobs (Marco Clemencic, COOL meeting 3 July 2006) COOL 3D and COOL - 22

  23. Deployment in Atlas • Largest COOL data set comes from DCS • Via the PVSS2COOL data transfer (1.5 GB/day) • From the online RAC in the T0 computer centre • For offline reconstruction and detector experts • Many options open for T2 replication • Many use cases (simulation, calibration, analysis) • Static/dynamic replication to sqlite/mysql, Frontier (Florbela Viegas, CHEP 2007) 3D and COOL - 23 3D and COOL - 23

  24. COOL deployment status • The T0 setup is (almost) complete • The LHCb online server is being set up these days • Atlas and LHCb T1 sites are all connected • SARA, RAL, PIC, IN2P3, Gridka, CNAF (both) • Plus Nordugrid, Triumf, BNL, Taiwan (Atlas only) • Distributed tests underway in both experiments Much larger data rates in ATLAS! 3D and COOL - 24 COOL Status - 24

  25. Atlas scalability tests (1) 3D and COOL - 25

  26. Atlas scalability tests (2) 3D and COOL - 26

  27. Outline • Relational databases for LHC computing • Reliable services at CERN and other LCG sites • The 3D project: distributed database deployment • COOL and other conditions data • COOL development and deployment status • Conclusions 3D and COOL - 27

  28. Conclusions • The 3D project has set up a world-wide distributed database infrastructure for LHC • This is one of the largest distributed deployments of the Oracle database worldwide (over 100 nodes at CERN and a few nodes at each of ten T1 sites) • T0/T1 are ready for ramp-up to LHC production • The COOL software is used by both Atlas and LHCb to store their conditions data • COOL deployment is one of the largest users of 3D • First results from Atlas scalability tests confirm that resources allocated should match required #jobs/h 3D and COOL - 28

  29. For more information • Physics database services at CERN • http://cern.ch/phydb • The 3D project • https://twiki.cern.ch/twiki/bin/view/PSSGroup/LCG3DWiki • The COOL project • http://cern.ch/cool • The CORAL project • http://pool.cern.ch/coral 3D and COOL - 29

More Related