1 / 7

COOL deployment in ATLAS

COOL deployment in ATLAS. A brief overview to give flavour of COOL activities in ATLAS COOL usage online and offline Current COOL database instances Conditions database deployment model Testing - from online to Tier-n Some ATLAS feedback. Richard Hawkings (CERN). LCG COOL meeting, 03/7/06.

ziarre
Download Presentation

COOL deployment in ATLAS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COOL deployment in ATLAS • A brief overview to give flavour of COOL activities in ATLAS • COOL usage online and offline • Current COOL database instances • Conditions database deployment model • Testing - from online to Tier-n • Some ATLAS feedback Richard Hawkings (CERN) LCG COOL meeting, 03/7/06 Richard Hawkings

  2. COOL usage in ATLAS • COOL is now widely deployed as ATLAS conditions database solution • Only some legacy 2004 combined testbeam data in Lisbon MySQL - migrating… • COOL usage in online software • CDI - interface between online distributed information system and COOL • Archiving IS ‘datapoints’ into COOL - track history - run parameters, status, monitoring • PVSS2COOL - application for copying selected DCS data from PVSS Oracle archive into COOL, for use in offline/external analysis • Interfaces between TDAQ ‘OKS’ configuration database and COOL (oks2cool) • Direct use of COOL and CORAL APIs in subdetector configuration code • COOL in offline software (Athena) • Fully integrated for reading (DCS, calibration, …) and calibration data writing • Using inline data payloads (including CLOBs), and COOL -> ref to POOL files • Supporting tools developed: • COOL_IO - interface to text and ROOT files, new AtlCoolCopy tool • Use of PyCoolConsole and PyCoolCopy • Use of Torre’s web browser, new Lisbon and Orsay JAVA/Athena-plugin browsers Richard Hawkings

  3. COOL database instances in ATLAS • Now have around 25 GB of COOL data on production RAC • Most ATLAS subdetectors have something, largest volume from DCS (PVSS) • Current effort to understand how best to use PVSS smoothing/filtering techniques to reduce data volume without reducing information content • Data split between several database instances for offlline production (data challenges / physics analysis, …), hardware commissioning and 2004 combined testbeam • Mostly using COOL 1.3, but some COOL 1.2 data from ID cosmic tests • Gymnastics using replication to SQLite files to allow this to be read in offline (COOL1.3) • Some of this data is replicated nightly out of Oracle to SQLite files • 6 MB of COOL SQLite file data used in offline software simulation/reconstruction • These files are included in release ‘kits’ shipped to outside locations - for this ‘statically’ replicated data, no need to access CERN central Oracle servers from outside world • ATLAS COOL replica from ID cosmics is 350 MB SQLite file - still works fine • (but takes 10-15 minutes to produce replica using C++ version of PyCoolCopy) Richard Hawkings

  4. Conditions data deployment model Tier-1 replica Computer centre Outside world Calib. updates ATLAS pit Streams replication Tier-1 replica Offline Oracle master CondDB Online OracleDB Online / PVSS / HLT farm Streams replication Tier-0 SQLite replication Dedicated 10Gbit link Tier-0 farm ATCN/CERN GPN gateway At present, all data on ATLAS RAC Introduce separate online server soon, once tests are complete Richard Hawkings

  5. Conditions database testing • Tests of online database server • Using COOL verification client from David Front, started writing data from pit to online Oracle server - using COOL as an ‘example application’ (other online apps) • Scale: 500 COOL folders, 200 channels/folder, 100 bytes/channel, every 5 minutes • 3 GB/day DCS-type load (would come from PVSS Oracle archive via PVSS2COOL) • Working - will add Oracle Streams to ‘offline’ server soon • Working towards correct network config to bring online Oracle server into production (will be on private ATCN network, not visible on CERN GPN) • Replication for HLT - a challenge • Online HLT farm (level 2 and event filter) needs to read 10-100 MB of COOL conditions data at start of fill to each of O(10000) processes, as fast as possible • Possible solutions under consideration (work getting underway): • Replication of data for required run to SQLite file which is distributed to all hosts • Replication into MySQL slave database servers on each HLT rack fileserver • Running squid proxies on each fileserver and using Frontier (same data for each) • Using a more specialised DBProxy that understands e.g. MySQL protocol and can even do multicast to a set of nodes (worries about local network bandwidth to HLT nodes) Richard Hawkings

  6. Conditions database testing, continued • Replication for Tier-0 • Tier-0 does prompt reconstruction, run-by-run, jobs start incoherently • Several hours per job, data spanning O(1 minute), 1000s jobs in parallel • Easiest solution is to extract all COOL data needed for each run (O(10-100 MB?)) once into SQLite file, distribute that to worker nodes • Solution with SQLite files (and POOL payload data) on replicated AFS being tested now • Replication outside CERN • Reprocessing of RAW data will be done at Tier-1s (Tier-0 busy with new data) • Need replication of all COOL data needed for offline reconstruction • Use Oracle Streams replication - being tested by 3D project ‘throughput phase’ • Once done, do some dedicated ATLAS tests (as in online -> offline), then production • Tier-2s and beyond need subsets (some folders) of conditions data • Analysis, Monte Carlo simulation, calibration tasks, … • Either use COOL API-based dynamic replication to MySQL servers in Tier-2s, or Frontier web-cache-based replication from Tier-1 Oracle • With squids at Tier-1s and Tier-2s, need to solve stale-cache problems (by policy?) • First plans for testing this being made - David Front, Argonne Richard Hawkings

  7. Some ATLAS feedback on COOL • COOL is (at last) being heavily used, both online and offline • Seems to work well, so far so good… • Online applications are stressing the performance, offline less so • The ability to switch between Oracle, (MySQL) and SQLite is very useful • Commonly-heard comments from subdetector users • Why can’t we have payload queries? • This causes people to think about reinventing COOL, or accessing COOL data tables directly, e.g. via CORAL • We would like to have COOL tables holding foreign keys to other tables • Want to do COOL queries that include the ‘join’ to the payload data • Can emulate with 2-step COOL+CORAL, but not efficient for bulk • A headache for replication … • COOL is too slow - e.g. in multichannel bulk insert • We need a better browser (one that can handle large amounts of data) • Why can’t COOL 1.3 read COOL 1.2 schema? • I know the ‘COOL-team’ answers to these questions, but still useful to give them here - feedback from the end-users! Richard Hawkings

More Related