1 / 3

Conditions data handling in FDR2c

Conditions data handling in FDR2c. Tag hierarchies set up (largely by Paul) and communicated in advance No real problems uploading data to the correct tag Calibration experts starting to deal with ‘real’ IOVs (data valid for calib n period) New POOL file registration scripts worked fine

cyndi
Download Presentation

Conditions data handling in FDR2c

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Conditions data handling in FDR2c • Tag hierarchies set up (largely by Paul) and communicated in advance • No real problems uploading data to the correct tag • Calibration experts starting to deal with ‘real’ IOVs (data valid for calibn period) • New POOL file registration scripts worked fine • Calibration users need to be in AFS group atlcond:poolcond • Consider doing calibration uploads from a ‘calibration’ account, not personal ones? • No instances of data in COOL without corresponding (or wrong) POOL file upload • No use of run-signoff database pages yet • System was not ready and integrated yet (holidays; too busy with other things) • But only one set of runs, and all calibrations were ‘accepted’ - no real test • Handling of detector status information works technically • Merging and transfer to LBSUMM folder (for ESD/AOD) still done by hand • Limited mapping of DQ histograms to status flags restricts usefulness • Need to make sure this improves for real data • Need to clarify how detector status flags are dealt with in ES1, ES2 processing Richard Hawkings / Paul Laycock

  2. Conditions DB access problems • Big problems in Tier-0 conditions DB access Thursday night/ Friday morning • Combination of several factors • 2/4 of Oracle server nodes got into trouble and restarted • Kernel patch being applied this week, some interdependencies not fully understood yet • Server full of ‘stuck’ connections which were never released or cleaned up - deadlock • Very high load due to FDR2 bulk reprocessing and cosmics reprocessing going on in parallel, plus FCT, ATN, RTT, TCT tests, plus user jobs • All jobs accessing Oracle directly, no use of SQLite replicas at present • Replica only useful once the run is ended online - applicable to ES2, bulk reco only • Vulnerability in that ALL Athena jobs accessing Oracle use same reader account • Limit of 800 concurrent sessions, now changed to 4 x 800 • Each Athena job holds O(10) connections in parallel until end of first event (one per subdetector schema) - typically for 5 minutes or so. Vulerable to ‘deadlock’ • Further actions being pursued • Deploy SQLite replica for bulk processing (but not for cosmics / express stream) • Use a dedicated COOL reader account for Tier-0 jobs - guarantee # connections • Reduce connection load from Athena jobs (short/long term actions) Richard Hawkings / Paul Laycock

  3. Next steps - discussion needed • Work on conditions DB access problems • Deployment of SQLite replicas to be used where possible • Start to setup tag hierarchies for first data • Separate top-level tags to be used by HLT, monitoring, Tier-0, reprocessing • Define calibration loop model for first data • Cosmics processing has no calibration loop, and several ‘express’ streams • Same plan for single beam running, or move to ‘calibration loop’ • Calibration 24hrs might be needed for code fixes even if no prompt calibration can be done yet, might have multiple processings at Tier-0 • What to do for first collisions • Sign-off tool and Tier-0/conditions integration to support all this ..? Richard Hawkings / Paul Laycock

More Related