1 / 34

Central DQM Shift Tutorial Online/Offline

Central DQM Shift Tutorial Online/Offline. Overview of the CMS DAQ and useful terminology. Storage Manager. Detector signals are collected through individual data acquisition systems (cables and boards) that end up at the FEDs: the first element of Global Data Acquisition system (DAQ)

lrobby
Download Presentation

Central DQM Shift Tutorial Online/Offline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Central DQM Shift TutorialOnline/Offline

  2. Overview of the CMS DAQ and useful terminology Storage Manager • Detector signals are collected through individual data acquisition systems (cables and boards) that end up at the FEDs: the first element of Global Data Acquisition system (DAQ) • FED (detector FrontEnd boards): multiple FEDs per detector collect event fragments that are sent to the online event processing farm • Builder Units: Computing farm that collects event fragments from all FEDs and merge them to produce full event information • Filter Units: Computing farm where the High Level Trigger (HLT) is run to filter interesting events • Storage Manager: application that saves to local disks events selected by the HLT 2

  3. Online DQM Storage Manager Online DQM: suite of CMSSW applications that run either on all events in the Filter Farm or on a selection of events served by the Storage manager Since Dec2009 Online DQM consume DCS information in addition to Event data Online DQM Infrastructure is completed by the DQM data transfer system up to the DQM servers where the histograms and other DQM data are uploaded and visible to the shifters and the CMS community Scope of Online DQM Shifts: • Identify problems with detector performance or data integrity during the run 3

  4. Offline Data Processing and Offline DQM Prompt Reconstruction at T0 and CAF is performed from within one hour up to 48 hours after data is transferred from P5 to T0 and CAF (CERN) Online environment T0, CAF Subsequent iterations of re-reconstruction at the T1’s follow periodically the Prompt Reco with improved Alignment and Calibration constants, bug fixes. T1 Offline DQM is part of the Offline data processing that, in addition to detector data analyses, includes higher level reconstruction objects, aka Physics Objects (POG’s) Scope of Offline DQM Shifts: Produce the data certification for various reconstruction iterations  USED FOR CMS OFFICIAL GOOD RUN LISTS!!! 4

  5. DQM Shift Tools • DQM GUI: Graphical User Interface, for histogram viewing • DQM Run Registry: web interface to the Database that holds run information. Used by shifters to register interesting runs (Online shifters) and to collect quality information • Elog: for end of shifts and problem reports • TWiki pages: for shift instructions 5

  6. DQM Shifts - Overview • Online Shifts at P5 (3/day for 24 hours coverage) 23:00-7:00 | 7:00-15:00 | 15:00-23:00 • On the first day of global running, shifts start at 9:00 (for this shift you can use the regular 8:15 shuttle) • For latest run plan updates (shift cancelations), inscribe to: hn-cms-commissioning@cern.ch • Offline Shifts run at Control Rooms away from P5 (4/day) • 1:00 – 7:00 at Fermilab • 7:00 – 13:00 at CERN-CMS Centre (Meyrin site) • 13:00 – 19:00 at DESY • 19:00 – 1:00 at Fermilab In order to certify reprocessing data, Offline DQM shifts can also be scheduled outside of global data taking periods 6

  7. Access to Control Rooms on CERN Site • Safety requirements for P5: • Online CMS level 4C safety class must be passed. • Each shifter should have appropriate access rights: through EDH: • request access to « CMS CR » for P5 • request access to « CMS CEN » for CMS Center at Meyrin • Regular shuttle service which runs 7 days per week. - https://twiki.cern.ch/twiki/bin/view/CMS/P5Shuttle 7

  8. In preparation for the shift • Read the DQM shift instructions, even if you have been on shift before: https://twiki.cern.ch/twiki/bin/view/CMS/DQMShiftInstructions https://twiki.cern.ch/twiki/bin/view/CMS/OnlineDQMShifts https://twiki.cern.ch/twiki/bin/view/CMS/OfflineDQMShifts https://twiki.cern.ch/twiki/bin/view/CMS/DQMOnlineShortTermInstr https://twiki.cern.ch/twiki/bin/view/CMS/DQMOfflineShortTermInstr Check the instructions at the beginnining of each shift to findo out if something has changed since your last shift 8

  9. In preparation for the shift • For newcomers: • Schedule your first shift in the daytime so assistance will be readily available if needed • Attend the shift tutorial on Monday (possibly the latest one before your shift or the one before that) • Attend a trainee shift between the tutorial and your first shift 9

  10. Task 0: Applications DQM on-call Expert : 165579 • (1) Make sure DQM applications and GUI are running during data taking: • Online: check update of histograms during runs • Offline: check arrival of histograms from Tier-0/CAF processing • (2) Check correct updating of Run Registry • (3) In case of problems in DQM Tools (persisting longer than 15 mins), call: • (4) During Online shifts stay in close contact with the Shift Leader (P5) 10

  11. Task 1: Histogram Inspection • Follow online/offline shift instructions • Run-by-run procedure: • (1) Enter significant runs in the Run Registry • Online Shifter has the task to make the decision on which are the significant runs to analyze and registers them in the Run Registry (confirm with the Shift Leader!) • Offline Shifter: analyzes the runs that have been previously registered by the Online shifter • (2) Shift histogram Inspection • Look at the Summary, the Reports, and the Shift Workspace in the DQM GUI • Make an effort to look at all the plots one by one. • If you spot a problem or have a question regarding a specific plot • Check the sub-system instruction and take action(s) accordingly • If not explained in the instruction, discuss with P5 shift leader and if needed inform him to call Detector On Call (DOC) • In offline, inform sub-system DQM experts, as noted in the instruction page by mail or phone • Make Elog entry (Type : “Problem Report”) 11

  12. GUI – Summary Workspace https://cmsweb.cern.ch/dqm/online 12

  13. GUI – Summary Workspace https://cmsweb.cern.ch/dqm/online 13

  14. GUI – Reports Workspace https://cmsweb.cern.ch/dqm/online 14

  15. GUI – Shift Workspace 15

  16. GUI – Shift Workspace Inspect Error plots (Online GUI -> Work Space -> Shift -> Errors They show plots that indicate errors. Please read sub-system short term instruction about the evaluation 16

  17. DQM Shift Instructions 17

  18. DQM Shift Instructions • Check that the HV information on the DQM GUI summary ( "Info" histogram) and the Run Registry ( "LumiSec" table) are consistent. • Sometimes "LS" in the Runs table shows 0. • If this happens: • make an ELOG entry "Problem report" indicating the run number • ensure that the DQM expert on-call is aware of the problem. • put the information manually into the general comments section of the Run Registry. format: Example: LS 0 = CASTOR, Strips, RPC, Pixel, and DT with HV OFF. All others with HV ON. Correlation between Data Cert and HV conditions: If HV remains OFF throughout the entire run then mark the subsystem as BAD If HV varies do not mark the subsystem as BAD based on that, but proceed following the subsystem instructions as usual. 18

  19. Task 2: Run Registry 19 Hover over the quality flag to see the comment

  20. Run Registry Run Registry Collect automated run info and information filled by the DQM shifters: Data quality and Comments. Threre are three view in the RunRegistry browsable through the buttons on top of the page: • RunInfo • Table • LumiSec 20

  21. Run Registry: RunInfo View • The RunInfo table contains information that is automatically entered • All Runs are entered in this table, from short start/stop runs to stable long runs • This table is used by the ONLINE DQM SHIFTER to select Significant runs: • Register only runs that have more than 10,000 events and/or have been running for more than 10 minutes (if you are in doubt, ask the shift leader). • Follow the run-by-run workflow inspecting the shift histograms. • Ask the shift leader for confirmation on the info reported and then move th enrty to « SIGNOFF » - 21

  22. Run Registry: Table View, Online Shifter’s use • Runs that are registered by the Online DQM Shifters are included and visible in the DB table of the “Table” view • This is the view used by both Online and Offline DQM Shifters to log in the information, i.e. comments and by –subsystems quality flags - Online Shifters: Click on the dataset “/Global/Online/ALL” and then on “Edit”, to open the RunRegistry editing page. Analyze subsystems one by one through the related histograms in the GUI and Assign run type + set the quality flags in the editing page of the run registry. 22

  23. Online Register Runs in Run Registry • During the run: Click run number itself->View details->Edit • Based on the shift histogram instructions, set the online subsystem flags (GOOD/BAD) and enter comments • If a subsystem is BAD, inform the shift leader and the subsystem expert, and enter comment • Also, enter the relevant luminosity range for “physics” (click “Beam Status” towards the bottom of the page).  You can get this information either from the shift leader of the shift elog. • Try to provide complete information by adding info like Beam 1 or 2, etc in the “comments” field. 23

  24. Register Runs in Run Registry • After the run: • Enter the ‘stop reason’ (in the stop reason field, NOT under comments) • The certification results must be confirmed by the shift leader, before the status of a given run is moved to « SIGNOFF » • from main “Runs info” view-> click on dataset name -> move to SIGNOFF • Once the run is in SIGNOFF state, it cannot be modified by the Online shifter. 24

  25. Online DQM Shifter: Run Classification Note that Runs are classified though the Group name in the RunRegistry: • Assigning correct Group name is of vital importance as it will affect Offline determination of Runs to be used for different analyses Text Text Current rules follow, make sure you check updated rules at start of your shift In the Online DQM Shift Instructions: 1. Select "Collisions11" if the run is taken for physics analysis purposes and contains at least one lumi section with two stable beams (colliding or non-colliding). 2. Select "Cosmics11" if the run is taken for analysis purposes and there is no beam activity throughout the run, i.e. stable "no beam" conditions 3. Select "Commissioning11" for all other runs, i.e. those taken for tests or specific detector studies only, i.e. not meant for general offline physics analysis. -> Ask the shift leader if in doubt ! 25

  26. Run Registry: Table View, Offline Shifter’s use Offline DQM Shifter: • Select which runs to analyze: oldest run where the “Global/Online/ALL” dataset is in SIGNOFF status • ADD the Offlien dataset to analyze (as per instructions) and proceed with subsystem evaluatio nbased on relevant histograms in the GUI, assign quality flags • Move the dataset entry to SIGNOFF when all subsystems are analyzed 26

  27. Run Registry – Datasets P5 Meyrin 27

  28. ELOG - http://cmsonline.cern.ch/portal/page/portal/CMS%20online%20system/Elog • Log in with your AFS account • Click on "Elog" and choose Subsystems -> "Event Display and DQM" • Problem Report ( 1 entry per problem ) • For each problem arising during your shift make a "Problem Report" entry • Please use “Elog” to report problems! • Shift Summary ( 1 entry per shift ) • At the end of each shift, write a short "Shift Summary" • N.B. (!!!): • Use Types "Problem Report" or "Shift Summary" only, do NOT create new • Do not enter run-by-run information in the Elog that should be in the Run Registry. • Make sure all run-by-run information is in the Run Registry, not in the Elog 28

  29. Shift Hand-over • Make sure to arrive 5-10 minutes early for shift hand-over. • Upon your arrival in the control room, the previous shifter will be there • Get from her/him the information about the current status of the data taking and what happened during the previous shift. • The shift person will show you where the tools are running, which you will be using (DQM GUI, CMS Online page, Run Registry). • If anything with your tasks is not clear, ask at that moment! • At the end of your shift, wait for the next shift person to arrive and provide the same support. 29

  30. Links • Shift instructions: • https://twiki.cern.ch/twiki/bin/view/CMS/DQMShiftInstructions • https://twiki.cern.ch/twiki/bin/view/CMS/OnlineDQMShifts • https://twiki.cern.ch/twiki/bin/view/CMS/OfflineDQMShifts • https://twiki.cern.ch/twiki/bin/view/CMS/DQMShiftHistograms • DQM Online GUI: • http://cmsweb.cern.ch/dqm/online/ • http://cmsweb.cern.ch/dqm/offline/ • Follow certificate instructions at https://twiki.cern.ch/twiki/bin/view/CMS/DQMGUIGridCertificate • Run Registry page: • http://pccmsdqm04.cern.ch/runregistry • presenly only working from CERN (need tunnel from outside CERN) • Elog: • http://cmsonline.cern.ch/portal/page/portal/CMS%20online%20system 30

  31. Final Suggestions From now to your shift: - Get familiar with the DQM TWiki pages (general structure & dedicated On/Offline): https://twiki.cern.ch/twiki/bin/view/CMS/DQMShiftInstructions Shortly before each shift you MUST read : - Short term instructions • Event Filter/DQM Announcements HN (https://hypernews.cern.ch/HyperNews/CMS/get/EvFDqmAnnounce.html) - Online/Offline Shift Histograms by Subsystem <---- especially Online shifters - Read the Elog of the shift before yours to be aware of the recent activity (please read both Shift Leader Elog and Event Display and DQM Elog) 31

  32. Backup

  33. Run Registry – Message Board • Communication between shift persons and DQM operators: • Opening: • Close all browser windows in the workstation • Open run registry (RR) • Login as yourself • Select  Tools  Message Board • Use phone or RR message board. Do not use ELOG or email (!) • Stick to using English 33

  34. Run Registry – Message Board Automatic messages Chat window Logged users 34

More Related