1.11k likes | 1.12k Views
v5-01-Release. P. Hristov 19/12/2011. Changes: v5-01-Rev-17. #89676 Porting modifs in AliGenpythia for PYQUEN usage to release, 53388 #89731 Port request: ZDC timing cut default values
E N D
v5-01-Release P. Hristov 19/12/2011
Changes: v5-01-Rev-17 • #89676 Porting modifs in AliGenpythia for PYQUEN usage to release, 53388 • #89731 Port request: ZDC timing cut default values • #89772 Request to port PHOS trigger info reconstruction code to the Release. From rev. 52412,53508,53524,53553,53577 • #89357 commit in STEER and port to release T0 AOD. From rev. 53616 • #89816: Hardware cables swapping discovered for HMPID chamber 2. From rev. 53586 • #89819: Request to port bugfix in AliHLTTPCHWClusterMerger to v5-01-Release. From rev. 53494
Changes: v5-01-Rev-17 • #89840 Request to port to release TPC code - for pass2. From rev. 53471 • #89860: TRD request :: New addition to ESD track. From rev. 53584, 53643 • #89887: Detector status ULong_t in AliESDEvent. From rev. 53609,53610 • #89872: Request: Port bug fix TRD calibration code to release. From rev. 53593 • #89822: Provide default track cuts for AOD production (LHC11h). From rev. 53104,53576,53604,53608,53619 + 53654 • #89817: Commit calorimeter AOD. From rev. 53658
Changes in v5-01-Rev-17. #88914: Very high memory consumption in reco of 2011 Pb+Pb • Rev. 53510, 53511: bug fix in AliESDTagCreator + method to merge tags. • Rev. 53512 : possibility to delete the recpoints, digits after each event reco. • Rev. 53513 : possibility to safely stop reco if memory exceeds specified limits. • Rev. 53583 : Bug fix + keeping the clusters in the TClonesArrays
Changes: v5-01-Rev-18 • #89822: Provide default track cuts for AOD production (LHC11h). From rev. 52345,52488,52523,52536 • AliCentralitySelectionTask: Update for LHC11h pass2. From rev. 53675 • Fix for memory corruption in PWG3/vertexerHF
Changes: OCDB • #89724 upload in alien OCDB T0 time-amplitude calibration • #89777 Please put PHOS trigger object to OCDB
LHC11h Pass2 – processing strategy CPass0 + calib train on all MB triggers (possibly chain of RAW) > 90% job succes OCDB snaphsot Merging+OCDB update If TPC OK Pass 2 OCDB snapshot #2 QA merging AOD merging
LHC11h Pass2 – reconstruction details • Start in inverse time order (last runs first, “LIFO”): OK • Check RAW data chain for CPass0 (with MB trigger): problem in the file merging, under investigation (high memory consumption: Rev-16 cannot merge output of Rev-18, exhausted disk space: big log files) • Exercise the full production setup on runs from ‘grey area’ • Need list of runs to process, QA filtered ‘bad runs’ • Run with TPC pools: OK • Work on a local raw file: OK • Use OCDB snapshot: OK • Keep only the rec. points for the current event: OK • Switch off QA: OK • Switch off MUON, if the memory consumption is still too high
Requested changes • Bug fixes • #87875: Big memory leak in AOD • #89822: Provide default track cuts for AOD production (LHC11h) • #90007: Request to commit/port fix in ANALYSIS/AliFileMerger.{h,cxx} • OCDB • #90005: Request to update TPC gain maps in Raw OCDB for pass 2 Pb-Pb production
Other reports (12/12/2011) • #89782 Set up test system for the event display • #89781 Per object quality instead of general quality flag • #89730 Flag "fIsMisaligned" in AliCluster set incorrectly during/after reconstruction for (at least) TRD
Changes: v5-01-Rev-16 • #89427 Porting modification for material budget issues to the Release. 53274,51506,53406,53427 • #89536 Request to port 53389: unnecessary cloning of AliESDtrack in AliCascadeVertexer • #89558 port to Release AliT0QADataMakerRec.cxx. From rev. 53403 • #89581 Request to port update of HLT TPC components to v5-01-Release. From rev. 52685,52776,52872,52874,52996,53052,53054 • #89582 Request to port merging of HLT TPC clusters at readout branch-borders to v5-01-Release. From rev. 53173,53179,53180,53226,53333,53342,53378,53380,53393,53394,53404 + 53181,53447,53448,53449,53450,53451 • #89586 Request to port new classes for automatic emulation of HLT TPC compression to v5-01-Release. From rev. 52995,53042,53110,53132
Changes: v5-01-Rev-16 • #89652: Request to update HLT/ChangeLog of v5-01-Release • #89601: Request to port AliPHOSRawDigiProducer to release. From rev. 53421 • Added histos for other triggers and flag for filling the histos. From rev. 53459 • Correct treatment of negative values. From rev. 53422 • #89681: Request to port commits 53249 and 53472 (ConfigVertexingHF_highmult.C) to the v5-01-Release
Message 05/12/2011 • Dear colleagues, • As discussed during the Physics board meeting last Thursday and today at the weekly offline • meeting, we will freeze v5-01-Release. Today is the last set of "general purpose" changes that • was discussed and accepted. The only changes we will consider in v5-01-Release from now • on are related to: • reduced memory consumption; • - improved time to reconstruct events; • fixes for crashes; • fixes for serious bugs in the reconstruction that affect physics (if any); • changes needed by the online processing; • updates of OCDB objects. • We will not accept changes in these particular parts of AliRoot that were heavily modified • during the recent weeks: • QA; • calibration algorithms; • analysis. • The goal is to have a stable version of AliRoot for the Christmas production by 16/12/2011. • The new release v5-02-Release is expected at the end of January, so the new development • will be taken from the SVN trunk.
Blockers for the Christmas production • Very high memory consumption: (3.5Gb RAM, 4.5Gb virtual): see next slide • Increased time to reconstruct one event (more central events this year) • solved by the changes in the cascade finder • Irreproducible crashes (G__exception): now reproduced, under investigation • “Cured” after resubmission => point to memory corruption problems; • reduce significantly the efficiency • seems to be related to the huge TPC.RecPoins.root file • #89651 Split HLT TPC clusters at readout branch border have impact to dca of high pt tracks: under investigation?
Very high memory consumption • Virtual memory improved by the “memory pools” of Ruben (TPC): in production • Still too high resident memory • Additional “pools”, to be committed to the trunk • tcmalloc doesn’t help on SLC5: LD_PRELOAD problem, to be repeated with linked libraries • xrootd studies: local files vs xrootd access
Very high memory consumption • Too big files with rec. points: option to keep only the current (last) event: needed if we test reconstruction of local raw file • Switch off QA and MUON reconstruction • Size of the libraries: loadlibs.C => libITSrec.so takes ~80Mb • no obvious reason • AliITSclustererV2.cxx: static Short_t pairs[1000][1000]; • Rec. points: split mode? • Option and macros and scripts to reconstruct one big chunk in several consecutive aliroot processes + merging of the ESDs, ESDfriends and tags => “last resort” • Test of event ordering with full pools and local raw+OCDB • More profiling…
Changes: v5-01-Rev-15 • #24466: Prepare a Geant 4 production request. From rev. 53143 • #89265: VZERO equalization factor not transmitted correctly during filtering. From rev. 53303 • #88914: Very high memory consumption in reco of 2011 Pb+Pb. From rev. 53245 • #89237 Code to port in release. From rev. 51957 • #89259 ZDC code porting request. From rev. 53161 • #89270 Request for porting TRD/PWG1 code. From rev. 53069,53082,53137,53154,53155,53169,53171 • #89301 Request to port PWG3/muon filtering update (rev. 53190) • #89324 Request to port r53200 and regenerate ITSRecoParams: Switch to not create tracklets/tracks refs. in PbPb
Changes: v5-01-Rev-15 • #89333 EMCAL: Port track matching modification to improve reconstruction speed to release • #89334 AOD Calorimeters: store distance to matched tracks to clusters in AODs, port to trunk and release. From rev. 53354 • #89335 Request to port AliPHOSRawFitterv4 to release. From rev. 53114,53211,53214 • #89354 commit in STEER and port to Release AliESDTZERO with fixed const. From rev. 53341 • #89355 EMCal Port r53227 & r53230 to the release • #89357 commit in STEER and port to release T0 AOD. From rev. 53358 • #88861 EMCAL: Port Trigger QA analysis task to release
Changes: v5-01-Rev-15 • #88417: Request to commit/port fixes to ANALYSIS/AliFileMerger.{h,cxx}. From rev. 53324 • #88368 Centrality determination updates to be ported to the release. From rev. 52533,52962,53028,53066,53090,53120,53162,53239 • #88827: Request for porting updates to TOF QA task into release. From rev. 53166 • More AddTimeStamp's added for #88914 • #89298: Additional trigger class for semicentral / central -> soon replaces the old ones. From rev. 53071,53199,53241,53243,53367 • #89368: QA final merging crashes REV-14. From rev. 53356
Other reports (28/11/11) • #89170 How to propagate promptly changes in the physics selection/trigger configuration to the QA • #89189 SPD Dead to RAW OCDB • #89233 Centrality selection for all events in the QA train • #89260 Adding sum of 4 tower PMTs vs. common PMT equalization in reconstruction • #89266 Reconstruction timing • #89298 Additional trigger class for semicentral / central -> soon replaces the old ones
Other reports (21/11/11) • #88880 Error: Symbol G__exception in reconstruction • #89002 Update of QA macro • #89012 Implementation of cosmics tracker in standard reconstruction • #89021 AliEve - segmentation fault when executing macros: geom_emcal.C & emcal_all.C • #89071 cpass0 failed / ocdb update failed for 168103, 168076, 168068, 168177
Ongoing investigations • Problems with v5-01-Rev-13 on the GRID • High load on the OCDB servers, problems to access “TPC/Calib/Correction” • Not understood: overcoming the problem using a local copy of the OCDB file • #88914 Very high memory consumption in reco of 2011 Pb+Pb • Event ordering by size: implemented, tested only with local raw file • “Memory pools” implemented by Ruben: under tests in GSI • Additional syswatch points (Ruben) • Slow processing in the cascade finder • Memory trashing in FillESD and in the ESD friends • Working tcmalloc on SLC5 (used by LHCb and ATLAS via LD_PRELOAD)
Changes: OCDB • #89288 TRD: update of Chamber Status for LHC10d • #89302 EMCAL: Port updated bad maps for 2011 to alien
Changes: v5-01-Rev-14 • #87404: Implementing the CDB snapshot. From rev. 51894,51992,52568,52590,52654,52814 • #88417: Request to commit/port fixes to ANALYSIS/AliFileMerger.{h,cxx}. From rev. 53125 • #88861 EMCAL: Port Trigger QA analysis task to release • #88936 Request to port 52924 (MUON DQM) • #88966 pPb configuration. From rev. 52982 • #88980 Request to port trunk rev. 52952 to the Release (fixed warnings) • #88987 request of porting of revision 52961,52965 in the release • #89005 Porting request (update in PWG1/TRD). From rev. 51710,51733,51810,52217,52240,52954,52960,52966
Changes: v5-01-Rev-14 • #89027 Request to port a fix r/53002: Small memory leak in AliQADataMaker • #89031 PHOS trigger in PhysicsSelection. From rev. 53006,53009,53068 • #89058 Port request: change in the trigger names of CVLN. Rev. 53009 in ticket #89031 • #89068 Please port new QA wagon to the Revision. From rev. 52831,52845,52863 • #89113 Port HMPID files to the release. From rev. 52929,52942 • #89120 Request to port optimisation of memory allocation in AliHLTTPCDataCompressionDecoder to v5-01-Release. From rev. 52974,52984,52997,53053
Changes: v5-01-Rev-14 • #89123 port to Release AliT0Reconstructor.cxx with important fix in reconstruction of simulated data. From rev. 53062 • #73877: Interaction time in MC. From rev. 50709,51126 • #88605: Request to port additional VZERO QA analysis task for Pb-Pb run. From rev. 53072 • #89203: Request to port an updated version of the MeanVertexer to the Release branch. From rev. 53117 • Possibility to reconstruct events in decreasing size order
Changes: OCDB • #88991 Request to update TOF OCDB for 2010 pp 900 GeV runs
Problems with v5-01-Rev-13 on the GRID • Very high load on the OCDB servers not seen with v5-01-Rev-12 • Clean restart with Rev-13 did not help • Emergency measure: go back to v5-01-Rev-12 for the GRID production: works, but several important changes are missing • Ongoing investigations (Raffaele, Alina, Latchezar, me): • The tests during the preparation of v5-01-Rev-13 did not show any anomaly, including in the processing of RAW data from AliEn • The differences in the code show no influence on the OCDB access • The log files from Rev-13 show exactly the same OCDB access for each active detector • The log files on the build server are OK • The tar balls from the build server are OK (local test) • Possible old memory corruption that showed up now: run with Valgrind (slow) • Stand alone test with dedicated server planned for tomorrow
#88914 Very high memory consumption in reco of 2011 Pb+Pb • The problem occurred after we moved to high luminosity. The memory goes up to ~3.5 Gb resident, 4.5Gb virtual memory: the jobs are killed • Temporary solution to provide possibility for QA: reconstruct only the first 80 ev. • The memory (resident/virtual) jumps at some high-multiplicity events • Investigations (TPC: Jacek, Marian; HLT: Matthias; ITS: Annalisa; Offline: Ruben, me) • Profiling with Google performance tools • Profiling with massif • Main allocations in TPC, more details on the next slides • Suspected pile-up events are not the only reason • Possibility to reject event based on N_TPC/N_ITS clusters or T0 times: not obvious • Technical solutions • Reconstruction of events in decreasing size order (Andreas) • tcmalloc
Stop saving the non-calibrated ESDs • Remove the non-merged AODs/QA output once the merging is done • Reduce the AliESDfriends to 1% (as in LHC10h) • Note: ESDs (and ancillaries) – 10% of RAW, AODs (and ancillaries) – 1% of RAW
Other production issues • Memory: at the limit, 3Gb RAM, 4Gb virtual. • G__exception that is cured after the resubmission of the failed job: memory corruption
#88626 DQM related problems • #72148 Recuperate thresholds for DQM into SHUTTLE • #84558 Memory leak in ACORDE DQM agent • #84566 Memory leak in HMPID DQM agent • #85143 Memory leak in PHOS DQM agent • #85149 Memory leak in TRI DQM agent • #85151 Memory leak in T0 DQM agent • #85152 Memory leak in TRD DQM agent • #85155 Memory leak in SSD DQM agent • #85175 Memory leak in EMCAL DQM agent
#88626 DQM related problems • #87363 DQM FXS implementation in the Shuttle • #87460 Memory leak in DAQ DQM agent • #88169 SPD Vertex DQM plots absent • #88173 DQM agent T00QAshifter crashes in technical runs • #88175 AMORE GUI unstable in v1.44 • #88210 Crash in the AmoreDA called from the TPC DAs • #88574 Problem with event display • #88576 Port the latest version of aliroot to DQM • #88619 amoreHLT crashes • #88622 amoreQA unstable
#88626 DQM related problems • #88655 Request to port rev 52689 (MTR DQM bug fix) • #88661 custom amore agent SSD01 crash • #88822 AMORE GUI crashes due to VertexXY object produced by SPD DA
Changes: v5-01-Rev-13 • #87623: Request to port a new V0 DA code to the release. From rev. 52803 • #88251 Introduction of the Pb-Pb trigger classes into the phys sel. From rev. 52754,52755 • #88605 Request to port additional VZERO QA analysis task for Pb-Pb run. From rev. 52744,52768, 52801, 52802 • #88698 makeOCDB.C : TRD update adjustment of the validate threshold for the chamber without data. From rev. 52722 • #88763 Porting request (momentum dependent cos(PA) cut for V0). From rev. 52750,52751,52759 • #88779 EMCAL: Fix mem. leak in QAChecker port to release. From rev. 52772,52777,52779 • #88793 Port AddTaskTPCCalib.C to v5-01-Release. From rev. 52780
Changes: v5-01-Rev-13 • #88798: Request to port fixes for: Beam type convention in the GRP. From rev. 52824,52836,52839,52842,52843,52862,52871 • #88807 Fix for AliTriggerPFProtection. From rev. 52778,52786,52797 • #88818 Port changes to AliRoot release - 5.01(AddTaskTPCCalib.C: low flux to high flux ). From rev. 52800 • #88820 new centrality OADB to port in release. From rev. 52781,52784,52804 • #88821 EMCAL: Port setting of clusterizer v2 in reconstruction. From rev. 50740,50741,50748,50749,50750,50753 • #88824 Request to port fix in QADataMaker: r52810 • #88827 Request for porting updates to TOF QA task into release. From rev. 52761,52811
Changes: v5-01-Rev-13 • #88829 Fix in counting processed events. From rev. 52819 • #88837 TPC port to Release request: AliTPCcalibTime.cxx. From rev. 52748,52793,52854 • #88847 Port revisions 52826 & 52827 to the release • #88849 Request to port VZERO event plane implementation. From rev. 51446,51730,52829,52917 • #88852 Request for porting rev. 52798,52821 to the release • #88862 Request: Port update of TRD ExB calibration code to release. From rev. 52822,52847,52857,52875 • #88864 Please port 52850 to release - On-line DQM scaling of histograms
Changes: v5-01-Rev-13 • #88865 Please port 52216,52849 to release - recognise p-A collisions • #88868 ZDC request to port code to the release. From rev. 52565,52687,52818,52852 • #88876 Request for TRD : reduce verbosity in AliTRDclusterizer. From rev. 51395 • #88926: EMCAL: Port QA reference file to release. From rev. 52912
Changes: v5-01-Rev-12 • #88679: Centrality task crashes in the release 5-01-Rev-11. From rev. 52716 • #88674: Porting request for 52710 (HLT event display) • #88686: Porting request for 52714 • #88681: Request to port fix for AliMagF (parser for p-A,A-p beam types): r52711. Created problems in the phys. selection • Technical fix from rev. 52709 (treatment of cosmic reco rarams)
Changes: v5-01-Rev-11 • #88178 Request to port bugfix in the AddTaskPHOSPbPb.C. From rev. 52362,52363 • #88197 Adding TPC cluster map for clusters used in fit. From rev. 52442,52443+52241,52262 • #88206 Request to port 52390 (MUON simulation w/ raw OCDB) • #88228 Request to port fix for proof reco. From rev. 52393 • #88242 Request to port bugfix in the AliAnalysisTaskPHOSPbPbQA.cxx. From rev. 52400,52427 • #88243 Vertex Diamond DA committing and porting request. From rev. 52357,52394,52401 • #88255 Request to port trunk rev. 52402 to the Release (use BPTX clock-shift in TOF calib)
Changes: v5-01-Rev-11 • #88261 Port r52410 to the release (disable EMCAL trigger emulator) • #88293 Request to port 52418 to release: protection against ill-formed QA cloning request • #88325 Request to port rawstream update to v5-01-Release. From rev. 52375 • #88329 port to Release AliT0QADataMakerRec.cxx and AliT0CalibTimeEq.cxx. From rev. 52212,52433,52434 • #88331 Request to port updates allowing the reconstruction of SPD+MUON. From rev. 52425 • #88333 EMCAL: Port L1 QA code for DQM to release. From rev. 52435,52437 • #88334 port to Release code for new T0 reconstruction scheme. From rev. 51643,52436,52479
Changes: v5-01-Rev-11 • #88344 Request to commit and port changes - TPCdedx info. 52445,52481 • #88350 Request to port 52451 to release: new macro to add in-reco analysis train • #88353 Request to port 52453 to release: fix in resetting cloned histos • #88354 Porting request for vertex QA. From rev. 52238,52450 • #88358 Request: Port update of TRD calibration code to release. From rev. 52361,52364,52366,52367,52389,52414,52432,52519 • #88368 Centrality determination updates to be ported to the release. From rev. 51433,52348,52391,52455 • #87900: Request to port changes is AliTOFQADataMakerRec code into release. From rev. 52454,52456
Changes: v5-01-Rev-11 • #88394: Port TPC changes to 5-01-Release - bug fix. From rev. 52500,52505 • #88406: Port TPC changes to 5-01 revisions - AliTPCcalibTimeGain.cxx. From rev. 52511 • #88417: Request to commit/port fixes to ANALYSIS/AliFileMerger. From rev. 52528. • #88455: Request to port commit rev=52408 to the release branch • #88462: TPC request: Update of the AliTPCPreprocesorOffline. From rev. 52517 • #88484 Port changes requested in task #23160. From rev. 52512 • #88477: port to Release AliT0CalibTimeEq, AliT0CalibSeasonTimeShift. From rev. 52543
Changes: v5-01-Rev-11 • #88382: request to port PIDqa related code to the release. From rev. 51070,51614,51739,52215+51790,52209,52213,52230,52317,52349,52369,52384,52407 • #88467: Request to update two VZERO OCDB objects for the forthcoming Pb-Pb run. From rev. 52535 • #88468: Request to commit and include in the new tag the changes in AMPT. From rev. 50767,51229,52570 • #88488: Request for TRD : code for PbPb 2011. From rev. 51823,51834,51836,51902,51949,51960,51969,51975,52163,52330,52340,52552,52553,52555+52554 • #88251: Introduction of the Pb-Pb trigger classes into the phys sel. From rev. 50745,50859,51220,51486,52234,52249,52250,52319,52347,52423,52516,52522
Changes: v5-01-Rev-11 • #88431: Memory corruption related to unpacking of HLT GlobalTriggerDecision during reconstruction. From rev. 52563 • #88468 Request to commit and include in the new tag the changes in AMPT. From rev. 52646 • #88537 Request to commit/port fix in EVE/alice-macros. From rev. 52648 • #88541 Request to port to release Muon QA analysis train from rev. 52059,52575,52576 • #88543 EMCAL: Port better description of resolution to release. Patch from rev. 52592 • #88549 Request to port bugfix in the AliAnalysisTaskPHOSPbPbQA.cxx to the Release. From rev. 52585
Changes: v5-01-Rev-11 • #88600 Request to port CPass0 code in v5-01-Release. From rev. 52467,52514,52515,52561,52564,52626,52634 • #88556 Request to port rev52603 into release (TOF matching code) • #88565 port to Release new T0Physda.cxx and AliT0Reconstructor.cxx. From rev. 52608,52610,52611,52633 • #82052: Request to port r50024 to v4-20-Release: Hidden OCDB call in Muon HLT code • #88169: SPD Vertex DQM plots absent. From rev. 52604,52605
Changes: v5-01-Rev-11 • #88573 Request to port to release TPC code - fix item # 88519. From rev. 52596 • #88583 SDD preprocessor updated on the trunk, commits 52613 and 52614 -> port to the release • #88587 porting request for cPass0 mean vertex code for PbPb processing. From rev. 52627 • #88591 Request to commit/port fix in AliRawReaderChain (setter instead of hardwired search path). From rev. 52667 • #88592 Request to port commit 52628 to the v5-01-Release • #88593 Request to port to release TPC code - fix task #23702. From rev. 52632