1 / 29

ATLAS Trigger & Data Acquisition system: concept & architecture

~25 min bias events ( >2k particles ) every 25 ns. Higgs → 2e+2 m O (1/hr). ATLAS Trigger & Data Acquisition system: concept & architecture. Kostas KORDAS INFN – Frascati. XI Bruno Touschek spring school, Frascati,19 May 2006. ATLAS Trigger & DAQ: the need (1). LHC. TeVatron.

bud
Download Presentation

ATLAS Trigger & Data Acquisition system: concept & architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ~25 min bias events ( >2k particles ) every 25 ns Higgs → 2e+2m O(1/hr) ATLAS Trigger & Data Acquisition system: concept & architecture Kostas KORDAS INFN – Frascati XI Bruno Touschek spring school, Frascati,19 May 2006

  2. ATLAS Trigger & DAQ: the need (1) LHC TeVatron Low lumi = 10 fb-1/y Total cross section is at ~100 mb, While the very interesting physics is at ~1 nb to ~1 pb, i.e., a ratio of 1:108 to 1:1011 ATLAS TDAQ concept & architecture - Kostas KORDAS

  3. ATLAS Trigger & DAQ: the need (2) p p Full info / event: ~ 1.6 MB/25ns ~60k TB/s 40 MHz Need high luminosity to get to observe the very interesting events Need on-line selection to write to disk mostly interesting events ~ 300 MB/s ~ 200 Hz ATLAS TDAQ concept & architecture - Kostas KORDAS

  4. ATLAS Trigger & DAQ: architecture Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz 160 GB/s RoI Dataflow High Level Trigger ~10 ms L2 Level 2 ROS ROB ROB ROB RoI requests Read-Out Systems RoI data (~2%) L2N L2P EB ~3+6 GB/s ~ 3.5 kHz L2 accept (~3.5 kHz) EBN Event Builder SFI Event Filter EFN ~ sec EFP EF SFO EF accept (~0.2 kHz) Full info / event: ~ 1.6 MB/25ns 40 MHz ~ 300 MB/s ~ 200 Hz ATLAS TDAQ concept & architecture - Kostas KORDAS

  5. Interactions every 25 ns: …in 25 ns particles travel 7.5 m Cable length ~100 meters: …in 25 ns signals travel 5 m From the detector into the Level-1 Trigger Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms FE Pipelines 22m 44 m Weight: 7000 t Total Level-1 latency = 2.5 msec (TOF + cables + processing + distribution) For 2.5 msec, all signals must be stored in electronic pipelines ATLAS TDAQ concept & architecture - Kostas KORDAS

  6. Upon LVL1 accept: buffer data & get RoIs Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 160 GB/s ROD ROD ROD 100 kHz Read-Out Drivers RoI Read-Out Links (S-LINK) Region of Interest Builder ROS Read-Out Buffers ROB ROB ROB ROIB Read-Out Systems ATLAS TDAQ concept & architecture - Kostas KORDAS

  7. LVL1 finds Regions of Interest for next levels 4 RoI -faddresses In this example: 4 Regions of Interest: 2 muons, 2 electrons ATLAS TDAQ concept & architecture - Kostas KORDAS

  8. Upon LVL1 accept: buffer data & get RoIs Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 160 GB/s ROD ROD ROD 100 kHz Read-Out Drivers RoI Read-Out Links (S-LINK) Region of Interest Builder ROS Read-Out Buffers ROB ROB ROB ROIB Read-Out Systems • On average, LVL1 finds • ~2 Regions of Interest (in h-f)per event • Data in RoIs is a few % of the Level-1 throughput ATLAS TDAQ concept & architecture - Kostas KORDAS

  9. LVL2: work with “interesting” ROSs/ROBs Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV ROD ROD ROD 100 kHz RoI 160 GB/s Level 2 ~10 ms L2 ROS RoI requests Read-Out Buffers ROB ROB ROB LVL2 Supervisor Read-Out Systems LVL2 Network LVL2 Processing Units ~3 GB/s RoI data (~2%) L2N L2P For each detector there is a simple correspondenceh-f Region Of InterestROB(s) • LVL2 Proccessing Units: for each RoI, the list of ROBs with the corresponding data from each detector is quickly identified RoI-based Level-2 trigger: A much smaller ReadOut network … at the cost of a higher control traffic ATLAS TDAQ concept & architecture - Kostas KORDAS

  10. After LVL2: Build full events Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz RoI 160 GB/s Level 2 ~10 ms L2 ROS Read-Out Systems RoI requests ROB ROB ROB ~3+6 GB/s RoI data (~2%) L2N L2P Dataflow Manager Event Building Network EB L2 accept (~3.5 kHz) ~3.5 kHz EBN Sub-Farm Input SFI Event Builder ATLAS TDAQ concept & architecture - Kostas KORDAS

  11. LVL3: Event Filter deals with Full Event info Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz RoI 160 GB/s Level 2 ~10 ms L2 ROS RoI requests ROB ROB ROB Read-Out Systems RoI data (~2%) ~3+6 GB/s L2N L2P EB Event Builder ~3.5 kHz L2 accept (~3.5 kHz) EBN Sub-Farm Input Full Event SFI Event Filter Event Filter Network EFN ~ sec EF EFP Farm of Event Filter Processors ~ 200 Hz ATLAS TDAQ concept & architecture - Kostas KORDAS

  12. From Event Filter to Local (TDAQ) storage Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz RoI 160 GB/s ~10 ms L2 Level 2 ROS RoI requests ROB ROB ROB Read-Out Systems RoI data (~2%) ~3+6 GB/s L2N L2P EB ~3.5 kHz L2 accept (~3.5 kHz) EBN Event Builder SFI Event Filter Event Filter Network EFN ~ sec EF EFP SFO Event Filter Processors Sub-Farm Output EF accept (~0.2 kHz) ~ 200 Hz ~ 300 MB/s ATLAS TDAQ concept & architecture - Kostas KORDAS

  13. TDAQ, High Level Trigger & DataFlow Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz RoI 160 GB/s Dataflow High Level Trigger Level 2 ~10 ms L2 ROS Read-Out Systems RoI requests ROB ROB ROB RoI data (~2%) ~3+6 GB/s L2N L2P EB ~3.5 kHz L2 accept (~3.5 kHz) EBN Event Builder SFI EFN ~ sec Event Filter EF EFP SFO EF accept (~0.2 kHz) ~ 200 Hz ~ 300 MB/s ATLAS TDAQ concept & architecture - Kostas KORDAS

  14. TDAQ, High Level Trigger & DataFlow Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz RoI 160 GB/s Dataflow High Level Trigger Level 2 ~10 ms L2 ROS Read-Out Systems RoI requests ROB ROB ROB RoI data (~2%) ~3+6 GB/s L2N L2P EB ~3.5 kHz L2 accept (~3.5 kHz) EBN Event Builder SFI EFN ~ sec Event Filter EF EFP SFO EF accept (~0.2 kHz) ~ 200 Hz ~ 300 MB/s High Level Trigger (HLT) • Algorithms developed offline (with HLT in mind) • HLT Infrastructure (TDAQ job): • “steer” the order of algorithm execution • Alternate steps of “feature extraction” & “hypothesis testing”) fast rejection (min. CPU) • Reconstruction in Regions of Interest  min. processing time & network resources ATLAS TDAQ concept & architecture - Kostas KORDAS

  15. TDAQ, High Level Trigger & DataFlow Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz RoI 160 GB/s Dataflow High Level Trigger Level 2 ~10 ms L2 ROS Read-Out Systems RoI requests ROB ROB ROB RoI data (~2%) ~3+6 GB/s L2N L2P EB ~3.5 kHz L2 accept (~3.5 kHz) EBN Event Builder SFI EFN ~ sec Event Filter EF EFP SFO EF accept (~0.2 kHz) ~ 200 Hz ~ 300 MB/s DataFlow • Buffer & serve data to HLT • Act according to HLT result, but otherwise HLT is a “black box” which gives answers • Software framework based on C++ code and the STL ATLAS TDAQ concept & architecture - Kostas KORDAS

  16. High Level Trigger & DataFlow: PCs (Linux) Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD RoI Dataflow 150 nodes High Level Trigger ~10 ms L2 ROS 500 nodes RoI requests ROB ROB ROB RoI data (~2%) L2N L2P EB 100 nodes L2 accept (~3.5 kHz) EBN SFI 1600 nodes EFN ~ sec EF EFP SFO EF accept (~0.2 kHz) Infrastructure Control Communication Databases ATLAS TDAQ concept & architecture - Kostas KORDAS

  17. TDAQ at the ATLAS site SDX1 dual-CPU nodes CERN computer centre ~30 ~1600 ~100 ~ 500 Local Storage SubFarm Outputs (SFOs) Event Filter (EF) Event Builder SubFarm Inputs (SFIs) LVL2 farm Event rate ~ 200 Hz Second- level trigger Data storage pROS DataFlow Manager Network switches stores LVL2 output Network switches LVL2 Super- visor Gigabit Ethernet Event data requests Delete commands Requested event data Event data pulled: partial events @ ≤ 100 kHz, full events @ ~ 3 kHz Regions Of Interest USA15 Data of events accepted by first-level trigger 1600 Read- Out Links ~150 PCs VME Dedicated links Read- Out Drivers (RODs) ATLAS detector Read-Out Subsystems (ROSs) RoI Builder First- level trigger UX15 Timing Trigger Control (TTC) Event data pushed @ ≤ 100 kHz, 1600 fragments of ~ 1 kByte each “pre-series” system: ~10% of final TDAQ in place SDX1 USA15 UX15 ATLAS TDAQ concept & architecture - Kostas KORDAS

  18. Example of worries in such a system: CPU power Test with AMD dual-core, dual CPU @ 1.8 GHz, 4 GB total • At Technical Design Report we assumed: • 100 kHz LVL1 accept rate • 500 dual-CPU PCs for LVL2 • 8 GHz per CPU at LVL2 • So: • each L2PU does 100Hz • 10ms average latency per event in each L2PU • 8 GHz per CPU will not come • But, dual-core dual-CPU PCs show scaling! Preloaded ROS w/ muon events, run muFast @ LVL2 We should reach necessary performance per PC at cost of higher memory needs & latency (shared memory model would be better here) 18 ATLAS TDAQ concept & architecture - Kostas KORDAS

  19. Cosmics in ATLAS in the pit Last Sept: cosmics in the Tile hadronic calorimeter, brought via the pre-series(monitoring algorithms) This July: Cosmic run with LAr EM + Tile Had Cal (+Muon detectors?) ATLAS TDAQ concept & architecture - Kostas KORDAS

  20. Summary • Triggering at Hadron Colliders: • Need high luminosity to get rare events • Can not write all data to disk • No sense otherwise: offline, we’ll be wasting our time looking for a needle in the hay! • ATLAS TDAQ: • 3-level trigger hierarchy • Use Regions of Interest from previous level: small data movement • Feature extraction + hypothesis testing: fast rejection  min. CPU power TDAQ will be ready in time for LHC data taking • We are in the installation phase of system • Cosmic run with Central Calorimeters (+muon system?) this summer ATLAS TDAQ concept & architecture - Kostas KORDAS

  21. Thank you ATLAS TDAQ concept & architecture - Kostas KORDAS

  22. ReadOut Systems: 150 PCs w/ special cards “Hottest” ROS from paper model High Lumi. operating region Low Lumi. operating region 2. Measurements on real ROS H/W LVL1 accept rate (kHZ) LVL2 accept rate (% of input) 12 ROS in place, more arriving ROS units contain 12 R/O Buffers 150 units needed for ATLAS (~1600 ROBs) A ROS unit is implemented with a 3.4 GHz PC housing 4 custom PCI-x cards (ROBIN) Not all ROSs are equal in rate of data requests RODROS re-mapping can reduce requirements on busiest (hottest) ROS Performance of final ROS (PC+ROBIN) is above requirements Note: we have also ability to access individual ROBs if wanted/needed ATLAS TDAQ concept & architecture - Kostas KORDAS

  23. Event Building needs • Throughput requirements: • 100 KHz LVL1 accept rate • 3.5% LVL2 accept rate  3.5 KHz EB • 1.6 MB event size •  3.5 x 1.6 = 5600 MB/s total input • Network limited (fast CPUs): • Event building using • 60-70% of Gbit network • ~70 MB/s into each • Event Building node (SFI) So, we need: • 5600 MB/s into EB system / (70MB/s in each EB node)  need ~80 SFIs for full ATLAS • When SFI serves EF, throughput decreases by ~20%  actually need 80/0.80 = 100 SFIs 6 prototypes in place, evaluation of PCs now,  expect big Event Building needs from day 1: >50 PCs till end of year ATLAS TDAQ concept & architecture - Kostas KORDAS

  24. Tests of LVL2 algorithms & RoI collection 8 1 pROS L2SV Emulated ROS 1 1 L2PU pROS 1 DFM • Plus: • 1 Online Server • 1 MySQL data base server 1) Majority of events rejected fast Electron sample is pre-selected Di-jet, m & e simulated events preloaded on ROSs; RoI info on L2SV 2) Processing takes ~all latency: small RoI data collection time 3) Small RoI data request per event Note: Neither Trigger menu, nor data files representative mix of ATLAS (this is the aim for a late 2006 milestone) ATLAS TDAQ concept & architecture - Kostas KORDAS

  25. Scalability of LVL2 system • L2SV gets RoI info from RoIB • Assigns a L2PU to work on event • Load-balances its’ L2PU sub-farm • Can scheme cope with LVL1 rate? • Test with preloaded RoI info into RoIB, which triggers TDAQ chain, emulating LVL1 • LVL2 system is able to sustain the LVL1 input rate: • 1 L2SV system for LVL1 rate ~ 35 kHz • 2 L2SV system for LVL1 rate ~ 70 kHz (50%-50% sharing) Rate per L2SV stable within 1.5% ATLAS will have a handful of L2SVs  can easily manage 100 kHz LVL1 rate ATLAS TDAQ concept & architecture - Kostas KORDAS

  26. EF performance scales farm size Dummy algorithm: always accept, but with fixed delay • Test e/g & m selection algorithms • HLT algorithms seeded by L2Result • pre-loaded (e & m) simulated events on 1 SFI Emulator serving EF farm • Results here are for muons: Initially CPU limited, but eventually bandwidth limited Event size 1 MB Running muon algorithms: scaling with EF farm size(still CPU limited with 9 nodes) • Previous Event Filter I/O protocol limited rate for small event sizes (e.g., partially built)  changed in current TDAQ software release ATLAS TDAQ concept & architecture - Kostas KORDAS

  27. ATLAS Trigger & DAQ: philosophy Latency Rates Muon Calo Inner 40 MHz Pipeline Memories LVL1 2.5 ms ~100 kHz Read-Out Drivers ROD ROD ROD ROD ROD ROD ROD ROD ROD Read-Out Subsystems hosting Read-Out Buffers ~10 ms ROB ROB ROB ROB ROB ROB ROB ROB ROB ROB ROB ROB RoI ~3 kHz LVL2 Event builder cluster ~1 s Event Filter farm EF ~200 Hz Local Storage: ~ 300 MB/s Hardware based (FPGA, ASIC) Calo/Muon (coarse granularity) Software (specialised algs) Uses LVL1 Regions of Interest All sub-dets, full granularity Emphasis on early rejection High Level Trigger Offline algorithms Seeded by LVL2 result Work with full event Full calibration/alignment info ATLAS TDAQ concept & architecture - Kostas KORDAS

  28. Data Flow and Message Passing ATLAS TDAQ concept & architecture - Kostas KORDAS

  29. A Data Collection application example: the Event Builder ROS & pROS Trigger (Event Filter) Event Fragments Data Requests Event Input Activity Request Activity Assignment Event Handler Activity SFI: Event Builder *Event Fragments *Event Event Assembly Activity Event Sampler Activity Assignment Event Event Monitoring Data Flow Manager ATLAS TDAQ concept & architecture - Kostas KORDAS

More Related