New LHCb Trigger and DAQ Strategy: Gigabit-Ethernet-based System Architecture

The New LHCb Trigger and DAQ Strategy: A System Architecture based on Gigabit-Ethernet RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne for the LHCb Collaboration

LHCb Trigger Niko NEUFELD CERN, EP

Two Software Trigger Levels • Both run on commercial PCs • Level-1 • uses reduced data set: only part of the sub-detectors (mostly Vertex-detector and some tracking) with limited-precision data • has a limited latency, because data need to be buffered in the front-end electronics • reduces event rate from 1.1 MHz to 40 kHz, by selecting events with displaced secondary vertices • High Level Trigger (HLT) • uses all detector information • reduces event rate from 40 kHz to 200 Hz for permanent storage Niko NEUFELD CERN, EP

Features • Two data streams to handle: • Level-1 trigger: 4.8 kB @ 1.1 MHz • High Level Trigger: 38 kB @ 40 kHz • Fully built from commercial components • (Gigabit) Ethernet throughout • Push-through protocol, no re-transmissions • Centralized flow control • Latency control for Level-1 at several stages • Scalable by adding CPUs and/or switch ports Niko NEUFELD CERN, EP

HLTTraffic Level-1Traffic Front-end Electronics FE FE FE FE FE FE FE FE FE FE TRM FE FE 349Links 40 kHz 2.3 GB/s 126-240Links 44 kHz 5.5-11.0 GB/s Multiplexing Layer Switch 31 Switches Switch 62-83 Switches Switch Switch Switch 64-157Links 88 kHz 33 Links 1.7 GB/s Readout Network L1-Decision Sorter TFCSystem 90-153 Links 5.5-10 GB/s StorageSystem Switch Switch Switch Switch Switch 90-153 SFCs SFC SFC SFC SFC SFC CPUFarm ~1400 CPUs CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Level-1 Traffic Gb Ethernet HLT Traffic Mixed Traffic Architecture Niko NEUFELD CERN, EP

Front-end electronics • Separation of Level-1 and HLT paths • two Ethernet links into network • Data must be packaged into IPv4 packets • Must be able to pack several events into “super-events” reduce packet rate into network • Must provide sufficient buffer space to allow for Level-1 trigger algorithm to decide (53 ms total) • Must assign destination, which is centrally distributed (with the trigger system) Niko NEUFELD CERN, EP

Event Building Network • Built from Gigabit Ethernet switches (1000 BaseT a.k.a UTP copper) • Try to optimise link-load ( ~ 80% or 100 MB/s) using (cheap) office switches to multiplex links from front-end • Need a large core switch with ~ 100 x 100 ports  can be built from smaller elements • Need switch with sufficient amount of buffering and good internal congestion control Niko NEUFELD CERN, EP

CPU farm • More than 1000 PCs partitioned into sub-farms consisting of • a Sub-farm Controller (SFC), acting as a gateway into the readout-network • a number of worker CPUs, only known to the sub-farms • The SFC • builds the events from the “super-event” fragments it receives • distributes them among its workers in a load-balancing manner • receives the trigger decisions from workers and • passes them on to permanent storage (HLT events) • passes them to the decision sorter (Level-1 events) Niko NEUFELD CERN, EP

Front-end Electronics Queuing latencies in the network (switch buffers) Queuing in the SFC (“all all nodes are busy with a L1 event”) TRM FE FE FE FE FE FE FE FE FE FE FE FE Switch Switch Switch Switch Switch Multiplexing Layer Reception of event and invocation of trigger algorithm Readout Network L1-Decision Sorter Switch Switch Switch Switch Switch SFC SFC SFC SFC SFC CPUFarm CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Latencies TFCSystem Niko NEUFELD CERN, EP

Latencies due to queuing in the network or the farm • Latencies in the network can only be estimated from simulation, because it comes from competition between large packets for the same output port (forwarding latency of a packet in a switch is negligible) • Latencies in the sub-farm are due to statistical fluctuations in the Level-1 processing time • Simulation using simulated raw data shows that the amount of events which run into Level-1 time-out because of this is very small ( < 10-4) • Goes down as sub-farms grow in number of workers • This can and will be measured Niko NEUFELD CERN, EP

Context Switching Latency • What is it? • On a multi-tasking OS, whenever the OS switches from one process to another it needs a certain time to do this • Why do we worry? • Because we run the L1 and the HLT algorithms concurrently on each CPU node • Why do we want this concurrency? • We want to use every available CPU cycle Niko NEUFELD CERN, EP

Scheduling and Latency • Using Linux 2.5.55 we have established two facts about the scheduler: • Soft Realtime priorities work: the Level-1 task will never be interrupted until it finishes • The context switch latency is low: < 10.1 ± 0.2 µs • Measurements of this have been done on a high-end server 2.4 GHz PIV Xeon – 400 MHz FSB – we should have machines at least 2x faster in 2007 • Conclusion: the scheme of running both tasks concurrently is sound Niko NEUFELD CERN, EP

System Design • God-given parameters: trigger rates, transport overheads, raw data size distributions per front-end links • Chosen parameters: number of CPUs (1400), average link load (80%), maximum acceptable rate at event-building SFC (80 kHz), packing factor of events into “super-events” for transport (25) • Munch through huge spread-sheet, apply some reasonable rounding and take care of partitioning and voila! Niko NEUFELD CERN, EP

Some Numbers Niko NEUFELD CERN, EP

Summary • LHCb’s new software trigger system operates on the same infrastructure two read-out streams at 40 and 1100 kHz • One event stream requires hard latency restrictions to be obeyed • System is based on Gigabit Ethernet and uses commercial and mostly commodity hardware throughout • The system can be built today and afforded in three years from now Niko NEUFELD CERN, EP

New LHCb Trigger and DAQ Strategy: Gigabit-Ethernet-based System Architecture

New LHCb Trigger and DAQ Strategy: Gigabit-Ethernet-based System Architecture

Presentation Transcript

Introduction to the Gigabit-Ethernet

Data and Computer Communications

Chapter 5

Fast Ethernet and Gigabit Ethernet

Gigabit Ethernet: Architectural Design and Issues

Introduction of trigger system: E906 as an example

Performance Characterization of a 10-Gigabit Ethernet TOE

LHCb Upgrade Overview

Converged IP Solutions

Gigabit Ethernet

Future experiment specific needs for LHCb

L0 Calorimeter Trigger LHCb Bologna

GTK-TO interface with High speed Gigabit Ethernet link

Progress on System Architecture for Extreme Devices

The LHCb Trigger

Gigabit Ethernet – IEEE 802.3z The Choice of a New Generation Design Presentation

Chapter 9 Ethernet Part II

Gigabit Ethernet TxRx