Werner Wiedenmann University of Wisconsin, Madison, Wisconsin on behalf of the ATLAS Collaboration

The ATLAS Online High Level Trigger Framework: Experience reusing Offline Software Components in the ATLAS Trigger Werner Wiedenmann University of Wisconsin, Madison, Wisconsin on behalf of the ATLAS Collaboration CHEP 2009 Prague, 21-27 March 2009

The ATLAS Trigger Level-1 Hardware trigger <75 (100) kHz 2.5 μs High Level Triggers (HLT) Level-2 + Event Filter Software trigger runs on commodity hardware Level 2 Trigger ~ 40 ms 2 GHz / 4 core CPU ~ 3 kHz Event Filter ~ 4 s 2 GHz / 4 core CPU ~ 200 Hz Rate Data Recording Event Size ~1.6 MB Target processing time

HLT Hardware • Final system: • 17 L2 racks • 62 EF racks • 1 rack = 31 PCs (1U) • 28 racks configurable as L2 or EF (XPU) • 1 Local File Server per rack  served by Central File Servers (CFS) • 27 XPU racks installed • ~35% of final system • XPU • CPU  2x Harpertown quad-core @ 2.5 GHz • 2 GB Memory/core • Network booted

ATLAS High Level Triggers • Level-2 (75 kHz→3 kHz) • Partial event reconstruction in Regions of Interest (RoI) provided by Level 1 event data in RoI are requested from selection algorithms over the network from the readout system • HLT selection software runs in the Level-2 Processing Unit (L2PU). Selection algorithms are run in a worker thread. • Event Filter (3 kHz→200 Hz) • Full event reconstruction seeded by Level-2 result with offline-type algorithms • Independent Processing Tasks (PT) run selection software on Event Filter (EF) farm nodes • HLT Event Selection Software is based on the ATLAS Athena offline Framework • HLT framework interfaces the HLT event selection algorithms to online • Driven by run control and data flow software  Event loop managed by data flow software • Allows HLT algorithms to run unchanged in the trigger and offline environment

HLT software in the Level 2 and Event Filter Data Flow Data Collection L2PU/EFPT Steering Controller HLT Event Selection Software Level-2 Event processing in Worker threads Online/Data Collection Framework HLT Framework based on ATHENA/GAUDI Event Filter Event processing in Processing Tasks

HLT Event Selection Software Event Filter HLTSSW Processing HLT Core Software Task 1..* Steering Monitoring Service HLT Algorithms L2PU Application Data Manager HLT Algorithms ROB Data Collector 1..* Event Data Model MetaData Service <<import>> <<import>> <<import>> <<import>> Package Athena/ Gaudi Event Data Model Reconstr. Algorithms StoreGate Interface Offline Reconstruction Offline Core Software Dependency HLT Data Flow Software HLT Selection Software Framework ATHENA/GAUDI Reuse offline components Common to Level-2 and EF Level2 Offline algorithms used in EF

HLT Framework Online Information Service Conditions Data Base Online Offline Data Flow ATHENA Environment L2PU/EFPT athena AthenaSvc AthenaSvc Algorithms Algorithms Provides special Athena services when necessary to allow algorithms a transparent running in online / offline. Whenever possible the offline infrastructure services are reused. Athena services with a special “online” backend are used e.g. for • Interfacing to online runcontrol, state machine and the online configuration database • e.g. for the readout buffer configuration • Interfacing the algorithm configuration to the Trigger Database • Holds menu configuration • Interfacing to online conditions data • Forwarding of data requests to data flowin Level 2 • …

HLT Framework • The HLT framework is responsible • For packaging and forwarding of the HLT decisions to online, e.g. • Detailed Level 2 and Event Filter decision record for subsequent reconstruction steps • Event output destination (Stream tag information) • Use of event for calibration purposes and building event only partially • …. • For error handling and forwarding of error conditions from algorithms to online applications • For forwarding of monitoring histograms and counters to online monitoring infrastructure • … • Level-2 and Event Filter share many common components from the framework  differences mainly due to different data access in RoI or of full event

Development Model HLT Framework software Data Flow ATHENA Environment L2PU/EFPT athenaMT/PT Steering Controller Steering Controller Algorithms Algorithms HLT algorithm development, testing and optimization is done mainly in offline environment. Developers can use online emulators athenaMT for Level-2 and athenaPT for Event filter to test their applications  Decoupling HLT algorithm development and TDAQ data flow development Offline support for developers Emulators athenaMT/PT Emulate complete L2PU/PT environment with specific boundary conditions for data access Command line tools  no need to setup complex Data Flow systems Allow testing of complete state machine Written in Python/C++ Online Offline Integration

The Atlas HLT Software Project online offline tdaq AtlasTrigger tdaq-common Atlas…. AtlasCore Gaudi • The HLT framework packages form the AtlasHLT software project  setup like other offline software projects • uses many facilities from offline build and testing system • is hosted in the offline version control repository • Follows offline build and patching strategy • Automatic tests • … • AtlasHLT project has dependencies on most of the Atlas software projects AtlasHLT has to be last in build sequence Automatic memory leak test for Level 2 AtlasHLT Extends directly Gaudi base services Contains HLT algorithms LCG

Operational Experience • The HLT framework has been successfully tested and used now over many years in e.g. • The ATLAS test beam in 2004 and TDAQ commissioning runs • In cosmic data taking and detector commissioning periods • 1st beam on September 10th: configured in pass through mode • HLT framework used regularly for cosmic data taking and commissioning runs: e.g. continuous cosmic data taking after Sept .12: • 216 millions events • 453 TB data • 400k files • Proved to be very versatile and allowed to react fast on changing detector conditions • Emulators athenaMT/PT used to analyze events offline in DEBUG streams and to eventually reassign them to the analysis streams. • Valuable feedback on functionality, stability and reliability

Operational Experience System tests with full 1031 menu and simulated data showed that timing for event processing is at specification (2.5 GHz quad core CPUs) • Level-2: ~ 45 ms • Event Filter: ~ 260 ms (Event Filter not exploited by this setup) 7 hour run L2 input ~ 60 kHz (80% of design rate) 2880 L2PU (70% of final system) Cronjobs Level-2 rejected events Event Filter rejected events

Event Parallelism and Multi-Core Machines • Multi-core machines are used for the HLT processors • Processor development  rapidly increasing number of CPU cores • Calls for massive parallelism  typically inherent to high energy physics selection and reconstruction as “event parallelism” • The original ATLAS HLT design exploits event parallelism in • Level-2: parallel event processing in multiple worker threads (with the possibility to fall back to multiple selection processes) • Event filter: multiple event processing tasks • The HLT framework supports both ways of exploiting event parallelism • Special version of Gaudi/Athena framework to create selection algorithm instances for worker threads • Experience with multi threading in Level-2 has shown that it is very difficult to maintain a large code base in an open developer community thread safe and thread efficient. • Use Level-2 processing unit with one worker thread with one Level-2 processing unit per CPU core • Allows also in Level-2 to profit from large ATLAS offline code basis (mostly written and designed in “pre-multi-core era”) • Increases software communality between Level-2 and Event Filter and allows transparent move of Level-2 algorithms to Event Filter • Increases however the resource usage (memory, network connections, …)

Experience with Multi Threading in HLT Framework Worker threads blocked due to common memory pool for STL containers L2PU with 3 worker threads Initialization Event processing • Event processing in multiple worker threads • Code sharing • Small context switch times  “lightweight processes” • Automatic sharing of many hardware resources • Requires careful tuning and proves difficult to keep code thread safe and efficient over release cycles  impacts capability to react fast on changing conditions and to deploy new code

Experience with Multiple Processes in HLT Framework Run on every CPU core one instance of a Level-2 processing unit or Event Filter processing task • Easy to do with existing code basis  a priori no code changes required • Facilitates reuse of offline components Observe good scaling with number of cores  provides required throughput Machine with 8 cores in total Resource requirements are multiplied with number of process instances • Memory • ~ 1– 1.5 GByte/Application • file descriptors, • network sockets, • number of controlled applications • ~ 7k presently • ~ 20k final system Event Filter memory requirements

Resource Optimization Core 4 Core 3 Core 1 Core 4 Core 3 Core 2 Core 2 Core 1 P/T P/T P/T P/T P/T P/T P/T P/T Memory Controller Memory Controller Data per P/T Data per P/T Data per P/T Data per P/T Data per P/T Data per P/T Data per P/T Data per P/T Sharable data Memory • Typically in HEP applications all processes use a large amount of constant configuration and detector description data. • Common problem for offline reconstruction and trigger • Different prototypes tried  see contr. # 244, S. Binet, “Harnessing multicores: strategies and implementations in ATLAS” • Very interesting also “OS based” memory optimization “KSM”

Conclusions • The ATLAS HLT framework based on the ATHENA offline framework has successfully been used in many different data taking periods and provides the required performance • The reuse of offline software components allows to benefit from a big developer community and a large code basis. • The ability to develop, optimize and test trigger algorithms in offline is successfully used to establish the ATLAS HLT selection menu • The optimal use of multi core processors requires further frame work enhancements to reduce resource utilization. Many issues there are shared with offline and can profit from common solutions. • The HLT Framework is ready for the new data taking periods.

Werner Wiedenmann University of Wisconsin, Madison, Wisconsin on behalf of the ATLAS Collaboration

Werner Wiedenmann University of Wisconsin, Madison, Wisconsin on behalf of the ATLAS Collaboration

Presentation Transcript

Werner Wiedenmann University of Wisconsin, Madison, Wisconsin on behalf of the ATLAS Collaboration

University of Wisconsin-Madison Arboretum

WiMAX at the University of Wisconsin-Madison

University of Wisconsin-Madison

Madison, Wisconsin

David M. Webber For the MuLan Collaboration University of Wisconsin-Madison

University of Wisconsin -Madison

Brennan Hughey University of Wisconsin – Madison for the AMANDA Collaboration

University of Wisconsin-Madison 2002-03

on behalf of ATLAS Collaboration

Ian Coxhead University of Wisconsin-Madison

A.D. Crowe and A.M. Thompson University of Wisconsin – Madison, Madison, Wisconsin

Tom Danielson University of Wisconsin – Madison

University of Wisconsin-Madison

Claudia Cyganowski (University of Wisconsin-Madison)

Susan Horwitz University of Wisconsin-Madison

Pedro DeRose University of Wisconsin-Madison

William A. Craig, MD University of Wisconsin Madison, Wisconsin

University of Wisconsin Madison

University of Wisconsin - Madison

Pedro DeRose University of Wisconsin-Madison