1 / 18

Reference Software Framework

ATLAS Trigger/DAQ Workshop Chamonix 19-23 OCtober 1998. Reference Software Framework.

barb
Download Presentation

Reference Software Framework

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ATLAS Trigger/DAQ Workshop Chamonix 19-23 OCtober 1998 Reference Software Framework A.Belias5, R. Bock2, A. Bogaerts2, M.Boosten2, J. Bystricky8, D.Botterill5, D.Calvet8, P.Clarke9, J.M.Conte8, R.Cranfield9, G.Crone9, M.Dobson2, A. Dos Anjos2, S.Falciano7, F.Giacomini2, A.Guglielmi3, R.Hauser2, M.Huet8, R.Hughes-Jones4, K.Korcyl2, P.Le Du8, I.Mandjavidze8, R.P.Middleton5, D.Pelkman2, S.Qian6, J.Schlereth1, M.Sessler2, P.Sherwood9, S.Tapprogge2, W.Wiedenmann6, P.Werner2, F.Wickens5, H. Zobernig6 1Argonne National Lab, 2CERN, 3DEC Joint Project Office, 4University of Manchester, 5Rutherford Appleton Laboratory, 6University of Wisconsin, 7University of Rome, 8CEA Saclay , 9UCL A.Bogaerts2

  2. Objectives • Full software chain from RoI data collection to trigger decision • common framework for different aspects within LVL2 (testbeds, algorithms, communications, special components) • Diversity of architectures • small systems for FeX algorithms, steering, trigger performance • large scale testbeds for network performance • HPCN farms (technology watch) • Decouple communications technology • TCP/IP or UDP over Ethernet to test functionality • allow alternative communications technologies (ATM, Ethernet, SCI, …) • Isolate algorithms (Steering, FeX) from architecture/framework • coping with distributed algorithms, distributed data • Implement on open, affordable platform • PCs with WNT or Linux • Good engineering, good quality software • introduction of software process • OO design & implementation • emphasis on portability (isolation of OS services, ANSI compliance, STL)

  3. Functional Components • Supervisor (Emulator) • LVL1 Interface( provides RoI lists & LVL1 result to Steering) • EF interface ( passes LVL2 result to the Event Filter) • Communicates Final Decision to Event Filter/DAQ • Receives Event Copied message from Event Builder • Clears Readout Buffers • Steering • Global Decision driven by a Trigger Menu table (sequential or parallel) • schedules Feature Extraction as defined by Algorithm table • Feature Extraction • initiates Data Collection • extracts Features from RoI Data • Data Collection • initiates data transfers from RoBs (or reads from a file) • collates RoB fragments • Readout Buffer (Emulator) • data source for LVL2 and Event Filter/DAQ

  4. Test Setups • Feature Extraction -- Single CPU • “stand alone” Feature Extraction for a single detector • purpose: algorithm development, physics performance • Single Node -- Single CPU • single node implementation of Steering & Feature Extraction for multiple detectors • data file reader replaces Supervisor and data Collector/RoB emulators • purpose: functional tests, physics performance, trigger strategy • Single Farm -- Multiple CPU • Steering & Feature Extraction combined in a single thread • multiple threads per Node • one processor farm (Steering & FeX), one RoB farm, one network • purpose: performance including communications, component tests • Split Steering/FeX Farms (HPCN cluster) -- Multiple CPU • assumes a single HPCN farm per subdetecor, one global farm • direct connection to detector data without external network (rob-ins) • requires a single small but high bandwidth external network • purpose: alternative for a Single Farm

  5. Software Components • Applications • Application Framework (common to all functional components) • Emulators: Supervisor, RoB • Algorithms • Steering (“Stephandler”), FeX algorithms (presently TRT, Calorimeter) • Menu & algorithm table • ASCII data files • Run Control • Configuration File • Process manager, Run Control, Error Logging, Monitoring • client/server approach, compatible with DAQ/backend • Infrastructure • Operating System Services (encapsulates OS dependencies) • Object management • Message passing Interface and Buffer Management (encapsulates communications) • Application Interfaces to Run Control (Configuration database, Error Logging, Monitoring)

  6. Proxy TRT Handler extract(…) extract(…) Proxy Steering poll(); retrieve(…) poll(); retrieve(…) Independence of underlying Architecture • All functional components (e.g. Supervisor, Steering, Algorithms, Data Collectors, RoB) are defined as objects • Objects interact through the invocation of methods (with arguments) • This simple model is always used to preserve independence of architecture • But CORBA-like techniques (RPC, proxy-objects, marshalling of arguments, serialisation of objects) are used to cross processor boundaries • Example: interface between TRT FeX Algorithm and Steering class TRTHandler { public: void extract (Region, EventId, Algorithm); // TRT Feature Extraction bool poll(); // polls if result has already been produced list <TRTTrack> retrieve(); // allows asynchronous execution }; extract(…) Super visor Steering TRTHandler TRTCollector RoB poll(); retrieve(…) Steering TRTHandler

  7. Supervisor Steering FeX RC RoB Error Logger Config Information Histogram Application Framework [Architecture Independent & Data Location Independent] Application (Supervisor, RoB, Steering) Object Management • Distribution - proxies • Location - broker • Transport - messages Control Services [Hide Event Data Location] [Hide Network Technologies] Threads, Pipes, Synchronisation, Shared memory, Timers, Sockets, etc. Ethernet,SCI,ATM... (WNT,Linux,...) [Hide OS Specifics] Network Abstraction Operating System Abstraction

  8. Objects, Threads, Processes, Processors • Objects • Since all functional components are objects they can easily be instantiated, exist in multiple copies or as multiple variants • Simplest architecture is the classical single threaded single process which is used for tests, development of algorithms and physics performance studies • Threads • multiple threading allows efficient parallelism (light weight process) and easy communication (shared address space) • parallelism allows overlap between program execution and I/O as well as exploitation of SMP systems • Processes • Use of multiple processes is only kept as a convenient substitute for multi-processors for testing as multi-threading is preferred • Processors • CORBA-like techniques with proxy-objects hide distribution of algorithms and data to preserve independence of architectures • Communication (with associated message passing, buffer management and queuing) is hidden in proxy-objects

  9. Status • Design: March - Oct 1998 • Supervisor & RoB emulators, Application Framework, infrastructure • Steering and Feature Extraction algorithms • Prototyping: Sept - Oct 1998 • Single Node: Steering (“Stephandler”), FeX algorithms (TRT, Calorimeter), data files • Multi Node: Application Framework, Supervisor & RoB emulator, Messaging over TCP/UDP, Configuration, Error Reporting • CVS Repository • Full implementation Nov - Dec 1998 • Integration of communications technologies (ATM, Ethernet, SCI) • Full set of representative Algorithms (SCT in preparation …) • Complete run Control, Configuration, Error Reporting, Monitoring • Complete, improve and tune Application Framework and Emulators • Distribute turn-key system for WNT and Linux on PCs 98 99 TP < 2000 Design & Implementation Testbed Operations

  10. Supervisor • Tasks of the Supervisor • Provides interface to Level 1 by collecting data needed to steer Level 2 decision • Receives back Level 2 decision from steering object • Notifies RoB object to release buffers • Level 1 information is summarized in an object, LVL1result • object contains data and access methods for steering object • RoI objects accessed as STL containers • 3 sources of Level 1 data • Internally generated distributions based on parameter file • Read from Event store to synchronize with RoB emulator, Steering • Read from RoI Builder hardware via S-link interfaces (not yet implemented) • Status • Simple Supervisor using Event Store source has been integrated with Application Framework; used to exchange messages with steering object

  11. RoB Emulator • Follows the standard conventions of the Application Framework • It is a single communications “node” • It responds to • data requests originating from the RoB Data Collector • delete event <event list> originating from the Supervisor • Event Filter/DAQ allows three implementations: • mapping to the entire detector (a single RoB provides all data) • mapping to a subdetector (one RoB per subdetector) • mapping to part of a subdetector (each Rob holds a slice of the RoI data) • It may access the EventStore to obtain data from a file and preload events in memory

  12. Error Reporting & Logging • Uses high level designed employed by DAQ-1 ‘s MRS • Independently implemented • Senders “inject” messages into ERL using C++ stream I/O to send ASCII strings • Receivers “subscribe” to get a subset of the messages (selection criteria) • Consists of: C++ API, Message Server, Command Module, Message Database • Status: all implemented and tested except Message Database ERL Server Receiver Sender Commander

  13. Process Manager • Service Layer for Run Control to handle startup and shutdown of LVL2 tasks • Modeled after BE Process manager • Run Control sends requests to, receives replies from “Agents” • One Agent on each LVL2 node, started at boot time • Agents create/destroy LVL2 processes and maintain a process database • Status: not yet implemented Process Application replies requests requests info Client API Agent DataBase replies status

  14. Run Control • Tasks: Startup of LVL2 processes; Control and Monitoring; Intervention in case of malfunctioning; Controlled shutdown; User Interface for interaction • Basic Element: Controller (concept borrowed from BE) • receives commands; reads Configuration DB; performs actions; reacts to events; reports status • Hierarchy of Controllers (tree structure but two levels expected to be sufficient) • State machines • each Controller represents processes under its control by a standard Finite State Machine • Status: not yet implemented commands events Controller Component under Control status actions Configuration Database

  15. Integration • Lab equipped with 6 dual PII PCs booting Linux or WNT; monitor switch + 2 displays; AFS and NFS common file base; CVS sw repository • Fast Ethernet network with 8-port switch; 500Mbytes/s SCI/PCI network with 4-port switch • Single CPU/Linux: Steering, TRT and Calorimeter FeX, ASCII data files • Multi-CPU/Linux: Farm (Supervisor, 2 Steering/FeX, 2 RoB), dummy algorithms, Error Reporting • Idem for WNT • Multi-CPU/Linux farm with Algorithms, ASCII data files • SCI Message passing tested under WNT

  16. Multi-CPU Farm Supervisor Eventlist LVL1Decisions Steering Proxy LVL1Results LVL2Results Farm Node Supervisor Proxy Queue of LVL1Decisions Workerthreads of Steering/FeX/DataCollectors RoB Proxy Farm Node Supervisor Proxy Queue of LVL1Decisions Workerthreads of Steering/FeX/DataCollectors RoB Proxy Data Requests Data Responses ASCII Data File RoB EventList DataCollector Proxy RoB EventList DataCollector Proxy

  17. Summary • High Level Design of Functional Components (including Steering, TRT and Calorimeter algorithms) finished • Single CPU system prototype: Steering, 2 FeX algorithms (TRT, Calorimeter) with data read from ASCII files prototyped and integrated. • Multi-CPU farm prototype: Application Framework (Farm node) integrated with Supervisor and RoB Emulators. RoB RoB Supervisor ERL TCP/Fast Ethernet Steering FeX Steering FeX

  18. Pending items • handling of the LVL2 result • software performance evaluation • quality assurance of the software • system robustness, error recovery • monitoring • construction of testbeds • integration of technologies

More Related