1 / 36

CHEP 2000

CHEP 2000. Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy. The KLOE experiment. K S  p + p - K L  p + p - (CP not). at DA F NE  -factory main goal: CP violation study other interesting fields: kaon form factors kaon rare decays radiative f decays.

abra-barker
Download Presentation

CHEP 2000

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy

  2. The KLOE experiment KSp +p - KLp +p - (CP not) • at DAFNE -factory • main goal: • CP violation study • other interesting fields: • kaon form factors • kaon rare decays • radiative f decays KSp +p - KL3p 06g

  3. KLOE Requirements • Data acquisition (at full DAFNE luminosity) • 1011 events per year acquired • 50 MB/s sustained throughput • Computing power • ALL the events need to be reconstructed • Storage requirements • one petabyte of raw and reconstructed events • hundreds of megabytes of related data(configurations, slow control data, calibration parameters, etc.)

  4. KLOE computing environment • Based on a set of medium-sized servers • Connected using commercial switched networks (Fast Ethernet and Gigabit Ethernet) • Heterogeneous environment, several platforms: • IBM AIX on PowerPC • Sun Solaris on Sparc • Compaq Tru64 Unix on Alpha • HP-UX on PA-RISC

  5. KLOE storage pool • Different policies for different types of data: • raw and reconstructed events on tape libraries, with big disk pools for data caching • related data managed by a disk based database system • analysis output on disk pools

  6. Disk pools • Four categories of disk pools are present: • each data acquisition node in the farm has its own small disk pool • computing nodes write their output to centralized, NFS mounted disk pools • separate disk pools are used as a cache for the events on tape • analysis output is written to its own, central AFS mounted disk pool

  7. Tape library • Several automated tape libraries supported(at the moment the 5500 slot tape library is partitioned between two tape servers) • Accessed using commercial software • IBM ADSM with the current tape library

  8. KLOE software • Three distinct categories • DAQ (or online) • reconstruction and analysis (or offline) • Monte Carlo ANSI C FORTRAN inside A_C FORTRAN The interface to the Data Handling System must be compatible with all of them

  9. KLOE Data Handling System • Composed of four elements: • Database System • Archiving System • Spy System • KLOE Integrated Dataflow (KID)

  10. KLOE Data Handling System A mix of commercial and custom software the dependency on commercial software is minimized by the layers of custom software commercial software carries on all the vital functions • custom software mostly extends and coordinates the functionality of the commercial software

  11. bypasses TCP/IP filtering flexible, programming language and operating system independent no configuration needed on the client side KLOE Data Handling System • Based on a set of multi-threaded non-privileged daemons and related libraries • Distributed across several nodes • Communication by means of TCP/IP sockets on high ports

  12. KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

  13. offline database system Database System • Two distinct database systems are used based on HepDB data stored as ZEBRA banks • online database system based on a Relational DBMS data are structured in fields extended for distributed environments

  14. app app RDBMS app DD Online Database System • data stored in a Relational DBMS • IBM DB2 Universal Database at the moment • communication between the clients (user applications) and the RDBMS through a database daemon

  15. Database Daemon • The database daemon is the only link between the applications and the RDBMS • if the RDBMS is changed in the future, only the database daemon will need to be changed • Different kinds of commands are managed by the daemon • general SQL commands • KLOE specific commands

  16. general SQL commands • passed directly to the RDBMS select run_nr from run_logger where status = 'OK' • managed by the daemon itself • the RDBMS is used to retrieve and store data needed by the daemon itself log that I am starting processing file relative to run 3 Database Daemon • Different kinds of commands are managed by the daemon • KLOE specific commands

  17. for example, the DAQ configuration cache reduces the typical access time from 4 to 0.1 s Database Daemon • The use of KLOE specific commands has several advantages • additional checks and restrictions are possible • data consistency management is centralized • fast central caches can be implemented

  18. A light version • The RDBMS is used to ensure flexibility, reliability and performance • Demanding in terms of computing resources and management effort • stand-alone environments oftencannot afford it • A RDBMS-independent version of the database daemon is under development

  19. A light version A RDBMS-independent version of the database daemon is under development limited to KLOE specific and the most frequently used SQL commands based on use of flat files containing a small portion of the data not suitable for production environment,but enough for home use

  20. KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

  21. KLOE Archiving System • Expected event data managed by KLOE • 1 PB • Tape libraries needed • data storage and retrieval non trivial • random access to data very inefficient • Disk-based intermediate buffers used

  22. KLOE Archiving System • Two types of intermediate buffers • DAQ, offline and Monte Carlo output are structured as YBOS files and written on their disk output areas • event data needed by offline as input are read from the archiving system disk-cache

  23. Data needs to be migrated from output areas to the tape library as soon as possible(taking into account also efficiency concerns) from the tape library to the disk cache when an application needs it(or even better, a bit earlier) Migration is totally automated and transparent to the applications KLOE Archiving System

  24. KLOE Archiving System • The Archiving System is made of four components • storage managers • disk space managers • output areas • cache areas • archival director • cache manager • Communication by means of TCP/IP sockets • Coordinated by the online database archADSM spacekeeper filekeeper archiver retrieve

  25. Storage Managers • One for each logical tape library • Allows • queries about tape library content • file archival • file retrieval • Transaction oriented(if the underlying tape library software supports it)

  26. The only link between the tape library and the rest of the system interface independent of the underlying archiving software IBM ADSM is used with the current tape library if other products is used in the future, only a specific storage manager will need to be developed Storage Managers

  27. Disk Space Managers • One for each disk pool • Create and delete files • unused files get deleted to make space for new ones

  28. Archival Director • Fully automated • Works in polling mode • from time to time looks for files ready to be archived • starts archiving only when enough data is available • Files are ordered and grouped to minimize the expected retrieve time • Several groups of files can be archived in parallel

  29. Cache Manager • User driven • when a file is needed, the application asks the cache manager where it is located • a retrieve is performed by the manager if needed • Several requests can be issued at the same time • the manager reorders them internally to minimize the tape mounts • Communication by means of TCP/IP sockets

  30. KLOE Archival System archiver Tape Library Tape Library ... n archADSM archADSM . . . m spacekeeper spacekeeper Disk Pool Disk Pool DB . . . filekeeper k filekeeper Disk Pool Disk Pool retrieve NFS mount local file system TCP/IP socket TCP/IP socket

  31. KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

  32. Spy System • KLOE data acquisition software allows the event data to be read-out before they get written to disk • The mechanism that reads those data is called Spy • Based on use of shared memory buffers • DAQ processes are piped using this mechanism • the spy system reads data from the buffers without interfering with the DAQ

  33. KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

  34. spy:/buffer datarec:(run_nr=5000) and (stream='ksl') open a spy channel and pass the events to the application read the list from DB, ask the cache manager for the files, pass the events from the files to the application KLOE Integrated Dataflow (KID) • Integration library • database accesses and retrieve operations hidden • Offers a single point of access to all the services • URI-based selection

  35. Management effort • The entire system is managed by only a few people: • 3 people (2 full time) are engaged in KLOE computing system management (including storage) • 1 person is engaged in the development and management of the online database and the archiving system • 2 people spend few percent of their time for the maintenance of the offline database

  36. CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy

More Related