1 / 20

NOVA N etworked O bject-based En V ironment for A nalysis

NOVA N etworked O bject-based En V ironment for A nalysis. P. Nevski, A. Vaniachine, T. Wenaus

mona-deleon
Download Presentation

NOVA N etworked O bject-based En V ironment for A nalysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NOVANetworked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics analysis components, adaptable to different experiments. A job configuration manager uses a scripting interface to provide web-based editing, submission and cataloguing of analysis jobs, both user-level and experiment-wide, centrally managed in a database. A client/server system distributed over compute nodes provides job submission and monitoring across facilities, which may span several sites. A file catalog records production relationship of data files generated by an experiment. NOVA provides database tools for geometry and parameter object storage. A NOVA web-based browser navigates a relational database storing hierarchically structured dataObjects. Clients may access database information from the code or through a CORBA-specified interface. NOVA components have been tested and deployed in the STAR and ATLAS environments. February 7, 2000

  2. Outline • Goals • Requirements • Architecture • Components • Details • Summary CHEP in Padova

  3. Motivations • Unprecedented data volume and software complexity in new large High Energy and Nuclear Physics experiments at RHIC (BNL) and LHC (CERN) • New approaches to analysis and data handling software • Distributed computing environment (DCE) is vital and increasingly powerful • Experience in developing DCE solutions for STAR • Build on experience to develop DCE tools for use in similarly challenging environments CHEP in Padova

  4. Goals • Develop software tools for • coordination and control of widely distributed analysis development and physics analysis activity • distributed management and analysis of very large datasets • enhanced robustness, reusability and maintainability of analysis software • For application in many global computing environments (ATLAS, STAR, …) • generic tools not tied to specific implementation choices • select, templatable implementations provided such that NOVA components can be used in a baseline framework CHEP in Padova

  5. Requirements • Support wide area data intensive analysis • Define middleware services are required to permit analysis applications to effectively run over wide area networks • Provide a rich set of features that applications can select and use to obtain the level of service they need to operate • Define the features and the API's necessary to allow the application and middleware to communicate • Integrate the middleware API's with the applications CHEP in Padova

  6. Design Approach • Small, modular components; application-neutral interfaces • Can be used as a coherent framework or in isolation to extend existing analysis systems • Focused on support for C++ based analysis • Used for all RHIC, LHC, other large experiments • Emphasis on user participation in iterative development; real-world prototyping and testing (STAR, ATLAS) • Extensive use of existing tools and technologies • Must be readily available, true or de facto standards, well supported, widely used or showing good growth CHEP in Padova

  7. Component-based Architecture CHEP in Padova

  8. Tools and Technologies • Third party tools and technologies used in NOVA: • MySQL: relational database for catalogs, state information and simple objects: C-structs • Perl: Unix scripting and web development tool • Apache: customizable (Perl & PHP) web server for communication and monitoring • CORBA: low-volume interprocess data exchange • ROOT: visualization and analysis tools CHEP in Padova

  9. Components NOVA components fall into four domains • Regional Center • Central management and execution of analysis • Remote Client • Mobile Analysis • Middleware Components • Data exchange and navigation tools • Client/Server object request brokerage • Data Management • Data repository, catalogue, and interface • Data model for simple objects (C-structs) CHEP in Padova

  10. Dynamic Binding • Problem: • A user has a new idea that was not foreseen at the beginning. User modifies the structure of one object in his application. Application stores new objects in the database. • Remote applications unaware of a new functionality may request objects in old format. • Solution: • Application: provides metadata request (name, time, selectors...) and the application dataObject dictionary • Database server: provides dataObject and the dictionary • Object Request Broker module: converts dataObject according to the application dictionary CHEP in Padova

  11. Application DataObject Application Dictionary Database DataObject Database Dictionary Dynamic Object Broker Object Request Broker Middleware Services Remote Application Clients Parameters Repository Central Database Server CHEP in Padova

  12. Forward Compatibility • Benefits: • Separation of database and analysis applications • Robust interface (via built-in type checking) • Dictionary built from C-header files or IDL-files • Database access is independent of application code version: user can read new dataObjects with an old executable • Usage: • Parameters data management (versioned geometry and reconstruction constants support) CHEP in Padova

  13. Static Binding • Problem: • Remote application (web browser) navigates current database hierarchy. • Solution: • Object Request Broker at the Regional Center serves dynamic HTML dataObjects in format tailored according to application ID: Netscape or MS Internet Explorer CHEP in Padova

  14. Static Object Broker NOVA Browser Application DataObject Apache Web Server Application ID Database API Module Database DataObject Middleware Services Database API Call Parameters Repository Regional Center Database Server Remote Application Client CHEP in Padova

  15. Layered Interface CHEP in Padova

  16. Data Model CHEP in Padova

  17. Job configuration manager Cataloguing Analysis Workflow Job monitoring system fileCatalog CHEP in Padova

  18. GCA Interface gcaClient FileCatalog IndexFeeder Grand Challenge Interface GC System STAR Components Query Estimator StIOMaker Query Monitor database fileCatalog Cache Manager tagDB Index Builder CHEP in Padova

  19. & GCA-dependent • gcaClient interface • Experiment sends queries and get back filenames through the gcaClient library calls Limiting Dependencies Experiment-specific • IndexFeeder server • IndexFeeder read the “tag database” so that GCA “index builder” can create index • FileCatalog server • FileCatalog queries the “file catalog” database of the experiment to translate fileID to HPSS & disk path CHEP in Padova

  20. Summary What is NOVA? • Framework components for distributed computing What are NOVA components? • Configuration manager for analysis jobs • Distributed job submission and monitoring system • Analysis workflow catalog • Database for versioned dataObjects • Brokered extraction of dataObjects • Web-based database navigation tool CHEP in Padova

More Related