1 / 15

01/12/2010

Universal Access Layer Presented by: G. Manduchi TF Leader : G. Falchetto Deputies: R. Coelho, D. Coster EFDA CSU Contact Person: D. Kalupin. 01/12/2010. The need for data layer abstraction. Within the ITM framework, a unique data interface is defined.

bnielsen
Download Presentation

01/12/2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Universal Access LayerPresented by: G. ManduchiTF Leader : G. Falchetto Deputies: R. Coelho, D. CosterEFDA CSU Contact Person: D. Kalupin 01/12/2010

  2. The need for data layer abstraction • Within the ITM framework, a unique data interface is defined. • Many data access tools are used in the fusion community • Different data formats; • Different views of data strucures. • No data access tool has been selected to be directly used in ITM • Rather, a unique data interface is exposed to users, hiding the actual implementation. • This layer represents therefore the only way users can deal with data.

  3. Database abstraction • A Data Model is presented to simulation program • The model is decoupled from its actual implementation • In Object – Oriented Terminology:Program to Interfaces • Interface Decoupling allows: • Changing the underlying implementation whenever technology provides a better solution; • Using different solutions, possibly mixed in distributed systems.

  4. A Bus-Like View of the UAL Grid Simulation Interface (KEPLER) Batch processor HPC server HPC server UAL Bus HDF5 Files MDSplus data server

  5. The Data Model • Users normally have access to basic data types such as Integer, float, double, strings, n-dimensional arrays. • Physical entities are represented by several pieces of information, contributing to their complete definition. • In the ITM framework physical entities are represented by Consistent Physical Objects (CPOs) • Every CPO is represented by a (possibly complex) hierarchical data structure. • In essence: only CPOs can be read and written in the ITM framework.

  6. A neutral language-independent definition of CPOs • XML has been chosen to provide an abstract definition of the hierarchical structure Consistent Physical Objects. • XML appears to be the best candidate for the generic definition of hierarchical data structures. • XML represents therefore the common language to establish the way data are organized in ITM. • Several tools are available for analysis and graphical display of XML descriptions • Such a graphical view is published in the ITM web page

  7. Consistent Physical Objects and Time • CPOs can describe time-independent information • E.g. geometry description • In other cases CPO will describe phenomena over time • Every CPO instance represents a snapshot of the described physical quantity. • Every time-dependent CPO has the field « time », i.e. the time that snapshot refers to. • Time evolution of a given physical object is represented by an array of CPOs in the ITM database.

  8. From Abstract Representation to Data access • The usage of XML schemas is effective for agreeing on widely accepted data structure. • The UAL provides an Application Programming Interface (API) for reading/writing structured data for a variety of languages (Fortran, C, Java, Matlab, Python). • A one-to-one mapping exists between the XML structure description and the actual language specific implementation. • The language specific API represents the exclusive interface between applications and database implementation.

  9. UAL implementation - Architecture • The UAL interface is split into two levels: • The high level interface provides the user language specific API • The low level interface provides a system-independent API composed of a set of low level data access routines • Mapping between High level and low level layers is carried out by code generated from the XML interface definition • Two implementations of the low level API are currently available for MDSplus and HDF5.

  10. MDSplus and HDF5 • MDSplus is used in the fusion community as a common format for data exchanging • HDF5 is used to provide efficient machine-independen storage and access of hierarchical databases • The abstract data structures defined in the UAL are mapped onto specific data structures for both systems • The hierarchical view of CPOs in the UAL fits well in both systems since both support a hierarchical data view. • A comparison bewteen the two systems is provided in: • ”Commonalities and differencies between MDSplus and HDF5 data systems”Fusion Engineering and Design 85:3-43-4, 583-590

  11. UAL structure • Structured in modular layers. One could change one layer without modifying the others • High level : • Knows about CPO structure • Language specific • Dynamically generated • Low level : • Deals with single elements : GET/PUT scalars, vectors, arrays, … • Can use multiple storage methods • Transport : managed by MDS+ • Storage : MDS+, memory, HDF5 F90 Java C, C++ Matlab Python C MDSip, TDI func MDSplus (file or memory) HDF5 11

  12. UAL data access within the ITM framework • When programs are executed within the Kepler Framework, no data access is performed. • Programs receive the current input and output CPO reference as argument when they are called by the framework. • The kepler framework provides all the required data access, and prepares the language specific CPOs structures to be used by programs. • Two Kepler modules will provide the source and the sink of data used in simulation • During a Kepler Simulation, CPOs can be stored in memory using the same interface • This is achieved using the MDSplus ability of mapping pulse files in memory cache

  13. UAL and HDF5 in higly data intensive programs in HPC • The HDF5 UAL implementation is used for simulations handling a huge amount of data; • Parallel I/O is being integrated in the UAL for HDF5 • HDF5 files produced in HPC systems will be transferred via (Grid)FTP into the gateway orchestrating the simulation • This approach is different from the MDSplus one, where data are exported by a data server and not stored in local files • It is possible to mix UAL access for MDSplus pulse files and HDF5 in a trasparent way, even from the same program

  14. Collecting experiment data for the UAL • Simulation must work on real data • Even if experiments use different data systems, most of them provide also a MDSplus « access point » • Using MDSplus remote data access experiment-speficic data can be collected and, through the UAL interface, ITM experimental databases can be created • MDSplus remote data access is also used to implement remote UAL access • Exceptions are represented by data intensive applications requiring local storage and producing very large databases which are then transferred via FTP.

  15. Conclusions • The UAL represents the Data Bus for ITM-TF applications. • It is not tied to the Kepler framework used for simulation in ITM-TF • It provides an abstract high level view of structured data, offered in 5 different languages • High level data view is decoupled from low level data management. • Currently data storage is provided by MDSplus and HDF5; data transport and memory mapping by MDSplus.

More Related