1 / 11

Data Catalogue Service

Data Catalogue Service. Work Package 4. Main Objective: Deployment, Operation and Evaluation of a cataloguing service for scientific data. Why: Potential benefits beyond the convenience of powerful data searching/retrieving. Outcomes

tavita
Download Presentation

Data Catalogue Service

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Catalogue Service Work Package 4

  2. Main Objective: • Deployment, Operation and Evaluation of a cataloguing service for scientific data. Why: • Potential benefits beyond the convenience of powerful data searching/retrieving. WP4

  3. Outcomes • develop the generic software infrastructure to support the interoperation of facility data catalogues, • deploy this software to establish a federated catalogue of data across the partners, • provide data services based upon this generic framework which will enable users to deposit, search, visualise, and analyse data across the partners’data repositories, • evaluation of the service (also from the perspective of facility users) • manage jointly the evolution of this software and the services based upon it, • promote the take up of this technology and the services based upon it beyond the project. WP4

  4. Relations and dependencies • user AAA services (WP3) • Virtual Laboratories (WP5) • Requires an established shared user AAA service • underpin the integrated data catalogue both of these are required to enable seamless access to the content through the virtual laboratories. WP4

  5. Methodology Builds on: • PaNdata Support Action • user AAA services • in order to provide: service to the virtual labs No intention for a new metadata catalogue STFC’s ICATis an advanced implementation • Deployed in various facilities including Elettra/NFFA (+VCR) Comparison with other systems will be necessary • MCA, MCAT, Artemis and Fireman. (outdated candidates?) • Check: AMGA (Fireman replacement in GLite) WP4

  6. The current system will need further development. Issues that have to be addressed: • how to linklogical files (indexed by metadata) to physical files • how to querymetadata • how to authorizeuser access to metadata (WP3 feedback?) • what APIto propose to programs to access metadata and data • (ICAT API at the catalogue level - pHDF5/ NeXus, Common Data Model? For the actual data in, line with PaNdata) WP4

  7. Additional Should we “migrate” old files / archived datasets too? (converters?) Initial requirement Set of keywords for the metadata catalogue Expansion based on existing implementations + PaNdata SA Integration WP outcome + Dublin Core? WP4

  8. Populating the catalogue • virtual laboratories (WP5) – demonstration & test • Existing data archives of other partners • May require converters + metadata generation • Distributed access • accessing data distributed over multiple sites via their metadata • performance and scalability will be evaluated (as elaborated in WP5) WP4

  9. Task 4.1 • Survey existing systems • ICAT and other • Examine them against the metadata, authorisation, performance, and ontological requirements of vLab (WP5) and uCAT AAA (WP3) Task 4.2. • Deployment of the chosen metadata catalogue solution (=ICAT) Task 4.3.  • Remote API access to the individual catalogues • Single search capability across the collaborating facilities. Task 4.4. • Benchmarking - evaluation of the performance. WP4

  10. Indicators of success • Searchable data catalogue established in participating facilities (more than 50% uptake) • Cross facility searching in place for data from different facilities. WP4

  11. Deliverables • D4.1. Requirements analysis for common data catalogue (M9: June 2012) • D4.2. Populated metadata catalogue with data from the virtual laboratories (M15: Dec. 2012) • D4.3 : Deployment of cross-facilitymetadata searching(M21: June 2013) • D4.4. Benchmarkof performance of the metadata catalogue (M27: Dec. 2013) WP4

More Related