the faster research cycle interoperability for better science n.
Skip this Video
Loading SlideShow in 5 Seconds..
The Faster Research Cycle Interoperability for better science PowerPoint Presentation
Download Presentation
The Faster Research Cycle Interoperability for better science

Loading in 2 Seconds...

play fullscreen
1 / 26

The Faster Research Cycle Interoperability for better science - PowerPoint PPT Presentation

Download Presentation
The Faster Research Cycle Interoperability for better science
An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. The Faster Research Cycle Interoperability for better science Brian Matthews, Leader, Information Management Group, E-Science Centre, STFC Rutherford Appleton Laboratory

  2. The Research Lifecycle E-Science: providing theinfrastructure for the research lifecycle

  3. How do we speed up this cycle? • By speeding up the cycle we can increase the volume of good science • Make a better return from the investment in science • Make breakthroughs in science earlier • Do this via: • Integration • Support the whole lifecycle • See Kerstin’s talk • Interoperability • Support across lifecycles

  4. Interoperability • Sharing across boundaries • Across different research lifecycles • Across institutions • Across information objects • Across disciplines • Across time • Characteristics • Loosely coupled • Across different authorities • Different internal models

  5. Enabling better science Neutron diffraction NMR X-ray diffraction } } SCIENCE MASHUPS High-quality structure refinement

  6. Vision Infrastructure to support science across disciplines, scientific institutions and research groups

  7. EDNS • European Data Infrastructure for Neutron and Synchrotron Sources • Combining European Neutron and Synchrotron Facilities • Already a common user community • Across many disciplines • Materials, chemistry, proteomics, pharmaceuticals, nuclear physics, archaeology …

  8. Interoperability Across Facilities • Neutrons • ILL • ISIS • e-Science • France • UK • Diamond • ESRF • Synchrotron X-Rays

  9. Common CRIS Single Infrastructure  Single User Experience Raw Data Catalogue Analysed Data Catalogue Published Data Catalogue Different Infrastructures  Different User Experiences Publications Catalogue Data Analysis User Catalogue Data Analysis Analysed Data Published Data User Data Raw Data Publications Facility 1 Data Analysis Analysed Data Published Data Raw Data Publications User Data Facility 2 Data Analysis Analysed Data Published Data Raw Data Publications User Data Facility 3 User Registries Capacity Storage Software Repositories Data Repositories Publications Repositories Integration and interoperation across facilities

  10. Potential Impact • Most of Research Lifecycle • User Management, Data Collection, Analysis, Publication • Establish a Production service • benefit to users – usability, findability: user info, data, pubs, software • benefit to facilities – manageability: users, data, pubs, software • Outreach and expansion • Linking with other facilities in Europe and the wider world • USA, Canada, Australia • Linking with User communities But at the moment, we are still in the planning and discussion phase

  11. FedID Facility UserID DN Shibboleth ID SRB System UID SSH PK Facility User Sharing Users • Sharing knowledge of users • Enhancing level of support for users • Can correlate similar applications put into different facilities • Facilities can provide a continuity of service • Facilities can increase accuracy • Common Authentication • Common UID ? • Shibboleth • Grid Certificates • SSO at STFC, ShibGrid • Virtual Organisation Support • Policy Issues • Data protection • Institutional Security policy

  12. Sharing Data • Sharing data is hard: • Different data formats • Different access rights • Complex objects • Maintaining context • Metadata is key • Structural Metadata (CSMD) • Conceptual structures (Ontologies) – maintain meaning • Metadata is hard to collect • Consistent data policies are needed

  13. Publishers: peer-review journals, conference proceedings, etc eCrystals ‘Data Federation’ Model Data discovery, linking, citation Presentation services / portals Data creation & capture in “Smart lab” Data discovery, linking, citation Aggregator services Search, harvest Search, harvest Publication Deposit Validation Subject Repository Data analysis Institutional data repositories Search, harvest Laboratory repository Deposit Deposit Deposit , Validation Curation Preservation Institution Library & Information Services Deposit

  14. Goals & Requirements … Self-* Dynamic VO Policies VO Mngt … Data Policy Trust and Security for NGGs • Data policy • Retention • Quality • Access • Learning how to manage policy as part of the SOA infrastructure • E.g GridTrust • Consequence – looking at Data Policy • Remains as a very large Business question Usage control Resources …

  15. Sharing Publications • Institutional Repository s/w now very well established • ePrints, DSpace, Fedora, ePubs • Large body of expertise available • Standard metadata models and protocols: • DC-APs, FRBR, OAI-PMH, OAI-ORE • Not yet embedded in science practise • except HEP! • Linking science data and publications • Not yet well established • Needs data citation • Needs peer review of data • Can (and should) be done on a P2P basis

  16. STFC

  17. Sharing Software • Analysis software tends to be specialised • Dependent on specific data formats • Dependent of nature of data • Dependent on the particular result to demonstrated • Nevertheless common s/w repositories exist • GAMS, StarLink, NAG, CCPForge etc • Advantages in sharing it • Saves programmer effort • Verification of results • Common algorithms • Visualisation tools • Little work on systematic preservation of s/w • Signficant properties of s/w

  18. Common Representation and Transport of Information • To support the infrastructure we need a means to share information • Lightweight • Minimal impact on internal systems • Keeps control at the source • Easy to share and merge • Can share conceptual information • The Semantic Web (still) provides the best current option

  19. DataWebs • DataWeb concept • David Shotton, Oxford • Biological images • Publishing metadata locally • With different conceptual description • Mapped to core Ontology • Search and aggregator service • Integration comes for free

  20. SKOS: Simple conceptual relationships

  21. A Reality Check: the SUPER Report • Do the users really want all this? • Study of User Priorities fore-Infrastructure for e-Research (SUPER) • Survey commissioned by the UK NeSC • Steven Newhouse, Jennifer Schopf, Andrew Richards, Malcolm Atkinson • Covered 45 people from over 30 e-Science projects • Small survey • Selected from the already converted! • Available:

  22. SUPER Results • Some Concerns: • How to share data with colleagues • Large-scale data sets (files) • Metadata standards seen as key • Automatic capture of provenance • Long-term data curation • Help with best practice to curate data • Authentication • Simpler authentication mechanisms • Easier use of Virtual Organisations • Training and outreach We seem to be hitting the right points!

  23. Summary • Leverage to speed up the science lifecycle from interoperability • Access to resources across institutions and disciplines • Metadata Key • Policy Key • Need to use semantic description to share meaning • Loose coupling of resources via Semantic Web