1 / 31

Linking Data to Open Access Publications

Linking Data to Open Access Publications. EGU, 23 April 2012, Najla Rettberg, OpenAIRE , University of Göttingen,. In 12 Minutes …. OpenAIRE – P ublications and D ata Demonstrators for Enhanced Publications Use Case S cenarios Services for Users. OpenAIRE – Second Phase.

zanta
Download Presentation

Linking Data to Open Access Publications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linking Data to Open Access Publications EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen,

  2. In 12 Minutes…. • OpenAIRE– Publications andData • Demonstrators for Enhanced Publications • Use Case Scenarios • Services for Users EGU, April 23 2012

  3. OpenAIRE – Second Phase • Open Access, participatory infrastructure for scientific information linkingpublications, datasets, funding • Disseminates OA/RDM information in Europe • Opens its content(search, browse, stats) and to 3rd-party/Service providers • Capitalizes on the OpenAIREinfrastructure, built for Open Access pilot, FP7-funded articles (measuring the impact of EC SC39) EGU, April 23 2012

  4. Portal:Search, Access, Deposit EGU, April 23 2012

  5. Past, present and OpenAIREplus OpenAIREplus OpenAIRE+ Guidelines for Data Providers Dataset repositories Metadata on data sets FP7 publications 5,600,000 OA publications 311 validated repositories OpenAIRE Guidelines v2.0 National funding publications OpenAIRE Guidelines v1.0 Driver Guidelines EC Project metadata National Project metadata Publication repositories network Institutional & Thematic EGU, April 23 2012

  6. Covering ‘European Knowledge’ • Open Data Infrastructures • OA Publication Infrastructure ESFRi, EU wide infrastructures EGU, April 23 2012

  7. A ‘Static‘ publication <Slide from Jens Klump

  8. Enhanced Publications (EPs) Compound information objects: represent the aggregation of distinct information objects through meaningful relationships Example of SURF-EPs: textual publications enhanced with links to datasets OpenAIREplus provides EP services: • Management: creation and curation • Visualization, browsing, querying • Import: OAI-PMH/ORE harvesting of EPs from external providers • Export: OAI-PMH/ORE publishing of EPs, Linked Data representation EGU, April 23 2012

  9. ‘Information in Context’ EGU, April 23 2012

  10. Cross-disciplineapproach • Attempt at a generic workflow • No one-size fits all for data • Use different data types, PIs, policies, access levels, standards • Look at research driven disciplines, different communities • Incremental, based on prototypes • “..any roadmap for OA infrastructure must address this natural tension between diversity and infrastructure” C. Meier zuVerl, & W. Horstmann (Eds.) 2011. Studies on Subject-Specific Requirements for Open Access Infrastructure. EGU, April 23 2012

  11. Subject-specificpilots • Learning lessons from interoperation of data infrastructures • Interoperability pilots between OpenAIREplus and subject-specific infrastructures • In the Life Sciences • In the Social Sciences • Exploitation in modelling and implementation for OpenAIRE data model • Relationship entities: projects, publications, datasets EGU, April 23 2012

  12. The Challenges • Aggregation andDiscovery ofresources • Representationof diverse disciplines in a ‚generic‘ infrastructure • Access restrictions/reusepolicies • User friendlywayfor Researchers to link researchresultswithprojectinformation • Machine-readable (Linked Open Data) EGU, April 23 2012

  13. Twodisciplines… • SSH - DANS/EASY • ProducehandmadeEP‘satfilelevel • Experienceddatamodellingandresearchwork (Veteran tapes) • Life Sciences – EMBL-EBI • Text mineabstracts/fulltexts • Link bio-entitiestodatabase • Enrichedinformationcouldbetransferedtogenericinfrastructure EGU, April 23 2012

  14. Demonstrator • Data model • Generalised • Extractcitationinfofordatasets • frome.gUniProtandfulltext • DerivePersistent Identifiers • from URLs (URNs and PMC-Ids) • Transfer oflinkedentities • communityservicesandOpenAIREinfrastructure EGU, April 23 2012

  15. Use Cases • Import EP created in DANS or SURF • Proofof Services Interoperability EGU, April 23 2012

  16. Use Cases • Import EP created in DANS or SURF • Proofof Services Interoperability • Manual compositionof EP in OpenAIRE • Proofof Tools: Editor, Discovery of Research data in OpenAIRE EGU, April 23 2012

  17. Use Cases • Import EP created in DANS or SURF • Proofof Services Interoperability • Manual compositionof EP in OpenAIRE • Proofof Tools: Editor, Discovery of Research data in OpenAIRE • Automaticgenerationof EP byextractingcitationinformation (ormining), auto-linking • Proofthatrichmetadatacanberepresented in user-friendlyway • PossibleLinked Open Data compliancy EGU, April 23 2012

  18. Use Cases • Reuse andenrichment: annotationsaddedbyuserstodatasetsorpublications • An EP isusedbyresearcher in publication • Adequatedocumentation • Test legal framework • Study into Licensing ofpublicationsanddata • Analyse requirementsof legal protectionofresearchdata • Legal prototype ofrestraints EGU, April 23 2012

  19. Research Scenario 1 • Youare an EC-project researcher • OA publication • Dataset with a DOI • Generatethe link in OpenAIRE • Researcher completesdataoutputwithpaper • Nodatarepository • SubmitdatasettoOpenAIRE ‚orphan‘ repository EGU, April 23 2012

  20. Research Scenario 2 • Yousearchfor ‚mousegenomeliterature‘ in OpenAIRE • Find a citationforpublication • fundingdetailsofproject • Relateddata, say a protein link toGenBank • Create yourown links tothis EGU, April 23 2012

  21. Service activities • For publication providers - OpenAIRE’s Guidelines for repository managers • Metadata: (DC) and Protocols: (OAI etc.) • For data providers: accessing (metadata of) datasets from providers while minimizing effort to comply • Metadata: indications on minimal metadata about datasets (e.g., identifiers, date of creations, title, URLs) and best-practices for interlinking datasets and publications • Access protocols: no requirements for adopting precise protocols (e.g., OAI, FTP) or ID/URL frameworks (e.g., OpenURL, DOI) to comply EGU, April 23 2012

  22. Service activitiesUsers • Registered end-users (e.g., EC personnel, project coordinators, researchers, authors) • Search, browse and access statistics • Deposit files and metadata of publications and datasets into the Orphan Repository • Ingest (claim) into the information space metadata • Create EP by combining datasets from different communities • Reuse ofdatasetsassecondarydata (withrespectto IPR) EGU, April 23 2012

  23. Service activitiesUsers • Content provider managers (e.g. datasets and publications repository managers) • Registration and validation (OpenAIREPlus guidelines) of publication and dataset repositories • Data curators (administrative tasks) • Collect and aggregate publications, project data and dataset metadata • Third-party application developers • Bulk-fetch content from the (curated) information space EGU, April 23 2012

  24. The Future….. • “Forget PDFs, imagine an ideal publication where you click on tables to get through to raw data, where you can contribute and discuss some aspects and later update or correct parts of a paper in subsequent versions. The latter is similar to Wikipedia, actually.” • PhD Student, UGOE EGU, April 23 2012

  25. Danke…... • najla.rettberg@gwdg.de • @openaire_eu EGU, April 23 2012

  26. Linking: Publicationto Database EGU, April 23 2012

  27. AuthorsuppliedSupplementaryinfo: TIFF,MOV PLoS: O’Toole, Greenan, Lange, Srayko, Müller-Reichert EGU, April 23 2012

  28. Research Impact • OpenAIRE puts foundations to measure research impact per publication, researcher, project, institution, country, … EGU, April 23 2012

  29. Data Management Issues • Gooddatapractices • Data policies, standards • Drivers fordeposit? What‘s in itforresearchers? • Work withpublishers, DOIs • Where do researchersdepositdata? Figshare? EGU, April 23 2012

  30. Potential issues: unstructured data with different kinds of media files • Persistent IDs: resolvable and managed by the originator of resource • Preservation: responsibility lies in the trusted repositories EGU, April 23 2012

  31. Demonstrators • Demonstrators for Enhanced Publications • Explorehow links aremanagedbetweenpublicationsandresearchdata in Life Sciencesand SSH • Howdatacanbemutuallycomplementedandexchanged in genericinfrastructures • Example: how a publication ‚reported‘ in OpenAIREisenriched via UKPMC with links todatabases • Report: „Connection Data and Publications through e-Infrastructure“ EGU, April 23 2012

More Related