1 / 29

CERIF for Datasets: Background and Key Findings

CERIF for Datasets: Background and Key Findings. Workshop, London 26 th July 2013. CERIF slides reproduced from presentations by euroCRIS members : Keith Jeffery, Brigitte Joerg , Anna Clements. C4D Summary. JISC MRD Programme

kalare
Download Presentation

CERIF for Datasets: Background and Key Findings

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CERIF for Datasets: Background and Key Findings Workshop, London 26th July 2013 CERIF slides reproduced from presentations by euroCRIS members :Keith Jeffery, Brigitte Joerg, Anna Clements

  2. C4D Summary • JISC MRD Programme • Consortium : Sunderland, Glasgow, St Andrews, NERC, EPSRC, DCC andeuroCRIS • “CERIFication” of the metadataabout research datasets • Focus on MEDIN* standard : NERC requirementforhttp://www.bodc.ac.uk/ * http://www.oceannet.org/

  3. Datasets & metadata Datasets have sparked interest in metadata standards that support their: • Discoverability • Description • Usability • Re-use

  4. For example … • CKAN : www.ckan.org • Software platform; default schema is DC • eGMSDescription • UK e-Government metadata standard; based on DC • ‘flat’ model; single entity (a resource or dataset); keep adding attributes • DCAT • RDF schema vocabulary for PSI (public sector info) • Some normalisation; can’t capture different roles/semantics in relationships

  5. Houssos, N., Joerg, B., Matthews, B.. A multi-level metadata approach for a Public Sector Information data infrastructure. CRIS2012. Prague 06-09 June 2012 http://www.engage-project.eu/engage/wp/

  6. … so what about CERIF? • Common European Research Information Format • A conceptual model fordescribing the complete research domain • A standard for the development, implementation and interoperability of current research information systems (CRIS) and theirvariousapplication • Est. 1991; maintainedbywww.euroCRIS.org

  7. … and euroCRIS? • Notforprofitorganisation of experts • Research organisations; funders; publishers; systems providers; standardsorganisations • 109 institutional, 38 personal & 20 affiliate members (euroCRISannual report 2012) • 41 countries; notjust Europe • Mainactivity is the development, maintenance andof implementationCERIF

  8. euroCRIS : Strategic Partners

  9. In the UK : The CERIF landscape

  10. UK CERIF adoptipn • 1/3 of UK HEIs have a CERIF-compliant CRIS* • Driven by desire to better support research management at the institutional level • … and streamline reporting to funders • Source: UKOLN (R. Russell), Adoption of CERIF in Higher Education Institutions in the UK: A Landscape Study, March 2012 • http://www.ukoln.ac.uk/isc/reports/cerif-landscape-study-2012/CERIF-UK-landscape-report-v1.1.pdf

  11. CERIF 2006 / 2008 Model CERIF 1.5 CERIF 1.4 (XML) CERIF 1.3 CERIF 1.6 Base Link Semantics Language 2ndLevel CERIF 2000 Model Roles EXPERTISE OrgUnit PERSON • --Data Model • - Infrastructure - Facility, Equipment, Service • - Measurement & Indicator • - Entities and Link Tables • GeographicBounding Box- CERIF 1.3 Vocabulary • - UUIDs - Terms - Schemes • - CERIF 1.4 new XML format • - CERIF 1.5 FederatedIdentifiers CERIF 91 • --Data Model • - C4D datasets PROJECT RESULTS EQUIPMENT PROJECT CLASSIFICATION Acronym : ERGO Participants : Keith Jefffery, Anne Asserson, Rutherford Appleton Lab, Univ Bergen,, manymore • - Data Model • Model Normalization • - Robust/ConsistentStructure • - Extensible Structure • - SemanticLayer • XML Exchange Specification- Elaboration on Publication • CERIF CoreSemantics (2008 1.2) • Data Model - Multilinguality- ControlledVocabulary- Roles / Types- User-driven • EC Recommendationto Member States • - Networking of DBs • Exchange of Records • EC Recommendationto Member States + Linked Data 2013 2012 2006 1991 2000 2002

  12. CERIF EntityTypes • Base Entities • Result Entities • Infrastructure Entities • 2nd Level Entities • Link Entities • CERIF Features • Multiple Language • Semantics • Measures & Indicators • GeographicBounding Box

  13. Person ID URI Gender FirstNames OtherNames FamilyNames NameVariants ResearchInterest Keywords Project ID URI Acronym StartDate EndDate Title Abstract Keywords OrganisationUnit ID URI Acronym Name HeadCount CurrencyCode Turnover ResearchActivity Keywords

  14. cfPerson cfID cfURI cfGender cfBirthdate cfProject cfID cfURI cfAcronym cfStartDate cfEndDate cfOrganisationUnit cfID cfURI cfAcronym cfHeadCount cfCurrencyCode cfTurnover cfTitle cfAbstract cfKeywords cfFamilyNames cfFirstNames cfName cfDescription cfOtherNames cfKeywords cfNameVariants cfKeywords

  15. ResultPublication ID URI Title Subtitle Abstract Bibl. Note PublicationDate TotalPages StartPage EndPage Keywords ResultPatent ID URI PatentNumber Title CountryCode RegistrationDate ApprovalDate Description Keywords ResultProduct ID URI

  16. cfResultPublication cfID cfURI cfNumber PublicationDate cfStartPage cfEndPage cfTotalPages cfEdition cfSeries cfIssue cfVolume cfISBN cfISSN cfResultPatent cfID cfURI cfPatentNumber cfCountryCode cfRegistrationDate cfApprovalDate cfResultProduct cfID cfURI cfTitle cfBibliographic Note cfSubtitle cfAbstract cfVersionInfo cfKeywords cfAbbreviation cfName cfName cfDescription cfAbstract cfKeywords cfKeywords cfVersionInfo cfVersionInfo

  17. Advantages of CERIF • CERIF has many advantages as the canonical model (the research information entities, attributes, associations and semantics) for contextual metadata for datasets: • Covers all aspects of research information: researchers, projects, organisations, funding, outputs, equipment, services, and so on; • An optimal (relational) architecture allowing the expression of any kind of relation between entities/attributes with every relation “time-stamped” and semantically defined; • Very fine-grained structure, allowing output of the metadata to virtually any format; • A separated “semantic layer” allowing the use of multiple (any) controlled vocabularies (classifications, typologies) as well as their cross-linking and mapping; • Ability to cope with multiple languages

  18. Mapping to CERIF 24 of 30 MEDIN elements mapped to CERIF

  19. DataCite version 3.0 • Related Identifier • Relation Type • Description • GeoLocation More work required? • Data Format • Version • Rights • Geolocation Place • Language of Resource • Alternate Identifier • Related Metadata • Size

  20. CERIF 1.6 released for testing 25th July 2013 http://www.cerifsupport.org/2013/07/24/cerif-1-6-formal-models-released-for-testing/

  21. Mapping to other schemata C4D vs RE3Data vs DCI vs DataCite

  22. Key Findings • CERIF metadata model • can be used to record rich metadata about datasets • can related to other pieces of the research landscape • can evolve / extend within formal euroCRIS governance structure BUT … • Needs testing in production environments • Is cfResProd appropriate? Not just a research result? • Ongoing need for agreed vocabularies • CASRAI • RCUK harmonisation

  23. Case Study: DaMaRo • Have used C4D as basis for checking whether DataFinder is rich and detailed enough • Once the C4D profile has been finalised, DaMaRo will embark on implementation of C4D-compliant outputs • Most fields map to C4D

  24. Next Steps • Further consultation with euroCRIS/CERIF TG in terms of best approach • Aiming to achieve most comprehensive set of metadata (incorporating RE3Data, DataCite, etc.) • Move new Pure model to production (after REF) • Exporting and importing CERIF-XML from systems; exploring this with http://ckan.org • Aggregation of data into national data register model

  25. scott.brander@st-andrews.ac.uk anna.clements@st-andrews.ac.uk

More Related