960 likes | 1.13k Views
C E R I F T U T O R I A L CERIF 1.3 Release. Brigitte Jörg, M.A. (Information Science) German Research Center for Artificial Intelligence (DFKI ), Language Technology Lab Berlin, Germany. Introduction of Speaker. Brigitte Jörg M.A. Information Science
E N D
C E R I F T U T O R I A LCERIF 1.3 Release Brigitte Jörg, M.A. (Information Science) German Research Center forArtificialIntelligence (DFKI), Language Technology Lab Berlin, Germany
Introductionof Speaker Brigitte Jörg M.A. Information Science Information Systems, Economics • Researcher, Project Manager DFKI GmbH, Language Technology Lab, Manager Virtual Information Center, Berlin • CERIF TG Leader, Board Member euroCRIS Contact: brigitte.joerg@ dfki.de http://www.dfki.de/~brigitte/
Outline • GroundingExplanations • The CERIF Model • Entities • Model Structure • Model Components • Semantic Layer • CERIF Semantics • Useand Implementations • Ongoingand Next Steps • Discussion
Funding Organisation Organisation Person Person Project Project Service Skills Publication Equipment CV Patent Classification Classification Product ( ( ) ) Semantics Semantics Event The C E R I F Model CommonEuropeanResearchInformationFormat
is memberof ispartof A C B D supports X Z co-ordinates F G Whatis a model ? • … is a simplifiedviewtodescribe a particularareaofinterest • … allowsfor a bettercommunicationbetweenparties (mutual understanding) • … supports (re-)design decisions • … supportsworkflowidentification • … supportsdocumentation • … supportsformalization • … canbeexchanged, re-used, iterated, extended
WhatisMetadata ? „Metadata is structured data which describes the characteristicsof a resource.” An Introduction to Metadata by Chris Taylor, University of Queensland “Metadata is sometimes defined literally as 'data about data,' but the term is normally understood to mean structured data about resources that can be used to help support a wide range of operations. These might include, for example, resource descriptionand discovery, the management of information resourcesand their long-term preservation.” Metadata in a Nutshellby Michael Day, UKOLN
Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata WhatisMetadata ? Book: Title: The Hitchhiker‘s Guide to the Galaxy Date of Publication: 1979 Radio Series: Title: The Hitchhiker‘s Guide to the Galaxy Description: is a science fiction comedy series created by Douglas Adams. Originally a radio comedy broadcast on BBC Radio 4 in 1978, […] Source: Wikipedia Date of Query: May 30, 2008 Series of five Books: Title: The Hitchhiker‘s Guide to the Galaxy. Between: 1979 - 1982 • Structure: • Type of Resource • Title • Description • Source • Date • Author, Creator, … TV Series: Title: The Hitchhiker‘s Guide to the Galaxy Screened: 1981 Data about Data Game Cover Image: The Hitchhiker‘s Guide to the Galaxy Source:http://egotron.com/ Retrieved: May 30, 2008 Computer Game: Title: The Hitchhiker‘s Guide to the Galaxy Released: 1984 Links: http://www.bbc.co.uk/cult/hitchhikers/ HTML-Title: Cult – The Hitchhiker‘s Guide to the Galaxy http://en.wikipedia.org/wiki/The_Hitchhiker's_Guide_to_the_Galaxy HTML-Title:The Hitchhiker's Guide to the Galaxy Comic Book Adaptions: Title: The Hitchhiker‘s Guide to the Galaxy Between: 1993 – 1996
Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Whatis formal, semanticMetadata ? Book: Title: The Hitchhiker‘s Guide to the Galaxy Date of Publication: 1979 Radio Series: Title: The Hitchhiker‘s Guide to the Galaxy Description: is a science fiction comedy series created by Douglas Adams. Originally a radio comedy broadcast on BBC Radio 4 in 1978, […] Source: Wikipedia Date of Query: May 30, 2008 Series of five Books: Title: The Hitchhiker‘s Guide to the Galaxy. Between: 1979 - 1982 • Structure: • Type of Resource • Title • Description • Source • Date • Author, Creator, … TV Series: Title: The Hitchhiker‘s Guide to the Galaxy Screened: 1981 Data about Data Game Cover Image: The Hitchhiker‘s Guide to the Galaxy Source:http://egotron.com/ Retrieved: May 30, 2008 Computer Game: Title: The Hitchhiker‘s Guide to the Galaxy Released: 1984 Links: http://www.bbc.co.uk/cult/hitchhikers/ HTML-Title: Cult – The Hitchhiker‘s Guide to the Galaxy http://en.wikipedia.org/wiki/The_Hitchhiker's_Guide_to_the_Galaxy HTML-Title:The Hitchhiker's Guide to the Galaxy Comic Book Adaptions: Title: The Hitchhiker‘s Guide to the Galaxy Between: 1993 – 1996
Whatis formal, semanticMetadata? MetadataCategories • DescriptiveMetadata [intellectualcontents] • AdministrativeMetadata • Technical [file formats ...] • Rights Management [permissions ...] • Provenance [creation, subsequenttreatment, ...] • ... • StructuralMetadata [internalstructureofitems: pageorder ...] • ContextualMetadata • Project Context [fundingprogramme, participating organisations …] • PublicationContext [numberofauthors, externalauthors, first …] • UsageContext [downloads, requests, …] • ... Formalization, Semantcis = based on a Model Formalization, Semantics = based on a Model
Whatis Research Information ? Data/Metadataor Information about • Scientists • Project Managers • OngoingandCompleted Projects • Research Departments • FundingOrganisationsand Programmes • Research Results • Publications • Equipment • Facilities • theirtimelyRelationships (Semantics) ...
Whatis a CRIS ? CurrentResearch Information System = CRIS • … thatmeans • Timeliness • Vitality • … informationabout • People + • Organisations+ • Projects + • Funding Programmes + • Research Results+ • … • … driven by • A Concept • A Model • … incorporated as a • Implementation (ICT) an integrated approach towards managing research information
CERIF What is a CRIS ? Current Research Information System = CRIS Metadata • … that means • Timeliness • Vitality • … information about • People + • Organisations + • Projects + • Funding Programmes + • Research Results + • … • … driven by • A Concept • A Model • … incorporated as a • Implementation (ICT) heterogenous entities changing relationships Integration an integrated approach towards managing research information
What CRISs aimat ? Data Silos Funding Programmes Project Management Projects Terminologies CRISs toenable IntegrationandInterchange MetadataHost & Carrier Organisations Publications Finance Human Resource Management Patents People
What CRISs aimat ? Data Silos Funding Programmes Project Management Projects Terminologies CRISs toenable IntegrationandInterchange MetadataHost & Carrier Organisations Publications Finance Human Resource Management Patents People
What CRISs aimat ? Data Silos Funding Programmes Project Management Projects Terminologies Organisations Publications Finance Human Resource Management Patents People
What CRISs aimat ? Data Silos Funding Programmes Project Management Projects Terminologies Organisations Publications CERIF as a Middle-Layer Finance Human Resource Management Patents People
End-User CRIS Research Context [projects, persons, organisational units funding, products, patents, publications facilities, equipment, events] CERIF CERIF Various protocols OAI-PMH OA Repository (hypermedia) Documents e-Research repository Datasets and Software CRIS and Repositories at 1 institution (Slide by Keith Jeffery)
Who is in need of Research Information? Stakeholders Find partners, trackcompetitors, form collaborations todecide on prioritiesandresourcing, comparewithother countries Researchers Decision Makers Funding Organisations toassessperformance, assessoutput, find proposalreviewers Research Information Project Managers Research Organisations Publishers Education Intermediaries / Brokers to find potential authors, find paperreviewers to find researchproducts, identifyideastobecarriedforward General Public SMEs Media forinterest tocommunicateresult
Who is in need of Research Information? Stakeholders Find partners, trackcompetitors, form collaborations todecide on prioritiesandresourcing, comparewithother countries Researchers Decision Makers Funding Organisations toassessperformance, assessoutput, find proposalreviewers Research Information Project Managers Research Organisations Publishers Education Intermediaries / Brokers to find potential authors, find paperreviewers to find researchproducts, identifyideastobecarriedforward General Public Research is International SMEs Media forinterest tocommunicateresult Research Information involves various Entities
WhatkindofQuestions do wewanttoanswer? • Howmanyarticleshasauthor X published in 2007 as a firstauthor? • Howoftenhavearticlesbyauthor X beencited? • Didauthor X publishwithinstitutionallyexternalauthors? • In howmany FP7 projectsdoesorganisation Z participate? • Howmanypublicationshaveresultedfromproject Y? • Howmanypeoplehavebeenemployed in thecourseof FP6 projectsfromthe 1st call in the NMS? • HowmanyPhDstudentshaveparticipated in FP6 projects? • Howmanywomenhavebeeninvolved in FP6 projects? • Howoftenhavearticles in journal A beenrequested in 2007? • Howmanyarticleshavebeenpublished in thefieldof B?
Funding Organisation Organisation Person Person Project Project Service Skills Publication Equipment CV Patent Classification Classification Product ( ( ) ) Semantics Semantics Event C E R I F CommonEuropeanResearchInformationFormat
Whatis CERIF ? theCommonEuropean Research Information Format • Conceptual Level (Specification) A descriptionoftheconceptsofresearchinvolved entitiesandtheirrelationships • Logical Level (Conceptual Model)A formal description of the research involvedentities and their relationships according to a concept • Physical Level (Database Scripts)A formal machine readable description of the entities and their relationships according to a concept • Semantic Layer (current Formal Semantics)A formalizedcontrolledvocabularydescribing ageneralcontextualsemanticsoftheresearchdomaininline withtheconceptual, logicalandmachinedescription is author of is author (numbered) of is publisher of is author (percentage) of Person_Publication Scheme is subject of is editor (numbered) of SQL Script ----------------------- CREATE Table Person CREATE Table Project CREATE Table OrgUnit is editor of is translator of is reviewer of
The CERIF Evolution FOR MA L SEMANT IC S CERIF 2006 / 2008 Model Similar Ideas UN/UNESCO OECD CODATA Base Link Semantics Language 2ndLevel EU Working Group on Research Databases Workshop CERIF 2000 Model Roles EXPERTISE OrgUnit PERSON CERIF 91 PROJECT RESULTS EQUIPMENT PROJECT CLASSIFICATION Acronym: ERGO Participant: Keith Jeffery, Anne Asser son, many more Organisations: Rutherford Appleton, Uni- versity of Bergen, … • - Data Model • Model Normalization • - Robust/ConsistentStructure • - Extensible Structure • - SemanticLayer • XML Exchange Specification- Elaboration on Publication • CERIF CoreSemantics (2008 1.2) • Data Model - Multilinguality- ControlledVocabulary- Roles / Types- User-driven • EC Recommendationto Member States + LinkedData • - Networking of DBs • Exchange of Records • EC Recommendation to Member States + CERIF Ontology CERIF 1.3 1987 1991 2000 2006 2008 2011/2012
Whatis CERIF ? CommonEuropeanResearchInformation Format • CERIF is an EU Recommendation to Member States http://cordis.europa.eu/cerif/ • The European Commission (EC) has authorisedeuroCRIS to maintain and develop CERIF and its usagehttp://www.eurocris.org/cerif/cerif-releases/
Funding Organisation Organisation Person Person Project Project Service Skills Publication Equipment CV Patent Classification Classification Product ( ( ) ) Semantics Semantics Event C E R I F CommonEuropeanResearchInformationFormat
One View ofthe CERIF Model Structure CERIF EntityTypes • Base Entities • Result Entities • Infrastructure Entities • 2nd Level Entities • Link Entities CERIF Features • Multiple Language • Semantics • Measure & Indicator • GeographicBounding new new new
Person ID URI Gender FirstNames OtherNames FamilyNames NameVariants ResearchInterest Keywords Project ID URI Acronym StartDate EndDate Title Abstract Keywords OrganisationUnit ID URI Acronym Name HeadCount CurrencyCode Turnover ResearchActivity Keywords CERIF Base Entities in Detail
ResultPublication ID URI Title Subtitle Abstract Bibl. Note PublicationDate TotalPages StartPage EndPage Keywords ResultPatent ID URI PatentNumber Title CountryCode RegistrationDate ApprovalDate Description Keywords ResultProduct ID URI InternationalID CERIF Result Entities in Detail
CERIF Infrastructure Entities Equipment Facility Service
Service ID Acronym URI Title Description Keywords Equipment ID Acronym URI Title Description Keywords Facility ID Acronym URI Title Description Keywords CERIF Infrastructure Entities in Detail Equipment Facility Service
Service ID Acronym URI Title Description Keywords Equipment ID Acronym URI Title Description Keywords Facility ID Acronym URI Title Description Keywords Research Infrastructures (Definitions) • A distributed RI is constituted by geographically distributed implementations / facilities but managed by a single body. Thus, a distributed RI is thus considered as asingle RI, modelled in CERIF as a tree of facilities connected to the Facility entity representing the entire RI via recursive cfFacility_Facility relationships with appropriate semantics (e.g. isPartOf) while the RI is connected with its managing body through the cfOrgUnit_Facillinkentity. • A networkof RIs comprise several Ris connected in some respect, but each one with its own governance / management body. The relationships between Ris in a network are modelled in CERIF using the CERIF Semantic Layer, using the cfFacil_Classentity (Class: Network class, Role expression: isANetwork) and the cfFacil_Facil entity (Class: Network class, Role expression: belongsTo). • A virtual RI is defined ase‐infrastructure, either a distributed e-infrastructure that connects several Ris (databases, HP computers, facilities, etc) or a single‐sited electronic RI (e-library, e‐archive, data repository, etc). • A Cluster is a grouping of Ris based on certain similarities (scientific domain, geographical region). A clusterofRIsmaybesupportedinsomewayor funded, e.g. byindustryentities.Acluster is modelled in CERIF asavirtualorganisationthrough the cfOrgUnitentity connected withtheindividualRIsbelongingtotheclusterwith a cfOrgUnit_Facillink entity.
CERIF 2008 1.2 Entities (Types in Colors) Facility Equipment Funding ExpertiseAndSkills Service Qualification ElectronicAddresse Prize PostalAddress CV Country Citation Currency Metrics Event Language
CERIF 1.3 Entities (Types in Colors) Funding Equipment Facility ExpertiseAndSkills Qualification Service Prize ElectronicAddresse CV PostalAddress Citation GeographicBounding Box Metrics Indicator Measurement Country Event Language Currency
CERIF Second Level Entities Infrastructure Entities Funding Equipment Facility ExpertiseAndSkills Qualification Service Prize ElectronicAddresse CV PostalAddress Geographic Citation GeographicBounding Box Measurement & Indicator Metrics Indicator Measurement Country Event Language Currency
Measuring Impact in CERIF (MICE) MICE, a JISC-funded Project coordinated by Richard Gartner, Kings College, London, UK
CERIF Measurement & Indicator(inspiredby MICE; furtherdevelopedwithinCERIFy) Indicator GeoBoundin Box GeoBoundin Box Facility Equipment Service PAddr Country PAddr Country Facility Equipment Service Person Project Organisation Publication Patent Product Publication Patent Product Person Project Organisation Measurement Classification Classification CERIFy, a JISC-funded Project managed by MahendraMahey, University of UKOLN, Bath, UK
CERIF Measurement & Indicator(inspiredbyMICE; furtherdevelopedwithinCERIFy) Impact Indicator GeoBoundin Box GeoBoundin Box Facility Equipment Service Project Country PAddr Country Facility Equipment Service Person PAddr Organisation Publication Patent Product Publication Patent Product Person Project Organisation Measurement Esteem Classification Classification CERIFy, a JISC-funded Project managed by MahendraMahey, University of UKOLN, Bath, UK
CERIF Measurement & Indicator(inspiredbyMICE; furtherdevelopedwithinCERIFy) Impact Number of Patents 2007-2008;Number of Staff employed; Number of cancer patients 2008-2010; Theater performances 2005-2010; Indicator GeoBoundin Box GeoBoundin Box Service Equipment Facility Country PAddr Country Person Publication Organisation PAddr Project Equipment Product Publication Patent Product Person Project Organisation Facility Service Patent Measurement Impact indicators are categories that include such concepts as 'improving performance of existing businesses', 'improved health outcomes' and 'cultural enrichment'. Classification Classification
Measurement & Indicator cfMeasureIdentifier cfCountInteger • cfCountIntegerChange cfValueFloatingPoint • cfCountFloatingPointChange cfValueJudgementalNumeric cfValueJudgementalNumericChange cfValueJudgementalText cfValueJudgementalTextChange cfURI
CERIF Entities (Types in Colors) Funding Equipment Facility ExpertiseAndSkills Qualification Service Prize ElectronicAddresse CV PostalAddress Citation GeographicBounding Box Metrics Indicator Measurement Country Event Language Currency
CERIF Entity Structure Generic Identifier URI Attributes Multilingual Attributes Relationships (Links)
CERIF Link Entity Structure Generic Applied Triple Structure Time Stamps Contextual Roles Semantic Layer Vocabulary
role=author1 institute role=author role=deliverable1.2 role=CEO role=funder role=coordinator Some CERIF Semantic Features Semantic Features associatedwith Link Entities
role=author1-institute role=editor role=... ? role=author role=author1 role=reviewer role=... ? role=deliverable1.2 role=journal article role=public report role=CEO role=researcher role=project-manager role=funder role=investigator role=member role=coordinator role=manager More CERIF Semantic Features Semantic Features associatedwith Link Entities
role=author1-institute role=editor role=... ? role=author role=author1 role=reviewer role=... ? role=deliverable1.2 role=journal article role=public report role=CEO role=researcher role=project-manager role=funder role=investigator role=member role=coordinator role=manager More CERIF Semantic Features Semantic Features associatedwith Link Entities