220 likes | 352 Views
Interoperability Aspects in Europeana. Antoine Isaac aisaac@few.vu.nl. Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen. europeana.eu: the mission. Making European cultural heritage better (web-)accessible
E N D
Interoperability Aspects in Europeana Antoine Isaac aisaac@few.vu.nl Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen
europeana.eu: the mission • Making European cultural heritage better (web-)accessible • Federating (online) cultural collections across countries and domains • Hundreds of institutions, millions of objects
europeana.eu in practice • We rely on aggregating from our providers: • Metadata • References to digital objects • We have a portal • End-user “show-case” • We will strive to become a metadata distributor • Allowing partners to get enriched (contextualized) data for their objects • Allowing third-parties to deploy object access functions similar to Europeana’s, in their own services
Current status – provider data • Very heterogeneous: different communities, different institutions, different interests and means • Descriptions of original objects and digital objects uses hundreds of vocabularies, e.g.: • Libraries: “MARC-style” records • Museums: very diverse, richest ones with event-based descriptions (CIDOC-CRM) • Archives: “EAD-style” hierarchical finding aids • Cross-field “container” formats: METS
Current status – provider data • Grain varies • Quality varies • Free keyword indexing • Explicit or implicit use of controlled vocabularies • Adhoc vs. more standard (DDC, AAT, etc.) • Persistent identifier usage not widespread • (National) Libraries are doing better
Current Europeana metadata stream • Europeana Semantic Elements for ingestion of descriptive metadata and pointers to digital objects • Dublin Core fields + Europeana-specific one • Providers do the mapping from their data to ESE • Ingestion process: OAI-PMH, still often via files • (Fielded) full-text search using SOLR/Lucene
Limitations of ESE • Simple “flat” format • Loosing richer (structured) data • OK for full-text indexing and search • Not ok for all the rest (display, access to data, richer search) • Variations of DC field usage across collections • dc:coverage • dc:rights
Digression: talk about rights? • Lots of objects, with rights not cleaned yet • Collection-level approaches are difficult to implement • Rights of metadata different of rights for “real” objects • Result: users don’t know in Europeana the rights status of the object they can access • They have to go to providers’ site for each object • Deterring reference and re-use • Recent developments: trying to • Encourage provision of rights at object-level • Use “controlled vocabularies” for rights (CC) • Promote public domain (esp. for metadata)
The future • A new data model as a solution? • EDM – Europeana Data Model
EDM requirements & principles • Distinction between “provided object” (painting, book, program) and digital representation • Distinction between object and metadata record describing an object • Allow for multiple records for same object, containing potentially contradictory statements about an object • Support for objects that are composed of other objects • Standard metadata format that can be specialized • Standard vocabulary format that can be specialized • EDM should be based on existing standards
EDM basics Re-using available vocabularies • OAI ORE for organization of metadata about an object • Dublin Core for core metadata representation • SKOS for vocabulary representation
EDM basics • A semantic web-inspired model • E.g., DC would not be used with text fields alone, reference to controlled vocabularies (via URIs) will be encouraged • Keeping original descriptive metadata • Achieving interoperability through mapping (cf. Peter’s “profile matching”?) • Flexibility–ingesting richer original metadata– is a main requirement • Even though we might not really use ourselves all of the data at its full potential, e.g. for search
Around the data model • Opportunity (and need) to get and produce richer metadata • De-duplication • Semantic enrichment with contextual resources (thesauri, authority lists) within and outside Europeana • Alignment of contextual resources • Linked Data: serving data on the web, pointing to others’ data • Fits very well Europeana missions
Around the data model Rationalization of data ingestion, archival and dissemination process (OAIS) makes more explicit what Europeana needs to do to behave more as a real metadata archive • Not only feeding a Lucene/SOLR instance • Cope with enrichments, versions, pointers to external resources • Registries of vocabularies (metadata structures and controlled value vocabularies) and links between them
Encouraging community initiatives Best practices for representing and providing metadata can be seen as a complement to the general EDM. Building interoperability cores at community-level • Museums: • ATHENA project (LIDO format) • Audio/visual: • PrestoPrime, European Film Gateway • Archives: • APEnet (using EAD)
Thank you! References for ESE and EDM: http://version1.europeana.eu/web/guest/technical-requirements/ http://version1.europeana.eu/web/europeana-project/technicaldocuments/