260 likes | 337 Views
Lessons from an examination of Dublin Core. Paul Miller UK Interoperability Focus P.Miller@ukoln.ac.uk < URL: http://www.ukoln.ac.uk/interop–focus/ >. Arts & Humanities Data Service. AHDS <URL: http://ahds.ac.uk/ > funded by the UK Higher Education Funding Councils, and comprises:.
E N D
Lessons from an examination of Dublin Core Paul Miller UK Interoperability Focus P.Miller@ukoln.ac.uk <URL:http://www.ukoln.ac.uk/interop–focus/>
Arts & Humanities Data Service • AHDS <URL: http://ahds.ac.uk/> funded by the UK Higher Education Funding Councils, and comprises: • Archaeology Data Service (York et al.) • History Data Service (Essex Data Archive) • Oxford Text Archive (Oxford) • Performing Arts Data Service (Glasgow) • Visual Arts Data Service (Farnham) • an Executive (King’s College, London). <URL: http://www.ukoln.ac.uk/interop-focus/>
Aims of the AHDS • AHDS is • a distributed collection of discipline–specific services • each with additional responsibility service–wide for a data ‘type’ • a model for decentralised data archiving and access • AHDS is building • a single gateway to Arts & Humanities data of interest to UK academics • data remain distributed in many locations, linked by means of Z39.50, Dublin Core, etc. <URL: http://www.ukoln.ac.uk/interop-focus/>
Data in the Arts & Humanities (1) • Arts & Humanities data encompass a wide range of types and formats, including • text • raw, SGML marked–up, PDF, etc • databases • flat file, relational, spatial, temporal, GIS, etc • images • manuscripts, works of art, remote sensing, film, video, etc • sound • recordings, MIDI, etc. <URL: http://www.ukoln.ac.uk/interop-focus/>
Data in the Arts & Humanities (2) • These data not only span diverse technical formats, they are also • constructed within differing conceptual frameworks • ‘geographies’, theoretical paradigms, etc • ‘Creator’ may not be quite synonymous with ‘Author’ • recorded following different — and inconsistent — cataloguing practices • described using many different ‘metadata’ systems, if formally described at all. <URL: http://www.ukoln.ac.uk/interop-focus/>
Data in the Arts & Humanities (3) These data are too diverse to be effectively retrieved by means of any one search system …but… a description of the ‘core metadata’ for each resource may prove comparable within and between disciplines, facilitating effective resource discovery. <URL: http://www.ukoln.ac.uk/interop-focus/>
What is ‘Metadata’? • meaningless jargon, or; • a fashionable term for what we’ve always done, or; • “a means of turning data into information”, and; • “data about data”, and; • the name of a film director (‘Luc Besson’), and; • the title of a book (‘The Lord of the Flies’) • etc • Metadata means many things to many people. <URL: http://www.ukoln.ac.uk/interop-focus/>
The Dublin Core (1) • probably the best tool for providing core resource discovery metadata • international, cross–domain effort to achieve definition of a core element set • defines 15 core elements • allows optional qualification of these through addition of thesauri and lookup tables (SCHEME), sub–classification of the elements (SUBELEMENT) and metadata language (LANG) • hopes to capture the essence of any resource… • …but is it too Core? <URL: http://www.ukoln.ac.uk/interop-focus/>
The Dublin Core (2) • Title • Creator • Subject • Description • Publisher • Contributor • Date • Type • Format • Identifier • Source • Language • Relation • Coverage • Rights http://purl.org/dc/ <URL: http://www.ukoln.ac.uk/interop-focus/>
AHDS/UKOLN Workshops (1) • an attempt to discover what users and depositors require from a Core Element set • created jointly by AHDS and UKOLN to • resolve AHDS’ immediate problems • explore the wider issues of cross–domain, interdisciplinary, distributed resource discovery. • Dublin Core used as reference set, but • participants examined both where it failed to meet their needs and where it offered more than required • DC was not seen as a replacement for other standards. <URL: http://www.ukoln.ac.uk/interop-focus/>
AHDS/UKOLN Workshops (2) • Six workshops held during 1997 • two (digital sound and moving images) for Performing Arts, one for each of the other Service Providers • integrated with ongoing technical deliberations • Invitees included • experts in holding and describing domain–specific data • those depositing these data • current and potential users of the data • me. <URL: http://www.ukoln.ac.uk/interop-focus/>
AHDS/UKOLN Workshops (3) • Draft reports widely circulated for comment • Final reports from each workshop made available on Service Provider sites • Integrated report published October 1997 • Discovering Online Resources Across the Humanities: a practical implementation of the Dublin Core. Edited by Paul Miller & Daniel Greenstein • Available on–line from <URL: http://ahds.ac.uk/> or order printed copy from info@ahds.ac.uk. <URL: http://www.ukoln.ac.uk/interop-focus/>
Assessing the Dublin Core (1) • Dublin Core is not • a replacement for existing detailed metadata schemes • they still have an (important) role to play • a means for describing data sets, concepts, or subject issues in great detail • the answer to all our problems (!) • Many of the problems encountered by workshops were not with Dublin Core itself, but were related to more generic data description and cataloguing issues • In many cases, workshops began by confusing these external issues with those integral to Dublin Core. <URL: http://www.ukoln.ac.uk/interop-focus/>
Assessing the Dublin Core (2) • Dublin Core is • a useful means by which discrete data types and sets may be described in a comparable fashion • small enough to remain manageable, yet extensible enough to (hopefully) be suitably descriptive • a fascinating example of inter–disciplinary and international co–operation • (if used in conjunction with the concepts of the Warwick Framework) an extremely powerful means of drawing complex metadata and data together, facilitating access and re–use. <URL: http://www.ukoln.ac.uk/interop-focus/>
Assessing the Elements (1) • Dublin Core found to be fit for purpose • definitions found to be unsatisfactory • interpreted too differently by the six workshops • AHDS agreed single interpretation • Current review of elements across DC community • CREATOR and CONTRIBUTORS found to be confusing • notions of primary intellectual responsibility difficult to assign • some workshops suggest a single element, NAMES, instead. AHDS agreed to ignore CONTRIBUTORS. <URL: http://www.ukoln.ac.uk/interop-focus/>
Assessing the Elements (2) • SUBJECT open to abuse • easily overloaded with many terms from many word lists • potential conflict with COVERAGE and TYPE • what is the subject of ‘Hamlet’, anyway?! • PUBLISHER means too many different things to different people. <URL: http://www.ukoln.ac.uk/interop-focus/>
Assessing the Elements (3) • DATE not sufficient for requirements • creation of original work? publication date of version later digitised? release date of electronic version? update cycle dates? • TYPE represents a confusing collection of concepts. <URL: http://www.ukoln.ac.uk/interop-focus/>
Assessing the Elements (4) • FORMAT concept extended to non–digital • AHDS suggests inclusion of film running times, video formats, etcwhere absolutely required • SOURCE and RELATION need clarified • AHDS Service Providers hold different notions of ‘source’ • both could be misused with over-inclusion of ‘useful’ relationships. <URL: http://www.ukoln.ac.uk/interop-focus/>
Assessing the Elements (5) • COVERAGE is ‘complex’ • is the Deutsche Bibliothek the SUBJECT or COVERAGE of a photograph? • what are the usefully recorded spatialCOVERAGEs for a Frankish bowl made in Aachen, excavated in Trier and on view in London’s British Museum? • The Holy Roman Empire? Aachen? France? Germany? Trier? British Museum? London? Europe?… • what is ‘The Holy Roman Empire’? • temporalCOVERAGE ? <URL: http://www.ukoln.ac.uk/interop-focus/>
Assessing the Elements (6) • RIGHTSessential • AHDS developed a simple rights management coding scheme to be used in conjunction with a mandatory link to detailed rights management information for each individual resource. <URL: http://www.ukoln.ac.uk/interop-focus/>
Assessing the Qualifiers • ‘optional’ extensibility of SCHEME and SUBELEMENT found to be essential. LANG useful in certain cases • every use of a SCHEME or SUBELEMENTincreases Dublin Core’s value to one discipline, and reduces interoperability with the others • many SCHEMEs and SUBELEMENTs identified in workshop reports • integrated report attempts to aggregate these, moving back towards interoperable generalisations • where is middle ground between value to one discipline and the over–reaching goal of interoperability? <URL: http://www.ukoln.ac.uk/interop-focus/>
Moving Forward (1) • Resources of interest to AHDS are • diverse • an archaeological excavation database and a recording of the Berlin Philharmonic playing ‘Ode to Europe’ • distributed • a database physically mounted in York, the Scottish NMR in Edinburgh, and the Shetland Amenities Trust SMR in Lerwick; all accessible to the user in Pisa or Antwerp • ‘living’ • a Local Authority SMR, updated every day • rarely in HTML • so ‘Harvesting’ is not the best solution. <URL: http://www.ukoln.ac.uk/interop-focus/>
Moving Forward (2) • Z39.50 seen as the solution • preserves distributed nature of resources • capable of expressing many data types • (relatively) large body of implementation experience • allows ‘easy’ integration with CIMI, Aquarelle, etc • having gained sufficient expertise, targets may be implemented at collaborating organisations, extending system functionality. • probably I–Site (it’s free, and spatially aware). <URL: http://www.ukoln.ac.uk/interop-focus/>
A Model WWW browser AHDSGateway HDS OTA ADS PADS VADS (Z Target) <URL: http://www.ukoln.ac.uk/interop-focus/>
A Model … extended for ADS SCRAN (Scotland)Museums (world–wide)H–SYS (England)ADAP (USA)NGDF (UK) NUTS/ SABE (EU)Thesauri — CoE, GII etc.plus local ADS collections WWW browser/ Z Target AHDSGateway etc. HDS OTA ADS PADS VADS CIMITest–beds Aquarelle NMRE NMRS NMRW SMR (c.50) <URL: http://www.ukoln.ac.uk/interop-focus/>
The reality • AHDS Z39.50 Gateway • <http://prospero.ahds.ac.uk:8080/ahds_live/> • Z39.50 ‘Targets’ at all five Service Providers • Domain–specific interfaces to collections also available • <http://ads.ahds.ac.uk/catalogue/> • On–going development of disciplinary Gateways at several Service Providers. <URL: http://www.ukoln.ac.uk/interop-focus/>