Dublin Core and metadata: a tutorial

Dublin Core and metadata: a tutorial PowerPoint PPT Presentation


  • 340 Views
  • Uploaded on
  • Presentation posted in: General

2 - Lux, 1-2 Dec 1997. Questions for you .... MetadataEAD, CIMI, TEI PICS, XML, RDFMARC856Dublin Coreyou aregeeks/people with sensible shoesgoers/doers. 3 - Lux, 1-2 Dec 1997. . . . . Overview. UKOLN and metadata Metadata landscapeDublin CoreMetadata managementInteroperabilityHarvestingFuture.

Download Presentation

Dublin Core and metadata: a tutorial

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


1. Dublin Core and metadata: a tutorial Lorcan Dempsey Andy Powell UKOLN, University of Bath (with a little help from our friends) http://www.ukoln.ac.uk/metadata

2. 2 - Lux, 1-2 Dec 1997 Questions for you ... Metadata EAD, CIMI, TEI PICS, XML, RDF MARC 856 Dublin Core you are geeks/people with sensible shoes goers/doers

3. 3 - Lux, 1-2 Dec 1997 Overview UKOLN and metadata Metadata landscape Dublin Core Metadata management Interoperability Harvesting Future

4. 4 - Lux, 1-2 Dec 1997 UKOLN and metadata ROADS subject gateways WHOIS++ templates BIBLINK CIP for electronic data Dublin Core (+ MARC) Desire WHOIS++, GILS, Dublin Core Z39.50/WHOIS++ NewsAgent current awareness, Ariadne Dublin Core, DC-dot MODELS collection description?? Agora PRIDE Initiatives

5. Metadata landscape

6. 6 - Lux, 1-2 Dec 1997 What is metadata …? It’s just cataloguing, isn’t it … ? Yes and no … Data which supports operations carried out on information objects … discover, buy, ... In the company of strangers (Brody) Relieve user of having to have full advance knowledge of characteristics of resources …

7. 7 - Lux, 1-2 Dec 1997 Metadata model: the library example

8. 8 - Lux, 1-2 Dec 1997 Variety of formal and informal metadata models

9. 9 - Lux, 1-2 Dec 1997 Variety of operations ... Discovery Location Selection fit for use Acquire terms Manipulate Exploit IPR Document Contextualise Preserve Manage dates, people, structures, … Agent/client access ….

10. 10 - Lux, 1-2 Dec 1997 Variety of sectors ... Curatorial traditions ‘cataloguing’/documentation libraries, archives, text archives, museums, geospatial data, etc Network resource discovery directory services, search engines, etc influence from computer science Network information management web developments, W3C, database sitemap, time to live, ... pragmatic - market needs, vendor push

11. 11 - Lux, 1-2 Dec 1997 Variety of creation models ... Author/creator web pages? Repository/site manager effective disclosure better management Third party creator e.g. eLib subject gateways Library

12. 12 - Lux, 1-2 Dec 1997 Metadata ... Variety of metadata models syntax, semantics, content scope sectors/domains Variety of operations supported Variety of creation models Variety of architectures for disclosure/discovery Search and retrieve Disclosure/distribution Management

13. 13 - Lux, 1-2 Dec 1997 Some formats

14. Dublin core in the metadata landscape

15. 15 - Lux, 1-2 Dec 1997 Dublin Core Metadata model Simple element set focus on semantics - several target syntaxes Operations resource discovery on the web Explicitly cross sector/domain No constraint on creation model or application architecture

16. 16 - Lux, 1-2 Dec 1997 Dublin core - why success? Simple Coincides with strategic needs in each of sectors we identified Curatorial: semantic interoperability between richer metadata models Resource discovery: a simple format for descriptive metadata (DLOs) Web management: associate metadata with Web resources Inclusive (countries/domains/traditions) Stu Weibel

17. Introduction to Dublin Core

18. 18 - Lux, 1-2 Dec 1997 Dublin Core - elements Title Subject Description Creator Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights

19. 19 - Lux, 1-2 Dec 1997 Dublin Core - HTML Example <HTML><HEAD> <TITLE>UKOLN Home Page</TITLE> <META NAME="DC.Title” CONTENT="UKOLN: UK Office for Library and Information Networking"> <META NAME="DC.Subject" CONTENT="national centre, network information support, library community, awareness, research, information services, public library networking, bibliographic management, distributed library systems, metadata, resource discovery, conferences, lectures, workshops"> <META NAME="DC.Description" CONTENT="UKOLN is a national centre for support in network information management in the library and information communities. It provides awareness, research and information services"> <META NAME="DC.Creator" CONTENT=”Isobel Stark"> </HEAD> ...

20. Management

21. 21 - Lux, 1-2 Dec 1997 Data creation Practical issues of using Dublin Core for Internet resource description... UKOLN metadata system Requirements 3 models for metadata management Implementation at UKOLN

22. 22 - Lux, 1-2 Dec 1997 UKOLN metadata system requirements Easy to use Work with a variety of methods of creating HTML Simple migration to future metadata formats Separate metadata from resource It would be nice to say that the first thing we did when beginning to think about embedding metadata in Ariadne was to come up with a list of our requirements. However, closer to the truth to say that we came as with the following requirements as we went along… Firsttly, we wanted a system that was easy to use (for the people creating the metadata) and that integrated with the other tools in use. At UKOLN this means Windows 95 and MicroSoft Office type tools by and large. We wanted the system to work with a variety of ways of creating HTML Web pages. Different staff at UKOLN use different tools currently… Word IA, Netscape Gold, text editors under Windows 95. Some of still use text editors under UNIX. We wanted a system that would allow us to move to alternate metadata formats, for example PICS-ng, in he future and to handle any changes in the syntax for embedding DC in HTML fairly easily. So it seems sensible to try and separate the metadata from the resource itself. There are good examples of this happening in other areas. E.g. style sheets, which are concerned with separating presentation specific information away from the content/structural information. It would be nice to say that the first thing we did when beginning to think about embedding metadata in Ariadne was to come up with a list of our requirements. However, closer to the truth to say that we came as with the following requirements as we went along… Firsttly, we wanted a system that was easy to use (for the people creating the metadata) and that integrated with the other tools in use. At UKOLN this means Windows 95 and MicroSoft Office type tools by and large. We wanted the system to work with a variety of ways of creating HTML Web pages. Different staff at UKOLN use different tools currently… Word IA, Netscape Gold, text editors under Windows 95. Some of still use text editors under UNIX. We wanted a system that would allow us to move to alternate metadata formats, for example PICS-ng, in he future and to handle any changes in the syntax for embedding DC in HTML fairly easily. So it seems sensible to try and separate the metadata from the resource itself. There are good examples of this happening in other areas. E.g. style sheets, which are concerned with separating presentation specific information away from the content/structural information.

23. 23 - Lux, 1-2 Dec 1997 Managing Dublin Core (1) HTML Authoring tool Pros… Simple May be useful for training and familiarisation Cons… May not be possible with all editors Maintenance problems Easy to make errors

24. 24 - Lux, 1-2 Dec 1997 DC-dot A Web based tool for creating Dublin Core <meta> tags Automatic generation of some tags based on content of the resource Forms based editing of tags Cut-and-paste output into HTML Conversion to other formats… SOIF, ROADS/WHOIS++, USMARC, GILS...

25. 25 - Lux, 1-2 Dec 1997 Managing Dublin Core (2) Web-site management tool Pros… Use of Web-site management tools likely to increase Object-oriented database approach Cons… Proprietry formats Early days - too early to evaluate use for metadata yet?

26. 26 - Lux, 1-2 Dec 1997 Managing Dublin Core (3) On the fly generation Pros… Separates metadata from resource Future migration fairly simple Cons… Performance Lack of integration with HTML tools Server specific

27. 27 - Lux, 1-2 Dec 1997 UKOLN metadata system (1) Embed on-the-fly Apache SSI script Store metadata using SOIF records Use MS-Access as tool to create the records Associate metadata with resource by co-locating them in the Web server filestore

28. 28 - Lux, 1-2 Dec 1997 UKOLN metadata system (2)

29. 29 - Lux, 1-2 Dec 1997 UKOLN metadata system (3)

30. 30 - Lux, 1-2 Dec 1997 UKOLN metadata system (4)

31. 31 - Lux, 1-2 Dec 1997 Issues Performance Interaction with Web caches Dublin Core vs Alta Vista style metadata <META NAME=”Description” CONTENT=”blah, blah"> <META NAME="Keywords” CONTENT="xxx, yyy, zzz"> Granularity Which pages should have metadata?

32. A short history: Dublin to Helsinki We have borrowed some of this material from Stu Weibel, with permission

33. 33 - Lux, 1-2 Dec 1997 Dublin Core Workshop Series .. DC-1: OCLC/NCSA Metadata Workshop Mar, 1995 Limited Scope: Discovery of document-like objects 13 element Dublin Core Interdisciplinary consensus DC-2: OCLC/UKOLN Warwick Workshop April, 1996 Warwick Framework - modularity Syntax issues

34. 34 - Lux, 1-2 Dec 1997 .. Dublin Core Workshop Series DC-3: CNI/OCLC Image Metadata Workshop, Sep, 1996 Images are in scope 15 element core; some element name changes DC-4: Canberra Metadata Workshop Mar, 1997 Minimalists and Structuralists Canberra Qualifiers (additional information useful for interpretation of metadata)

35. 35 - Lux, 1-2 Dec 1997 Dublin core - qualifiers Language of element value Scheme specifies a context for interpretation <META NAME=“DC.Subject” SCHEME=“ddc.21” CONTENT=“170.42”> Sub-element specifies a facet - narrows <META NAME="DC.Creator.Address" CONTENT=“[email protected]">

36. 36 - Lux, 1-2 Dec 1997 DC-5 DC-5: National Library of Finland/OCLC Workshop, October 1997 Formal Data Model (expressed in RDF) many other problems are hereby made simpler Resource Description Framework The return of modularity Finnish finish (of unqualified DC) minimalist DC is done and will not be changed Semantics for additional sub-structure a small number of sub-elements will be established Closer DC-W3C collaboration

37. 37 - Lux, 1-2 Dec 1997 Working groups Data Model date, relationship, source what is a resource? 1:1 RDF Relationships Typology Sub-elements Date

38. 38 - Lux, 1-2 Dec 1997 RFCs in preparation Simple DC semantics (the minimalist position) Simple DC syntax for embedded HTML DC semantics with qualifiers DC syntax with qualifiers HTML 2.0 HTML 4.0 RDF

39. Dublin Core implementation

40. 40 - Lux, 1-2 Dec 1997 Projects 30 projects; 10 countries http://purl.org/metadata/dublin_core/projects.html “Interdisciplinary and international recognition as the lingua franca for resource discovery metadata for electronic resources” Stu Weibel Support for use for non-digital objects

41. 41 - Lux, 1-2 Dec 1997 The HTML 2.0 “kludge” Convention for simple embedded metadata Bootstrapping early Dublin Core deployments META tags and standard HTML syntax Useful for simple metadata without qualifiers Can support Dublin Core qualifiers, but with risks for interoperability and indexing purity

42. 42 - Lux, 1-2 Dec 1997 HTML 4.0 - DC influences the web

43. 43 - Lux, 1-2 Dec 1997 Some quick statistics UK (academic sites only) Total pages: ~1.5M (a guess!) Embedded DC: ‘a few hundred’ http://www.cs.ukc.ac.uk/people/staff/djb1/ Sweden Total pages: 1.4M Embedded DC: ‘a few dozen’ http://www.lub.lu.se/nwiPaper/

44. Interoperability

45. 45 - Lux, 1-2 Dec 1997 Interoperability What do we mean by interoperability? Issues Z39.50 and Dublin Core Metadata registries

46. 46 - Lux, 1-2 Dec 1997 Interoperability? Unify access to data in different domains - Web, library, museums, archives, ... Issues Protocols - Z39.50, WHOIS++, … gateways Attribute names - author/creator/... Semantic interoperability - mapping tables Format of results format converters

47. 47 - Lux, 1-2 Dec 1997 Protocol Gateways - an example ZEXI - a Z39.50 to WHOIS++ gateway Based on CNIDR's Isite Accepts Z39.50 searches Converts them to WHOIS++ Returns SUTRS records

48. 48 - Lux, 1-2 Dec 1997 Attribute names Different databases may use different ‘names’ for the same thing ‘creator’ vs ‘author’ Need to be able to construct searches that ‘work’ against different databases irrespective of the ‘names’ in use

49. 49 - Lux, 1-2 Dec 1997 Format of results Different databases may return results in different formats USMARC, GRS-1, SUTRS, IAFA, ... Early stages of searching ideally need results to be returned in single ‘simple’ format

50. 50 - Lux, 1-2 Dec 1997 Z39.50 and DC - searching Version 2 Searches phrased in terms of single attribute set only Either need to add DC attributes to Bib-1 map DC to Bib-1 Version 3 Multiple attribute sets allowed for searching New simple DC attribute set to be proposed Other attributes taken from Bib-1

51. 51 - Lux, 1-2 Dec 1997 Z39.50 and DC - retrieval To return Dublin Core ‘records’ using Z39.50… use GRS-1 (General Record Syntax) elements are assigned tags DC elements have been added to tagset-G

52. 52 - Lux, 1-2 Dec 1997 Format conversion - issues Simple to rich, e.g. DC to MARC May not generate valid rich record without manual enhancement Use of DC qualifiers required for decent MARC record Rich to simple, e.g. MARC to DC Loss of data

53. 53 - Lux, 1-2 Dec 1997 Metadata registries Semantics Agreement on element meanings Agreement on enumerated lists Qualifiers Thesaurus naming Publishing existing metadata sets Re-use by others - prevent duplication of work e.g. Administrative metadata

54. 54 - Lux, 1-2 Dec 1997 Some pointers Mapping tables http://www.ukoln.ac.uk/metadata/interoperability/ Software General http://www.ukoln.ac.uk/metadata/software-tools/ d2m : Dublin Core to MARC converter http://www.bibsys.no/meta/d2m/ USEMARCON http://www2.echo.lu/libraries/en/projects/usemarc.html

55. Harvesting

56. 56 - Lux, 1-2 Dec 1997 Harvesting Dublin Core General Issues Building a Web index Harvest and NWI Building a ‘local’ search engine Harvest, SWISH-E, Isite, Zebra DC as cataloguer’s aid

57. 57 - Lux, 1-2 Dec 1997 Harvesting - issues Mappings Multiple element values Multiple languages Complex data values e.g. DC.Date, DC.Coverage SCHEMES

58. 58 - Lux, 1-2 Dec 1997 Harvesting - issues Frames Harvesting non-embedded metadata HTML 3.2 vs HTML 4.0 Hidden pages Controlling the robot

59. 59 - Lux, 1-2 Dec 1997 Harvest Resource discovery suite of tools - robot, summarisers, indexers SOIF records Supports a variety of indexers Supports database brokerage model CGI based user-interface UKOLN’s HTML summariser is Dublin Core aware

60. 60 - Lux, 1-2 Dec 1997 Nordic Web Index Custom robot - NWI/Combine Dublin Core aware GILS-II records Indexed using Zebra Searched using Z39.50 User interface based on Europagate

61. 61 - Lux, 1-2 Dec 1997 Other software SWISH-E system for indexing local collections of Web pages or other text files http://sunsite.berkeley.edu/SWISH-E/ Isite text indexer (Isearch) and Z39.50 http://www.cnidr.org/ir/isite.html Zebra text indexer and Z39.50 http://www.indexdata.dk

62. 62 - Lux, 1-2 Dec 1997 DC as cataloguer’s aid ROADS Software to create, manage and search Internet resource descriptions WHOIS++ Records created manually Pump-prime’ metadata record with values based on embedded DC using robot

63. 63 - Lux, 1-2 Dec 1997 DC as cataloguer’s aid BIBLINK Flow of information from publishers to National Bibliographic Agencies MARC based catalogues of electronic publications Initial MARC record based on DC description supplied by publisher using email

64. 64 - Lux, 1-2 Dec 1997 Building a Web index Centralised databases Harvest, database brokerage Multiple databases - parallel NWI - Z39.50 Multiple databases - query routing WHOIS++ Common Indexing Protocol, centroids

65. 65 - Lux, 1-2 Dec 1997 Architecture - centralised

66. 66 - Lux, 1-2 Dec 1997 Architecture - Harvest

67. 67 - Lux, 1-2 Dec 1997 Architecture - multiple databases

68. 68 - Lux, 1-2 Dec 1997 WHOIS++ - Query routing Centroid generated by database C contains… “you’ll find the string ‘mona’ in the ‘title’ attribute of at least one record in this database”.

69. Dublin Core - critique

70. 70 - Lux, 1-2 Dec 1997 Limits In development Syntax Simple Discovery Document like objects Weak model Administrative metadata Addressed in Helsinki

71. Futures The material on RDF has been adapted from Stu Weibel’s material, with permission

72. 72 - Lux, 1-2 Dec 1997 Dublin Core futures Internal Syntax and semantics External environment

73. 73 - Lux, 1-2 Dec 1997 Syntax HTML 2, HTML 4, RDF, ... RDF - W3C (World Wide Web Consortium) initiative “RDF is the realization of the Warwick Framework for the Web” RDF will be the foundation for an architecture for metadata on the Web Resource description Electronic commerce Site mapping Third party rating Digital signatures

74. 74 - Lux, 1-2 Dec 1997 RDF: Why is it important? RDF provides a coherent data model and syntactical framework for ‘plug-n-play’ metadata the semantics and structure of metadata packages will be determined by stakeholder communities via independently developed and maintained metadata element sets e.g.: MARC, DC, TEI, GILS, CIMI, Ratings…. Political imperatives for deployment Software infrastructure will be ubiquitous (and come for free in browsers and servers)

75. 75 - Lux, 1-2 Dec 1997 Semantics Tension simple vs complex generic vs specific interoperability vs selfstanding Development relationship sub-elements scheme

76. 76 - Lux, 1-2 Dec 1997 Environment ‘Save the time of the user’ Diverse resources Broker/middleware/ gateway/trading place/… Variety of protocols and metadata models DC simple - volume ‘shallow’ - interop

77. 77 - Lux, 1-2 Dec 1997 Further Information Dublin Core Home Page http://purl.org/metadata/dublin_core W3 Metadata Overview and RDF Working Group Home Page http://www.w3.org/Metadata/RDF UKOLN metadata page http://www.ukoln.ac.uk/metadata/

  • Login