1 / 24

Introduction to Metadata: Discovering Structured Data

Learn about metadata, structured data that is encountered in everyday life, and its use in various contexts such as libraries, museums, and online shopping sites. Explore different metadata formats like MARC, ONIX, Dublin Core, and RSLP Collection Description.

geraldinej
Download Presentation

Introduction to Metadata: Discovering Structured Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. First steps in metadata Ann Chapman Policy and Advice team, UKOLN UKOLN is supported by:

  2. What is metadata? • Structured data about something • Encountered every day • bus & rail timetables • phone directories • Internet shopping sites (e.g. Amazon) • ingredient lists on food items • calendars (public holidays, religious festivals) • event (e.g. seminar, workshop) programme

  3. More about metadata • Structured data about resources • Library catalogues • Abstracting and indexing services • Archival finding aids • Museum documentation • Collection description • Community information • Carriers • Formats (e.g. MARC) • Markup languages (e.g. HTML, SGML, XML)

  4. Markup languages • SGML = Standard Generalised Markup Language - controls document formatting for publication • XML = Extensible Markup Language - “next generation” SGML • HTML = Hyper Text Markup Language - SGML subset, controls display of web pages All use tags (usually paired) to structure text into elements e.g. headings, paragraphs, lists, etc. <title> </title> <p> </p> <li> </li>

  5. Overview • MARC • ONIX • Dublin Core & application profiles • RSLP Collection Description • MARC 21 Community Information • Other metadata types

  6. MARC Formats • MAchine Readable Catalogue records • Library of Congress, 1960s • Now widespread use in many countries • Catalogue once, use record many times • Holdings can be attached • 1960s: books, serials, maps, music scores • 2006: any physical or digital resource

  7. MARC - structure • Structured format and carrier • Numeric and alpha tags • Fixed fields • Leader, 001-008, 010-099 • Variable fields • 100, 110, 111, 245, 260, etc.

  8. MARC - elements • 1XX Main entry • 2XX Title, Statement of Responsibility, edition, publication • 3XX Physical description • 4XX Series information • 5XX Notes • 6XX Subject access • 7XX Added entries (alternative titles, multiple authors, etc.) • 8XX Added entries for series • 9XX References and local use fields

  9. MARC 21 record 021 $a 0761952926 082 $s 338.9 $c 21 100 $a Nederveen Pieterse, Jan P. 245 $a Development theory: $b deconstruction. 260 $a London: $b Sage, $c 2001 300 $a xii, 195p. $c 25cm $e cased 440 $a Theory, culture and society 650 $a Economic development

  10. ONIX Formats • Primary use • Publishers to Internet booksellers • Rich product information • 3 Formats for product information metadata • Books, Serials, Licensing Terms • ONIX for Books in use: • First version 1999 • Current version release 2.0 (2001) • Carrier – XML • Elements – XML reference name and tag

  11. ONIX - elements • Message header • Product record • identifiers, author, title, edition, language, subject, audience, descriptions, publisher, dates • territorial rights, dimensions, suppliers, availability, promotions • Main series and sub-series records

  12. ONIX for Books - record <ISBN> 0123456789 </ISBN> <DistinctiveTitle> Alice in Wonderland <DistinctiveTitle> <Contributor> <ContributorRole> Author <ContributorRole> <PersonNameInverted> Carroll, Lewis </PersonNameInverted> </Contributor> <Publisher> Collins </Publisher> <PublicationDate> 2000 <PublicationDate>

  13. Dublin Core - structure • Simple resource discovery • DCMES – Dublin Core Metadata Element set • HTML the most common ‘carrier’ • Comprises 15 elements with • Element qualifiers • Element encoding schemes • Optional/mandatory elements • Application profiles

  14. Dublin Core - elements Title Format Creator Resource identifier Subject Source Description Language Publisher Relation Contributor Coverage Date Rights Resource Type

  15. Dublin Core - record <title> Alice in Wonderland </title> <creator> Lewis Carroll </creator> <subject><LCSH> Fiction </LCSH></subject> <publisher> Project Gutenberg </publisher> <date> 2000 </date> <format> ASCII file via FTP </format> <identifier> htttp://promo.net/pg/… </identifier>

  16. RSLP Collection Description • Schema developed May 2000 for RSLP programme • MS Access database for RSLP – summer 2001 • Web-based implementations: Revealweb, Cornucopia, Backstage, PADDI, MASC25, SCONE, Cecilia, RASCAL • Based on same model: SCONE • General attributes • Subject • Dates • Associated agents • External relationships

  17. Coll. Desc. - elements General: title, identifier, description, strength, physical characteristics, language, type, access control, accrual status, legal status, custodial history, note, location Subject: concept, object, name, place, time Dates: accumulation, contents Agents: creator, owner Relationships: sub & super-collections, catalogues and descriptions, associated collections and publications

  18. Coll. Desc. - record Title: Pitman Collection Strength: Shorthand – national significance Phys.Char.: printed texts and manuscripts Lang: English, Spanish, Esperanto, …. Access: Written request to the Librarian, University of Bath Accrual: passive, deposit Location: The Library, University of Bath, Bath Subject: shorthand, Sir Isaac Pitman, phonetic alphabets Owner: Pitman Publishing Co. Catalogue: University of Bath Library OPAC

  19. MARC 21 Community Information • Same principles as MARC 21 Bibliographic • Leader • Individual / organization / program / event / other • Fixed fields • 001-008, 010-099 fixed fields • 007 disability facilities • 008 special aspects • Variable fields

  20. M 21 Comm. Inf. – elements 1XX Name 2XX Title and Address 3XX Physical description 4XX Series (for events) 5XX Notes 6XX Subject access 7XX Added entries 8XX Other variable fields

  21. M 21 Comm. Inf. – record 110 $a CILIP 245 $a CILIP HQ 247 $a LA HQ $f 19?? – 2002 270 $a Ridgmount St, London WC1E 7AE $k 020 7255 0505 $m info@cilip.org.uk $r 9am to 6pm 311 $a Ewart Room $d seats 50 $g £100 per day 312 $a Overhead projector $f £10 per day 581 $a Library + Information Update 856 $a http://www.cilip.org.uk

  22. Other metadata formats • IEEE LOM – learning object metadata • EAD – Encoded Archival Description • Theatre Information Group DTD – performance data

  23. Metadata – fit for purpose • MARC 21 Bibliographic– libraries • ONIX – book trade and libraries • Dublin Core – Internet • EAD – archives • Collection description – archives, libraries, museums • M21 Community Information – primarily libraries

  24. Contact details Ann Chapman a.d.chapman@ukoln.ac.uk UKOLN University of Bath, Bath BA2 7AY www.ukoln.ac.uk

More Related