1 / 32

Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources

ALA/CLA Annual Meeting 22 June 2003 Toronto, CA. Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources. Timothy W. Cole ( t-cole3@uiuc.edu ) University of Illinois at Urbana-Champaign http://dli.grainger.uiuc.edu/Publications/TWCole/ALA2003OAI/. Order of Presentation.

corby
Download Presentation

Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ALA/CLA Annual Meeting22 June 2003Toronto, CA Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources Timothy W. Cole (t-cole3@uiuc.edu)University of Illinois at Urbana-Champaign http://dli.grainger.uiuc.edu/Publications/TWCole/ALA2003OAI/

  2. Order of Presentation • Perspectives on OAI-PMH • Illinois OAI metadata harvesting project • Goals & objectives • Findings regarding metadata • Findings regarding search & discovery • New OAI projects at Illinois • IMLS digital collections & content • CIC OAI metadata harvesting project ALA 2003 / OAI-PMH

  3. OAI Protocol for Metadata Harvesting Harvesting approachto interoperabilityat metadata level Divides world intoMetadata Providers& Service Providers Builds on HTTP,XML, & Dublin Core http://www.openarchives.org/ ALA 2003 / OAI-PMH

  4. OAI Antecedents • Call to other E-Print archives (July 1999) Paul Ginsparg, Rick Luce, & Herbert Von de Sompel: “…mobilize core group to work towards achieving a universal service for author self-archived scholarly literature.” • Santa Fe Mtgs. (Oct. 1999 & June 2000) • OAI – PMH version history: • First Alpha Release, Sept. 2000 • 1.0 (Beta) Release January 2001 • 1.1 (Beta 2) Release July 2001 • 2.0 (Production) Release June 2002 ALA 2003 / OAI-PMH

  5. Original OAI Organization • OAI Executive: • Carl Lagoze & Herbert Van de Sompel • OAI Steering Committee: • Co-Chairs: Dan Greenstein, Cliff Lynch • OAI Technical Committee • Funded by NSF, DLF & CNI • Seeks to be user community driven ALA 2003 / OAI-PMH

  6. OAI-PMH as a tool • All about moving metadata around • Designed to be a building block, useable by many different communities • Can facilitate (in some cases enable) services & functions • Assumes widely distributed content, butcentralized indexing(!) & services • Build once, use for many applications • Focus of OAI is interoperability ALA 2003 / OAI-PMH

  7. Harvesting vs. Broadcast • Competing approaches to interoperability • Distributed/Broadcast searching: search and discovery over remote services and data • Harvesting is when data/metadata is transferred from the remote source to the destination where search & discovery services are located (e.g. Union catalogs) • OAI-PMH is a harvesting protocol ALA 2003 / OAI-PMH

  8. As Compared to Z39.50 ALA 2003 / OAI-PMH

  9. Metadata vs. Resources • Resource refers to information objects or digital representations of information objects • Metadata item is a collection of properties about a resource (e.g. title, author, etc.) • Metadata record is a metadata item expressed in a specific syntax according to an XSD • OAI focuses on metadata, with the implicit understanding that metadata contains useful links to the source information object(s) ALA 2003 / OAI-PMH

  10. When to use OAI-PMH • Metadata is sufficient for services desired • Normalization, dedupping, metadata augmentation desired • Content is widely distributed across small, non-Z39.50 enabled repositories • OAI-PMH is more lightweight than Z39.50 • Portals can use BOTH Z39.50 & OAI-PMH ALA 2003 / OAI-PMH

  11. What OAI-PMH Is Not • Not search & discovery on its own • Not a database management system • Not a single metadata schema • Not OAIS ALA 2003 / OAI-PMH

  12. How OAI Works OAI “VERBS” Identify ListMetadataFormats ListSets ListIdentifiers ListRecords GetRecord Service Provider Metadata Provider H A R VESTER REPOSITORY OAI HTTP Request OAI (OAI Verb) HTTP Response (Valid XML) ALA 2003 / OAI-PMH

  13. HTML <meta> XML DBMS OAI Application (CGI, ASP, PHP, etc.) Webserver - HTTP OAI Provider Architectures Descriptive Metadata OAI Administrative Metadata, e.g.,Ids, datestamps, sets, formats OAI Harvesters ALA 2003 / OAI-PMH

  14. A few projects using OAI-PMH • Basic building block of the National Science Digital Library • Large-scale implementations in E-Prints, OLAC, NDLTD, … • Built into ENCompass, ContentDM, Michigan’s DLXS, D-Space, and other products • Open Archives Forum in Europe; will be part of federation activities in the UK and EU ALA 2003 / OAI-PMH

  15. Univ. of Illinois OAI Metadata Harvesting Project • Funded by Andrew W. Mellon Foundation(July 2001 – May 2003) • Primary objectives: • Develop & make available OAI harvesting tools • Build search services for aggregated metadata in the domain of cultural heritage • Examine metadata aggregation issues, including use of EAD in OAI context • Investigate utility of aggregated metadata, including preliminary testing with end-users ALA 2003 / OAI-PMH

  16. Type of resources • 39 data providers • academic libraries • Museums / cultural orgs • digital libraries • public library • 1.1 million original DC records • + 1.5 million derived from EAD ALA 2003 / OAI-PMH

  17. Variations in DC element usage • Records containing subject & description element • Many different controlled and local vocabularies in use • Granularity: a record may describe a collection of coins — or one coin ALA 2003 / OAI-PMH

  18. Description:Digital image of a single-sized cotton coverlet for a bed with embroidered butterfly design. Handmade by Anna F. Ginsberg Hayutin. Source:Materials: cotton and embroidery floss. Dimensions: 71 in. x 86 in. Markings: top right hand corner has 1 1/2 in. x 1/2 in. label cut outs at upper left and right hand side for head board; fabric is woven in a variation of a rib weave; color each of yellow and gray; hand-embroidered cotton butterflies and flowers from two shades of each color of embroidery floss - blue, pink, green and purple and single top 20 in. bordered with blue and black cotton embroidery thread; stitches used for embroidery: running stitch, chain stitch, French knot and back stitches; selvage edges left unfinished; lower edges turned under and finished with large gray running stitches made with embroidery floss. Format:Epson Expression 836 XL Scanner with Adobe Photoshop version 5.5; 300 dpi; 21-53K bytes. Available via the World Wide Web. Coverage:— Date Created: 2001-09-19 09:45:18; Updated: 20011107162451; Created: 2001-04-05; Created: 1912-1920? Type:Image Excerpt of a metadata record describing a cotton coverlet ALA 2003 / OAI-PMH

  19. Excerpt of a metadata record describing "American woven coverlet“ Description:Materials: Textile--Multi, Pigment—Dye; Manufacturing Process: Weaving--Hand, Spinning, Dyeing, Hand-loomed blue wool and white linen coverlet, worked in overshot weave in plain geometric variant of a checkerboard pattern.Coverlet is constructed from finely spun, indigo-dyed wool and undyed linen, woven with considerable skill. Although the pattern is simpler, the overall craftsmanship is higher than 1934.01.0094A. - D. Schrishuhn, 11/19/99 This coverlet is an example of early "overshot" weaving construction, probably dating to the 1820's and is not attributable to any particular weaver. -- Georgette Meredith, 10/9/1973 Source:— Format:228 x 169 x 1.2 cm (1,629 g) Coverage:Euro-American; America, North; United States; Indiana? Illinois? Date:Early 19th c. CE Type:cultural; physical object; original ALA 2003 / OAI-PMH

  20. Implications • Service providers • Automatically normalize metadata encoding where possible (e.g., dates) • Normalize for and co-locate by type / format where possible • Metadata providers • Create metadata for interoperability • Consider more expressive schema – e.g., Qualified DC, MARC ALA 2003 / OAI-PMH

  21. Original interface • Portal had two search pages—simple (keyword) and advanced.

  22. Pilot study with student teachers • 23 users in honors-level C&I class • Assignment: Use the site in preparing a lesson plan (high school social studies) __________ • Introduced to “aggregated metadata” concept • Focus group interviews conducted • Students’ papers examined • Transaction logs analyzed ALA 2003 / OAI-PMH

  23. Results of initial user testing 1. Users expected all links pointed to digital objects • Some records pointed to finding aids • Some records pointed to collection’s web site • Some records described analog objects 2. Users unable to make use of search results • Simple searches produced 1000s of unranked results • Advanced search (with limits) rarely used 3. Distinction between portal and data providers unimportant to users ALA 2003 / OAI-PMH

  24. What does “online access” mean? • To librarian & curator • To student teacher ALA 2003 / OAI-PMH

  25. Response to test results • EAD-derived records segregated • Analog only collections excluded • Categories of resource types reduced to 3: • Images and Video • Text, Sheet Music, and Websites • Museums and Archival Collections ALA 2003 / OAI-PMH

  26. Revised interface • Simple keyword & advanced searchput on one page • Clarify “online access” • Natural language in Boolean operators ALA 2003 / OAI-PMH

  27. Revised search results • Link goes to finding aid or collection page? “Learn more.” • Link displays object? “View item.” • Subj/Desc expanded ALA 2003 / OAI-PMH

  28. IMLS Digital Collections & Content • Build a registry of all National Leadership Grant collections with digital content. • Assist and guide NLG projects in making item-level metadata sharable using OAI. • Build a repository and search & discovery tools for integrated access to the content of NLG collections (unique metadata schema?). • Research best practices for sharing metadata about diverse digital content and for supporting the interests of diverse user communities. ALA 2003 / OAI-PMH

  29. http://imlsdcc.grainger.uiuc.edu/

  30. CIC OAI metadata harvesting • Univ. of Illinois at UC will host an OAI-PMH metadata harvesting service for 10 CIC libraries • Project Goals (3 year experimentation phase) • Improve access to selected resources at CIC libraries • Advertise these resources (internally & externally) • Prepare member institutions for future grant-mandated OAI-based resource sharing • Serve as a useful testbed for experimentation with OAI-PMH, development of metadata best practices, usability and user needs testing, etc. ALA 2003 / OAI-PMH

  31. Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources http://dli.grainger.uiuc.edu/Publications/TWCole/ALA2003OAI/ Timothy W. Cole (t-cole3@uiuc.edu)University of Illinois at Urbana-Champaign

More Related