Download
frbr information exchange n.
Skip this Video
Loading SlideShow in 5 Seconds..
FRBR information exchange PowerPoint Presentation
Download Presentation
FRBR information exchange

FRBR information exchange

0 Views Download Presentation
Download Presentation

FRBR information exchange

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

  2. Current FRBR information exchange • Sets of MARC-21 records • Both bibliographic and authority • Sometimes extended • pKeys • Unique pKeys • Lists of sets of control numbers • xISBN web service • superWork records

  3. Some background • Our FRBRization has been done primarily at the work level • We have FRBRized OCLC WorldCat • ~60,000,000 records • ~1,000,000,000 holdings • Used in Open WorldCat, FictionFinder now • Will be visible in FirstSearch displays this fall • Norwegian BIBSYS records • Finish national bibliography (now in WorldCat) • Electronic thesis metadata • Processing done on a 24-node Beowulf Linux cluster

  4. MARC 21 bibliographic data • Basic method of accepting information • Other formats get mapped into it • Fields we use: • Author main entry • Titles • ISBN • Personal name added entries • Language • Extensions • BIBSYS use of 490 fields to indicate hierarchy

  5. MARC 21 Authority data • Map personal names using cross references • Map author-titles using cross references • Fields we currently use • 008 fixed field • 100, 130, 400 • Extensions • Files of additional cross references • Common title patterns • xISBN matching

  6. pKeys • An author-title key for matching • Derived from MARC-like records & authority data ocm00019613 shakespeare, william\1564 1616/hamlet ocm00615676 /hamlet/shakespeare, william\1564 1616 ocm14055779 hamlet motion picture 1948 ocm00290352 /hamlet/ocm00290352

  7. Unique pKeys • pKeys that have been sorted and counted 692 sw00008899 milton, john\1608 1674/poems 691 sw00255854 puccini, giacomo\1858 1924/tosca 690 sw00020874 chaucer, geoffrey\d 1400/canterbury tales 688 sw00237074 melville, herman\1819 1891/moby dick 682 sw03620985 china/laws etc

  8. Lists of control numbers • sw00000089 00206765 01261413 00000089 01236648 03975229 08360541 07363127 • sw00000169 00000169 01647333 00420563 10957239 05205626 02325844 07299473 08244692 08555721 24509677 02533498 03967788 24728032 10130242 04849080 09477230 23323184 22051264 38870301 54266609 56760701 08366329 • sw00000182 00000182 00102731 • sw00000201 00000201 02786659 • sw00000210 00000210 09175561 • sw00000245 00000245 34103639

  9. xISBN web service • Takes an ISBN as input • Returns list of ISBNs in associated work • Significant processing • Starts with control-number list of work-sets • Uses ISBNs to pull work-sets together • Allows fuzzy-matching on author/title • Ends up with consistent clusters • In general larger than those in control-number list

  10. xISBN examples [0130188549, 0130188476]: sw11067396 barnea, amir/agency problems and financial contracting sw13096363 barnea, amir/agency problems on financial contracting [000713407x, 0007126360, 0007134053, 0007134061, 0007126441]: sw48486275 /collins new school dictionary/ocm48486275 sw49740193 /collins new school dictionary/ocm49740193 sw49740203 /collins new school dictionary/ocm49740203

  11. xISBN XML response • <?xml version="1.0" encoding="UTF-8" ?> • - <idlist> • <isbn>000713407x</isbn> • <isbn>0007126360</isbn> • <isbn>0007134053</isbn> • <isbn>0007134061</isbn> • <isbn>0007126441</isbn> • </idlist>

  12. superWorks format • Developed for FictionFinder • XML format • Includes expression-level information • All the information needed • We are adapting it to the Curioser project

  13. superWork record layout • pKey • # manifestations, holdings, sw-id, control #s • publication dates • expressions • expression • classes • language • authors • titles • subjects • components • author, title, publication data

  14. Summary • Simpler when only work-level relationships are needed • Even for work-level relationships, a number of different formats are useful • Information needed for an interface gets much more complicated