1 / 25

Implementing FRBR on Large Databases

Implementing FRBR on Large Databases . Thomas Hickey Diane Vizine-Goetz OCLC Research. What is FRBR. IFLA study group report: Functional Requirements for Bibliographic Records Bibliographic model independent of cataloging rules Clusters bibliographic items into a f our-level structure

pello
Download Presentation

Implementing FRBR on Large Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementing FRBR on Large Databases Thomas Hickey Diane Vizine-Goetz OCLC Research

  2. What is FRBR • IFLA study group report: Functional Requirements for Bibliographic Records • Bibliographic model independent of cataloging rules • Clusters bibliographic items into a four-level structure • Work • Expression • Manifestation • Item

  3. Work Concept Person Expression Object Manifestation Corporate Body Event Item Place Control of Entities in FRBR Entities Surrogates Uniform titles Citations Names Subjects

  4. Why FRBR? • Potential to improve: • Cataloging • Discovery • Delivery • By • Bringing versions of works together • Showing relationships of various kinds • Enabling users to navigate to level of interest

  5. Research on FRBR & WorldCat • Subsets • By library, region • Example/problem sets • Shakespeare, the Bible • Humphry Clinker • 1,000 random works • By genre • Dissertations • Fiction • Whole file, 47 million bibliographic records

  6. Our Approach • Concentrating on work-level • Problems with expression-level clusters • Efficient, maintainable, understandable • Few, if any, false matches with correct cataloging • Err on the side of missed matches • Some accommodation of frequent variants • Compare with manually clustered

  7. The Algorithm • A key is generated for each record • Extract author, title • Look up in NACO authority file • Added entry information as needed • Form a key from bibliographic record • Author, title, added entry information • These can be sorted, compared

  8. Problems • Many (17%) records do not have • Author main-entry • Uniform title • In general these can not be matched • Look at added entries • Information at the expression and manifestation levels • Handled separately • 180,000 clusters involving ~400,000 records

  9. Top 10 WorldCat Clusters # RecsAuthor/Title Key 8,383 bible\n t 8,055 bible 6,174 bible\authorized 4,033 bible\o t\psalms 3,964 haggadah 3,477 great britain/treaties etc 2,402 bible\o t 2,248 koran 2,153 arabian nights

  10. Top 10 from a Public Library # RecsAuthor/Title Key 89 bible\authorized 85 mother goose 84 chopin, frederic\1810 1849/piano music 81 schulz, charles m/peanuts 63 davis, jim/garfield 61 moore, clement clarke\1779 1863/night before christmas 60 mozart, wolfgang amadeus\1756 1791/instrumental music 58 bach, johann sebastian\1685 1750/cantatas 57 beethoven, ludwig van\1770 1827/sonatas 56 twain, mark\1835 1910/adventures of huckleberry finn

  11. Results • Manual estimate: 1.5 manifestations/work in WorldCat • Algorithm: ~1.3 • 25,844 clusters have 20 or more records • 401,659 clusters have 5 or more records

  12. Preliminary Plans • Build structures for FRBR into new catalog • Expose FRBR clustering for searching • Make visible in cataloging • As consensus on implementation is developed • As cataloging rules accommodate FRBR

  13. Spin-offs • NACO normalization code • Testbed • Server • Authority work • ePrints UK • FRBR in other projects • FictionFinder • NDLTD union catalog

  14. Fiction Subset • 2,665,662 WorldCat records • 1,758,479 work clusters • 1.5 records/cluster • 3,866 clusters have 20 or more records • 50,540 clusters have 5 or more records

  15. Top 10 clusters for fiction # RecsAuthor/Title Key 1,296 defoe, daniel\1661 1731/robinson crusoe 1,248 carroll, lewis\1832 1898/alices adventures in wonderland 971 cervantes saavedra, miguel de\1547 1616/don quixote 828 stevenson, robert louis\1850 1894/treasure island 689 twain, mark\1835 1910/adventures of huckleberry finn 624 twain, mark\1835 1910/adventures of tom sawyer 618 swift, jonathan\1667 1745/gullivers travels 600 andersen, h c\hans christian\1805 1875/tales 581 stowe, harriet beecher\1811 1896/uncle toms cabin 570 arabian nights

  16. FictionFinder • Employs work clusters in a prototype system for searching and browsing bibliographic records for fiction • Indexes records at the work level and organizes displays by work and expression (primarily language) • Includes records for textual items; additional modes of expression (moving image, sound) to be added later

  17. 395 records for author “crichton, michael\1942” clustered into 17 entries

  18. Typical Results Set Display

  19. Typical Work-level Display

  20. Typical Results Set Display

  21. Typical Work-level Display

  22. Benefits • Aggregated displays for works and expressions • Enhancement of (fiction) records at work level • with elements from records within the work cluster (e.g., summaries, genre terms, subject headings, class numbers) • with external data (e.g., literary prizes, prequels/sequels, evaluative content)

  23. Challenges • Identifying appropriate bibliographic data for systematically grouping or differentiating works and expressions • Works • Genre (graphic novel v.s novel) • Genre + mode of expressions (audio book v.s radio play) • Degree of modification (abridgement of juvenile work v.s an adaptation for young children) • Expressions • translators, illustrators, editors

  24. Next Steps • FRBR algorithm • Explore applications • Refine algorithm as needed • FictionFinder • Add records for sound and image • Conduct user studies

  25. Links • Functional Requirements for Bibliographic Records - Final Report • http://www.ifla.org/VII/s13/frbr/frbr.htm • Experiments with the IFLA Functional Requirements for Bibliographic Records (FRBR) • http://www.dlib.org/dlib/september02/hickey/09hickey.html • OCLC Research Activities and IFLA's Functional Requirements for Bibliographic Records • http://www.oclc.org/research/projects/frbr/index.shtm • Implementing FRBR on Large Databases • http://staff.oclc.org/~vizine/CNI/OCLCFRBR.htm

More Related