190 likes | 270 Views
Explore the challenges and possibilities of FRBRization, converting bibliographic information using the FRBR model, and cross-catalogue FRBRization in Europe. Discover the importance of structured bibliographic information, difficulties in identifying entities and relationships, and specific issues with Persons, Works, Expressions, and Manifestations. Learn about the experience of FRBRization in various catalogues and the common problems faced, offering insights into achieving richer FRBRization.
E N D
FRBRization of European Catalogueschallenges and some solutions Trond Aalberg Norwegian University of Science and Technology (NTNU) Workshop on FRBR in The European Library 9 October 2008, National Library of Portugal – Lisbon - Portugal
Overview • FRBRization? • FRBR and new requirements for bibliographic information • Challenges, problems and possibilities • With some examples
FRBRization • Catchy term for ”the FRBR model applied on existing bibliographic information” • Converting existing bibliographic information • Or just interpreting (run-time) • Different levels of ambition: • Following the FRBR model or just FRBR-inspired • User interface only – presenting search results and allowing users to navigate along the axis of FRBR relationships • Data model that implements (part of) the FRBR model
Cross-catalogue FRBRization • FRBRization is even more relevant in a broader context: • reuse of information across catalogues • as a framework for portals - integrated access to multiple catalogues or cross domain integration • novel user interfaces – explorative • In Europe • Diversity in language, format and cataloguing practise
What FRBR really is about • Emhasis on ”content” and the documentation of intellectual/artistic endavour • What are the works and expressions in this product • Who are the actors and how do they relate to the expressions and works • It’s like drawing a map.... • More consistently structured bibliographic information • That can be processed and interpreted – not only searched and displayed
Our focus • Conceptual models are ideal solutions • ”This is where we want to go” objective • But how do we get there? • Existing bibliographic information is a valuable asset • One of the problems for future implementations of FRBR will be compatibility with already created information • Identification of entities and relationships • Experimenting with different rules, algortihms etc. • Gathering statistics and evaluating the results • Looking for solutions
Our experience so far..... • Based on FRBRization of different collections • BIBSYS (Norwegian catalogue - BIBSYSMARC) • The Slovenian National Bibliography (UNIMARC) • BTJ (Swedish catalogue - MARC 21) • Different catalogues, different formats, different practises • Many catalogue-spesific rules are needed • A certain level of FRBRization is easy to achieve • For ”richer” FRBRization there is a number of common problems related to the poor structuring capabilities of the MARC formats
Persons and Corporate Bodies • Persons and Corporate Bodies are usually easy to identify • Specific fields for these entities • Duplicate entities is a frequent problem • Despite the use of authority control • Relatorcodes are needed to associate persons and corporate bodies to the correct kind of product entity • For records with multiple persons and multiple works/expressions it is often difficult to set up the correct relationships....
Works and Expressions • Works can be identified by titles and associated creators (if applicable) • Major challenge is to find and select title, identify multiple works, .. • Problems related to the identification of persons are ”inherited” • Expressions can be identified by the work it is associated to and additional expression-level information • Typical problems • Lack of original title/uniform title when title statement is inappropriate • Often inconsistent practise for work titles within and across catalogues
Not always easy... 100 1 $a Sjöwall, Maj, $d 1935- 240 14 $a Den vedervärdige mannen från Säffle. $l Tyska 245 14 $a Das Ekel aus Säffle ; $b Verschlossen und verriegelt : zwei Romane / $c Maj Sjöwall, Per Wahlöö 260 $a Erftstadt : $b Area, $c 2006 300 $a 639 s. 500 $a Den vedervärdige mannen från Säffle / ... in der deutschen übersetzung von Eckerhard Schultz -- Det slutna rummet / ... in der deutschen übersetzung von Hans-Joachim Maass 700 12 $a Sjöwall, Maj, $d 1935-. $t Det slutna rummet. $l Tyska 700 12 $a Wahlöö, Per, $d 1926-1975. $t Det slutna rummet. $l Tyska 700 1 $a Schultz, Eckehard $4 trl 700 1 $a Maass, Hans-Joachim $4 trl 700 12 $a Wahlöö, Per, $d 1926-1975. $t Den vedervärdige mannen från Säffle. $l Tyska 740 4 $a Det slutna rummet
Manifestations • Each record describes a single manifestation • and manifestations can easily be identified by e.g. ISBN and/or title statment etc. • But there are different solutions used for multivolumed publications • Record linking • Note fields • Linking fields
Major challenges for FRBRization • A number of techniques and a complex set of rules must be applied when interpreting records • Inspecting fields, subfields and even parsing the text in note fields • Interpreting relator codes • No single set of rules for all catalogues • Still struggling with the bascic relationships... • Results must be evaluated and corrected • Equivalent entities has to be identified • Erronously identified entities and relationships has to be removed
What are the consequences? • The current (rather simple) interfaces are tolerant to errors and inconsistencies • The FRBR context adds new requirements to the data
The reason why 020 $a 0396070213 : $c $6.95 040 $a DLC $c DLC $d DLC 050 00 $a PZ3.C4637 $b Hh3 $a PR6005.H66 082 00 $a 823/.9/12 100 1 $a Christie, Agatha, $d 1890-1976. 245 10 $a Hercule Poirot's early cases / $c Agatha Christie. 260 $a New York : $b Dodd, Mead, $c [1974] 300 $a 250 p. ; $c 22 cm. 505 0 $a The affair at the victory ball.--The adventure of the Clapham cook. --The cornish mystery.--The adventure of Johnnie Waverly.--The double clue.--The king of clubs. --The Lemesurier inheritance.--The lost mine.--The Plymouth express.--The chocolate box. --The submarine plans.--The third-floor flat.--Double sin.--The market basing mystery. --Wasps' nest.--The veiled lady.--Problem at sea.--How does your garden grow? 650 0 $a Poirot, Hercule (Fictitious character) $x Fiction. 650 0 $a Private investigators $z England $x Fiction. 650 0 $a Detective and mystery stories, English. 984 $a gsl 991 $b c-GenColl $h PZ3.C4637 $i Hh3 $p 00022213155 $t Copy 1 $w BOOKS
What quality can we achieve? • A large number of records have a ”simple” FRBR structure • Single creator, published once... • The quality from the more complex records is more questionable • But this is where FRBR is mostly needed • Errors and problems that users never would notice, become very visible when FRBRizing
Concluding remarks • Is MARC sufficient for FRBR? • More structured information about expressions, works is possible even in MARC • Extensive use of relatorcodes is needed • Field linking (in MARC 21) could solve many of the problems caused by multiplicity • Can we automatically improve existing records? • By implementing more intelligent entity discovery solutions • Using information from other records/catalogues in the interpretation of others