V irtual I nternational A uthority F ile - PowerPoint PPT Presentation

toril
v irtual i nternational a uthority f ile n.
Skip this Video
Loading SlideShow in 5 Seconds..
V irtual I nternational A uthority F ile PowerPoint Presentation
Download Presentation
V irtual I nternational A uthority F ile

play fullscreen
1 / 25
Download Presentation
V irtual I nternational A uthority F ile
105 Views
Download Presentation

V irtual I nternational A uthority F ile

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. VirtualInternationalAuthorityFile ALA, June 2006

  2. Virtual International Authority File • Link authority records from national bibliographic agencies • Build on their authority work • Expand the concept of universal bibliographic control • Allow national or regional variations in authorized form to co-exist • Support needs for variations in preferred language, script, and spelling

  3. Joint VIAF Project

  4. Other controlled vocabularies A&I controlled vocabularies (Library) authorityfiles “Ontologies” End-user Semantic Web Building Blocks

  5. Project Goal Demonstrate feasibility of linking personal names across: • Personennormadatei (PND) • Library of Congress Name Authority File (LCNAF)

  6. What is the VIAF? • System • Links between files • Web browser access • Multi-lingual and multi-scripts • Maintenance • National agencies control their records • Records harvested from national systems • Scalable • Any number of national authority files

  7. Matching Variations In the LCNAF and PND authority files: • Same name, same person • Same name, different people • Different names, same person • Missing person in one file

  8. Different Same Name People Two Different People – One Name Adams, Mike • PND: a golfer • LCNAF: author of a Beatles collector's guide

  9. Different Same Person Names One Person – Two Names • LCNAF: Morel, Pierre • PND: Morellus, Petrus

  10. Bibliographic Record Enhanced Authority Derived Authority Authority Record Enhancing the Authorities

  11. Usage Language LC Control Number LC Classification Title Publisher Place of Publication Date of Publication Material Type Authors Mining the Bibliographic Record LDR 00826ccm 2200289 a 4500 1 ocm10025532 5 20031229650847.0 8 840627s1982 nyuuua n eng 10 $a 84758340 40 $a DLC $c DLC 19 $a 17706440 20 $c $2.95 28 22 $a 48418 $b G. Schirmer 45 2 $b d198006 $b d198007 48 $b va01 $b ve01 $a ka01 50 00 $a M1529.3 $b .T 100 1 $a Thomson, Virgil, $d 1896- 245 14 $a The cat : $b duet for soprano and baritone / $c Virgil Thomson ; [words by Jack Larson]. 260 $a New York : $b G. Schirmer, $c c1982. 300 $a 1 score (11 p.) ; $c 31 cm. 500 $a For soprano, baritone, and piano. 650 0 $a Vocal duets with piano. 600 10 $a Larson, Jack $x Musical settings. 700 1 $a Larson, Jack.

  12. All text is normalized Subjects are grouped into broad subject areas Coauthor Publication date is by decade Material type is coded Derived Authority Record 00525nz 2200229n 4500 0 1 xlc 1 1 3 OCoLC 2 5 20040721111415.0 3 8 040721nneanz||abbn n and d 4 40 $a OCoLC $b eng $c OCoLC $f viaf 5 100 1 $a Larson, Jack. 6 903 $a 84758340 7 910 14 $a the cat $b duet for soprano and baritone 8 921 $a g schirmer 9 922 $a nyu 10 930 $a jack larson 11 940 $a eng 12 942 $a 234 13 943 $a 198x 14 944 $a cm 15 950 1 $a thomson, virgil $d 1896

  13. Enhanced Authority Record

  14. Strong Matching Attributes • A work (title) in common • Common control numbers (ISBN, ISSN, or LCCN) • Exact birth and death year • Joint authors • Name as subject

  15. Weaker Attributes • Only one of birth/death date(s) (allows some variation) • Subject area of works (two levels) • Format (books, films, musical scores, etc.) • Language • Publisher • Partial title match • Date of publication • Country • Role (author, illustrator, composer, etc.) • Format (books, films, musical scores, etc.)

  16. Exact name match with dates Standard Number Corporate name Joint aughor Language Publisher Subject Decade Role Exact title match

  17. LC Names Established Names 4,187,973 Names from Bib Records 3,440,706 Active Established Names 2,556,824 Uncontrolled Names 883,882 Orphaned Names 1,631,149

  18. DDB Names Established Names 2,659,276 Names from Bib Records 2,319,829 Active Established Names 2,013,618 Uncontrolled (Undif’d) Names 306,211 Orphaned Names 645,658

  19. Results • Matches 558,618 • Complex Matches 70,797 • Unique Matches 487,821

  20. VIAF File LC Names 4,187,973 DDB Names 2,659,276 Common 558,618 (70% of potential)

  21. Next Steps • Move to incremental updates • Start harvesting national files • Bring up Web interface (to full files) • Make OAI accessible • Bring in new participants • Handle non-Roman matching • Move to other types of authorities • Corporate names • Geographic names • …

  22. Stage 3: Build OAI Server OAI Server(s) LCNAF DDB/PND

  23. Stage 4: Ongoing maintenance

  24. Stage 5: Build End User Interface with Unicode displays User’s cookie specifies Hangul is preferred. Display 700 form, building on local system’s authority structure

  25. Thank you T. Hickey http://errol.oclc.org/laf/n82-54463.html ALA June 2006