1 / 21

WEB ARCHIVING IN THE BRITISH LIBRARY

WEB ARCHIVING IN THE BRITISH LIBRARY. John Tuck Head of British Collections February 2004. BRITISH LIBRARY: CONTEXT.  Created by British Library Act 1972.  National Library of the United Kingdom.  Origins from 1753.  One of world’s greatest research libraries.

Download Presentation

WEB ARCHIVING IN THE BRITISH LIBRARY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WEB ARCHIVING IN THE BRITISH LIBRARY John TuckHead of British CollectionsFebruary 2004

  2. BRITISH LIBRARY: CONTEXT •  Created by British Library Act 1972. •  National Library of the United Kingdom. •  Origins from 1753. •  One of world’s greatest research libraries. •  160 million collection items.

  3. BRITISH LIBRARY: COLLECTION DEVELOPMENT • Building as completely as possible the UK national published archive - current and retrospective gap filling; print and electronic. • Collecting research-level English- language material published world-wide in the humanities, social sciences, STM. • Buying foreign-language material selectively • Material acquired through: legal deposit, voluntary deposit from publishers, purchase, donation, exchange.

  4. LEGISLATION •  Legal Deposit Libraries Act 2003: enabling legislation. • VDEP: Voluntary Deposit of Electronic Publications.

  5. DOMAIN.UK • Six-month experiment to select and capture 100 UK web-sites, 2001. • audit change, loss, links, etc. • determine next steps.

  6. DOMAIN.UK: Why? • Short-lived nature/changing content of many web-sites. • loss of information. • increasing reference to web-sites in research/scholarship.

  7. DOMAIN.UK: Voluntary/Rights Cleared Approach • Voluntary. • Requiring explicit agreement of website publishers to take part in pilot. • No public access.

  8. DOMAIN.UK: Selection • Websites of historical or cultural significance. • Cross-section of Dewey Decimal Classification.

  9. DOMAIN.UK: Process • E-mail selected sites for approval and to check whether already archived. • Measure sites for links, size, change, etc. • Frequency of visits: every three weeks or more in some cases. • Supported by those sites approached. • Report recommended scaling up.

  10. BRITISH LIBRARY WEB ARCHIVING PROGRAMME • Building on Domain.uk. • BL to play leading role in collecting UK web presence in partnership with other institutions nationally and internationally. • Selective approach.

  11. BRITISH LIBRARY WEB ARCHIVING PROGRAMME contd. • Co-ordinate a snapshot of entire UK web presence at occasional intervals. • Achieve more regular capture of limited and well-defined range of sites. • Sites judged to be research-level, whether in terms of stated intentions of sites themselves or of potential to be primary resources for research.

  12. WEB ARCHIVING PROGRAMME • Comprises a series of complementary projects and activities. • Based entirely on voluntary, rights-cleared basis pending secondary legal deposit legislation. • Aims to embed web archiving within the BL's overall collection development policy. • Aims to provide the infrastructure to collect, preserve and make accessible web-site material alongside material in other formats.

  13. WEB ARCHIVING PROGRAMME STRANDS • Four main strands: • Definition of collection development policy. • UK Web Archiving Consortium. • International Internet Preservation Consortium. •  Internet Archive: incunabula of the internet.

  14. COLLECTION DEVELOPMENT • Appointment of Curator, Web Archiving. • Extension of policy defined for Domain.uk. • Sites of national, historical and cultural significance. • Research level now/in the future.

  15. UK WEB ARCHIVING CONSORTIUM • Two-year project. • Six partners: BL (lead); National Library of Scotland, National Library of Wales, National Archives, Joint Information Systems Committee, Wellcome Library. • Plan to use PANDAS software developed by National Library of Australia. • Rights to use individual sites to be cleared with rights-holders.

  16. UK WEB ARCHIVING CONSORTIUM contd. • Procurement exercise in process to recruit supplier to host service. • Intention to let contract in April 2004 and to be operational in summer 2004. • Sites to be made accessible to users. • Each partner to collect up to 500 sites per year, i.e. 6,000 during project.

  17. INTERNATIONAL INTERNET PRESERVATION CONSORTIUM • Project involving national libraries. • Led by Bibliotheque Nationale de France. • Also includes BL, Library of Congress, Library and Archives of Canada, Nordic countries, Italy, Australia, Internet Archive.

  18. INTERNATIONAL INTERNET PRESERVATION CONSORTIUM contd. • Aims to develop automated web-crawler mechanism. • Open-source tools to search web at regular intervals matching agreed collection development policies. • Working groups in: access tools; content management, deep web, framework, metrics and test-beds, researcher requirements. • Developmental at this stage.

  19. INTERNET ARCHIVE •  Collecting and saving sites since 1997. •  Wayback machine. •  Legal, technical and procurement issues.

  20. SOME CHALLENGES •  Defining UK. •  Rapid technology change. •  Third party rights (not always subject to UK law). •  Libel/defamation issues. •  Software issues / which platform? •  Validity of a snapshot.

  21. SOME CHALLENGES contd. •  Formats for archiving. •  Metadata standards. •  Archiving ‘look and feel’. •  Authenticity.

More Related