230 likes | 392 Views
Web Archiving at the National Library of Australia. Russell Latham Senior Web Archivist, National Library of Australia.
E N D
Web Archiving at the National Library of Australia Russell Latham Senior Web Archivist, National Library of Australia
“The Web's ever-expanding size, the dynamic and ephemeral nature of its content, and how this is to be captured, stored and made accessible for the long-term are some of the key questions being addressed by electronic archiving programs. “ PADI http://www.nla.gov.au/padi/topics/92.html
What is web archiving? • A web archive is not the same as the live web • Brings a different value to web content • Creating artefacts from the web • Preserved snapshots, slices, gobbets of time • Challenge of timeliness • At certain times some things are more interesting and valuable • Focus on the future and long term access (preservation objective)
History: web archiving at the NLA • April Fools Day 1996: ‘Electronic Unit’ established • May 1998: public access to PANDORA titles • July 1998: first PANDORA ‘partner’ began participation • 10th participant joined in 2003 • June 2001: PANDAS v.1 released • Web archiving workflow system developed by NLA • 2002: Digital Archiving Branch • Our own identity at last! • Began first trial of ‘mainstreaming’ web archiving in Serials and Govt Deposit sections
History: web archiving at the NLA • August 2002: PANDAS v.2 released • July 2003: joined IIPC • 2004: PANDORA added to UNESCO Australian Memory of the World Register • July 2005: first .au domain harvest • Subsequent harvests in 2006, 2007, 2008 & 2009 • December 2006: “Web Archiving and Digital Preservation Branch” • July 2007: PANDAS v.3 released (at last!) • 2010: PANDORA search moved to Trove • May 2010: Proposal for whole-of-govt ‘opt-out’ arrangements through SIGB
What we collect • Selective approach • Collaboration with PANDORA participating agencies • Modest in size • High quality, timely, high value collection, described and searchable • Accessible to the public
Searchingthecollections • Subjects • Browse list • Collections • Agency based • Trove – Archived Websites • Trove – bibliographic • Search engines
Subjects/Browsing • When looking for non-specific resources • Wish to browse a topic area
Agency based • Use the partners page http://pandora.nla.gov.au/partners.html
2001 Federal Election 2004 Federal Election 2007 Federal Election 2010 Federal Election 1998 Federal Election 1996 Federal Election
Australian web domain harvests • Annual domain harvests 2005-2009 • Working with the Internet Archive • Covers .au top level domain and a bit more … • No public access • Quantity over quality; content not assessed or described; opportunistic rather than timely
Comparative statistics DomainHarvests PANDORA