archive it archiving preserving digital content n.
Skip this Video
Loading SlideShow in 5 Seconds..
Archive-It: Archiving & Preserving Digital Content PowerPoint Presentation
Download Presentation
Archive-It: Archiving & Preserving Digital Content

Loading in 2 Seconds...

  share
play fullscreen
1 / 13
Download Presentation

Archive-It: Archiving & Preserving Digital Content - PowerPoint PPT Presentation

LionelDale
301 Views
Download Presentation

Archive-It: Archiving & Preserving Digital Content

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Archive-It: Archiving & Preserving Digital Content

  2. Internet Archive • We are a Digital Library • Founded in 1996 by Brewster Kahle • Located in San Francisco California

  3. www.archive.org • Largest publicly available web archive in existence • Accessible starting in 2001 • 400 Billion+ URLs • 80+ million websites • Content in 40+ Languages • Collect a snapshot of the web every 60-90 days 361 Billion pages saved

  4. Web Archiving Service: Archive-It Archive-It is a subscription service launched in February 2006 • Web based application that allows users to create, manage, access and store collections of digital content • The service is a fully hosted solution, and includes access and storage. • Provides tools for selection and scoping including cataloging with metadata • Ability to capture content using 10 different time frequencies • Archived content includes: html, text, videos, audio, social media, PDF, images, online newspapers • Can browse archived content 24 hours after a capture is complete; and full text search is available within 7 days • Restricted access options are available

  5. Archive-It Partners

  6. What is Web Archiving? Web archiving is the process of collecting portions of web content, preserving the collections, and then providing access to the archives - for use and re use. A web archive is a collection of archived URLs grouped by theme, event, subject area, or web address.

  7. Challenge: a lot of data Amount of content that is being archived Amount of data being created by content providers http://www.helenbrowngroup.com/2011/02/rescue-from-the-digital-firehose/gushing-firehose-by-joseph-robertson/ http://www.chaitalag.com/new/s/tubig

  8. Challenge: What to archive? …What is important to you? What do you want people to know about? What are your organization’s collecting activities? Vision?

  9. Archive-It Use Cases • Create a thematic/topical web archive on a specific subject or event. • Different perspectives and social commentary (tweets, blogs, comments). • Can include Spontaneous Events • Often related to traditional collecting activity around the same focus • Mandate to capture/preserve institutional memory and history. Construct an historical record of an institution’s web presence over time. • Support an electronic records system to meet records retention requirements. • Capture publications that aren’t being deposited in print form. • Closure crawls

  10. Access to Public Collections Partners: • Can view through private web application with login/password General Public: • Can view from Archive-It website: http://www.archiveit.org/ • Landing Pages: view from organization’s website with a branded page that links back to Archive-It hosted data • Integration with existing systems and catalogs

  11. Storage & Preservation Multiple ways to Store and Preserve Storage: • 2 copies of the archived data (primary and back-up) are stored at San Francisco Data Center • Collections transferred to the General Archive as a third copy • A copy of archived data can be shipped on a hard drive • Ability to download files from Internet Archive servers Digital Preservation: • 2008: LOCKSS • 2013: Duracloud

  12. Web Archiving Life Cycle Model http://www.archive-it.org/publications

  13. Questions & Answers Lori Donovan lori@archive.org Thank you!