1 / 49

Welcome to the SERI Educational Webinar June 10, 2014

Welcome to the SERI Educational Webinar June 10, 2014. Let us know who you are, where you’re from, and who is participating with you today Use the chat box on the right of the screen to type your name, state, and the names of those watching the webinar with you.

oliver
Download Presentation

Welcome to the SERI Educational Webinar June 10, 2014

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome to theSERI Educational WebinarJune 10, 2014 • Let us know who you are, where you’re from, and who is participating with you today • Use the chat box on the right of the screen to type your name, state, and the names of those watching the webinar with you. You can connect to the audio portion of today’s webinar through your phone line or through VoIP

  2. acknowledgements • This webinar is made possible by a grant from the National Historical Publications & Records Commission (NHPRC) SERI Educational Webinar - June 10, 2014

  3. Electronic Records Inventory Tibaut HouzanmeElectronic Records SpecialistIndiana Commission on Public Records Sarah GrimmElectronic Records ArchivistWisconsin Historical Society

  4. Indiana’s Electronic Records Inventory: Towards a Statewide Digital Preservation Repository June 10, 2014 Presented by Tibaut Houzanme, Electronic Records Specialist, Indiana Commission on Public Records thouzanme@icpr.in.gov SERI Educational Webinar - June 10, 2014

  5. Foreword This is the work-in-progress of Indiana’s internal study (2013 data) at a Macro-level towards a State-wide digital repository. Policy and economic considerations are still debated and options for cost-savings are still under exploration. Overall, Indiana is planning with a high sense of urgency with enough wiggle room for the unknown, negotiation and cost modeling that could still lead to a significant achievement. The ultimate goal of this inventory is to help with our business case and facilitate access to records through a unified interface. SERI Educational Webinar - June 10, 2014

  6. Background • Established as an agency in 1979, the Indiana Commission on Public Records has oversight for state and local records. The Commission manages the State Archives, Records Center, Forms Management, Records Management, and Imaging Lab. • In 2011-12, Indiana began the process of developing an electronic records program. A component of that process was conducting an inventory of the electronic holdings held by the State Archives, and development of calculations to determine the requirements for hosting the files in native, normalized and accessible copies. • The Commission is looking at the feasibility of building a statewide (state and local governmental units+) digital repository for permanent electronic records. SERI Educational Webinar - June 10, 2014

  7. How we went about it. SERI Educational Webinar - June 10, 2014

  8. Overall inventory Process • Access to, and use of previous reports (incl. the Indiana Office of Technology’s report on data) • Current reports from the Archives’ accession database • Visits, inspection, recounts, verification (ongoing) • Management review; Assumptions review (ongoing) • In-house and expert opinion estimates of records sizes if digitized • Estimates/Models (refinement ongoing) SERI Educational Webinar - June 10, 2014

  9. What we assumed. SERI Educational Webinar - June 10, 2014

  10. Scope of records considered • Born-digital and accessioned records (including all electronic media) • Surrogates records (from external partners, Ancestry, Family Search) • Web pages (Archive-It) • Analog records to be digitized (for access/preservation reasons) • Paper/Microfilm (text/image records) • Audio (tapes) • Video (tapes, films) • Estimates of current records that might come from IOT’s data center • Records and records growth compared to data growth • Social media has been out of scope for this inventory SERI Educational Webinar - June 10, 2014

  11. What will be stored in a digital repository? The repository would contain, depending on the type of record: • 1 original record • 1 migrated/normalized copy • 1 or multiple access copies based on material • Metadata (some will grow over the life of the records: e.g. audit trail) • The entire repository should be replicated, at least once. The requirements that guided us were OAIS, TRAC and the Digital Preservation Capability Assessment. Though we kept the number of access copies at one, for some categories of records such as audio, Indiana is looking at 3 access copies (i-Tunes, WMA & MP3). SERI Educational Webinar - June 10, 2014

  12. What is the initial size per category of digitized records? Note: Scan tests on 10 samples of paper records have given the average for each category for bitonal, grayscale, black & white and color. Microfilms numbers came from our imaging lab and we relied on expert advice for audiovisual digitization.

  13. Whatpercentage of data held could be considered “records”? Based on the Compliance, Governance and Oversight Council –CGOC’s survey, 31% of electronic data are records needing to be retained for some time period. *Source: CGOC Benchmark Report on Information Governance in Global 1000 Companies https://www.cgoc.com/files/CGOC_Workshop_Nov2012_NYC_PROCEEDINGS.pdf CGOC , est. 2004, is a forum of over 1,900 legal, IT, records and IM executive professionals from corporations and government agencies *Disclaimer: This data reflects records to data ratios from the corporate world, based on a survey of fortune 1000 companies' legal, records and IT staff across 10 industries.

  14. How might the repository content grow? Based on International Data Corporation (IDC)’s data storage predictions, data growth will increase tenfold in the next 7 years, at a rate of 40% each year. Source: IDC’s Digital Universe Study – 2014: http://www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm ICD is a market research, advisory and events services firm for information technology, telecommunications and consumer technology. IDC is a subsidiary of IDG, a global IT and technology company that owns brands such as CIO®, CSO®, Computerworld®, GamePro®, InfoWorld®, Macworld®, Network World®, PCWorld® and TechWorld® – that reach an audience of more than 280 million technology buyers in 97 countries.

  15. What are our results/numbers? SERI Educational Webinar - June 10, 2014

  16. Inventory results: Born Digital Websites harvested: 1,741 GB. Some may be permanent, some may not. SERI Educational Webinar - June 10, 2014

  17. Inventory results: Surrogates SERI Educational Webinar - June 10, 2014

  18. Inventory Results: Paper records SERI Educational Webinar - June 10, 2014

  19. Inventory results: Microfilm records

  20. Inventory results: Audiovisual records

  21. Undeclared Digital Records Estimate – State Data Center *Disclaimer: Governments are different from corporations and may be required to retain more than 31% of records from data. However, the percentage of permanent records that gets transferred to the State Archive could be less than 31%. Such number averages around 3 to 5% for paper records. Indiana used the formula of 10% of 31%. SERI Educational Webinar - June 10, 2014

  22. Total repository and growth – state & local governments *We hope to build an infrastructure that is scalable enough to accommodate twice the State of Indiana’s digital repository content. This will enable any of the 2,355 local governments units to join and share cost. It will also allow for growth. SERI Educational Webinar - June 10, 2014

  23. Learning and perspectives The following considerations emerge from the results: • Roughly 5PB is the best estimate we have for the repository, based on the assumptions and calculations. • Managing 3 to 5 PB of records requires sound and viable options. For example, storing 5PB for 5 years will cost the following: • IOT Storage = ~ $91,226,112 (@ $0.29/month) • Amazon Gov Cloud = ~ $23,730,585 (+ access/get & downloads) • DIY hard drive storage: ($276,352 to $654,320) + redundancies & Management • Better options? • Obsolescence of audiovisual and removable media along with building a consortium of state and local governments to participate emerge as priorities • Addressing current records or data will be key to reducing reduce repository size, metadata cost and improve efficiency through governance. SERI Educational Webinar - June 10, 2014

  24. Evolution of an Inventory Development and Use of an Inventory for Long Term Preservation Planning Sarah GrimmElectronic Records ArchivistWisconsin Historical Societysarah.grimm@wisconsinhistory.org

  25. Who we are….. SERI Educational Webinar - June 10, 2014

  26. Why an Inventory? • For budgetary planning and requests • To raise institutional awareness • To collect the “bits and pieces” that had not necessarily been accounted for. SERI Educational Webinar - June 10, 2014

  27. Creation of the Inventory • Started with the basic questions…… • What is it? • Who owns it? • What does it consist of? • Where is it right now ? • How critical is the data? • How is it being stored? • What about access? SERI Educational Webinar - June 10, 2014

  28. What is it? • Title • Description • Genre Term • Dates • Estimated growth over time *** • General Notes SERI Educational Webinar - June 10, 2014

  29. Date Considerations • Date Original • Dates associated with the content (18601865) • Date Digital • Date of files - created or modified (2009) • Date received • If relevant / possible (2011) Shawano Probate Cases 1860-1865 Received by WHS In 2011 Digitized by USG In 2009

  30. Who owns it? • Owner - Who currently “owns” the digital content? • Responsible staff - Who knows the most about it? • Creator (Internal or External) - Who created the digital content? Collections Department / Personnel Digital Management Creator THESE MAY BE DIFFERENT PEOPLE (or not) SERI Educational Webinar - June 10, 2014

  31. What does it consist of? • Medium (SAN, 6cds, 1 hard drive, 115 floppy disks) • Extent = Format + Amount (600 .pdfs, 30 .doc) • File Size (MB, GB, TB) SERI Educational Webinar - June 10, 2014

  32. Where is it right now? Locations of content are important: • List primary locations • List locations of all backups/copies (Hard drive in the storage room, weekly backup tapes, offsite location) ….Remembering to change locations as content moves SERI Educational Webinar - June 10, 2014

  33. How critical is the data? • Data Criticality • Business Criticality • Ownership SERI Educational Webinar - June 10, 2014

  34. Data Criticality • Rated on a scale of 1  5 • 1 - Digital and we hold the only copy • 2 - We have a digital copy but physical copies are at high risk (ex: Audio tapes) • 3 - We have a digital copy but physical copies reside elsewhere • 4 - We have a digital copy but digital copies reside elsewhere • 5 - We have a digital copy and still hold original physical item SERI Educational Webinar - June 10, 2014

  35. Business Criticality • Rated on a scale of 1  4 • 1 – Irrecoverable • 2 – Major Impact • 3 – Minor Impact • 4 – No Impact SERI Educational Webinar - June 10, 2014

  36. Ownership • Do we have a statutory requirement to hold a collection? • Do we have a donor contract? • Did we purchase it? SERI Educational Webinar - June 10, 2014

  37. How is it being stored? • Standard Backup • Dark Archive • Recovery Time SERI Educational Webinar - June 10, 2014

  38. What about access? • Data Access • Restrictions SERI Educational Webinar - June 10, 2014

  39. What we learnedalong the way…… SERI Educational Webinar - June 10, 2014

  40. Test Your inventory SERI Educational Webinar - June 10, 2014

  41. Pick the right tool • Started with Excel • BUTit took foorreevveerr to scroll across the page(resulting in this ) • Moved it to MS Access SERI Educational Webinar - June 10, 2014

  42. Take some time…. • To get people involved • To find the content • …plan for that SERI Educational Webinar - June 10, 2014

  43. take another look • Identified collections that needed more work before they were ready for the repository…….. SERI Educational Webinar - June 10, 2014

  44. Next Steps • Evolving the existing inventory to a Pre-SIP tracking mechanism • Incorporating some of the inventory fields into our future repository as metadata SERI Educational Webinar - June 10, 2014

  45. Contacts • Tibaut HouzanmeElectronic Records SpecialistIndiana Commission on Public Recordsthouzanme@icpr.in.gov • Sarah GrimmElectronic Records ArchivistWisconsin Historical Societysarah.grimm@wisconsinhistory.org SERI Educational Webinar - June 10, 2014

  46. Questions & comments SERI Educational Webinar - June 10, 2014

  47. Webinar Evaluation • We really do appreciate your feedback! • After you exit the webinar, you will automatically be taken to an online webinar evaluation. Please take a couple minutes to complete the survey and help us plan future webinars. SERI Educational Webinar - June 10, 2014

  48. Upcoming SERI webinars • Tuesday, July 8, 2014: SERI Webinar Topic To-Be-Determined • Tuesday, July 22, 2014: SERI Webinar PERTTS Portal Overview SERI Educational Webinar - June 10, 2014

  49. Stay connected & informed • CoSA Website:http://www.statearchivists.org • CoSA Resource Center:http://rc.statearchivists.org • CoSA Blog:http://statearchivists.wordpress.com • CoSA Twitter Handle:@StateArchivists • CoSA Facebook Page:www.facebook.com/CouncilOfStateArchivists • SERI Facebook Page:www.facebook.com/SERIproject SERI Educational Webinar - June 10, 2014

More Related