1 / 20

Digital Preservation in State Government: Best Practices Exchange 2006

Preservation Strategies in the North Carolina Geospatial Data Archiving Project (NCGDAP) NCSU Libraries Steve Morris Head of Digital Library Initiatives. Digital Preservation in State Government: Best Practices Exchange 2006. Overview. Digital geospatial data preservation issues

Download Presentation

Digital Preservation in State Government: Best Practices Exchange 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Preservation Strategies in the North Carolina Geospatial Data Archiving Project (NCGDAP)NCSU LibrariesSteve Morris Head of Digital Library Initiatives Digital Preservation in State Government: Best Practices Exchange 2006

  2. Overview • Digital geospatial data preservation issues • Technical solutions • Organizational/cultural solutions Note: Percentages based on the actual number of respondents to each question

  3. NC Geospatial Data Archiving Project • Partnership between university library (NCSU) and state agency (NCCGIA), with Library of Congress under the National Digital Information Infrastructure and Preservation Program (NDIIPP) • One of 8 initial NDIIPP partnerships (only state project) • Focus on state and local geospatial content in North Carolina (statedemonstration) • Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventories • Objective: engage existing state/federal geospatial data infrastructures in preservation Note: Percentages based on the actual number of respondents to each question

  4. Targeted Content • Resource Types • GIS “vector” data • Digital orthophotography • Digital maps • Tabular data • Content Producers • Mostly state, local, regional • Some university, commercial • Selected local federal projects Note: Percentages based on the actual number of respondents to each question

  5. Today’s geospatial data as tomorrow’s cultural heritage Future uses of data are difficult to anticipate (as with Sanborn Maps). Note: Percentages based on the actual number of respondents to each question

  6. Risks to Digital Geospatial Data • Producer focus on current data • Time-versioned content generally not archives • Future support of data formats in question • Vast range of data formats in use--complex • Shift to “streaming data” for access • Archives have been a by-product of providing access • Preservation metadata requirements • Descriptive, administrative, technical, DRM • Geodatabases • Complex functionality Note: Percentages based on the actual number of respondents to each question

  7. Different Ways to Approach Preservation • Technical solutions: How do we preserve acquired content over the long term? • Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be preserved—at point of production? Note: Percentages based on the actual number of respondents to each question

  8. Vector Data Format Options • Option A: use an open format and have a really unfortunate transformation and limited vendor support for the output object • Option B: use closed format but retain the original content and count on short- and medium-term vendor support.  • Option C: do both to buy time and look for an open, ASCII solution. (watch GML activity) No sweet spot, just an evolving and changing mix of flawed options that are used in combination. Note: Percentages based on the actual number of respondents to each question

  9. Preserving Cartographic Representation Counterpart to the map is not just the dataset but also models, symbolization, classification, annotation, etc. Note: Percentages based on the actual number of respondents to each question

  10. Preserving Geodatabases • Spatial databases in general vs. ESRI Geodatabase “format” • Not just data layers and attributes—also topology, annotation, relationships, behaviors • Growing use of geodatabases by municipal, county agencies • Some looking to Geodatabase as archive platform (in addition to feature class export) • ESRI Geodatabase archiving approaches • Feature Class Export, XML Export, Geodatabase History, File Geodatabase, Geodatabase Replication Note: Percentages based on the actual number of respondents to each question

  11. Harnessing Geospatial Web Services Image atlases from WMS services? Capturing cartographic representation? Recording records from decisions-making processes? Later: data transfer via WFS & GML?, Other? Note: Percentages based on the actual number of respondents to each question

  12. Project Repository Approach • Interest in how geospatial content interacts with widely available digital repository software • Focus on salient, domain-specific issues • Challenge: remain repository agnostic • Avoid “imprinting” on repository software environment • Preservation package should not be the same as the ingest object of the first environment • Tension between exploiting repository software features vs. becoming software dependent Note: Percentages based on the actual number of respondents to each question

  13. Organizational/Cultural Approaches Provide feedback to producer organizations/ inform state geospatial infrastructure Take the data as is, in the manner in which it can be obtained “Wrangle” and archive data Note the ‘Project’ in ‘North Carolina Geospatial Data Archiving Project’– the process, the learning experience, and the engagement with industry and infrastructure are more important than the archive Note: Percentages based on the actual number of respondents to each question

  14. Points of Engagement with Spatial Data Infrastructure • Framework data communities • Snapshot frequency, naming schemes, classification, GML application schemas, format strategies • Metadata standards and outreach • Persistent identifiers, versioning, feedback on metadata quality • Content replication/transfer • For data improvement projects, disaster preparedness, aggregation by regional service providers, … and archives • Where does archiving and preservation fit in? Note: Percentages based on the actual number of respondents to each question

  15. Points of Engagement with the Open Geospatial Consortium (OGC) • Geography Markup Language (GML) for archiving (PDF/A version of GML?) • GeoDRM • Adding preservation use cases • Content Packaging • Will there be an industry solution? • Web Map Context Documents • Can we save data state as well as application state? • Content Replication • Is this a layer in the overall architecture? • Persistent Identifiers Note: Percentages based on the actual number of respondents to each question

  16. Points of Engagement with Industry • Software vendors • Better support for temporal data management • Tools for retrospective data conversion • Web mashup and open source communities • WMS caching schemes • Standard tiling schemes with temporal component? • Data vendors • Cultivate market for older data (scaled pricing?) • Tech transfer on archiving practices? Note: Percentages based on the actual number of respondents to each question

  17. Cultivating a market for older data. Project Status Note: Percentages based on the actual number of respondents to each question

  18. Project Status Cultivating tools for retrospective conversion. Note: Percentages based on the actual number of respondents to each question

  19. Conclusion • Geospatial data is complex, introducing manifold challenges to ingest processes and repository development • Vector data and spatial databases are especially complex • Geospatial data exists in very large quantities and is subject to frequent update • Need to engage industry in the solution • Need to engage point of production Note: Percentages based on the actual number of respondents to each question

  20. Questions? Contact: Steve Morris Head, Digital Library Initiatives NCSU Libraries Steven_Morris@ncsu.edu Web site: http://www.lib.ncsu.edu/ncgdap/ Note: Percentages based on the actual number of respondents to each question

More Related