1 / 13

Indiana University

Indiana University. Data Publishing Service. Stacy Kowalczyk. April 9, 2010. Questions. Which phases of the data life cycle are managed by your repository? How do data management requirements differ across the data life cycle? What systems do you use to support the data life cycle?

tangia
Download Presentation

Indiana University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Indiana University Data Publishing Service Stacy Kowalczyk April 9, 2010

  2. Questions • Which phases of the data life cycle are managed by your repository? • How do data management requirements differ across the data life cycle? • What systems do you use to support the data life cycle? • Can you generalize the mechanisms used to migrate data between different phases of the data life cycle?

  3. Data Publishing Service • A new service of the IUScholarWorks institutional repository and the Scholarly Data Services • Providing data management support and data access • Data will have a persistent URL so it can be linked to publications • The service will combine our DSpace repository with IU’s Scholarly Data system (formerly known as MDSS), a system that researchers are already uses • Allows discovery over the Web • Preservation – bit level

  4. Current Data Lifecycle Model Implementation Scholarly Data Service IU ScholarWorks Preservation of data storage of data migration to suitable format/medium metadata creation ↓Distribution/publication of data ↓Re-use of data by same researcher by other researchers Data creation research design data management planning data collection (surveying, experimentation, measuring etc.) data checking and cleaning ↓Data analysis analysis derived data creation creation of data documentation ↓End of research research outputs preparing data for preservation http://www.data-archive.ac.uk/sharing/lifecycle.asp

  5. Scholarly Data Service • Massive Data Storage System • Current system for research data storage • Installed in 1998 • Based on IBM developed High Performance Storage System (HPSS) software • It offers over 2.8 petabytes of disk- and tape-based storage. Distributed between Indianapolis and Bloomington campuses

  6. Bloomington Users Indianapolis Users HPSS Movers HPSS Movers Disk Arrays Disk Arrays Tape Library Tape Library IUB Campus Network IUPUI Campus Network TCP/IP Wide Area Network Distributed between IUB and IUPUI IUB Subsystem IUPUI Subsystem HPSS Core Servers Research Network Research Network SAN SAN

  7. Data Publishing in IU Scholarworks • Discovery and access of datasets and related publications through the IUScholarWorks Repository service • DSpace records that are searchable, indexed, and harvested and available at stable URLs • DSpace records that contain DSpace bitstreams for small datasets • DSpace records that link via stable URLs to large datasets in IU MDSS

  8. IUScholarWorks Data: Linking to MDSS and delivery via HTTP HTTP Server Item record with URL’s of datasets in MDSS IU MDSS hpssfs filesystem MDSS web server

  9. Data Publishing in IU Scholarworks • Facilitating the submission process for both the researcher and collection manager • We facilitate the process for submitters via the DSpace Configurable Submission system • We facilitate the data collection manager’s process via steps in the DSpace workflow system

  10. IUScholarWorks Data: Item submission user interface Phase 2, automated workflow DSpace Configurable Submission System Instructions and preparation Describe item metadata form(s) MDSS and dataset info/form File upload step Review step Non-interactive processing steps Finalize/ Accept License Update metadata IU MDSS Initiate MDSS actions (move datasets, etc.) Query MDSS technical metadata (checksum, etc.)

  11. Planning for a More Curated Life Cycle Model http://libraries.mit.edu/guides/subjects/data-management/cycle.html

  12. Active and Social Curation • Engage researchers during projects not at the end • Use immediate benefits to drive automatic capture and 'volunteering’ of metadata • Reduce costs by re-engineering curation processes to leverage this rich metadata and volunteered effort

  13. Data Curation Lifecycle Elements Active Curation OAIS Repository Federation Curation Boundary Automated Curation Workflow/Rule Engine Metadata Management Data Acquisition, Analysis and Simulation Scholarly Communication Operates on Metadata, Content Objects and Trigger Events DDI3. METS, PREMIS, MODS, DC, SensorML, OGC, … Ingest scripts: fixity, integrity, authentication, transformation Ingest, AIPs Trusted Digital Repository Federation (OAIS compliant) Appraisal and Selection Active Data Systems Compound Objects - OAI-ORE Preservation Actions Dissemination Packages Wide-Area File System Search, Browse, Annotation, Visualization Tools Migration and Emulation Tools Use, Reuse, Repurposing Tools Access Mechanisms and E-Scholarship Services Contributor User

More Related