1 / 34

Data Curation: Challenges and Opportunities for Research Libraries

Data Curation: Challenges and Opportunities for Research Libraries. Brian E. C. Schottlaender The Audrey Geisel University Librarian. Should I Talk About … . … declining: budgets? numbers of staff? transactions? … closing branch libraries? … “rationalizing” collections?

alexia
Download Presentation

Data Curation: Challenges and Opportunities for Research Libraries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Curation: Challenges and Opportunities for Research Libraries Brian E. C. Schottlaender The Audrey Geisel University Librarian OSU “Library Futures” Seminar

  2. Should I Talk About … • … declining: • budgets? • numbers of staff? • transactions? • … closing branch libraries? • … “rationalizing” collections? • … repurposing space? • … bottom-up strategic planning? • … moving to a service program–based organizational structure? OSU “Library Futures” Seminar

  3. No, I Think I’ll Talk About … DATA CURATION OSU “Library Futures” Seminar

  4. Overview • The Scholarly Record • Stewardship • Data Curation • Why do data need to be curated? • Why should libraries curate data? • What should research libraries do? OSU “Library Futures” Seminar

  5. The Scholarly Record? The scholarly record is … “… that which has already been written in all disciplines ... that stable body of graphic information, upon which each discipline bases its discussions, and against which each discipline measures its progress.” Ross Atkinson. “Text Mutability and Collection Administration.” Library Acquisitions: Practice & Theory, Vol. 14 (1990) OSU “Library Futures” Seminar

  6. What Does the Scholarly Record Include? • E-only journals • Reviews • Preprints and working papers • Encyclopedias, dictionaries, and annotated content Nancy L. Maron and K. Kirby Smith. Current Models of Digital Scholarly Communication: Results of an Investigation Conducted by Ithaka for the Association of Research Libraries (November 2008) OSU “Library Futures” Seminar

  7. “The Scholarly Record” Scholarly Publishing (e.g., journal articles) Libraries Trusted Third Parties (e.g., JSTOR, Portico) Stable OSU “Library Futures” Seminar

  8. What Does the Scholarly Record Include? • E-only journals • Reviews • Preprints and working papers • Encyclopedias, dictionaries, and annotated content • Data resources Nancy L. Maron and K. Kirby Smith. Current Models of Digital Scholarly Communication: Results of an Investigation Conducted by Ithaka for the Association of Research Libraries (November 2008) OSU “Library Futures” Seminar

  9. “The Scholarly Record” Scholarly Publishing (e.g., journal articles) Libraries Trusted Third Parties (e.g., JSTOR, Portico) Stable OSU “Library Futures” Seminar

  10. What Does the Scholarly Record Include? • E-only journals • Reviews • Preprints and working papers • Encyclopedias, dictionaries, and annotated content • Data resources • Blogs • Discussion forums • Professional and academic hubs Nancy L. Maron and K. Kirby Smith. Current Models of Digital Scholarly Communication: Results of an Investigation Conducted by Ithaka for the Association of Research Libraries (November 2008) OSU “Library Futures” Seminar

  11. “The Scholarly Record” Infrastructures largely self-contained Scholarly Publishing (e.g., journal articles) Scholarly Raw Material (e.g., archives, data) Scholarly Inquiry/Discourse (e.g., blogs, wikis, open notebooks INPUTS OPERATORS OUTPUTS Libraries Archives Data Centers [Some in Libraries; Some Not] ????? TrustedThird Parties (e.g., JSTOR, Portico) Very unstable Emergent Less Stable Stable OSU “Library Futures” Seminar

  12. Stewardship 1 “Stewardship is a core value that includes notions of mission, responsibility, integrity, trust, accountability, service, preservation and sustainability for future use.”  Sharon E. Farb. “Libraries, Licensing, and the Challenge of Stewardship.” First Monday, Vol. 11, No. 7 (3 July 2006) “As a society and as educational institutions, we have a collective responsibility to preserve and make available, along a continuum of a life cycle, our digital heritage.” Jeffrey L. Horrell. “Converting and Preserving the Scholarly Record: An Overview.” LRTS, Vol. 52, No 1 (January 2008) OSU “Library Futures” Seminar

  13. Stewardship 2 • “There is a need for a close linking between digital data archives, scholarly publications, and associated communication. The potential for an expanded role for research libraries in the area of digital data stewardship affords opportunities to address these important linkages.” • “Stakeholder groups have different expertise, outlooks, assumptions, and motivations … Collaboration models to share expertise and resources will be critical.” To Stand the Test of Time—Long-Term Stewardship of Digital Data Sets in Science and Engineering: A Report to the National Science Foundation from the ARL Workshop on New Collaborative Relationship (2006) OSU “Library Futures” Seminar

  14. Stewardship 3 • “Historically, universities have played a leadership role in the advancement of knowledge and shouldered substantial responsibility for the long-term preservation of knowledge through their university libraries. An expanded role for some research and academic libraries and universities, along with other partners, in digital data stewardship is a topic for critical debate and affirmation.” • “The scale of the challenge regarding the stewardship of digital data requires that responsibilities be distributed across multiple entities and partnerships that engage institutions, disciplines, and interdisciplinary domains.” To Stand the Test of Time … (2006) OSU “Library Futures” Seminar

  15. Data Curation: What Is It? “The activity of managing and promoting the use of data from its point of creation, to ensure it is fit for contemporary purpose, and available for discovery and reuse. For dynamic datasets this may mean continuous enrichment or updating to keep it fit for purpose. Higher levels of curation will also involve maintaining links with annotation and other published materials.” Philip Lord, Alison Macdonald, Liz Lyon, and David Giaretta. “From Data Deluge to Data Curation.” eScience All Hands Meeting 2004 (2004) OSU “Library Futures” Seminar

  16. Data Curation: What’s It Include? • Design • Creation or Collection • Processing • Analysis • Appraisal • Selection • Description • Discovery • Dissemination • Repurposing • Storage • Preservation • Etc. OSU “Library Futures” Seminar

  17. Curation Model PanosConstantopoulos,et al. “DCC&U: An Extended Digital Curation Lifecycle Model.” The International Journal of Digital Curation, Issue 1, Vol. 4 (2009) OSU “Library Futures” Seminar

  18. Actors … “As we move from small to large scale data sharing, where data are managed and maintained for broad access, we also are seeing an increase in the number and type of intermediaries. Intermediaries, in the form of organizations and the people who work for them, prepare data for reuse by eliciting, organizing, storing, packaging and/or preserving data, and by performing various roles in dissemination and facilitation …” Ixchel M. Faniel and Ann Zimmerman. “Beyond the Data Deluge: A Research Agenda for Large-Scale Data Sharing and Reuse.” The International Journal of Digital Curation, Issue 1, Vol. 6 (2011) OSU “Library Futures” Seminar

  19. … and Stakeholders • Disciplinary experts • Functional experts • Developers • Curators • Preservationists • Users • Archives • Data Centers • Libraries • Institutions • Professional Societies • Publishers • Governments OSU “Library Futures” Seminar

  20. The Curation Ecosystem 1 Data Providers Policy Makers Funders Service Providers Systems Providers Data Consumers OSU “Library Futures” Seminar

  21. The Curation Ecosystem 2 “… the activities of curation are highly interconnected within a system of systems, including institutional, national, scientific, cultural, and social practices as well as economic and technological systems. Data curation is a nascent set of technologies and practices emerging in the context of this complex and rapidly evolving socio[economic]-technical ecosystem.” Anna Gold. “Data Curation and Libraries: Short-Term Developments, Long-Term Prospects.” http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1027&context=lib_dean OSU “Library Futures” Seminar

  22. Why do data need to be curated? • “The more effectively that data can be manipulated, mined, managed, analyzed and served to communities, the better the conduct of science can be supported.” • “The more we can eliminate boundaries in this exponentially growing sea of data, the better data can be shared enabling multidisciplinary and collaborative research …” • “The more effectively students and faculty gain the data intensive knowledge and skills, the larger the impact will be on science and society.” NSF-OCI Task Force on Data and Visualization. Report Draft Final (March 7, 2011) OSU “Library Futures” Seminar

  23. Why do data need to be curated? • Because data reuse requires it. • Why do data need to be reused? • Because trans-domain research requires it. • Why is trans-domain research important? • Because solving grand challenges requires it. • Why is solving grand challenges important? • Because they affect all of us. OSU “Library Futures” Seminar

  24. Why do data need to be curated? 3 Because the government says so. OSU “Library Futures” Seminar

  25. Why Should Research Libraries Curate Data? • Because we can: “Research libraries, archives, and other stewardship institutions have the capacity to aggregate and hold data, manage metadata, deal with rights management and access, and help users.” • Because we must: “… uncurated data are as good as lost, even if the bits are stored forever, because they cannot be interpreted correctly.” • Because, left to their own devices, scientists won’t: “… many if not most scientists focus on the shortest path to a particular scientific result rather than the best long-term solution for data reuse or data-service …” NSF-OCI Task Force on Data and Visualization. Report Draft Final (March 7, 2011) OSU “Library Futures” Seminar

  26. What Should Research Libraries Do? • Stop waiting and start proactive engagement locally. • Stake a claim in the production cycle. • Start retraining and repurposing staff. • Be a doer, not a broker, wherever possible. • Consider digital curation collaborations. • Actualize collaborative engagement. Tyler Walters and Katherine Skinner. New Roles for New Times: Digital Curation for Preservation. Association of Research Libraries (2011) OSU “Library Futures” Seminar

  27. What Have I Done? • Reached out to the San Diego Supercomputer Center (on whose Executive Committee I sit) to co-create the campus’ Research Cyberinfrastructure Initiative (RCI), funded by the Chancellor. • Leveraged the NDSA-funded Chronopolis Federated Preservation Environment to create a Research Data Curation Services Program. • Hired a Director, and reallocated portions of two domain specialists and a metadata analyst to her. • Created Sample Data Management Plans for various NSF Directorates. • Launched five curation pilots in the Humanities and the Sciences. • Joined DPN and am preparing to field-test Chronopolis as a DPN data triad. OSU “Library Futures” Seminar

  28. And So … OSU “Library Futures” Seminar

  29. And So … OSU “Library Futures” Seminar

  30. And So … OSU “Library Futures” Seminar

  31. An Example OSU “Library Futures” Seminar

  32. And So … OSU “Library Futures” Seminar

  33. Conclusion • Digital scholarly output cannot be de-coupled from the raw material and inquiry operations that generate that output, at least not as easily as analog scholarly output can be. • It can’t be, it needn’t be, and it shouldn’t be. • Its stewardship calls for a more expansive view of what constitutes the scholarly record, a view that encompasses more and different inputs, outputs, and stakeholders; and a more distributed and interoperant organizational and technical infrastructure. OSU “Library Futures” Seminar

  34. QUESTIONS? OSU “Library Futures” Seminar

More Related