Loading in 2 Seconds...
Loading in 2 Seconds...
ARL Survey on E-science and Data Support: Initial Finding Wendy Pradt Lougee E-Science Working Group October 14, 2009
ARL E-Science • 2006 Task Force inquiry about library activity • 2007 Task Force Report recommendations • Education, awareness • Workforce development • Relationships with relevant organizations • Infrastructure development (CNI) • Policy development, new publishing genre • 2008 Forum, e-research presentations • 2009 Working Group survey of membership
E-Science Survey • Status of institutional planning • Campus structures • Infrastructure development • Status of library planning & engagement • Involvement in campus planning • Library services, infrastructure, capacity (staff) • Pressure points, areas of interest
Institutional structure/organization • 52 respondents • Institutional infrastructure in place or planned 75% • Most institutions have hybrid of institution-wide and unit planning & infrastructure (59%) • Only 10% are pursuing institution-wide approach • Institution-wide approaches: include IT, Library, faculty/researchers, Office of Research • Unit-based focus: grounded in science/medicine units “Organizational structure is too strong a word. There is periodic interaction and base-touching between science departments, ITS and sometimes the Libraries that considers infrastructure needs…”
Institutional infrastructure Of those institutions with focused infrastructure (N=41): • 50% report designated unit to provide data curation support • 40% have conducted assessment of data resources and needs • 53% use combo of central/distributed data centers; 44% only distributed centers • Lack of awareness about digital lab notebook support
Decentralized themes “Most science and engineering departments, labs and centers…have some infrastructure to support high performance computing, or provide software tools to process/visualize research data. But none…are clearly documented on a single webpage or other place where researchers can easily locate.” “Currently there is no one central group or effort that focuses on overall Planning, but a collection of overlapping initiatives and activities – This is largely because the university is highly decentralized and others in The institution do not think in terms of e-science but in terms of Research supported by cyberinfrastructure.”
Centralized strategies “A cyberinfrastructure task force is in the planning stages, and it will report to the President of the University.” “Two groups exist: Cyberinfrastructure Council and Knowledge Management Committee. The Council is most involved in the high performance computing, data centers, other computing and network issues. The Knowledge Management Committee is more oriented to the content of escience and data curation…” “An eScience Institute was formed in January2008 by the Vice President for Technology and the Provost, in partnership with key Deans and a group of highly distinguished research faculty…”
External funding • 16 institutions engaged in DataNet proposal development, 15 involved the library • 13 libraries involved in other e-science grants (NIH, NSF, Mellon and Gates Foundations) “Investigating Data Curation Profiles Across Multiple Disciplines (Explores Who is willing to share what data with whom)…Awarded by IMLS to Purdue, Libraries PI.” “NSF Office of Cyberinfrastructure award to [Cornell’s] Mann Library: III-CXT: Promoting the curation of research data through library-laboratory collaboration.”
Library eScience Support • Of libraries with institutional activity, 72% of respondents reported library involvement. • Organization: group or group/department/ individual lead • 86% libraries offering service collaborate with other units (e.g., IT, colleges/departments, centers, Ofc. Research) E-Research Working Group, Data Curation Working Group, e-Data Archiving Group, Science Data Services Team, Data Executive Group, E-Research Team
Data Assessment/audits • Washington http://www.washington.edu/lst/research_development/research_projects/LSTsurvey • U Oregon http://libweb.uoregon.edu/faculty/SciDataAudit.html • Purdue/UIUC http://www.datacuratoinprofiles.org • Wisconsin http://digital.library.wisc.edu/1793/34859 and http://digital.library.wisc.edu/1793/21443
Library service portfolio • Finding, using available infrastructure • 8 libraries maintain web site on services • Finding relevant data, developing data management plans, rights management • 8 libraries offer training in data management • Metadata and archiving consultation/support • Most (86%) rely on discipline librarians, many (69%) also have data librarians
Library Technology Infrastructure • Institutional repositories • Domain-specific repositories • Virtual community support (e.g., VIVO) • Short-term storage, partner in campus storage solutions • Publishing infrastructure • GIS and social science data services, tools
Library Staff & Staff Development • 62% reassigning existing staff • 42% have hired or are planning (39%) to hire escience expertise • 62 positions detailed: • Two named chairs • 70% had library/info science degree • 32% had disciplinary degree (10% PhD only) • Staff development: • Conference support, in-house presentations, course support • 7 institutions collaborating with iSchools
Wisconsin • Research Data Management Study Group (2008) • Proposed pilot: jointly funded and managed with research partners DoIT and Library • Easily accessed, maintained storage and backup for data; projects, address consultation needs • Libraries/DoIT digital curation service (2009) • Assess institutional models • User-based data management applications integrated with storage/retrieval system • Develop digital curation processes & procedures, data management assistance
University of Washington • Institution level • Study of campus needs: no clear consensus on technology priorities. Areas of convergence: data management, shared expertise, computing power & network access, data collection & analysis, communication & collaboration • eScience Institute formed 2008, interdisciplinary and institution-wide coordinating body; Library interfaces on planning and referral on data curation • Library: • Informal, evolving structure involving metadata unit, research services unit, and health sciences libraries. • Planned Libraries integrated data services unit
Purdue • Institutional level • No one central group, but collection of overlapping initiatives • Planned task force on data management (VP Research, IT, Libraries, Provost office, colleges/schools) • Libraries Distributed Data Curation Center (D2C2) pursues curation issues of organizing, facilitating access to, archiving for and preserving research data and data sets in complex environments. Brings together project teams to apply for grants and promote collaboration on advancing solutions for data management.
Cornell University • DISCOVER Research Service Group • Partnership: domain scientists, Center for Advanced Computing, Library, Fedora Commons; Sponsored by VP Research • Facilitates collaboration, fosters cross-disciplinary analysis of data using data mining and visualization tools, supports development of cyberinfrastructure • Library Data Working Group white paper: http://hdl.handle.net/1813/10903 • Library maintains Research Data Management & Publishing Support site • Library’s DataStar project: supports collaboration, short term storage, & data sharing during research
Johns Hopkins • Data Intensive Engineering and Science (IDIES), most visible umbrella organization for eScience http://idies.jhu.edu • Coalesces data-intensive science efforts • Brings together scholars from School of Arts/Sciences, Engineering, Sheridan Libraries to form interdisciplinary teams. • Facilitates development of tools and methods • DataNet award: Data Conservancy Project addressing creation, implementation, and sustained management of an integrated and comprehensive data curation strategy across initial disciplinary base of astronomy, biodiversity, earth sciences, and social sciences.
UCSD • Blueprint for the Digital University (2009) recommendations: colocation facilities, centralized disk storage, digital curation & data services, CI network, “condo clusters,” expertise (labor pool). • Move toward centralized data service (new facility). Service sponsored by UC Ofc of President, available to UC researchers • Collaboration between SD Supercomputer Center and Libraries. • Partner in Chronopolis, national center for the management, long-term preservation, and promulgation of national digital assets http://chronopolis.sdsc.edu/
Pressure Points for ARL Libraries • Organizational: • Low recognition of importance of e-science support • Turf issues • Complexity of structures • Resources: • Staff with relevant expertise • Technology infrastructure • Budget constraints
Information Exchange Interests • Initiatives at member institutions • Share organizational models, position descriptions • Assessments of researcher needs, environmental scans • Support for digital humanities (models, programs) • Data curation technologies, data preservation
Next steps • Leverage the survey to increase info exchange: • Occasional paper including cases • Repeat survey in 2 years • Briefing paper by and for VPs for Research • “Institute” for e-science teams (discipline/data librarians and professionals) • Build on Reinventing Science Librarianship Forum