1 / 12

Please take the Workshop Survey

Learn fundamental data management practices for improved usability, efficiency, and understandability of your research data. Workshop sponsored by ORNL Distributed Active Archive Center and CC&E Joint Science Workshop.

jnoble
Download Presentation

Please take the Workshop Survey

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Please take the Workshop Survey https://www.surveymonkey.com/s/update 

  2. Data Management Practices for Early Career Scientists:Closing Robert Cook ORNL Distributed Active Archive Center Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN cookrb@ornl.gov CC&E Joint Science Workshop College Park, MD April 19, 2015

  3. Plan for archiving data “Begin with the end in mind” • Identified the Data Center • Collaborated with data center during project • Communicated: • Volume and Number of Files • Special needs • Delivery dates 3

  4. Followed Fundamental Data Practices • Define the contents of your data files • Define the variables • Use consistent data organization • Use stable file formats • Assign descriptive file names • Preserve processing information • Perform basic quality assurance • Provide documentation • Protect your data • Preserve your data

  5. What to submit to the archive? • Well-structured data files, with variables, units, and fill values well-defined • Document that describes the data set • Additional information • Article written with the data set • Files that describe project, protocols, or field sites (photographs) • Material from Project Web site or Wiki • Basic description of the data (15 questions) • http://daac.ornl.gov/PI/questions.shtml

  6. Issues with data sets received • Descriptive information about data files and content is incomplete • Data description and collection method • Field sites • Quality / uncertainty of data • Inconsistencies with publication • Files uploaded are not identified / described • Variable names are not defined or vague • “Height” unclear, change to “canopy_height” • Perhaps append the method/sensor for added clarity

  7. Information about Data (15 questions) Information About Your Data Set • Have you looked at our Best Data Management Practices • Who produced this data set? • What agency and program funded the project? What awards funded this project? (comma separate multiple awards) Data Set Description • Provide a title for your data set. (maximum 84 characters) What type of data does your data set contain? What does the data set describe? (2-3 sentences) • What parameters did you measure, derive, or generate? (comma separated, limit to ten) • Have you analyzed the uncertainty in your data? Briefly describe your uncertainty analysis. (2-3 sentences) Will the uncertainty estimates be included with your data set?

  8. Information about Data (cont) Temporal and Spatial Characteristics • What date range does the data cover? (YYYY-MM-DD) What is a representative sampling frequency or temporal resolution for your data? • Where were the data collected/generated? • Which of the following best describes the spatial nature of your data? (single point, multiple points, transect, grid, polygon, n/a) • What is a representative spatial resolution for these data? • Provide a bounding box around your data. Data Preparation and Delivery • What are the formats of your data files? How many data files does your product contain? What is the total disk volume of your data set? (MB) • Is this data set final, unrestricted, and available for release? What are the reasons to restrict access to the data set? • Has this data set been described and used in a published paper? If so, provide a DOI or upload a digital copy of the manuscript with the data set. • Are the data and documentation posted on a public server? If so, provide the URL.

  9. Data Center: Stewardship and Archive Functions • Ingest • perform QA checks • compile project-provided metadata • generate additional metadata • convert to archival file formats • Metadata / Documentation • prepare final metadata record and documentation • Archive / Release • generate citation and DOI (digital object identifier) • Exploration and Distribution • provide tools to explore, access, and extract data • Post-Project Data Support • provide long-term secure archiving • serve as a buffer between end users and PIs • provide usage statistics • Stewardship • security, disaster recovery • migration to new computer systems

  10. Workshop Goal Provide fundamental data management practices that investigators should perform during the course of data collection. • To improve the usability of data sets for: • You • Collaborators • People outside your project • By following the practices taught in this workshop, your data will be • less prone to error, • more efficiently structured for analysis, and • more readily understandable for any future research.

  11. Please take the Workshop Survey • https://www.surveymonkey.com/r/72MJWGF 

  12. Thank you ! Workshop Sponsors

More Related