Data Stewardship and Data Provenance Activities May 10-11, 2011 Steve Kempler and Greg Leptoukh. Data Stewardship.
NASA Earth science data systems are authoritative sources of science data and recognized as national assets. They are systems of record. Therefore, data stewardship serves a vital role.
Science data stewardship is the protection of science data records, their integrity, long term utility, and other actions that maximize the return on investment.
Science data stewardship includes areas such as:
From Berrick, dsds.nasa.gov/day1/D1_LessonsLearned_Berrick.ppt
Rebuilding and Organizing 1960’s Era Project HighlightsDatasets: Achievements
Rebuilding and Organizing 1960’s Era Project HighlightsDatasets: Lessons Learned
Where to find the knowledge about data?
It is scattered in scientific papers, the actual code, unwritten assumptions, folklore, etc.
Assess sensitivity of the results to variations in processing algorithms/steps…
Work closely with scientists to guarantee science quality
How to deliver provenance?
Deliver to users together with the data
Present to users in a convenient, easy-to-read fashion
Provide recommendations for different data usage (applications vs. climate studies)
Are these quality flags compatible?
Capture and classify the details of measurement technique, data collection and processing
Identify and spell out similarities and differences
Assess importance of these differences
Deliver all this information in such a way that a user can easily see and understand the details
Present recommendations to guide the data usage and avoid apples-to-oranges comparison and fusion