1 / 10

Citations Top to Bottom http :// etherpad.ooici /geodata-fir3

Citations Top to Bottom http :// etherpad.ooici.org /geodata-fir3. The Fir Group – Breakout 3

alyn
Download Presentation

Citations Top to Bottom http :// etherpad.ooici /geodata-fir3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Citations Top to Bottom http://etherpad.ooici.org/geodata-fir3 The Fir Group – Breakout 3 Kerstin Lehnert, John Graybeal, Dmitri Mozzherin, Vivian Hutchison, GiriPalanisamy, Eric Wolf, Ron Weaver, Jan Peters, Walt Snyder, Mary Marlino, Cheryl Morris, Benjamin D Branch, Steve Tessler, Lisa Raymond, Jeanine Aquino, Scott Jensen, Percy Donaghay, Dave Folker, Sze-Ling Celine Chan, Doug Walker

  2. Why we cite - Reasons for creating a citation for a dataset or data • Give credit to creator (Credit) • Allow humans to know about the data and machines to find the source (Use) • Know the provenance of the data (History) • Give rigor and reproducibility to analysis (Rigor) • Allow specificity and exactness (possibly down to single item)

  3. Why we Cite Caveat - Citation and metadata records come from the data source (History) • Must come from the data source • Citation – source can give the most detailed and appropriate description including the persistence of the data • Metadata – source understands and can describe the data well at any granularity. Source also can record what the user did to discover/download the data.

  4. Why we cite – Rigor/Reproduce (Rigor) • Scientific method requires that we can replicate results and reproduce experiment to get the same data and/or result • Can the data source reliably reproduce and/or recover the same result based on the same search/request?

  5. Data sources are really variable! (Credit, history, rigor) • Persistence is a defining factor – Persistence means that the data, or some version of them, can be found in perpetuity (?) • 1. Persistent and static or tightly versioned – same query or request produces exactly the same result • 2. Persistent but variable – changes and versions are not tracked, but basic dataset/data type is available. Same query produces similar results, but possibly with differences • 3. Not persistent/streaming – data and data sources come and go and are valuable while there. • THESE ALL PRODUCE IMPORTANT RESULTS!

  6. Persistent and static or infrequently versioned data (rigor) • Citation is easy and rigorous (although we still have to define it) • Metadata stable • User gets the same result • Source maintains the whole record

  7. How about the other 99% of data sources? (rigor?!) • What is appropriate for these data sources? Community recognizes that this is an appropriate scientific activity that yields reliable and important results. • Move toward persistent and stable • Create a SNAPSHOT

  8. What is a SNAPSHOT • It is what was downloaded • The User of the data is the instigator • The Source(s) provide citation and metadata • It is not appropriate for persistent and static sources • It provides the rigor for analysis but not extraction • It must be made immutable because the source is not • It must be persistent somewhere (library, source, other)

  9. How to cite something - USE • Human interaction • Assess source for quality and create trust • Know the author, source, time, version - someone will figure out how to format/specify, or the source will give the information • Machine • Where is the source and is it a snapshot? • Resolve to something humans can use (mostly)

  10. Use, history, and rigor seem to be OK, what about Credit ? • Highest level seems to be tractable • Should be given to original sources, contributors, compilers, collectors • I did the work, give me some credit. • HOW? • In a meaningful way (ISI)

More Related