Presentation is loading. Please wait.

Presentation is loading. Please wait.

Citations Top to Bottom The Fir Group – Breakout 3 Kerstin Lehnert, John Graybeal, Dmitri Mozzherin, Vivian Hutchison,

Similar presentations


Presentation on theme: "Citations Top to Bottom The Fir Group – Breakout 3 Kerstin Lehnert, John Graybeal, Dmitri Mozzherin, Vivian Hutchison,"— Presentation transcript:

1 Citations Top to Bottom http://etherpad.ooici.org/geodata-fir3 The Fir Group – Breakout 3 Kerstin Lehnert, John Graybeal, Dmitri Mozzherin, Vivian Hutchison, Giri Palanisamy, Eric Wolf, Ron Weaver, Jan Peters, Walt Snyder, Mary Marlino, Cheryl Morris, Benjamin D Branch, Steve Tessler, Lisa Raymond, Jeanine Aquino, Scott Jensen, Percy Donaghay, Dave Folker, Sze-Ling Celine Chan, Doug Walker

2 Why we cite - Reasons for creating a citation for a dataset or data  Give credit to creator (Credit)  Allow humans to know about the data and machines to find the source (Use)  Know the provenance of the data (History)  Give rigor and reproducibility to analysis (Rigor)  Allow specificity and exactness (possibly down to single item)

3 Why we Cite Caveat - Citation and metadata records come from the data source (History)  Must come from the data source  Citation – source can give the most detailed and appropriate description including the persistence of the data  Metadata – source understands and can describe the data well at any granularity. Source also can record what the user did to discover/download the data.

4 Why we cite – Rigor/Reproduce (Rigor)  Scientific method requires that we can replicate results and reproduce experiment to get the same data and/or result  Can the data source reliably reproduce and/or recover the same result based on the same search/request?

5 Data sources are really variable! (Credit, history, rigor)  Persistence is a defining factor – Persistence means that the data, or some version of them, can be found in perpetuity (?)  1. Persistent and static or tightly versioned – same query or request produces exactly the same result  2. Persistent but variable – changes and versions are not tracked, but basic dataset/data type is available. Same query produces similar results, but possibly with differences  3. Not persistent/streaming – data and data sources come and go and are valuable while there.  THESE ALL PRODUCE IMPORTANT RESULTS!

6 Persistent and static or infrequently versioned data (rigor)  Citation is easy and rigorous (although we still have to define it)  Metadata stable  User gets the same result  Source maintains the whole record

7 How about the other 99% of data sources? (rigor?!)  What is appropriate for these data sources? Community recognizes that this is an appropriate scientific activity that yields reliable and important results.  Move toward persistent and stable  Create a SNAPSHOT

8 What is a SNAPSHOT  It is what was downloaded  The User of the data is the instigator  The Source(s) provide citation and metadata  It is not appropriate for persistent and static sources  It provides the rigor for analysis but not extraction  It must be made immutable because the source is not  It must be persistent somewhere (library, source, other)

9 How to cite something - USE  Human interaction  Assess source for quality and create trust  Know the author, source, time, version - someone will figure out how to format/specify, or the source will give the information  Machine  Where is the source and is it a snapshot?  Resolve to something humans can use (mostly)

10 Use, history, and rigor seem to be OK, what about Credit?  Highest level seems to be tractable  Should be given to original sources, contributors, compilers, collectors  I did the work, give me some credit.  HOW?  In a meaningful way (ISI)


Download ppt "Citations Top to Bottom The Fir Group – Breakout 3 Kerstin Lehnert, John Graybeal, Dmitri Mozzherin, Vivian Hutchison,"

Similar presentations


Ads by Google