Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Organization Quality Assurance and Transformations.

Similar presentations


Presentation on theme: "Data Organization Quality Assurance and Transformations."— Presentation transcript:

1 Data Organization Quality Assurance and Transformations

2 Check for missing, impossible, anomalous values –Plotting –Mapping Examine summary statistics Verify data transfers from notebooks to digital files Verify data conversion from one file format to another Data Validation Hook, et al. 2010. Best Practices for Preparing Environmental Data Sets to Share and Archive. Available online: http://daac.ornl.gov/PI/BestPractices-2010.pdf.

3 Preserve & Record Information Keep Original (Raw) File –Do not include transformations, interpolations, etc. –Make the raw data “read-only” Save as a new file Processing Script (R)

4 Data Manipulation You will need to repeat reduction and analysis procedures many times –You need to have a workflow that recognizes this –Scripted languages can help capture the workflow –You could just document all steps by hand –After the 20 th iteration through your data set; however, you may feel more fondly towards scripted languages Learn the analytical tools of your field –Talk to colleagues, etc. and choose at least one tool to master

5 Scripts used in file cleaning Programs / algorithms Document workflows or data file transformations Preserve Processing Information Temperature data (T) Data in R format Salinity data (S) Summary statistics “Clean” T & S data Data import into R Quality control & data cleaning Analysis Graph Production

6 Preserving: Scripted Notes Use a scripted language to process data –R Statistical package (free, powerful) –SAS –MATLAB Processing scripts records processing –Steps are recorded in textual format –Can be easily revised and re-executed –Easy to document GUI-based analysis may be easier, but harder to reproduce

7 Reproducibility Methods Do use version control Do document software environment Only save what cannot be reconstructed from original data + code


Download ppt "Data Organization Quality Assurance and Transformations."

Similar presentations


Ads by Google