Data management for NEES Stanislav (Standa) Pejša, NEEScomm Data Curator
Content Benefits of data management File formats File names File description Storage Reports/Publications Copyright Curation Things to remember
Benefits of data management Better chances to find what you are looking for Predictable location Meaningful structure More efficient work with data Easier sharing of your data and transfer of knowledge Safe location of files Less stress when finishing a project
File formats Use common and mainstream formats Use formats consistently Each formats requires different preservation approach If possible avoid bundling and embedding different formats that can result in loss of functionality or data Recommended formats: Sensor measurements: tab delimited ASCII or CSV Reports, publications and other documentation: PDF is recommended Images: PNG, JPG, and GIF; avoid BMP Frame captures: the recommended formats are ZIP, TAR, TAR.GZ Video: currently there are no restrictions; avoid formats that require a specific codec, e.g. ASF
File names File naming convention is a good idea Be consistent with using lower case and upper cases Use file extensions consistently – do not mix JPG and jpg (lower case is proffered) Make filenames meaningful Avoid forbidden characters: Do not start filenames with a period ".” Avoid whitespace; use underscore (_) or hyphen (-) instead
File description Descriptions on directories and/or files Make retrieval and identification of files easier Help researchers to understand purpose of the file or directory Description should include: Notes about data processing Software used for creation or processing of data Useful information necessary for rendering of files Notes about context of the file
Storage NEEShub will keep your data SAFE Your laptop or desktop is not enough Save as you work on experiments Do not wait to be told to upload your data and your experiments ORGANIZED Data (./Experiment-n/Trial-n/Rep-n/Type_of_data) Sensor metadata (./Experiment-n/Documentation/Sensors) Material properties (./Experiment-n/MaterialNNNN) Technical drawings: specimen, instrumentation plans (./Experiment-n/Documentation/Drawings) Analytical files (./Experiment-n/Analysis) Presentations, reports, images (./Documentation)
Reports Reports are requirement Final report (project level) Executive summary (project level) Experimental setup report (experiment level) Essential tool for understanding of research and its context NEEShub accepts MSci/PhD theses Pre-prints/post-prints Pre-prints – draft before they are peer-reviewed Post-prints – drafts with comments of peer-reviewers Researchers typically DON’T own copyright to their articles Reports to other grant agencies
Reports, Presentations… Two methods for publishing of resources in the NEEShub Resources uploaded from WP Project related materials Resources uploaded through Contrib module Materials of interested to the EE community, but not related to research projects in the NEES Data Repository
Resources in the Project warehouse… Project related resources Articles Conference papers Theses * currently limited set of document types Retrievable within the NEEShub Discoverable through Google
Resources in the Project warehouse pt. 2 Project related resources Articles Conference papers Theses * currently limited set of document types Retrievable within the NEEShub Discoverable through Google
Other relevant resources NOT related to projects in PW Learning Objects (Presentations, syllabi, assignments) Active Documents (Papers, articles, howtos) Historical Documents (Documenting past practices and research) Publications (Published works) – IP ! Multimedia (Audio, video, etc.) Tools
NEEShub Let others know that they can use your data Open Data data Creative Commons presentations, reports, pre-prints/post-prints, teaching materials Open Source software more on intellectual property considerations
Curation IS service that helps researchers to archive their data in meaningful way IS about planning and organizing data, metadata, and documentation IS concerned about current and future use of data IS iterative and interactive process between researcher teams and curator IS continuum of actions from creation through publication and preservation of data
Curation and metadata Metadata need to be: consistent accurate standardized Example: Relationship among: Sensor metadata Data files headers Instrumentation
Curation and metadata On the experiment level CURATION self-check in ‘EDIT’ mode you can repeatable check you progress and compliance with the data model self-check indicates whether files were uploaded to correct location use the provided box to communicate with the curator once done send a curation request to the curator
Things to remember Save files as you work on them Plan ahead Do not wait to be told to upload your data Be consistent If you need help with upload or organization of data Search for curation Many documents are tagged ‘curation’ or ’data curation’ Or the NEEScomm Data Curator
Thank you ! And if you have any questions, me at: