Download presentation
Presentation is loading. Please wait.
Published byPreston Beasley Modified over 9 years ago
1
Preservation Strategies: Framing The Approach Nancy Hoebelheinrich Knowledge Motifs LLC nhoebel@kmotifs.com Data Management Workshop American Geophysical Union San Francisco, CA Tuesday, December 6, 2011
2
Overview Preservation strategies to pursue once the argument for data stewardship & data preservation is won Background of previous issues & discussions re: data stewardship & data management Provides a framework of questions that a scientist can answer to facilitate the preservation of his/her data for the long term
3
Relevance to Data Management Why is this important???? As a metaphorical example, consider the following situation:
4
Relevance to Data Management Documentation for My Latest Research Project To Data Manager: Don’t worry, the connections are all there [– in my head!]
5
Relevance to Data Management Documentation for My Latest Research Project To Data Manager: See, here’s the primary algorithm I used…
6
Relevance to Data Management Documentation for My Latest Research Project To Data Manager: Here’s the schedule we used to gather the data – although some months it was a little different…
7
Relevance to Data Management Documentation for My Latest Research Project To Data Manager: Oh, and here’s the team – our PI wasn’t available for the photo, so we put a placeholder for him – see the guy with the mustache below on the stick? And the project manager – she’s the one with the long ears…what was her name?
8
Relevance to Data Management So, what’s the Data Manager gonna do with all this stuff?? Ensure long term integrity & viability of your data incl. Various levels of processed data / data products, if desired Metadata (MD) you have (in your head or in documentation) Context & Provenance – “audit” trail of sources, processing, products By ingesting, identifying, storing, locating & providing access, if desired, to all of the above Deploy preservation strategies such as: Assigning checksums and/or identifiers to each “item” of a data set Migrating to non-proprietary and/or new formats over time Migrating to new storage media over time Refreshing the data over time
9
How can I (the scientist) help? Besides me, who’s going to care? Sponsor mandates to archive Specific requirements from sponsor e.g., NASA, NOAA, USGS Data archive requirements & desirements Negotiated & documented in Submission Information Package (OAIS SIP) Future scientists who want to use/re-use your data!! What kind of data should be kept? Formulae for decisionmaking, e.g., NOAA National Climatic Data Center’s Climate Data Record Maturity Matrix; factors include software readiness, existence / state of metadata & (other) documentation, utility of data, validity of product (based on certainty estimates), desire for / restrictions upon public access Documentation of specific disciplinary requirements, e.g., CDRs from Satellite Passive Microwave Sounders Allowing for serendipity & cyclical nature of scientific data Framework Questions:
10
Example Data Maturity Index
11
How can I (the scientist) help? Key Framework Question for future scientists who want to use/re-use my data: what will they need to know? (= MD that I probably know best) Documentation including restrictions on access & use Assumptions, hypotheses, algorithms about data (who, what, when, where, why & how) = “provenance & context” Sequence of time, date, technical details of data creation / acquisition and relationships among data units or how to figure out = “preservation MD” Key people, roles & their organizations = “citation MD”
12
What if I don’t have an existing archive for my data? Some disciplines may not have a data center or archive set up for them – what resources are available? Institutions with experience: governmental agencies (UK Data Centers, UK Digital Curation Center, in US: NASA, NOAA, USGS, NARA, Research Libraries, national & international libraries, archives and data centers Comprehensive information resources about preservation and archiving, e.g., CIESIN’s Geospatial Clearinghouse, at http://geopreservation.org/index.jsp;jsessionid=7nV0TZhFPJc4b22zWbjn1J9p3LK9RR81fHhpGgM 9gzJWYlQJJ690!480336224 US Library of Congress, etc., and Duraspace, at http://www.duraspace.org/ http://geopreservation.org/index.jsp;jsessionid=7nV0TZhFPJc4b22zWbjn1J9p3LK9RR81fHhpGgM 9gzJWYlQJJ690!480336224 US Library of Congress, etc. http://www.duraspace.org/ DataOne – NSF funded consortium, https://www.dataone.org/ focused on preservation and access to multi-scale, multi-discipline, and multi-national science datahttps://www.dataone.org/ DataConservancy – an NSF funded consortium focused upon scientific data curation is a means to collect, organize, validate and preserve data, http://dataconservancy.org/ http://dataconservancy.org/
13
References and Resources NASA Earth Science Data Preservation Content Specification (Nov 2011), http://earthdata.nasa.gov/sites/default/files/field/document/NASA_ESD_Preservation_Spec.pdf http://earthdata.nasa.gov/sites/default/files/field/document/NASA_ESD_Preservation_Spec.pdf NASA, 2011: Metadata Requirements – Base Reference for NASA Earth Science Data Products, (Nov 2011), http://earthdata.nasa.gov/sites/default/files/field/document/NASA%20ESD%20Base%20Metadata%20 Requirements_V1_20110922_0.pdf http://earthdata.nasa.gov/sites/default/files/field/document/NASA%20ESD%20Base%20Metadata%20 Requirements_V1_20110922_0.pdf Preliminary Principles and Guidelines for Archiving Environmental and Geospatial Data at NOAA: Interim Report, http://www.nap.edu/catalog.php?record_id=11659http://www.nap.edu/catalog.php?record_id=11659 Archiving Strategy for USGS EROS Center & Our Future Direction, March 29, 2010, http://ddp.nist.gov/workshop/ppts/01_07_John_Faundeen%20US_DPIF_NIST%20WS.pdf http://ddp.nist.gov/workshop/ppts/01_07_John_Faundeen%20US_DPIF_NIST%20WS.pdf Example disciplinary requirements: NOAA Workshop on Climate Data Records from Satellite Passive Microwave Sounders http://www.star.nesdis.noaa.gov/star/documents/meetings/CDR2011/NOAAWorkshopCDRsFinal Report.pdf http://www.star.nesdis.noaa.gov/star/documents/meetings/CDR2011/NOAAWorkshopCDRsFinal Report.pdf NOAA NCDC Climate Data Record ( CDR) Maturity Matrix, http://www1.ncdc.noaa.gov/pub/data/sds/cdrp-mtx-0008-v3.0-maturity-matrix.pdf http://www1.ncdc.noaa.gov/pub/data/sds/cdrp-mtx-0008-v3.0-maturity-matrix.pdf ESIP Data Stewardship & Preservation Cluster, wiki found at http://wiki.esipfed.org/index.php/Preservation_and_Stewardship http://wiki.esipfed.org/index.php/Preservation_and_Stewardship
14
Other Relevant Modules The case for data stewardship Managing your data Creating documentation and metadata Working with your archive organization
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.