Science Archives Workshop - April 25, 2007 - Page 1 Archive Policies and Implementation: A Personal View from a NASA Heliophysics Data Policy Perspective.

Slides:



Advertisements
Similar presentations
1 CEOS/WGISS20 – Kyiv – September 13, 2005 Paul Kopp SIPAD New Generation: Dominique Heulet CNES 18, Avenue E.Belin Toulouse Cedex 9 France
Advertisements

Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Long-Term Preservation of Astronomical Research Results Robert Hanisch US National Virtual Observatory Space Telescope Science Institute Baltimore, MD.
Copyright © 2007 Software Quality Research Laboratory DANSE Software Quality Assurance Tom Swain Software Quality Research Laboratory University of Tennessee.
Lecture Nine Database Planning, Design, and Administration
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
An Overview of Selected ISO Standards Applicable to Digital Archives Science Archives in the 21st Century 25 April 2007 Donald Sawyer - NASA/GSFC/NSSDC.
Field Project Planning, Operations and Data Services Jim Moore, EOL Field Project Services (FPS) Mike Daniels, EOL Computing, Data and Software (CDS) Facility.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
ILWS and Data Services D. G. Sibeck, Aaron Roberts NASA/GSFC Ray Walker UCLA.
EGY Meeting March Page 1 The Data Policy for NASA's Heliophysics Science Missions & the eGY Geoscience Information Commons D. A. Roberts.
Archiving 40+ years of Planetary Mission Data - Lessons Learned and Recommendations K. E. Simmons LASP, University of Colorado, Boulder, CO
At A Glance VOLT is a freeware, platform independent tool set that coordinates cross-mission observation planning and scheduling among one or more space.
Elements of a Data Management Plan Bill Michener University Libraries University of New Mexico Data Management Practices for.
Planning for Arctic GIS and Geographic Information Infrastructure Sponsored by the Arctic Research Support and Logistics Program 30 October 2003 Seattle,
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Sun-Earth Connection MO&DA Programs - March 26, Page 1 What NASA needs from us? Presented to the Workshop: VOs in Space and Solar Physics
Page 1 Informatics Pilot Project EDRN Knowledge System Working Group San Antonio, Texas January 21, 2001 Steve Hughes Thuy Tran Dan Crichton Jet Propulsion.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
Planetary Science Archive PSA User Group Meeting #1 PSA UG #1  July 2 - 3, 2013  ESAC PSA Archiving Standards.
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
Chapter 6 Supporting Knowledge Management through Technology
C. Huc/CNES, D. Boucon/CNES-SILOGIC, D.M. Sawyer/NASA/GSFC, J.G. Garrett/NASA-Raytheon Producer-Archive Interface Methodology Abstract Standard PAIMAS.
Implementing an Institutional Repository: Part III 16 th North Carolina Serials Conference March 29, 2007 Resource Issues.
EOSDIS Status 10/16/2008 Dan Marinelli, Science Systems Development Office.
ESDIS Project Status 11/29/2006 Dan Marinelli, Science Systems Development Office.
SPASE: Metadata Interoperability in the Great Observatory Environment Jim Thieman Todd King Aaron Roberts Joe King AGU Joint Assembly May 23, 2006.
Data Strategy  Status Update  SSIM  RID  Technology Strategies.
March 2004 At A Glance NASA’s GSFC GMSEC architecture provides a scalable, extensible ground and flight system approach for future missions. Benefits Simplifies.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Last Updated 1/17/02 1 Business Drivers Guiding Portal Evolution Portals Integrate web-based systems to increase productivity and reduce.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
29 Nov 2006PDS MC NSSDC MOU history PDS-NSSDC MOU circa 1994 Reviewed in Jan 2003, June 2004, Oct 2005, Nov 2006 Add words to remove HQ changes Change.
PoDAG XXI: SEEDS SEED: NSIDC Potential Interactions NSIDC DAAC should prepare an evaluation of their desired future roles in "core activities" and in mission.
SPASE and the VxOs Jim Thieman Todd King Aaron Roberts.
Science Data in the Science Mission Directorate (SMD) Jeffrey J.E. Hayes Program Executive for MO & DA, Heliophysics Division August 17, 2011.
06-1L ASTRO-E2 ASTRO-E2 User Group - 14 February, 2005 Astro-E2 Archive Lorella Angelini/HEASARC.
M-1 ISO “Reference Model For an Open Archival Information System (OAIS)” ISO “Reference Model For an Open Archival Information System (OAIS)” Presentation.
2003 Dec 16 J.B. Gurman A bit of (really boring) history First attempts organized c by K. Reardon and L. Sanchez-Duarte as the “Whole Sun Catalog”
Breakout Session Assignments and Goals. Summary of Objectives and Charge to Breakout Groups Desired outcome: a comprehensive vision for NACP Data Management.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
N. RadziwillEVLA Advisory Committee Meeting May 8-9, 2006 NRAO End to End (e2e) Operations Division Nicole M. Radziwill.
The Virtual Heliospheric Observatory and Distributed Data Processing T.W. Narock 1,2, A. Szabo 2, A. Davis 3 1. L3 Communications,
1 CREATING AND MANAGING CERT. 2 Internet Wonderful and Terrible “The wonderful thing about the Internet is that you’re connected to everyone else. The.
Virtual Space Physics Observatory (VSPO) (vspo in Google) 22 May 2006 Overview D. Aaron Roberts NASA GSFC.
EGY Meeting March Page 1 NASA's Space Science (mostly Heliophysics) Virtual Observatories and Informatics D. A. Roberts C. P. Holmes J. H.
1 SUZAKU HUG 12-13April, 2006 Suzaku archive Lorella Angelini/HEASARC.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
VxO Kickoff Meeting - May 22, 2006 The Evolving Heliophysics Data Environment: “VxO Kickoff” Chuck Holmes Joe Bredekamp May 22, 2006.
SPDF Science Advisory Group - September 29-30, 2005 Page 12/24/2016 9:09:48 PM Services of the Space Physics Data Facility (SPDF) / Sun-Earth Connection.
The Virtual Solar Observatory – An Operational Resource for Heliophysics Informatics Frank Hill & The VSO Team.
A Perspective on the Electronic Geophysical Year Raymond J. Walker UCLA Presented at eGY General Meeting Boulder, Colorado March 13, 2007.
PDS4 Project Report PDS MC F2F University of Maryland Dan Crichton March 27,
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
International Planetary Data Alliance Registry Development and Coordination Project Report 7 th IPDA Steering Committee Meeting July 13, 2012.
The Virtual Heliospheric Observatory VHO. The VHO Team Adam Szabo (lead)NASA/GSFC Andrew DavisCaltech George HoJHU/APL Justin KasperMIT Jan MerkaU. Maryland,
The Records Management Vision The Records Management Vision: Our Journey Towards Solutions for Everyday Life Ronald G. Smith, CRM Records and Information.
ISWG / SIF / GEOSS OOS - August, 2008 GEOSS Interoperability Steven F. Browdy (ISWG, SIF, SCC)
IPDA Architecture Project International Planetary Data Alliance IPDA Architecture Project Report.
Sun-Earth Connection MO&DA Programs - February Page 1 SS Implementing the Data Environment for the Living-with-a-Star era Charles P Holmes Joseph.
Heliophysics MO&DA Program - November 13, Page 1 Notes from the Heliophysics MO&DA Program STEREO SWG Meeting Chuck Holmes “Director, Heliophysics.
Archiving of solar data Luis Sanchez Solar and Heliospheric Archive Scientist Research and Scientific Support Department.
An Overview of Data-PASS Shared Catalog
Leigh Grundhoefer Indiana University
Robin Dale RLG OAIS Functionality Robin Dale RLG
Presentation transcript:

Science Archives Workshop - April 25, Page 1 Archive Policies and Implementation: A Personal View from a NASA Heliophysics Data Policy Perspective D. Aaron Roberts NASA GSFC 25 April 2007

Science Archives Workshop - April 25, Page 2 Define:Archive (some Google results)  A site containing a large number of files, possibly acquired over time, and often publicly accessible. (100 Best Web Hosting)  A function permitting users to copy one or more files to a long-term storage device. Archive copies can:  Accompany descriptive information;  Imply data compression software usage;  Be retrieved by archive date, file name, or description (Tivoli Storage Manager)  Archive is a London-based Trip-hop group. (Wikipedia)

Science Archives Workshop - April 25, Page 3 Science Data Archive Definition  Easily accessible, scientifically useable, well-documented, secure data = a good archive.  Requires:  Open data policy  Independently useable data  Science input (data preparation and serving)  Proper registration and backup

Science Archives Workshop - April 25, Page 4 Archiving Homilies  Archiving is a journey, not a destination  “Archive early, archive often” as a natural extension of serving data  “Central” archiving is more about knowledge than acquisition  Knowledge must be easily available: presentation matters  The customer is always right  Standards are only as good as the community that supports them, but they are essential: “It’s the metadata, stupid”  Consider the legacy

Science Archives Workshop - April 25, Page 5 Archiving is a journey Properly described, well-documented, accessible data should easily move from one archiving stage to the next:  NASA missions produce Active Archives (nothing is “ingested”)  Products, delivery, and initial long-term data plans in Project Data Management Plan  Virtual Observatories provide uniform descriptions and access to many such archives  The archive continues to develop in the extended mission  A Mission Archive Plan provides updates to the Senior Reviews on status, plans, and actions for post mission products and service  After the mission, a Resident Archive can continue to server data  Active upgrades of data products to be funded by other means  NSSDC manages the RAs  “Permanent” archiving may just be moving the data and documentation to a more generic Resident Archive (e.g., SDAC, SPDF) for continued access  At all stages, backups and registries maintain safety and knowledge of the data products

Science Archives Workshop - April 25, Page 6 “Central” archiving More about knowledge than acquisition:  What exists?  Where is it?  Is it well documented?  Is it safe?  New focus for NSSDC role (at least for HP): knowledge of data environment; management of RAs.  (Harvested) VO registries augmented as needed can provide a complete set of resources.  Information about the above should be available in ways that provide easy overviews as well as details.

Science Archives Workshop - April 25, Page 7 The customer is always right The community determines directions:  Peer review of VOs, RAs, Data Centers, Missions: What is working? What could be improved? What can go?  HP Data and Computing Working Group provides feedback on HQ directions  “Top down vision, bottom-up implementation”  “Market-driven” including what we want from archives

Science Archives Workshop - April 25, Page 8 It’s the metadata, stupid Standards that work:  Value of sharing data  SPASE data model provides a uniform description of data products  SPASE description + data = “SIP”, “AIP”, and “DIP”  Preserved data should be in common, open, supported formats (e.g, FITS, HDF, CDF, documented ASCII, …)  Communication and other standards TBD  Important to decide the level of description

Science Archives Workshop - April 25, Page 9 Consider the legacy Preserving and serving what matters for the long term:  What is most useful? (If “all” is not possible)  What works now, and what will last (and how)?  Calibrated, best-effort products should accompany level-zero plus software/algorithms

Science Archives Workshop - April 25, Page 10 A model Heliophysics never quite implemented Main problems: (1)“Planning” is a mission function (in collaboration with VOs and others) (2)“Ingest” is replaced by “production” and “transfer” (3)“Access” is a distributed function as are the archives in general

Science Archives Workshop - April 25, Page 11 The New Heliophysics Mission Data Lifecycle and Framework

Science Archives Workshop - April 25, Page 12 Summary Easily accessible, scientifically useable, well-documented, secure data = a good archive. Archiving is a journey, not a destination “Central” archiving is more about knowledge than acquisition Knowledge must be easily available: presentation matters The customer is always right Standards are only as good as the community that supports them, but they are essential: “It’s the metadata, stupid” Consider the legacy

Science Archives Workshop - April 25, Page 13 Backup Slides (HP Data Policy)

Science Archives Workshop - April 25, Page 14 The HP Data Environment l Data from the Heliophysics Great Observatory reside in a distributed environment and are served from multiple sources. l Multimission Data Centers n Solar Data Analysis Center n Space Physics Data Facility (CDAWeb, OMNIWeb, etc.) n National Space Science Data Center l Mission-level active archives: e.g. ACE, TIMED, TRACE, Cluster, etc. l Much of our data are served from individual instrument sites. l We are moving into a new data environment of n Virtual Observatories for convenient search and access of the distributed data, and n Resident Archives to retain the distributed data sources even after mission termination. l We have a Data and Computing Working Group to help us move ahead.

Science Archives Workshop - April 25, Page 15 Goals of the HP Science Data Management Policy l Improve management of and access to HP mission data. l Clarify the architecture and associated data lifecycle milestones of the data environment. l Provide guidelines for proposals, Project Data Management Plans, NRAs, peer reviews, and other activities related to the HP data environment.

Science Archives Workshop - April 25, Page 16 Basic Philosophy l Evolve the existing HP data environment: n take advantage of new computer and Internet technologies to n respond to our evolving mission set and community research needs (enable the HP Great Observatory) l Blend ‘bottoms-up’, ‘market-driven’ implementation approaches with a ‘top-down’ vision for an integrated data environment. l Assure that the HP science community participates in all levels of data management.

Science Archives Workshop - April 25, Page 17 Guiding Principles l All data produced by the HP missions will be open and made available as soon as is practical. n Gurman's "Right Amount of Glue” from the Fall 2002 AGU meeting sets the philosophy [see a key component of which is a standard of behavior - share one’s data with everyone. l Data will be independently scientifically usable. n adequate documentation including uniform SPASE descriptions n sustainable and open data formats n easy electronic access n provision of appropriate analysis tools.

Science Archives Workshop - April 25, Page 18 Architecture l The environment will be distributed n Many archives with different internal workings l Data integration capabilities provided by discipline- based virtual observatories (“VxO’s”; VSO first for x = “Solar” and now 5 others) n linked by a central dictionary (“SPASE Data model”) and machine- to-machine communication routines. n Easily permits the inclusion of essential data sets from non-NASA sources. n Provides a context for services and advanced analysis tools developed under, e.g. AISRP, LWS TR&T, and the VxOs.

Science Archives Workshop - April 25, Page 19 Policy Recommendations, Etc. l The Policy includes: n Roles of data environment components n “Rules of the Road” for data use, n Recommendations for Project Data Management Plans and Mission Archive Plans, n A timeline of the HP mission data lifecycle

Science Archives Workshop - April 25, Page 20 Implementation l Use peer-review processes to assist in managing the elements of the environment. n NRAs for: (a) VxOs, (b) Data quality and access improvement, (c) Resident Archives, and (d) Value-added services. n Mission and Data Center Senior Reviews RA reviews. l Success will be determined by community use and feedback. The process is “market-driven.”

Science Archives Workshop - April 25, Page 21 Current Activities l Finalizing the Data Policy with community input. n Our goal is to have this ready for the MIDEX AO l Implementing a second round of VxOs and processing the next round of proposals for VxOs and related services. l Coordinating these efforts through frequent interactions and work with the SPASE group. l Implementing Resident Archives and the processes to manage these archives. l Working with new missions to incorporate the Data Policy from the start, and “retrofitting” older missions through VxOs and other means. l Working on collaboration with other NASA science divisions, other US agencies, and international partners. l Maintaining a web site for latest news about our data environment: