The Path Toward Data System Integration Raymond J. Walker Todd A. King Steven P. Joy Science Archives in the 21 st Century University of Maryland April.

Slides:



Advertisements
Similar presentations
Science Archives Workshop - April 25, Page 1 Archive Policies and Implementation: A Personal View from a NASA Heliophysics Data Policy Perspective.
Advertisements

SUMMARY Jane Russell Perot Systems Corp & NASA/GSFC
Operating a Virtual Observatory Raymond J. Walker, Jan Merka, Todd A. King, Thomas Narock, Steven P. Joy, Lee F. Bargatze, Peter Chi and James Weygand.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Planning for the Virtual Observatory Tara Murphy … with input from other Aus-VO members …
1 Stanford Archival Repository Project Brian Cooper Arturo Crespo Hector Garcia-Molina Department of Computer Science Stanford University.
March 2010 PDS Imaging Node 1 NASA PDS Imaging Node: NASA PDS Imaging Node: Digital Data Archives and Distribution Archiving and distributing data and.
An Overview of Selected ISO Standards Applicable to Digital Archives Science Archives in the 21st Century 25 April 2007 Donald Sawyer - NASA/GSFC/NSSDC.
WGISS CNES SIT-30 Agenda Item 10 CEOS Action / Work Plan Reference 30 th CEOS SIT Meeting CNES Headquarters, Paris, France 31 st March – 1 st April 2015.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA.
CERN – IT Department CH-1211 Genève 23 Switzerland t CERN Open Source Collaborative tools: Digital Library Software Tim Smith CERN/IT.
Research Data at NCAR 1 August, 2002 Steven Worley Scientific Computing Division Data Support Section.
Trieste, May 19, 2008IVOA Interoperability meeting IPDA: a Standards Initiative for Building Compatible Archives Maria Teresa Capria INAF – IASF Rome,
EGY Meeting March Page 1 The Data Policy for NASA's Heliophysics Science Missions & the eGY Geoscience Information Commons D. A. Roberts.
[The Virtual Radiation Belt Observatory] Bob Weigel (George Mason University) Software: Eric Kihn (NOAA/NGDC, ViRBO Web and API) Mikhail Zhizhin (RFO,
Elements of a Data Management Plan Bill Michener University Libraries University of New Mexico Data Management Practices for.
Ensemble Computing in the National Science Digital Library (NSDL)
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
The SPASE Data Model: Standard Metadata for Space Science Data Description J. Thieman, T. King, A. Roberts, J. King, C. Harvey, and P. Richards Oct. 24,
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
Sun-Earth Connection MO&DA Programs - March 26, Page 1 What NASA needs from us? Presented to the Workshop: VOs in Space and Solar Physics
IASSIST 2008 Collection, Communication, Access and Preservation IASSIST 2008 – session E3 Yesterday, Today and Tomorrow: Data on the Web from Vision to.
The Open Connected TV (OCTV) project 2011/08/28. Connected TV: dream and reality The dream Connected TV: the means to provide the much sought- after convergence.
Planetary Science Archive PSA User Group Meeting #1 PSA UG #1  July 2 - 3, 2013  ESAC PSA Archiving Standards.
Usability Issues Facing 21st Century Data Archives Joey Mukherjee and David Winningham
1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright.
PDS4 Tool Development Strategy PDS Management Council Meeting November 18, 2014 Dan Crichton.
PDS Geosciences Node Page 1 Archiving Mars Mission Data Sets with the Planetary Data System Report to MEPAG Edward A. Guinness Dept. of Earth and Planetary.
Poster Session Rappateurs Report Science Archives in the 21st Century 26 April 2007 Lou Reich - CSC (NASA/GSFC)
Managing the Impacts of Change on Archiving Research Data A Presentation for “International Workshop on Strategies for Preservation of and Open Access.
WGISS Working Group on Information Systems and Services Richard MORENO CNES WGISS report, Agenda Item 14 Tromsø, Norway October 2014.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
SPASE: Metadata Interoperability in the Great Observatory Environment Jim Thieman Todd King Aaron Roberts Joe King AGU Joint Assembly May 23, 2006.
The Virtual Magnetospheric Observatory VMO/U Ray Walker Todd King Steven Joy.
Ian Bird GDB; CERN, 8 th May 2013 March 6, 2013
Planetary Science Archive PSA User Group Meeting #1 PSA UG #1  July 2 - 3, 2013  ESAC PSA Introduction / Context.
User Working Group 2013 Data Access Mechanisms – Status 12 March 2013
Todd King.
Data Integrity Issues: How to Proceed? Engineering Node Elizabeth Rye August 3, 2006
SPASE and the VxOs Jim Thieman Todd King Aaron Roberts.
Science Data in the Science Mission Directorate (SMD) Jeffrey J.E. Hayes Program Executive for MO & DA, Heliophysics Division August 17, 2011.
Data Standards Development August 29, Topics 1.Current Status 2.What was delivered for Build 2c 3.How was IPDA supported 4.What mission support.
PDS Geosciences Node Page 1 Archiving LCROSS Ground Observation Data in the Planetary Data System Edward Guinness and Susan Slavney PDS Geosciences Node.
Introduction to the VO ESAVO ESA/ESAC – Madrid, Spain.
EGY Meeting March Page 1 NASA's Space Science (mostly Heliophysics) Virtual Observatories and Informatics D. A. Roberts C. P. Holmes J. H.
SOFTWARE ARCHIVE WORKING GROUP (SAWG) REPORT TODD KING PDS MANAGEMENT COUNCIL MEETING FEB. 4-5, 2016.
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
March 2004 At A Glance Advanced Mission Design (AMD) researches and develops innovative trajectories and the mathematical methods used for optimal designs.
A Perspective on the Electronic Geophysical Year Raymond J. Walker UCLA Presented at eGY General Meeting Boulder, Colorado March 13, 2007.
Planetary Data System (PDS) Tom Morgan November 24, 2014.
ISWG / SIF / GEOSS OOSSIW - November, 2008 GEOSS “Interoperability” Steven F. Browdy (ISWG, SIF, SCC)
PDS4 Project Report PDS MC F2F UCLA Dan Crichton November 28,
ISWG / SIF / GEOSS OOS - August, 2008 GEOSS Interoperability Steven F. Browdy (ISWG, SIF, SCC)
NASA/NSSDC Report to MOIMS DAI/IPR Plenary November 02, 2004.
IPDA Architecture Project International Planetary Data Alliance IPDA Architecture Project Report.
Creation of the Archiving Component of a Memorandum of Understanding (MOU) Template for International Missions IPDA MOU project members.
QA4EO Update on the Quality Assurance Framework For Earth Observation Joint GSICS GDWG-GRWG meeting.
April 2, 2009PDS Management Council - PPI Node1 Planetary Plasma Interactions Node PDS Management Council Presentation April 2, 2009 Raymond Walker Steven.
International Planetary Data Alliance Registry Project Update September 16, 2011.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
School on Grid & Cloud Computing International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics.
NASA Earth Science Data Stewardship
Archiving of solar data Luis Sanchez Solar and Heliospheric Archive Scientist Research and Scientific Support Department.
Active Data Management in Space 20m DG
Geospatial Data Use and sharing Concepts
eGY Planning Meeting Boulder, February 2005
IPDA July 2013 CDF and PDS Todd King, Joseph Mafi, Steven Joy.
Google Sky.
Presentation transcript:

The Path Toward Data System Integration Raymond J. Walker Todd A. King Steven P. Joy Science Archives in the 21 st Century University of Maryland April 25-26, 2007

“ Yes, Haven, most of us enjoy preaching, and I have such a bully pulpit!” Theodore Roosevelt Prepare for a sermon!

A Persistent Dream A global data environment in which all Earth and space science data are organized in a common way with “one stop shopping” for any data product. After decades of trying we are not very close to achieving that dream.

Goals for Science Data Systems Help scientists locate the data required for a given study. Provide scientists with access to those data. Assure that those data are useable. Preserve the data forever. Aid scientists in using the data.

A Few Realities Don’t try to build a centralized system. The data are distributed and will be. It is all about science. Allow the science needs and the scientific community to drive the system. Adopt community wide standards. The key to interoperability within a data system is the metadata. No data model is perfect. New requirements emerge continually. Leverage what already exists. The are a variety of valuable community assets.

Can there be a Single Solution? Not yet…. because Each community has distinct needs. Each community has unique histories. Science must continue while systems are deployed. –Changes must be evolutionary. –Leverage existing systems and assets. The resources are limited. –Revolutions are expensive.

Plus… The Data are Found Worldwide More nations are active participants in space. Each mission enhances and complements the current body of data. –All data are important. Answers to current science questions require data from multiple sources. Individual communities need autonomy. –Governmental –Project organization –Efficiency

A Mature Data System (Planetary) The Planetary Data System –Serving the NASA planetary science community for almost 20 years. –Mature data model. –Well suited for archiving. IPDA (International Planetary Data Alliance) –U.S., EU, Japan, China, Russia, and India –Formed in 2006 to define a standard based on (inspired by) the PDS data model. –Expect draft to be vetted by the community in late 2007.

What do you do if there are no accepted metadata standards and the data are highly distributed?

An Emerging Data System (Heliophysics) Needed a way to connect existing systems. –A new model that would be an “interlingua” was required. SPASE (Space Physics Archive Search and Extract) –Concept defined in –Formed in 2003 to define a standard for data exchange for Space Physics. –International participation (U.S., France, Britain, Japan, Canada). Open to all. –Releases: Version released in November 2005 Version released in August 2006 Version release is eminent

The Heliophysics Data Environment – December 2007 VMO VxO Resident Archive Individual Researcher

Implementing a Space Physics Data System Virtual Observatories –Provide standards based access by sub-discipline. –Aid data providers in making their resources available. –Serve as integrating portal to existing data repositories. Mission data bases Resident archives Researcher data sets –Serve as integrating portal to services See CoSEC ( Collaborative Sun-Earth Connector) for a functioning example.

When is Enough, Enough? The metadata have to be relevant to the purpose. –Initially PDS developed a metadata standard which was very rich scientifically. –Data providers complained that it was too rich. That the effort required to generate the metadata was too large. –PDS then modified the metadata standards to be more in keeping with what the data providers could support. –Clarity comes from usage. The threshold of participation must be kept low.

Assure that the Data are Useable Data Quality The data systems have a responsibility to provide the best quality data available. You only learn about data by trying to do science with it. Peer review has proven to be an important tool for assuring data quality. Data Processing Most science is done with highly processed data. Frequently only raw data plus algorithms and calibrations (or software) are archived. Data need to be readily useable- secondary users often don’t have the resources to process raw data into physical units even with well documented data or software.

Formats! Formats! Formats! One of the most contentious issues during PDS’ 20 year history concerns data formats. Long ago the astronomy community settled on the Flexible Image Transport System (FITS) as their main format. FITS is inappropriate for many types of data (e.g. time series tables). Many planetary scientists objected to FITS so PDS accepts most formats (provided they can be described by the PDS metadata standard). The result is many formats and even accepted formats are sometimes hard to describe. Is it possible and desirable to limit this ever growing list? YES!!!

Preserve the Data Forever Long term preservation remains a serious issue. –PDS has been archiving the data on “hard media” (CD, DVD). –They have media as old as 20 years. –They have found problems with the media (both stamped and write once) that are only a couple of years old (M. Martin and B. Harris). –The decay time is short compared to the time over which we had planned to renew media. –Tape media have even shorter lifetimes than CD and DVD. Long term preservation is a common concern! –There is no current or anticipated hard media that meets our durability and capacity requirements. –How do you build an adequate preservation system?

What should we do? For your discipline Endorse and adopt a standard. –Don’t start over for each mission and project. –If the metadata standard in your discipline is inadequate work to improve it. –The standards keepers must be willing to work with the community and respond quickly. Pick a minimal set of data formats –One size fits all does not work but neither does allowing all formats. –One for each type of data Retrofit existing tools –We can’t afford to continually start over – we must evolve. Make compliance a contractual obligation

What should we do? For your discipline Endorse and adopt a standard. –Don’t start over for each mission and project. –If the metadata standard in your discipline is inadequate work to improve it. –The standards keepers must be willing to work with the community and respond quickly. Pick a minimal set of data formats –One size fits all does not work but neither does allowing all formats. –One for each type of data Retrofit existing tools –We can’t afford to continually start over – we must evolve. Make compliance a contractual obligation