Amanda Leon ESIP Summer 2017 Virtual Collections Amanda Leon ESIP Summer 2017
What is a “virtual collection”? Virtual: refers to the idea that the data sets are not all in a single location or repository Collection: implies a curation of specific datasets and products relevant to a theme or application
What is a “virtual collection”? ESDSWG 2015: collected use cases and types of resources that could be curated into a collection Community Use Cases NSIDC DAAC User Working Group Soil Moisture Active Passive (SMAP) Science Team and Applications Working Group
Enabling Virtual Collections Creating components for curating VCs Reformat Reproject Variable Spatial Temporal Granule Metadata UMM-G Collection Metadata UMM-C Data Transformation Services Subsetting Services
Enabling Virtual Collections What components are still needed? Subsetting Services File Metadata UMM-G Collection Metadata UMM-C Data Transformation Services Variable Metadata UMM-V Service Metadata UMM-S
Data stewards are creating them Around missions ORNL DAAC NSIDC DAAC GES DAAC Soil Moisture Active Passive (SMAP) Satellite Mission AirMOSS: Soil Moisture Visualizer Airborne Microwave Observatory of Subcanopy and Surface NSIDC DAAC ASF DAAC NSIDC DAAC ORNL DAAC
Data stewards are creating them Around Earth science domains NSIDC DAAC University of Maryland NASA GMAO Satellite Observations of Arctic Change (SOAC) Sea Level Change Portal PO.DAAC NSIDC DAAC NSIDC DAAC JPL
Data stewards are creating them, but… We know where to find data And harmonize different data access protocols, formats, projections, girds, etc. And can only manually curate for a limited set of use case
Researchers are creating them Around research and applied science GES DAAC NSIDC DAAC OB.DAAC PO.DAAC EarthData Search: discover and customize data Flood in Texas on 24-25 May 2015 6 variables from 5 sensors for 3 months More harmonizing
Researchers are creating them, but… There are still barriers to finding and harmonizing data And their VCs typically are not captured or shareable
Moving forward How do we make curating VCs scalable and interoperable? How do we define a standardized service to capture, disseminate, and share VCs? Protyped a VC with CMR tagging How might we enable VCs so the user community to easily create and share them?
Thank you!