Presentation is loading. Please wait.

Presentation is loading. Please wait.

End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009.

Similar presentations


Presentation on theme: "End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009."— Presentation transcript:

1 End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009

2 Vision “Unidata’s vision calls for providing comprehensive, well- integrated and end-to-end data services for the geosciences. These include an array of functions for collecting, finding, and accessing data; data management tools for generating, cataloging, and exchanging metadata; and submitting or publishing, sharing, analyzing, visualizing, and integrating data.” What does this vision statement mean to each of us?

3 Background Providing real-time weather data (and related tools) was the primary reason why Unidata was created and it has been the bread and butter of Unidata’s mission for more than two decades. But as our work has evolved, along with our community, it has become clear that just provision of real-time data or facilitating data access is not enough. Hence the vision statement. Let’s think about how some of the capabilities available on Amazon/Ebay/You Tube/Flickr can be facilitated for geosciences “data”.

4 Objectives 1. Create “integrated” data services across all stages of data life cycle, beginning with observations and ending with data curation/archiving. A) Observations/Sensors  Ingest  Data collection systems  Data providers  Disseminate  Users (both end users and data archival systems) B) From beginning till end of a workflow (LEAD example): Observations  Ingest  Analysis/Assimilation  Prediction  Output  Dissemination  Users (both end users and data repositories)

5 Imperatives Integrated services does not imply a monolithic system, but a set of modular services that are configurable, flexible, extensible, and scalable. Need think what [essential] services are needed by our users and the use cases. –Users include students, faculty, scientists, data providers, outreach providers, field project personnel –Use cases include class room & lab use, research studies, weather websites, field projects, projects like LEAD, portals, and data centers –Both programmatic and interactive invocation We may not work on all of the functionalities ourselves but we need to facilitate as many of the as possible.

6 Strategies and Tactics Integration achieved via both loosely and tightly coupled components and services Incrementalism is the only practical option for a program like Unidata where many technologies already exist and resources are scarce. Leverage as much as possible both our own technologies as well as what is available from the outside.

7 What do I mean by Data? Scientific data (binary, ASCII, netCDF, HDF, XML, GRIB, …and XML) Metadata (ASCII, XML, etc.) Data in data bases (e.g., SQL) GIS data (Shapefiles, KML, etc.) Derived products from scientific data Ancillary data objects (images, videos, documents – pdf, Word, html, ppt, etc.

8 Integration Capabilities Different data types (feeds, obs., platforms, model output, and GIS information) Different data formats Data on different projections Distributed data holdings Data operation (e.g., GDS, netCDF operators) Metadata addition Integration of scientific data with metadata content, documents, and other information

9 Not develop ourselves but perhaps provide hooks to Collaboration tools Wikis Forums Blogs Chat and IM/SMS Social network apps (Facebook, Twitter, etc.) RSS, email and other notifications

10 A list of possible data services Data collection service (routine ingest via LDM, FTP, etc.) Data submission service Metadata service for submitting, editing, and exchanging metadata Cataloging service Data discovery service Monitoring and notification service for new data, metadata, and products Data access services Data delivery/transport services, including copying/moving data to other servers and personal space and streaming data on demand Security and authentication services Subsetting service, including capability for progressive disclosure Aggregation services for data and metadata Services for CF conformance checking Decoding and data translation services Unit conversion services Visualization and product generation services Data fusion and data manipulation services (e.g., netCDF operator services) GIS services Output handling services

11 IMO, Beyond Unidata’s Scope Data mining Ontologies Brokering and workflow orchestration Federation and mediation Provenance Curation and stewardship

12 Final Thoughts It is important that we develop consensus on what we mean by integrated, end-to-end data services. Therefore, we need to hear your thoughts. Once we have an idea of what it is that we want to build, we need to agree on how to go about building it. Again, I believe in an incremental approach, but there may be other ideas. With RAMADDA, THREDDS, netCDF, and LDM, many of the pieces already exist upon which to build E2E data services. We need to identify the next steps and a concrete project in this potentially long journey. Develop a pilot effort? A prototype?


Download ppt "End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009."

Similar presentations


Ads by Google