End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009.

Slides:



Advertisements
Similar presentations
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Advertisements

LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
1 NASA CEOP Status & Demo CEOS WGISS-25 Sanya, China February 27, 2008 Yonsook Enloe.
NG-CHC Northern Gulf Coastal Hazards Collaboratory Simulation Experiment Integration Sandra Harper 1, Manil Maskey 1, Sara Graves 1, Sabin Basyal 1, Jian.
V Alyssa Rosemartin 1, Lee Marsh 1, Ellen Denny 1, Bruce Wilson USA National Phenology Network, Tucson, AZ; 2 - Oak Ridge National Laboratory, Oak.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Data-enabled Science: Challenges and Opportunities 2013 Tropical Cyclone Research Forum 67 th Interdepartmental Hurricane Conference 6 March 2013 College.
OPEN RESEARCH DATA, EPFL, 28 October 2014, M. Töwe, M. Bärlocher docuteam packer: viewer and editor for file structures and metadata.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
A New Generation of Data Services for Earth System Science Education and Research: Unidata’s Plans and Directions AGU Fall Meeting San Francisco, CA 6.
1 genSpace: Community- Driven Knowledge Sharing for Biological Scientists Gail Kaiser’s Programming Systems Lab Columbia University Computer Science.
Mobile Crowdsourcing in the Gulf of Mexico Oil Spill Considerations for Integration with Professional GIS Robert Laudati Trimble Navigation Ltd. November.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
What is Web 2.0? Purpose of web 2.0 in Education.
1 What is RUcore?  A cyberinfrastructure for the Rutgers Community that includes:  An institutional repository, to preserve, manage and make accessible.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
Unidata TDS Workshop THREDDS Data Server Overview October 2014.
V. Chandrasekar (CSU), Mike Daniels (NCAR), Sara Graves (UAH), Branko Kerkez (Michigan), Frank Vernon (USCD) Integrating Real-time Data into the EarthCube.
Quick Unidata Overview NetCDF Workshop 25 October 2012 Russ Rew.
AIRNow-International The future of the United States real-time air quality reporting and forecasting program and GEOSS participation John E. White U.S.
September 29, 2002Ubicomp 021 NIST Meeting Data Collection Jean Scholtz National Institute of Standards and Technology Gaithersburg, MD USA.
Sage Bionetworks Mission Sage Bionetworks is a non-profit organization with a vision to create a “commons” where integrative bionetworks are evolved by.
Bringing it All Together: NODC’s Geoportal Server as an Integration Tool for Interoperable Data Services Kenneth S. Casey, Ph.D. YuanJie Li NOAA National.
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
Eric GrahamNathan Yau Staff Ecologist, CENSGraduate Student, Department of Statistics Use CasesSensorBase Coupled Human-Observational Systems Technology.
U.S. Department of the Interior U.S. Geological Survey The NGGDPP's Best Practices in Data Preservation Project Brian Buczkowski U.S. Geological Survey.
Fundamentals of XML Management Greg Alexopoulos Systems Engineer Documentum.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
NEPTUNE Canada Workshop Oceans 2.0 Project Environment NEPTUNE Canada DMAS Team Victoria, BC February 16, 2009.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
Sept 19,  Provides a common set of terminology and definitions  A framework for describing resources and processes  Enables computer based interoperability.
Making Connections: SHARE and the Open Science Framework Jeffrey Open Repositories 2015.
U.S. Department of the Interior U.S. Geological Survey Management of Oceanographic time-series data at the Woods Hole Coastal and Marine Science Center.
N-Wave Stakeholder Users Conference Wednesday, May 11, Marine St, Rm 123 Boulder, CO Linda Miller and Mike Schmidt Unidata Program Center (UPC)-Boulder,
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
Accomplishments and Remaining Challenges: THREDDS Data Server and Common Data Model Ethan Davis Unidata Policy Committee Meeting May 2011.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
The Arctic Observing Network (AON) Cooperative Arctic Data and Information Service (CADIS) Florence Fetterer,
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
Linking Tasks, Data, and Architecture Doug Nebert AR-09-01A May 2010.
Unidata TDS Workshop THREDDS Data Server Overview
Geospatial Systems Architecture Todd Bacastow. Views of a System Architecture Enterprise Information Computational Engineering Technology.
IODE Ocean Data Portal - ODP  The objective of the IODE Ocean Data Portal (ODP) is to facilitate and promote the exchange and dissemination of marine.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
UAF/OSMC Presenters: Kevin O’Brien and Eugene Burger Abstract: Kevin O’Brien and Eugene Burger are from NOAA’s Pacific Marine Environmental Laboratory.
Evolving toward a Coherent, Collaborative Framework for Earth Science Data, Tools and Services Christopher Lynnes, Kwo-Sen Kuo and Kevin Murphy Earth Science.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Ocean Observatories Initiative OOI Cyberinfrastructure Data Management Michael Meisinger & David Stuebe OOI Cyberinfrastructure Life Cycle Objectives Milestone.
An Update on COLA’s Software Development Jennifer M. Adams and Brian Doty.
A Technical Overview Bill Branan DuraCloud Technical Lead.
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
N. RadziwillEVLA Advisory Committee Meeting May 8-9, 2006 NRAO End to End (e2e) Operations Division Nicole M. Radziwill.
1 1 NOAA Office of Ocean Exploration End-to-End Data Management: A Success Story NOAA Tech Conference November 2005 Susan Gottfried National Coastal Data.
Convergence And Trust in Earth and Space Science Data Systems Ted Habermann, NOAA National Geophysical Data Center Documentation: It’s not just discovery...
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Where/how could we change the overall process of field project implementation to improve in our mission of answering key science questions? Are we open.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
Repository for Archiving, Managing and Accessing Diverse DAta Thiru.
LEAD Project Discussion Presented by: Emma Buneci for CPS 296.2: Self-Managing Systems Source for many slides: Kelvin Droegemeier, Year 2 site visit presentation.
5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision.
The overall classification of this briefing is UNCLASSIFIED US Africa Command Africa Partner Country Network Overview Mr. Jordan Pritchard AFRICOM KM.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL
The GEF International Waters Learning Exchange and Resource Network (iwlearn.net) IW:LEARN3 – GEF IW:LEARN III GEF IW Science Conference.
R2R ↔ NODC Steve Rutz NODC Observing Systems Team Leader May 12, 2011 Presented by L. Pikula, IODE OceanTeacher Course Data Management for Information.
DataNet Collaboration
NOAA OneStop and the Cloud
Presentation transcript:

End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009

Vision “Unidata’s vision calls for providing comprehensive, well- integrated and end-to-end data services for the geosciences. These include an array of functions for collecting, finding, and accessing data; data management tools for generating, cataloging, and exchanging metadata; and submitting or publishing, sharing, analyzing, visualizing, and integrating data.” What does this vision statement mean to each of us?

Background Providing real-time weather data (and related tools) was the primary reason why Unidata was created and it has been the bread and butter of Unidata’s mission for more than two decades. But as our work has evolved, along with our community, it has become clear that just provision of real-time data or facilitating data access is not enough. Hence the vision statement. Let’s think about how some of the capabilities available on Amazon/Ebay/You Tube/Flickr can be facilitated for geosciences “data”.

Objectives 1. Create “integrated” data services across all stages of data life cycle, beginning with observations and ending with data curation/archiving. A) Observations/Sensors  Ingest  Data collection systems  Data providers  Disseminate  Users (both end users and data archival systems) B) From beginning till end of a workflow (LEAD example): Observations  Ingest  Analysis/Assimilation  Prediction  Output  Dissemination  Users (both end users and data repositories)

Imperatives Integrated services does not imply a monolithic system, but a set of modular services that are configurable, flexible, extensible, and scalable. Need think what [essential] services are needed by our users and the use cases. –Users include students, faculty, scientists, data providers, outreach providers, field project personnel –Use cases include class room & lab use, research studies, weather websites, field projects, projects like LEAD, portals, and data centers –Both programmatic and interactive invocation We may not work on all of the functionalities ourselves but we need to facilitate as many of the as possible.

Strategies and Tactics Integration achieved via both loosely and tightly coupled components and services Incrementalism is the only practical option for a program like Unidata where many technologies already exist and resources are scarce. Leverage as much as possible both our own technologies as well as what is available from the outside.

What do I mean by Data? Scientific data (binary, ASCII, netCDF, HDF, XML, GRIB, …and XML) Metadata (ASCII, XML, etc.) Data in data bases (e.g., SQL) GIS data (Shapefiles, KML, etc.) Derived products from scientific data Ancillary data objects (images, videos, documents – pdf, Word, html, ppt, etc.

Integration Capabilities Different data types (feeds, obs., platforms, model output, and GIS information) Different data formats Data on different projections Distributed data holdings Data operation (e.g., GDS, netCDF operators) Metadata addition Integration of scientific data with metadata content, documents, and other information

Not develop ourselves but perhaps provide hooks to Collaboration tools Wikis Forums Blogs Chat and IM/SMS Social network apps (Facebook, Twitter, etc.) RSS, and other notifications

A list of possible data services Data collection service (routine ingest via LDM, FTP, etc.) Data submission service Metadata service for submitting, editing, and exchanging metadata Cataloging service Data discovery service Monitoring and notification service for new data, metadata, and products Data access services Data delivery/transport services, including copying/moving data to other servers and personal space and streaming data on demand Security and authentication services Subsetting service, including capability for progressive disclosure Aggregation services for data and metadata Services for CF conformance checking Decoding and data translation services Unit conversion services Visualization and product generation services Data fusion and data manipulation services (e.g., netCDF operator services) GIS services Output handling services

IMO, Beyond Unidata’s Scope Data mining Ontologies Brokering and workflow orchestration Federation and mediation Provenance Curation and stewardship

Final Thoughts It is important that we develop consensus on what we mean by integrated, end-to-end data services. Therefore, we need to hear your thoughts. Once we have an idea of what it is that we want to build, we need to agree on how to go about building it. Again, I believe in an incremental approach, but there may be other ideas. With RAMADDA, THREDDS, netCDF, and LDM, many of the pieces already exist upon which to build E2E data services. We need to identify the next steps and a concrete project in this potentially long journey. Develop a pilot effort? A prototype?