Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unidata Seminar Series - 30 January 2004 OPeNDAP and THREDDS: Access and Discovery of Distributed Scientific Data Yuan Ho Ethan Davis UCAR Unidata.

Similar presentations


Presentation on theme: "Unidata Seminar Series - 30 January 2004 OPeNDAP and THREDDS: Access and Discovery of Distributed Scientific Data Yuan Ho Ethan Davis UCAR Unidata."— Presentation transcript:

1 Unidata Seminar Series - 30 January 2004 OPeNDAP and THREDDS: Access and Discovery of Distributed Scientific Data Yuan Ho Ethan Davis UCAR Unidata

2 Unidata Seminar Series - 30 January 2004 Access and Discovery of Distributed Scientific Data OPeNDAP – access to scientific data but no standard inventory or discovery mechanisms THREDDS – cataloging, describing, and discovery of scientific data

3 Unidata Seminar Series - 30 January 2004 What is OPeNDAP OPeNDAP (Open source Project for a Network Data Access Protocol) is a protocol for accessing distributed scientific data (aka DODS DAP). OPeNDAP is a generic data exchange mechanism that lies at the core of a variety of discipline data system. OPeNDAP is two reference implementations of the protocol (C++ and Java) OPeNDAP is a software framework that simplifies all aspects of scientific data networking, allowing simple access to remote data. OPeNDAP is a community of users and developers OPeNDAP is a non-profit corporation called OPeNDAP Inc..

4 Unidata Seminar Series - 30 January 2004 Design Principles The user should be able to share their data via OPeNDAP over network (server). The user should be able to use their application package to examine or analyze the data of interest (client).

5 Unidata Seminar Series - 30 January 2004 Client/Server Interaction Data access (client) –Access to remote data in users normal application IDL (win32) Matlab Ferret GrADS Any netCDF application Excel –Don’t need to know the data format in which the data is stored –Can access data subsets. Data publishing (server) –Network interface via http –DAP provides common/network representation for data –Can serve data in various formats netCDF HDF SQL FreeForm JGOFS DSP –Allows subsetting of data

6 Unidata Seminar Series - 30 January 2004 OPeNDAP Status OPeNDAP/DODS 3.4 release OPeNDAP Java 1.1.3 OPeNADP Data Connector 2.3X OPeNDAP DAP Specification 4.0

7 Unidata Seminar Series - 30 January 2004 OPeNDAP Data Object Three important OPeNDAP data objects: –DDX The DDX is an XML representation of the structure of all or part of a data set, as well as a description of the variables within that datasets. –Blob Binary data transfer from the data source to the client. The Blob contains the serialized data represented by the DDX. –ErrorX The ErrorX object is an XML document containing information about any errors that may have been encountered by the server while processing a request.

8 Unidata Seminar Series - 30 January 2004 DDX Example <Datasets name=“fnoc1.nc” xmlns:xsi=http://www.w3.org/2001/XMLSchema-instancehttp://www.w3.org/2001/XMLSchema-instance xmlns=http://www.opendap.org/ns/OPeNDAPhttp://www.opendap.org/ns/OPeNDAP xsi:schemaLocation=“http://www.opendap.org/ns/OPeNDAP http://dods.coas.oregonstate.edu:8080/opendap/opendap.xsd”>http://dods.coas.oregonstate.edu:8080/opendap/opendap.xsd Fleet Numerical Wind Data U_Wind_Vector http://dcz.opendap.org/dap/data/nc/fnoc1.nc?u

9 Unidata Seminar Series - 30 January 2004 Variables and Attributes Each variable consists of a name, a type, a value and a collection of Attributes. –Atomic variables: atomic data types are indivisible. integer, floating-point, string, and binary images. Example –Constructor variables: a constructor variable is assembled from collections of other variables, including both atomic and constructor types. array, structure, grid, and sequence. Example

10 Unidata Seminar Series - 30 January 2004 Variables and Attributes An attributes is composed of a name, a type, and a value. –Each variable may have zero or more attributes. –Types: Boolean, Byte, IntXX, UIntXX, FloatXX, String, URL. –Example 18 Mar 03 GOES 898976

11 Unidata Seminar Series - 30 January 2004 Requests/Responses Responses: four categories of information pass from the server to client –Information about the data: DDX –The data: Blob –Error messages: ErrorX object –Information about the server: version messages and server capabilities document Requests: a constraint expression provides a way for client to request certain information from a dataset, such certain variables, or parts of certain variables. –Projection clause: a collection of one or more project elements –Selection clause: one or more select elements. –Example: 34.0” target=“sample”/>

12 Unidata Seminar Series - 30 January 2004 Problems of searching and retrieving datasets from OPeNDAP server Metadata –Use metadata: metadata at the data level –Search metadata: metadata at the directory level OPeNDAP has been built from data level, high functionality at the data acquisition level. OPeNDAP AIS (ancillary information service) adding metadata information into OPeNDAP data stream. The role of ancillary data is to translate and access of data ODC is more a directory services with limit data searching functionality.

13 Unidata Seminar Series - 30 January 2004 Summary of OPeNDAP OPeNDAP data delivery architecture provides remote access of data via internat. OPeNDAP uses HTTP (FTP, GridFTP, Telnet, et cetera) to transport its data object. OPeNDAP has proved very versatile. XML for the persistent form of the data objects. OPeNDAP is a data access tool, need a data discovery tool to complement each other.

14 Unidata Seminar Series - 30 January 2004 THREDDS Project Develop a framework to bridge the gap between data providers and data users, to make scientific data discoverable and usable as well as referencable from scientific publications and educational materials. The framework should be: –Scalable for large and small projects –Easy to use yet powerful and flexible –Capable of supporting various user interfaces

15 Unidata Seminar Series - 30 January 2004 THREDDS Catalogs Hierarchal structure of datasets Dataset access methods Structure on which to hang (reference) metadata 1 0..* THREDDS catalogs are for communicating information about a collection of datasets

16 Unidata Seminar Series - 30 January 2004 THREDDS Catalogs Hierarchal structure of datasets Dataset access methods Structure on which to hang (reference) metadata 1 0..* THREDDS catalogs are for communicating information about a collection of datasets

17 Unidata Seminar Series - 30 January 2004 THREDDS Catalogs <metadata metadataType="DublinCore" xlink:href="http://server/dods/eta.xml" /> <access serviceType="DODS" urlPath="http://server/dods/2003092412_eta.nc" /> …

18 Unidata Seminar Series - 30 January 2004 THREDDS Catalogs Hierarchal structure of datasets Dataset access methods Structure on which to hang (reference) metadata 1 0..* THREDDS catalogs are for communicating information about a collection of datasets

19 Unidata Seminar Series - 30 January 2004 THREDDS Catalogs <metadata metadataType="DublinCore" xlink:href="http://server/dods/eta.xml" /> <access serviceType="DODS" urlPath="http://server/dods/2003092412_eta.nc" /> …

20 Unidata Seminar Series - 30 January 2004 THREDDS Catalogs Hierarchal structure of datasets Dataset access methods Structure on which to hang (reference) metadata 1 0..* THREDDS catalogs are for communicating information about a collection of datasets

21 Unidata Seminar Series - 30 January 2004 THREDDS Catalogs <metadata metadataType="DublinCore" xlink:href="http://server/dods/eta.xml" /> <access serviceType="DODS" urlPath="http://server/dods/2003092412_eta.nc" /> …

22 Unidata Seminar Series - 30 January 2004 THREDDS Catalogs NCEP Eta 80km CONUS model data NOAA/NCEP NCEP Eta Model data; Real-time data This collection of real-time NOAA/NCEP Eta model data contains five days worth of data. The data is on a 80km CONUS grid (GRIB grid 211). Daily 00Z and 12Z runs are available where each dataset includes analysis data and forecast data from a single Eta run. Each dataset contains forecasts for every 6 hours going out two and a half days (60hrs) from the run time. …

23 Unidata Seminar Series - 30 January 2004 THREDDS Catalogs Hierarchal structure of datasets Dataset access methods Structure on which to hang (reference) metadata 1 0..* THREDDS catalogs are for communicating information about a collection of datasets

24 Unidata Seminar Series - 30 January 2004 THREDDS DQC (Dataset Query Capabilities) THREDDS DQC documents describe how a subset of a data collection can be requested. –Large and time varying data collections are cumbersome to view as a hierarchical structure THREDDS DQC documents describes the set of requests that can be made to one or more DQC services and the form of those requests. THREDDS DQC documents are an abstract representation of a collection of datasets

25 Unidata Seminar Series - 30 January 2004 THREDDS DQC Subsetting Large Collections

26 Unidata Seminar Series - 30 January 2004 THREDDS DQC <query base="http://motherlode.ucar.edu/cgi-bin/thredds/RadarServer.pl" construct="append" returns="catalog"/> … <choice name=".5 reflectivity.54nm res" value="N0R" description=".5 reflectivity.54nm res 16 levels id 19/r"/> … …

27 Unidata Seminar Series - 30 January 2004 THREDDS Services THREDDS catalogs are sources of information about a collection of data on top of which complex services can be built. For instance, tools that: –Provide interoperability with GIS systems –Supply external discovery systems with needed information (e.g., Dublin Core, DIF, FGDC) –Supply information to improve data display and analysis, e.g., geolocation information

28 Unidata Seminar Series - 30 January 2004 THREDDS and Discovery Systems To supply external discovery services with the information they require, we need: –The proper information added to a catalog, e.g., title and description of a dataset, spatial and temporal ranges, parameters, dataset ID. –Service to provide metadata in desired encoding –Service to feed information to discovery system Use discovery systems to search for data

29 Unidata Seminar Series - 30 January 2004 THREDDS and Discovery Systems Data server Communicate with Discovery Systems Metadata Repository Metadata Harvester Reads References Discovery System (e.g., DLESE) THREDDS Services with data server Writes Catalog Searches Dublin Core Generator

30 Unidata Seminar Series - 30 January 2004 Search and Discovery Services

31 Unidata Seminar Series - 30 January 2004 THREDDS Status Working on new versions of the catalog and DQC schemas Working on updating existing tools to use new schemas Working with UCAR DMWG and NCAR CDP on enhancing descriptive metadata Working with OPeNDAP developers on integrating THREDDS and OPeNDAP

32 Unidata Seminar Series - 30 January 2004 OPeNDAP and THREDDS Enhance OPeNDAP C++ implementation to serve THREDDS catalogs THREDDS DQC replace OPeNDAP File Servers

33 Unidata Seminar Series - 30 January 2004 OPeNDAP and THREDDS More Information OPeNDAP Web page: http://www.unidata.ucar.edu/packages/dods/ OPeNDAP Email list: dods@unidata.ucar.edu, subscribe at http://www.unidata.ucar.edu/packages/dods/home/mailLists/ THREDDS Email list: thredds@unidata.ucar.edu, subscribe at http://www.unidata.ucar.edu/projects/THREDDS/maillists/ THREDDS Web page: http://www.unidata.ucar.edu/projects/THREDDS/ Support questions: support@unidata.ucar.edu


Download ppt "Unidata Seminar Series - 30 January 2004 OPeNDAP and THREDDS: Access and Discovery of Distributed Scientific Data Yuan Ho Ethan Davis UCAR Unidata."

Similar presentations


Ads by Google