Download presentation
Presentation is loading. Please wait.
Published byAvice Gaines Modified over 9 years ago
1
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Experience with the WMO core metadata in the SIMDAT/VGISC project Baudouin Raoult ECMWF
2
WMO Metadata Workshop – Beijing Sep 2005 The SIMDAT/VGISC project SIMDAT EU funded GRID project 7 Technologies: Grid infrastructure, Virtual Organisation, Ontologies, Analysis Services, Workflows, Distributed data access, Knowledge Services 4 Activities: Automotive, Areospace, Pharmacy and Meteorology Meteorology activity: build a Virtual GISC (V-GISC) DWD UKMO MétéoFrance EUMETSAT ECMWF
3
ECMWF WMO Metadata Workshop – Beijing Sep 2005 V-GISC infrastructure
4
ECMWF WMO Metadata Workshop – Beijing Sep 2005 V-GISC Conceptual view Through the Distributed Portal users searches for and retrieves data, subscribe to services subject to authentication and authorization The Virtual Database Service provides a single view of partners databases
5
ECMWF WMO Metadata Workshop – Beijing Sep 2005 VGISC Distributed Architecture
6
ECMWF WMO Metadata Workshop – Beijing Sep 2005 VGISC Node Functional Design
7
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Why do we need metadata (in this project)? Create a catalogue (discovery metadata) Searchable (Keyword, Geographical location, Time range) Browsable (Directory hierarchy) Implement the V-GISC (service metadata) Describe where the data resides (physical location) Describe how to request the data Describe the data format (useful for offering list of transformations, e.g. sub-sampling of gridded data, plots or format conversions) Describe associated data policies
8
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Study of the WMO core Starting point XML files available on the WMO web site XML files from DWD earlier prototype Trying to describe ECMWF archive (1.3 10 10 GRIB fields)
9
ECMWF WMO Metadata Workshop – Beijing Sep 2005 XML Root element or Namespaces are a nightmare to use (especially using XPath when there is a default namespace)
10
ECMWF WMO Metadata Workshop – Beijing Sep 2005 XML Keywords Russian Federation Moscow region Temperature Clouds Meteorology Observation Pressure Rainfall Snow Snowfall Weather Wind Phenomenon Or… EARTH SCIENCE > Cryosphere > Sea Ice EARTH SCIENCE > Atmosphere EARTH SCIENCE > Oceans EARTH SCIENCE > Solid Earth ocean, atmosphere, ice, land Or… METAR aviation hourly weather observation temperature dew point precipitation amount visibility cloud amount type height weather runway colour state
11
ECMWF WMO Metadata Workshop – Beijing Sep 2005 XML Geographical extent 50.78 6.1 Or… CCCC2 Or… -126.3 39.9
12
ECMWF WMO Metadata Workshop – Beijing Sep 2005 XML Temporal extent 0100-01-01 0299-12-31 monthly daily Or… 2004-02-05T00:00:00 2004-02-05T06:00:00 Or… 2004-01-28 creationDate
13
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Repetition of XML elements (means extension) 3.5 992.5 mb -180 +180 -90 +90 Global 1900-01-01 1999-12-31 monthly daily
14
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Repetition of XML elements (means redefinition) Global Grid 2.5 degree latitude and 2.5 degree longitude steps, 6 sectors, one sector per GRIB bulletin Sector S -180 -60 0 90 Global Grid 2.5 degree latitude and 2.5 degree longitude steps, 6 sectors, one sector per GRIB bulletin Sector T -60 60 0 90
15
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Findings A flexible format, that leads to a lack of consistency Different way to encode geographical extent, keywords and temporal extents Missing information (for the V-GISC) To create a directory To locate the data To create retrieval requests To describe available transformations To implement data policies
16
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Findings (cont.) Seems to be designed for human consumption Free text in XML elements Not scalable Some document may change frequently (hourly?) Some documents are orders of magnitude larger than data itself Cannot represent very large archives with small granularity
17
ECMWF WMO Metadata Workshop – Beijing Sep 2005 SIMDAT/VGISC problem Each site has its own practices We have to be ready for variability in the XML We will have to handle XML from other WMO programmes We need to handle tens of thousands of documents Lot of repeated information We need fast search We need to automatically Index the keywords, the geographical extent and the temporal extent Create a browsable directory (similar the NCAR’s Community data portal) Locate and retrieve the data Implement the data policy
18
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Solution: split XML documents into fragments WMO core metadata is structured Some part are shared amongst many documents All metadata share the Core part All UKMO metadata share the Owner part All synops (should) share the same description All observations at Heathrow have the same location The date part is variable but is very small WMO UKMO Synop Heathrow 2005-10-12 Core Owner Data type Station (geographical extent) Date (temporal extent)
19
ECMWF WMO Metadata Workshop – Beijing Sep 2005 XML fragments are hierarchically linked WMOUKMO SynopHeathrow Heathrow Synop Heathrow Synop 2005-10-12
20
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Fragments: advantages Factorizing commonalities into static fragments Reduces size of XML documents Indexation done once Avoid redundancy of information Faster searches Frequently updated documents are small Manageable Scalable Complete XML document can be rebuilt For exchange outside the V-GISC
21
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Indexing of XML fragments WMOUKMO SynopHeathrow Heathrow Synop Heathrow Synop 2005-10-12 Keywords Geographical Extent Temporal Extent
22
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Prototype implementation XML Fragment are stored as “text” Fragment table Hierarchy table Indexed at insertion time Keywords table Locations table Periods table Directory table Implemented with MySQL With OpenGIS extension With text search extension Indexes are “inherited” OO approach
23
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Object Oriented Approach - Behaviours WMOUKMO SynopHeathrow Heathrow Synop Heathrow Synop 2005-10-12 Index as geography Index as keyword Index as period Index as keyword
24
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Fragment properties - Behaviours Only the owner of the data knows how to : Describe the data (Indexation information) Request the data (Create internal request) Extract a subset of the data (Define a interface to extract a subset) Associated to each fragments ancillary metadata can be defined to describe how to index, request and sub-select the data Behaviours are inherited Object oriented approach
25
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Behaviours example: indexing //identificationInfo/descriptiveKeywords //identificationInfo/dataExtent/geographicElement/boundingBox //identificationInfo/dataExtent/geographicElement/polygon //identificationInfo/referenceDate/date //identificationInfo/dataExtent/temporalElement //identificationInfo/referenceDate/period //identificationInfo/topicCategory
26
ECMWF WMO Metadata Workshop – Beijing Sep 2005 extension A element from the “http://www.vgisc.org/” namespace is embedded in all the fragments It contains all information needed to implement the V-GISC that is not defined by the WMO core because they are not relevant outside the scope of the V-GISC Internal unique ID Hierarchy relationship Physical location (which V-GISC node holds the data) Information used to create data request Information used to create web pages It is removed when full XML document is recomposed for use outside the V-GISC
27
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Fragment example http://www.vgisc.org/ urn:akrotiri.synop.land.second.record.20050629 urn:akrotiri urn:int.wmo.synop.land.second.record ecmwf.obs 2005-06-29
28
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Variables and Requests Some datasets have two many items Impossible to describe every one of them But describing the whole dataset is simple Some datasets are very homogenous E.g. same parameters for a long period of time This can be described in a compact form ( and ) But we still need to specify that individual dates can be requested by the user
29
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Variables and requests (cont.) Associate two elements with an XML fragment: Hold information specific on how to generate a valid request to the data repository Holds information on how to create a web interface to let the user select items from the dataset Web portal We use WMO core for discovery We use the element to present selection dialogues to the user
30
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Fragment example: ECMWF Reanalysis http://www.vgisc.org/ urn:int.ecmwf.era40.sfc urn:int.wmo.core ecmwf.mars e4 sfc marser 1980-01-01 1990-12-31 2t msl 0000 0600 1200 1800 ECMWF 40 Years reanalysis ERA40 ERA-40 in GRIB NWP Outputs > ECMWF > 40 years reanalysis 1980-01-01 1990-12-31 …
31
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Directory structure Problem: create a browsable hierarchy of topics, as the “Google directory” (see NCAR’s community data portal) Not to be confuse with the internal “fragment hierarchy” which is not exposed to the end user Currently using the element NWP Outputs > ECMWF > 40 years reanalysis The same product can appear in several locations of the directory Observations > By Type > Profile > Temp Land Observations > By Region > Asia > China Usage should be recommended by WMO
32
ECMWF WMO Metadata Workshop – Beijing Sep 2005 Conclusion The approach taken in the V-GISC should help us support the large variety of XML documents Nevertheless, the standard is too flexible Lot of programming is required to support all possible variations The WMO must provide “best practices” guidelines How to encode point in time, how to encode ranges, … A topic hierarchy must be defined, to create the directory WMO core metadata needs only contain sufficient information for discovery The rest can be implemented as a series of local extensions, as long as they are not exported or exchanged
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.