Download presentation
Presentation is loading. Please wait.
1
METADATA from observation to its use
METADATA from observation to its use Dr Esa Falkenroth Information Architect, SMHI 1st Data Provider Workshop St Petersburg November 2016
2
three perspectives - producer - infra-structure - users
Create data sets Format data sets Provide data sets online Write abstract Geo-relate data sets Classify data sets Enter metadata Maintain metadata Develop metadata standard Develop classification system Locate matching keyword Locate datasets matching geo Listing matching data sets Show data sample Search using classification Search based on geolocation Search based on keyword Pick data sets Provide download service Download dataset Open dataset Understand dataset Use data set three perspectives - producer - infra-structure - users
3
Create data sets Format data sets Provide data sets online Write abstract Geo-relate data sets Classify data sets Enter metadata Maintain metadata Develop metadata standard Develop classification system Locate datasets matching keyword Locate datasets matching geo Listing matching data sets Show data sample Search using classification Search based on geolocation Search based on keyword Pick data sets Provide download service Download dataset Open dataset Understand dataset Use data set
4
SOMEBODY ELSE RESPONSIBILITY
all very busy ” research is done”, ” can’t update allportals.” producer Create data sets Format data sets Provide data sets online Write abstract Geo-relate data sets Classify data sets Enter metadata Maintain metadata Develop metadata standard Develop classification system Locate datasets matching keyword Locate datasets matching geo Listing matching data sets Show data sample Search using classification Search based on geolocation Search based on keyword Pick data sets Provide download service Download dataset Open dataset Understand dataset Use data set SOMEBODY ELSE RESPONSIBILITY
5
digital infra-structure
infra-structure view Create data sets Format data sets Provide data sets online Write abstract Geo-relate data sets Classify data sets Enter metadata Maintain metadata Develop metadata standard Develop classification system Locate datasets matching keyword Locate datasets matching geo Listing matching data sets Show data sample Search using classification Search based on geolocation Search based on keyword Pick data sets Provide download service Download dataset Open dataset Understand dataset Use data set SOMEBODY ELSE digital infra-structure all very busy ”I don’t know the data…” ”..OGC, XML, WFS, HDF!” SOMEBODY ELSE
6
SOMEBODY ELSE user ”I just want to search, download and use the data”
Create data sets Format data sets Provide data sets online Write abstract Geo-relate data sets Classify data sets Enter metadata Maintain metadata Develop metadata standard Develop classification system Locate datasets matching keyword Locate datasets matching geo Listing matching data sets Show data sample Search using classification Search based on geolocation Search based on keyword Pick data sets Provide download service Download dataset Open dataset Understand dataset Use data set SOMEBODY ELSE user ”I just want to search, download and use the data” all very busy
7
digital infra-structure
MIND THE GAP Helicopter view producer Create data sets Format data sets Provide data sets online Write abstract Geo-relate data sets Classify data sets Enter metadata Maintain metadata Develop metadata standard Develop classification system Locate datasets matching keyword Locate datasets matching geo Listing matching data sets Show data sample Search using classification Search based on geolocation Search based on keyword Pick data sets Provide download service Download dataset Open dataset Understand dataset Use data set digital infra-structure user
8
digital infra-structure
MIND THE GAP ?= Helicopter view producer Create data sets Format data sets Provide data sets online Write abstract Geo-relate data sets Classify data sets Enter metadata Maintain metadata Develop metadata standard Develop classification system Locate datasets matching keyword Locate datasets matching geo Listing matching data sets Show data sample Search using classification Search based on geolocation Search based on keyword Pick data sets Provide download service Download dataset Open dataset Understand dataset Use data set Who writes metadata for old inactive projects ? ? ? digital infra-structure ? ? ? ? ? user Who should make the classification? producer user or ”mediators”?
9
”where is that book?” …. before the librarians
not a new problem…
10
We can do better with metadata for open data ”somebody elses problem” does not help. Collaborate w. providers NOT a software issue, just hard work.
11
SWITCH-ON METADATA LIBRARY
Create data sets Format data sets Provide data sets online Write abstract Geo-relate data sets Classify data sets Enter metadata Maintain metadata Develop metadata standard Develop classification system Locate datasets matching keyword Locate datasets matching geo Listing matching data sets Show data sample Search using classification Search based on geolocation Search based on keyword Pick data sets Provide download service Download dataset Open dataset Understand dataset Use data set producer UPLOAD TOOL SWITCH-ON METADATA LIBRARY SEARCH TOOL MIND THE GAP user DOWNLOAD DOWNLOAD (SVN)
12
SWITCH-ON METADATA LIBRARY
FILLING METADATA producer Create data sets Format data sets Provide data sets online Write abstract Geo-relate data sets Classify data sets Enter metadata Maintain metadata Develop metadata standard Develop classification system Locate datasets matching keyword Locate datasets matching geo Listing matching data sets Show data sample Search using classification Search based on geolocation Search based on keyword Pick data sets Provide download service Download dataset Open dataset Understand dataset Use data set UPLOAD TOOL SWITCH-ON METADATA LIBRARY Maintain ontologies Catalogue resources Create metadata what a SEARCH TOOL ”BYOD” MIND THE GAP user Hydrologist work during the summer period to improve abstracts, geographical information, license information for data sets by contacting data providers.
13
8300 resources usable metadata
free copy- right Acknow- ledgement science special non-commercial % LICENSE 8300 resources usable metadata - varying formats - many licences - ways to access - being added to GEOSS ACCESS direct download 78 % request % viewing % other 10% hdf netcdf % Downloadable datasets (direct ) Request-datasets require registration View-services (no download ) Download services (e.g. ftp-servers) Other websites (w. open data) SWITCH-ON datasets asc txt 7% FORMAT dat 9% excel 24% shape 9% html 10%
14
SWITCH-ON innovation in usable metadata search
Innovative (usable) classification Innovative (usable) geospatial data Innovative (usable) interface Innovative budget (0.2 % for ”librarian work”) Extend/correct incomplete or missing abstracts More detailed spatial coverage for point sources Reclassification for water-science (user perspective)
15
classification problems
(1) Generic themes give many “hits” (not specific) GEOSS Water ( hits) GEOSS Climate (24436 hits), a generic portal has GEOSS Agriculture (11866 hits) generic keywords… (2) Producer and users use different sets of keywords. - Producer: WFS, realtime portal, operational data store - User: mass fraction pm2p5 nitrate dry aerosol, runoff (3) Neither the producer, user or the mediators necessarily have the ”whole picture” needed to make a good classification. Here, user communities can help with develop usable classifications (that work for search).
16
usability-driven classification
Resources catalogued based on how the users will search instead of using the producers terms Balancing “specialisation-degree” Too specific (zero hits) Too generic (too many hits) Good enough ( hits) SWITCH-ON extended the well- known CUAHSI ontology for the hydrosphere with additional keywords to cover land-use and population data.
17
Good fit with GEOSS DAB thesaurus
18
Usable spatial search and the world box problem
Bounding boxes are great for describing coverage of maps and gridded data. However, for in-situ data, bounding boxes give false positives. More detailed spatial resolution with individual in-situ positions for datasets facilitate search on local or regional scale. Technically, bounding boxes / polygons are replaced with multipoint coverage
19
Usable spatial search and the world box problem
What happens if the user does a search for his/her area of interest? The search box matches the bounding box of the data …but there are no relevant data in the dataset found.
20
Balancing and pragmatic approach to metadata
Not all data sets are equally popular. More popular datasets need more refined/detailed metadata. Not all ISO metadata attributes are necessary for search. Water scientists mainly want to search by classification and/or spatial reference (co-location). The rest is simple matter of automatic filtering (as implemented by geoportal) or simply sifting through the often limited results. This means the search metadata can be simplified while still maintaining compatibility with GEOSS and ISO-standards.
21
agile development of user interfaces
Agile approach is a proven/established method of finding user requirements and develop software Easier interfaces Less coding Easier testing Faster time to market Happier users
22
Very basic search tool
23
Welcome to the SWITCH-ON Portal
SWITCH-ON is developing a large number of commercial water-information products and services Open Virtual Water-Science Laboratory: Research infrastructure to facilitate collaboration, transparency and repeatable computational experiments. Tailored data, research results and marketing SWITCH-ON will give free access: tools for datasearch and knowledge brokering for development and marketing of commercial information products and services one-stop-shop with water information and tools to water scientists, consultancies and managers:
24
Increasing the use of GEOSS: summary from three perspectives
Producers can do better: Sharing data to enable innovation and better research in e.g. climate Use clear permitting licences!! Preferrably Creative Commons. Provide complete and correct metadata in standard machine-readable formats Mediators (portals, brokers, data hubs ) can do better: Monitor availability, completeness and usability of the data sets Encourage open data and the adoption of Creative Commons Active pragmatic collaboration with data providers increase precision of spatial information (e.g. multipoint coverages) update broken links in collaboration (some 4% yearly loss of data in our collection). fix missing descriptions of all data sets Librarian effort in SWITCH-ON is less than 0.2% of the total project cost. User communities, research organisations and product developers: Better communicate their primary requirements for data search Contribute to metadata (especially for data sets from smaller local projects)
25
Thank you !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.