Environmental Information Data Centre: enabling the discovery of CEH-held data John Watkins Deputy Director EIDC
CEH & NERC in a UK Government setting
CEH monitoring and data collection As diverse as our science Micro- to macro-scale Many sources: Monitoring campaigns 180+ field sites State-of-the-art facilities Regulator networks Volunteers Model outputs Long-term and unique
CEH data coordination – in partnership Land Cover Map
National River Flow Archive Environmental Change Network NERC Environmental Bioinformatics Centre Biological Records Centre Other Data NERC Designated Data Centre Data CEH data CEH data Web Access Users CEH Information Gateway Metadata catalogue (data discovery) Linked data and integration Long-term Storage and Curation View & download (data access) Query & visualisation tools NERC Catalogue UK Gov Catalogue Data Transfer Process EIDC Data Hub
gateway.ceh.ac.uk
Links to NERC Data Catalogue Service
Links to UK Government Portal
Links to European INSPIRE Portal
Data citation via the Data Hub “.....the data have been allocated a digital object identifier ( 5285/1a91c7d1-ec af2-98d80f169bbd).”
Harmonising data definitions CEH Analytical Services Thesaurus (CAST) No specified vocabulary!
Making definitions open access CEH Analytical Services Thesaurus (CAST) Created to Simple Knowledge Organization System (SKOS) W3C standard Designed to describe whole process Top concepts: determinands machine descriptions measurement units methods filtration preservation
Importing definitions through Web links
Importing information through Web links
Resource oriented discovery CEH Analytical Services Thesaurus (CAST) SKOS allows links to externally hosted vocabularies e.g. ChEBI adds further value to datasets tagged using CAST, as they can be integrated with datasets tagged using concepts from linked vocabularies
Linking ecological concepts
Linking to multilingual definitions
Enabling complex environmental queries Web as a research data resource
Issues & challenges Researchers can ask complex questions across diverse data sources using LOD How to incentivise data providers to document & tag data => buy-in (e.g. DOIs)! Tools to automate the process, tagging at source/time of creation (e.g. LIMS) Automating the creation of semantic information for legacy data using diverse information sources (e.g. text mining of past reports and science papers)
Thank you!