Quality control of biodiversity data: tools & techniques Leen Vandepitte On behalf of WoRMS, EurOBIS & LifeWatch data management teams.

Slides:



Advertisements
Similar presentations
Session 2 – WISE SoE TCM data processing and quality issues.
Advertisements

Emission Inventory System Reports Course Sally Dombrowski
EMu and Darwin Core Ely Wallis, Museum Victoria October 2004.
Welcome to the British Education Index tutorial By the end of this tutorial you should be able to: Do an advanced search to find references Use search.
Using…. EasyCBM Reasons to use EasyCBM
LaMNA Incoming Dataset Processing Flow.
Introduction to biological data management
Next Steps in the Catalogue of Life Frank Bisby, Sp2000 and Thomas Orrell, ITIS Catalogue of Life Partnership.
Ocean Biodiversity Information – 29/11-1/12/20041 European Register of Marine Species version 2.0 data management, current status and plans for the future.
Search Engines and Information Retrieval
Geographic Information Systems
Ocean Biogeographic Information System Edward Vanden Berghe
EEP wants to do a better job creating natural ecosystems. CVS provides improved reference data, target design, monitoring, and data management and analysis.
Link yourself or perish? PhytoKeys, the next generation journal in systematic botany Lyubomir Penev 1, W. John Kress 2, Sandra Knapp 3, De-Zhu Li 4, Susanne.
FADA workshop, 5-7 December 2008 in Bruges (Belgium) World Register of Marine Species and Aphia IT platform Ward Appeltans
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer September G A Darwin-Core Archive solution to publishing and.
Value of a coordinate: geographic analysis of agricultural biodiversity Andy Jarvis, Julian Ramirez, Nora Castañeda, Samy Gaiji, Luigi Guarino, Hector.
The EDIT Platform for Cybertaxonomy as an information broker in name infrastructures Andreas Kohlbecker 1, Yde de Jong 2, Cherian Mathew 1, Lorna Morris.
IWC Database Overview of technology and application 13 th July 2010.
Ocean Biogeographic Information System. ‘Mission’ OBIS publishes primary data on marine species locations online through –It.
Operational integration of biodiversity and physico-chemical data: experience at the BMDC Meerhaeghe A., De Cauwer K., Devolder M., Jans S., Scory S.
NOAA NRDA Data Management Webinar. FTP Site: Please download and print these instructions! Sampling Forms & Instructions.
MEDIN Data Guidelines. Data Guidelines Documents with tables and Excel versions of tables which are organised on a thematic basis which consider the actual.
PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde.
Search Engines and Information Retrieval Chapter 1.
Controlled Vocabularies (Term Lists). Controlled Vocabs Literally - A list of terms to choose from Aim is to promote the use of common vocabularies so.
DWINSA 2007 Website. Website Purpose Allow states to track status of questionnaires Allow systems >100K or states to upload project data.
Functions and Demo of Astrogrid 1.1 China-VO Haijun Tian.
Online Data Flanders Marine Data & Information Centre InnovOcean site SeadataNet Annual Meeting, Madrid 2009.
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition Tools and Resources to Assess and Enhance Fitness-For-Use.
Drinking Water Infrastructure Needs Survey and Assessment 2007 Training.
Knowledge base for growth and innovation in ocean economy: assembly and dissemination of marine data for seabed mapping LOT NO: 5 – BIOLOGY Simon Claus.
Drinking Water Infrastructure Needs Survey and Assessment 2007 Website.
The Marine S ystème d’ I nformation sur la N ature et les P aysages CAML Workshop– Villefranche-sur-mer – 18 th May 2010.
INCO 3739 Collaborating in Incofish, WP4 and WP1 Presented by Lotta Järnmark (WPL1) From May 2005 to April 2008.
Development of a Marine Biological Data Portal within the framework of EMODNet Simon Claus, Leen Vandepitte and Tjess Hernandez Flanders Marine Institute.
Fábio Lang da Silveira – This talk on behalf of OBIS International Committee and OBIS North & South America Nodes USP – Zoology.
Geographic data validation. Index Basic concepts Why do we need validation? How to assess geographic data Initial checks Intermediate checks Advanced.
Hellenic Centre for Marine Research (HCMR) MedOBIS - Ocean Biogeographic Information System for the Eastern Mediterranean and Black Sea.
The New GBIF Data Portal Web Services and Tools Donald Hobern GBIF Deputy Director for Informatics October 2006.
1 EMODNET pilot biological lot Francisco Hernandez, Simon Claus, Leen Vandepitte.
USER´S GUIDE OF TRESMED4 WEB Working on Social Dialogue and Cooperation.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen Senior Programme Officer, ECAT 3 Oct th Nodes Meeting.
Open Access data at VLIZ Experience in retrieving data from EMODnet “Data ingestion, archiving, citation and DOI” June 26, 2014.
African Register of Marine Species AfReMas Leen Vandepitte On behalf of WoRMS data management team.
OBIS - A Valuable Resource for NW Atlantic Fisheries Science (Part 2) OBIS Canada M.Kennedy, B.Marshall, N.Campbell, W.Appeltans NAFO Scientific Council.
ODINAfrica Marine Biodiversity Data Management training course Ward Appeltans UNESCO, Intergovernmental Oceanographic Commission (IOC-UNESCO) International.
Vision for Laboratory Data Distribution April 9, 2015 How Much of the Vision Has Been Realized since 2012 National Soil Survey Center Soil Survey Laboratory.
Leen Vandepitte On behalf of WoRMS data management team Introduction to WoRMS, the World Register of Marine Species.
Development of a Marine Biological Data Portal within the framework of EMODNet Simon Claus, Leen Vandepitte & Tjess Hernandez Flanders Marine Institute.
1 AGENDA: Coordination Board Meeting EMODnet BIOLOGICAL LOT 11/10/2010.
Metadata standards Leen Vandepitte On behalf of WoRMS data management team.
Using Kurator Tools for Data Quality and Cleaning Biodiversity Data
Presentation on ISS RIT preparation :
What needs to be checked? Quality control procedures for OBIS data - background Aim: Help data providers & data managers Checking quality Checking.
INTRODUCTION TO GENERATING SERVICES
The IPT user interface and data quality tools
Flanders Marine Institute (VLIZ)
RCN Development of an Online Database to Enhance the Conservation of SGCN Invertebrates in the Northeastern Region James W. Fetzner Jr. & John.
Midwest Training Entering Data on the Website
Introduction to WoRMS, the World Register of Marine Species.
? Geographic quality control LifeWatch: Show on map
EC FP7 - Cooperation Theme 6: Environment (incl. climate change)
Comments on ASFA Input Helen Wibley, FAO 2016 ASFA Advisory Board Meeting – Hanoi, Viet Nam.
Preliminaries: -- vector, raster, shapefiles, feature classes.
North Tech Student Computer Policies
March 2014, Oostende, Belgium
7.b Marine alien species on EASIN
TASKMASTER Field Force Tracking
Simon Claus, Leen Vandepitte, Klaas Deneudt & Tjess Hernandez
Presentation transcript:

Quality control of biodiversity data: tools & techniques Leen Vandepitte On behalf of WoRMS, EurOBIS & LifeWatch data management teams

What needs to be checked?

LifeWatch: home for a multitude of web services Part of European Strategy Forum on Research Infrastructures (ESFRI) Distributed virtual laboratory: – Biodiversity research – Climatological & environmental impact studies – Support development of ecosystem services – Provide information for policy makers – Biodiversity observatories, databases, web services and modelling tools – Integration of existing systems, upgrades, new systems LifeWatch wants – Standardization of species data – Integration of distributed biodiversity data repositories & operating facilities LifeWatch needs – Species information services

LifeWatch offers compilation and combination of several web services These services = taxonomic backbone – Taxonomy access services – Taxonomic editing environment – Species occurrence services – Catalogue services LifeWatch infrastucture: – Identify, analyze and design online data services, models and applications – Make use of all LifeWatch data – = interactive part of LifeWatch

LifeWatch web services Login / password required System keeps track of all your “jobs”

Taxonomic QC

All quality checks relevant for OBIS in one: OBIS data format validation Are mandatory fields available? Is data/information in the mandatory fields available? Plotting of coordinates on map => identifies land versus sea points Validation of the dates (=check format) Taxon match, based on World Register of Marine Species (WoRMS)

Data validations and QC services – Check OBIS file

NEXT

Use this report as feedback to your provider

Taxonomic quality control  Taxon match: World Register of Marine Species (WoRMS)  Taxon match: LifeWatch taxon match: – World Register of Marine Species – Integrated Taxonomic Information System (ITIS) – Catalogue of Life (CoL) – International Plant Name Index (IPNI) – Index Fungorum (IF) – PalaeoBiology Database (Palaeo-DB) – Pan-European Species Infrastructure (PESI)

WoRMS Taxon Match Tool Freely available, no password/login required This tool uses the following components: TAXAMATCH fuzzy matching algorithm by Tony Rees PHP/MySql port of TAXAMATCH by Michael Giddens Scientific Names Parser by Dmitry Mozzherin

 Prepare your own file (Plain text [TXT], Comma Separated [CSV] & Excel Sheet [XLS, XLSX]  For convenience => colum “scientific_name”  Upload onto website

WoRMS taxon match results: – Exact match – Phonetic match – Near_1 match – Near_2 match – No match Check and verify everything that is not an exact match… Some examples: – Phonetic: Fragilaria aurivillii => Fragilaria aurivilii – Near_1: Chaetoceros seychellarum => Chaetoceros seychellarus – Near_2: Gammarus finnmarchius => Gammarus finmarchicus Syllis armoricanus => Syllis armoricana

LifeWatch taxon match tool

Currently available taxon services If a taxon is not in WoRMS: - Send to - Let us know if it is available in any of the other registers

Use this report as feedback to your provider / WoRMS

Geographic quality control ?  LifeWatch: Show on map  LifeWatch: Marine Regions Gazetteer services – Get lat-lon by MrgID – Get lat-lon by name – Get Gazetteer name by lat-lon – Get lat-lon by accepted name

Geographic QC – the concept Before quality controlAfter quality control 18°30’25’’N – 5°15’E18.51 ; ,23N – 16.5S54.23 ; Communication with provider WGS84 = World Geodetic System 1984; most used geographical reference system Decimal degrees => easy to work with

Coordinates are indispensable Coordinates = basis of a biogeographic information system When no coordinates are provided… Check with the data provider / the source When existing: complete the file & run QC When not existing: – Derive from provided map – Check Marine Regions to assign coordinates

Marine Regions = Standard, relational list of geographic names Coupled with information and maps of the geographic location Improve access and clarity of the different geographic, mainly marine names such as seas, sandbanks, ridges and bays

Fish species “A” present in Kenya Marine species on land? Link with adjacent sea area: EEZ Indicate precision!!!!

“Monitoring in Kongsfjorden area” “Monitoring in Belgian part of the North Sea” Latitude & longitude switched “+” & “-” signs switched Some examples The importance of geographical QC

Left: coordinates as received; right: corrected. Errors due to missing minus sign Sightings and strandings of marine turtles around the coast of UK and Ireland

What else to check…? Use common sense…

Dates OBIS data format check includes check on the date format: – Year: “1972” vs “72” vs “972” – Month: between 1-12 – Day: between 1-31, check takes into account the given month but… – Dataset from 1990, with a few records in 1909…

Units OBIS can capture: – Counts – Biomass – Depth Are units defined? – Counts: individuals per m², cm², liter, m³ – Biomass: wet weight, dry weight, ash-free dry weight – Depth: meter, centimeter Significance – Needs thorough documenting – Know what you are dealing with – Comparison – Convert to OBIS standards depth: in meter, positive values Abundance: NULL versus 0, positive values

Questions?