PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde de Jong)
PESI project Aim: Creating a Pan-European taxonomic backbone (an All-Species Checklist of Europe) Based on the existing 3 large European databases: + European Fungi (Index Fungorum) & Algae (Algaebase) 2.64 million Euro, 40 partners, May April 2011 Fauna Europaea European animal species Terrestrial & freshwater European Register of Marine Species European marine species Euro+Med PlantBase European plant species Terrestrial & freshwater
PESI in the context of other EU projects (FP 4) (FP 5) (FP 6)(FP 7) Digitisation & Infrastructures Integration & e-Science Networks of Excellence
Sustaining Pan-European checklists - issues Sustaining the expert networks Sustaining the database systems Sustaining the maintenance (updating) tools & functions Sustaining the interoperability (e.g. Global Name Architecture) Sustaining the data dissemination (webportal) Sustaining the data verification (quality control / validation) Sustaining the implementation of European taxonomic standards SMEBD Host Institutions EDIT Platform for Cybertaxonomy Focal Networks VLIZ / Hosts (LifeWatch?) *4Life projects (LifeWatch?)
Management & Coordination Infrastructural Networks Community Networks Zoological Community Botanical Community Marine Community Mycological Community Expert- networks Focal point networks Authority files & Standards Data e-Infra- structure e-Services Phycological Community PESI WP2 — Expert Networks
Expert Networks Society for the Management of Electronic Biodiversity Data (SMEBD)
Management & Coordination Infrastructural Networks Community Networks Zoological Community Botanical Community Marine Community Mycological Community Expert- networks Focal point networks Authority files & Standards Data e-Infra- structure e-Services Phycological Community PESI Focal Points Networks
PESI Focal Point Networks
Cross-validation pan-European lists with local species lists –TaxonMatch Tool Provide meta-data on local expertise (experts, resources, etc.) –Focal Points Expertise database Major tasks of PESI National Focal Points
PESI Validation Tools: Taxon Match Tool TAXAMATCH fuzzy matching algorithm by Tony Rees, PHP/MySql port of TAXAMATCH by Michael Giddens, Scientific Names Parser by Dmitry Mozzherin
1.Exact match test 2.Phonetic match test 3.Custom Modified Damerau-Levenshtein Distance (MDLD) 4.Modified n-gram comparison of author names and dates, including known abbreviations PESI validation tools: Taxon Match Tool - 2 Mapping between two taxon names lists (exact and fussy)
PESI validation tools: Taxon Match Tool - 3 Excel file export
Management & Coordination Infrastructural Networks Community Networks Zoological Community Botanical Community Marine Community Mycological Community Expert- networks Focal point networks Authority files & Standards Data e-Infra- structure e-Services Phycological Community Taxonomic standards & authority files
What are the properties of a Taxonomic Backbone –connecting different uses of the same name for multiple classifications –persistent name-name relationships of species names Optimise data sharing / interoperability –Linked Data mark-up –persistent identifiers: globally unique IDs (GUIDs, LSIDs) –DarwinCore Archive format for transport Standardised ontologies / vocabularies –consensus classification –consensus distribution and occurence scheme Conceptual integration
PESI consensus distribution and occurrence scheme Gazetteer:
Management & Coordination Infrastructural Networks Community Networks Zoological Community Botanical Community Marine Community Mycological Community Expert- networks Focal point networks Authority files & Standards Data e-Infra- structure e-Services Phycological Community Taxonomic information e-infrastructure
EDIT Platform dataflow in PESI domain © Walter Berendsohn PESI Phyco-Myco databases
Quality control mechanisms Inconsistency checks used in the merging process
PESI Data Warehouse Model > taxon names ~ valid species names > taxon names ~ valid species names
PESI Data Warehouse - statistics
Management & Coordination Infrastructural Networks Community Networks Zoological Community Botanical Community Marine Community Mycological Community Expert- networks Focal point networks Authority files & Standards Data e-Infra- structure e-Services Phycological Community e-Services for users & dissemination
PESI project website
PESI dataportal
Linking to Global Names Architecture
Acknowledgement (PESI SC & management) Nihat Aktaç Ward Appeltans Walter Berendsohn Phillip Boegh Louis Boumans Thierry Bourgoin Mark Costello Charles Hussey Roger Hyam Yde de Jong Julia Kouwenberg David Ouvrad Henrik Pedersen