Building the European Register of Marine Species Richard White Biodiversity & Ecology Research Division, School of Biological Sciences, University of Southampton, UK Mark Costello & Chris Emblow Ecological Consultancy Services Ltd (EcoServe), Dublin, Ireland
European Register of Marine Species (ERMS) Funded as a Concerted Action project by the MAST (Marine Science and Technology) programme of the European Union Managed by EcoServe in Dublin Team of participants including “list editors”
ERMS
ERMS versus URMO URMO is creating global lists of marine organisms but is not taxonomically complete ERMS is creating a regional list (for European waters) but it is (almost) taxonomically complete With the Fauna Europaea and Euro+Med PlantBase projects, Europe will have a complete list of its species (almost)
How many species? About 29,500
ERMS geographical area
Incoming data Approximately 100 separate lists for different taxonomic groups Mostly compiled as spreadsheets Scientific names, synonyms, geography (at least Atlantic or Mediterranean) Some optional fields
List conversion is carried out in several stages: Excel spreadsheets are exported to text files Tab-delimited text files are converted to “holding format” (was XDF, now a client- server database (MySQL) Database queries results are passed through templates to generate either RTF (for the printed publication) or HTML (for the Web site)
Data flow
Variations on a theme Fields may be combined or separated e.g. genus species authority date Higher taxa may be: repeated in fields of the species record given once in separate preceding records in various different formats Synonyms may be: in a separate field of the species record, or mixed with other remarks, with various delimiters and separators in separate records, linked by code or by name or even abbreviated implied, e.g. Genus1 specname (Smith as Genus2) Geographical information is often free text
Conversion: simple case #!/usr/bin/perl -w # Porifera.pl: convert an ERMS list text file to an XDF file use PerlStart; use ERMS; &speciesList(); __END__ list code PF list version 1 list rank phylum record 1 fields field 1 genus field 2 species field 3 species authority and date field 4 used
More complicated case #!/usr/bin/perl -w # Tardigrada.pl: convert an ERMS list text file to an XDF file use PerlStart; use ERMS; &speciesList ( sub { &extractSynonyms(10, "syn.:"); } ); __END__ list code TG list version 2 list rank phylum record 1 title record 2 fields field 1 order field 2 family field 3 genus field 4 species field 5 subspecies field 6 species authority field 7 species date field 8 geography field 9 reference field 10 remarks
“Holding format” XDF file (HIGHER:informal:Tetrapoda) (HIGHER:order:Testudines) (HIGHER:family:Cheloniidae) TP00001:Caretta:caretta:(Linnaeus, 1758):species::::::::Cosmopolitan warm to temperate waters::::Loggerhead turtle:::: TP00002:Chelonia:mydas:(Linnaeus, 1758):species::::::::Cosmopolitan warm water::::Green turtle:::: TP00003:Eretmochelys:imbricata:(Linnaeus, 1766):species::::::::Cosmopolitan warm water::::Hawksbill turtle::::
Example RTF file for the book Order Isopoda Suborder Anthuridea Family Antheluridae Ananthura abyssorum (Norman & Stebbing, 1886)A Anthelura elongata Norman & Stebbing, 1886A ovalis (Barnard, 1925)M = Ananthura ovalis sulcaticauda (Barnard, 1925)A = Ananthura sulcaticauda truncata (Hansen, 1916)
Static versus dynamic web pages Initial web pages were generated statically (in advance) from the XDF “holding format” (without synonyms) RTF files were generated from the database (with synonyms) Future web pages will be generated dynamically (on demand) from the database (with synonyms)
Database schema (simplified) Taxon file:Name table: taxon ID(PK)name ID(PK) geographytaxon ID(I, FK) etc.Genus(I, FK) species(I) Hierarchy table: author taxon(PK)etc. rank parent(I, FK) etc.