Download presentation
Presentation is loading. Please wait.
Published byNatalie Watkins Modified over 9 years ago
1
Representing taxonomy MarBEF-IODE workshop Oostende, 19-23 March 2007
2
Philosophy Structure has to be as simple as possible But not any simpler!! Alternatives to represent classification and hierarchy Alternatives to represent synonymy
3
Hierarchy: flat table Every rank in the hierarchy is represented by a field in the table Simplest solution Easy to create Easy to query
4
Hierarchy: flat table
5
Problems not normalised! Not a real problem if a quick-and-dirty solution is all that is needed Difficult to maintain hierarchy in the long term ‘Standard’ problems with non-normalised database Possible conflicting information, inefficient storage… Cfr MASDEA; too simple
6
Hierarchy: normalised tables Every rank is represented by a separate table Not very difficult to write a query to regenerate flat table Every taxon can have additional information Extra fields with description…
7
Hierarchy: cascading tables
10
Hierarchy: normalised tables Avantages Easy to maintain and query Normailised, possible to add information at any level of the hierarchy Drawbacks Ranks are hard-wired on the structure of the database New rank would require change of the structure of the database And probably of the user interface, web interface… Number of tables Lot of functionality duplicated
11
Taxonomic reality Ranks used depend on the taxonomic group Botany: mainly infra-specific; zoology: mainly on higher levels Many of the ranks are only sparsely used Needs for a more flexible system Much of the functionality is the same across all ranks ‘parent’, synonymy Authority, description…
12
‘Open Hierarchy’ Possible to define new ranks without having to rewrite the structure of the database All taxonomic names are stored in a field in a single table; other fields indicate parent and rank Many-to-one relation: a single parent, several descendants Include ID of parent in the record of the descendant
15
Open Hierarchy Avantages Completely normalised Flexible Drawbacks Difficult to query classification Queries of the type ‘all species of the Echinodermata’… Solution: ‘Calculated field’ Programmatical (loop in computer language) Recursive query
16
Synonymy Every taxon can have several synonyms; in principle, only one valid name for any synonym Many-to-one relation: one valid name, many synonymous names Include ID of the valid name in the record of the synonymous name Other fields for the type of synonymy…
18
Implementation in OBIS (PostgreSQL)
19
Calculated field: ‘stored path’ Calculate a field, as a concatenation of id of parent, parent of parent… E.g. x5x45x65x 5: Animalia 45: Arthropoda 65: Crustacea Stored path of all taxa belonging to Crustacea start with x5x45x65x
20
Query the Stored Path Get all species from Echinodermata: select * from obis.tnames where storedpath~(select '^'||storedpath||id||'x' from obis.tnames where tname='Echinodermata')::text and rank_id=220
21
Recursive query All taxa belonging to given taxon: with recursive includedtaxa(id, tname) as ( select id, tname from obis.tnames where tname='Semelidae' union select tnames.id, tnames.tname from obis.tnames inner join includedtaxa on tnames.parent_id=includedtaxa.id ) select * from includedtaxa order by tname
22
The other way Finding parent of given rank of a species with recursive parenttaxa(id, parent_id, tname) as ( select id, parent_id, tname from obis.tnames where tname='Abra alba' union select tnames.id, tnames.parent_id, tnames.tname from obis.tnames inner join parenttaxa on parenttaxa.parent_id=tnames.id and tnames.rank_id>=140 ) select * from parenttaxa order by tname
23
Rest of the taxonomic model Ranks should be in a separate table Information on the level of the rank can be added Possibility of extra quality control Rank of a parent as compared to rank of descendants Rank of siblings should be same
25
Documentation Documenting sources of information Add sources/references ‘Audit trail’: source of the information in the database Taxonomic information: reference of the original description Type of the source: expert, database, publication Date and person responsible for the last revision of the record
26
Sources Many-to-many relation Every source can contain information on several taxa A single taxon can be documented in several sources Necessitates an extra table to represent the relationship Divide one many-to-many in to one-to-many relationships
28
Add distribution Localities from where a taxon has been reported Many-to-many relation One locality has several taxa One taxon is found on several localities Relation must be qualified Source! Validity of the observation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.