Presentation is loading. Please wait.

Presentation is loading. Please wait.

Representing taxonomy MarBEF-IODE workshop Oostende, 19-23 March 2007.

Similar presentations


Presentation on theme: "Representing taxonomy MarBEF-IODE workshop Oostende, 19-23 March 2007."— Presentation transcript:

1 Representing taxonomy MarBEF-IODE workshop Oostende, 19-23 March 2007

2 Philosophy  Structure has to be as simple as possible  But not any simpler!!  Alternatives to represent classification and hierarchy  Alternatives to represent synonymy

3 Hierarchy: flat table  Every rank in the hierarchy is represented by a field in the table  Simplest solution Easy to create Easy to query

4 Hierarchy: flat table

5  Problems not normalised! Not a real problem if a quick-and-dirty solution is all that is needed Difficult to maintain hierarchy in the long term ‘Standard’ problems with non-normalised database  Possible conflicting information, inefficient storage… Cfr MASDEA; too simple

6 Hierarchy: normalised tables  Every rank is represented by a separate table  Not very difficult to write a query to regenerate flat table  Every taxon can have additional information Extra fields with description…

7 Hierarchy: cascading tables

8

9

10 Hierarchy: normalised tables  Avantages Easy to maintain and query Normailised, possible to add information at any level of the hierarchy  Drawbacks Ranks are hard-wired on the structure of the database New rank would require change of the structure of the database And probably of the user interface, web interface… Number of tables Lot of functionality duplicated

11 Taxonomic reality  Ranks used depend on the taxonomic group Botany: mainly infra-specific; zoology: mainly on higher levels Many of the ranks are only sparsely used  Needs for a more flexible system  Much of the functionality is the same across all ranks ‘parent’, synonymy Authority, description…

12 ‘Open Hierarchy’  Possible to define new ranks without having to rewrite the structure of the database  All taxonomic names are stored in a field in a single table; other fields indicate parent and rank  Many-to-one relation: a single parent, several descendants Include ID of parent in the record of the descendant

13

14

15 Open Hierarchy  Avantages Completely normalised Flexible  Drawbacks Difficult to query classification Queries of the type ‘all species of the Echinodermata’… Solution: ‘Calculated field’ Programmatical (loop in computer language) Recursive query

16 Synonymy  Every taxon can have several synonyms; in principle, only one valid name for any synonym Many-to-one relation: one valid name, many synonymous names Include ID of the valid name in the record of the synonymous name Other fields for the type of synonymy…

17

18 Implementation in OBIS (PostgreSQL)

19 Calculated field: ‘stored path’  Calculate a field, as a concatenation of id of parent, parent of parent…  E.g. x5x45x65x 5: Animalia 45: Arthropoda 65: Crustacea Stored path of all taxa belonging to Crustacea start with x5x45x65x

20 Query the Stored Path  Get all species from Echinodermata: select * from obis.tnames where storedpath~(select '^'||storedpath||id||'x' from obis.tnames where tname='Echinodermata')::text and rank_id=220

21 Recursive query  All taxa belonging to given taxon: with recursive includedtaxa(id, tname) as ( select id, tname from obis.tnames where tname='Semelidae' union select tnames.id, tnames.tname from obis.tnames inner join includedtaxa on tnames.parent_id=includedtaxa.id ) select * from includedtaxa order by tname

22 The other way  Finding parent of given rank of a species with recursive parenttaxa(id, parent_id, tname) as ( select id, parent_id, tname from obis.tnames where tname='Abra alba' union select tnames.id, tnames.parent_id, tnames.tname from obis.tnames inner join parenttaxa on parenttaxa.parent_id=tnames.id and tnames.rank_id>=140 ) select * from parenttaxa order by tname

23 Rest of the taxonomic model  Ranks should be in a separate table Information on the level of the rank can be added Possibility of extra quality control Rank of a parent as compared to rank of descendants Rank of siblings should be same

24

25 Documentation  Documenting sources of information  Add sources/references ‘Audit trail’: source of the information in the database Taxonomic information: reference of the original description Type of the source: expert, database, publication  Date and person responsible for the last revision of the record

26 Sources  Many-to-many relation Every source can contain information on several taxa A single taxon can be documented in several sources  Necessitates an extra table to represent the relationship Divide one many-to-many in to one-to-many relationships

27

28 Add distribution  Localities from where a taxon has been reported  Many-to-many relation One locality has several taxa One taxon is found on several localities  Relation must be qualified Source! Validity of the observation

29


Download ppt "Representing taxonomy MarBEF-IODE workshop Oostende, 19-23 March 2007."

Similar presentations


Ads by Google