A standardised database for fisheries data CM 2004/FF:15 Vojtěch Kupča Marine Research Institute, Iceland
Why another database? DST2 project ( ) Gadget tuned data source other reasons: standardised data format (all species the same, more countries, etc.) mild aggregation of data (hides details --allows sharing) local databases not always in a good shape consistent (fast) data source possibly used in other projects (flexibility) VPA data
Data Management Data flow in the DST2 project: collection of ecosystem data input and storage of data in the local database extraction and transformation process standardized data format upload of formated data into the new database extraction of data in the Gadget format gadget simulation using extracted data output of simulation, postprocessing, presentation
Database content Database contains: biological sample (survey + catch) stomach data catch data (landings) tagging data acoustic data environmental data
Database hierarchy
Characteristics Database characteristics: controlled content using “lookup tables” Gadus morhua Cod Sebastes marinus Redfish no indexing information Data characteristics: normalized (no redundancy in data) slightly aggregated
Data import upload into the database defined by a simple column-based text format LEC COD 600 LEC COD 610 table code, year, quarter, month, area code, species code, length a metadata webpage informs on the contents
Data export extraction of data from the database controled by keywords (parameters) in the “control file” filetype GALK years 1999 species COD an extraction program assembles the queries and returns Gadget formated output file GALK year step area age length number age1 len age1 len2 1
Some problems various kinds of errors propagating from the source databases can be difficult to spot large-scale deletion may be complicated due to the way indexing of records is handled
More information