Components of the OTN Data Centre Document-oriented web-based data repository Data and metadata in secure folders per user in their original formats Managed by the user Access and authorship permissions set by data owner Standardized metadata sheets for user projects Receivers and Sentinel Tags Tagging AUVs (Gliders) Environmental Sensors Cruise/Mission metadata Public-facing summary page describing project scopes and general location Citation text, abstract, contact information
Components of the OTN Data Centre Database Open-source, geospatially-aware database engine Project-level permissions with finer permissions available to power users (people fluent in SQL) Data and metadata verification done by data managers and by associated scripts Aggregation of all project data into queryable format
Components of the OTN Data Centre Interfaces to various endpoints for your data Web and GIS-friendly representations of data International metadata standards for acoustic detections (submitting data to OBIS) Publishable aspects of project data delivered in many formats
Data Nodes: Choice of DMS and web front left to local data managers All collection data kept private, even from the OTN parent node Project metadata can be harvested if node chooses to generate it (green area of DB) Database internal layout common to all nodes allowing for protocol and code exchange Access to community code repositories for data loading and generating data products
Puppet script repository – automated deployment of nodes Publishing OS and DB build process in Git for managing evolving data formats and structures Still in development Easy to deploy via VM tools like Vagrant Hopefully useful when deploying to cloud services
Node Toolset - Community of Acoustic Telemetry DMs Continuously evolving data parsers and insertion scripts for common data formats Data managers can author their own scripts or otherwise enhance the existing toolset
Node Toolset - Community of Acoustic Telemetry DMs Continuously evolving data parsers and insertion scripts for common data formats Data managers can author their own scripts or otherwise enhance the existing toolset
Node Toolset - Community of Acoustic Telemetry DMs Continuously evolving data parsers and insertion scripts for common data formats Data managers can author their own scripts or otherwise enhance the existing toolset
Node Toolset - Community of Acoustic Telemetry DMs A platform for code- sharing with permissions ranging from personal to completely public Mechanism for documentation, feedback, feature suggestion, and dissemination Contributions can be suggested with code or through issue-tracking
Summary OTN Database Node structure available for sharing –Built to support the OTN Data Policy –Helpful in identifying mystery / orphan tags across OTN Coding community for managing OTN DB Nodes –But also for visualization, statistical modelling, etc! –Programmers can contribute, non-programmers can define problems and OTN et al. can help solve them Equipment loans to extend existing telemetry effort
Installing a OTN db Node VM 1.Install Vagrant (vagrantup.com), VirtualBox (4.3.xx) and Git ( 2.Create a login to OTN’s GitLab – Jon will add you to OTN Partner Nodes group so you can d/l Node manifest 3.Create a folder on your computer to hold your Virtual Machines navigate to it using Git Bash or Terminal 4.Type the following commands: git clone partner-nodes/db-node-puppet-installer.git cd db-node-puppet-installer vagrant up Installing the Node VM will give you a temporary copy of the OTN database to work in. It will only be accessible from your local machine, and can be deleted when you want it to be. We’ll use it to go through some data ingesting and formatting exercises over the next few days. Draft OTN Node Training Overview
What you get in a Node VM PostgreSQL PostGIS Template OTN DB Templates Tomcat GeoServer ERDDAP VM-only Miniconda Python environment JuPyTer (iPython Notebook) OTN.ipynb repo for data loading and verification
The OTN Database Node Just one component of an acoustic telemetry data management system Bridge between input data from research groups and well-formatted output data to researchers, stakeholders and the public Other crucial components: document management system for input data Web portal for output data ? ?
Tentative Schedule for DB Training Monday Projects and schemas Create project metadata Create and populate project schema Vendor-provided metadata Tagging Metadata Load raw tag metadata Verification Load cache tables + verify OBIS-like tables Wednesday Detection processing Events and Detections loading from CSV Batch processing w/ ULFX (if time) YYYY tables – sharding and inheritance Sensor-enhanced detections Generating Detection Extracts Discovery process – creating publishable metadata Tuesday Receivers Receiver metadata – Short form (OTN) – Long form (OTN) Add station records Verification Sentinel tags if there’s time
Three main components of acoustic telemetry data Detection Event Detection Data Receiver Deployment Tagging Activity Receiver deployments: generally uncontroversial and publishable data. Useful for informing potential collaborators of existing equipment deployed in their intended study areas that could detect their tags. Detection data: protected but not very informative without associated tagging activity data to add the ‘what’ to the ‘where’ and ‘when’. Tagging activity: The history of which tags are in which animals, where those animals were released, how long the tag will live, and all auxiliary measurements and observations made at tagging time by the researchers. Very sensitive, embargoed.
Jon Pye – Portal Manager Ocean Tracking Network Dalhousie University Halifax, Nova Scotia Canada 1 (902) oceantrackingnetwork.org