OBIS. Current situation Working on new IT platform Present technology 8 years old Data ingestion going fine Including data quality Position, time Taxonomy.

Slides:



Advertisements
Similar presentations
Focus on Your Content, Not on Ingesting Your Content Terry Brady Applications Programmer Analyst Georgetown University Library
Advertisements

WP3 Biomapping results to date WP3: NRM, CDF, CEFAS, DINARA, WCS Additional input: WP1, AquaMaps workgroup.
Development of Bias-Corrected Precipitation Database and Climatology for the Arctic Regions Daqing Yang, Principal Investigator Douglas L. Kane, Co-Investigator.
The Discovery Corridor Concept and its Applicability January 13/14, 2004 workshop St. Andrews Biological Station, St. Andrews, N.B.
Advanced Searching Engineering Village.
ChEssBase Online database for species from deep-water chemosynthetic ecosystems integrated with OBIS Eva Ramirez-Llodra & Maria Baker National Oceanography.
Ocean Biogeographic Information System Edward Vanden Berghe
Evaluating the Role of the CO 2 Source from CO Oxidation P. Suntharalingam Harvard University TRANSCOM Meeting, Tsukuba June 14-18, 2004 Collaborators.
R OAD R UNNER : Towards Automatic Data Extraction from Large Web Sites Valter Crescenzi Giansalvatore Mecca Paolo Merialdo VLDB 2001.
Metadata is information about information Say what…? Metadata is the who, what, when, where, why and how that describes your data, document, photo, video,
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Rob Jennings – University of Connecticut, USA Photos by R.R. Hopcroft – University of Alaska, USA L.P. Madin – Woods Hole Oceanographic Institution, USA.
Ocean Biogeographic Information System Edward Vanden Berghe.
Ocean Biogeographic Information System. ‘Mission’ OBIS publishes primary data on marine species locations online through –It.
The Census of Marine Life and NOAA A Presentation to the NOAA Science Advisory Board Andrew A. Rosenberg, Ph.D. Professor of Natural Resources, University.
Scratchpads Publication Module - A paradigm shift in publishing RBG Kew, Seminar,
RAPID ASSESSMENT PROGRAM (RAP) Terrestrial Ecosystems Freshwater Ecosystems Marine Ecosystems.
CoML Framework Committee Mission: Recommendations for… 2010 Census Outputs (identify needs and products) HOW TO IMPLEMENT A SYNTHESIS PLAN?
Introduction to OBIS-USA Biological Data, Applications, & Relationships March 14, 2011.
Indexing the Species Names of the World - for the World Frank Bisby (Species 2000), Michael Ruggiero (ITIS) Per de Place Bjørn (GBIF - ECAT)
OBIS Portal Architecture Concepts plus potential for utilization as a basis for Regional OBIS Nodes Tony Rees, CSIRO Marine Research, Hobart (and OBIS.
Online Data Flanders Marine Data & Information Centre InnovOcean site SeadataNet Annual Meeting, Madrid 2009.
Oceans Portal Workshop 30 th March 2004 Healthy oceans: cared for, understood and used wisely for the benefit of all, now and in the future healthy oceans:
CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey 1.
Total distribution data holding in OBIS: 5,253,721 records, 50,932 scientific names, 38,012 species Under-sampled Regions.
OBIS and species distributions Tony Rees discussion presentation, March 2003 Some fundamental intentions for OBIS... –Choose any species and discover its.
History of Marine Animal Populations aims to enhance knowledge and understanding of how and why the diversity, distribution and abundance of marine life.
Update on IMPROVE Light Extinction Equation and Natural Conditions Estimates Tom Moore, WRAP Technical Coordinator May 23, 2006.
Census of Marine Life (CoML) A 10-year effort to increase our knowledge and understanding of the Diversity, Distribution & Abundance of Marine Life – Past,
CENSUS OF MARINE LIFE All Program Meeting Mapping & Visualization Pat Halpin November 2007 Auckland, New Zealand.
NaGISA Habitat specific, quantitative survey of the world’s nearshore environment Producing a baseline from which: -scientists can work -monitoring can.
1 3 rd CAMELS Meeting November 2003 Nadine Gobron & Bernard Pinty presented by Peter Cox, with no french accent The JRC contribution to CAMELS: Dataset.
MAR-ECO: PATTERNS AND PROCESSES OF THE ECOSYSTEMS OF THE NORTHERN MID-ATLANTIC after Garrison, 1993 Ridge-associated non-vent macro-and megafauna.
Using the Global Change Master Directory (GCMD) to Promote and Discover ESIP Data, Services, and Climate Visualizations Presented by GCMD Staff January.
Census of Marine Zooplankton CMarZ is a taxonomically comprehensive, global- scale census of marine zooplankton, to produce accurate and complete information.
3/30/04 16:14 1 Lessons Learned CERES Data Management Presented to GIST 21 “If the 3 laws of climate are calibrate, calibrate, calibrate, then the 3 laws.
2002/6/211 NaGISA = 渚 (foreshore) The symbol of mudflats, coral reefs, and sea grass beds Coastal zones in Japan are threat caused by development. Some.
What is AIMS ? Animal Information Management System Animal Information Management System It’s a CMS (Content Management System)
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
Fábio Lang da Silveira – This talk on behalf of OBIS International Committee and OBIS North & South America Nodes USP – Zoology.
Ocean Biogeographic Information System Edward Vanden Berghe.
A standardised database for fisheries data CM 2004/FF:15 Vojtěch Kupča Marine Research Institute, Iceland
Hellenic Centre for Marine Research (HCMR) MedOBIS - Ocean Biogeographic Information System for the Eastern Mediterranean and Black Sea.
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
 Milestones  Framework  Synthesis  CoML Legacies  Synthesis  Tasks CoML Benchmarks & Goals.
1 EMODNET pilot biological lot Francisco Hernandez, Simon Claus, Leen Vandepitte.
On the D4Science Approach Toward AquaMaps Richness Maps Generation Pasquale Pagano - CNR-ISTI Pedro Andrade.
1 Federal Research Centre for Fisheries Institute for Sea Fisheries, Hamburg Hans-Joachim Rätz Josep Lloret Institut de Ciències del Mar, Barcelona Long-term.
Global Change Master Directory (GCMD) Mission “To assist the scientific community in the discovery of Earth science data, related services, and ancillary.
System concept and development by: Tony Rees Divisional Data Centre CSIRO Marine Research, Australia c-squares - a new method for representing, querying,
Census of Marine Life A decade-long program ( ) to assess and explain marine life’s diversity, distribution & abundance - past, present & future.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen Senior Programme Officer, ECAT 3 Oct th Nodes Meeting.
GBIF Governing Board 20 Module 6B: New GBIF Tools II 2013 Portal and NPT Startup Daniel Amariles IT Leader, National Biodiversity Information System of.
African Register of Marine Species AfReMas Leen Vandepitte On behalf of WoRMS data management team.
IPT + Darwin Core OBIS XML Schema OBIS Database Schema Explained Mike Flavell OBIS Data Manager OBIS Nodes Training Course, Oostende, Belgium, 6 May 2014.
Quality control of biodiversity data: tools & techniques Leen Vandepitte On behalf of WoRMS, EurOBIS & LifeWatch data management teams.
1. 2 NOAA’s Mission To describe and predict changes in the Earth’s environment. To conserve and manage the Nation’s coastal and marine resources to ensure.
Data Mining What is to be done before we get to Data Mining?
Data holdings, global analyses and future plans for the Future of Marine Animal Populations (FMAP) project Daniel Ricard 1, Zoey Zahorodny 2, Heike K.
Air Quality Emission inventories
The IPT user interface and data quality tools
Flanders Marine Institute (VLIZ)
Map Reduce.
SOCIAL NETWORK AS A VENUE OF PARTICIPATION AND SHARING AMONG TEENAGERS
Rainer Froese, Kathleen Kesner-Reyes and Cristina Garilao
Contents: The Players Timing Peer Reviewed
Mark J. Costello, Chhaya Chaudhary  Current Biology 
Combinations (= multimetrics)
7.b Marine alien species on EASIN
Presentation transcript:

OBIS

Current situation Working on new IT platform Present technology 8 years old Data ingestion going fine Including data quality Position, time Taxonomy Web site well visited

Number of records (M)

Number of datasets

Average size dataset (K)

Web statistics

Data statistics

Analysis of content First preliminary analyses Has to take into account huge bias Geography Mostly coastal Mostly northern hemisphere Taxonomy Presence-only ‘Safety in numbers’

Number of records For known species most important to your project, what major discoveries have been made about their range or distribution? What is least known with regards to their distribution that you would like to know?

Number of species

Hurlbert’s index (es(50))

Large marine ecosystems

‘Age’ of record – trends study

Latitudinal gradient ES(50)

Marine fish to be discovered Mora et al (2007). The completeness of taxonomic inventories for describing the global diversity and distribution of marine fishes. Proc. R. Soc. B, published on line Percentage completeness 1 100

How good is the data? Data are from many sources Inconsistent become apparent Differences in names used Mistakes in transformations Decimalising lat/lon Needs quality control Data collection driven by priorities Sampling bias; resolution

Quality control Check formal record structure Check date/time Check position In the ocean? In dataset bounding box? Check taxonomy Problem: no reference list

New species are discovered Data from

Problems with taxonomic names Misspellings Misspellings Mixed with other information Mixed with other information Gadus sp.; Gadus sp. A; Gadus sp. a… Gadus sp.; Gadus sp. A; Gadus sp. a… Gadus morhua?; Gadus cfr morhua; Gadus aff. morhua… Gadus morhua?; Gadus cfr morhua; Gadus aff. morhua… Gadus morhua juv.; Gadus morhua juvenile; Gadus morhua juveniles… Gadus morhua juv.; Gadus morhua juvenile; Gadus morhua juveniles… Mixed with ecological/sampling information Mixed with ecological/sampling information Also variation in classification and author string Also variation in classification and author string

Examples of variation Callorhinchus callorynchus Callorhinchus callorynchus Cirrhinus or Cirrhina Cirrhinus or Cirrhina Cirrhinus cirrhosa or C. cirrhosus Cirrhinus cirrhosa or C. cirrhosus Cirrhina cirrhosa or C. cirrhosus Cirrhina cirrhosa or C. cirrhosus Microsoft helping a bit: Microsoft helping a bit: Calinectes ornatus Ordway, 1863 Calinectes ornatus Ordway, 1864 … Calinectes ornatus Ordway, 1891 Calinectes ornatus Ordway, 1863 Calinectes ornatus Ordway, 1864 … Calinectes ornatus Ordway, 1891

Number of ‘species’ in OBIS 147K unique ‘scientific names’ 147K unique ‘scientific names’ 132K ‘clean names’ 132K ‘clean names’ Approx 10% reduced (from 147K) Approx 10% reduced (from 147K) 80K match with WoRMS 80K match with WoRMS 11K known synonyms or misspellings 11K known synonyms or misspellings Non-matches assumed valid Non-matches assumed valid 121K ‘valid names’ 121K ‘valid names’ Approx 20% reduced Approx 20% reduced

Reduction of es(50) per 5d square

Same for fish

General patterns indistinguishable All Fish DirtyClean

Completeness

How to get OBIS data? Web site Web site DiGIR provider DiGIR provider OGC-compliant web services OGC-compliant web services Exist on experimental basis Exist on experimental basis Google base Google base Ask us! Ask us! Custom data extraction Custom data extraction

Data from field projects Not always easy to ‘trace’ Not always easy to ‘trace’ Not well documented what is CoML data, aand which field project it belongs to Not well documented what is CoML data, aand which field project it belongs to Needs mechanism to better document Needs mechanism to better document Part of the metadata? Part of the metadata? Exercise was done at iOBIS Exercise was done at iOBIS Spreadsheet will be made available Spreadsheet will be made available Please check Please check In general, good agreement with our understanding and information from annual reports In general, good agreement with our understanding and information from annual reports

Field projects Acronym # datasets as per group # records as per group # datasets in OBIS # records in OBIS CeDaAMar CAML ArcOD CoMargE POST1? CReefs ICoMM Mar-ECO1?12744 NaGISA GoMA CenSeam TOPP0000 ChEss CMarZ11141 HMAP10319, FMAP

How to get data in OBIS? Dialogue ongoing with all major providers Dialogue ongoing with all major providers All field projects All field projects Regional OBIS Nodes (RONs) Regional OBIS Nodes (RONs) FishBase, OBIS SEAMAP… FishBase, OBIS SEAMAP… iOBIS needs time to ingest data iOBIS needs time to ingest data Quality control… Quality control… Data cycle Data cycle Lag in data availability ~3 months Lag in data availability ~3 months Depending on quality of the data Depending on quality of the data