AquaMaps - behind the scene Josephine “Skit” D. Rius FishBase Project/INCOFISH WP1 WorlFish Center INCOFISH WP3 Technical Workshop Campinas, Brazil 18-22.

Slides:



Advertisements
Similar presentations
Descriptive Measures MARE 250 Dr. Jason Turner.
Advertisements

Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
WP3 Biomapping results to date WP3: NRM, CDF, CEFAS, DINARA, WCS Additional input: WP1, AquaMaps workgroup.
SBML2Murphi: a Translator from a Biology Markup Language to Murphy Andrea Romei Ciclo di Seminari su Model Checking Dipartimento di Informatica Università.
© 2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH Getting to Know Your Data Basic Data Cleaning Principles.
Improving the accuracy of aerial surveys for dugongs: implications for management of Indigenous hunting in Torres Strait Helene Marsh, Ken Pollock, Ivan.
QUANTITATIVE DATA ANALYSIS
HABITAT ANALYSIS INCORPORATING NOMENCLATURAL AND TAXONOMIC INFORMATION Daphne G. Fautin Ecology & Evolutionary Biology Natural History Museum University.
SOWK 6003 Social Work Research Week 10 Quantitative Data Analysis
Numerically Summarizing Data
Mapping Life on Earth: Recent Progress with AquaMaps Rainer Froese IFM-GEOMAR, Kiel, Germany Teldap Conference, Taipei 2 March 2009.
Business Statistics BU305 Chapter 3 Descriptive Stats: Numerical Methods.
Recent Advances with the Ocean Biogeographic Information System (OBIS) Rainer Froese Institute of Marine Research, Kiel
Unit 8 Quiz Clear your desk except for a calculator & pencil! Keep your HW & Stamp sheet out as well.
Describing distributions with numbers
Winners and Losers in the Future Ocean Insights from Millions of Samples Rainer Froese IFM-GEOMAR, Kiel, Germany EDIT Symposium 18th January
Chapter 12: Describing Distributions with Numbers We create graphs to give us a picture of the data. We also need numbers to summarize the center and spread.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Ecological Modeling Working Group 23 December 2013 Meeting.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Sampling Distribution ● Tells what values a sample statistic (such as sample proportion) takes and how often it takes those values in repeated sampling.
1 Electronic Atlas for All Marine Species Rainer Froese IFM-GEOMAR
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Array Processing - 2. Objectives Demonstrate a swap. Demonstrate a linear search of an unsorted array Demonstrate how to search an array for a high value.
Lecture 5 Raster Data Analysis Introduction Analysis with raster data is simple and efficient for it’s feature based on position Analysis.
Ecological geography as a framework for a transition towards responsible fishing by Daniel Pauly Reg Watson and Villy Christensen Iceland-FAO Conference.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Describing distributions with numbers
Using Climatic data in Diva GIS Franck Theeten, Royal Museum for central Africa Cabin training 2013.
AquaMaps Rainer Froese GBIF-Copenhagen 30 January 2008.
Data Preprocessing Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Statistics Chapter 1: Exploring Data. 1.1 Displaying Distributions with Graphs Individuals Objects that are described by a set of data Variables Any characteristic.
6.8 Compare Statistics from Samples MM1D3a: Compare summary statistics (mean, median, quartiles, and interquartile range) from one sample data distribution.
AquaMaps Predictive distribution maps for marine organisms K. Kaschner, J. S. Ready, E. Agbayani, J. Rius, K. Kesner-Reyes, P. D. Eastwood, A. B. South,
1 The Integrated fisheries Capture Information System (ICIS) Marc Taconet EGEE Istambul - 25 th September 2008 Scientific Data Infrastructure.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 4 – Slide 1 of 23 Chapter 3 Section 4 Measures of Position.
PREDICTING AND UNDERSTANDING BIOGEOGRAPHIC RANGES FROM OCCURRENCE RECORDS AND CORRELATED ENVIRONMENTAL DATA J. M. Guinottte, J. D. Bartley, A. Iqbal, D.
Error control coding – binary linear codes Background material for linear error control codes.
AquaMaps: Mapping Biodiversity Hotspots and Assessing Impacts of Climate Change K.Kaschner (FAO & Albert-Ludwigs- University of Freiburg), M. Taconet (FAO),
Statistics 1: Introduction to Probability and Statistics Section 3-2.
Mapping distributions of marine organisms using environmental niche modelling - AquaMaps K. Kaschner, J. Ready, S. Kullander, R. Froese and many more….INCOFISH,
On the D4Science Approach Toward AquaMaps Richness Maps Generation Pasquale Pagano - CNR-ISTI Pedro Andrade.
MODULE 3: DESCRIPTIVE STATISTICS 2/6/2016BUS216: Probability & Statistics for Economics & Business 1.
User scenario on Marine Biodiversity AquaMaps Pasquale Pagano National Research Council (CNR) – ISTI Italy.
Cloud Computing for Ecological Modeling in the D4Science Infrastructure A. Manzi (CERN), L. Candela, D. Castelli, G. Coro, P. Pagano, F. Sinibaldi (ISTI-CNR)
AquaMaps Mapping Marine Biodiversity Rainer Froese IFM-GEOMAR, Kiel, Germany FishBase Symposium 1 September 2008.
Correlation between People’s Behaviors in Cyber World and Their Geological Position Lixiong Chen Jan 24 th, 2009.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Chapter 1 Review. Why Statistics? The Birth of Statistics Began in the 17th Century System to combine probabilities with Bayesian inference Important.
Parameter, Statistic and Random Samples
FishBase, SealifeBase, AquaMaps
Describing Distributions with Numbers
Chapter 3 Describing Data Using Numerical Measures
Data Mining: Concepts and Techniques
Elementary Statistics
Stat 1301 Percentiles and the Normal Distribution
Rainer Froese, Kathleen Kesner-Reyes and Cristina Garilao
Box and Whisker Plots Algebra 2.
Warmup What is the shape of the distribution? Will the mean be smaller or larger than the median (don’t calculate) What is the median? Calculate the.
Checking and Editing AquaMap Outputs
Statistics 1: Introduction to Probability and Statistics
} 2x + 2(x + 2) = 36 2x + 2x + 4 = 36 4x + 4 = x =
Chapter 1: Exploring Data
Statistics Fractiles
Spatial Statistics A 15 minute Tour….
The Five-Number Summary
Presentation transcript:

AquaMaps - behind the scene Josephine “Skit” D. Rius FishBase Project/INCOFISH WP1 WorlFish Center INCOFISH WP3 Technical Workshop Campinas, Brazil April 2006

Description of tables (HCAF, HSPEN, HSPEC) Description of tables (HCAF, HSPEN, HSPEC) How the tables relate How the tables relate Rules on calculating the envelope values of environmental parameters Rules on calculating the envelope values of environmental parameters How the probabilities are calculated How the probabilities are calculated

HCAF (Half-degree Cells Authority File) Contains information on: a) various cell names (codes) that are used by this and other databases b) statistical cell properties such (center, limits, and area); c) membership in relevant areas(FAO areas, EEZs or LMEs); d) physical attributes (depth, salinity or temperature); e) biological properties (e.g. pimary production). Data Providers: Sea Around Us Project CSIRO Kansas Geological Survey Kristin Kaschner

HSPEN (Species Environmental Envelope) Contains information used for describing the environmental tolerance and preference of a species: Contains information used for describing the environmental tolerance and preference of a species: –distribution using FAO areas and bounding box –range of values per environmental parameter (min., preferred min., preferred max., max.)

HSPEC (Half-degree Species Assignment) Contains the assginment of a species to a half-degree cell and the corresponding probability of occurrence of the species in a given cell. Contains the assginment of a species to a half-degree cell and the corresponding probability of occurrence of the species in a given cell. Overall probability is the multiplicative equation of each of the environmental parameters (SST, salinity, prim. prod., sea ice concentration, distance to land) Overall probability is the multiplicative equation of each of the environmental parameters (SST, salinity, prim. prod., sea ice concentration, distance to land)

How the tables relate…

Get candidate species and corresponding occurrence data Using the Occurrence table of FishBase, get all ‘good’ point records of marine species Using the Occurrence table of FishBase, get all ‘good’ point records of marine specieswhere: ‘Good point’ records is a set of at least 10 non-doubtful point records, and should occur in at least 10 ‘good cells’ ‘Good cell” occurs in the FAO area OR bounding box in which species occurs

Compute the envelope for each parameter and store in HSPEN Step 1Using the good cells, a. take the observed minimum value as the true min b. take the 10 th percentile value as the preferred minimum c. take the 90 th percentile value as the preferred maximum d. take the observed maximum value as the true maximum Step 2Get the extended min. value by: 25 th percentile * interquartile range if observed min. > extended min., use extended min. value as the current min. Get the extended max. value by: 75 th percentile * interquartile range if observed max < extended max., use extended max. value as the current max.

Step 3Set a minimum for the preferred range by: If preferred max. – preferred min. < set minimum range then current pref. min. = MidPrefRange –.5 * set min. range current pref. max. = MidPrefRange +.5 * set min. range where: MidPrefRange = preferred max. + preferred min. / 2 Set min. range varies depending on parameter: SST = 1 Salinity = 1 Primary Production = 2 Sea ice concentration = N.A. Distance to land = 2 Step 4 Set a minimum difference between the current min. and current preferred min. by: if current pref. min. – current min. < set min. differrence then current min. = current pref. min. - set min. differrence Set a minimum difference between the current max. and current preferred max. if current max. – current pref. max. < set min. differrence then current max. = current pref. max. + set min. differrence Set min. difference varies depending on parameter: SST =.5 Salinity =.5 Primary Production = 1 Sea ice concentration = N.A. Distance to land = 1

Compute for probability and store in HSPEC 1. For each species, compare its envelope values (HSPEN) for each parameter with that of each cell in the HCAF table 2. Compute individual probabilities for each parameter as: 0: if cell value < envelope minimum 0: if cell value > envelpe maximum 1: if cell value is within envelope preferred values Cell value – env. Min / pref. min – cell value: if cell value between envelope min. and preferred min. if cell value between envelope min. and preferred min. Env. Max. – cell value / cell value – pref. max : if cell value between envelope max. and preferred max. if cell value between envelope max. and preferred max. 3.Compute overall probaility: Depth x SST x Primary Production x Sea ice concentration x Distance to Land 4. Check and note if cell is a ‘good cell’ 5. Store the combination Species ID, CellID and probability 6. Pass data to mapper

Stats… Number of species Remarks Marine species12498Marine with occurrence points With sufficient no. of good cells to compute for envelopes 4546 With sufficient number of good cells compute for envelopes AND with complete depth range 2909Expected number of species for which map data can be generated With map data691so far generated