Methodology workshop focused on technology for identifying marine habitats Trine Bekkby Workshop at NIVA, Oslo, May 29-30 2007.

Slides:



Advertisements
Similar presentations
Brief introduction on Logistic Regression
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Introduction Changes in suspended sediment concentration (SSC) are important for both the physical and ecological environment. Attenuation of light intensity.
Day 6 Model Selection and Multimodel Inference
Correlation and regression
Robert Plant != Richard Plant. Sample Data Response, covariates Predictors Remotely sensed Build Model Uncertainty Maps Covariates Direct or Remotely.
Objectives (BPS chapter 24)
1 Quantifying Opinion about a Logistic Regression using Interactive Graphics Paul Garthwaite The Open University Joint work with Shafeeqah Al-Awadhi.
ICIT Mapping maerl habitats using autonomous sensors Malcolm Thomson International Centre for Island Technology (ICIT) Heriot Watt University Old Academy.
Statistical Methods Chichang Jou Tamkang University.
Maximum likelihood (ML) and likelihood ratio (LR) test
GIS modelling for marine management Poland habitat mapping Oslo Workshop Martin Isæus
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Today Concepts underlying inferential statistics
Chapter 7 Correlational Research Gay, Mills, and Airasian
Correlation and Regression Analysis
Lorelei Howard and Nick Wright MfD 2008
Inferential Statistics
GEOGRAPHICAL FIELDWORK IN FOREST Jaromír Kolejka, Mendel University Eduard Hofmann, Masaryk University Brno, Czech Republic EXCITING GEOGRAPHY.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Basic Statistics (for this class) Special thanks to Jay Pinckney (The HPLC and Statistics Guru) APOS.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Day 7 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
OPTIMAL STRATEGIES FOR ECOLOGICAL RESTORATION UNDER CLIMATE CHANGE Koel Ghosh, James S. Shortle, and Carl Hershner * Agricultural Economics and Rural Sociology,
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
A Statistical Analysis of Seedlings Planted in the Encampment Forest Association By: Tony Nixon.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Main elements of an Integrated Monitoring and Assessment Programme:Coast and Hydrography Integrated Correspondence Group on Monitoring 30 March-1 April.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Why Model? Make predictions or forecasts where we don’t have data.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
CORRELATION: Correlation analysis Correlation analysis is used to measure the strength of association (linear relationship) between two quantitative variables.
Discussion of time series and panel models
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Question paper 1997.
INTRODUCTION TO Machine Learning 3rd Edition
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
PCB 3043L - General Ecology Data Analysis.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L1a.1 Lecture 1a: Some basic statistical concepts l The use.
MODULE 1 Water Framework Directive, Relation of WFD with Daughter Directives, River Basin Management Planning, Water Bodies, Typology, Classification River.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
The Marine Strategy Framework Directive “good environmental status” and the Water Framework Directive “good ecological/chemical status/potential” ECOSTAT.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Confidence Intervals and Hypothesis Testing Mark Dancox Public Health Intelligence Course – Day 3.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
BPS - 5th Ed. Chapter 231 Inference for Regression.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Stats Methods at IC Lecture 3: Regression.
15 Inferential Statistics.
Quadrat Sampling Chi-squared Test
Chapter Nine Hypothesis Testing.
How Good is a Model? How much information does AIC give us?
PCB 3043L - General Ecology Data Analysis.
Freshwater fish Classification Tools
CHAPTER 29: Multiple Regression*
Choosing a test: ... start from thinking whether our variables are continuous or discrete.
Chapter 7: The Normality Assumption and Inference with OLS
Questions Do fish species differ in relative abundance as a function of zone (shallow, deep) This should be in the context of a specific set of predictions.
Model generalization Brief summary of methods
European Red List of Habitats
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Methodology workshop focused on technology for identifying marine habitats Trine Bekkby Workshop at NIVA, Oslo, May

Presenting Norwegian Institute for Water Research

District offices and daughter companies District offices: - Trondheim - Hamar - Bergen - Grimstad Solbergstrand Marine Research Station Trondheim Daughter companies: Min office NIVA group: 250 employees

Categories of work Technical services Research (basic and applied) Development Monitoring Counciling Knowledge communication

Most important areas of research Water resource management Taxonomy and biodiversity Physiology and ecotoxology Physical processes and modelling Geochemistry Cleaning and transport of drinking and bilge water Water chemistry and chemical analyses

Experience from more than 70 countries…

Presenting Oslo Centre for Interdisciplinary Environmental and Social Research

Tandbergbygget på Brekke Area: m 2 Employees: 500 Cost: 270 mill. kr Started: April 2005 Inhabited: Oct CIENS partners: ► NIBR ► NINA ► NILU ► NIVA ► TØI ► UiO ► met.no ► CICERO + NVE

Heat from the ground covers 90% of the cooling and 60% of the heating The biggest solar panel in Norway

Presenting ICZM&P in Norway

ICZM&P in Norway - Background Norway has complex terrain, with high mountains, deep fjords and a large archipelago. Hence, large marine areas are found within the baseline We have many rivers and large freshwater runoffs to the ocean, hence a large interaction across the coast line We have many water types, outer exposed coast, archipelago and inner sheltered areas. Because of all this, the habitats are many and complex and biodiversity often high

ICZM&P in Norway – Management and planning Norway is obliged to the Water Framework directive WFD (because we are in the EEC), which includes large marine areas (since we have such a large archipelago) We are not obliged to the Habitat Directive and Natura 2000 (because we are not in the EU) We do not have any MPAs (marine protected areas), only suggestions under discussion We have Ramsar areas (for bird protection), landscape protection areas, national parks etc., but no true marine protection. We have management plans for selected areas (e.g. the Barents Sea), area defined as being of extra value regarding biodiversity To fulfil the requirement of the WFD, we have suggested areas and stations for reference monitoring (i.e. they are relatively pristine) and areas and stations for trend monitoring (with pressures, not pristine)

Legal borders of Norway Coastal areas (1 nm outside the base line) Territorial waters (12 nm outside the base line) Exclusive economic zone (200 nm outside the base line, with exceptions)

ICZM&P in Norway – Management and planning Water types according to the WDF work

ICZM&P in Norway – Management and planning Reference areas according to the WFD Trend monitoring areas according to the WFD Areas of particularly interest when it comes to biodiversity Suggestions for MPA

ICZM&P in Norway – ”All” collected data

Reference and trend monitoring stations (WFD) suggested

Presenting different projects

Presentation of selected projects “MarModell” - finding criteria for habitat modelling “CoastScenes” - modelling effects of scenarios “Dynamod” – developing models for the Skagerrak area Sugar kelp modelling – in the Skagerrak area “NorGIS” - modelling habitats at the Nordic level “MarNatur” - The national program for mapping and modelling of marine habitats. “Balance” Others

Presentation of selected projects “MarModell” - finding criteria for habitat modelling “CoastScenes” - modelling effects of scenarios “Dynamod” – developing models for the Skagerrak area Sugar kelp modelling – in the Skagerrak area “NorGIS” - modelling habitats at the Nordic level “MarNatur” - The national program for mapping and modelling of marine habitats. “Balance” Others

The aim of ”MarModell” Study the relationship between environmentla factors and the distribution and abundance of marine coastal habitats Develop methodology for habitat modelling Study the effects of scale The link geology-biology crucial

Predictors and responses Bathymetry and terrain (depth, slope, curvature) Wave exposure at different scales Tidal current (together with UiO) Light exposure Light % Presence/absence Coverage

Data Field work Input models Statistical model building, analyses, model selection

Presentation of selected projects “MarModell” - finding criteria for habitat modelling “CoastScenes” - modelling effects of scenarios “Dynamod” – developing models for the Skagerrak area Sugar kelp modelling – in the Skagerrak area “NorGIS” - modelling habitats at the Nordic level “MarNatur” - The national program for mapping and modelling of marine habitats. “Balance” Others

The aim of ”Coast-scenes” Study the relationship between environmentla factors and the distribution and abundance of marine coastal habitats – develop model Define the natural conditions of the area at the site of a fish farm, compare with the existing conditions Analyse/model the effect of scenarios for human acticity development

Presentation of selected projects “MarModell” - finding criteria for habitat modelling “CoastScenes” - modelling effects of scenarios “Dynamod” – developing models for the Skagerrak area Sugar kelp modelling – in the Skagerrak area “NorGIS” - modelling habitats at the Nordic level “MarNatur” - The national program for mapping and modelling of marine habitats. “Balance” Others

The aim of ”Dynamod” Develop methodology for modelling of marine substrate and habitats, both rocky and soft seabed Developing base models, i.e. light models Comparing wave exposure models Developing current models Separating rocks from soft sediment Separating different soft sediment classes Modelling ecological status? Modelling rocky shore macroalgaes

Presentation of selected projects “MarModell” - finding criteria for habitat modelling “CoastScenes” - modelling effects of scenarios “Dynamod” – developing models for the Skagerrak area Sugar kelp modelling – in the Skagerrak area “NorGIS” - modelling habitats at the Nordic level “MarNatur” - The national program for mapping and modelling of marine habitats. “Balance” Others

Presentation of selected projects “MarModell” - finding criteria for habitat modelling “CoastScenes” - modelling effects of scenarios “Dynamod” – developing models for the Skagerrak area Sugar kelp modelling – in the Skagerrak area “NorGIS” - modelling habitats at the Nordic level “MarNatur” - The national program for mapping and modelling of marine habitats. “Balance” Others

Presenting equipment and methods for sampling

Equipment and methods for sampling (sample design, equipment, sampling) Sample design Preliminary model as basis for selecting stations We need to cover the range of predictor (depth, slope, terrain, wave exposure, currents etc.) Stations are randomly selected within the study area

Equipment and methods for sampling (sample design, equipment, sampling) Equipment in the field ROV Pico-ROV (small portable camera, may be operated by hand) Singlebeam echosounder for recording of depth in the field, used together with pico-ROV Multibeam echosounder, used at selected locations Sediment profile Image (SPI) camera for sediment penetration depth and ecological status Grab (sediment samples) FerryBox (recording equipment on ferries) Divers

Equipment and methods for sampling (sample design, equipment, sampling) Recorded in the field Usually we use small boats and record Depth (from the echosounder) Substrate (visually), presence/absence and coverage Habitat presence and absence Habitat coverage If larger boats, then some of the following are recorded Substrate classified based on multibeam on selected locations Penetration depth (using SPI) Redox depth (from SPI or other equipment) Grain size (from grab) Species composition (from grab of sediment or diving on rocky substrate) Environmental state (from SPI pictures)

Field work

Similarities and differences Norway - Poland

Similarities and differences compared with Polish conditions Poland is in the EU and is obliged to both the Water Framework and the Habitat Directive. Norway is not in the EU and is only obliged to the EFD (because we are in the EEC) Norway and Poland has different bathymetry and topography, the terrain variability is less in Poland than in Norway The exposure levels are higher and more variable in Norway than in Poland The number of habitats differ between the two countries The pressures are different (?). In Norway, the pressures are mainly fishing, fish farms, kelp harvesting, waterfall regulations and, in some areas, changing of habitats for recreational purposes. In Poland: ? More?

Presenting the modelling approach in more detail

The basic idea Terrain structures and environmental factors determines the distribution of marine habitats But what kind and how? And how to make good predictions?

Modelling in more detail – the Norwegian approach (geophysical factors, substrate & habitat Geophysical base models Depth model (25 m resolution for the whole of Norway, better in selected areas), includes some land data to ensure good models in the coastal zone Wave exposure model (25 m resolution for the whole of Norway, 10 m in selected areas) Terrain models for selected areas (e.g. slope, curvature, basins, tops) Current circulation models for selected areas Light percentage models for selected areas (% of surface light reaching the seabed, depends on secchi depth) Light exposure models (an index based on optimal slope and aspect)

Isæus (2004) Modelled wave exposure

Depth

Slope

Curvature

Modelled current

Light - % of surface level

Light – related to optimal slope and aspect

Modelling in more detail – the Norwegian approach (geophysical factors, substrate & habitat Substrate Binomial models separating rocks from sediment based on slope and curvature Probability model separating rocks from sediment Probability model separating sand from softer sediment (based on data on penetration depth)

Seabed substrate

Binomial seabed substrate modelling

Probability seabed substrate modelling

Probability soft seabed sediment modelling

Modelling in more detail – the Norwegian approach (geophysical factors, substrate & habitat Habitat Kelp forest - binomial models for Norway Zostera meadows – binomial models for Norway EUNIS classes – binomial models to level 2 for Norway Large shallow inlets and bays (Natura 2000 habitat) binomial models for selected areas Kelp – probability models for selected areas Zostera meadows - probability models for selected areas

Modelling approach – methodology, some examples Binomial modelling – pros and cons + Uses empirical data to find max and min values + Uses expert judgement to set borders + Provides modelled areas on maps that may be measured (area) - Absolute borders, easy to miscommunicate - The uncertainty in the models not included, no probability measures

Binomial modelling of kelp forest Skagerrak: In exposed and moderately exposed areas down to 20 m depth North Sea: In exposed areas down to 25 m depth and moderately exposed areas down to 20 m depth Norwegian Sea to South-Trøndelag: as in the North Sea Norwegian Sea from to the Barents Sea: Exposed areas down to 25 m (moderately exposed areas are grazed by sea urchins)

Binomial modelling of eelgrass (Zostera marina) In shallow (down to 7 m depth), relatively flat (<7 degrees) and sheltered and moderately exposed areas

Predictions – habitat modelling Green: modelled kelp forest Pink: modelled eelgrass Yellow: modelled shell sand Turquoise: modelled Pecten maximus

Binomial modelling of EUNIS classes Based on the data available for the whole of Norway, it has been possible to model EUNIS down to level 2, using wave exposure and depth classes. The depth classes are: 0-30 m, 30-50, , , , and deeper than 700 m. Wave exposure classes are Wave exposure (SWM)EUNIS class < 1200Ultra beskyttet 1200 – 4000Ekstremt beskyttet 4000 – 10000Svært beskyttet – Beskyttet – Moderat eksponert – Eksponert – Svært eksponert > Ekstremt eksponert

Modelling approach – methodology, some examples Probability modelling – pros and cons + Uses empirical data to find max and min values + Includes the uncertainty of the data in the models, has probabilities + Probabilities makes it possible to select different approaches, overestimate (precautionary) or underestimate (e.g. for time-efficient searching) + More intuitive, easier to explain discrepancies from observations - Can not include expert judgement - Depends a lot on the empirical data set, an insufficient data set will give a bad model

Laminaria hyperborean kelp forest

Seagrass (Zostera marina) meadows

Analyses – ”separating the information from the noise” Integrating data in a GIS Linking data for analyses Predictor data (depth, slope, wave exposure, currents etc) Response data (habitat presence/absence, coverage etc) Analyses and model building Finding significant factors (traditional H0 testing with p-values) OR Build different alternative models and use model selection techniques (e.g. AIC)

Three traditions 1. Frequentism (p-values) 2. Likelihood (AIC) 3. Bayesian “IC” Frequentism H0 hypothesis testing, p-values, significance Akaikes Information Criterion (AIC) Testing the models (and the hypotheses) relative to each other Finding the model that looses the least information Bayes Often called BIC, but it has noting to do with information theory, not as well founded on theory as AIC Often gets none or very large effects Regarded as better that frequentism, but not as good as AIC

Traditional H0 testing or AIC model selection techniques Finding significant factors (traditional H0 testing with p-values) Did we believe in the H0 in the first place? What does “significant p” really mean? We test the H0(not the H1), as accept the H1 because of the rejection Build different alternative models and use model selection techniques (AIC) My models are my hypotheses and model selection is hypothesis selection All hypothesis are formulated as models, a priori “neck-up-thinking” is essential Testing the models (and the hypotheses) relative to each other AIC finds the model that looses the least information AIC weights the benefit of a better and more complicated model against the cost of including more factors

One example on “neck-down” models Kelp forest presence (P) is determined by wave exposure (WE) only P is determined by light attenuation (LA) only P is determined by sea bed substrate (SS) only P is determined by WE and LA P is determines by WE and SS P is determined by LA and SE P is determined by WE, LA and SE P is determined by WE, LA and WE*LA P is determined by WE, SS and WE*SS P is determined by LA, SE and LA*SE P is determined by WE, LA, SE and WE*LA P is determined by WE, LA, SE and WE*SE P is determined by WE, LA, SE and LA*SE P is determined by WE, LA, SE and WE*LA*SE ”Neck-up” choice of hypotheses and models is essential

More about AIC = Akaike Information Criterion AIC finds the model that looses the least information AIC weights the benefit of a better and more complicated model against the cost of including more factors A bit of introduction to the math A maximum likelihood estimate (MLE) or RSS (residual sum of squares from Lest square estimate, LSE) value for each hypothesis based model are needed (obtained from e.g. an ANOVA) MLE maximises the likelihood, LES minimises the sum of squares of error ML or RSS: RSS assumes normal, independent data and linear relationships, often this is not the case with ecological data. ML is most often the best choice. AIC = -2log(L) + 2K → -2log(L) is the deviance, i.e. the measure of lack of fit. This is linked to the Chi square analysis (ChiSq=-2log(La/Lb) The model fit often gets better with more factors, but you are “punished” for complicating the model (+2K), i.e. a cost-benefit approach

“All models are wrong, but some are useful”

A bit of math AIC = -2log(L) + 2K → -2log(L) is the deviance, i.e. the measure of lack of fit. → K is the number of parameters in the model This is linked to the Chi square analysis (ChiSq=-2log(La/Lb) The model fit often gets better with more factors, but you are “punished” for complicating the model (+2K), i.e. a cost-benefit approach

Some more math The smaller the AIC value, the better the model fit The delta value shows the difference between the best and the alternative models Delta<=2: the alternative model has good support Delta 4-7: the alternative model has low support DeltaZ10: the alternative model has no support Wi: the Akaike weight, the probability that the model in fact is the best, “how many ticket do I have in the lottery”, Wi=0.66 means 66% chance that the model is best. To know if the best model is fact is good (not only the best of the bad), combine AIC with adjusted R2 and residual plotting

So, what if more than one model is good 1.Describe them all, but choose one for your predictions 2.Model averaging (=multi model inference”), models are weigh using the Wi value. Is most often recommended

GRASP for GIS prediction – comments and concerns 1. Uses GAM (Generalised Additative Models) to build models 2. Uses AIC to select the models 3.Concerns 4. The AIC algorithm used in GRASP only applies to large datasets, ad additional 5. algorithm should be added to correct for this 6. GRASP does not allow for model averaging

Model validation using field data 1. Cross validation re-using the data from the predictive modelling No point when using AIC, because in the Akaike development of the Kullback- Leiber methodology into AIC, the expectation of the cross validation ends up as the same or similar to the expectation of the AIC. So cross validation adds nothing. 2. Validation using fresh data From the predictions, you get probabilities of finding a habitat at a certain site (pixel) Collecting data in the field (e.g. presence/absence data), you get binomial data (0s or 1s) that can be compared with modelled values using logistic regression. Look at the R2 and the residual plot

Habitat valorisation We haven’t come too far, due to lack of information on habitat distribution and function (e.g. little knowledge on rare and threatened species). The national program for mapping of marine habitats has established some criteria for nationally very important (A), regionally important (B) and locally important (C) occurences. Ecological criteria Ecological function (richness, size, age, production rate, functionally close to natural state Rareness (rare both regionally and nationally, close to natural state when it comes to biodiversity Threatenedness (small occurrences, vulnerable, reducing in abundance Cultural criteria Aesthetics Use (provides understanding of nature, important for recreation, teaching, research, long time series and knowledge of trends) A: includes the categories critically and strongly threatened and vulnerable B: includes close threatened