Mechanistic models for macroecolgy: moving beyond correlation Nicholas J. Gotelli Department of Biology University of Vermont Burlington, VT 05405.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Community and gradient analysis: Matrix approaches in macroecology The world comes in fragments.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
CHAPTER 8: LINEAR REGRESSION
SPATIAL DATA ANALYSIS Tony E. Smith University of Pennsylvania Point Pattern Analysis Spatial Regression Analysis Continuous Pattern Analysis.
Objectives (BPS chapter 24)
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 13 Multiple Regression
Chapter 10 Simple Regression.
Statistics for Managers Using Microsoft® Excel 5th Edition
BA 555 Practical Business Analysis
Chapter 12 Multiple Regression
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Community Assembly: From Small to Large Spatial Scales Nicholas J. Gotelli Department of Biology University of Vermont Burlington, VT
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Range Wrangler I: Guided Tour and Analyses of New World Mammal Distributions Nick Gotelli Department of Biology University of Vermont Burlington, VT
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Introduction to Regression Analysis, Chapter 13,
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Variance and covariance Sums of squares General linear models.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Correlation & Regression
OUR Ecological Footprint …. Ch 20 Community Ecology: Species Abundance + Diversity.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Relationship of two variables
Correlation and Linear Regression
STA291 Statistical Methods Lecture 27. Inference for Regression.
Simple Linear Regression
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
Simple Linear Regression Models
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Examining Relationships in Quantitative Research
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 19 Linear Patterns.
Chapter 14 Inference for Regression AP Statistics 14.1 – Inference about the Model 14.2 – Predictions and Conditions.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Applications of Spatial Statistics in Ecology Introduction.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Model Selection and Validation. Model-Building Process 1. Data collection and preparation 2. Reduction of explanatory or predictor variables (for exploratory.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Correlation & Regression Analysis
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Applying stochastic models of geographic evolution to explain species-environment relationships of bats in the New World J. Sebastián Tello and Richard.
Regression Analysis1. 2 INTRODUCTION TO EMPIRICAL MODELS LEAST SQUARES ESTIMATION OF THE PARAMETERS PROPERTIES OF THE LEAST SQUARES ESTIMATORS AND ESTIMATION.
There is a hypothesis about dependent and independent variables The relation is supposed to be linear We have a hypothesis about the distribution of errors.
Why use landscape models?  Models allow us to generate and test hypotheses on systems Collect data, construct model based on assumptions, observe behavior.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Chapter 15 Multiple Regression Model Building
Chapter 4 Basic Estimation Techniques
Regression Analysis AGEC 784.
CHAPTER 29: Multiple Regression*
Bird species (left), mammals (right)
CHAPTER 12 More About Regression
Product moment correlation
SPATIAL ANALYSIS IN MACROECOLOGY
Presentation transcript:

Mechanistic models for macroecolgy: moving beyond correlation Nicholas J. Gotelli Department of Biology University of Vermont Burlington, VT 05405

?? What causes geographic variation in species richness ??

Understanding species richness patterns Data sources A critique of current methods Range cohesion and the mid-domain effect Mechanistic models for species richness Model selection Summary

Nicholas Gotelli, University of Vermont Gary Entsminger Acquired Intelligence Rob Colwell University of Connecticut Gary Graves Smithsonian Carsten Rahbek University of Copenhagen Thiago Rangel Federal University of Goiás

Understanding species richness patterns Data sources A critique of current methods Range cohesion and the mid-domain effect Mechanistic models for species richness Model selection Summary

Data sources Gridded map of domain

Avifauna of South America “There can be no question, I think, that South America is the most peculiar of all the primary regions of the globe as to its ornithology.” P.L. Sclater (1858)

South American Avifauna 2891 breeding species 2248 species endemic to South America and associated land- bridge islands

Minimum: 18 species

Minimum: 18 species Maximum: 846 species

Data sources Gridded map of domain Species occurrence records within grid cells

Geographic Ranges For Individual Species Myiodoorus cardonai Phalacrocorax brasilianus Anas puna

Geographic Ranges Species Richness

Geographic Ranges Species Richness

Data sources Gridded map of domain Species occurrence records within grid cells Quantitative measures of potential predictor variables within grid cells (NPP, temperature, habitat diversity)

Climate, Habitat Variables Measured at Grid Cell Scale

Understanding species richness patterns Data sources A critique of current methods Range cohesion and the mid-domain effect Mechanistic models for species richness Model selection Summary

How are these macroecological data typically analyzed?

How are these macroecological data typically analyzed? Curve-fitting!

Criticisms of Curve-Fitting “Correlation does not equal causation”

Criticisms of Curve-Fitting “Correlation does not equal causation” Common to all of macroecology!

Criticisms of Curve-Fitting “Correlation does not equal causation” Common to all of macroecology! Non-linearity & non-normal, spatially correlated errors

Criticisms of Curve-Fitting “Correlation does not equal causation” Common to all of macroecology! Non-linearity & non-normal, spatially correlated errors LOESS, Poisson, Spatial Regression (SAM)

Criticisms of Curve-Fitting “Correlation does not equal causation” Common to all of macroecology! Non-linearity & non-normal, spatially correlated errors LOESS, Poisson, Spatial Regression (SAM) Choosing among correlated predictor variables

Criticisms of Curve-Fitting “Correlation does not equal causation” Common to all of macroecology! Non-linearity & non-normal, spatially correlated errors LOESS, Poisson, Spatial Regression (SAM) Choosing among correlated predictor variables Model selection strategies, stepwise regression, AIC

Criticisms of Curve-Fitting “Correlation does not equal causation” Common to all of macroecology! Non-linearity & non-normal, spatially correlated errors LOESS, Poisson, Spatial Regression (SAM) Choosing among correlated predictor variables Model selection strategies, stepwise regression, AIC Sensitivity to spatial scale, taxonomic resolution, geographic range size

Criticisms of Curve-Fitting “Correlation does not equal causation” Common to all of macroecology! Non-linearity & non-normal, spatially correlated errors LOESS, Poisson, Spatial Regression (SAM) Choosing among correlated predictor variables Model selection strategies, stepwise regression, AIC Sensitivity to spatial scale, taxonomic resolution, geographic range size Stratify analysis

Conceptual Weakness of Curve-Fitting Paradigm Predicted Species Richness (S / grid cell) Potential Predictor Variables (tonnes/ha, C°) Observed Species Richness (S / grid cell)

Conceptual Weakness of Curve-Fitting Paradigm Predicted Species Richness (S / grid cell) Potential Predictor Variables (tonnes/ha, C°) Observed Species Richness (S / grid cell) minimize residuals

Conceptual Weakness of Curve-Fitting Paradigm Predicted Species Richness (S / grid cell) Potential Predictor Variables (tonnes/ha, C°) Observed Species Richness (S / grid cell) ?? MECHANISM ?? minimize residuals

Explicit Simulation Model Alternative Strategy: Mechanistic Simulation Models Predicted Species Richness (S / grid cell) Potential Predictor Variables (tonnes/ha, C°) Observed Species Richness (S / grid cell)

Explicit Simulation Model Alternative Strategy: Mechanistic Simulation Models Predicted Species Richness (S / grid cell) Potential Predictor Variables (tonnes/ha, C°) Observed Species Richness (S / grid cell) mechanism

How can we build explicit simulation models for macroecology?

Understanding species richness patterns Data sources A critique of current methods Range cohesion and the mid-domain effect Mechanistic models for species richness Model selection Summary

One-dimensional geographic domain

Species geographic ranges randomly placed line segments within domain

One-dimensional geographic domain Species geographic ranges randomly placed line segments within domain Peak of species richness in geographic center of domain

One-dimensional geographic domain Species geographic ranges randomly placed line segments within domain Peak of species richness in geographic center of domain Species Number

domain

geographic range

der Pfankuchen Guild Pancakus spp.

Reduced species richness at margins of the domain

Mid-domain peak of species richness in the center of the domain

2-dimensional MDE Model Random point of origination within continent (speciation) Random spread of geographic range into contiguous unoccupied cells Spreading dye model (Jetz & Rahbek 2001) predicts peak richness in center of continent (r 2 = 0.17)

Assumptions of MDE models Placement of ranges within domain is random with respect to environmental gradients –Controversial, but logical for a null model for climatic effects

Assumptions of MDE models Placement of ranges within domain is random with respect to environmental gradients –Controversial, but logical for a null model for climatic effects Geographic ranges are cohesive within the domain –Rarely discussed, but important as the basis for a mechanistic model of species richness

Range CohesionRange Scatter

At the 1º x 1º scale, > 95% of species of South American birds have contiguous geographic ranges

Causes of Range Cohesion Extrinsic Causes

Causes of Range Cohesion Extrinsic Causes –Coarse Spatial Scale –Spatial Autocorrelation in Environments

Causes of Range Cohesion Extrinsic Causes –Coarse Spatial Scale –Spatial Autocorrelation in Environments Intrinsic Causes

Causes of Range Cohesion Extrinsic Causes –Coarse Spatial Scale –Spatial Autocorrelation in Environments Intrinsic Causes –Limited Dispersal –Philopatry & Site Fidelity –Metapopulation & Source/Sink Structure –Fine-scale Genetic Structure & Local Adaptation –Spatially Mediated Species Interactions

Strict Range Cohesion Stepping Stone * The mid-domain effect does not require strict range cohesion. A mid-domain peak in species richness will also arise from stepping stone models with limited dispersal and from neutral model dynamics (Rangel & Diniz-Filho 2005)

Homogenous Environment Heterogeneous Environment Almost all MDE models have assumed a homogeneous environment: grid cells are equiprobable

Enforced Relaxed HomogeneousHeterogeneous RANGE COHESION ENVIRONMENT

Enforced Relaxed HomogeneousHeterogeneous RANGE COHESION ENVIRONMENT Classic MDE Statistical Null (slope = 0)

Enforced Relaxed HomogeneousHeterogeneous RANGE COHESION ENVIRONMENT Classic MDE Statistical Null (slope = 0)

Enforced Relaxed HomogeneousHeterogeneous RANGE COHESION ENVIRONMENT Classic MDE Statistical Null (slope = 0) Range Scatter Models Range Cohesion Models

Enforced Relaxed HomogeneousHeterogeneous RANGE COHESION ENVIRONMENT Classic MDE Statistical Null (slope = 0) Range Scatter Models Range Cohesion Models Range Cohesion Models are a hybrid that describes a stochastic MDE model in a more realistic heterogeneous environment. Range Scatter Models also incorporate environmental heterogeneity, but do not place any constraints on species geographic ranges.

Explicit Simulation Model Alternative Strategy: Mechanistic Simulation Models Predicted Species Richness (S / grid cell) Potential Predictor Variables (tonnes/ha, C°) Observed Species Richness (S / grid cell) mechanism

Understanding species richness patterns Data sources A critique of current methods Range cohesion and the mid-domain effect Mechanistic models for species richness Model selection Summary

Modeling Strategy Establish simple algorithms that describe P(occupancy) based on environmental variables

Modeling Strategy Establish simple algorithms that describe P(occupancy) based on environmental variables Simulate origin and placement of each species geographic range in heterogeneous landscape (with or without range cohesion)

Modeling Strategy Establish simple algorithms that describe P(occupancy) based on environmental variables Simulate origin and placement of each species geographic range in heterogeneous landscape (with or without range cohesion) Repeat simulation to estimate predicted species richness per grid cell

Geographic Ranges Species Richness

What determines P(cell occurrence)? Simple environmental models P(occurrence)  Measured Environmental Variable (NPP, Temperature, etc.)

What determines P(cell occurrence)? Simple environmental models P(occurrence)  Measured Environmental Variable (NPP, Temperature, etc.) Formal analytical models

What determines P(cell occurrence)? Simple environmental models P(occurrence)  Measured Environmental Variable (NPP, Temperature, etc.) Formal analytical models –Species-Energy Model (Currie et al. 2004) –Temperature Kinetics (Brown et al. 2004)

What determines P(cell occurrence)? Simple environmental models P(occurrence)  Measured Environmental Variable (NPP, Temperature, etc.) Formal analytical models –Species-Energy Model (Currie et al. 2004) P(occurrence)  (NPP)(Grid-cell Area) –Temperature Kinetics (Brown et al. 2004) P(occurrence)  e -E/kT

Understanding species richness patterns Data sources A critique of current methods Range cohesion and the mid-domain effect Mechanistic models for species richness Model selection Summary

Model-Selection in Curve-Fitting Analyses Simple tests against the null hypothesis that b=0 No consideration of what expected slope should be with a specific mechanism Least-square and AIC criteria to try and select a subset of variables that best account for variation in S

H0: b = 0

Model Selection with Mechanistic Simulation Models Models make quantitative predictions of expected species richness Test slope of observed richness versus predicted richness Hypothesis of an acceptable fit H1: b = 1.0 Rank acceptable models according to slope, intercept, and r2 AIC criteria not appropriate

Predicted S Observed S Theoretical b = 1.0 Observed b

Understanding species richness patterns Data sources A critique of current methods Range cohesion and the mid-domain effect Mechanistic models for species richness Model selection Summary

Curve-fitting framework does not incorporate explicit mechanisms

Summary Curve-fitting framework does not incorporate explicit mechanisms Use mechanistic simulations to define the placement of geographic ranges in a gridded domain

Summary Curve-fitting framework does not incorporate explicit mechanisms Use mechanistic simulations to define the placement of geographic ranges in a gridded domain Specify rules for P(occurrence)= f(environmental variables)

Summary Curve-fitting framework does not incorporate explicit mechanisms Use mechanistic simulations to define the placement of geographic ranges in a gridded domain Specify rules for P(occurrence)= f(environmental variables) Test model fit against expected slope = 1.0

Criticisms & Rejoinders

“Each species has a unique and distinctive response to different environmental variables. Species ranges should be modeled independently, not with a single function for all species.”

Criticisms & Rejoinders “Each species has a unique and distinctive response to different environmental variables. Species ranges should be modeled independently, not with a single function for all species.” If this is true, why are there widespread repeatable patterns of species richness (e.g., latitude, elevation, area, productivity)?

Criticisms & Rejoinders “Each species has a unique and distinctive response to different environmental variables. Species ranges should be modeled independently, not with a single function for all species.” If this is true, why are there widespread repeatable patterns of species richness (e.g., latitude, elevation, area, productivity)? Often not enough data to model each species individually. We need a simple framework for analysing entire floras and faunas at a biogeographic scale.

Criticisms & Rejoinders “1:1 scaling of environmental variables with P(occurrence) is unrealistic and arbitrary.”

Criticisms & Rejoinders “1:1 scaling of environmental variables with P(occurrence) is unrealistic and arbitrary.” Perhaps, but this is a parsimonious mechanistic model that relates environmental variables to geographic range placement.

Criticisms & Rejoinders “1:1 scaling of environmental variables with P(occurrence) is unrealistic and arbitrary.” Perhaps, but this is a parsimonious mechanistic model that relates environmental variables to geographic range placement. Linearity in P(occurrence) is not unreasonable over the empirical ranges of environmental variables measured in South America. (Linearity of P(occurrence) ≠ Linearity of (Species Richness))

Criticisms & Rejoinders “1:1 scaling of environmental variables with P(occurrence) is unrealistic and arbitrary” Perhaps, but this is a parsimonious mechanistic model that relates environmental variables to geographic range placement. Linearity in P(occurrence) is not unreasonable over the empirical ranges of environmental variables measured in South America. (Linearity of P(occurrence) ≠ Linearity of (Species Richness)) Mechanistic models are scarce in this literature (n = 2)! We have to begin somewhere!

Criticisms & Rejoinders “Many environmental variables, but especially NPP, show non-linear relationships with peaks in richness at intermediate levels. This is not captured by linear models.”

Criticisms & Rejoinders “Many environmental variables, but especially NPP, show non-linear relationships with peaks in richness at intermediate levels. This is not captured by linear models.” At least at this spatial scale, no evidence for a diversity hump of avian species richness when plotted with NPP or other variables

Criticisms & Rejoinders “Using slopes comparisons will not successfully distinguish between models with intercorrelated predictor variables.”

Criticisms & Rejoinders “Using slopes comparisons will not successfully distinguish between models with intercorrelated predictor variables.” Not a problem for these analyses. From an initial set of ~ 100 candidate models (10 variables x 2 algorithms x 5 range size quartiles), we reduced the set down to only 4 or 5 possible contenders.

Criticisms & Rejoinders “The model is not truly mechanistic because it does not model the sizes of the geographic ranges, only their placement.”

Criticisms & Rejoinders “The model is not truly mechanistic because it does not model the sizes of the geographic ranges, only their placement.” True! Our model takes range sizes as a given and then uses algorithms to place them in a heterogeneous domain. A more realistic model would describe the processes of speciation, dispersal, and extinction of an evolving fauna.

Criticisms & Rejoinders “The model is not truly mechanistic because it does not model the sizes of the geographic ranges, only their placement.” True! Our model takes range sizes as a given and then uses algorithms to place them in a heterogeneous domain. A more realistic model would describe the processes of speciation, dispersal, and extinction of an evolving fauna. But how can the parameters of such a model (e.g. speciation and dispersal rates) ever be measured in the real world? Same problems have plagued most empirical evaluations of the neutral model.

Criticisms & Rejoinders “The model is not truly mechanistic because it does not model the sizes of the geographic ranges, only their placement.” True! Our model takes range sizes as a given and then uses algorithms to place them in a heterogeneous domain. A more realistic model would describe the processes of speciation, dispersal, and extinction of an evolving fauna. But how can the parameters of such a model (e.g. speciation and dispersal rates) ever be measured in the real world? Same problems have plagued most empirical evaluations of the neutral model. Our models are designed to analyze the data that macroecologists typically have: gridded maps of environmental variables and species geographic ranges.

Criticisms & Rejoinders “The range cohesion and range scatter models don’t’ seem like they would give predictions that are any different from just a regression with the underlying variables themselves. What is the added value of these simulation models?”

Criticisms & Rejoinders “The range cohesion and range scatter models don’t’ seem like they would give predictions that are any different from just a regression with the underlying variables themselves. What is the added value of these simulation models?” The predictions are not the same. For species with large geographic ranges, the range cohesion models always fit the data better than the range scatter models, regardless of which environmental variable is considered.

Key Differences Curve-FittingMechanistic Models Unit of StudySpecies RichnessUnderlying geographic ranges Predicted valuesMinimization of residuals (data dependent) Algorithms for origin and spread of geographic ranges (data independent) Model Selection CriteriaSmallest number of variables that reduce residual sum of squares Quantitative fit to model predictions Statistical TestsH0: (b = 0) tests for any effect that is larger than 0 H0: (b = 1.0) tests for quantitative match between observed and predicted S

To Be Continued… Carsten Rahbek. Perception of Species Richness Patterns: The Role of Range Sizes