Community and gradient analysis: Matrix approaches in macroecology The world comes in fragments.

Slides:



Advertisements
Similar presentations
3.3 Hypothesis Testing in Multiple Linear Regression
Advertisements

Analysis of variance and statistical inference.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Null models in Ecology Diane Srivastava Sept 2010.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Ch11 Curve Fitting Dr. Deshi Ye
Objectives (BPS chapter 24)
The neutral model approach Stephen P. Hubbell (1942- Motoo Kimura ( )
A metapopulation simulation including spatial heterogeneity, among and between patch heterogeneity Travis J. Lawrence Department of Biological Science,
Chapter 13 Multiple Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Community and gradient analysis: Matrix approaches in macroecology The world comes in fragments.
Community and gradient analysis: Matrix approaches in macroecology The world comes in fragments.
Chapter 10 Simple Regression.
Lecture 23: Tues., Dec. 2 Today: Thursday:
Chapter 12 Multiple Regression
Sample size computations Petter Mostad
Final Review Session.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Chapter Topics Types of Regression Models
Chapter 11 Multiple Regression.
Simple Linear Regression Analysis
Topic 3: Regression.
Introduction to Probability and Statistics Linear Regression and Correlation.
PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association.
Simple Linear Regression and Correlation
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Simple Linear Regression Analysis
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Correlation & Regression
AM Recitation 2/10/11.
Selecting the Correct Statistical Test
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
T-test Mechanics. Z-score If we know the population mean and standard deviation, for any value of X we can compute a z-score Z-score tells us how far.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Advanced analytical approaches in ecological data analysis The world comes in fragments.
APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical.
T-Tests and Chi2 Does your sample data reflect the population from which it is drawn from?
The Examination of Residuals. The residuals are defined as the n differences : where is an observation and is the corresponding fitted value obtained.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.
Chapter 16 The Chi-Square Statistic
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
CHI SQUARE TESTS.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
From the population to the sample The sampling distribution FETP India.
Advanced analytical approaches in ecological data analysis The world comes in fragments.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
Inference about the slope parameter and correlation
Chapter 4 Basic Estimation Techniques
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
CHAPTER 29: Multiple Regression*
When You See (This), You Think (That)
Simple Linear Regression
3.2. SIMPLE LINEAR REGRESSION
Introductory Statistics
Presentation transcript:

Community and gradient analysis: Matrix approaches in macroecology The world comes in fragments

Statistical inference means to compare your hypothesis H 1 with an appropriate null hypothesis H 0. Type I error Type II error Simple examples in ecology are The correlation between species richness and area (H 0 : no correlation, t-test) Differences in productivity between plots of different soil properties. (H 0 : no difference between means, ANOVA) But what about more complex patterns: Relative abundance distributions Productivity – diversity relationship Succession Community assembly

Galapagos Islands But: Your variance estimator comes from the underlying distribution of species and individuals. Does the variance stem from Species interactions? Random processes? Evolutionary history? Ecological history? In fact we do not have an appropriate null hypothesis. Bootstrapped or jackknifed variance estimators only catch the variability in the underlying distribution. We compare diversities on islands A t-test points to significant differences in diversity.

Statistical inference Is species co-occurrence random or do species have similar habitat requirements? A simple regression analysis points to joint occurrences. P F (r=0) < Abundance scale exponentially. Extreme values bias the results Spearman’s r = 0.67, P F (r=0) < Classical Fisherian testing relies on an equiprobable null assumption. All values are equiprobable. In ecology this assumption is often not realistic.

Species do not have the same abundances in the meta- community and sites differ in capacity. Statistical testing should incorporate such differences in occurrence pobabilities. Ecologists often have a good H 1 hypothesis. Much discussion is about the appropriate null assumption H 0. What do we expect if colonization of these three islands is random? Ecology is interested in the differences between observed pattern and random expectation. Our statistical tests should deal with these differences and not with raw pattern! If we use classical Fisherian testing nearly all empirical ecological matrices are significantly non-random. Thus we can’t separate ecological interactions from mass effects.

Theory of Island biogeography Galapagos Islands tries to understand diversity from a stochastic species based approach. We treat the theory as H 1 The theory gives us expectations that have to be confirmed by observation. 95% confidence limits We treat the theory as H 0 The theory gives us random expectations. Residuals need ecological interpretation.

Multispecies metapopulation and patch occupancy models Islands in a fragmented landscape Random dispersal of individuals between islands results in a stable pattern of colonization The change of occupancy p in time depends on patch size and distance according to a logistc growth equation. Metapopulation models are single species equivalents of the island biogeography model. Multispecies metapopulation models give null expectations on community structure.

The neutral theory of biodiversity Neutral models try to explain ecological patterns by five basic stochastic processes: - Simple birth processes- Simple death processes - Immigration of individuals- Dispersal of individuals - Lineage branching Neutral models are the individual based equivalents to the species based theory of island biogeography! Although they make predictions about diversities they do not explicitly refer to species! Diversities refer to evolutionary lineages Ecological drift The main trigger of neutrality is dispersal. A high dispersal rates species specific traits are of minor importance for the shape of basic ecological distributions.

Used as H 1 Neutral models make explicit predictions about Shape and parameters of species rank order distributions Species – area relationships Abundance - range size relations Local diversity patterns Patterns of succession Local and regional species numbers Branching patterns of taxonomic lineages Used as H 0 residuals from model predictions are measure of ecological interactions The model contains a number of hidden variables (dispersion limitation, branching mode, dispersal probability, isolation, matrix shape… CPU times are a limiting resource Variable carrying capacities are needed to obtain realistic evolutionary time scales

The neutral, metapopulation and island biogeography models contain too many hidden variables to be of use as null hypothesis. Ecological realism without too many parameters We need null models that are ecologically realistic and rely on few assumptions that apply to all species. Gradient of null model assumptions including more and more constraints. Null models only use information given in the matrix. Theses are matrix fill, marginal totals, and degree distributons.

Gradient of null model assumptions including more and more constraints. Retain fill Retain fill and row totals Retain fill and column totals Retain fill and row degree distribution Retain fill and column degree distribution Retain fill and row and column degree distribution Retain row and column totals Possible constraints Rows Columnsequiprobable proportional to marginal totals Marginal totals fixed equiprobablexxx proportional to marginal totalsxxx marginal totals fixedxxx Degree distribution Marginal totals Start from an empty matric and fill it randomly without or according to some constraints

Gradient of null model assumptions including more and more constraints. Equiprobable - equiprobable Proportional - proportional Equiprobable - fixed Fixed - Equiprobable Fixed - proportional Fixed - fixed Includes mass effects Most liberal Identifies nearly all empirical matrices as being not random Low discrimination power Partly includes mass effects Appropriate if species abundances or site capacities are equal Identifies most empirical matrices as being not random Partly excludes mass effects Appropriate if species abundances or site capacities are proportional to metapopulation abundances or sites capacities Identifies many empirical matrices as being not random Excludes most mass effects Appropriate if column totals are proportional to sites capacities Identifies many empirical matrices as being random Excludes mass effects Appropriate if nothing is known about abundances and capacities Identifies most empirical matrices as being random

An initial empty matrix is filled step by step at random. If after a placement violates the above constraints it steps back and places elsewhere. The process continues until all occurrences are placed. Major drawbacks: Long computation times Potential dead ends Fill algorithm Swap algorithm The algorithm screens the original matrix for checkerboards and swaps them to leave row and columns sums constant. Use at least 10*species*sites swaps. Major drawbacks: Generates biased matrices in dependence on the original distribution The algorithm starts with a random matrix according to the row and column constraints and sequentially swaps all 2x2 submatrices until only 1 and 0 remain. Major drawbacks: Randomized matrices have a low variance that are prone to type II errors. Trial algorithm (Sum of squares reduction) Algorithms for the fixed fixed null model

The Swap algorithm is most often used 1.Sequential swap: First make a burn in and swap times and then use each further 5000 swaps as a new random matrix 2.Independent swap: Generate each random matrix from the original matrix using at least 10*species*sites swaps. Compare the observed metric scores with the simulated ones (100 or more randomized matrices) Z-score lower CL = Z-score upper CL = Scores Frequency Observed score upper CL Lower CL

Using abundances Abundance Species Populations equiprobable proportional to observed totals marginal totals fixed proportional to marginal totals equiprobable marginal totals fixed proportional to marginal totals equiprobable populations fixed Including abundances into null models increases the number of possible null models These 27 combinations regard rows, columns, and row and columns.

CAMorisitaVarianceMantel LCLUCLLCLUCLLCLUCLLCLUCL Prop-prob, total abundance fixed Prop-prob, row/column abundances fixed Prop– prop, row/column richenss fixed Prop-prob, total richnes fixed Row/column richenss and abundance fixed Occurences fixed Occurrences and row/cloumn abundances fixed Populations fixed Populations per column fixed Populations per row fixed Testing of null models and metrics using proportional random matrices. The metrics shouldn’t detect these matrices as being non-random. 200 random matrices

Abundance matrices are more often detected as being non-random Fraction of 185 matrices detected as being significantly (two-sided 95% CL) segregated (dark bars) or aggregated (white bars).