Single Season Model Part I. 2 Basic Field Situation From a population of S sampling units, s are selected and surveyed for the species. Units are closed.

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Properties of Least Squares Regression Coefficients
Krishna Pacifici Department of Applied Ecology NCSU January 10, 2014.
Detectability Lab. Outline I.Brief Discussion of Modeling, Sampling, and Inference II.Review and Discussion of Detection Probability and Point Count Methods.
Model Assessment, Selection and Averaging
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Species interaction models. Goal Determine whether a site is occupied by two different species and if they affect each others' detection and occupancy.
Chapter 4 Multiple Regression.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.
End of Chapter 8 Neil Weisenfeld March 28, 2005.
Statistical Background
Topic 3: Regression.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Maximum likelihood (ML)
Chapter 14 Inferential Data Analysis
Bootstrapping applied to t-tests
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
Chapter 10 Hypothesis Testing
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Introduction to plausible values National Research Coordinators Meeting Madrid, February 2010.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Model Inference and Averaging
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Role of Statistics in Geography
Detecting trends in dragonfly data - Difficulties and opportunities - Arco van Strien Statistics Netherlands (CBS) Introduction.
Using historic data sources to calibrate and validate models of species’ range dynamics Giovanni Rapacciuolo University of California Berkeley
Resource Selection Functions and Patch Occupancy Models: Similarities and Differences Lyman L. McDonald Senior Biometrician Western EcoSystems Technology,
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
2.4 Units of Measurement and Functional Form -Two important econometric issues are: 1) Changing measurement -When does scaling variables have an effect.
Three Frameworks for Statistical Analysis. Sample Design Forest, N=6 Field, N=4 Count ant nests per quadrat.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Roghayeh parsaee  These approaches assume that the study sample arises from a homogeneous population  focus is on relationships among variables 
A generalized bivariate Bernoulli model with covariate dependence Fan Zhang.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
The effect of variable sampling efficiency on reliability of the observation error as a measure of uncertainty in abundance indices from scientific surveys.
Machine Learning 5. Parametric Methods.
Tutorial I: Missing Value Analysis
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Review of statistical modeling and probability theory Alan Moses ML4bio.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Matrix Models for Population Management & Conservation March 2014 Lecture 10 Uncertainty, Process Variance, and Retrospective Perturbation Analysis.
Multiple Season Model Part I. 2 Outline  Data structure  Implicit dynamics  Explicit dynamics  Ecological and conservation applications.
Estimation of Animal Abundance and Density Miscellaneous Observation- Based Estimation Methods 5.2.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Capture-recapture Models for Open Populations “Single-age Models” 6.13 UF-2015.
 1 Species Richness 5.19 UF Community-level Studies Many community-level studies collect occupancy-type data (species lists). Imperfect detection.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Monitoring and Estimating Species Richness Paul F. Doherty, Jr. Fishery and Wildlife Biology Department Colorado State University Fort Collins, CO.
Sampling Design and Analysis MTH 494 LECTURE-11 Ossam Chohan Assistant Professor CIIT Abbottabad.
Single Season Occupancy Modeling 5.13 UF Occupancy Modeling State variable is proportion of patches that is occupied by a species of interest.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
1 Occupancy models extension: Species Co-occurrence.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
 1 Modelling Occurrence of Multiple Species. 2 Motivation Often there may be a desire to model multiple species simultaneously.  Sparse data.  Compare/contrast.
 Multi-state Occupancy. Multiple Occupancy States Rather than just presence/absence of the species at a sampling unit, ‘occupancy’ could be categorized.
Multiple Season Study Design. 2 Recap All of the issues discussed with respect to single season designs are still pertinent.  why, what and how  how.
Single Season Study Design. 2 Points for consideration Don’t forget; why, what and how. A well designed study will:  highlight gaps in current knowledge.
 Occupancy Model Extensions. Number of Patches or Sample Units Unknown, Single Season So far have assumed the number of sampling units in the population.
Hierarchical Models. Conceptual: What are we talking about? – What makes a statistical model hierarchical? – How does that fit into population analysis?
Prediction and Missing Data. Summarising Distributions ● Models are often large and complex ● Often only interested in some parameters – e.g. not so interested.
Chloe Boynton & Kristen Walters February 22, 2017
12 Inferential Analysis.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Estimating mean abundance from repeated presence-absence surveys
12 Inferential Analysis.
Parametric Methods Berlin Chen, 2005 References:
Learning From Observed Data
Presentation transcript:

Single Season Model Part I

2 Basic Field Situation From a population of S sampling units, s are selected and surveyed for the species. Units are closed to changes in occupancy during a common ‘season’. Units must be repeatedly surveyed within a season.

3 Conceptual Sampling Framework Season 1 12k1k1... Surveys 2 12k2k2... T 12kTkT Local Extinction Colonization Closure

4 Resulting Data Unit s 000

5 Single Season Model Investigating patterns in occupancy. Variety of approaches all recognizing that an observed ‘absence’ may be the result of a true absence or a nondetection. (e.g., Geissler and Fuller 1987, Azuma et al. 1990, MacKenzie et al. 2002, Tyre et al. 2003, Wintle et al and Stauffer et al. 2004) MacKenzie et al. (2002) provide most general treatment of the problem.

6 Single Season Model  = probability a unit is occupied. p j = probability species is detected at a unit in survey j (given presence).

7 Single Season Model Consider the data consists of 2 ‘layers’ 1. True presence/absence of the species. 2. Observed data which is conditional upon species distribution. Knowledge about the first layer is imperfect. Must account for the observation process to make reliable inferences about occurrence.

8 Model Development Biological Reality Field Observations

9 Observed Data Likelihood Model all possible stochastic processes that may have resulted in observed detection histories. Take verbal description of the observed data and translate it into a mathematical equation.

10 Observed Data Likelihood For example, Verbal description: species is present at the unit, was detected in first and third survey, not detected in second survey. Mathematical translation:

11 Observed Data Likelihood For example, Verbal description: species is present at the unit and was never detected, OR species is absent. Mathematical translation:

12 Observed Data Likelihood Biological Reality Present Absent Observations

13 Observed Data Likelihood Biological Reality Present Absent Observations

14 Observed Data Likelihood Biological Reality Present Absent Observations

15 Observed Data Likelihood Model likelihood is the product of the probability statements. Likelihood can be maximized to obtain MLE’s, or used within a Bayesian framework.

16 Complete Data Likelihood The data we wish we had! z i is true presence/absence of species at unit i Bernoulli random variable with Pr(success) = ψ

17 Complete Data Likelihood Compare with:

18 Complete Data Likelihood h ij is detection/nondetection of species (given presence) in survey j of unit i Bernoulli random variable with Pr(success) = p

19 Complete Data Likelihood Combining these terms, the overall likelihood can be expressed as: Note that many terms will disappear

20 Complete Data Likelihood For example, if z i = 1 and h i = 101

21 Complete Data Likelihood For example, if z i = 1 and h i = 000

22 Complete Data Likelihood For example, if z i = 0 and h i = 000

23 Complete Data Likelihood Unfortunately, the true values will typically be unknown Solutions: Replace with expected values Expectation – Maximization (EM) Algorithm Replace with imputed values (data augmentation) Markov Chain Monte Carlo (MCMC)

24 Complete Data Likelihood CDL can also be expanded to include units that were never surveyed

25 Covariates Season-specific constant within a season, but may vary between seasons. e.g., habitat type, patch size, generalized weather patterns Survey-specific may vary between surveys. e.g., local environmental conditions, observers

26 Covariates Occupancy and detection probabilities may be functions of season-specific covariates (via logit link, say).

27 Covariates Detection probabilities may also be a function of survey-specific covariates.

28 Covariates Covariates may be continuous or categorical. Advisable to standardize continuous covariates on to some meaningful scale such that covariates are approximately symmetrically distributed about zero. z-transformation can be done within PRESENCE.

29 Covariates Categorical covariates with m categories should be represented with m – 1 indicator (dummy) variables. e.g., if 4 habitat types, use 3 indicator variables; HabA, HabB, HabC, with habitat D considered the ‘standard’. However, suggest all m indicator variables be included in data file.

30 ‘Missing’ Observations Implicit assumption that j th survey of all units are conducted at (approximately) the same time; possibly unlikely in practice. Weather or breakdowns may result in some units not being surveyed. Equal sampling effort may not be possible with available resources.

31 ‘Missing’ Observations Day Unit

32 ‘Missing’ Observations Survey-specific covariates can only be missing if associated detection survey is also missing. Season-specific covariates cannot be missing.

33 Pr(occupied|nondetection) “Given the species was not detected, is the unit occupied?” is sometimes of interest. Can be derived from modelling results using Bayes theorem.

34 Pr(occupied|nondetection) Standard errors can be derived with the delta method.

Single Season Model Part II

36 Model Assumptions Closure. Surveys are independent. No unmodelled heterogeneity. Species identified correctly (no false detections)

37 What if Occupancy Changes? If species physically occupies units at random within a season, ‘occupancy’ parameter relates to probability a unit is used by the species. Immigration/emigration only; ‘occupancy’ parameter relates to probability is present at a unit at end/beginning of season respectively. should allow detection probability to vary within a season. should pool some survey occasions.

38 What if Occupancy Changes? Other non-random changes may cause biases. Is a ‘season’ defined appropriately? Is time between surveys appropriate? Area of active research.

39 What if Occupancy Changes? Closure implies 2 nd component = 1. When changes are random, ‘detection probability’ is the product of 2 nd and 3 rd components

40 Lack of Independence Surveys are not independent if the outcome of survey A is dependent upon the outcome of survey B. Some forms of dependence may be accommodated with good designs or modelling. In some instances, parameter estimates may be OK, but standard errors too small.

41 Lack of Independence To account for ‘trap response’; define a survey-specific covariate that equals 1 for all surveys after first detection at a site, 0 otherwise. X ij h

42 Accounting for Lack of Closure/Independence “Spatial Correlation” option in PRESENCE (Hines et al. 2010, Ecol. Aps.) Motivating example: surveying tigers on trails in India Introduce 2 new parameters θ – Pr(tiger on trail given not on trail in previous segment): thta0 θ’ – Pr(tiger on trail given on trail in previous segment): thta1 θ’’ – Pr(tiger on trail in first segment)

43 Accounting for Lack of Closure/Independence

44 Accounting for Lack of Closure/Independence

45 Accounting for Lack of Closure/Independence Could also possibly be applied to account for temporal correlation or non-random changes in occupancy within a season.

46 Unmodelled Heterogeneity In occupancy probabilities parameter estimates should still be valid as average values across the units surveyed. In detection probabilities occupancy will be underestimated. covariates may account for some sources of variation.

47 Incorporating Heterogeneity Finite Mixtures Assume occupied units consist of G groups. Each group has a different p. Group membership is unknown hence a unit may belong to any of the G groups.

48 Incorporating Heterogeneity Random Effects Assume p i is a random value from a continuous probability distribution (e.g., beta distribution, logit-normal). Closed form expressions possible for some distributions. Easily implemented using WinBUGS.

49 Abundance Induced Heterogeneity Differences in the local abundance of the species between units may induce heterogeneity in detection probability. Royle and Nichols (2003) suggested an extension of the above method to accommodate this.

50 Abundance Induced Heterogeneity Local abundance is unknown, but a spatial distribution could be assumed (e.g., Poisson).

51 Abundance Induced Heterogeneity

52 Abundance Induced Heterogeneity Occupancy is now a derived parameter: (1–e –  ). Implicit assumption that the number of animals at a unit is constant. Care must be taken in how ‘abundance’ should be interpreted.

53 Abundance Induced Heterogeneity Similar model can result through random effect on complementary log- log link function for p. A good approach for incorporating heterogeneity in detection for occupancy estimation, less reliable if inferences about abundance are desired.

54 Species Misidentification If species is falsely detected then occupancy could be overestimated. Could be incorporated into the models if there is data on known false detections. e.g., from observer training, comparison of lab and field identification of species, etc.

55 Species Misidentification Recent paper by Miller et. al. (2011) Ecology develops method that can account for misidentification provided there is a subset of detections for which misidentifications can not occur Incorporated in PRESENCE

56 Assessing Model Fit MacKenzie and Bailey (2004) suggest a test based on the observed and expected number of sites with each possible detection history. The expected number is predicted by the model, which may include covariates.

57 Assessing Model Fit For example, for the history 101 : Unit Total0.216

58 Assessing Model Fit Use parametric bootstrap to assess the evidence for lack of fit. used to adjust SE’s and AIC values.

59 Assessing Model Fit Test has been shown to perform well to identify poor model structure w.r.t. detection probabilities, but not occupancy probabilities. Found to have generally low-power, especially for the sample sizes expected in many applications ( s <100). Recommend ≥10,000 bootstraps.

60 Assessing Model Fit Test should be conducted on the most complicated model under consideration (the global model) and results applied to all models in candidate set. e.g., if from global model, this value is used to adjust AIC values and SE’s for all models.

61 Assessing Model Fit Posterior predictive checks could be used within a Bayesian setting to determine whether observed features of the data is “similar” to what a fitting model would expect.

62 Finite Population In some circumstances, the sample of s units may constitute a large fraction of the population of interest. Strictly speaking, the methods above estimate the probability of occupancy. an underlying characteristic of the population. The proportion of units occupied is a realisation of this process.

63 Finite Population Distinction between them is of little practical consequence for ‘infinite’ populations, but may be for ‘finite’ populations. SE’s will be too large if not accounted for.

64 Finite Population The proportion of occupied sites could be calculated as: SE derived from delta method

65 Finite Population When no covariates in the model, variance (ie SE 2 ) of the proportion is:

66 Finite Population Alternatively, easily implemented using the data augmentation approach. Presence/absence of the species is predicted for units where species not detected. Occupancy state can also be predicted for units that were never surveyed.

67 Finite Population Post. Distn.  mean = 0.56, sd = 0.10 {2.5, 50, 97.5} %iles = {0.37, 0.56, 0.78} Post. Distn.  * mean = 0.56, sd = 0.08 {2.5, 50, 97.5} %iles = {0.46, 0.54, 0.74}

68 Spatial Correlation Probability of occupancy at a unit may depend upon whether a ‘neighbouring’ unit is also occupied. Some forms of clustering may be well explained by covariates. May not be important to account for if interest is in an overall measure of occupancy. May be more important if interest is in maps or predictions of unit-level occupancy.

69 Spatial Correlation What fraction of cells are occupied? What would be the appropriate SE?

70 Spatial Correlation Recent simulations have confirmed that overall estimates are unbiased, with appropriate standard errors, without accounting for spatial correlation when units are sampled randomly.

71 Spatial Correlation Sargeant et al. (2005) suggested an approach using image restoration methods. doesn’t generalize easily to incorporate covariates. Magoun et al. (2007) assumed spatially correlated residuals with logistic regression.

72 Spatial Correlation Alternatively, could model occupancy with the autologistic function. essentially the logit link with an ‘effect’ related to the number of occupied neighboring units. g i is a function of the occupied ‘neighboring’ units But the exact occupancy state of units will often be unknown…

73 Spatial Correlation Theoretically, could integrate over all unknown occupancy states for all units in the landscape (i.e., develop ODL). Practically, much easier to develop the complete data likelihood and use data augmentation or EM algorithm.

74 Spatial Correlation Detection probability may also be a function of the number of neighboring occupied units (e.g., harder to detect a species at the edge of it’s range).

75

76

77

78

79

80 Summary Historically, investigating patterns in occupancy has been the main focus of such studies. A suite of flexible methods are now available that account for: detectability covariates unequal sampling effort heterogeneity finite populations spatial correlation Useful for assessing a snapshot of a population, but not for understanding the underlying dynamics.