Mixture models for estimating population size with closed models Shirley Pledger Victoria University of Wellington New Zealand IWMC December 2003.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

MARK RECAPTURE Lab 10 Fall Why?  We have 4 goals as managers of wildlife  Increase a population  Decrease a population  Maintain a population.
Analysis of variance and statistical inference.
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Brief introduction on Logistic Regression
Empirical Estimator for GxE using imputed data Shuo Jiao.
Logistic Regression Psy 524 Ainsworth.
Random Assignment Experiments
Lecture 28 Categorical variables: –Review of slides from lecture 27 (reprint of lecture 27 categorical variables slides with typos corrected) –Practice.
Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.
The current status of fisheries stock assessment Mark Maunder Inter-American Tropical Tuna Commission (IATTC) Center for the Advancement of Population.
Examining the use of administrative data for annual business statistics Joanna Woods, Ria Sanderson, Tracy Jones, Daniel Lewis.
Variance and covariance M contains the mean Sums of squares General additive models.
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Species interaction models. Goal Determine whether a site is occupied by two different species and if they affect each others' detection and occupancy.
Improving the accuracy of aerial surveys for dugongs: implications for management of Indigenous hunting in Torres Strait Helene Marsh, Ken Pollock, Ivan.
Topic 3: Regression.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
458 Fitting models to data – III (More on Maximum Likelihood Estimation) Fish 458, Lecture 10.
Generalized Linear Models
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
CLOSED CAPTURE-RECAPTURE
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Multiple regression models Experimental design and data analysis for biologists (Quinn & Keough, 2002) Environmental sampling and analysis.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
The Group Lasso for Logistic Regression Lukas Meier, Sara van de Geer and Peter Bühlmann Presenter: Lu Ren ECE Dept., Duke University Sept. 19, 2008.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Introduction to logistic regression and Generalized Linear Models July 14, 2011 Introduction to Statistical Measurement and Modeling Karen Bandeen-Roche,
Generalized Linear Models (GLMs) and Their Applications.
BRIEF INTRODUCTION TO ROBUST DESIGN CAPTURE-RECAPTURE.
Machine Learning 5. Parametric Methods.
Logistic regression. Recall the simple linear regression model: y =  0 +  1 x +  where we are trying to predict a continuous dependent variable y from.
Populations. What is a population? -a group of actively interacting and interbreeding individuals in space and time.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
Estimation of Animal Abundance and Density Miscellaneous Observation- Based Estimation Methods 5.2.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Capture-recapture Models for Open Populations “Single-age Models” 6.13 UF-2015.
 1 Species Richness 5.19 UF Community-level Studies Many community-level studies collect occupancy-type data (species lists). Imperfect detection.
6. Ordered Choice Models. Ordered Choices Ordered Discrete Outcomes E.g.: Taste test, credit rating, course grade, preference scale Underlying random.
Spatially Explicit Capture-recapture Models for Density Estimation 5.11 UF-2015.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Density Estimation with Closed CR Models 5.10 UF-2015.
K-Sample Closed Capture-recapture Models UF 2015.
Single Season Model Part I. 2 Basic Field Situation From a population of S sampling units, s are selected and surveyed for the species. Units are closed.
Single Season Occupancy Modeling 5.13 UF Occupancy Modeling State variable is proportion of patches that is occupied by a species of interest.
Capture-recapture Models for Open Populations Multiple Ages.
Abundance of Organisms
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
Closed Capture-Recapture Models 2 Sample Model Outline: Model description/ data structure Encounter history Estimators Assumptions and study design.
Single Season Study Design. 2 Points for consideration Don’t forget; why, what and how. A well designed study will:  highlight gaps in current knowledge.
29 October 2009 MRC CBU Graduate Statistics Lectures 4: GLM: The General Linear Model - ANOVA & ANCOVA1 MRC Cognition and Brain Sciences Unit Graduate.
 Occupancy Model Extensions. Number of Patches or Sample Units Unknown, Single Season So far have assumed the number of sampling units in the population.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Capture-recapture Models for Open Populations Abundance, Recruitment and Growth Rate Modeling 6.15 UF-2015.
Discussions on Software Reliability
CJT 765: Structural Equation Modeling
Generalized Linear Models
Wildlife Population Analysis What are those βs anyway?
Estimating Population Size
Sampling methods.
Multistate models Lecture 10.
Estimating mean abundance from repeated presence-absence surveys
Introduction to Logistic Regression
Wildlife Population Analysis
Wildlife Population Analysis
Presentation transcript:

Mixture models for estimating population size with closed models Shirley Pledger Victoria University of Wellington New Zealand IWMC December 2003

2 Acknowledgements Gary White Richard Barker Ken Pollock Murray Efford David Fletcher Bryan Manly

3 Background Closed populations - no birth / death / migration Short time frame, K samples Estimate abundance, N Capture probability p – model? Otis et al. (1978) framework

4 M(tbh) M(tb) M(th)M(bh) M(t)M(b)M(h) M(0)

5 Models for p M(0), null model, p constant. M(t), Darroch model, p varies over time M(b), Zippin model, behavioural response to first capture, move from p to c M(h), heterogeneity, p varies by animal M(tb), M(th), M(bh) and M(tbh), combinations of these effects

6 Likelihood-based models M(0), M(t) and M(b) in CAPTURE, MARK M(tb) – need to assume connection, e.g. c and p series additive on logit scale M(h) and M(bh), Norris and Pollock (1996) M(th) and M(tbh), Pledger (2000) Heterogeneous models use finite mixtures

7 M(h) C animal classes, unknown membership. Animal i from class c with probability  c. Animal i Class 1 Class 2 Capture probability p 1 Capture probability p 2 11 22

8 M(h 2 ) parameters N  1 and  2 p 1 and p 2 Only four independent, as  1 +  2 = 1 Can extend to M(h 3 ), M(h 4 ), etc.

9 M(th) parameters N  1 and  2 (if C = 2) p matrix, C by K, p cj is capture probability for class c at sample j Two versions: 1.Interactive, M(txh), different profiles 2.Additive (on logit scale), M(t+h).

10 M(t x h), interactive Different classes of animals have different profiles for p Species richness applications

11 M(t+h), additive (on logit scale) For Class 1, log(p j /(1-p j )) =  j For Class 2, log(p j /(1-p j )) =  j   2 Parameter  2 adjust p up or down for class 2 Similar to Chao M(th) Example – Duvaucel’s geckos

12 M(bh) parameters N  1   C (C classes,  ) p 1... p C for first capture c 1... c C for recapture Two versions: 1.Interactive, M(bxh), different profiles 2.Additive (on logit scale), M(b+h).

13 M(b x h), interactive Different size of trap- shy response One class bold for first capture, large trap response Second class timid at first, slight trap response.

14 M(b + h), additive (logit scale) Parallel lines on logit scale For Class 1, log(p/(1-p)) =  1 log(c/(1-c)) =  1   For Class 2, log(p/(1-p)) =  2 log(c/(1-c)) =  2   Common  adjusts for behaviour effect

15 M(tbh) Parameters N and  1...  C (C classes) Interactive version – each class has a p series and a c series, all non-parallel. Fully additive version – on logit scale, have a basic sequence for p over time, use  to adjust for recapture and  to adjust for different classes. There are also other intermediate models, partially additive.

16 M(t x b x h) For class c, sample j, Logit(p jc ) =   j +  +  c  +  j  +  jc  +  c  +  jc where  is a 0/1 dummy variable, value 1 for a recapture. (Constraints occur.)

17 Other Models M(t+b+h) – omit interaction terms M(t x h) – omit terms with  M(t + h) – also omit (  ) interaction term M(b x h) – omit  terms M(0) has  only.

18 M(t x b) Can’t do M(t x b) – too many parameters for the minimal sufficient statistics. Can do M(t+b) using logit. Similar to Burnham’s power series model in CAPTURE. Why can we do M(t x b x h) (which has more parameters), but not M(t x b)?

19 M(txbxh) M(txb) M(txh)M(bxh) M(t)M(b)M(h) M(0) M(t+b)M(t+h)M(b+h) M(t+b+h) Now have these models:

20 Example - skinks Polly Phillpot, unpublished M.Sc. thesis Spotted skink, Oligosoma lineoocellatum North Brother Island, Cook Strait, 1999 Pitfall traps April: 8 days, 171 adults, 285 captures Daily captures varied from 2 to 99 (av<40) November: 7 days, 168 adults, 517 captures (20 to 110 daily, av>70)

21

22

23 April: Rel(AIC c ) npar M(t + b + h) M(t x h) M(t x b x h) M(t + h) M(t) M(t + b) M(b x h) M(b + h) M(b) M(h) M(0)

24 November: Rel(AIC c ) npar M(t x b x h) M(t x h) M(t + b + h) M(t + h) M(t + b) M(t) M(b + h) M(h) M(b x h) M(0) M(b)

25 Abundance Estimates Used model averaging April, N estimate = 206 (s.e. = 33.0) 95% CI (141,270). November, N estimate = 227 (s.e. = 38.7) 95% CI (151,302).

26 Using MARK Data entry – as usual, e.g ; for 5 animals with encounter history Select “Full closed Captures with Het.” Select input data file, name data base, give number of occasions, choose number of classes, click OK. Starting model is M(t x b x h) Following example has 2 classes, 5 sampling occasions.

27 Parameters for M(t x b x h) 11 1 p for class p for class c for class c for class N 20

28 M(t x h): set p=c 11 1 p for class p for class c for class c for class N 12

29 M(b x h): constant over time 11 1 p for class p for class c for class c for class N 6

30 M(t) 11 1 (fix) p for class p for class c for class c for class N 7

31 M(b) 11 1 (fix) p for class p for class c for class c for class N 4

32 M(0) 11 1 (fix) p for class p for class c for class c for class N 3

33 M(t + h): use M(t x h) parameters (as below), plus a design matrix 11 1 p for class p for class c for class c for class N 12

34 Design matrix for M(t + h). Use logit link. B1B2B3B4B5B6B7B8 11 1 p class p class N 1  7 is   Adjusts for class 2

35 M(b + h) Start with M(b x h) and use this design matrix, with logit link B1B2B3B4B5  1 1 p class 1 1 p class 2 1 c class c class N 1

36 M(t + b + h) Start with M(t x b x h) Use one  to adjust for recapture For each class above 1 use another  for the class adjustment.

37 Time Covariates Time effect could be weather, search effort Logistic regression: in logit(p), replace  j with linear response e.g.  x j +  w j where x j is search effort and wj is a weather variable (temperature, say) at sample j Logistic factors: use dummy variables to code for (say) different searchers, or low and high rainfall. Skinks: maximum daily temperature gave good models, but not as good as full time effect.

38 Multiple Groups Compare – same capture probabilities? If equal-sized grids, different locations, N indexes density – compare densities in different habitats. Cielle Stephens, M.Sc. (in progress) – skinks. Good design - eight equal grids, two in each of four different habitat types. Between and within habitat density comparisons. Temporary marks.

39 Discussion Advantages of maximum likelihood estimation – AIC c, LRTs, PLIs. Working well for model comparison. Two classes enough? Try three or more classes, look at estimates.

40 If heterogeneity is detected, models including h have higher N and s.e.(N). If heterogeneity is not supported by AIC c, the heterogeneous models may fail to fit. See the parameter estimates. M(t x b x h) often fails to fit – see parameter estimates (watch for zero s.e., p or c at 0 or 1).

41 Alternative M(h) – use Beta distribution for p (infinite mixture). Which performs better? - depends on region of parameter space chosen by the data. Often similar N estimates. Don’t believe in the classes or the Beta distribution. Just a trick to allow p to vary and hence reduce bias in N.

42 All models poor if not enough recaptures. Warning signals needed. Finite mixtures, one class with very low p. Beta distribution, first parameter estimate < 1. Often with finite mixtures, estimates of  and p are imprecise, but N estimates are good.