Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wildlife Population Analysis

Similar presentations


Presentation on theme: "Wildlife Population Analysis"— Presentation transcript:

1 Wildlife Population Analysis
Lecture 07 – Tag recovery models, Mark-recapture models Goodness-of-fit

2 “Open” Population models
Mortality (1 - S ) Natality Fertility (F Emigration Immigration Generally do not estimate population size (N) Estimate dynamic processes One or more Simplest just examine survival rates

3 Tag recovery models Lecture 7 – Part I

4 Tag recovery model Open population models
Employed to estimate survival rates based on markers returned from harvested animals. Assumptions Recoveries at same time Relatively robust to violations of this assumption. Hetergeneity in parameters of interests modelled appropriately

5 Recovery data m-array or encounter or recovery matrix: each row = multinomial distribution. MODEL solution = simultaneous solution (sum) of all of the log-likelihoods 110000 110000 011000 101000 100100 001100

6 Tag recovery models Known Random Variables
Ri—number of animals marked in year i. Random Variables mij—number of bands or tags reported Parameters of interest (pij) Sj—conditional probability of survival rj—conditional probability of being reported (fi in Brownie et al.)

7 Tag Recovery Dendrogram
1100 (1-S1)r1 1000 (1-S1)(1-r1) Capture History 1 2 3 1000 S1S2(1-S3)(1-r3) 1000 (S1S2S3) 1001 S1S2(1-S3)r3 1-r3 Dead Reported Unreported S3 1-S3 r3 Live 1-r2 Dead Reported Unreported S2 1-S2 r2 Live 1010 S1(1-S2)r2 1000 S1(1-S2)(1-r2) 1-r1 Releases Dead Reported Unreported S1 1-S1 r1 Live

8 m-array Very concise way to display the data as it relates to the capture histories and the multinomial likelihoods. Each row contains the data for a multinomial likelihood 8

9 m-array Each row represents a cohort of marked individuals released on occasion i – R(i) This number includes those initially marked at other times and released again at time i. The columns indicate the sampling occasions 2 through m. mij – number of individuals released at i and next encountered at time j. The total on the right is the number of individuals released at R(i) that were seen again. 9

10 m-array - points Both band recoveries (dead animals) and recapture/re-sighting data Live releases are treated as newly marked animals in year i + 1. Each cell in the m-array has an associated probability (model) Each row has its own multinomial likelihood. Simultaneous solution of all of the likelihoods that makes this analysis possible. 10

11 Cell Probabilities Cell probabilities (pij) in terms of
Sij – Survival rates rij – Reporting rates Expected number of recoveries (E(mij). Cell probabilities (pij) X Releases (Ri). Use to assess goodness of fit.

12 Saturated (General) Model
as many parameters as cells. Cell Probabilities for tag recovery (pij) Occasion 1 2 3 4 5 Not recovered (1-S1) r1 S1(1-S2)r2 S1S2(1-S3)r3 S1S2S3(1-S4)r4 S1S2S3S4(1-S5)r5 1-((p1j) (1-S6)r6 S6(1-S7)r7 S6S7(1-S8)r8 S6S7S8(1-S9)r9 1-((p2j) (1-S10)r10 S10(1-S11)r11 S10S11(1-S12)r12 1-((p3j) (1-S13)r13 S13(1-S14)r14 1-((p4j) (1-S15)r15 1-((p5j)

13 Saturated Model as many parameters as cells.
Expected tag recoveries(E(mij)) Occasion Released 1 2 3 4 5 Not recovered R1 R1(1-S1)r1 R1S1(1-S2)r2 R1S1S2(1-S3)r3 R1S1S2S3(1-S4)r4 R1S1S2S3S4(1-S5)r5 R1-(E(m1j)) R2 R2(1-S6)r6 R2S6(1-S7)r7 R2S6S7(1-S8)r8 R2S6S7S8(1-S9)r9 R2-S(E(m2j)) R3 R3(1-S10)r10 R3S10(1-S11)r11 R3S10S11(1-S12)r12 R3-S(E(m3j)) R4 R4(1-S13)r13 R4S13(1-S14)r14 R4-S(E(m4j)) R5 R5(1-S15)r15 R5-(E(m5j))

14 Model {St rt} the subscript t indicates that parameters vary among years Cell Probabilities for tag recovery (pij) Occasion 1 2 3 4 5 Not recovered (1-S1) r1 S1(1-S2)r2 S1S2(1-S3)r3 S1S2S3(1-S4)r4 S1S2S3S4(1-S5)r5 1-(p1j) (1-S2)r2 S2(1-S3)r3 S2S3(1-S4)r4 S2S3S4(1-S5)r5 1-(p2j) (1-S3)r3 S3(1-S4)r4 1-(p3j) (1-S4)r4 S4(1-S5)r5 1-(p4j) (1-S5)r5 1-(p5j)

15 Model {St rt} the subscript t indicates that parameters vary among years Expected tag recoveries(E(mij)) Occasion Released 1 2 3 4 5 Not recovered R1 R1(1-S1)r1 R1S1(1-S2)r2 R1S1S2(1-S3)r3 R1S1S2S3(1-S4)r4 R1S1S2S3S4(1-S5)r5 R1-(E(m1j)) R2 R2(1-S2)r2 R2S2(1-S3)r3 R2S2S3(1-S4)r4 R2S2S3S4(1-S5)r5 R2-(E(m2j)) R3 R3(1-S3)r3 R3S3(1-S4)r4 R3S3S4(1-S5)r5 R3-(E(m3j)) R4 R4(1-S4)r4 R4S4(1-S5)r5 R4-(E(m4j)) R5 R5(1-S5)r5 R5-(E(m5j))

16 Model {S. rt} Survival constant; recoveries vary among years
Cell Probabilities for tag recovery(pij) Occasion 1 2 3 4 5 Not recovered (1- S1) r1 S1(1-S1)r2 S1S1(1-S1)r3 S1S1S1(1-S1)r4 S1S1S1S1(1-S1)r5 1-(p1j) (1-S1)r2 S1(1-S1)r3 S1S1(1-S1)r4 S1S1S1(1-S1)r5 1-(p2j) (1-S1)r3 S1(1-S1)r4 1-(p3j) (1-S1)r4 S1(1-S1)r5 1-(p4j) (1-S1)r5 1-(p5j)

17 Model {S. rt} Survival constant; recoveries vary among years
Expected tag recoveries(E(mij)) Occasion Released 1 2 3 4 5 Not recovered R1 R1(1-S1)r1 R1S1(1-S1)r2 R1S1S1(1-S1)r3 R1S1S1S1(1-S1)r4 R1S1S1S1S1(1-S1)r5 R1-(E(m1j)) R2 R2(1-S1)r2 R2S1(1-S1)r3 R2S1S1(1-S1)r4 R2S1S1S1(1-S1)r5 R2-(E(m2j)) R3 R3(1-S1)r3 R3S1(1-S1)r4 R3S1S1(1-S1)r5 R3-(E(m3j)) R4 R4(1-S1)r4 R4S1(1-S1)r5 R4-(E(m4j)) R5 R5(1-S1)r5 R5-(E(m5j))

18 Model {S. r.} survival and recovery constant
Cell Probabilities for tag recovery(pij) Occasion 1 2 3 4 5 Not recovered (1- S1) r1 S1(1-S1)r1 S1S1(1-S1)r1 S1S1S1(1-S1)r1 S1S1S1S1(1-S1)r1 1-(p1j) (1-S1)r1 1-(p2j) 1-(p3j) 1-(p4j) 1-(p5j)

19 Model {S. r.} survival and recovery constant
Expected tag recoveries(E(mij)) Occasion Released 1 2 3 4 5 Not recovered R1 R1(1-S1)r1 R1S1(1-S1)r1 R1S1S1(1-S1)r1 R1S1S1S1(1-S1)r1 R1S1S1S1S1(1-S1)r1 R1-(E(m1j)) R2 R2-(E(m2j)) R3 R3-(E(m3j)) R4 R4-(E(m4j)) R5 R5-(E(m5j))

20 Model {S. r3} survival constant and recovery different in year 3
Cell Probabilities for tag recovery (pij) Occasion 1 2 3 4 5 Not recovered (1- S1) r1 S1(1-S1)r1 S1S1(1-S1)r3 S1S1S1(1-S1)r1 S1S1S1S1(1-S1)r1 1-(p1j) (1-S1)r1 S1(1-S1)r3 S1S1(1-S1)r1 1-(p2j) (1-S1)r3 1-(p3j) 1-(p4j) 1-(p5j)

21 PIM – Parameter Index Matrix
Used to specify the parameters for Sis and ris. (separate PIMs for Sis and ris) Like m-arrays columns = occasions rows = cohorts Each estimated parameter gets an index number Parameters that are constrained to be equal get the same index number.

22 Model: {S.r.} Cell Probabilities for tag recovery(pij) 1 2 3 4 5
S1(1-S1)r1 S1S1(1-S1)r1 S1S1S1(1-S1)r1 S1S1S1S1(1-S1)r1 (1-S1)r1 Survival Probabilities Reporting Probabilities 1 2

23 Model: {S.r.}

24 Model: {St r.} Cell Probabilities for tag recovery(pij) Occasion 1 2 3
4 5 (1-S1) r1 S1(1-S2)r1 S1S2(1-S3)r1 S1S2S3(1-S4)r1 S1S2S3S4(1-S5)r1 (1-S2)r1 S2(1-S3)r1 S2S3(1-S4)r1 S2S3S4(1-S5)r1 (1-S3)r1 S3(1-S4)r1 (1-S4)r1 S4(1-S5)r1 (1-S5)r1 Survival Probabilities Reporting Probabilities 1 2 3 4 5 6

25 Model: {St r.}

26 Model: {Sj Ss Sa r} Age-specific Model: {Sj Ss Sa r}
animals are marked as young different survival probabilities for three age classes (juvenile, subadult, and adult) single recovery probability

27 Model: {Sj Ss Sa r} Cell Probabilities for tag recovery(pij) Occasion
(1-Sj)r Sj(1-Ss)r SjSs(1-Sa)r SjS2Sa(1-Sa)r SjS2SaSa(1-Sa)r Survival Probabilities Reporting Probabilities 1 2 3 4

28 Saturated (most general) model
Cell Probabilities for tag recovery(pij) Occasion (1-S1)r1 S1(1-S2)r2 S1S2(1-S3)r3 S1S2S3(1-S4)r4 S1S2S3S4(1-S5)r5 (1-S6)r6 S6(1-S7)r7 S6S7(1-S8)r8 S6S7S8(1-S9)r9 (1-S10)r10 S10(1-S11)r11 S10S11(1-S12)r12 (1-S13)r13 S13(1-S14)r14 (1-S15)r15 Survival Probabilities Reporting Probabilities 1 2 3 4 5 16 17 18 19 20 6 7 8 9 21 22 23 24 10 11 12 25 26 27 13 14 28 29 15 30

29 Model: {Sj Ss Sa r}

30 Multiple Groups Parallel studies are conducted on
(1) males and females, (2) different study areas, or (3) animals subjected to different treatments. MALES FEMALES Occasion (j) Ri 1 2 3 4 5 2583 91 89 24 18 16 1478 40 31 8 11 3075 141 45 52 50 1525 72 20 15 7 1195 27 21 319 3418 156 92 1805 63 3100 113 1400 39

31 Multiple Groups Much larger array of models Extremes
Model {s. r.} Model {Sg*t , rg*t} - both S and r are time-specific. IMPORTANT CONCEPT! Develop a suitable family of models a priori DO NOT TO RUN ALL POSSIBLE MODELS. data-dredging "over-fitting" "detection" of spurious effects

32 Multiple Groups

33 Multiple Groups

34 Unequal time intervals
CMR analysis handles data in much the same way we use apparent nest success to calculate DSRs. Specify the relative length of the intervals; Interval survival rates calculated Cell Probabilities 1 2 3-4 5 (1-S1)r S1(1-S2)r S1S2S3(1-S4)r S1S2S3S4(1-S5)r (1-S2)r S2S3(1-S4)r S2S3S4(1-S5)r (1-S4)r S4(1-S5)r (1-S5)r

35 Identifiability Some parameters cannot be "identified," i.e., a unique solution cannot be determined. Analogous to attempting to solve for a regression model with only one observation (n = 1). Structural identifiability – parameters must appear separately somewhere in the likelihood

36 Identifiability Models where r varies by year or age unless more than one cohort is marked. Model: {St rt} - lack of identifiability in the last term in each row of the m-array (in red): S5 and r5 do not appear separately anywhere in the matrix and can not be estimated independently (1-S5)r5 can be estimated. (1-S1)r1 S1(1-S2)r2 S1S2(1-S3)r3 S1S2S3(1-S4)r4 S1S2S3S4(1-S5)r5 (1-S2)r2 S2(1-S3)r3 S2S3(1-S4)r4 S2S3S4(1-S5)r5 (1-S3)r3 S3(1-S4)r4 S3S4(1-S5)r5 (1-S4)r4 S4(1-S5)r5 (1-S5)r5

37 Identifiability – other problems
K can not be determined by numerical methods Parameter estimates = 1 or 0 (the boundaries). When data are sparse (i.e. empty or 0 cells in the m-array)

38 Mark-recapture or Cormack-Jolly-Seber models
Lecture 7 - Part II

39 Readings: Pollock, K. H., J. D. Nichols, C. Brownie, and J. E. Hines Statistical inference for capture-recapture experiments. Wildlife Monograph 107. Seber, G. A. F The estimation of animal abundance and related parameters, 2nd ed. Macmillan, New York, NY. Lebreton, J. D., Burnham, K. P., Clobert, J., & Anderson, D. R. (1992). Modeling survival and testing biological hypotheses using marked animals: a unified approach with case studies. Ecological monographs, 62(1),

40 Cormack-Jolly-Seber models
Open population models Similar to the band/tag recovery models Animals are captured, given a unique mark, and released Subsequently marked and unmarked animals are captured. Marked animals are recorded and released Unmarked animals are marked and released Accidental deaths on capture are allowed and recorded Live recaptures (CJS)

41 Cormack-Jolly-Seber model (CJS)
Cormack-Jolly-Seber backstory General model allows year-specific estimates Apparent survival (Phi, ) Capture probability (p) Model covariates of both Apparent survival (i.e., 1 – (mortality + emigration)) Apparent survival < the true survival rate. Studies are restricted to a specific locality. MARK example data set (dipper.dbf)

42 Assumptions Tagged animals are representative of the population of interest. Numbers of releases are known exactly. There is no tag loss, misreading of tags, no data entry errors, etc. Captures, releases and recaptures occur over a brief time period. The fates of individuals is independent (no over-dispersion). Animals in an identifiable group have the same survival and recapture probabilities. Parameter estimates are based on the correct model.

43 Constants Ri – The number of animals released in year i is known and included the number of previously marked animals recaptured and re-released in year i.

44 Random variables mij – As before, these are the data, the matrix of first recaptures of individuals released in year i; and recaptured in year j. Recaptures can be live encounters of any type including recapture, re-sighting, or otherwise detecting the presence and status of marked individuals.

45 Parameters Φj – Conditional probability of apparent survival in interval j, given alive at the beginning of the interval j pj – Conditional probability of capture or recapture at time j, given alive at the beginning of the interval j K – the number of estimable parameters in the model.

46 Capture histories Data summarized in a capture-history or encounter-history matrix. Typical encounter histories for 5 occasions might look like:

47 m-arrays Slightly different – the Ri include re-releases
mij include only the first capture of each individual each individual appears only once in each row For example consider the following m-array for a simulated study with 2000 new released on each occasion: i Ri 1 2 3 4 2000 30 70 114 43 2030 80 97 55 2150 167 46 2378 72

48 Expectations Φ1(1-p2)Φ2p3Φ3(1-p4)Φ4p5 Cell expectations different
For example the capture history: {10101} - released at time 1 and encountered again at 3 and 5 Relationship between encounter history and the estimable parameters can be represented by: The probability of observing this encounter history is thus: Φ1(1-p2)Φ2p3Φ3(1-p4)Φ4p5

49 P{11110} = Φ1p2Φ2p3Φ3p4Φ4(1-p5)+(1-Φ3)
Complications Not encountered on the last occasion last term includes probability of dying previously P{11110} = Φ1p2Φ2p3Φ3p4Φ4(1-p5)+(1-Φ3) When used to develop the expected values for the m-array, we see additional differences.

50 Expectations m24 includes capture histories {11010} and {010100}
E(m24) = R2Φ2(1-p3)Φ3p4 Capture histories can be used to construct the m-array m-array can not be used to reconstruct the capture history matrix. i Ri 2 3 4 5 1 R1 m12 m13 m14 m15 R2 m23 m24 m25 R3 m34 m35 R4 m45 m23 capture histories {111...} and {011...}) E(m23) = R2Φ2p3

51 Model specification The basic CJS model is model {Φt pt} (similar to {St rt}) Models may be constructed using PIMs and Design Matrix Likelihoods are based on the multinomial distributions of cohorts of (re-) releases

52 Jolly-Seber models N and B estimable, but Subject to substantial bias
Heterogeneity in capture probability Nichols, J. D., Hines, J. E., & Pollock, K. H. (1984). Effects of permanent trap response in capture probability on Jolly-Seber capture-recapture model estimates. The Journal of wildlife management, 48(1), Tag loss Need an independent estimate Arnason, A. N., & Mills, K. H. (1981). Bias and loss of precision due to tag loss in Jolly-Seber estimates for mark-recapture experiments. Canadian Journal of Fisheries and Aquatic Sciences, 38(9),

53 POPAN Stand-alone version Estimator in MARK
Schwarz, C. J., & Arnason, A. N. (1996). A general methodology for the analysis of capture-recapture experiments in open populations. Biometrics, Robust re-parameterization of Jolly-Seber Parameters estimated Apparent survival (Phi, ) Capture probability (p) Entry probability (pent) Immigration or births Births (Bi) Super-population size (N) Population size (Ni)

54 Goodness of fit Lecture 07 – Part III

55 Resources: Cooch and White Chapters 5.
Pollock, K.H, Nichols, J.D., Brownie, C. & Hines, J.E. (1990) Statistical inference for capture-recapture experiments. Wildlife Monographs, 107, 1-97

56 Assumptions of open and closed models
Categorized and Summarized by Burnham et al. (1987:51-54): Study planning, field procedures, and generality of inference: Marked animals representative of the population of interest Test conditions are representative Treatment and control animals biologically identical; initial handling, marking, and holding do not affect survival rate Numbers of animals released are known exactly Marking is accurate; there are no mark losses and no misread marks Releases and recaptures occur in brief time intervals; recaptured individuals are released immediately

57 Assumptions Stochastic components of the model:
The fate of each individual is independent of the fate of any other individual With multiple lots (or other replication), the data are statistically independent over lots Model structure: Statistical analyses of the data are based on the correct model Treatment and control fish move downstream together Capture and re-release do not affect subsequent survival or recapture All individuals in the study of an identifiable class have the same survival and capture probabilities

58 Goodness of fit (GOF) Why do I care?
Underlying assumption that the data fit the most general model. If not fit (i.e., likelihood) is over-estimated (Same effect as underestimating variance)

59 GOF testing Form of contingency table analysis
Example from genetics labs Do the frequencies of individuals exhibiting particular encounter histories match those expected under a given general model? Under many circumstances the “general model” is tpt or Strt

60 QAICC Remember the limited discussion of ĉ when we covered QAICc? Note that the model likelihood is divided by ĉ If ĉ >1, then the contribution to the QAIC value from the model likelihood will decline, and the relative penalty for parameters will increase. As ĉ increases, QAIC increasingly favors models with fewer parameters.

61 Estimating ĉ – 4 methods Model deviance divided by model degrees of freedom Tends to over estimate c-hat RELEASE GOF 2 statistic divided by the model df (less subject to bias) NOT available for all models OR sparse data sets). Determines where LOF occurrs Bootstrap approach – theoretically robust; only an estimate Median ĉ – also an estimate; best alternative to date Number of Estimated Parameters {S(g*t) r(g*t) PIM} = 30 DEVIANCE {S(g*t) r(g*t) PIM} = DEVIANCE Degrees of Freedom {S(g*t) r(g*t) PIM} = 39 ĉ {S(g*t) r(g*t) PIM} = AIC {S(g*t) r(g*t) PIM} = AICc {S(g*t) r(g*t) PIM} =

62 Pearson GOF 2 Program RELEASE
Stand-alone program for survival analysis, but only runs certain specialized models for experimental projects. Includes GOF tests Release GOF also runs within Program MARK (Tests/Release GOF).

63 RELEASE GOF - 3 standard tests
TEST 1 – omnibus test for the comparison of groups. NOT used for GOF. Possible to do much more sophisticated tests in MARK. TEST 2 and TEST 3 Assumption – all marked animals in the population have the same chances of being captured at any time. Assumption – among the marked individuals in the population, all animals have the same probability of surviving, regardless of when they were marked.

64 RELEASE from within MARK
Simply pull down the ‘Tests’ menu, and select ‘Program RELEASE GOF’ (only available with ‘Recaptures’ data type). Results will be output into a Notepad window. Output: Information concerning recent updates to RELEASE, program limits listing of capture histories, summary tabulation as an m- array TEST 3 and TEST 2 results for each group (respectively) Summary statistics – overall GOF (TEST 2+TEST 3).

65 Bootstrap GOF in MARK Provides a GOF diagnostic
Estimates the magnitude of the lack of fit ĉ Probability distribution

66 Bootstrap GOF in MARK - procedure
Simulated data generated via Monte Carlo methods and analyzed. (fits model) Uses parameter estimates (survival, recapture, recovery rate...) data exactly meet the model assumptions: no over-dispersion animals are totally independent no violations of model assumptions. Output: model deviance or deviance ĉ. If the model fits perfectly, ĉ=1. Process (2 &3) is repeated for the number of simulations requested.

67 Bootstrap GOF in MARK - procedure
Compare the deviance of the original model to the distribution of the bootstrapped deviances of the simulated data. proportion of bootstrapped deviances larger than the original deviance is interpreted as the probability of a deviances larger than the observed deviance If that proportion is small the observed data does NOT fit the model. Do the same with the ĉ values.

68 Simulations – a 2-stage process.
Run 100 simulations, and compare of the observed deviance to the distribution of the 100 values. If the probability of the observed deviance is >0.2, then more simulations are unlikely to change the results. Conclude there is adequate model fit. If the P-value < 0.2 then simulations are required to accurately determine whether model fit is adequate.

69 Sort simulation results

70 Determine P(obs)

71

72 Estimating ĉ For the approach based on deviance:
where the observed is the original deviance calculated for the model, and the expected deviance is the mean of the bootstrapped values. This is a measure of the amount of over- dispersion in the original data.

73 Estimating ĉ Alternatively for the approach based on ĉ: where the observed is the calculated for the model, and the expected is the mean of the bootstrapped values. Be conservative – use the higher of the two values (assume worse fit) to calculate QAICc.

74 Median ĉ - PREFERRED METHOD
Also based on simulating data Also a 2-stage process Estimates fit over a range of ĉ deviance values. Logistic regression is then used to estimate the median of the distribution. Inputs: Range of values to simulate (lower and upper bounds, Total number of points (values of ĉ in simulation based on these bounds) Number of simulations for each of the specified values

75 Median ĉ - PREFERRED METHOD
Stage 1 (to obtain an initial estimate): Use a small set of values over a wide range to generate deviances. MARK uses logistic regression analysis to estimate the median

76 Median ĉ - PREFERRED METHOD
Stage 2 (to improve precision): Repeat for a smaller range of ĉ values Result is estimated value and SE

77 Median ĉ Median chat is biased high.
Smaller standard deviation than the chat estimated by RELEASE. Median is closer to truth than the RELEASE chat.

78 When is ĉ too large? If the model fits perfectly, then ĉ =1.
What if c=2, or c=10? “Rule of thumb”, < 3.0 is ok (but see Lebreton et al pp ).

79 What to do? Large ĉ (>2) warrants careful examination of the model structure. If the problem is due to structure problems in the general model: Examine TEST 2 and TEST 3 results. Consider a different general model. If not a structural problem Ok to adjust for lack of fit simply by using ĉ. Make the parameter estimates as robust and valid as possible. If model structure is correct and is >>3, data may not be adequate for analysis.


Download ppt "Wildlife Population Analysis"

Similar presentations


Ads by Google