Download presentation
Presentation is loading. Please wait.
1
Wildlife Population Analysis
Lecture 03 - Kaplan-Meier Procedure and Known Fate Analyses
2
Motivation Describe the Kaplan Meier procedure for estimating survival rates Describe an analogous method based on the likelihood for examining factors related to survival
3
Kaplan-Meier or Product Limit Procedure
Kaplan-Meier procedure developed in the late 50's for use in medicine and engineering First suggested for wildlife survival studies in mid- to late-80's by Pollock (1984) and Pollock et. al (1989). Early approaches assumed that survival of every animal was independent (i.e., constant survival probability over all animals and periods)
4
Kaplan-Meier or Product Limit Procedure
Used when relocation is certain or unrelated to mortality Radio-tagged animals Plants or sessile creatures It is a nonparametric approach that does not assume any distribution; e.g., the exponential or weibull
5
Terms and definitions Interval survival rates – the proportion of animals surviving some discrete period of time Product-limit – refers to the fact that survival rate during any interval can be calculated as the product of the survival rates during shorter intermediate intervals. Thus the survival rate for the interval can never be greater than the lowest estimate of survival during any intermediate period. Survival function – equation describing the probability of an animal in a population surviving t units of time from the beginning of a study.
6
Terms and definitions At-risk – having some probability of detectable mortality. Thus, in radio-telemetry studies, transmitter failure or emigration beyond the study area boundaries is grounds for removal Right censor – to remove an animal from the at-risk group for some reason unrelated to mortality Staggered Entry — the addition of additional subjects to the at-risk group during the course of the study.
7
Study design Individuals marked or uniquely identifiable radio tags sessile individuals Subjects can be relocated without failure All individuals relocated at the end of each survival interval (e.g., at each aj)
8
Identification of time origin is crucial
Study design Identification of time origin is crucial in medical and engineering studies – treatment time in wildlife studies for example studies in winter may have a different survival rate that studies in summer; survival rates may vary greatly after the start of a hunting season or other event
9
Assumptions Animals in the population of interest have been sampled randomly – sex, age, location, etc. Survival times are independent for different animals Time of death is known exactly Capturing, marking, and observing does not influence survival (but may condition to eliminate potential effects) Censoring is random (i.e., unrelated to fate – either survival or mortality) May be useful to assume that marker failure is zero (e.g. studies of emigration time) Survival probabilities are same within groups
10
Staggered entry Assumes that newly tagged animals have the same survival function as those previously tagged Alternatively condition data Model the assumption
11
Basic K-M calculations
a1, a2, … ag – discrete time points when deaths (failures) occur (alternatively use regular sampling periods e.g. days, weeks, etc.) r1, r2, … rg – number of animals at risk at these time points d1, d2, … dg – number of deaths at these time points
12
Basic K-M calculations
the probability of surviving the interval from ai-1 to ai and Proportion dying (mortality) the probability of surviving the interval from ai-1 to ai+1
13
Basic K-M calculations
the probability of surviving the interval from ai-1 to ai and Read “product”
14
Censoring and staggered entry
When individuals are right censored, the number in the at risk group ri is reduced at time ai. Likewise when individuals are entered into the at risk sample the value ri is increased at time ai. Alternatively, newly marked individuals can be "conditioned" to allow for marking or handling effects by delaying their addition to the at risk group for some predetermined time period.
15
Example K-M calculations
Week No. risk No. deaths Si Survival No. censored No. added 1 7 1.0000 2 6 5 3 11 0.9091 4 10 16 0.9375 0.8523 15 0.9333 0.7955 8 14 9 0.7857 0.6250 Total
16
Example K-M calculations
Table 3-2 from Pollock et al. 1989: Bobwhite survival in fall of 1986 Week No. risk No. deaths Survival No. censored No. added Var(S) LCL UCL 1 7 1.0000 2 6 5 3 11 0.9091 0.0068 0.8957 0.9225 4 10 0.0075 0.8944 0.9238 16 0.8523 0.0067 0.8391 0.8654 15 0.0072 0.8383 0.8663 0.7955 0.0086 0.7785 0.8124 8 14 0.0092 0.7773 0.8136 9 0.6250 0.0105 0.6045 0.6455 Total
17
Basic K-M calculations
Estimating standard deviation and SE 95% Confidence intervals or
18
K-M Example
19
Known fate models
20
Any good statistical text on logistic regression
Known Fate Analysis Resources: White et al.: Known Fate models Link function Any good statistical text on logistic regression Hosmer, D.W. and Lemeshow, S Applied logistic regression. 2nd Ed. John Wiley & Sons, Inc. New York.
21
Known Fate Models Based on binomial probability models i.e., survived or died Subjects can be relocated and fate determined without failure. Same assumptions as K-M survival estimator Much greater flexibility Allows covariates for modeling heterogeity via a link function Based on MLE model selection
22
Survival rate (probability)
Kaplan-Meier estimate: where di was the number dying and ri was the number at risk during i th interval of some longer time period t. Using slightly different symbology: where yi is the number surviving and ni is the number at risk
23
Survival rate Likelihood for this equation is: is the survival model for the m periods, ni is the number of individuals at risk during each period, yi is the number surviving each period Si is the MLE of survival during each period.
24
Survival rate Product of the probability of the individual observations: or the sum of the log-likelihoods where Sij is the probability of individual i surviving the jth interval.
25
Covariates Assumption is that heterogeneity can be modeled Survival probabilities are same within groups Frequently more interested in the differences among survival rates or the effects of environment, treatment, or condition of individuals. Factors that vary with parameters of interest (e.g., survival probability)
26
Covariates Can be discrete classes or continuous measurements Can be used to model group effects at the individual level Sex, location, time, length, color, cohort Can vary over time Temperature, weight, age, date
27
Link functions “Link” the covariates, the data (X), with the response variable (i.e., fate of the marked individual)
28
Link functions Recall that the likelihood for the n individuals over the m weeks in our known fate analysis was In order to estimate the survival rate as a function of some variable (e.g., age or weight) MARK uses a link function. Usually, this is done by way of the logit link: where x1i is the first covariate (data) in the ith interval.
29
Link functions Since the data in the link can be coded to represent time periods or other discrete groups (more on how we do this later), we can drop the product of the likelihoods and substituting the logit link into the likelihood: (ignoring the binomial coefficient), thus the log-likelihood is: where n is the number of individuals in the sample.
30
Link function More than one covariate can be included Extend the logit (linear equation). bs are the estimated parameters (effects); estimated for each period or group constrained to be equal using the data (xij).
31
Link function The survival rates or real parameters of interest are calculated from the s as in this equation. HUGE concept and applicable to EVERY estimator we examine. Survival probabilities are replaced by the link function submodel of the covariate(s). Conceivably every animal has a different survival probability that is related to the value of the covariates xij .
32
Example Fictitious critter All are captured and radio-marked at the same time, Mass determined at the time of capture Classified as belonging to two discrete groups (e.g., gender) At the end of the first time period (0), #5 is discovered to have died. By the end of the second period mouse #4 also is dead.
33
Mass at capture (g) (x3i)
Known fate example Covariates Obs Individual (i) Fate (Survived) (yi) Time (x1i) Group (x1i) Mass at capture (g) (x3i) 1 26 2 28 3 25 4 22 5 20 6 7 8 9
34
Model 1: all survival probabilities equal (null).
0 estimates the survival rate of all individuals.
35
Initial values In the table below the column labeled 0 is the logit for this model, S is from equation 1, and ln(L) is from equation 2. The ln(L) for the observations based on initial values for i = 0.5. Obs Individual Fate Time Group b0 ln(L) (i) (Survived) (x1i) (x2i) (yi) 1 0.5 1.25 -0.25 2 3 4 5 -1.50 6 7 8 9
36
Solution ln(L) is maximized when 0 = 1.25.
Note that because of the model specification the are equal within groups and time periods. Obs Individual Fate Time Group b0 ln(L) (i) (Survived) (x1i) (x2i) (yi) 1 1.25 0.778 -0.25 2 3 4 5 -1.50 6 7 8 9 S -4.77
37
Model 2: Survival rates vary between times
0 - survival rate of individuals in group 0 during time 0; 1 - difference in survival between time periods 0 and 1. Obs Individual (i) Fate (Survived) (yi) Time (x1i) 1 2 3 4 5 6 7 8 9 Thus, for individual 1 in time period 1: And, for individual 1 in time period 2:
38
Initial values 0+x1i1 is the linear equation for this model, S is from equation 1, and ln(L) is from equation 2. The ln(L) for the observations is based on initial values for i = 0.5, 2 = 0.5. Obs Individual Fate Time Group b0+x1ib1 ln(L) (i) (Survived) (x1i) (x2i) (yi) 1 0.50 0.622 -0.47 2 3 4 5 -0.97 6 1.00 0.731 -0.31 7 8 9 -1.31 S b1 0.5 b2
39
Solution MLE for 0= -1.39 and 1 = 0.29.
Note that because of the model specification the survival probabilities are equal for all individuals in each time period. Obs Individual Fate Time Group b0+x1ib1 ln(L) (i) (Survived) (x1i) (x2i) (yi) 1 1.39 0.800 -0.22 2 3 4 5 -1.61 6 1.10 0.750 -0.29 7 8 9 -1.39 S -4.75
40
Model 3: Survival varies by group and time.
0 equal the survival rate of individuals in group 0 during time 0 ignoring the effect of mass on survival. 1 equal the difference in survival between time periods 0 and 1; 2 equal the difference in survival between groups 0 and 1. Thus, it follows from Equation 1 that for group 1 at time 1:
41
Initial values In the table below the column labeled 0+x1i1+x2i2 is the linear equation for this model, is from equation 1, and ln(L) is from equation 2. The ln(L) for the observations is based on initial values i = 0.5. Obs Individual Fate Time Group b0+x1ib1+x2ib2 ln(L) (i) (Survived) (x1i) (x2i) (yi) 1 0.50 0.622 -0.47 2 3 1.00 0.731 -0.31 4 5 -1.31 6 7 8 1.50 0.818 -0.20 9 -1.70 S -5.42
42
Solution ln(L) is maximized when 0 = , 1 = , and 2 = Note that because of the model specification they are equal within groups and time periods. Obs Individual Fate Time Group b0+x1ib1+x2ib2 ln(L) (i) (Survived) (x1i) (x2i) (yi) 1 16.01 1.000 0.00 2 3 0.69 0.667 -0.41 4 5 0.333 -1.10 6 15.32 7 8 0.500 -0.69 9 S -3.30
43
Wildlife Population Analysis
Lecture Other link functions
44
Default in MARK with PIMs Constrained 0-1
Sine Default in MARK with PIMs Constrained 0-1
45
Default in MARK w/Design Matrix Constrained 0-1 Robust
Logit Default in MARK w/Design Matrix Constrained 0-1 Robust
46
May optimize better as S0
LogLog Constrained 0-1 May optimize better as S0
47
May optimize better as S1
Complimentary LogLog CLogLog in MARK Constrained 0-1 May optimize better as S1
48
Log NOT Constrained Covariates additive
49
Don’t confuse with identity matrix NOT Constrained
50
Remember Logit, sin, loglog, and complimentary loglog Constrain estimates to the interval [0,1]. The default in MARK is the sine function. Log and identity Do not constrain to the interval [0, 1], Useful when estimating things other than rates
51
Other More complex link functions are beyond the scope of this class see the program MARK documentation.
52
End of Lecture Lab 03 30-45 min – Introduction to Program Mark Kaplan-Meier & Known Fate Analyses
53
End of Lecture Material
54
Introduction to Program Mark
Download MARK and documentation: Cooch and White: Windows-based program Estimation of parameters based on re-encounters of marked individuals. Patch occupancy analysis Virtual population analysis MLE – by numerical techniques Model selection by information theoretic methods (AIC)
55
Model Specification
56
PIMs
57
Design Matrix Rows correspond to “real” parameters
Columns correspond to βs “estimated parameters”
58
Matrix terminology Rank – dimensions
Vector – matrix with either single column or single row For multiplication: do not have to be of the same rank; must have the same “inner” dimension; dimensions are specified as rows x columns. Thus a 4x3 matrix can be multiplied by a 3x1 matrix, But a 1x3 matrix cannot be multiplied by a 4x3 matrix The product of matrices has the dimension of “outer” dimensions of the multiplicands Thus a 4x3 x 3x1 yields a 4x1
59
The identity matrix. For any square matrix, the identity is a diagonal matrix of equal rank with all of the diagonal elements = 1.
60
Matrix multiplication
D – Design matrix Elements are data (covariates) B – Betas from logit Estimated parameters DxB = logits for all survival estimates in model
61
Matrix multiplication
Logits
62
Design Matrix
63
Link Functions
64
Link function
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.