Download presentation
Presentation is loading. Please wait.
1
STT520-420: Biostatistics Analysis Dr. Cuixian Chen
Chapter 1: Introduction to Survival Analysis
2
Survival Data survival time examples:
time a cancer patient is in remission time til a disease-free person has a heart attack time til death of a healthy mouse time til a computer component fails time til a paroled prisoner gets rearrested time til death of a liver transplant patient time til a cell phone customer switches carrier time til recovery after surgery all are "time til some event occurs" - longer times are better in all but the last… STT
3
Survival and hazard functions
Now define a survival r.v. Y as a continuous r.v. taking its values in the interval from 0 to inf; i.e., its values are thought of as the lifetime or survival time = the time til death (or time til failure if we’re considering an inanimate object). So Y is a positive-valued r.v. with pdf f(y) and cdf F(y) and F(y)=P(Y≤y) STT
4
Survival distribution
Now define the survival (or reliability) function S(y) as S(y) = 1- F(y) = P(Y>y). In terms of the pdf, f(y), we have Note the following important properties of the survival function: S(0) = 1 S(inf) = 0 S(b) > S(a) for 0<b<a So the survival function is a monotone decreasing function on the interval from 0 to infinity (see Fig 1.1 p. 4) STT
5
Survival function Note: the survival function is a monotone decreasing function on the interval from 0 to infinity (see Fig 1.1 p. 4) STT
6
Summary to Survival function
STT
7
Three goals of survival analysis
Estimate the survival function with SD. Compare survival functions (e.g., across levels of a categorical variable - treatment vs. placebo) Understand the relationship of the survival function to explanatory variables ( e.g., is survival time different for various values of an explanatory variable?) STT
8
Empirical survival function
The survival function S(y)=P(Y>y) can be estimated by the empirical survival function (ESF), which essentially gets the relative frequency of the number of Y’s > y… Y1, … ,Yn are i.i.d. (independent and identically distributed) survival variables. Then empirical survival function is given by where I is the indicator function… Q: Find ESF for data: 1, 3, 5, 8, 10. Q: Moreover, what is variance of Sn(y)? STT
9
Review of Bernoulli & Binomial RVs:
Z~ Bernoulli(p): in a trial, outcome={success, failure}, Prob(Success)=p. Or say: Z~ Bernoulli(p)=Binomial(1, p), where n=1. E(Z)=p; (i.e., P(Z=1)=p). V(Z)=p*q=p*(1-p). Recall: Sum of n iid (independent and identically distributed) Bernoullis is a Binomial rv with parameters n and p, show on the next slide that the empirical survivor function Sn(y) is an unbiased estimator of S(y) STT
10
Expectation, Variance and confidence interval of Empirical survival function
Note that and as such nSn has B(n,p) where p=P(Y>y)=S(y). Note that for a fixed y* so Sn is unbiased as an estimator of S What is the Var(Sn)? What is Confidence interval for Sn? (see 1.6 and on p.6 where the confidence interval is computed…) STT
11
Empirical survival function
Example 1.3, page 6 Placebo group: Steroid induced remission times (weeks): 1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23. Q: find (a) ; (b) find the 95% confidence interval for STT
12
Plot a empirical survivor function in R section 4.1, page 55-56:
placebo<-c(1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23) placebo<-sort(placebo); a<-rle(placebo); values<-a$values; values ##distinct values from the observations length<-a$length; length ##replication for each distinct value f<-table(placebo); f #We need the fractions to plot the curve - so get the sample size first in n n=length(placebo) ; n #we want S(0)=1 surv1=1-cumsum(f)/n; surv2=c(1,surv1) ; surv2 #now let's plot this curve… use the type="s" to get a step function t=c(0, values) ; surv2 #t is the vector of x's and surv2 is the vector of y's plot(t,surv2,type="s",xlab="Remission Times",ylab="Relative Frequencies", col="blue", lwd=3) points(t,surv2, col="red", pch=18); STT
13
To create the confidence bands for Example 1.3
> (cbind(t, surv2, low, upp)) t surv low upp low=surv2-2*sqrt(surv2*(1-surv2)/n) upp=surv2+2*sqrt(surv2*(1-surv2)/n) points(t, low, col="orange", lty=2) points(t, upp, col="orange", lty=3) lines(t, low, col="orange", lty=2, lwd=3) lines(t, upp, col="orange", lty=3, lwd=3) ## To print out the confidence intervals (cbind(t, surv2, low, upp)) STT
14
Confidence bands for Example 1.3
STT
15
Plot a empirical survivor function in R section 4.1, page 55-56:
Assume we have sorted data: Starting at Sn(0)=1; STT
16
How to compare survival functions
Example 1.4 on page 8 shows that it is sometimes difficult to compare survival curves since they can cross each other… (what makes one survival curve “better” than another?): S1(y)=exp(-y/2) and S2(y)=exp(-y2/4)… STT
17
How to compare survival functions
One way of comparing two survival curves is by comparing their MTTF (mean time til failure) values. Let’s try to use R to draw the two curves given in Ex. 1.4: S1(y)=exp(-y/2) and S2(y)=exp(-y2/4)… # For example 1.4 on page 8. We plot two survival function in R: # First we need to evaluate the first exponential survival function at these values of x x=seq(0,10, by=0.001) y1=exp(-x/2) y2=exp(-x^2/4) plot(x,y1, col="blue") points(x,y2,col="red"); title("comparing two survival functions") STT
18
Mean time to failure (MTTF)
Note that the MTTF of a survival rv Y is just its expected value E(Y). We can also show (Theorem 1.2) that (Math & Stat majors: Show this is true using integration by parts and l’Hospital’s rule…!) STT
19
Review: Exponential distribution
Def 4.11: Y~Exp(β). Eg: The cdf of Exp(β)? Eg: What is the prob of Pr(Y>a), if Y~Exp(β)?
20
Mean time to failure (MTTF)
Note that the MTTF of a survival rv Y is just its expected value E(Y). We can also show (Theorem 1.2) that (Math & Stat majors: Show this is true using integration by parts and l’Hospital’s rule…!) So suppose we have an exponential survival function: Q: Show that MTTF for this variable is . (Can you show this satisfies the properties of a survival function?) STT
21
Mean time to failure (MTTF)
For any two such survival functions, S1(y)=exp(-y/ and S2(y)=exp(-y/2). one is “better” than the other if the corresponding beta is “better”… HW, EX1: (Use R) plot on the same axes at least two such survival functions: S1(y)=exp(-y/ and S2(y)=exp(-y/2). with different values of beta (e.g: = 10; 2 = 5) and show this result: STT
22
STT420-520 HW HW, EX2: For example 1.1, page 1:
Calculate the empirical survival function and corresponding confidence intervals (Use R) ; Plot both the empirical survival function and its confidence intervals in the same figure… (Use R and see example code on the back) Estimate the probability of failure beyond 10 weeks. HW, EX3: See page 13, Exercise 1.3. STT
23
Review: Exponential distribution
In R: dexp(x, 1/β); pexp (x, 1/β); qexp (per, 1/β); rexp (N, 1/β). ## Note that in R, exponential distribution is defined in a different way than we used to have in ## STT315 class. set.seed(100) y=rexp(10000, 0.1) mean(y) ## beta= , not 0.1!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.