STT : Biostatistics Analysis Dr. Cuixian Chen

STT520-420: Biostatistics Analysis Dr. Cuixian Chen
Chapter 1: Introduction to Survival Analysis

Survival Data survival time examples:
time a cancer patient is in remission time til a disease-free person has a heart attack time til death of a healthy mouse time til a computer component fails time til a paroled prisoner gets rearrested time til death of a liver transplant patient time til a cell phone customer switches carrier time til recovery after surgery all are "time til some event occurs" - longer times are better in all but the last… STT

Survival and hazard functions
Now define a survival r.v. Y as a continuous r.v. taking its values in the interval from 0 to inf; i.e., its values are thought of as the lifetime or survival time = the time til death (or time til failure if we’re considering an inanimate object). So Y is a positive-valued r.v. with pdf f(y) and cdf F(y) and F(y)=P(Y≤y) STT

Survival distribution
Now define the survival (or reliability) function S(y) as S(y) = 1- F(y) = P(Y>y). In terms of the pdf, f(y), we have Note the following important properties of the survival function: S(0) = 1 S(inf) = 0 S(b) > S(a) for 0<b<a So the survival function is a monotone decreasing function on the interval from 0 to infinity (see Fig 1.1 p. 4) STT

Survival function Note: the survival function is a monotone decreasing function on the interval from 0 to infinity (see Fig 1.1 p. 4) STT

Summary to Survival function
STT

Three goals of survival analysis
Estimate the survival function with SD. Compare survival functions (e.g., across levels of a categorical variable - treatment vs. placebo) Understand the relationship of the survival function to explanatory variables ( e.g., is survival time different for various values of an explanatory variable?) STT

Empirical survival function
The survival function S(y)=P(Y>y) can be estimated by the empirical survival function (ESF), which essentially gets the relative frequency of the number of Y’s > y… Y1, … ,Yn are i.i.d. (independent and identically distributed) survival variables. Then empirical survival function is given by where I is the indicator function… Q: Find ESF for data: 1, 3, 5, 8, 10. Q: Moreover, what is variance of Sn(y)? STT

Review of Bernoulli & Binomial RVs:
Z~ Bernoulli(p): in a trial, outcome={success, failure}, Prob(Success)=p. Or say: Z~ Bernoulli(p)=Binomial(1, p), where n=1. E(Z)=p;  (i.e., P(Z=1)=p). V(Z)=p*q=p*(1-p). Recall: Sum of n iid (independent and identically distributed) Bernoullis is a Binomial rv with parameters n and p, show on the next slide that the empirical survivor function Sn(y) is an unbiased estimator of S(y) STT

Expectation, Variance and confidence interval of Empirical survival function
Note that and as such nSn has B(n,p) where p=P(Y>y)=S(y). Note that for a fixed y* so Sn is unbiased as an estimator of S What is the Var(Sn)? What is Confidence interval for Sn? (see 1.6 and on p.6 where the confidence interval is computed…) STT

Empirical survival function
Example 1.3, page 6 Placebo group: Steroid induced remission times (weeks): 1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23. Q: find (a) ; (b) find the 95% confidence interval for STT

Plot a empirical survivor function in R section 4.1, page 55-56:
placebo<-c(1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23) placebo<-sort(placebo); a<-rle(placebo); values<-a$values; values ##distinct values from the observations length<-a$length; length ##replication for each distinct value f<-table(placebo); f #We need the fractions to plot the curve - so get the sample size first in n n=length(placebo) ; n #we want S(0)=1 surv1=1-cumsum(f)/n; surv2=c(1,surv1) ; surv2 #now let's plot this curve… use the type="s" to get a step function t=c(0, values) ; surv2 #t is the vector of x's and surv2 is the vector of y's plot(t,surv2,type="s",xlab="Remission Times",ylab="Relative Frequencies", col="blue", lwd=3) points(t,surv2, col="red", pch=18); STT

To create the confidence bands for Example 1.3
> (cbind(t, surv2, low, upp)) t surv low upp low=surv2-2*sqrt(surv2*(1-surv2)/n) upp=surv2+2*sqrt(surv2*(1-surv2)/n) points(t, low, col="orange", lty=2) points(t, upp, col="orange", lty=3) lines(t, low, col="orange", lty=2, lwd=3) lines(t, upp, col="orange", lty=3, lwd=3) ## To print out the confidence intervals (cbind(t, surv2, low, upp)) STT

Confidence bands for Example 1.3
STT

Plot a empirical survivor function in R section 4.1, page 55-56:
Assume we have sorted data: Starting at Sn(0)=1; STT

How to compare survival functions
Example 1.4 on page 8 shows that it is sometimes difficult to compare survival curves since they can cross each other… (what makes one survival curve “better” than another?): S1(y)=exp(-y/2) and S2(y)=exp(-y2/4)… STT

How to compare survival functions
One way of comparing two survival curves is by comparing their MTTF (mean time til failure) values. Let’s try to use R to draw the two curves given in Ex. 1.4: S1(y)=exp(-y/2) and S2(y)=exp(-y2/4)… # For example 1.4 on page 8. We plot two survival function in R: # First we need to evaluate the first exponential survival function at these values of x x=seq(0,10, by=0.001) y1=exp(-x/2) y2=exp(-x^2/4) plot(x,y1, col="blue") points(x,y2,col="red"); title("comparing two survival functions") STT

Mean time to failure (MTTF)
Note that the MTTF of a survival rv Y is just its expected value E(Y). We can also show (Theorem 1.2) that (Math & Stat majors: Show this is true using integration by parts and l’Hospital’s rule…!) STT

Review: Exponential distribution
Def 4.11: Y~Exp(β). Eg: The cdf of Exp(β)? Eg: What is the prob of Pr(Y>a), if Y~Exp(β)?

Note that the MTTF of a survival rv Y is just its expected value E(Y). We can also show (Theorem 1.2) that (Math & Stat majors: Show this is true using integration by parts and l’Hospital’s rule…!) So suppose we have an exponential survival function: Q: Show that MTTF for this variable is . (Can you show this satisfies the properties of a survival function?) STT

For any two such survival functions, S1(y)=exp(-y/ and S2(y)=exp(-y/2). one is “better” than the other if the corresponding beta is “better”… HW, EX1: (Use R) plot on the same axes at least two such survival functions: S1(y)=exp(-y/ and S2(y)=exp(-y/2). with different values of beta (e.g:  = 10; 2 = 5) and show this result: STT

STT420-520 HW HW, EX2: For example 1.1, page 1:
Calculate the empirical survival function and corresponding confidence intervals (Use R) ; Plot both the empirical survival function and its confidence intervals in the same figure… (Use R and see example code on the back) Estimate the probability of failure beyond 10 weeks. HW, EX3: See page 13, Exercise 1.3. STT

Review: Exponential distribution
In R: dexp(x, 1/β); pexp (x, 1/β); qexp (per, 1/β); rexp (N, 1/β). ## Note that in R, exponential distribution is defined in a different way than we used to have in ## STT315 class. set.seed(100) y=rexp(10000, 0.1) mean(y) ## beta= , not 0.1!

STT : Biostatistics Analysis Dr. Cuixian Chen

Similar presentations

Presentation on theme: "STT : Biostatistics Analysis Dr. Cuixian Chen"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

STT : Biostatistics Analysis Dr. Cuixian Chen

Similar presentations

Presentation on theme: "STT : Biostatistics Analysis Dr. Cuixian Chen"— Presentation transcript:

Similar presentations

About project

Feedback