Presentation is loading. Please wait.

Presentation is loading. Please wait.

STT : Biostatistics Analysis Dr. Cuixian Chen

Similar presentations


Presentation on theme: "STT : Biostatistics Analysis Dr. Cuixian Chen"— Presentation transcript:

1 STT520-420: Biostatistics Analysis Dr. Cuixian Chen
Chapter 1: Introduction to Survival Analysis

2 Survival Data survival time examples:
time a cancer patient is in remission time til a disease-free person has a heart attack time til death of a healthy mouse time til a computer component fails time til a paroled prisoner gets rearrested time til death of a liver transplant patient time til a cell phone customer switches carrier time til recovery after surgery all are "time til some event occurs" - longer times are better in all but the last… STT

3 Survival and hazard functions
Now define a survival r.v. Y as a continuous r.v. taking its values in the interval from 0 to inf; i.e., its values are thought of as the lifetime or survival time = the time til death (or time til failure if we’re considering an inanimate object). So Y is a positive-valued r.v. with pdf f(y) and cdf F(y) and F(y)=P(Y≤y) STT

4 Survival distribution
Now define the survival (or reliability) function S(y) as S(y) = 1- F(y) = P(Y>y). In terms of the pdf, f(y), we have Note the following important properties of the survival function: S(0) = 1 S(inf) = 0 S(b) > S(a) for 0<b<a So the survival function is a monotone decreasing function on the interval from 0 to infinity (see Fig 1.1 p. 4) STT

5 Survival function Note: the survival function is a monotone decreasing function on the interval from 0 to infinity (see Fig 1.1 p. 4) STT

6 Summary to Survival function
STT

7 Three goals of survival analysis
Estimate the survival function with SD. Compare survival functions (e.g., across levels of a categorical variable - treatment vs. placebo) Understand the relationship of the survival function to explanatory variables ( e.g., is survival time different for various values of an explanatory variable?) STT

8 Empirical survival function
The survival function S(y)=P(Y>y) can be estimated by the empirical survival function (ESF), which essentially gets the relative frequency of the number of Y’s > y… Y1, … ,Yn are i.i.d. (independent and identically distributed) survival variables. Then empirical survival function is given by where I is the indicator function… Q: Find ESF for data: 1, 3, 5, 8, 10. Q: Moreover, what is variance of Sn(y)? STT

9 Review of Bernoulli & Binomial RVs:
Z~ Bernoulli(p): in a trial, outcome={success, failure}, Prob(Success)=p. Or say: Z~ Bernoulli(p)=Binomial(1, p), where n=1. E(Z)=p;  (i.e., P(Z=1)=p). V(Z)=p*q=p*(1-p). Recall: Sum of n iid (independent and identically distributed) Bernoullis is a Binomial rv with parameters n and p, show on the next slide that the empirical survivor function Sn(y) is an unbiased estimator of S(y) STT

10 Expectation, Variance and confidence interval of Empirical survival function
Note that and as such nSn has B(n,p) where p=P(Y>y)=S(y). Note that for a fixed y* so Sn is unbiased as an estimator of S What is the Var(Sn)? What is Confidence interval for Sn? (see 1.6 and on p.6 where the confidence interval is computed…) STT

11 Empirical survival function
Example 1.3, page 6 Placebo group: Steroid induced remission times (weeks): 1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23. Q: find (a) ; (b) find the 95% confidence interval for STT

12 Plot a empirical survivor function in R section 4.1, page 55-56:
placebo<-c(1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23) placebo<-sort(placebo); a<-rle(placebo); values<-a$values; values ##distinct values from the observations length<-a$length; length ##replication for each distinct value f<-table(placebo); f #We need the fractions to plot the curve - so get the sample size first in n n=length(placebo) ; n #we want S(0)=1 surv1=1-cumsum(f)/n; surv2=c(1,surv1) ; surv2 #now let's plot this curve… use the type="s" to get a step function t=c(0, values) ; surv2 #t is the vector of x's and surv2 is the vector of y's plot(t,surv2,type="s",xlab="Remission Times",ylab="Relative Frequencies", col="blue", lwd=3) points(t,surv2, col="red", pch=18); STT

13 To create the confidence bands for Example 1.3
> (cbind(t, surv2, low, upp)) t surv low upp low=surv2-2*sqrt(surv2*(1-surv2)/n) upp=surv2+2*sqrt(surv2*(1-surv2)/n) points(t, low, col="orange", lty=2) points(t, upp, col="orange", lty=3) lines(t, low, col="orange", lty=2, lwd=3) lines(t, upp, col="orange", lty=3, lwd=3) ## To print out the confidence intervals (cbind(t, surv2, low, upp)) STT

14 Confidence bands for Example 1.3
STT

15 Plot a empirical survivor function in R section 4.1, page 55-56:
Assume we have sorted data: Starting at Sn(0)=1; STT

16 How to compare survival functions
Example 1.4 on page 8 shows that it is sometimes difficult to compare survival curves since they can cross each other… (what makes one survival curve “better” than another?): S1(y)=exp(-y/2) and S2(y)=exp(-y2/4)… STT

17 How to compare survival functions
One way of comparing two survival curves is by comparing their MTTF (mean time til failure) values. Let’s try to use R to draw the two curves given in Ex. 1.4: S1(y)=exp(-y/2) and S2(y)=exp(-y2/4)… # For example 1.4 on page 8. We plot two survival function in R: # First we need to evaluate the first exponential survival function at these values of x x=seq(0,10, by=0.001) y1=exp(-x/2) y2=exp(-x^2/4) plot(x,y1, col="blue") points(x,y2,col="red"); title("comparing two survival functions") STT

18 Mean time to failure (MTTF)
Note that the MTTF of a survival rv Y is just its expected value E(Y). We can also show (Theorem 1.2) that (Math & Stat majors: Show this is true using integration by parts and l’Hospital’s rule…!) STT

19 Review: Exponential distribution
Def 4.11: Y~Exp(β). Eg: The cdf of Exp(β)? Eg: What is the prob of Pr(Y>a), if Y~Exp(β)?

20 Mean time to failure (MTTF)
Note that the MTTF of a survival rv Y is just its expected value E(Y). We can also show (Theorem 1.2) that (Math & Stat majors: Show this is true using integration by parts and l’Hospital’s rule…!) So suppose we have an exponential survival function: Q: Show that MTTF for this variable is . (Can you show this satisfies the properties of a survival function?) STT

21 Mean time to failure (MTTF)
For any two such survival functions, S1(y)=exp(-y/ and S2(y)=exp(-y/2). one is “better” than the other if the corresponding beta is “better”… HW, EX1: (Use R) plot on the same axes at least two such survival functions: S1(y)=exp(-y/ and S2(y)=exp(-y/2). with different values of beta (e.g:  = 10; 2 = 5) and show this result: STT

22 STT420-520 HW HW, EX2: For example 1.1, page 1:
Calculate the empirical survival function and corresponding confidence intervals (Use R) ; Plot both the empirical survival function and its confidence intervals in the same figure… (Use R and see example code on the back) Estimate the probability of failure beyond 10 weeks. HW, EX3: See page 13, Exercise 1.3. STT

23 Review: Exponential distribution
In R: dexp(x, 1/β); pexp (x, 1/β); qexp (per, 1/β); rexp (N, 1/β). ## Note that in R, exponential distribution is defined in a different way than we used to have in ## STT315 class. set.seed(100) y=rexp(10000, 0.1) mean(y) ## beta= , not 0.1!


Download ppt "STT : Biostatistics Analysis Dr. Cuixian Chen"

Similar presentations


Ads by Google