Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 10: Hypothesis Testing II Weight Functions Trend Tests.

Similar presentations


Presentation on theme: "Lecture 10: Hypothesis Testing II Weight Functions Trend Tests."— Presentation transcript:

1 Lecture 10: Hypothesis Testing II Weight Functions Trend Tests

2 Testing >2 Samples in R > ###2-sample testing using toy example > time<-c(3,6,9,9,11,16,8,9,10,12,19,23) > cens<-c(1,0,1,1,0,1,1,1,0,0,1,0) > grp<-c(1,1,1,1,1,1,2,2,2,2,2,2) > grp<-as.factor(grp) > > sdat<-Surv(time, cens) > survdiff(sdat~grp) Call: survdiff(formula = sdat ~ grp) N Observed Expected (O-E)^2/E (O-E)^2/V grp=1 6 4 2.57 0.800 1.62 grp=2 6 3 4.43 0.463 1.62 Chisq= 1.6 on 1 degrees of freedom, p= 0.203

3 Testing >2 Samples in R > survdiff(sdat~grp, rho=1) Call: survdiff(formula = sdat ~ grp) N Observed Expected (O-E)^2/E (O-E)^2/V grp=1 6 3.2 2.15 0.513 1.23 grp=2 6 2.11 3.16 0.349 1.23 Chisq= 1.6 on 1 degrees of freedom, p= 0.268

4 Revisit ‘Linear Dependence’ of Z j (  ) How are they linearly dependent? Two sample case:

5 Revisit ‘Linear Dependence’ of Z j (  ) K -sample case:

6 Beyond Log-Rank Log-rank has optimum power to detect H a when the hazard rates of our K groups are proportional What if they’re not… We’ve mentioned using other weight functions Depending on the choice of weight functions, we can place emphasis on different regions of the survival curve.

7 Example: Kidney Infection Data on 119 kidney dialysis patients Comparing time to kidney infection between two groups – Catheters placed percutaneously ( n = 76) – Catheters placed surgically ( n = 43)

8

9 Log-Rank Test titi Y i1 d i1 Y i2 d i2 YiYi didi 0.543076611962.168-2.1681.326 1.543160010310.4170.5830.243 2.54205629820.857-0.8570.485 3.54014918920.8990.1010.489 4.53624307920.9111.0890.490 5.53314007310.4520.5480.248 6.53103516610.470-0.4700.249 8.52523005520.9091.0910.487 9.52212704910.4490.5510.247 10.52012504510.4440.5560.247 11.51812204010.4500.5500.248 15.51111412520.880.1200.472 16.51011302310.4350.5650.246 18.5911102010.4500.5500.248 23.54150910.4400.5560.247 26.52130510.4000.6000.240 Sum3.9646.211

10 Comparisons Test p -value Log-Rank13.966.212.530.112 Gehan-9388620.0020.964 Tarone-Ware13.2432.830.40.526 Peto-Peto2.474.361.40.237 Modified Peto-Peto2.314.21.280.259 Fleming-Harrington p=0; q=11.410.219.670.002 Fleming-Harrington p=1; q=02.554.691.390.239 Fleming-Harrington p=1; q=11.020.119.830.002 Fleming-Harrington p=0.5; q=0.52.470.669.280.002 Fleming-Harrington p=0.5; q=20.320.018.180.004

11 Notice the Differences! Situation of varying inference Need to be sure you are testing what you think you are testing Check – Look at the hazards – Do they cross? Problem – Estimating hazards is imprecise (as we’ve discussed)

12 Cumulative Hazards

13 Hazard Rate (smoothing spline)

14

15

16

17

18 Misconception Survival curves crossing  telling about appropriateness of log-rank Not true – Survival curves crossing depends on censoring and study duration – What if they cross but we don’t look far enough out Consider – Survival curves cross  hazards cross – Hazards cross  survival curves may or may not cross Solution? – Test regions of t – Prior and after cross based in looking at hazard – Some tests allow for crossing (Yang and Prentice 2005)

19 Take-home Choice of weight function can be critical K&M recommend applying log-rank and Gehan Cox regression (simple) is akin to log-rank Think carefully about the distribution of weights and about possible crossing of hazards

20 What About Weights… We know that R has limited selection for weights. SAS doesn’t seem to allow us to specify any weights (at least not in proc lifetest) So of course we can write our own function…

21 R Function for Different Weights What information will we need to construct the different weights? Can we get this information from R?

22 Building Our R Function > times<-kidney$Time > cens<-kidney$d > grp<-kidney$cath > fit<-survfit(Surv(times, cens)~1) > tm<-summary(fit)$time > Yi<-fit$n.risk[which(fit$time%in%tm)] > di<-fit$n.event[which(fit$time%in%tm)] > Yi [1] 119 103 98 89 79 73 66 55 49 45 40 25 23 20 9 5 > di [1] 6 1 2 2 2 1 1 2 1 1 1 2 1 1 1 1

23 > fit<-survfit(st~kidney$cath) > summary(fit) Call: survfit(formula = st ~ kidney$cath) kidney$cath=1 time n.risk n.event survival std.err lower 95% CI upper 95% CI 1.5 43 1 0.977 0.0230 0.9327 1.000 3.5 40 1 0.952 0.0329 0.8899 1.000 4.5 36 2 0.899 0.0478 0.8104 0.998 5.5 33 1 0.872 0.0536 0.7732 0.984 8.5 25 2 0.802 0.0683 0.6790 0.948 9.5 22 1 0.766 0.0743 0.6332 0.926 10.5 20 1 0.728 0.0799 0.5868 0.902 11.5 18 1 0.687 0.0851 0.5392 0.876 15.5 11 1 0.625 0.0976 0.4599 0.849 16.5 10 1 0.562 0.1060 0.3886 0.813 18.5 9 1 0.500 0.1111 0.3233 0.773 23.5 4 1 0.375 0.1366 0.1835 0.766 26.5 2 1 0.187 0.1491 0.0394 0.891 kidney$cath=2 time n.risk n.event survival std.err lower 95% CI upper 95% CI 0.5 76 6 0.921 0.0309 0.862 0.984 2.5 56 2 0.888 0.0376 0.817 0.965 3.5 49 1 0.870 0.0409 0.793 0.954 6.5 35 1 0.845 0.0467 0.758 0.942 15.5 14 1 0.785 0.0726 0.655 0.941 > names(fit) [1] "n" "time" "n.risk" "n.event" "n.censor" "surv" "type" "strata" "std.err" "upper" [11] "lower" "conf.type" "conf.int" "call"

24 > names(fit) [1] "n" "time" "n.risk" "n.event" "n.censor" "surv" "type" "strata" "std.err" "upper" [11] "lower" "conf.type" "conf.int" "call" > fit$n.risk [1] 43 42 40 36 33 31 29 25 22 20 18 16 14 13 11 10 9 8 6 4 3 2 1 76 60 56 49 43 40 35 33 30 27 [34] 25 22 20 16 14 13 11 10 7 6 5 4 3 1 > fit$n.event [1] 1 0 1 2 1 0 0 2 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 6 0 2 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 > names(summary(fit)) [1] "n" "time" "n.risk" "n.event" "n.censor" "surv" "type" "strata" "std.err" "upper" [11] "lower" "conf.type" "conf.int" "call" "table“ > summary(fit)$n.risk [1] 43 40 36 33 25 22 20 18 11 10 9 4 2 76 56 49 35 14 > summary(fit)$n.event [1] 1 1 2 1 2 1 1 1 1 1 1 1 1 6 2 1 1 1 Building Our R Function

25 We still need to think about how to estimate Y i1 and d i1 for all times where > 1 event occurs – including times where group 1 is censored We can certainly construct a risk set using what we get out of R. – Recall how we find the risk set…

26 Building Our R Function > dat<-cbind(times, cens)[which(grp==1),] > yij<-dij<-c() > for (i in 1:length(tm)) { tmi<-tm[i] yij =tmi))) dij<-append(dij, sum(dat[which(dat[,1]==tmi),2])) } > yij [1] 43 43 42 40 36 33 31 25 22 20 18 11 10 9 4 2 > dij [1] 0 1 0 1 2 1 0 2 1 1 1 1 1 1 1 1 -We need to estimate the weights so we can construct the weighted versions

27 Test Statistic -We have all the parts we need to construct the “constant” portion of our test statistic. >OmEi<-dij-yij*(di/Yi) >vi<-(yij/Yi)*(1-yij/Yi)*((Yi-di)/(Yi-1))*di > round(OmEi, 3) [1] -2.168 0.583 -0.857 0.101 1.089 0.548 -0.470 1.091 0.551 [10] 0.556 0.550 0.120 0.565 0.550 0.556 0.600 > round(vi, 3) [1] 1.326 0.243 0.485 0.489 0.490 0.248 0.249 0.487 0.247 0.247 0.248 0.472 0.246 0.248 0.247 0.240 -Now we need to estimate the weights so we can construct the weighted versions…

28 #generating weights Sim1<-c(1,fit$surv[which(fit$time%in%tm)][1:(length(tm)-1)]) if (wt=="lr") Wti<-rep(1, length(tm)) if (wt=="geh") Wti<-Yi if (wt=="tw") Wti<-sqrt(Yi) if (wt=="pp") Wti<-cumprod(1-di/(Yi+1)) if (wt=="mpp") Wti<-cumprod(1-di/(Yi+1))*Yi/(Yi+1) if (wt=="fh") { if(missing(p) | missing(q)) stop("Use of Fleming-Harrington Weights requires values for p and q") else Wti<-Sim1^p*(1-Sim1)^q } #Example Using the Gehan weight > wt=“geh” > if (wt=="geh") Wti<-Yi > Wti [1] 119 103 98 89 79 73 66 55 49 45 40 25 23 20 9 5 Different Weight functions

29 Final Calculations #Apply the chosen weight to our test statistic and it’s variance > OmE<-as.numeric(t(Wti)%*%OmEi) > v<-as.numeric(t(Wti^2)%*%vi) > tstat<-OmE^2/v > pval<-pchisq(tstat, df=1, lower.tail=F) > OmE [1] -9 > v [1] 38861.81 > tstat [1] 0.002084309 > pval [1] 0.9635858

30 survdiff_wts<-function(times, cens, grp, wt, p, q) { fit<-survfit(Surv(times, cens)~1) tm<-summary(fit)$time Yi<-fit$n.risk[which(fit$time%in%tm)] di<-fit$n.event[which(fit$time%in%tm)] dat<-cbind(times, cens)[which(grp==1),] yij<-dij<-c() for (i in 1:length(tm)) { tmi<-tm[i] yij =tmi))) dij<-append(dij, sum(dat[which(dat[,1]==tmi),2])) } OmEi<-dij-yij*(di/Yi) vi<-(yij/Yi)*(1-yij/Yi)*((Yi-di)/(Yi-1))*di Sim1<-c(1,fit$surv[which(fit$time%in%tm)][1:(length(tm)-1)]) if (wt=="lr") Wti<-rep(1, length(tm)) if (wt=="geh") Wti<-Yi if (wt=="tw") Wti<-sqrt(Yi) if (wt=="pp") Wti<-cumprod(1-di/(Yi+1)) if (wt=="mpp") Wti<-cumprod(1-di/(Yi+1))*Yi/(Yi+1) if (wt=="fh") { if(missing(p) | missing(q)) stop("Use of Fleming-Harrington Weights requires values for p and q") else Wti<-Sim1^p*(1-Sim1)^q } OmE<-as.numeric(t(Wti)%*%OmEi) v<-as.numeric(t(Wti^2)%*%vi) tstat<-OmE^2/v pval<-pchisq(tstat, df=1, lower.tail=F) ans<-list(weights=Wti, Z_tau=OmE, sig_11=v, chisq=tstat, pval=pval) names(ans)<-c("Weights", "Z_tau","sig_11","chisq value","pvalue") return(ans) }

31 Larynx Cancer 90 patients diagnosed with larynx cancer (1970’s) Patients classified according to disease stage – Stages I-IV We are interested in survival BUT we want to compare the four stages

32 Kaplan-Meier curves

33 R: survdiff >lar<-read.csv("H:public.html\\BMTRY_722_Summer2015\\Date\\larynx.csv") >time<-lar$time; death<-lar$death; stage<-lar$stage >st<-Surv(time, death) > test0<-survdiff(st~stage) > test0 Call: survdiff(formula = st ~ stage) N Observed Expected (O-E)^2/E (O-E)^2/V stage=1 33 15 22.57 2.537 4.741 stage=2 17 7 10.01 0.906 1.152 stage=3 27 17 14.08 0.603 0.856 stage=4 13 11 3.34 17.590 19.827 Chisq= 22.8 on 3 degrees of freedom, p= 4.53e-05

34 R: survdiff > test1<-survdiff(st~stage, rho=1) > test1 Call: survdiff(formula = st ~ stage, rho=1) … Chisq= 23.1 on 3 degrees of freedom, p= 3.85e-05 > test2<-survdiff(st~stage, rho=3) > test2 Call: survdiff(formula = st ~ stage, rho=3) … Chisq= 21.8 on 3 degrees of freedom, p= 7.03e-05 Recall: W(ti)=Y(ti)S(t_(i-1))^p(1-S(t_(i-1)))^q

35 What about our hazards

36 R: survdiff > test3<-survdiff(st[stage<3]~stage[stage<3]) Chisq= 0 on 1 degrees of freedom, p= 0.866 > test4<-survdiff(st~factor(disease, exclude=c(2,4))) Chisq= 3.1 on 1 degrees of freedom, p= 0.0801 > test5<- survdiff(st~factor(disease, exclude=c(2,3))) Chisq= 23.4 on 1 degrees of freedom, p= 1.32e-06 > test6<-survdiff(st~factor(disease, exclude=c(1,4))) Chisq= 1.5 on 1 degrees of freedom, p= 0.266 > test7<-survdiff(st~factor(disease, exclude=c(1,3)) Chisq= 11.5 on 1 degrees of freedom, p= 0.000679 > test8 2]~stage[stage>2]) Chisq= 0.5 on 1 degrees of freedom, p= 0.769

37 What about the differences Not much evidence of hazards crossing If there isn’t overlap, then tests will be somewhat consistent Log-rank: most appropriate when hazards are proportional

38 Test For Trends We generally perform tests of trends for ordinal variables – Dose level – PSA categories (prostate cancer) – Cancer stage Different than treating variable as continuous, although that is one ‘accepted’ approach For continuous covariates, we need a regression model (we will get there shortly)

39 Formally Tests for trends Our hypothesis is Any weight function discussed previously can be used Test statistic:

40 Formally Tests for trends a j : Weights- often chosen as a j = j but can be user specified  jg : j th, g th element of the variance-covariance matrix of Z j (  )

41 Stage: Ordinal Categories

42 Trend Test in R #Test Trend in R surv.trendtest<-function(times, cens, wt, aj) { require(survival) test<-survdiff(Surv(times, cens), rho=wt) zj<-test$obs-test$exp zv<-test$var num<-sum(aj*zj) den<-0 for (i in 1:length(aj)) { for (g in 1:length(aj)) { den<-den+aj[i]*aj[g]*zv[i,g]} } den<-sqrt(den) zz<-num/den pval<-2*(1-pnorm(abs(zz))) return(list(Z=zz, pvalue=pval)) }

43 Trend Test in R >test.t0<-surv.trendtest(test=test0, wt=1:4) >test.t0 $Z [1] 3.718959 $pvalue [1] 0.0002000459 >test.t1<-surv.trendtest(test=test1, wt=1:4) > test.t1 $Z [1] 4.120055 $pvalue [1] 3.787827e-05

44 Next Time… Stratified tests Other K -sample tests


Download ppt "Lecture 10: Hypothesis Testing II Weight Functions Trend Tests."

Similar presentations


Ads by Google