Presentation is loading. Please wait.

Presentation is loading. Please wait.

Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004.

Similar presentations


Presentation on theme: "Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004."— Presentation transcript:

1 Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

2 Outline 1. Motivation Petrochemical exposure in relation to childhood brain and leukemia cancers Petrochemical exposure in relation to childhood brain and leukemia cancers 2. Cumulative Geographic Residuals Unconditional Unconditional Conditional Conditional 3. Simulation Results Type I error Type I error Power Calculations Power Calculations 4. Application Childhood Leukemia Childhood Leukemia Childhood Brain Cancer Childhood Brain Cancer 5. Software 6. Discussion Limitations Limitations Future Research Future Research

3 Taiwan Petrochemical Study Matched Case-Control Study 3 controls per case 3 controls per case Matched on Age and Gender Matched on Age and Gender Resided in one of 26 of the overall 38 administrative districts of Kaohsiung County, Taiwan Resided in one of 26 of the overall 38 administrative districts of Kaohsiung County, Taiwan Controls selected using national identity numbers (not dependent on location). Controls selected using national identity numbers (not dependent on location).

4 Study Population Due to dropout approximately 50% 3 to 1 matching, 40% 2 to 1 matching, and 10% 1 to 1 matching. Leukemia Brain Cancer Cases 121111 Controls 287259

5 Map of Kaohsiung

6 Cumulative Residuals Unconditional (Independence) Unconditional (Independence) Model definition using logistic regression Model definition using logistic regression Extension to Cluster Detection Extension to Cluster Detection Conditional (Matched Design) Conditional (Matched Design) Model definition using conditional logistic regression Model definition using conditional logistic regression Extension to Cluster Detection Extension to Cluster Detection

7 Logistic Model Assume the logistic model where, and the link function, and the link function, Therefore the likelihood score function for is with information matrix with information matrix

8 Residual Formulation Then define a residual as, where is the solution to. where is the solution to. Assuming the model is correctly specified would imply there is no pattern in residuals. => Use Residuals to test for misspecification. Cumulative Residuals for Model Checking; Lin, Wei, Ying 2002

9 Hypothesis Test Hypothesis of interest, Geographic Location, (r i, t i ) Independent of Outcome, Y i |X i  Cumulative Geographic Residual Moving Block Process is Patternless

10 Unconditional Cluster Detection Define the Cumulative Geographic Residual Moving Block Process as,

11 Asymptotic Distribution However, the asymptotic distribution of is difficult to simulate, but it has been shown to be equivalent to the following, conditional on the observed data, distribution, where

12 Significance Test Testing the NULL Simulate N realizations of Simulate N realizations of by repeatedly simulating, while fixing the data at their observed values. Calculate P-value Calculate P-value

13 Conditional Logistic Model Type of Matching: 1 case to M s controls Data Structure: Assume that conditional on, an unobserved stratum-specific intercept, and given the logit link, implies, The conditional likelihood, conditioning on is,

14 Score and Information Denote the conditional likelihood score as, with information matrix,

15 Conditional Residual Then define a residual as, where is the solution to. => Use these correlated Residuals to test for patterns based on location.

16 Conditional Cumulative Residual Define the Conditional Cumulative Residual Moving Block Process as, Which has been shown to be asymptotically equivalent to, where and that are independent of observed data.

17 Significance Test Testing the NULL Simulate N realizations of Simulate N realizations of by repeatedly simulating, while fixing the data at their observed values. by repeatedly simulating, while fixing the data at their observed values. Calculate P-value Calculate P-value

18 Simulation Choice of G i or G is Choice of G i or G is Unconditional Unconditional NormalDiscrete Conditional Conditional NormalDiscrete 1 to 1 1 to 1 2 to 1 2 to 1 3 to 1 3 to 1 Type I error Type I error Power Calculations Power Calculations

19 Type I error Unconditional Unconditional Generate N x i and y i from Unif(0,10) Generate N x i and y i from Unif(0,10) Type I error is the percentage of found significant clusters. Type I error is the percentage of found significant clusters. Conditional Conditional Generate N x is and y is from Unif(0,10) Generate N x is and y is from Unif(0,10) Type I error is the percentage of found significant clusters. Type I error is the percentage of found significant clusters.

20 Type I error UnconditionalConditional

21 Power Calculations Two Power Calculations Two Power Calculations 13141516 9101112 5678 1234

22 Power Calculations Single Hotspot Single Hotspot 13141516 9101112 5678 1234

23 Power Calculations Multiple Hotspots Multiple Hotspots 13141516 9101112 5678 1234

24 Power Calculations Unconditional Unconditional Conditional Conditional

25 Application Study: Study: Kaohsiung, Taiwan Matched Case-Control Study Method: Method: Conditional Cumulative Geographic Residual Test (Normal and Mixed Discrete)

26 Results Odds Ratio (p-values) Marginally Significant Clustering for both outcomes without adjusting for smoking history.

27 Childhood Leukemia

28 Childhood Brain Cancer

29 Software R macro to handle both unconditional and conditional data R macro to handle both unconditional and conditional data Dataset: Dataset: X and Y coordinates of each participant X and Y coordinates of each participant Case/control variable Case/control variable Covariate matrix Covariate matrix Stratum Variable for conditional data Stratum Variable for conditional data Takes just a few minutes to run! Takes just a few minutes to run!

30 Discussion Cumulative Geographic Residuals Unconditional and Conditional Methods for Binary Outcomes Unconditional and Conditional Methods for Binary Outcomes Can find multiple significant hotspots holding type I error at appropriate levels. Can find multiple significant hotspots holding type I error at appropriate levels. Not computer intensive compared to other cluster detection methods Not computer intensive compared to other cluster detection methods Taiwan Study Found a possible relationship between Childhood Leukemia and Petrochemical Exposure, but not with the outcome Childhood Brain Cancer. Found a possible relationship between Childhood Leukemia and Petrochemical Exposure, but not with the outcome Childhood Brain Cancer.

31 Discussion Future Research Failure Time Data Failure Time Data Recurrent Events Recurrent Events Relocation of Study Participants Relocation of Study Participants Surveillance Surveillance


Download ppt "Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004."

Similar presentations


Ads by Google