Study Design in Molecular Epidemiology of Cancer Epi243 Zuo-Feng Zhang, MD, PhD.

Study Design in Molecular Epidemiology of Cancer Epi243 Zuo-Feng Zhang, MD, PhD

Objectives of Molecular Epidemiology To gain knowledge about the distribution and determinants of disease occurrence and outcome that may be applied to reduce the frequency and impact of disease in human populations.

Cross-Sectional Studies Exposure (questionnaire or measured exposure markers) and a biomarker as an end-point among healthy population, e.g., ETS (exposure) and serum nicotine metabolites (cotinine levels) Sometimes use biological markers to validate questionnaire data.

Case-Control Studies Disease end-point as a major interest Clinical (Hospital)-based or population-based case-control studies Inclusion of both questionnaire data and biological specimens Biological markers can be measured and compared between cases and controls when other variables can be used as either confounding factors or effect modifiers

Case-Control Studies In population-based studies, the collection of biological material for such markers is feasible but logistically more complex. For early biological marker, collection of materials (e.g., pre-cancerous lesions) is logistically feasible in a hospital setting, but become more difficult in the population setting

Case-Cohort Study Design Collecting the specimens at the baseline for entire cohort and then collecting specimens from cases as they occur. Measuring the biomarker using newly collected specimen and using the baseline cohort specimen as control. Because the specimens for cases and controls are taken at the different times for cases and controls, bias will be introduced if sample degradation or lab drift occurs over time

Case-Control Study For genetic susceptibility markers, case-control study design is highly appropriate Clinic-based case-control studies are particularly suitable for studies of intermediate endpoints, as these end-point can be systematically measured. Clinic-based case-control studies are excellent for studying etiology of precancerous lesions (e.g., CIN)

Case-Control Study Biomarkers of internal dose (e.g., carrier status for infectious agents, such as HBsAg) or effective dose (PAH DNA adducts) are appropriate when they are stable over a long period of time or when the exposures have been constant over exposure period. However, it is essential that you are not affected by the disease process, diagnosis, or treatment.

Prospective Cohort Studies Exposure is measured before the outcome The source population is defined The participation rate is high if specimen are available for all subjects and follow-up is complete

Prospective Cohort Studies The usually small number of cases of each of many type of cancer The lack of specimen if the biomarker requires large amounts of specimen or unusual specimens Degradation of the biomarkers during long-term storage The lack of details on other potentially confounding or interacting exposures

Prospective Cohort Studies The major concern of cohort studies of the short duration (as in case-control studies) is the possibility that the disease process has influenced the biomarker level among cases diagnosed within 1 to 2 years of the specimen being collected.

Prospective Cohort Studies In prospective studies in longer duration, there may be considerable misclassification of the etiologically relevant exposures if the specimens have been collected only at baseline. This misclassification occurs when individual’s exposure level may change systematically over time and there may be intra-individual variation in biomarker level.

Prospective Cohort Studies The intra-individual misclassification may be reduced by taking multiple samples, but this will generally increase expenses of sample collection and storage and the burden on study subjects Similar approaches apply to taking sample at several points in time in an attempt to estimate time-integrated exposures or exposure change.

Prospective Cohort Studies An alternative approach is to estimate the extent of intra-individual variation, and the misclassification involved in taking single specimens, by taking multiple specimens in a sample of the cohort. This information can be used to correct for bias to the null introduced if the misclassification is non- differential, and therefore de-attenuate observed relative risks

Prospective Cohort Studies Repeated contact of subjects Informing the cohort members of their biomarker level is problematic if the biomarker is not considered to be sufficiently predictive of disease and if there is no preventive steps cohort members can take to reduce their risk of the disease

Nested Case-Control Study The biomarker can be measured in specimens matched on storage duration The case-control set can be analyzed in the same laboratory batch, reducing the potential for bias introduced by sample degradation and laboratory drift

The Case-Case Design: Applications in Tumor Markers and Genetic Polymorphisms Studies

Case-Case Study Design To identify etiological heterogeneity To evaluate gene-environment interaction

Case-Case Study Design Case-only, Case-series, etc. Studies with cases without using controls Can be employed to evaluate the etiological heterogeneity when studying tumor markers and exposure May be used to assess the statistical gene- environment or gene-gene interactions

Interaction Assessment using Case-Control Study Genotype abnormalOR1 Genotype normalOR2 Interaction measureOR1/OR2 hereOR2=OR01 OR1=OR11/OR10 OR Interaction=OR11/(OR10xOR01)

Comparison of Case-Control and Case-Case Study designs ParameterCase-controlCase-Case Beta(01)OR01Not measured Beta(10)OR10Not measured Beta interaction ORint= OR11/OR01xOR10 Measured Beta (11)OR11=OR01 x OR10 x ORint Not measured

Assumptions for Case-Case Study Design Exposure and genotype occur independently in the population The Risk of disease is small (or the disease is rare) at all level of the study variables

From Rothman & Greenland, p.615 Smoking and TGF-alpha Polymorphism SmokingTGF-BCaseControlOR adj. NeverNormal36 A00167 B001.0 OR00 NeverPositive 7 A01 34 B011.0 OR01 YesNormal13 A10 69 B100.9 OR10 YesPositive13 A11 11 B115.5 OR11

OR int= OR11/(OR01 x OR10) = 5.5/(1.0 x 0.9)=6.1 OR CA=(A11 x A00)/(A10 x A01)= (13 x 36)/(13 x 7)=5.1

OR int=OR CA/OR CO=[OR 11/(OR01xOR10)] OR11=A11 B00/A00 B11 OR CA = [OR 11/(OR01xOR10)] x OR CO Assumption: OR CO=1, OR int = OR CA

Sample Size Main effectInteraction Case-control(RR) 2.0 Sample size150 cases 150 controls 600 cases 600 controls Case-Case300 cases

Strengths of Case-Case Study Design Case-Case study design offers greater precision for estimating gene-environment interaction than case-control study design The power for detecting gene environment interactions in case-case study is comparable to the power for assessing a main effect in a classic case-control study. Which leads to reduced sample size for interaction assessment.

Strengths of Case-Case Study Design Only cases are needed, thus avoiding the difficulties and often unsatisfying selection of appropriate controls (avoiding selection bias for controls)

Limitations of Case-Case Study Design The main effects of susceptible genotype (G) and environment effect (E) cannot be estimated The case-case study will miss gene- environment models with departures from additivity.

Intervention Studies In studies of smoking cessation intervention, we can measure either serum cotinine or protein or DNA adducts (exposure) or p53 mutation, dysplasia and cell proliferation (intermediate markers for disease) Measure compliance with the intervention such as assaying serum  -carotene in a randomized trial of  -carotene.

Intervention Studies Susceptibility markers (GSTM1) can also be used to determine whether the randomization is successful (comparable intervention and control arms)

Family Studies Does familial aggregation exist for a specific disease or characteristic? Is the aggregation due to genetic factors or environmental factors, or both? If a genetic component exists, how many genes are involved and what is their mode of inheritance? What is the physical location of these genes and what is their function?

Issues in Study Design and Analysis Relating a particular disease (or marker of early effect); to a particular exposure; while minimizing bias; controlling for confounding; assessing and minimizing random error; and assessing interactions

Sample Size and Power Consideration EPI243: Molecular Epidemiology of Cancer

Sample Size and Power False positive (alpha-level, or Type I error). The alpha-level used and accepted traditionally are 0.01 or 0.05. The smaller the level of alpha, the larger the sample size.

Sample Size and Power False negative (beta-level, or Type II error). (1-beta) is called the power of the study. Investigator like to have a power of around 0.80 or 0.95 when planning a study, which means that there have a 80% or 95% chance of finding a statistically significant difference between study and control groups.

Sample Size and Power The difference between study and control groups (delta). Two factors need to be considered here: one is what difference is clinically important, and the another is what is the difference reported by previous studies.

Sample Size and Power Variability. The more the variability of the data, the bigger the sample size.

Power or Sample Size Estimate for Case-Control Studies Alpha-level (false positive): 0.05 Beta-level (false negative level; 1- beta=power): 0.20 Delta-level: Proportion of exposure in controls and exposure in cases or expected odds ratio

Power Estimate

Sample Size Estimate

Estimate Minimum Detectable Odds Ratios

Gene-Environment (Gene-Gene) Interaction EPI242: Molecular Epidemiology Zuo-Feng Zhang, MD. PhD

Definition for Interaction Interaction (effect modification) occurs when the estimate of effect of exposure depends on the level of other factor in the study base. Interaction is distinct from confounding (or selection or information bias), but rather a real difference in the effect of exposure in various subgroup that may be of considerable interest.

Interaction Assessment Factor A AbsentPresent Factor AAbsentRR00RR01 PresentRR10RR11

Interaction Assessment RR00, relative risk when both factors absent RR01, relative risk when factor A present only RR10, relative risk when factor B present only RR11, relative risk when both factors A & B present

Interaction Assessment Combined RR = RR11 RR11 > RR01 x RR10 indicating more than multiplicative interaction or RR11/RR10 >or < RR01/RR00 or RR11/RR01xRR10 > or < 1 Interaction RR = RR11 / (RR01 x RR10)

Odds Ratios for two factors, Interaction? Factor B absentpresent Factor Aabsent1.02.5 present4.010.0

No more than multiplicative interaction ORs for factor B: 2.5 when factor A present; 2.5 (10.0/4.0) when factor A absent ORs for factor A: 4.0 when B absent and 4.0 (10.0/2.5) when factor B present

More than Multiplicative Interaction, Positive Quantitative Interaction ORs for factor B: 2.5 when factor A absent; 5.0 (20.0/4.0) when factor A present ORs for factor A: 4.0 when B absent and 8.0 (20.0/2.5) when factor B present

More than Multiplicative Interaction, Negative Quantitative Interaction Both factors increase the risk regardless of the value of the other factor, but the combined effect is less than the product of the two, although greater than that of either factor alone, giving a negative quantitative interaction.

Odds Ratios for two factors, Interaction? Factor B absentpresent Factor Aabsent1.02.5 present4.0

More than Multiplicative Interaction, Negative Quantitative Interaction Both factors increase the risk When A is present, there is no additional effect of factor B Adding factor A to factor B, only increases the risk to the degree found for factor A alone (4.0), leading to negative quantitative interaction.

Sample Size Consideration for Interaction Assessment Evaluation of interaction requires a substantial increase in study size. For example, in a case-control study involves comparing the sizes of the odds ratios (relating exposure and disease) in different strata of the effect modifier, rather than merely testing whether the overall odds ratio is different from the null value of 1.0.

Sample Size Consideration The power to test interaction depends on the number of cases and controls in each strata (of the effect modifier) rather than overall numbers of cases and controls. When considering possible interactions, the size of the study needs to be at least four time larger than when interaction is not considered (Smith and Day)

Study Design in Molecular Epidemiology of Cancer Epi243 Zuo-Feng Zhang, MD, PhD.

Similar presentations

Presentation on theme: "Study Design in Molecular Epidemiology of Cancer Epi243 Zuo-Feng Zhang, MD, PhD."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Study Design in Molecular Epidemiology of Cancer Epi243 Zuo-Feng Zhang, MD, PhD.

Similar presentations

Presentation on theme: "Study Design in Molecular Epidemiology of Cancer Epi243 Zuo-Feng Zhang, MD, PhD."— Presentation transcript:

Similar presentations

About project

Feedback