Daniel O. Stram Mark Huberman Anna Wu

Slides:

Advertisements

Similar presentations

How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.

Advertisements

Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Measures of Association.

Cross-sectional study. Definition in Dictionary of pharmaceutical medicine 2009 by G Nahler Dictionary of pharmaceutical medicine cross-sectional study.

Chance, bias and confounding

EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.

Correcting for measurement error in nutritional epidemiology Ruth Keogh MRC Biostatistics Unit MRC Centre for Nutritional Epidemiology in Cancer Prevention.

Journal Club Alcohol, Other Drugs, and Health: Current Evidence July–August 2009.

Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 6: Correlation.

Are exposures associated with disease?

1 Journal Club Alcohol, Other Drugs, and Health: Current Evidence January–February 2014.

Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.

September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.

Biostatistics Case Studies Peter D. Christenson Biostatistician Session 5: Analysis Issues in Large Observational Studies.

Estimation of Statistical Parameters

Regression Analysis. Scatter plots Regression analysis requires interval and ratio-level data. To see if your data fits the models of regression, it is.

Joint Effects of Radiation and Smoking on Lung Cancer Risk among Atomic Bomb Survivors Donald A. Pierce, RERF Gerald B. Sharp, RERF & NIAID Kiyohiko Mabuchi,

Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.

MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.

© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.

Issues concerning the interpretation of statistical significance tests.

Challenges to the Epidemiology of Aging: The REasons for Geographic And Racial Differences in Stroke Study George Howard, DrPH UAB School of Public Health.

Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.

Welcome to the Unit 5 Seminar Kristin Webster

CHAPTER 4 Designing Studies

Chapter 8: Estimating with Confidence

Regression Analysis.

CHAPTER 9 Testing a Claim

Chapter 8: Estimating with Confidence

Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.

Regression Analysis AGEC 784.

Chapter 9 Audit Sampling: An Application to Substantive Tests of Account Balances McGraw-Hill/Irwin ©2008 The McGraw-Hill Companies, All Rights Reserved.

Topic 10 - Linear Regression

Present: Disease Past: Exposure

Lecture notes on epidemiological studies for undergraduates

Unit 5: Hypothesis Testing

Confidence Intervals and p-values

CASE-CONTROL STUDIES Ass.Prof. Dr Faris Al-Lami MB,ChB MSc PhD FFPH

Lecture 4: Meta-analysis

Secondhand smoke exposure and cervical cancer:

Understanding Standards Event Higher Statistics Award

Chapter 7 The Hierarchy of Evidence

Correlation and Regression

Systematic review and meta-analysis

NURS 790: Methods for Research and Evidence Based Practice

Simple Linear Regression

CHAPTER 9 Testing a Claim

Geology Geomath Chapter 7 - Statistics tom.h.wilson

Significance Tests: The Basics

Chapter 8: Estimating with Confidence

Mpundu MKC MSc Epidemiology and Biostatistics, BSc Nursing, RM, RN

CORRELATION AND MULTIPLE REGRESSION ANALYSIS

Inferential Statistics

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 9 Testing a Claim

The objective of this lecture is to know the role of random error (chance) in factor-outcome relation and the types of systematic errors (Bias)

Interpreting Epidemiologic Results.

Chapter 8: Estimating with Confidence

CHAPTER 9 Testing a Claim

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Section 10.2 Comparing Two Means.

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Multiple Regression Berlin Chen

MGS 3100 Business Analysis Regression Feb 18, 2016

Alcohol, Other Drugs, and Health: Current Evidence May–June 2019

Presentation transcript:

Daniel O. Stram Mark Huberman Anna Wu IS RESIDUAL CONFOUNDING A REASONABLE EXPLANATION FOR THE APPARENT PROTECTIVE EFFECTS OF BETA-CAROTENE FOUND IN EPIDEMIOLOGICAL STUDIES OF LUNG CANCER IN SMOKERS? Daniel O. Stram Mark Huberman Anna Wu

Background A large number of epidemiological studies (case-control and cohort studies) have reported an inverse (protective) relationship between beta-carotene intake and lung cancer risk among smokers and more generally between fruit and vegetable intake and lung cancer risk

Review (Ziegler et al 1996) Of 25 retrospective studies of intake, 16 showed a protective effect of carotenoids, 6 were strongly significant, only 4 studies showed an opposite trend and only 1 reporting a marginally significant increase in risk Of 6 prospective studies of blood micro-nutrient levels, 5 showed protective effects of beta-carotene (4 strongly significant), and none showed increases in risk Many of these studies show approximately a doubling of risk in the “low” vs. “high” beta-carotene group

Intervention Trials Three randomized studies (CARET, ATBC, PHS) have failed to find reduced lung cancer risk in smokers given beta-carotene Two of these studies actually noted an increase in risk in the beta-carotene group

What is the likely cause for the differences between observational and intervention studies? Albanes et al (ATBC investigator) lists 2 distinct explanations Serum levels and “usual” intake represent long-term exposure to beta-carotene with different effects on risk than supplementation High beta-carotene intake is associated with other dietary or lifestyle practices that are protective

What about tobacco itself? Tobacco use is an extraordinarily important risk factor for lung cancer A number of studies have noted that intake or serum levels of beta-carotene are reduced in current smokers compared to ex-smokers, and in ex-smokers compared to never smokers Studies that fail to adequately control for smoking would be subject to biases due to this inverse association All the observational studies, did however, include smoking history as a variable in their analyses

Is controlling for self-reports of smoking history sufficient to address confounding? This depends upon The nature and magnitude of the errors in self-reports of smoking in assessing true exposure The strength of the association between beta-carotene intake or serum levels and true smoking exposure The shape and strength of the relationship between true smoking and lung cancer risk Large errors + strong relationships between true dose and beta-carotene and between true dose and risk = High probability of “Residual Confounding”

Goals of the paper Develop model for residual confounding that is consistent with the literature regarding self-reports of smoking, true lung dose, beta-carotene, and lung cancer risk Use this model to compute risk differences between smokers with “high vs. low” beta-carotene intake levels due solely to differences in “true lung dose” Suggest future research for observational studies

Simplifications Concentrate solely upon model for current smokers even though most observational studies included ex-smokers

What should the model look like? Model for errors in smoking reports (z) Allow for both classical and Berkson components of error in distribution of z and true lung dose (x) Reports of number of cigarettes / day may be “symmetrically” distributed around the truth (classical error model) Conditional on # of cigs, true lung dose may show additional random variation from person to person due to inhalation differences etc. (Berkson) Multiplicative errors seem more reasonable than additive errors

Conditional on Z=log(z) the log (X) of true exposure (x) is assumed to be lognormal Taking b = 1 gives the Berkson while b = Var(X)/Var(Z) gives the classical error model z and Var(Z) are set so that 95 percent of smokers report between 5 and 60 cigarettes with median = 20 We worked with 3 models (purely classical, purely Berkson, and “mixed”)

On the arithmetic scale the correlation, Rz,x, between x and z is equal to Where and are the Berkson and classical variances in a “mixed” B&C model

How large is Rz,x ? High Error Model: As a lower bound we chose Rz,x = 0.55 which is a commonly reported value for the correlation between cotinine measurements and self reports of smoking. This assumes that cotinine is a “nearly perfect” biomarker of true lung dose Low Error Model: We chose Rz,x = 0.85 as an upper bound (which seems high for a self-reported exposure of anything)

Model for beta-carotene and true lung dose We chose a semi-lognormal model so that log beta-carotene, B, is linear in true lung dose with a negative slope – this is similar to the models that Stryker fit to measurements of serum beta carotene This model is parameterized by the correlation RB,x

Why a negative RB,x ? Smokers have poor diets There probably is some kind of direct action of nicotine or other tobacco constituent on taste Anecdotally I have heard ex-smokers say that their taste for sweet foods is much stronger after quitting There may also be direct action of smoking on serum levels of beta-carotene conditional on intake This has been reported in several papers and partly served as a rationale for believing that replacing “lost” beta-carotene should be protective

How large is RB,x ? We assume that RB,x = -0.25 It must be larger than RB,z in fact we have We assume that RB,x = -0.25 Under the high error model this gives an observed correlation of –0.14 This correlation seems to be consistent with the (very few) direct examinations of RB,z that are given in the literature

Model for Lung Cancer Risk among current smokers Doll and Peto (J Epidemiol and Community Health 1978) describe the British Doctors data using RR = (1 + 1/6  cigarettes / day)2 We also considered a model of form RR=1 + (13/9)  cigarettes/day which agrees with the quadratic model at the values 0 and 40 cigarettes/day

Note that this model is in terms of self-reported smoking. Under the lognormal measurement error model the observed relationship is attenuated relative to the true model. In particular if an observed model for risk involves a term zn then our lognormal model implies that a term zn/b appears in the model using true dose

Putting it all together What do we want to compute? RR = RR(“low intake”,z) / RR(“high intake”,z) for specific values of reported smoking, z How do we compute it? By (numerical) integration over the distributions of x given both z and B

Results

Conclusions Under the high measurement error model and linear-quadratic dose response model “all” of the effect of beta-carotene may be due to confounding with unmeasured tobacco exposure, that is relative risks of close to 2 are evident

The linear model reduces the strength of the residual confounding, but the bias is still important for the high error model The results are somewhat dependent upon whether a Berkson, Classical, or Mixed measurement error model is assumed

Comparison of Classical & Berkson error models (with the same value of Rz,x) Classical model implies attenuation and greater nonlinearity in risk function on true dose scale Classical model also implies that Var(X) is smaller than Var(Z). The opposite is true for the Berkson model Conditioning on “high” and “low” values of B produces a smaller difference in E{X|Blow} vs E{X|Bhigh} for classical than a Berkson model

The Berkson model implies much less change in the risk function but the larger Var(X) yields larger variation in E{X|B} These two effects tend to “cancel”. The model yielding the largest residual confounding is actually “mixed”

Is the high error model reasonable? Correlations between cotinine and self-reports of smoking range from approx. .4 - .7 in various publications. This allows little room for the “high error” model unless cotinine is a “perfect” estimate of true exposure Corr(Cotinine,z) = Corr(Cotinine, x)*Corr(z,x) implying that Corr(z,x) > Corr(Cotinine,z) However, these reported correlations are all from small carefully focused studies, it is likely that smoking reporting errors in larger less focused studies are larger than in these special studies

Many epidemiological studies report that reported smoking duration is a much better predictor of lung cancer risk than smoking amount. To the extent that this is true (and careful analysis is required because of the relation between smoking duration and age) I take this as evidence that self-reports of smoking are rather poor in these studies.

Is RB,x = -0.25 reasonable? This is a very strong negative correlation, but the high error model implies a much weaker observed correlation of RB,z = -0.14. One paper (Stryker) reported a correlation RB,z = -.26 for males, this however included nonsmokers in the analysis (the correlation would be smaller if only smokers are considered).

It is been under-appreciated that a direct action of tobacco smoke on serum beta- carotene (conditional on intake) which has been reported in the literature, increases the potential for residual confounding if tobacco intake is measured poorly. Remember that the cohort studies that measured serum beta-carotene found stronger effects than the other observational studies

Suggestions for Research The correlation between serum beta carotene levels and reported smoking should be better reported in studies Levels of beta-carotene in smokers < ex-smokers < never smokers have been commonly reported but only rarely have correlations for current smokers been calculated. (I found RB,z = -0.10 for the MEC). Correlation of serum beta carotene levels and serum cotinine levels should be reported where possible

Cohort studies with stored blood samples should use cotinine as well as reported smoking in joint analyses of lung cancer risk and beta carotene. This is planned for the Multi-Ethnic Cohort Study of Diet and Cancer (Kolonel et al 2000, Am J of Epidemiol)