Sources and effects of bias in investigating links between adverse health outcomes and environmental hazards Frank Dunstan University of Wales College.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Multivariate Meta-analysis: Notes on Correlations Robert Platt Department of Epidemiology & Biostatistics McGill University Jack Ishak United BioSource.
Transitions from independent to supported environments in England and Wales: examining trends and differentials using the ONS Longitudinal Study Emily.
Assumptions underlying regression analysis
Intro to Statistics Part2 Arier Lee University of Auckland.
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Brief introduction on Logistic Regression
Comparing Two Proportions (p1 vs. p2)
Health outcomes in populations living close to landfill sites Lars Jarup, David Briggs, Cornelis de Hoogh, Christopher Hurt, Tina Kold Jensen, Sara Morris,
Inference for Regression
BACKGROUND Benzene is a known carcinogen. Occupational exposure to benzene is an established risk factor for leukaemia. Less is known about the effects.
Departments of Medicine and Biostatistics
Ethnic and socioeconomic trends in testicular cancer incidence in New Zealand Diana Sarfati, Caroline Shaw, June Atkinson, James Stanley, Tony Blakely.
Objectives (BPS chapter 24)
Chance, bias and confounding
Nicky Best and Chris Jackson With Sylvia Richardson Department of Epidemiology and Public Health Imperial College, London
Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004.
Class 17: Tuesday, Nov. 9 Another example of interpreting multiple regression coefficients Steps in multiple regression analysis and example analysis Omitted.
Clustered or Multilevel Data
GIS in Spatial Epidemiology: small area studies of exposure- outcome relationships Robert Haining Department of Geography University of Cambridge.
Cumulative Geographic Residual Test Example: Taiwan Petrochemical Study Andrea Cook.
Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li December 2, 2004.
Business Statistics - QBM117 Statistical inference for regression.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Quantitative Methods – Week 7: Inductive Statistics II: Hypothesis Testing Roman Studer Nuffield College
Are the results valid? Was the validity of the included studies appraised?
Inference for regression - Simple linear regression
1 Key concepts, data, methods and results Index Trends in cancer survival by ethnic and socioeconomic group, New Zealand, Soeberg M, Blakely.
Using GIS to investigate multiple deprivation David Briggs Small Area Health Statistics Unit Imperial College, London A few thoughts and several questions.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
TWO-STAGE CASE-CONTROL STUDIES USING EXPOSURE ESTIMATES FROM A GEOGRAPHICAL INFORMATION SYSTEM Jonas Björk 1 & Ulf Strömberg 2 1 Competence Center for.
Study Designs Afshin Ostovar Bushehr University of Medical Sciences Bushehr, /4/20151.
Lecture 8: Generalized Linear Models for Longitudinal Data.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
An Introductory Lecture to Environmental Epidemiology Part 5. Ecological Studies. Mark S. Goldberg INRS-Institut Armand-Frappier, University of Quebec,
Sample Size Considerations for Answering Quantitative Research Questions Lunch & Learn May 15, 2013 M Boyle.
1 ◄ ◄ Maternal and Infant Health data for California Choose one vital records indicator:  Preterm birth (birth prior to 37 weeks of pregnancy among singletons)
BACKGROUND Benzene is a known carcinogen. Occupational exposure to benzene is an established risk factor for leukaemia. Less is known about the effects.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
KNR 445 Statistics t-tests Slide 1 Introduction to Hypothesis Testing The z-test.

A short introduction to epidemiology Chapter 9: Data analysis Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand.
BC Jung A Brief Introduction to Epidemiology - XIII (Critiquing the Research: Statistical Considerations) Betty C. Jung, RN, MPH, CHES.
Authenticity of results of statistical research. The Normal Distribution n Mean = median = mode n Skew is zero n 68% of values fall between 1 SD n 95%
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
STATISTICAL INFERENCES
Exposure Assessment for Health Effect Studies: Insights from Air Pollution Epidemiology Lianne Sheppard University of Washington Special thanks to Sun-Young.
Logistic Regression Analysis Gerrit Rooks
Descriptive study design
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Statistical Significance: Tests for Spatial Randomness.
1 Part09: Applications of Multi- level Models to Spatial Epidemiology Francesca Dominici & Scott L Zeger.
Hypothesis Testing and Statistical Significance
Confidence Intervals and Hypothesis Testing Mark Dancox Public Health Intelligence Course – Day 3.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Direct method of standardization of indices. Average Values n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.
Case Control study. An investigation that compares a group of people with a disease to a group of people without the disease. Used to identify and assess.
How many study subjects are required ? (Estimation of Sample size) By Dr.Shaik Shaffi Ahamed Associate Professor Dept. of Family & Community Medicine.
Lecture 4: Meta-analysis
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Investigation Response Team: Descriptive Health Data
Chapter 11: Testing a Claim
Statistical significance using p-value
Presentation transcript:

Sources and effects of bias in investigating links between adverse health outcomes and environmental hazards Frank Dunstan University of Wales College of Medicine The results are only as good as the data!

Outline of talk Why do so many spatial studies fail to find evidence of the effect of risk factors? Is it because exposure is usually measured inadequately? Consider a point source of risk and the effects of Distance as a surrogate for exposure Migration More generally in looking at association between the spatial variation of disease incidence and risk factors, what is the effect of measurement error?

How do we measure exposure? Distance from focus is often used as a surrogate Algorithms of Stone, Bithell, Tango etc use different models under the alternative hypothesis – but the usual assumption is that risk decreases monotonically with distance. It is implicit that it is the same in all directions Circles approach similar Models of transmission of risk make this implausible

Circles method

Example on congenital anomalies Data on all births in Wales in 15 year period, linked to records of congenital anomalies Locations obtained using a GIS Data on landfill sites which changed significantly in the period – 24 in all Individual data on maternal age, birthweight Census data on deprivation Is the opening of a site associated with an increased risk of anomalies?

Modelling Rate varied significantly between hospitals and by year of birth – adjustment for these needed Risk modelled as function of Age of mother Hospital Gender Year of birth Deprivation Calculate observed and expected for each square of side 250m (for example), then smooth standardised differences using kernel smoothing

Smoothed risks around 2 landfill sites, before and after opening Standard Astbury Quarry

Trecatti Nantygwyddon

Interpretation Need to consider the change in risk pattern, comparing before and after opening. Different sites have different risk patterns. Pooled results across sites must be interpreted carefully. Possibly due to geographical differences. Risk does not seem isotropic – possibly affected by wind, water flow, topography of site. What is the effect on tests and estimates of risk?

Simulation exercise Assume that a certain amount of pollutant is spread from a point source Consider different patterns of spread –isotropic –Concentration on direction of prevailing wind –Non-monotonic Based on scenario of births to provide detailed data – but interested in relative magnitudes of power, etc, rather than absolute Does geography matter?

Results Show power – Stones method for simplicity (patterns the same for others) Mean estimated odds ratio if using circles with correct threshold These vary between sites because of the distribution of the population

Results on power – 3 sites Isotropic Not isotropic Not monotonic

Results on odds ratio – 3 sites Isotropic Not isotropic Not monotonic

Migration Large numbers of people move house each year Many diseases associated with environmental risks are believed to be due to long term exposure Taking place of residence at diagnosis as representing exposure is potentially misleading – exposure may have arisen from previous locations Effect will be to weaken the apparent risk We planned to use the NHSAR to identify appropriate models – but the data are not yet available

Migration model Based on the population around a site, divided into census EDs (between 150 and 200, depending on the site) Assume a fixed probability of moving each year Probability of destination of move decreases with distance Assume the background rate varies across EDs according to a log-normal distribution Assume a monotonically-decreasing risk from the source

Monitor total number of exposure-years on each individual Assume a logistic model for the risk of a case as a function of exposure-years Use circles method for simplicity to assess effect Estimate effect on odds ratio and power

Typical results from a site Odds ratio and power decrease markedly as migration increases. Absolute values depend on parameters – pattern seems to be preserved but local geography matters

Errors in variables Ecological studies by administrative area Take area-based disease rates (mortality, incident cancer cases etc.) Risk factors also defined at area level –Often from census data –Also from irregularly measured factors So these are unlikely to be reported at correct levels

Typical problem – leukaemia incidence against ionising radiation

Simulation Poisson regression & spatial models – only Poisson results shown for brevity Measure of deprivation used as covariate Spatial correlation induced Classical measurement error model Interested in bias and in the estimate of the SD of the regression coefficient

Typical simulation results – effect on bias Based on all-Wales (908 wards) and a sub-region (111 wards) Bias increases with error SD as in other contexts Effect of correlation more on the estimated SD Spatial model (BYM) gives similar parameter estimates but with better estimate of SE

Conclusion In investigating the risk around a source we need a proper measure of exposure; distance is not enough. Methods which assume the risk decreases monotonically with distance lack power. Effects will vary with geographical location and account must be taken of local conditions. Migration can have a considerable effect on the extent of exposure. This is particular important when distance is used a surrogate for exposure. More work is needed on better models. A proper investigation requires detailed studies at individual level, of locations and people to assess exposure accurately.