Applied Epidemiologic Analysis Fall 2002 Applied Epidemiologic Analysis Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie KranickSylvia.

Slides:



Advertisements
Similar presentations
Multilevel modelling short course
Advertisements

Agency for Healthcare Research and Quality (AHRQ)
M2 Medical Epidemiology
Random Assignment Experiments
CONCEPTS UNDERLYING STUDY DESIGN
Revisiting causal neighborhood effects on individual ischemic heart disease risk: a quasi-experimental analysis among Swedish siblings Juan Merlo In collaboration.
Chance, bias and confounding
Selection of Research Participants: Sampling Procedures
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
Clustered or Multilevel Data
Multilevel Modeling Soc 543 Fall Presentation overview What is multilevel modeling? Problems with not using multilevel models Benefits of using.
STAT262: Lecture 5 (Ratio estimation)
Stratified Simple Random Sampling (Chapter 5, Textbook, Barnett, V
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
Today Concepts underlying inferential statistics
Experimental Group Designs
CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.
Trieschmann, Hoyt & Sommer Risk Identification and Evaluation Chapter 2 ©2005, Thomson/South-Western.
Chapter 5: Descriptive Research Describe patterns of behavior, thoughts, and emotions among a group of individuals. Provide information about characteristics.
PSY 307 – Statistics for the Behavioral Sciences
Analysis of Clustered and Longitudinal Data
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Applied Epidemiologic Analysis Fall 2002 Applied Epidemiologic Analysis Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie KranickSylvia.
Lecture 3: Measuring the Occurrence of Disease
Multiple Choice Questions for discussion
بسم الله الرحمن الرحيم * this presentation about :- “experimental design “ * Induced to :- Dr Aidah Abu Elsoud Alkaissi * Prepared by :- 1)-Hamsa karof.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences.
Quantitative Methods Heteroskedasticity.
Sampling and Nested Data in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine.
1/26/09 1 Community Health Assessment in Small Populations: Tools for Working With “Small Numbers” Region 2 Quarterly Meeting January 26, 2009.
Understanding Statistics
Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology.
MULTILEVEL ANALYSIS Kate Pickett Senior Lecturer in Epidemiology SUMBER: www-users.york.ac.uk/.../Multilevel%20Analysis.ppt‎University of York.
Study Designs Afshin Ostovar Bushehr University of Medical Sciences Bushehr, /4/20151.
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
Module Overview Risk—what is it? How do we perceive risks-socioecological model? It’s all relative.. Where do health statistics come from? How can I learn.
Introduction Multilevel Analysis
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Data Analysis – Statistical Issues Bernd Genser, PhD Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador
A short introduction to epidemiology Chapter 2b: Conducting a case- control study Neil Pearce Centre for Public Health Research Massey University Wellington,
Estimating Causal Effects from Large Data Sets Using Propensity Scores Hal V. Barron, MD TICR 5/06.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Measures of Disease Frequency COURTNEY D. LYNCH, PhD MPH ASSISTANT PROFESSOR DEPT. OF OBSTETRICS & GYNECOLOGY
An Introductory Lecture to Environmental Epidemiology Part 5. Ecological Studies. Mark S. Goldberg INRS-Institut Armand-Frappier, University of Quebec,
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
The Choice Between Fixed and Random Effects Models: Some Considerations For Educational Research Clarke, Crawford, Steele and Vignoles and funding from.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Relative Values. Statistical Terms n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the data  not sensitive to.
Data Analysis in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine October.
A short introduction to epidemiology Chapter 9: Data analysis Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand.
Standardization of Rates. Rates of Disease Are the basic measure of disease occurrence because they most clearly express probability or risk of disease.
Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.
United Nations Workshop on Revision 3 of Principles and Recommendations for Population and Housing Censuses and Evaluation of Census Data, Amman 19 – 23.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
Sampling and Nested Data in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine.
Chapter 11 The t-Test for Two Related Samples
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
1 Study Design Imre Janszky Faculty of Medicine, ISM NTNU.
Research proposal (Lecture 3) Dr.Rehab F Gwada. Objectives of the Lecture The student at the end of this lecture should Know Identify Target Population.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Methods of Presenting and Interpreting Information Class 9.
Risk Identification and Evaluation Chapter 2
Relative Values.
Lecture 1: Fundamentals of epidemiologic study design and analysis
11/20/2018 Study Types.
Statistical Data Analysis
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Applied Epidemiologic Analysis Fall 2002 Applied Epidemiologic Analysis Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie KranickSylvia Taylor Chelsea MorroniJudith Weissman

Applied Epidemiologic Analysis Fall 2002 Lecture 11 Standardization, sampling fractions, and multilevel analysis Goals: To review motivation for methods of standardization of study data To understand the connection between standardization of study data to match the population and weighted statistical analysis To review the connection between clustered sampling and the violation of the assumption of independence of observations To see how clustered data can be informative about group/area effects and differences in impact of exposures To see how multilevel clustered data relate to ecologic analysis

Applied Epidemiologic Analysis Fall 2002 Why standardize measures of incidence or exposure effects? Measures of incidence and exposure effects as estimated in a study are inevitably influenced by the population strata as represented in the study sample. Unless these measures are constant across strata we can expect different estimates associated with different study designs and reference populations. Any comparison across populations (e.g. state or country estimates), and any projection to future or past populations, will need to take these changing population compositions into account.

Applied Epidemiologic Analysis Fall 2002 A sample with 100 subjects in each age group; not the same distribution as the population * standardized incidence proportion

Applied Epidemiologic Analysis Fall 2002 Adjusting incidence or prevalence proportions to the population distribution of strata requires multiplying each stratum’s estimate by the relative sampling fraction, summing, and dividing by the sum of the strata sampling fractions. Alternatively, each observation in each stratum may be weighted by the same relative sampling fraction, and these summed and the total divided by the sum of the relative sampling fractions (= N). Equivalent methods of standardization

Applied Epidemiologic Analysis Fall 2002 Representing the population in multivariate analyses; weighting data * standardized incidence proportion

Applied Epidemiologic Analysis Fall 2002 Standardizing risk and rate ratios Both utilize comparable means: multiplying individual observations or strata estimates by relative sampling ratios will standardize risk ratios Special considerations for rate ratios –Require strata-specific values for person-time for determination of the appropriate relative sampling fraction –These values will be used to standardize both the numerator and denominator of rate ratios –Doing so assumes that exposure does not effect the strata person-time substantially (e.g. rare disease assumption)

Applied Epidemiologic Analysis Fall 2002 Standardized measures in epidemiology differ from standardized scores/coefficients in statistics “Standard” or z scores A correlation coefficient is a measure of association that has substituted standard deviation units for the original units of measurement A regression coefficient (or an odds ratio) expresses a relationship as a change in dependent variable units per unit change in the independent variable.

Applied Epidemiologic Analysis Fall 2002 In what sense are epidemiological effect measures standardized measures? Note that rate ratios are “unit free” by including disease per person-time unit in both numerator and denominator Risk ratios are similarly unit free by including disease per person in both numerator and denominator. An advantage is that rate ratios and risk ratios can be compared for different outcomes, different exposures, or even both.

Applied Epidemiologic Analysis Fall 2002 Rate and risk differences and OLS regression coefficients In contrast to rate and risk ratios, rate and risk differences retain the original units. It is possible that they will be more constant across populations because they are less likely to change with a change in the distribution of other risks for the disease. In that sense, they are much like regression coefficients in OLS (literally, they are regression coefficients if OLS is applied to a disease outcome) which are also expected to be more stable across populations.

Applied Epidemiologic Analysis Fall 2002 Advantages and disadvantages of standardized coefficients Risk and rate ratios and correlation coefficients share advantages and disadvantages: They are unit-free and thus comparable across changes in variables (e.g. disease and/or exposure) Even when variables are the same, they are likely to vary across populations with different distributions of the same or other relevant variables.

Applied Epidemiologic Analysis Fall 2002 Other circumstances when differential sampling fractions may be employed In many study designs certain subpopulations may be deliberately over-represented in the sampling. This is particularly likely when these sub-populations are to be the subject of special study and representation at the overall sampling fraction rate would yield too small a sample for adequate statistical power.

Applied Epidemiologic Analysis Fall 2002 Over-sampling of particular strata Often it is desired to estimate the incidence proportion, or measures of effect for the entire population. In these circumstances, the relative sampling fraction used as a weight for individual observations will accomplish this goal. Virtually all statistical programs provide for use of such weights, and there is no constraint on the complexity of the model that may be estimated.

Applied Epidemiologic Analysis Fall 2002 Clustering Ordinary statistical analyses assume that study participants are individually randomly sampled from the population; observations are independent. Whenever there is observational clustering on predictors of the DV that are NOT included in the model, standard error of estimates will be too small. Such clustering occurs when study participants are obtained from “group settings” such as different diagnostic or treatment sources, whole classrooms or other groups, or neighborhoods that may differ on relevant variables.

Applied Epidemiologic Analysis Fall 2002 Clusters at the sampling stage Often in the past these clusters might be ignored. The availability of statistical programs to properly analyze them has created both more proper statistical estimates (especially of confidence limits) and new awareness of important substantive questions When there are few (e.g. <10) clusters, it may be more efficient and informative to include “membership” as a categorical variable in the analysis.

Applied Epidemiologic Analysis Fall 2002 Variables measured at the cluster level Differences between clusters on variables measured on the study individuals such as average age, average education, ethnic background, mean score on a test Differences between clusters on variables that either: –Represent cluster properties per se: weather, urbanization, location, pollution, teacher’s training, doctors per patient –Variables which could be measured on the individual level but are not study variables: % voting, mean persons per room, % of babies born with low birth weight

Applied Epidemiologic Analysis Fall 2002 Goals of multilevel analysis statistical programs Produce proper standard errors. Ask new kinds of questions: –Questions at the aggregate or cluster level –Questions about effects of aggregate characteristics on individual participants (at the individual level) –Questions about different individual level effects that depend on the aggregate or cluster in which they appear

Applied Epidemiologic Analysis Fall 2002 Goals of multilevel analysis statistical programs Questions at the aggregate or cluster level Are there differences in mortality rates (or disease markers) associated with the medical systems from which they were recruited for persons with a given disease? Questions about effects of aggregate characteristics on individual participants (at the individual level) Does the risk for disease depend on the average risk of those around them (assuming a non-infectious disease where this question is particularly interesting)? Questions about different individual level effects depending on the aggregate or cluster in which they appear Are there differences in the relationship between use of a given therapy and outcome for individuals in different treatment settings?

Applied Epidemiologic Analysis Fall 2002 Multilevel analyses as “random coefficient” methods Multilevel analyses are sometimes referred to as “random coefficient” methods because it can answer these new types of questions: As in traditional analyses, the individual level outcome variable is viewed as a “random” - to be predicted – variable. Additionally, the variation in relationships of, e.g. exposures to diseases, as reflected in different cluster estimates of effects (coefficients) may also be viewed as a “random” – to be predicted – variable.

Applied Epidemiologic Analysis Fall 2002 Multilevel regression analysis Use maximum likelihood techniques (including “empirical Bayes” estimation). Examine possibility that there may be different effects of exposures in different contexts. Analyses are carried out at the cluster level AND the individual participant level.

Applied Epidemiologic Analysis Fall 2002 First stage: The intraclass correlation (ICC) reflects the fraction of disease variance associated with the cluster differences. Second stage: Adds individual or cluster level “fixed” independent variables to the prediction. Third stage: Add other variables that may characterize clusters. Fourth stage: Add potential interactions between individual and aggregate level variables. Multilevel regression analysis, continued

Applied Epidemiologic Analysis Fall 2002 Neighborhood socioeconomic status and all-cause mortality Hans Bosma, H. Dike van de Mheen, Gerard J. J. M. Borsboom, and Johan P. Mackenbach, American Journal of Epidemiology, 2001, 153, ,506 participants in a survey of quality of life and contextual factors in Eindhoven, the Netherlands 86 neighborhood clusters All-cause mortality from municipal registers matched to individuals Four SES indicators  % with primary schooling only  % unskilled laborers  % unemployed or disabled  % with severe financial problems

Applied Epidemiologic Analysis Fall 2002 Individual education High Intermediately high Intermediately low Low (Recolored from original) FIGURE 1. Percent deceased during follow-up by individual and neighborhood educational level. Estimated for men aged 49 years without baseline diseases (n = 6,506 deaths).

Applied Epidemiologic Analysis Fall 2002 An example of a policy-effect analysis in public health Averett SL, Rees DI, Argys LM The impact of government policies and neighborhood characteristics on teenage sexual activity and contraceptive use. American Journal of Public Health, 92, In this study of teenage sexual activity and contraceptive use the predictors included both individual levels variables such as religious background, parental education, and section of the US from which the subsample was drawn (and associated differences in family planning service availability), neighborhood characteristics such as median income, racial composition.

Applied Epidemiologic Analysis Fall 2002 Ecologic studies Individual level data are not available. All data, including the dependent variable, are measured only at the aggregate or cluster level. Still attempt to make some conclusions about individuals, since causal impacts on disease operate at the individual level. These could be thought of as multilevel studies without the individual level data.

Applied Epidemiologic Analysis Fall 2002 Such analyses may address public health and epidemiological issues otherwise neglected See Diez-Roux AV Bringing context back into epidemiology: variables and fallacies in multilevel analysis. American Journal of Public Health, 88,

Applied Epidemiologic Analysis Fall 2002 Ecologic studies Shown by Robinson (1950) that conclusions about individuals based ecologic data are not necessarily appropriate; “ecologic fallacy.” Still used because: –Low cost and convenience –Measurement limitations of individual-level studies –Design limitations of individual-level studies –Interest in ecologic effects per se –Simplicity of analysis and presentation

Applied Epidemiologic Analysis Fall 2002 Karpati A, Glea S, Awerbuch T, Levins R Variability and vulnerability at the ecological level: Implications for understanding the social determinants of health. American Journal of Public Health, 92, Examination of variations in disease and mortality rates across US counties. Note that regions with the greatest variability in county disease and mortality were those with the greatest variability in county SES indicators.