USE OF COMPOSITE VARIABLES: AN EPIDEMIOLOGICAL PERSPECTIVE

Slides:



Advertisements
Similar presentations
Sampling Design, Spatial Allocation, and Proposed Analyses Don Stevens Department of Statistics Oregon State University.
Advertisements

Chapter 8 Flashcards.
Cross Cultural Research
Design of Experiments Lecture I
Chapter 10 Decision Making © 2013 by Nelson Education.
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Measures of Association.
17.1 INTRODUCTION Explanatory variable CHAPTER 17 FURTHER DATA ANALYSIS 2.
Correlation and Regression
Correlation and regression
What is a sample? Epidemiology matters: a new introduction to methodological foundations Chapter 4.
Quantitative Data Analysis or Introduction to Statistics Need to know: What you want to know -- Relationship? Prediction? Causation? Description? Everything?!
Slide 1 Statistics Workshop Tutorial 4 Probability Probability Distributions.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Correlation and Regression Analysis
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Chapter 11 Simple Regression
1 G Lect 1a Lecture 1a Perspectives on Statistics in Psychology Applications of statistical arguments Describing central tendency and variability.
Evidence-Based Practice Current knowledge and practice must be based on evidence of efficacy rather than intuition, tradition, or past practice. The importance.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
Correlation & Regression Analysis
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
Evidence-Based Practice Evidence-Based Practice Current knowledge and practice must be based on evidence of efficacy rather than intuition, tradition,
Matching Analyses to Decisions: Can we Ever Make Economic Evaluations Generalisable Across Jurisdictions? Mark Sculpher Mike Drummond Centre for Health.
Linear Correlation (12.5) In the regression analysis that we have considered so far, we assume that x is a controlled independent variable and Y is an.
GENERALIZING RESULTS: the role of external validity.
Exploring Social Psychology by David G. Myers 7th Edition
Chapter 8 Relationships Among Variables. Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
1 Study Design Imre Janszky Faculty of Medicine, ISM NTNU.
Advanced Higher STATISTICS Spearman’s Rank (Spearman’s rank correlation coefficient) Lesson Objectives 1. Explain why it is used. 2. List the advantages.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Multivariate Analysis - Introduction. What is Multivariate Analysis? The expression multivariate analysis is used to describe analyses of data that have.
Stats Methods at IC Lecture 3: Regression.
Statistical Forecasting
INTRODUCTION AND DEFINITIONS
Principles of Quantitative Research
Statistical tests for quantitative variables
CHAPTER 4 Research in Psychology: Methods & Design
Formulate the Research Problem
OCD Research in the Lab: A Research Pathway for Psychological Models
Understanding Results
Correlation and Regression
Organizational Effectiveness
Understanding Research Results: Description and Correlation
Making Causal Inferences and Ruling out Rival Explanations
Correlation and Regression
Analyzing Reliability and Validity in Outcomes Assessment Part 1
Developing and Evaluating Theories of Behavior
Cross Sectional Designs
Techniques for Data Analysis Event Study
Introduction.
Experimental Design: The Basic Building Blocks
Seminar in Economics Econ. 470
Product moment correlation
An Introduction to Correlational Research
15.1 The Role of Statistics in the Research Process
Linear Panel Data Models
Analyzing Reliability and Validity in Outcomes Assessment
Multivariate Analysis - Introduction
DRQ #10 AGEC pts October 17, 2013 (1 pt) 1. Calculate the median of the following sample of observations for a variable labeled.
Chapter Ten: Designing, Conducting, Analyzing, and Interpreting Experiments with Two Groups The Psychologist as Detective, 4e by Smith/Davis.
Presentation transcript:

USE OF COMPOSITE VARIABLES: AN EPIDEMIOLOGICAL PERSPECTIVE David Coggon 1

EPIDEMIOLOGY Investigates the distribution of health and its determinants Descriptive studies describe the distribution of health-related variables in defined populations at defined times Analytical studies explore the interrelationship of health-related variables (causation or prediction) 2

SPECIFICATION AND MODELLING OF VARIABLES Both types of study require specification of the variables to be analysed May be measured directly (e.g. weight, peak expiratory flow, blood lead) Often are composites of two or measures (e.g. many case definitions, incidence, BMI, forced expiratory ratio) Analytical studies may also entail statistical modelling 3

STATISTICAL MODELLING Makes assumptions about nature of relationships between variables Assumed relationships should have conceptual validity Validity can be explored empirically In some cases, the use of composite variables can be considered a form of modelling 4

REASONS FOR USING COMPOSITE VARIABLES Simple scaling (e.g. incidence, prevalence) Best practicable proxy for a gold standard (e.g. case definition for myocardial infarction) No gold standard available, but composite has conceptual and possibly predictive validity (e.g. scales for depression, somatising tendency, ERI, job strain) 5

PROS AND CONS OF COMPOSITE VARIABLES Composite variables can simplify analysis, and are of proven utility But combining primary measures in composite variables may lose useful information (e.g. BMI, ERI) Use of ratios has been a particular concern 6

PROBLEMS WITH RATIOS Pearson (1897) referred to “spurious correlation” that can occur between ratios even if all of the component variables of the ratios are uncorrelated Neyman (1952) constructed fictitious example in which number of storks per 10,000 women and number of babies per 10,000 women was highly correlated across a sample of counties, although within counties having same numbers of women number of storks was unrelated to number of babies (counties with most women tended to have fewer storks/10,000 women and fewer babies/10,000 women) 7

RECOMMENDATIONS OF KRONMAL (1993) “The use of ratios in regression analyses should be avoided” “Ratios should only be used in the context of a full linear model in which the variables that make up that ratio are included and the intercept term is also present” 8

SOME THOUGHTS ON KRONMAL Problems identified reflect sub-optimal specification of models and are not limited to ratios. They could apply to products (X*Y = X/[1/Y]), and to compound variables more generally Any variable can be expressed as a ratio of two other variables (X = [X/Y] / [1/Y]) Including a term for each variable and intercept may still not be optimal (e.g. power functions might be relevant) Potential disadvantages of over-elaborate models, especially if data-driven 9

DECISIONS IN DESIGN AND ANALYSIS Data to be collected Specification of variables from those data (may be directly measured or a function of what is measured) Specification of statistical models 10

HOW SHOULD DECISIONS BE MADE? No universal rules, only guiding principles Prime consideration is study question and conceptual understanding Collection of data should be practicable and efficient Variables analysed should have validity (conceptual, repeatability, predictive, against a gold standard) Consistency with other research Statistical models should be conceptually valid but parsimonious Where optimal choices are uncertain, can explore alternatives, but data-driven models may in part reflect random sampling variation 11

Ultimately, what matters is practical utility 12

FACTORS FAVOURING USE OF COMPLEX VARIABLES Strong conceptual grounding, ideally with empirical evidence of validity Parsimony without undue loss of validity Compatiblility with other research (meta-analysis) 13