Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays.

Slides:



Advertisements
Similar presentations
RELIABILITY Reliability refers to the consistency of a test or measurement. Reliability studies Test-retest reliability Equipment and/or procedures Intra-
Advertisements

Correlation Chapter 6. Assumptions for Pearson r X and Y should be interval or ratio. X and Y should be normally distributed. Each X should be independent.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
Errors in Chemical Analyses: Assessing the Quality of Results
Sampling: Final and Initial Sample Size Determination
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Research Methods for Counselors COUN 597 University of Saint Joseph Class # 8 Copyright © 2015 by R. Halstead. All rights reserved.
Estimation of Sample Size
Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Concept of Measurement
Intermediate methods in observational epidemiology 2008 Quality Assurance and Quality Control.
The Simple Regression Model
Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.
BCOR 1020 Business Statistics
Chapter 7 Correlational Research Gay, Mills, and Airasian
Quality Assurance in the clinical laboratory
Relationships Among Variables
Chemometrics Method comparison
Are the results valid? Was the validity of the included studies appraised?
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Data Collection & Processing Hand Grip Strength P textbook.
Epidemiologic Methods. Definitions of Epidemiology The study of the distribution and determinants (causes) of disease –e.g. cardiovascular epidemiology.
PTP 560 Research Methods Week 3 Thomas Ruediger, PT.
Clinical Research: Sample Measure (Intervene) Analyze Infer.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Teaching Registrars Research Methods Variable definition and quality control of measurements Prof. Rodney Ehrlich.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Clinical Research: Sample Measure (Intervene) Analyze Infer.
Fundamentals of Data Analysis Lecture 10 Management of data sets and improving the precision of measurement pt. 2.
Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.
Reliability & Validity
Reliability & Agreement DeShon Internal Consistency Reliability Parallel forms reliability Parallel forms reliability Split-Half reliability Split-Half.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Yes - ANeed more information - CNo - B After competing for years under a cloud of suspicion, Jones tested positive for EPO June 23. Jones immediately requested.
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Yes - ANeed more information - CNo - B After competing for years under a cloud of suspicion, Jones tested positive for EPO June 23. Jones immediately requested.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Epidemiologic Methods- Fall Course Administration Format –Lectures: Tuesdays 8:45 – 10:15 am –Small Group Sections: Tuesdays 1:30 pm. Begin next.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Measurement MANA 4328 Dr. Jeanne Michalski
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Sample Size Determination
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 13 Understanding research results: statistical inference.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Statistical Concepts Basic Principles An Overview of Today’s Class What: Inductive inference on characterizing a population Why : How will doing this allow.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
1 Measuring Agreement. 2 Introduction Different types of agreement Diagnosis by different methods  Do both methods give the same results? Disease absent.
Dr.Theingi Community Medicine
Clinical practice involves measuring quantities for a variety of purposes, such as: aiding diagnosis, predicting future patient outcomes, serving as endpoints.
Statistical analysis.
Quality Assurance in the clinical laboratory
Statistical analysis.
Understanding Results
Reliability & Validity
Inverse Transformation Scale Experimental Power Graphing
Choice of Methods and Instruments
MANA 5341 Dr. George Benson Measurement MANA 5341 Dr. George Benson 1.
15.1 The Role of Statistics in the Research Process
Intermediate methods in observational epidemiology 2008
Presentation transcript:

Epidemiologic Methods- Fall 2002

Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays 1:00 pm except for last Section, Dec. 3, from 10:30 to 11:30. Begin next week Content: Overview and discussion of lectures, and review of assignments. Textbooks –Epidemiology: Beyond the Basics by Szklo and Nieto (S & N). –Multivariable Analysis: A Practical Guide for Clinicians by M. Katz Grading –Based on points achieved on homework (~80%) & final (~20%). –Late assignments are not accepted. Missed sessions –All material distributed in class is posted on website.

Definitions of Epidemiology The study of the distribution and determinants (causes) of disease –e.g. cardiovascular epidemiology The method used to conduct human subject research –the methodologic foundation of any research where individual humans or groups of humans are the unit of observation

Understanding Measurement: Aspects of Reproducibility and Validity Review Measurement Scales Reproducibility vs Validity Reproducibility –importance –sources of measurement variability –methods of assessment by variable type: interval vs categorical

Clinical Research Sample Measure (Intervene) Analyze Infer

A study can only be as good as the data... -Martin Bland

Measurement Scales

Reproducibility vs Validity Reproducibility –the degree to which a measurement provides the same result each time it is performed on a given subject or specimen Validity –from the Latin validus - strong –the degree to which a measurement truly measures (represents) what it purports to measure (represent)

Reproducibility vs Validity Reproducibility –aka: reliability, repeatability, precision, variability, dependability, consistency, stability Validity –aka: accuracy

Relationship Between Reproducibility and Validity Good Reproducibility Poor Validity Poor Reproducibility Good Validity

Relationship Between Reproducibility and Validity Good Reproducibility Good Validity Poor Reproducibility Poor Validity

Why Care About Reproducibility? Impact on Validity Mathematically, the upper limit of a measurement’s validity is a function of its reproducibility Consider a study to measure height in the community: –Assume the measurement has imperfect reproducibility: if we measure height twice on a given person, we get two different values; 1 of the 2 values must be wrong (imperfect validity) –If study measures everyone only once, errors, despite being random, will lead to biased inferences when using these measurements (i.e. lack validity)

Impact of Reproducibility on Statistical Precision Classical Measurement Theory: –observed value (O) = true value (T) + measurement error (E) –If we assume E is random and normally distributed: E ~ N (0,  2 E ) Error

Impact of Reproducibility on Statistical Precision –observed value (O) = true value (T) + measurement error (E) –E is random and ~ N (0,  2 E ) When measuring a group of subjects, the variability of observed values is a combination of: the variability in their true values and the variability in the measurement error  2 O =  2 T +  2 E

Why Care About Reproducibility?  2 O =  2 T +  2 E More measurement error means more variability in observed measurements –e.g. measure height in a group of subjects. –If no measurement error –If measurement error Height

Why Care About Reproducibility?  2 O =  2 T +  2 E More variability of observed measurements has profound influences on statistical precision/power: – Descriptive studies: wider confidence intervals – RCT’s: power to detect a treatment difference is reduced – Observational studies: power to detect an influence of a particular risk factor upon a given disease is reduced.

Mathematical Definition of Reproducibility Reproducibility Varies from 0 (poor) to 1 (optimal) As  2 E approaches 0 (no error), reproducibility approaches 1

Phillips and Smith, J Clin Epi 1993 Power

Sources of Measurement Error Observer within-observer (intrarater) between-observer (interrater) Instrument within-instrument between-instrument

Sources of Measurement Error e.g. plasma HIV viral load –observer: measurement to measurement differences in tube filling, time before processing –instrument: run to run differences in reagent concentration, PCR cycle times, enzymatic efficiency

Within-Subject Variability Although not the fault of the measurement process, moment-to-moment biological variability can have the same effect as errors in the measurement process Recall that: –observed value (O) = true value (T) + measurement error (E) –T = the average of measurements taken over time –E is always in reference to T –Therefore, lots of moment-to-moment within-subject biologic variability will serve to increase the variability in the error term and thus increase overall variability because  2 O =  2 T +  2 E

Assessing Reproducibility Depends on measurement scale Interval Scale –within-subject standard deviation –coefficient of variation Categorical Scale –Cohen’s Kappa

Reproducibility of an Interval Scale Measurement: Peak Flow Assessment requires >1 measurement per subject Peak Flow Rate in 17 adults (Bland & Altman)

Assessment by Simple Correlation

Pearson Product-Moment Correlation Coefficient r (rho) ranges from -1 to +1 r r describes the strength of linear association r 2 = proportion of variance (variability) of one variable accounted for by the other variable

r = -1.0 r = 0.8 r = 0.0 r = 1.0 r = -1.0 r = 0.8r = 0.0

Correlation Coefficient for Peak Flow Data r ( meas.1, meas. 2) = 0.98

Limitations of Simple Correlation for Assessment of Reproducibility Depends upon range of data –e.g. Peak Flow r (full range of data) = 0.98 r (peak flow <450) = 0.97 r (peak flow >450) = 0.94

Limitations of Simple Correlation for Assessment of Reproducibility Depends upon ordering of data Measures linear association only

Meas. 2 Meas

Limitations of Simple Correlation for Assessment of Reproducibility Gives no meaningful parameter using the same scale as the original measurement

Within-Subject Standard Deviation Mean within-subject standard deviation (s w ) = 15.3 l/min

Computationally easier with ANOVA table: Mean within-subject standard deviation (s w ) :

s w : Further Interpretation If assume that replicate results: – are normally distributed – mean of replicates estimates true value 95% of replicates are within (1.96)(s w ) of true value x  true value swsw (1.96) (s w )

s w : Peak Flow Data If assume that replicate results: – are normally distributed – mean of replicates estimates true value 95% of replicates within (1.96)(15.3) = 30 l/min of true value x  true value s w = 15.3 l/min (1.96) (s w ) = (1.96) (15.3) = 30

s w : Further Interpretation Difference between any 2 replicates for same person = diff = meas 1 - meas 2 Because var(diff) = var(meas 1 ) + var(meas 2 ), therefore, s 2 diff = s w 2 + s w 2 = 2s w 2 s diff

s w : Difference Between Two Replicates If assume that differences: – are normally distributed and mean of differences is 0 – s diff estimates standard deviation The difference between 2 measurements for the same subject is expected to be less than (1.96)(s diff ) = (1.96)(1.41)s w = 2.77s w for 95% of all pairs of measurements x diff  0 s diff (1.96) (s diff )

s w : Further Interpretation For Peak Flow data: The difference between 2 measurements for the same subject is expected to be less than 2.77s w =(2.77)(15.3) = 42.4 l/min for 95% of all pairs Bland-Altman refer to this as the “repeatability” of the measurement

One Common Underlying s w Appropriate only if there is one s w i.e, s w does not vary with true underlying value Within-Subject Std Deviation Subject Mean Peak Flow Kendall’s correlation coefficient = 0.17, p = 0.36

Another Interval Scale Example Salivary cotinine in children (Bland-Altman) n = 20 participants measured twice

Cotinine: Absolute Difference vs. Mean Subject Absolute Difference Subject Mean Cotinine Kendall’s tau = 0.62, p = 0.001

Logarithmic Transformation

Log Transformed: Absolute Difference vs. Mean Subject abs log diff Subject mean log cotinine Kendall’s tau=0.07, p=0.7

s w for log-transformed cotinine data s w back-transforming to native scale: antilog(s w ) = antilog(0.175) = = 1.49

Coefficient of Variation On the natural scale, there is not one common within-subject standard deviation for the cotinine data Therefore, there is not one absolute number that can represent the difference any replicate is expected to be from the true value or from another replicate Instead, within-subject standard deviation varies with the level of the measurement and it is reasonable to depict the within-subject standard deviation as a % of the level = coefficient of variation

Cotinine Data Coefficient of variation = = 0.49 At any level of cotinine, the within-subject standard deviation of repeated measures is 49% of the level

Coefficient of Variation for Peak Flow Data By definition, when the within-subject standard deviation is not proportional to the mean value, as in the Peak Flow data, then there is not a constant ratio between the within-subject standard deviation and the mean. Therefore, there is not one common coefficient of variation Estimating the the “average” coefficient of variation is not very meaningful

Peak Flow Data: Use of Coefficient of Variation when s w is Constant

Reproducibility of a Categorical Measurements: Kappa Statistic Agreement above that expected by chance (observed agreement - chance agreement) is the amount of agreement above chance If maximum amount of agreement is 1.0, then (1 - chance agreement) is the maximum amount of agreement above chance that is possible Therefore, kappa is the ratio of “agreement beyond chance” to “maximal possible agreement beyond chance”

Sources of Measurement Variability: Which to Assess? Observer within-observer (intrarater) between-observer (interrater) Instrument within-instrument between-instrument Subject within-subject Which to assess depends upon the use of the measurement and how/when the measurement will be made: –For clinical use: all of the above are needed –For research: depends upon logistics of study (e.g., within-observer and within-instrument only are needed if just one person/instrument used throughout study)

Assessing Validity Measures can be assessed for validity in 3 ways: –Content validity Face Sampling –Construct validity –Empirical validity (aka criterion) Concurrent (i.e. when gold standards are present) –Interval scale measurement: 95% limits of agreement –Categorical scale measurement: sensitivity & specificity Predictive

Conclusions Measurement reproducibility plays a key role in determining validity and statistical precision in all different study designs When assessing reproducibility, for interval scale measurements: avoid correlation coefficients use within-subject standard deviation if constant or coefficient of variation if within-subject sd is proportional to the magnitude of measurement For categorical scale measurements, use Kappa What is acceptable reproducibility depends upon desired use Assessment of validity depends upon whether or not gold standards are present, and can be a challenge when they are absent

Assessing Validity - With Gold Standards A new and simpler device to measure peak flow becomes available (Bland-Altman)

Plot of Difference vs. Gold Standard Difference Gold standard

Examine the Differences Difference Gold standard d 1 = -81 d 2 = 7 d 3 = -35

Are the Differences Normally Distributed?

The mean difference describes any systematic difference between the gold standard and the new device: The standard deviation of the differences: 95% of differences will lie between (1.96)(38.8), or from -78 to 74 l/min. These are the 95% limits of agreement