Presentation is loading. Please wait.

Presentation is loading. Please wait.

Epidemiologic Methods. Definitions of Epidemiology The study of the distribution and determinants (causes) of disease –e.g. cardiovascular epidemiology.

Similar presentations


Presentation on theme: "Epidemiologic Methods. Definitions of Epidemiology The study of the distribution and determinants (causes) of disease –e.g. cardiovascular epidemiology."— Presentation transcript:

1 Epidemiologic Methods

2 Definitions of Epidemiology The study of the distribution and determinants (causes) of disease –e.g. cardiovascular epidemiology The method used to conduct human subject research –the methodologic foundation of any research where individual humans or groups of humans are the unit of observation

3 Understanding Measurement: Aspects of Reproducibility and Validity Review Measurement Scales Reproducibility –importance –methods of assessment by variable type: interval vs categorical intra- vs. inter-observer comparison Validity –methods of assessment gold standards present no gold standard available

4 Clinical Research Sample Measure Analyze Infer

5 A study can only be as good as the data... -Martin Bland

6 Measurement Scales

7 Reproducibility vs Validity Reproducibility –the degree to which a measurement provides the same result each time it is performed on a given subject or specimen Validity –from the Latin validus - strong –the degree to which a measurement truly measures (represents) what it purports to measure (represent)

8 Reproducibility vs Validity Reproducibility –aka: reliability, repeatability, precision, variability, dependability, consistency, stability Validity –aka: accuracy

9 Relationship Between Reproducibility and Validity Good Reproducibility Poor Validity Poor Reproducibility Good Validity

10 Relationship Between Reproducibility and Validity Good Reproducibility Good Validity Poor Reproducibility Poor Validity

11 Why Care About Reproducibility? Impact on Validity Mathematically, the upper limit of a measurement’s validity is a function of its reproducibility Consider a study to measure height in the community: –if we measure height twice on a given person and get two different values, then one of the two values must be wrong (invalid) –if study measures everyone only once, errors, despite being random, may not balance out –final inferences are likely to be wrong (invalid)

12 Why Care About Reproducibility? Impact on Statistical Precision Classical Measurement Theory: observed value (O) = true value (T) + measurement error (E) E is random and ~ N (0,  2 E ) Therefore, when measuring a group of subjects, the variability of observed values is a combination of: the variability in their true values and measurement error  2 O =  2 T +  2 E

13 Why Care About Reproducibility?  2 O =  2 T +  2 E More measurement error means more variability in observed measurements More variability of observed measurements has profound influences on statistical precision/power: – Descriptive study: less precise estimates of given traits – RCT’s: power to detect a treatment difference is reduced – Observational studies: power to detect an influence of a particular exposure upon a given outcome is reduced.

14 Conceptual Definition of Reproducibility Reproducibility Varies from 0 (poor) to 1 (optimal) As  2 E approaches 0 (no error), reproducibility approaches 1

15 Phillips and Smith, J Clin Epi 1993

16 Sources of Measurement Variability Observer within-observer (intrarater) between-observer (interrater) Instrument within-instrument between-instrument Subject within-subject

17

18 Sources of Measurement Variability e.g. plasma HIV viral load –observer: measurement to measurement differences in tube filling, time before processing –instrument: run to run differences in reagent concentration, PCR cycle times, enzymatic efficiency –subject: biologic variation in viral load

19 Assessing Reproducibility Depends on measurement scale Interval Scale –within-subject standard deviation –coefficient of variation Categorical Scale –Cohen’s Kappa

20 Reproducibility of an Interval Scale Measurement: Peak Flow Assessment requires >1 measurement per subject Peak Flow Rate in 17 adults (Bland & Altman)

21 Assessment by Simple Correlation

22 Pearson Product-Moment Correlation Coefficient r (rho) ranges from -1 to +1 r r describes the strength of the association r 2 = proportion of variance (variability) of one variable accounted for by the other variable

23 r = -1.0 r = 0.8 r = 0.0 r = 1.0 r = -1.0 r = 0.8r = 0.0

24 Correlation Coefficient for Peak Flow Data r ( meas.1, meas. 2) = 0.98

25 Limitations of Simple Correlation for Assessment of Reproducibility Depends upon range of data –e.g. Peak Flow r (full range of data) = 0.98 r (peak flow <450) = 0.97 r (peak flow >450) = 0.94

26

27 Limitations of Simple Correlation for Assessment of Reproducibility Depends upon ordering of data Measures linear association only

28 Meas. 2 Meas 1 1003005007009001100130015001700 100 300 500 700 900 1100 1300 1500 1700

29 Limitations of Simple Correlation for Assessment of Reproducibility Gives no meaningful parameter for the issue

30 Within-Subject Standard Deviation Mean within-subject standard deviation (s w ) = 15.3 l/min

31 Computationally easier with ANOVA table: Mean within-subject standard deviation (s w ) :

32 s w : Further Interpretation If assume that replicate results: – normally distributed – mean of replicates estimates true value – standard deviation estimated by s w Then 95% of replicates will be within (1.96)(s w ) of the true value For Peak Flow data: –95% of replicates will be within (1.96)(15.3) = 30.0 l/min of the true value

33 s w : Further Interpretation Difference between any 2 replicates for same person = diff = meas 1 - meas 2 Because var(diff) = var(meas 1 ) + var(meas 2 ), therefore, s 2 diff = s w 2 + s w 2 = 2s w 2 s diff If assume the distribution of the differences between pairs is N(0,  2 diff ), therefore, –The difference between 2 measurements for the same subject is expected to be less than (1.96)(s diff ) = (1.96)(1.41)s w = 2.77s w for 95% of all pairs of measurements

34 s w : Further Interpretation For Peak Flow data: The difference between 2 measurements for the same subject is expected to be less than 2.77s w =(2.77)(15.3) = 42.4 l/min for 95% of all pairs Bland-Altman refer to this as the “repeatability” of the measurement

35 Interpreting s w Appropriate only if there is one s w if s w does not vary with the true underlying value Within-Subject Std Deviation Subject Mean Peak Flow 100300500700 0 10 20 30 40 Kendall’s correlation coefficient = 0.17, p = 0.36

36 Another Interval Scale Example Salivary cotinine in children (Bland-Altman) n = 20 participants measured twice

37 Simple Correlation of Two Trials trial 1 trial 2 0246 0 2 4 6

38 Correlation of Cotinine Replicates

39 Cotinine: Absolute Difference vs. Mean Subject Absolute Difference Subject Mean Cotinine 0246 0 1 2 3 4 Kendall’s tau = 0.62, p = 0.001

40 Logarithmic Transformation

41 Log Transformed: Absolute Difference vs. Mean Subject abs log diff Subject mean log cotinine -.50.51 0.2.4.6 Kendall’s tau=0.07, p=0.7

42 s w for log-transformed cotinine data s w back-transforming to original units: antilog(s w ) = antilog(0.175) = 1.49

43 Coefficient of Variation On the natural scale, there is not one common within-subject standard deviation for the cotinine data Therefore, there is not one absolute number that can represent the difference any replicate is expected to be from the true value or from another replicate Instead, = coefficient of variation

44 Cotinine Data Coefficient of variation = 1.49 -1 = 0.49 At any level of cotinine, the within-subject standard deviation of repeated measures is 49% of the level

45 Coefficient of Variation for Peak Flow Data By definition, when the within-subject standard deviation is not proportional to the mean value, as in the Peak Flow data, then there is not a constant ratio between the within-subject standard deviation and the mean. Therefore, there is not one common coefficient of variation Estimating the coefficient of variation by taking the common within-subject standard deviation and dividing by the overall mean of the subjects is not very meaningful

46 Intraclass Correlation Coefficient, r I r I Averages correlation across all possible ordering of replicates Varies from 0 (poor) to 1 (optimal) As  2 E approaches 0 (no error), r I approaches 1 Advantages: not dependent upon ordering of replicates; does not mistake linear association for agreement; allows >2 replicates Disadvantages: still dependent upon range of data in sample, still does not give a meaningful parameter on the actual scale of measurement in question

47 Intraclass Correlation Coefficient, r I r I where: – m = no. of replicates per person –SS b = sum of squares between subjects –SS t = total sum of squares r I (peak flow) = 0.98 r I (cotinine) = 0.69

48 Reproducibility of a Categorical Measurement: Chest X-Rays On 2 different occasions, a radiologist is given the same 100 CXR’s from a group of high-risk smokers to evaluate for masses How should reproducibility in reading be assessed?

49

50

51 Kappa Agreement above that expected by chance (observed agreement - chance agreement) is the amount of agreement above chance If maximum amount of agreement is 1.0, then (1 - chance agreement) is the maximum amount of agreement above chance that is possible Therefore, kappa is the ratio of “agreement beyond chance” to “maximal possible agreement beyond chance”

52 Determining agreement expected by chance

53 Suggested interpretations for kappa

54 Kappa: problematic at the extremes of prevalence

55 Sources of Measurement Variability: Which to Assess? Observer within-observer (intrarater) between-observer (interrater) Instrument within-instrument between-instrument Subject within-subject Which to assess depends upon the use of the measurement and how it will be made. –For clinical use: all of the above are needed –For research: depends upon logistics of study (i.e. intrarater and within-instrument only if just one person/instrument used throughout study)

56 Improving Reproducibility See Hulley text Make more than one measurement! –But know where the source of your variation exists!

57 Assessing Validity - With Gold Standards A new and simpler device to measure peak flow becomes available (Bland-Altman)

58 Plot of Difference vs. Gold Standard Difference Gold standard 02004006008001000 -200 -100 0 100 200

59

60 The mean difference describes any systematic difference between the gold standard and the new device: The standard deviation of the differences: 95% of differences will lie between -2.3 + (1.96)(38.8), or from -78 to 74 l/min. These are the 95% limits of agreement

61 Assessing Validity of Categorical Measures Dichotomous More than 2 levels –Collapse or –Kappa

62 Assessing Validity - Without Gold Standards When gold standards are not present, measures can be assessed for validity in 3 ways: –Content validity Face Sampling –Construct validity –Empirical validity (aka criterion) Concurrent Predictive

63 Conclusions Measurement reproducibility plays a key role in determining validity and statistical precision in all different study designs –When assessing reproducibility, avoid correlation coefficients use within-subject standard deviation if constant or coefficient of variation if within-subject sd is proportional to the magnitude of measurement Acceptable reproducibility depends upon desired use For validity, plot difference vs mean and determine “limits of agreement” or determine sensitivity/specificity –Be aware of how your measurements have been validated!


Download ppt "Epidemiologic Methods. Definitions of Epidemiology The study of the distribution and determinants (causes) of disease –e.g. cardiovascular epidemiology."

Similar presentations


Ads by Google