Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Measuring Agreement. 2 Introduction Different types of agreement Diagnosis by different methods  Do both methods give the same results? Disease absent.

Similar presentations


Presentation on theme: "1 Measuring Agreement. 2 Introduction Different types of agreement Diagnosis by different methods  Do both methods give the same results? Disease absent."— Presentation transcript:

1 1 Measuring Agreement

2 2 Introduction Different types of agreement Diagnosis by different methods  Do both methods give the same results? Disease absent or Disease present  Staging of carcinomas Will different methods lead to the same results? Will different raters lead to the same results?  Measurements of blood pressure How consistent are measurements made  Using different devices?  With different observers?  At different times?

3 3 Investigating agreement Need to consider  Data type Categorical or continuous  How are the data repeated? Measuring instrument (s), rater(s), time(s)  The goal Are ratings consistent? Estimate the magnitude of differences between measurements Investigate factors that affect ratings  Number of raters

4 4 Data type Categorical  Binary Disease absent, disease present  Nominal Hepatitis  Viral A, B, C, D, E or autoimmune  Ordinal Severity of disease  Mild, moderate, severe Continuous  Size of tumour  Blood pressure

5 5 How are data repeated? Same person, same measuring instrument  Different observers Inter-rater reliability  Same observer at different times Intra-rater reliability  Repeatability  Internal consistency Do the items of a test measure the same attribute?

6 6 Measures of agreement Categorical  Kappa Weighted Fleiss’ Continuous  Limits of agreement  Coefficient of variation (CV)  Intraclass Correlation (ICC) Cronbach’s   Internal consistency

7 7 Number of raters Two Three or more

8 8 Categorical data: two raters Kappa Magnitude quoted ≥0.75 Excellent, 0.40 to 0.75 Fair to good, < 0.40 as Poor 0 to 0.20 Slight, >0.20 to 0.40 Fair, >0.40 to 0.60 Moderate, >0.60 to 0.80 Substantial, >0.80 Almost perfect Degree of disagreement can be included Weighted kappa  Values close together do not count to disagreement as much as those further apart  Linear / quadratic weightings

9 9 Categorical data: > two raters Different tests for  Binomial data  Data with more than two categories Online calculators  http://www.vassarstats.net/kappa.html

10 10 Example 1 Two raters  Scores 1 to 5 Unweighted kappa 0.79, 95% CI (0.62 to 0.96) Linear weighting 0.84, 95% CI (0.70 to 0.98) Quadratic weighting 0.90, 95% CI (0.77 to 1.00)

11 11 Example 2 Binomial data Two raters Two ratings each Inter-rater agreement Intra-rater agreement

12 12 Example 2 ctd. Inter-rater agreement  Kappa 1,2 = 0.865 (P<0.001)  Kappa 1,3 = 0.054 (P=0.765)  Kappa 2,3 = -0.071 (P=0.696) Intra-rater agreement  Kappa 1 = 0.800 (P<0.001)  Kappa 2 = 0.790 (P<0.001)  Kappa 3 = 0.000 (P=1.000)

13 13 Continuous data Test for bias Check differences not related to magnitude Calculate mean and SD of differences Limits of agreement Coefficient of variation ICC

14 14 Test for bias Student’s paired t (mean) Wilcoxon matched pairs (median) If there is bias, agreement cannot be investigated further

15 15 Example 3: Test for bias Paired t test P=0.362 No bias

16 16 Check differences unrelated to magnitude Clearly no relationship

17 17 Calculate Mean and SD differences this is s NMean Std. Deviation Difference 174.941221.72404 Valid N (listwise) 17 this is mean

18 18 Limits of agreement Lower limit of agreement (LLA) = mean - 1.96×s = -37.6 Upper limit of agreement (ULA) = mean + 1.96×s = 47.5 95% of differences between a pair of measurements for an individual lie in (-37.6, 47.5)

19 19 Coefficient of variation Measure of variability of differences  Expressed as a proportion of the average measured value Suitable when error (the differences between pairs) increases with the measured values  Other measures require this not to be the case 100 × s ÷ mean of the measurements  100 × 21.72 ÷ 447.88  4.85%

20 20 Intraclass Correlation Continuous data Two or more sets of measurements Measure of correlation that adjusts for differences in scale Several models  Absolute agreement of consistency  Raters chosen randomly or same raters throughout  Single or average measures

21 21 Intraclass Correlation ≥0.75 Excellent 0.4 to 0.75 Fair to Good <0.4 Poor

22 22 Cronbach’s α Internal consistency  Total scores  Several components. α ≥0.8 good ≥0.7 adequate

23 23 Investigating agreement Data type Categorical  Chi squared Continuous  Limits of agreement  Coefficient of variation  Intraclass correlation How are the data repeated? Measuring instrument (s), rater(s), time(s) Number of raters  Two Straightforward  Three or more Help!


Download ppt "1 Measuring Agreement. 2 Introduction Different types of agreement Diagnosis by different methods  Do both methods give the same results? Disease absent."

Similar presentations


Ads by Google