Presentation is loading. Please wait.

Presentation is loading. Please wait.

On Some Statistical Aspects of Agreement Among Measurements – Part II Bikas Sinha [ISI, Kolkata]

Similar presentations


Presentation on theme: "On Some Statistical Aspects of Agreement Among Measurements – Part II Bikas Sinha [ISI, Kolkata]"— Presentation transcript:

1 On Some Statistical Aspects of Agreement Among Measurements – Part II Bikas Sinha [ISI, Kolkata]

2 Key References : Part II Lin. L. I. (2000).Total Deviation Index for Measuring Individual Agreement : With Application in Lab Performance and Bioequivalence. Statistics in Medicine, 19 : 255 - 270. Lin, L. I., Hedayat, A. S., Sinha, Bikas & Yang, Min (2002). Statistical Methods in Assessing Agreement: Models, Issues, and Tools. Jour. Amer. Statist. Assoc. 97 (457) : 257 - 270.

3 Follow the Instruction….. Pass on Directly to Page 19

4 Nature of Agreement Problems… Assessment & Recording of Responses … Two Assessors for evaluation and recording… The Raters examine each “unit” Independently of one another and report separately : “+” for “Affected” or “-” for “OK” : Discrete Type. Summary Statistics UNIT Assessment Table Assessor \ Assessor # II # I + - + 40% 3% - 3% 54% Q. What is the extent of agreement of the two assessors ? 0.40 0.03 0.03 0.54 K^ = 0.8776, K*^=0.8776 0.93 0.02 0.04 0.01 K^ = 0.2208, K*^=0.2219 0.03 0.40 0.54 0.03 K^ =-0.8439, K*^=-0.8776 0.02 0.93 0.01 0.04 K^ =-0.0184, K*^=-0.2219

5 Nature of Data … Assessor \ Assessor # II # I + - + 93% 2% - 4% 1% Assessor \ Assessor # II # I + - + 3% 40% - 44% 13% Same Question : Extent of Agreement / Disagreement ?

6 Measurements : Provided by Experts / Observers / Raters Could be two or more systems, assessors, chemists, psychologists, radiologists, clinicians, nurses, rating system or raters, diagnosis or treatments, instruments or methods, processes or techniques or formulae……

7 Diverse Application Areas Cross checking of data, Acceptability of a new or generic drug or of test instruments against standard instruments, or of a new method against gold standard method, statistical process control,….

8 Cohen’s Kappa : Nominal Scales Cohen (1960) Proposed Kappa statistic for measuring agreement when the responses are nominal

9 Cohen’s Kappa Rater I vs Rater II : 2 x 2 Case Categories + & - : Prop.  (i,j)  (+,+) &  (-,-) : Agreement Prop  (+,-) &  (-,+) :Disagrmnt. Prop  = [  0 -  e ] / [ 1 -  e ] where  0 =  (+,+) +  (-,-)= P[agreement]  e =  (+,.)  (.,+) +  (-,.)  (.,-) [chancy agreement]

10 Cohen’s Kappa  = 1 iff  0 = 1 [Perfect Agrmnt]  = 0 iff  0 =  e iff  (+,+)=  (+,.)  (.,+) [Chancy Agrmnt]  = -1 iff  (+,+) =  (-,-) = 0 &  (+,-) =  (-,+) = 0.5 Restrictive Raters’ Behavior for  =-1 !! Not Just Disagreement !! Needed so in a very strict sense…..

11 Kappa : Modification  * = [  0 -  e ] / [  (+,.)  (-,.) +  (.,+)  (.,-)]  * = 1 iff  (+,-) =  (-,+) = 0  * = -1 iff  (+,+) =  (-,-) = 0  * >=<0 iff  (+,+).  (-,-) >=<  (+,-).  (-,+) Measure of Association…  =  * iff  (+,.) =  (.,+) i.e.,  (+,-)=  (-,+).

12 KAPPA COMPUTATIONS… Raters I vs II 0.40 0.03 0.03 0.54 K^ = 0.8776, K*^=0.8776 0.93 0.02 0.04 0.01 K^ = 0.2208, K*^=0.2219 0.03 0.40 0.54 0.03 K^ =-0.8439, K*^=-0.8776 0.02 0.93 0.01 0.04 K^ =-0.0184, K*^=-0.2219

13 Back to Kappa Sample Analogue ? Plug-in Estimator 2 Raters : n Applicants / Subjects / Units n(+,+) & n(-,-) : # for Agreement n(+,-) & n(-,+) : # for Disagreement  ^ = [  0 ^ -  e ^] / [ 1 -  e ^] where  0 ^ = [n(+,+) + n(-,-)]/n  e ^ = [n(+,.) n(.,+) + n(-,.) n(.,-)]/n^2

14 Cantor’s Observation Cantor(1996) : Psychological Methods Parameters in 2 x 2 array &  0 &  e : All Derivable from  (+,.),  (.,+) &  Var(  ^) : LSE [delta-method] Sample size (n) determination for pre-assigned precision in  ^

15 Evaluation of Status…contd. Multiple Categories of Evaluation Type ++ Type + Type - Type -- How to study the Nature & the Extent of Agreement from a given data set ?

16 Weighted Kappa R X R Classification : Two Raters  (i,i)’s &  (i,j)’s : Cohen(1960) & Landis & Koch (Biometrics, 1977)  0 =   (i,i) = P[agreement]  e =   (i,.)  (.,i) = P[chancy event]  = [  0 -  e ] / [ 1 -  e ] Unweighted Kappa : Criticism Nearest Neighbor Effects ?

17 Weighted Kappa Nearest Neighbor Effects….Wts W(i,i)’s & W(i,j)’s  0 =   (i,i) →  0w =   (i,j)W(i,j)  e =   (i,.)  (.,i)→  ew =   (i,.)  (.,i) W(i,j)  (W) = [  0w -  ew ] / [ 1 -  ew ] Choice of Wts: W(i,i)=1 & Symmetric Fleiss & Cohen(1973):W(i,j) =1-[(i-j) 2 / (R-1 )2 ] Ciccetti & Allison(1971):W(I,j) = 1-[|i-j| /(R-1)]

18 Beyond Kappa A Review of Inter-rater Agreement Measures Banerjee et al (Canadian Journal of Statistics : 1999; 3-23 Modelling Patterns of Agreement : Log Linear Models Latent Class Models

19 Continuous Measurements Evaluation of agreement when the data are measured on a continuous scale…… Pearson correlation coefficient, regression analysis, paired t-tests, least squares analysis for slope and intercept, within-subject coefficient of variation, and intra-class correlation coefficient…..

20 Continuous Measurements Two raters – n units for measurement Data : [{x i, y i }; 1 ≤ i ≤ n] Scatter Plot : Visual Checking Product Moment Corr. Coeff.: High +ve : What does it mean ? Squared Deviation : D 2 = (X-Y) 2 MSD:E[D 2 ]=(m 1 –m 2 ) 2 + (  1 2 +  2 2 - 2  1  2 )

21 Carotid Stenosis Screening Study Emory Univ.1994-1996 Gold Standard : Invasive Intra-arterial Angiogram [IA] Method Non-invasive Magnetic Resonance Angiography [MRA] Method Two Measurements under MRA: 2D & 3D Time of Flight Three Technicians : Each on Left & Right Arteries for 55 Patients by IA & MRA [2d & 3D] :3x3x2 =18 Obs. / patient

22 Data Structure…. Between Technicians : No Difference Left vs Right : Difference 2D vs 3D : Difference Q. Agreement between IA & 2D ? 3D ? Barnhart & Williamson (2001, 2002) : Biometrics papers …..no indication of any strong agreement

23 Scatter Plot : IA-2D-3D

24 Right vs Left Arteries [IA]

25 Descriptive Statistics : Carotid Stenosis Screening Study Sample Means Methods 1A, MRA-2D & MRA-3D by Sides Method N Left Artery Right Artery ---------------------------------------------------------- 1A 55 4.99 4.71 MRA-2D 55 5.36 5.73 MRA-3D 55 5.80 5.52

26 Descriptive Statistics (contd.) Sample Variance – Covariance Matrix 1A MRA-2D MRA-3D L R L R L R 1A-L 11.86 1.40 8.18 1.18 6.80 1.08 1A-R 10.61 2.67 7.53 1.78 7.17 2D-L 10.98 2.70 8.69 1.74 2d-R 8.95 2.19 7.69 3D-L 11.02 2.65 3D-R 10.24

27 Data Analysis Recall MSD = E[(X-Y) 2 ] : Normed? No ! Lin (1989):Converted MSD to Corr.Coeff Concordance Corr. Coeff. [CCC] CCC = 1 – [MSD / MSD Ind. ] = 2  1  2 /[(m 1 –m 2 ) 2 + (  1 2 +  2 2 ) Properties :Perfect Agreement [CCC = 1] Perfect Disagreement [CCC = -1] No Agreement [CCC = 0]

28 CCC… CCC = 2  1  2 /[(m 1 –m 2 ) 2 + (  1 2 +  2 2 )] = .  a  = Accuracy Coefficient  a = Precision Coeff. [<=1]  a = [2 / {  + 1/  +  2 }] where  =  1 /  2 and  2 = (m 1 –m 2 ) 2 /  1  2 CCC = 1 iff  = 1 &  a = 1  a = 1 iff [ m 1 = m 2 ] & [  1 =  2 ] hold simultaneously !!

29 Study of CCC…. Identity of Marginals:Max.Precision High value of  : High Accuracy Needed BOTH for Agreement Simultaneous Inference on H 0 :    0, [m 1 = m 2 ] & [  1 =  2 ] LRT & Other Tests based on CCC Pornpis/Montip/Bimal Sinha (2006) More Tests : Dutta & Bikas Sinha (2012)

30 Total Deviation Index Lin (1991) & Lin et al (JASA, 2002) Assume BVN distribution of (X,Y)  = P[ |Y – X| < k] = P[ D 2 < k 2 ]; D = Y - X =  2 [k 2, 1, m D 2 /  D 2 ]..non-central  2 TDI = Value of k for given  Inference based on TDI Choice of  : 90 % or more

31 Coverage Probability Lin et al (JASA, 2002) : BVN distribution CP(d) = P[ |Y – X| < d] =  [(d - m D ) /  D ] -  [(- d - m D ) /  D ] Emphasis is on given d and high CP. CP^ : Plug-in Estimator using sample means, variances & corr. coeff. Var[CP^] : LSA V^[CP^] : Plug-in Estimator

32 Graphical Display …

33 Back to Data Analysis… Carotid Stenosis Screening Study Emory Univ.1994-1996 GS : Method IA Competitors : 2D & 3D Methods Left & Right Arteries : Different Range of readings : 0 – 100 % Choice of d : 2%

34 Doctoral Thesis… Robieson, W. Z. (1999) : On the Weighted Kappa and Concordance Correlation Coefficient. Ph.D. Thesis, University of Illinois at Chicago, USA Lou, Congrong (2006) : Assessment of Agree- ment : Multi-Rater Case. Ph D Thesis, University of Illinois at Chicago, USA

35 Data Analysis…. Lou(2006) derived expressions for CP(d)^, V^(CP^(d)), COV^(…,…) where CP iJ = P[|X i – X J |< d]

36 Data Analysis : CP 12, CP 13 & CP 23 Estimated Coverage Probability [CP] & Estimated Var. & Cov. for Screening Study Side Pairwise CP^ V^(CP^) COV^ Left CP 12 (L)^=0.56 0.0019 0.0009 Left CP 13 (L)^=0.47 0.0015 Right CP 12 (R)^=0.60 0.0021 0.0010 Right CP 13 (R)^=0.54 0.0019 Left CP 23 (L)^ =0.64 0.0021 Right CP 23 (R)^=0.69 0.0021

37 95% Lower Confidence Limits Left Side CP 12 (L)^=0.56  95% Lower CL = 0.48 CP 13 (L)^=0.47  95% Lower CL = 0.40 Right Side CP 12 (R)^=0.60  95% Lower CL = 0.51 CP 13 (R)^=0.54  95% Lower CL = 0.46 Conclusion : Poor Agreement in all cases

38 Data Analysis (contd.) Testing Hyp. Statistic p - value H 0L : CP 12 (L)= CP 13 (L) Z-score 0.0366 [against both-sided alternatives ] H 0R : CP 12 (R)= CP 13 (R) Z-score 0.1393 Conclusions : For “Left Side”, CP for [1A vs 2D] & for [1A vs 3D] are likely to be different while for “Right Side” these are likely to be equal.

39 Testing Multiple Hypotheses For “K” alternatives [1, 2, …, K] to the Gold Standard [0], interest lies in H 0L : CP 01 (L)= CP 02 (L) = … = CP 0K (L) H 0R : CP 01 (R)= CP 02 (R) = … = CP 0K (R) This is accomplished by performing Large Sample Chi-Square Test [Rao (1973)] Set for “Left Side”  L = ( CP 01 (L)^ CP 02 (L)^ ….CP 0 K (L)^)

40 Chi-Square Test… Chi-Sq.Test Statistic  L W^ -1  L - [  L W^ -1 1] 2 / [1 W^ -1 1] where W tt = Var ( CP 0t (L)^); t = 1, 2,.. W st = Cov ( CP 0s (L)^, CP 0t (L)^); s # t Asymptotic Chi-Sq. with K-1 df Slly…for “Right Side” Hypothesis.

41 Simultaneous Lower Confidence Limits Pr[ CP 01  L 1,CP 02  L 2, …,CP 0K  L k ]  95% Set Z t = [CP 0t ^ – CP 0t ] /  Var^(CP 0t ^) Assume : Z t ‘s Jointly follow Multivariate Normal Dist. Work out estimated Correlation Matrix as usual. Solve for “z” such that Pr[Z 1  z, Z 2  z, Z 3  z,…, Z K  z]  95% Then L t = CP 0t ^ – z.  Var^(CP 0t ^) t = 1, 2,.., K Stat Package : Available with Lou (2006).


Download ppt "On Some Statistical Aspects of Agreement Among Measurements – Part II Bikas Sinha [ISI, Kolkata]"

Similar presentations


Ads by Google