On Some Statistical Aspects of Agreement Among Measurements – Part II Bikas Sinha [ISI, Kolkata]
Key References : Part II Lin. L. I. (2000).Total Deviation Index for Measuring Individual Agreement : With Application in Lab Performance and Bioequivalence. Statistics in Medicine, 19 : Lin, L. I., Hedayat, A. S., Sinha, Bikas & Yang, Min (2002). Statistical Methods in Assessing Agreement: Models, Issues, and Tools. Jour. Amer. Statist. Assoc. 97 (457) :
Follow the Instruction….. Pass on Directly to Page 19
Nature of Agreement Problems… Assessment & Recording of Responses … Two Assessors for evaluation and recording… The Raters examine each “unit” Independently of one another and report separately : “+” for “Affected” or “-” for “OK” : Discrete Type. Summary Statistics UNIT Assessment Table Assessor \ Assessor # II # I % 3% - 3% 54% Q. What is the extent of agreement of the two assessors ? K^ = , K*^= K^ = , K*^= K^ = , K*^= K^ = , K*^=
Nature of Data … Assessor \ Assessor # II # I % 2% - 4% 1% Assessor \ Assessor # II # I % 40% - 44% 13% Same Question : Extent of Agreement / Disagreement ?
Measurements : Provided by Experts / Observers / Raters Could be two or more systems, assessors, chemists, psychologists, radiologists, clinicians, nurses, rating system or raters, diagnosis or treatments, instruments or methods, processes or techniques or formulae……
Diverse Application Areas Cross checking of data, Acceptability of a new or generic drug or of test instruments against standard instruments, or of a new method against gold standard method, statistical process control,….
Cohen’s Kappa : Nominal Scales Cohen (1960) Proposed Kappa statistic for measuring agreement when the responses are nominal
Cohen’s Kappa Rater I vs Rater II : 2 x 2 Case Categories + & - : Prop. (i,j) (+,+) & (-,-) : Agreement Prop (+,-) & (-,+) :Disagrmnt. Prop = [ 0 - e ] / [ 1 - e ] where 0 = (+,+) + (-,-)= P[agreement] e = (+,.) (.,+) + (-,.) (.,-) [chancy agreement]
Cohen’s Kappa = 1 iff 0 = 1 [Perfect Agrmnt] = 0 iff 0 = e iff (+,+)= (+,.) (.,+) [Chancy Agrmnt] = -1 iff (+,+) = (-,-) = 0 & (+,-) = (-,+) = 0.5 Restrictive Raters’ Behavior for =-1 !! Not Just Disagreement !! Needed so in a very strict sense…..
Kappa : Modification * = [ 0 - e ] / [ (+,.) (-,.) + (.,+) (.,-)] * = 1 iff (+,-) = (-,+) = 0 * = -1 iff (+,+) = (-,-) = 0 * >=<0 iff (+,+). (-,-) >=< (+,-). (-,+) Measure of Association… = * iff (+,.) = (.,+) i.e., (+,-)= (-,+).
KAPPA COMPUTATIONS… Raters I vs II K^ = , K*^= K^ = , K*^= K^ = , K*^= K^ = , K*^=
Back to Kappa Sample Analogue ? Plug-in Estimator 2 Raters : n Applicants / Subjects / Units n(+,+) & n(-,-) : # for Agreement n(+,-) & n(-,+) : # for Disagreement ^ = [ 0 ^ - e ^] / [ 1 - e ^] where 0 ^ = [n(+,+) + n(-,-)]/n e ^ = [n(+,.) n(.,+) + n(-,.) n(.,-)]/n^2
Cantor’s Observation Cantor(1996) : Psychological Methods Parameters in 2 x 2 array & 0 & e : All Derivable from (+,.), (.,+) & Var( ^) : LSE [delta-method] Sample size (n) determination for pre-assigned precision in ^
Evaluation of Status…contd. Multiple Categories of Evaluation Type ++ Type + Type - Type -- How to study the Nature & the Extent of Agreement from a given data set ?
Weighted Kappa R X R Classification : Two Raters (i,i)’s & (i,j)’s : Cohen(1960) & Landis & Koch (Biometrics, 1977) 0 = (i,i) = P[agreement] e = (i,.) (.,i) = P[chancy event] = [ 0 - e ] / [ 1 - e ] Unweighted Kappa : Criticism Nearest Neighbor Effects ?
Weighted Kappa Nearest Neighbor Effects….Wts W(i,i)’s & W(i,j)’s 0 = (i,i) → 0w = (i,j)W(i,j) e = (i,.) (.,i)→ ew = (i,.) (.,i) W(i,j) (W) = [ 0w - ew ] / [ 1 - ew ] Choice of Wts: W(i,i)=1 & Symmetric Fleiss & Cohen(1973):W(i,j) =1-[(i-j) 2 / (R-1 )2 ] Ciccetti & Allison(1971):W(I,j) = 1-[|i-j| /(R-1)]
Beyond Kappa A Review of Inter-rater Agreement Measures Banerjee et al (Canadian Journal of Statistics : 1999; 3-23 Modelling Patterns of Agreement : Log Linear Models Latent Class Models
Continuous Measurements Evaluation of agreement when the data are measured on a continuous scale…… Pearson correlation coefficient, regression analysis, paired t-tests, least squares analysis for slope and intercept, within-subject coefficient of variation, and intra-class correlation coefficient…..
Continuous Measurements Two raters – n units for measurement Data : [{x i, y i }; 1 ≤ i ≤ n] Scatter Plot : Visual Checking Product Moment Corr. Coeff.: High +ve : What does it mean ? Squared Deviation : D 2 = (X-Y) 2 MSD:E[D 2 ]=(m 1 –m 2 ) 2 + ( 1 2 )
Carotid Stenosis Screening Study Emory Univ Gold Standard : Invasive Intra-arterial Angiogram [IA] Method Non-invasive Magnetic Resonance Angiography [MRA] Method Two Measurements under MRA: 2D & 3D Time of Flight Three Technicians : Each on Left & Right Arteries for 55 Patients by IA & MRA [2d & 3D] :3x3x2 =18 Obs. / patient
Data Structure…. Between Technicians : No Difference Left vs Right : Difference 2D vs 3D : Difference Q. Agreement between IA & 2D ? 3D ? Barnhart & Williamson (2001, 2002) : Biometrics papers …..no indication of any strong agreement
Scatter Plot : IA-2D-3D
Right vs Left Arteries [IA]
Descriptive Statistics : Carotid Stenosis Screening Study Sample Means Methods 1A, MRA-2D & MRA-3D by Sides Method N Left Artery Right Artery A MRA-2D MRA-3D
Descriptive Statistics (contd.) Sample Variance – Covariance Matrix 1A MRA-2D MRA-3D L R L R L R 1A-L A-R D-L d-R D-L D-R 10.24
Data Analysis Recall MSD = E[(X-Y) 2 ] : Normed? No ! Lin (1989):Converted MSD to Corr.Coeff Concordance Corr. Coeff. [CCC] CCC = 1 – [MSD / MSD Ind. ] = 2 1 2 /[(m 1 –m 2 ) 2 + ( 2 2 ) Properties :Perfect Agreement [CCC = 1] Perfect Disagreement [CCC = -1] No Agreement [CCC = 0]
CCC… CCC = 2 1 2 /[(m 1 –m 2 ) 2 + ( 2 2 )] = . a = Accuracy Coefficient a = Precision Coeff. [<=1] a = [2 / { + 1/ + 2 }] where = 1 / 2 and 2 = (m 1 –m 2 ) 2 / 1 2 CCC = 1 iff = 1 & a = 1 a = 1 iff [ m 1 = m 2 ] & [ 1 = 2 ] hold simultaneously !!
Study of CCC…. Identity of Marginals:Max.Precision High value of : High Accuracy Needed BOTH for Agreement Simultaneous Inference on H 0 : 0, [m 1 = m 2 ] & [ 1 = 2 ] LRT & Other Tests based on CCC Pornpis/Montip/Bimal Sinha (2006) More Tests : Dutta & Bikas Sinha (2012)
Total Deviation Index Lin (1991) & Lin et al (JASA, 2002) Assume BVN distribution of (X,Y) = P[ |Y – X| < k] = P[ D 2 < k 2 ]; D = Y - X = 2 [k 2, 1, m D 2 / D 2 ]..non-central 2 TDI = Value of k for given Inference based on TDI Choice of : 90 % or more
Coverage Probability Lin et al (JASA, 2002) : BVN distribution CP(d) = P[ |Y – X| < d] = [(d - m D ) / D ] - [(- d - m D ) / D ] Emphasis is on given d and high CP. CP^ : Plug-in Estimator using sample means, variances & corr. coeff. Var[CP^] : LSA V^[CP^] : Plug-in Estimator
Graphical Display …
Back to Data Analysis… Carotid Stenosis Screening Study Emory Univ GS : Method IA Competitors : 2D & 3D Methods Left & Right Arteries : Different Range of readings : 0 – 100 % Choice of d : 2%
Doctoral Thesis… Robieson, W. Z. (1999) : On the Weighted Kappa and Concordance Correlation Coefficient. Ph.D. Thesis, University of Illinois at Chicago, USA Lou, Congrong (2006) : Assessment of Agree- ment : Multi-Rater Case. Ph D Thesis, University of Illinois at Chicago, USA
Data Analysis…. Lou(2006) derived expressions for CP(d)^, V^(CP^(d)), COV^(…,…) where CP iJ = P[|X i – X J |< d]
Data Analysis : CP 12, CP 13 & CP 23 Estimated Coverage Probability [CP] & Estimated Var. & Cov. for Screening Study Side Pairwise CP^ V^(CP^) COV^ Left CP 12 (L)^= Left CP 13 (L)^= Right CP 12 (R)^= Right CP 13 (R)^= Left CP 23 (L)^ = Right CP 23 (R)^=
95% Lower Confidence Limits Left Side CP 12 (L)^=0.56 95% Lower CL = 0.48 CP 13 (L)^=0.47 95% Lower CL = 0.40 Right Side CP 12 (R)^=0.60 95% Lower CL = 0.51 CP 13 (R)^=0.54 95% Lower CL = 0.46 Conclusion : Poor Agreement in all cases
Data Analysis (contd.) Testing Hyp. Statistic p - value H 0L : CP 12 (L)= CP 13 (L) Z-score [against both-sided alternatives ] H 0R : CP 12 (R)= CP 13 (R) Z-score Conclusions : For “Left Side”, CP for [1A vs 2D] & for [1A vs 3D] are likely to be different while for “Right Side” these are likely to be equal.
Testing Multiple Hypotheses For “K” alternatives [1, 2, …, K] to the Gold Standard [0], interest lies in H 0L : CP 01 (L)= CP 02 (L) = … = CP 0K (L) H 0R : CP 01 (R)= CP 02 (R) = … = CP 0K (R) This is accomplished by performing Large Sample Chi-Square Test [Rao (1973)] Set for “Left Side” L = ( CP 01 (L)^ CP 02 (L)^ ….CP 0 K (L)^)
Chi-Square Test… Chi-Sq.Test Statistic L W^ -1 L - [ L W^ -1 1] 2 / [1 W^ -1 1] where W tt = Var ( CP 0t (L)^); t = 1, 2,.. W st = Cov ( CP 0s (L)^, CP 0t (L)^); s # t Asymptotic Chi-Sq. with K-1 df Slly…for “Right Side” Hypothesis.
Simultaneous Lower Confidence Limits Pr[ CP 01 L 1,CP 02 L 2, …,CP 0K L k ] 95% Set Z t = [CP 0t ^ – CP 0t ] / Var^(CP 0t ^) Assume : Z t ‘s Jointly follow Multivariate Normal Dist. Work out estimated Correlation Matrix as usual. Solve for “z” such that Pr[Z 1 z, Z 2 z, Z 3 z,…, Z K z] 95% Then L t = CP 0t ^ – z. Var^(CP 0t ^) t = 1, 2,.., K Stat Package : Available with Lou (2006).