Download presentation
Presentation is loading. Please wait.
Published byJeremy Foster Modified over 8 years ago
1
On Some Statistical Aspects of Agreement Among Measurements – Part II Bikas Sinha [ISI, Kolkata]
2
Key References : Part II Lin. L. I. (2000).Total Deviation Index for Measuring Individual Agreement : With Application in Lab Performance and Bioequivalence. Statistics in Medicine, 19 : 255 - 270. Lin, L. I., Hedayat, A. S., Sinha, Bikas & Yang, Min (2002). Statistical Methods in Assessing Agreement: Models, Issues, and Tools. Jour. Amer. Statist. Assoc. 97 (457) : 257 - 270.
3
Follow the Instruction….. Pass on Directly to Page 19
4
Nature of Agreement Problems… Assessment & Recording of Responses … Two Assessors for evaluation and recording… The Raters examine each “unit” Independently of one another and report separately : “+” for “Affected” or “-” for “OK” : Discrete Type. Summary Statistics UNIT Assessment Table Assessor \ Assessor # II # I + - + 40% 3% - 3% 54% Q. What is the extent of agreement of the two assessors ? 0.40 0.03 0.03 0.54 K^ = 0.8776, K*^=0.8776 0.93 0.02 0.04 0.01 K^ = 0.2208, K*^=0.2219 0.03 0.40 0.54 0.03 K^ =-0.8439, K*^=-0.8776 0.02 0.93 0.01 0.04 K^ =-0.0184, K*^=-0.2219
5
Nature of Data … Assessor \ Assessor # II # I + - + 93% 2% - 4% 1% Assessor \ Assessor # II # I + - + 3% 40% - 44% 13% Same Question : Extent of Agreement / Disagreement ?
6
Measurements : Provided by Experts / Observers / Raters Could be two or more systems, assessors, chemists, psychologists, radiologists, clinicians, nurses, rating system or raters, diagnosis or treatments, instruments or methods, processes or techniques or formulae……
7
Diverse Application Areas Cross checking of data, Acceptability of a new or generic drug or of test instruments against standard instruments, or of a new method against gold standard method, statistical process control,….
8
Cohen’s Kappa : Nominal Scales Cohen (1960) Proposed Kappa statistic for measuring agreement when the responses are nominal
9
Cohen’s Kappa Rater I vs Rater II : 2 x 2 Case Categories + & - : Prop. (i,j) (+,+) & (-,-) : Agreement Prop (+,-) & (-,+) :Disagrmnt. Prop = [ 0 - e ] / [ 1 - e ] where 0 = (+,+) + (-,-)= P[agreement] e = (+,.) (.,+) + (-,.) (.,-) [chancy agreement]
10
Cohen’s Kappa = 1 iff 0 = 1 [Perfect Agrmnt] = 0 iff 0 = e iff (+,+)= (+,.) (.,+) [Chancy Agrmnt] = -1 iff (+,+) = (-,-) = 0 & (+,-) = (-,+) = 0.5 Restrictive Raters’ Behavior for =-1 !! Not Just Disagreement !! Needed so in a very strict sense…..
11
Kappa : Modification * = [ 0 - e ] / [ (+,.) (-,.) + (.,+) (.,-)] * = 1 iff (+,-) = (-,+) = 0 * = -1 iff (+,+) = (-,-) = 0 * >=<0 iff (+,+). (-,-) >=< (+,-). (-,+) Measure of Association… = * iff (+,.) = (.,+) i.e., (+,-)= (-,+).
12
KAPPA COMPUTATIONS… Raters I vs II 0.40 0.03 0.03 0.54 K^ = 0.8776, K*^=0.8776 0.93 0.02 0.04 0.01 K^ = 0.2208, K*^=0.2219 0.03 0.40 0.54 0.03 K^ =-0.8439, K*^=-0.8776 0.02 0.93 0.01 0.04 K^ =-0.0184, K*^=-0.2219
13
Back to Kappa Sample Analogue ? Plug-in Estimator 2 Raters : n Applicants / Subjects / Units n(+,+) & n(-,-) : # for Agreement n(+,-) & n(-,+) : # for Disagreement ^ = [ 0 ^ - e ^] / [ 1 - e ^] where 0 ^ = [n(+,+) + n(-,-)]/n e ^ = [n(+,.) n(.,+) + n(-,.) n(.,-)]/n^2
14
Cantor’s Observation Cantor(1996) : Psychological Methods Parameters in 2 x 2 array & 0 & e : All Derivable from (+,.), (.,+) & Var( ^) : LSE [delta-method] Sample size (n) determination for pre-assigned precision in ^
15
Evaluation of Status…contd. Multiple Categories of Evaluation Type ++ Type + Type - Type -- How to study the Nature & the Extent of Agreement from a given data set ?
16
Weighted Kappa R X R Classification : Two Raters (i,i)’s & (i,j)’s : Cohen(1960) & Landis & Koch (Biometrics, 1977) 0 = (i,i) = P[agreement] e = (i,.) (.,i) = P[chancy event] = [ 0 - e ] / [ 1 - e ] Unweighted Kappa : Criticism Nearest Neighbor Effects ?
17
Weighted Kappa Nearest Neighbor Effects….Wts W(i,i)’s & W(i,j)’s 0 = (i,i) → 0w = (i,j)W(i,j) e = (i,.) (.,i)→ ew = (i,.) (.,i) W(i,j) (W) = [ 0w - ew ] / [ 1 - ew ] Choice of Wts: W(i,i)=1 & Symmetric Fleiss & Cohen(1973):W(i,j) =1-[(i-j) 2 / (R-1 )2 ] Ciccetti & Allison(1971):W(I,j) = 1-[|i-j| /(R-1)]
18
Beyond Kappa A Review of Inter-rater Agreement Measures Banerjee et al (Canadian Journal of Statistics : 1999; 3-23 Modelling Patterns of Agreement : Log Linear Models Latent Class Models
19
Continuous Measurements Evaluation of agreement when the data are measured on a continuous scale…… Pearson correlation coefficient, regression analysis, paired t-tests, least squares analysis for slope and intercept, within-subject coefficient of variation, and intra-class correlation coefficient…..
20
Continuous Measurements Two raters – n units for measurement Data : [{x i, y i }; 1 ≤ i ≤ n] Scatter Plot : Visual Checking Product Moment Corr. Coeff.: High +ve : What does it mean ? Squared Deviation : D 2 = (X-Y) 2 MSD:E[D 2 ]=(m 1 –m 2 ) 2 + ( 1 2 + 2 2 - 2 1 2 )
21
Carotid Stenosis Screening Study Emory Univ.1994-1996 Gold Standard : Invasive Intra-arterial Angiogram [IA] Method Non-invasive Magnetic Resonance Angiography [MRA] Method Two Measurements under MRA: 2D & 3D Time of Flight Three Technicians : Each on Left & Right Arteries for 55 Patients by IA & MRA [2d & 3D] :3x3x2 =18 Obs. / patient
22
Data Structure…. Between Technicians : No Difference Left vs Right : Difference 2D vs 3D : Difference Q. Agreement between IA & 2D ? 3D ? Barnhart & Williamson (2001, 2002) : Biometrics papers …..no indication of any strong agreement
23
Scatter Plot : IA-2D-3D
24
Right vs Left Arteries [IA]
25
Descriptive Statistics : Carotid Stenosis Screening Study Sample Means Methods 1A, MRA-2D & MRA-3D by Sides Method N Left Artery Right Artery ---------------------------------------------------------- 1A 55 4.99 4.71 MRA-2D 55 5.36 5.73 MRA-3D 55 5.80 5.52
26
Descriptive Statistics (contd.) Sample Variance – Covariance Matrix 1A MRA-2D MRA-3D L R L R L R 1A-L 11.86 1.40 8.18 1.18 6.80 1.08 1A-R 10.61 2.67 7.53 1.78 7.17 2D-L 10.98 2.70 8.69 1.74 2d-R 8.95 2.19 7.69 3D-L 11.02 2.65 3D-R 10.24
27
Data Analysis Recall MSD = E[(X-Y) 2 ] : Normed? No ! Lin (1989):Converted MSD to Corr.Coeff Concordance Corr. Coeff. [CCC] CCC = 1 – [MSD / MSD Ind. ] = 2 1 2 /[(m 1 –m 2 ) 2 + ( 1 2 + 2 2 ) Properties :Perfect Agreement [CCC = 1] Perfect Disagreement [CCC = -1] No Agreement [CCC = 0]
28
CCC… CCC = 2 1 2 /[(m 1 –m 2 ) 2 + ( 1 2 + 2 2 )] = . a = Accuracy Coefficient a = Precision Coeff. [<=1] a = [2 / { + 1/ + 2 }] where = 1 / 2 and 2 = (m 1 –m 2 ) 2 / 1 2 CCC = 1 iff = 1 & a = 1 a = 1 iff [ m 1 = m 2 ] & [ 1 = 2 ] hold simultaneously !!
29
Study of CCC…. Identity of Marginals:Max.Precision High value of : High Accuracy Needed BOTH for Agreement Simultaneous Inference on H 0 : 0, [m 1 = m 2 ] & [ 1 = 2 ] LRT & Other Tests based on CCC Pornpis/Montip/Bimal Sinha (2006) More Tests : Dutta & Bikas Sinha (2012)
30
Total Deviation Index Lin (1991) & Lin et al (JASA, 2002) Assume BVN distribution of (X,Y) = P[ |Y – X| < k] = P[ D 2 < k 2 ]; D = Y - X = 2 [k 2, 1, m D 2 / D 2 ]..non-central 2 TDI = Value of k for given Inference based on TDI Choice of : 90 % or more
31
Coverage Probability Lin et al (JASA, 2002) : BVN distribution CP(d) = P[ |Y – X| < d] = [(d - m D ) / D ] - [(- d - m D ) / D ] Emphasis is on given d and high CP. CP^ : Plug-in Estimator using sample means, variances & corr. coeff. Var[CP^] : LSA V^[CP^] : Plug-in Estimator
32
Graphical Display …
33
Back to Data Analysis… Carotid Stenosis Screening Study Emory Univ.1994-1996 GS : Method IA Competitors : 2D & 3D Methods Left & Right Arteries : Different Range of readings : 0 – 100 % Choice of d : 2%
34
Doctoral Thesis… Robieson, W. Z. (1999) : On the Weighted Kappa and Concordance Correlation Coefficient. Ph.D. Thesis, University of Illinois at Chicago, USA Lou, Congrong (2006) : Assessment of Agree- ment : Multi-Rater Case. Ph D Thesis, University of Illinois at Chicago, USA
35
Data Analysis…. Lou(2006) derived expressions for CP(d)^, V^(CP^(d)), COV^(…,…) where CP iJ = P[|X i – X J |< d]
36
Data Analysis : CP 12, CP 13 & CP 23 Estimated Coverage Probability [CP] & Estimated Var. & Cov. for Screening Study Side Pairwise CP^ V^(CP^) COV^ Left CP 12 (L)^=0.56 0.0019 0.0009 Left CP 13 (L)^=0.47 0.0015 Right CP 12 (R)^=0.60 0.0021 0.0010 Right CP 13 (R)^=0.54 0.0019 Left CP 23 (L)^ =0.64 0.0021 Right CP 23 (R)^=0.69 0.0021
37
95% Lower Confidence Limits Left Side CP 12 (L)^=0.56 95% Lower CL = 0.48 CP 13 (L)^=0.47 95% Lower CL = 0.40 Right Side CP 12 (R)^=0.60 95% Lower CL = 0.51 CP 13 (R)^=0.54 95% Lower CL = 0.46 Conclusion : Poor Agreement in all cases
38
Data Analysis (contd.) Testing Hyp. Statistic p - value H 0L : CP 12 (L)= CP 13 (L) Z-score 0.0366 [against both-sided alternatives ] H 0R : CP 12 (R)= CP 13 (R) Z-score 0.1393 Conclusions : For “Left Side”, CP for [1A vs 2D] & for [1A vs 3D] are likely to be different while for “Right Side” these are likely to be equal.
39
Testing Multiple Hypotheses For “K” alternatives [1, 2, …, K] to the Gold Standard [0], interest lies in H 0L : CP 01 (L)= CP 02 (L) = … = CP 0K (L) H 0R : CP 01 (R)= CP 02 (R) = … = CP 0K (R) This is accomplished by performing Large Sample Chi-Square Test [Rao (1973)] Set for “Left Side” L = ( CP 01 (L)^ CP 02 (L)^ ….CP 0 K (L)^)
40
Chi-Square Test… Chi-Sq.Test Statistic L W^ -1 L - [ L W^ -1 1] 2 / [1 W^ -1 1] where W tt = Var ( CP 0t (L)^); t = 1, 2,.. W st = Cov ( CP 0s (L)^, CP 0t (L)^); s # t Asymptotic Chi-Sq. with K-1 df Slly…for “Right Side” Hypothesis.
41
Simultaneous Lower Confidence Limits Pr[ CP 01 L 1,CP 02 L 2, …,CP 0K L k ] 95% Set Z t = [CP 0t ^ – CP 0t ] / Var^(CP 0t ^) Assume : Z t ‘s Jointly follow Multivariate Normal Dist. Work out estimated Correlation Matrix as usual. Solve for “z” such that Pr[Z 1 z, Z 2 z, Z 3 z,…, Z K z] 95% Then L t = CP 0t ^ – z. Var^(CP 0t ^) t = 1, 2,.., K Stat Package : Available with Lou (2006).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.