Download presentation
Presentation is loading. Please wait.
Published byConstance Hunter Modified over 9 years ago
1
Reading Report: A unified approach for assessing agreement for continuous and categorical data Yingdong Feng
2
Introduction In Lin’s paper, they propose a series of indices for assessing agreement, precision and accuracy. In addition, this paper also proposes the CP and TDI for normal data. All these five indices are expressed as functions of variance components. Lin obtains the estimates and perform inferences for all the functions of variance components through GEE method. In their model, they measure the agreement among k raters, with each rater having multiple (m) readings from each of the n subjects for continuous and categorical data.
3
Introduction The approach of this paper integrates the approaches by Barnhart et al. (2005) and Carrasco and Jover (2003), for example, Barnhart et al. (2005) proposed a series of indices (intra-rater CCC, inter-rater CCC and total CCC) and estimate those indices and their inferences by GEE method. The definition of these three indices are listed below. In this paper, they introduce a unified approach which can be used for continuous, binary, and ordinal data. They provide the simulation results in assessing the performance of the unified approach in section 4 and give two examples to illustrate the use of the unified approach in section 5. IndexDefinition Intra-raterMultiple readings from same rater Inter-raterAverage of multiple readings among different raters Total-raterDifferent raters based on individual readings
4
Method In this paper, the model they use for measuring agreement is y ijl stands for the lth reading from subject i given by rater j, with i = 1, 2,…, n, j = 1, 2,…, k, and l = 1, 2,…, m. μ is the overall mean, α i is the random subject effect, β j is the rater effect, γ ij is the random interaction effect between rater and subject, e ijl is the random error effect. The variance among all raters is denoted as Based on this model, they propose a series of indices to measure agreement, precision and accuracy.
5
Method
8
In the total agreement part, the authors give CCC total, precision total, accuracy total, MSD, TDI and CP. Since total agreement is a measure of agreement based on any individual reading from each reader, we can see from the paper that these indices do not depend on the number of replications unlike inter-rater agreement. For the estimation and inference part, before we estimate all indices, we need to estimate the mean for each rater and all variance components first, this paper proposes a system of equations to estimate them.
9
Method Then delta method is used to obtain the estimates and inferences of estimates for all indices. The following table shows what transformation is used when perform the corresponding indices. IndicesTransformation method CCC-indices and precision indicesZ-transformation Accuracy and CP indicesLogit transformation TDIsNatural log transformation
10
Simulation In section 3, this paper gives the simulation based on binary data, ordinal data and normal data in order to evaluate the performance of proposed indices above and to compare them against other existing methods. The results are shown from table 1 to table 5, among the 5 tables, both table 1 and table 2 are the results from binary data simulation, but the binary data in table 1 has been transformed using the methods in the above table. Similarly, both table 3 and table 4 are the results from ordinal data simulation, but table 3 used the transformation. Table 5 gives normal data results with transformation. From the five tables we can see for all of them, this paper uses three cases, case one is k=2 & m=1, case two is k=4 & m=1, case three is k=2 & m=3. For each case, they generate 1000 random samples of size 20.
11
Simulation There are five columns in each table, the corresponding definition of each column is listed below. TheoreticalTheoretical value for this case Mean The mean of the 1000 estimated indices from the 1000 random samples. Std (Est) The standard deviation of the 1000 estimated indices from the 1000 random samples. Mean (Std)The mean of the 1000 estimated standard errors. SigProportion of estimates which are outside the 95% confidence interval
12
Simulation Why could they have the theoretical value? For example, for binary data with k=2 & m=1, they set the correlation equals to 0.6, the margin for the first variable is (0.3, 0.7) and the margin for the second variable is (0.5, 0.5). For binary data with k = 4 and m = 1, they set the vector mean μ = (0.55, 0.6, 0.65, 0.8) and ρ 12 = 0.75, ρ 13 = 0.7, ρ 14 = 0.5, ρ 23 = 0.8, ρ 24 = 0.6, and ρ 34 = 0.6. For binary data with k = 2 and m = 3, we set the vector mean μ = (0.7, 0.7, 0.7, 0.6, 0.6, 0.6). The correlation between any two of the first three variables is 0.8. The correlation between any two of the last three variables is also 0.8. The correlation between any one of the first three variables with any one of the last three variables is 0.7. With these settings, we can calculate the theoretical value in advance.
13
From the results of table 1 and table 2, all three cases perform very well, it is very straightforward to see that the numbers in first column and second column are very close and the numbers in the third column and forth column are very close, which means their estimates are very close to the corresponding theoretical value, and the means of the estimated standard error are very close to their corresponding standard deviations of the estimates. We can conclude that these indices they proposed fit well for binary data.
14
Table 3 and table 4 show the results for ordinal data simulation, similarly, we set the correlation and margin in advance to get the theoretical values. For both tables, the results are also similar to binary data, the numbers in first column are close to second column, so as the third column and forth column. We can say these indices fit well for ordinal data.
15
The last table in simulation part is the result for normal data with transformation. In order to get the theoretical value, the author have to set the precision, accuracy, within-rater precision, between-rater precision and between-rater precision in advance. Notice most of the means of estimated standard error are close to the corresponding standard deviations of the estimates except for CP inter, unlike the conclusion from the author, I would say Carrasco’s method performs better than CCC here when m=1, notice in the case of k=2 & m=3, the inter-rater agreement calculated from Barnhart’s method is a little bit bigger than ours, the reason is in Barnhart’s method, they assume m is infinite. Thus from the results of simulation, we can conclude that the indices they proposed in this paper work fairly well in estimates and in corresponding inferences for binary data, ordinal data and normal data.·
16
Example This paper gives two examples to illustrate the use of the unified approach. The first example is dispirin crosslinked hemoglobin (DCLHb) and the second example is assay validation. In this reading report we will discuss the results from the second example. They consider the Hemagglutinin Inhibition (HAI) assay for antibody to Influenza A (H3N2) in rabbit serum samples from two different labs. Serum samples from 64 rabbits are measured twice by each method. Antibody level is classified as: negative, positive, and highly positive.
17
Example In the paper, table 7 to 10 show the frequency tables for within lab and between lab readings. From table 9 and 10 we can see that lab two tends to report higher values than lab one, but table 7 and 8 suggest that within lab agreement is good.
18
Example Since it is an imprecise assay, the author allowed for looser agreement criteria where the agreement was defined as a within-sample total deviation not more than 50% of the total deviation if observations are from the same method, and a within- sample total deviation not more than 75% of the total deviation if observations are from different methods. Thus we get a least acceptable CCC intra of 0.75, and a least acceptable CCC inter of 0.4375.
19
Example Table 11 shows the result, the item 97.5% confidence limit is the one- sided 97.5% lower confidence limits of its corresponding agreement statistics. Now let’s see the data in this table, for example, precision intra was estimated to be 0.88361, which means for observations from the same method, the within-sample deviation is about 34.1% of the total deviations. The CCC inter is estimated to be 0.37225, which means for the average observations from different methods, the within- sample deviation is about 79.2% of the total deviations.
20
Conclusion Measuring agreements between different methods or different raters have received a great deal of attention recently, in this paper they proposes several indices include CCC, precision, accuracy, CP and TDI. They used these indices to measure intra, inter and total agreement among all raters. From the simulation part, we have figured out that these indices fit fairly well for binary, ordinal and normal data, and in the example of HAI assay, they also point out the ineffective of these indices for the agreement between two labs readings and suggest that kappa or weighted kappa could be applied to get the agreement within each lab.
21
Further Research we can consider to include the link functions such as log or logit in the GEE method in order to make the approach be more robust to different types of data. And in this paper, the variance components functions are based on balanced data, for the missing data, we may modify these functions or develop new functions.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.