Download presentation
Presentation is loading. Please wait.
Published byCecilia Laurel Carr Modified over 8 years ago
1
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
2
After we have evaluated the quality of test items and eliminated those that are not considered adequate, we must evaluate the overall quality of the test. In this chapter we discuss the problem of the reliability and accuracy of the measure, trying to find an answer to the question “to what extent the scores obtained by subjects in the test are affected by measurement errors and how much”.
3
The problem of measurement error
4
Measurement error is the difference between the empirical score obtained by a subject on a test and his/her true score. Objective: – Elaborate tests that lead to the minimum possible measurement error. – That the obtained score gives the greatest degree of real information on the characteristics under study. There are other errors, random ones (which ones are studied trough analysis of reliability).
5
Types of measurement errors
6
Measurement error: the difference between the empirical score of a subject and their true score. – We obtain an individual measure of the accuracy of the test. – The standard error of measurement: standard deviation of measurement errors. It’s a measurement of the group because it is calculated for all subjects of the sample. Estimation error of the true score: the difference between the true score of the subject and the true score predicted by the regression model. – The standard error of estimation of the true score: standard deviation of estimation errors.
7
Substitution error: the difference between the score obtained by a subject in a test and that one obtained in another parallel test. It would be committed to replacing the test scores on the X1 by those from a parallel test X2. – The standard error of substitution: standard deviation of substitution errors. Prediction error: the difference between the scores obtained by a subject in a test (X1) and predicted scores in the same test (X1') from a parallel test X2. – The standard error of prediction = standard deviation of prediction errors.
8
The linear model of Spearman
9
He's going to help us estimate the amount of error that are affecting to the empirical scores and the true level of subjects in the characteristic of study. X (empirical score)= V (true level)+ E (measurement error)
10
A) E = X – V B) E (e) = 0 C) D) Cov (V, E) = 0 E) F) Cov (X, V) = G) H)
11
Interpretation of the reliability coefficient
12
The correlation between the empirical scores obtained by a sample of subjects in two parallel forms of the test. The ratio between the variance of true scores and the variance of empirical scores. As this ratio increases, the measurement error decreases. Reliability index:
13
Factors that affect reliability
14
TEST LENGTH – If we increase the length of the test (if we add parallel items): More information about the attribute under study. Lower error when estimating the true score of a subject. – So, reliability will increase.
15
SAMPLING VARIABILITY – The reliability coefficient can vary depending on the homogeneity of the group. – The lower the reliability coefficient the more homogeneous the group. – * We assume that the standard error of measurement of a test remains constant independently of the variability of the group in which it is applied.
16
Reliability as equivalence and stability of measures Coefficient of reliability or equivalence
17
A test must meet two requirements : – It should measure the characteristic that really needs to be measured (be valid). – Empirical scores obtained by applying the test should be: Accurate (free of error), and Stable (when we evaluate a trait or characteristic with the same test at different times and under conditions as similar as possible, if the studied trait has not changed, you must obtain similar results: reliability of the test).
18
1. Elaborate two parallel forms of one test X and X’. 2. Apply the two tests on a sample of subjects representative of the population targeted by the test. 3. Calculate Pearson’s correlation. X 1 and X 2 :scores obtained by subjects in each form of the test. If applications are made at the same time there is greater control over the conditions of application. Difficulty to elaborate two parallel forms. a) Parallel forms method
19
1. Apply the same test on two separate occasions to the same sample of subjects. 2. Calculate the correlation X 1 and X 2 : scores obtained by subjects in each of the test applications. It does not require different forms of the same test. Possible influence of memory, the time interval between one application and another, and the attitude of the subject. b) Test-retest method
20
Reliability as internal consistency
21
Methods to estimate the reliability of a test that only require one application: – A) Based on the division of the test in two parts: Spearman-Brown Rulon Guttman-Flanagan – B) Based on the covariation of items: Cronbach's alpha coefficient
22
The estimation of reliability is not affected by the factors discussed. Save time and effort. 1. Apply the test to a sample of subjects. 2. Once obtained the scores, divide the test in two parts, calculate the correlation between the scores obtained by subjects in both parts and apply a correction formula. The parts should be similar in difficulty and content. a) Methods Based on the division of the test in two parts
23
The two parts must be parallel, so we should check the assumptions of parallelism (true scores of the subjects are the same in both tests, the variance of measurement errors is the same in both tests). Spearman-Brown
24
We have applied a numerical aptitude test of 20 items to a sample of 6 subjects. The table results are the scores obtained on even items (X1) and odd ones (X2). Calculate the reliability coefficient assuming that the two parts of the test are parallel. Spearman-Brown. Example SubjectsX1X1 X2X2 X12X12 X22X22 X1X2X1X2 184641632 27749 386643648 454251620 587644956 66636 Total4234302202241
26
They are applied when, despite not being strictly parallel parts, we can consider tau-equivalent (test in which the true scores of subjects of the sample are the same in both forms but the error variances are not necessarily equal) or essentially tau- equivalent (test in which the true score for each subject in one of the test is equal to the other plus a constant). Rulon and Guttman-Flanagan
27
Rulon: Guttman-Flanagan: Rulon y Guttman-Flanagan
28
It requires the analysis of variance and covariance of the subjects' responses to the items. It is an estimation of the internal consistency of test’s items. Cronbach's alpha coefficient. – It is based on the mean correlation among all test’s items. b) Method based on the covariation of items
29
We have applied a test of visual perception to 6 subjects. The results of the table show the scores of subjects in each of the five test items. Calculate the value of the coefficient of reliability of the test. Cronbach's alpha. Example Subjects12345 A34334 B23244 C42233 D21121 E11121 F00111
31
Estimation of the true score of the subjects in the attribute of interest
32
Estimations about the value of a subject's true score on a test and the error that affects the empirical scores obtained in the same test. We can’t calculate the exact value of the subject’s true score, but we can establish a confidence interval within which we will find the score with a given confidence level.
33
33 a) Method of Chebychev’s inequality It is applied if there is not any assumption about the empirical scores distribution or the errors distribution. The true score will be between two values , the upper limit and the lower limit.
34
34 Example We have administered a numerical reasoning test to 200 subjects. We have obtained: Mean=52, S X =7, r XX =0.73. Estimate the true score of a subject who obtained an empirical score of 65 points on the test. Confidence level of 95%. Too wide interval which involves a vague estimation. It may be due to a low reliability coefficient or that this method does not consider the type of distribution of empirical scores.
35
35 b) Estimation based on the normal distribution of errors It assumes a normal distribution both of measurement errors and empirical scores. 1. Calculate Zs in the normal distribution table for the desired confidence level. 2. Calculate the standard error of measurement. 3. Calculate the maximum measurement error that we are willing to admit. 4. Confidence interval in which we will find the true score.
36
36 Example The same data than previously. The confidence interval has been reduced significantly.
37
37 c) Estimation based on the regression method It is more convenient to make the confidence interval not from empirical scores (which are biased due to measurement errors), but from the estimated true score. 1. Make the regression equation of V on X. To determine the regression equation is useful for: Describe concisely the relationship between variables. Predict the values of a variable depending on the other.
38
38 2. Calculate Zs in the normal distribution table for a given level of confidence. 3. Calculate the standard error of estimation S vx. 4. Calculate the maximum error of estimation. 5. Calculate the confidence interval in which we will find the true score.
39
39 Example Same data than previouly. Estimate the true score (in raw, differential and typical scores) of a subject who scored 65 points.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.