Download presentation
Presentation is loading. Please wait.
Published byDayna Lang Modified over 9 years ago
1
2. Main Test Theories: The Classical Test Theory (CTT) Psychometrics. 2011/12. Group A (English)
2
Development of test theories As a result of the increased popularity achieved by the test, we need to develop a theoretical framework and a procedure that: – Serves as a basis for the scores obtained by the subjects when one test is applied to them. – Allows to analyze the accuracy of measurements obtained (the extent to which scores obtained by subjects in one test scores are equivalent to their true scores and which is the amount of measurement error that affects them). Reliability of the scores. – Allows to analyze the validity of inferences or conclusions that can be drawn from them. Validity.
3
In response to these problems the Theory Test was developed. It allows to set one functional relationship between observable variables (from empirical scores obtained by subjects in tests or in their items) and unobservable variables (true scores or the skill level of the subjects in the construct that is measuring).
4
To make inferences from the scores of the subjects in the test is necessary so that the relationship between the construct we want to measure and empirical obtained scores can be established from a model. Each model represents a kind of functional relationship, and through a series of assumptions it must specify the factors that influence the scores obtained by subjects in tests. To the extent that the assumptions are valid, the inferences made from the model will describe the properties of test scores correctly.
5
Each model could lead to a Theory Test, but those who have had a higher incidence has been the Classical Test Theory (CTT), the Item Response Theory (IRT) and Generalizability Theory (GT).
6
Classical Test Theory (CTT) Spearman (1904, 1907, 1910, 1913) Functional relationship between empirical or observed scores (X), true scores (V) and scores due to error (E). Lineal model: X = V + E
7
The actions of one subject responding to a test at a particular time are affected by many factors difficult to control. That implies that the obtained score (empirical) doesn’t match with their true score. It will be necessary to estimate the true score based on assumptions of the model.
8
To measure in psychology we should use reliable tools, i.e., free of measurement errors (as far as possible). The error term will include all random errors that are affecting empirical scores. They can come from several sources: – The subject (emotional state, fatigue, stress, etc..). – The test (due to their items and type of format). – Characteristics of the applicators. – Environmental conditions. – Instructions. – Etc. We should try to control them through the study of reliability.
9
Assumptions 1. The true score (V) is the mathematical expectation of the empirical score (X). – If we pass an infinite number of times the same test to a subject (assuming that the applications are independent of each other, so the score for that subject in one applications doesn’t influence the others) the mean of all observed scores (X) would be the real score of the subject. V = E(X)
10
2. The correlation between true scores of 'n' subjects in a test and measurement errors is equal to 0. – There doesn’t exist a relationship between measurement errors and true scores. R ve = 0
11
3. The correlation between measurement errors (r e1e2 ) that affect subjects scores in two different test (X 1 y X 2 ) is equal to 0. – There is no reason to assume that measurement errors committed in one test will influence, positively or negatively, the other test if tests test are applied correctly. r e1e2 = 0
12
The deductions of this model and its implications will be explored in a theoretical and practical way in chapter 3.
13
Alternative models Item Response Theory (IRT) Generalizability Theory (GT)
14
Item Response Theory (IRT) Lord (1952, 1953). The probability that one person emits a specific response to an item depends on the skill level of the person in the construct and on item characteristics (difficulty, discrimination, pseudoazar). The IRT provides a number of models that assume a functional relationship between the values of the variable that items measure (skill level of the subjects in the measured construct) and the likelihood that the subjects hit each item, depending on their skill level.
15
This function is called: Item Characteristic Curve. The probability that one subject hits each of the items no longer depends on the item itself but the level of the subjects in the variable that measures each items. More information: Muñiz (1997). Introducción a la Teoría de respuesta al Ítem. Madrid: Pirámide.
16
Generalizability Theory (GT) Cronbach, Glesser, Nanda & Rajaratnam (1972) It represents a way to try to systematize and classify the error as a function of the possible sources that cause it. It takes into account all possible sources of error (due to individual factors, situational characteristics of the evaluator, and instrumental variables) and tries to differentiate by applying the classical procedures of analysis of variance (ANOVA).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.