Classical Test Theory Psych 818 - DeShon. Big Picture To make good decisions, you must know how much error is in the data upon which the decisions are.

Slides:



Advertisements
Similar presentations
The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.
Advertisements

1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Structural Equation Modeling Using Mplus Chongming Yang Research Support Center FHSS College.
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
Chapter 4 – Reliability Observed Scores and True Scores Error
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Introduction to Regression with Measurement Error STA431: Spring 2015.
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
Ch11 Curve Fitting Dr. Deshi Ye
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
Structural Equation Modeling
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L10.1 CorrelationCorrelation The underlying principle of correlation analysis.
T-Tests.
t-Tests Overview of t-Tests How a t-Test Works How a t-Test Works Single-Sample t Single-Sample t Independent Samples t Independent Samples t Paired.
T-Tests.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Topics: Inferential Statistics
Item PersonI1I2I3 A441 B 323 C 232 D 112 Item I1I2I3 A(h)110 B(h)110 C(l)011 D(l)000 Item Variance: Rank ordering of individuals. P*Q for dichotomous items.
Correlation 2 Computations, and the best fitting line.
When Measurement Models and Factor Models Conflict: Maximizing Internal Consistency James M. Graham, Ph.D. Western Washington University ABSTRACT: The.
LECTURE 5 TRUE SCORE THEORY. True Score Theory OBJECTIVES: - know basic model, assumptions - know definition of reliability, relation to TST - be able.
Session 3 Normal Distribution Scores Reliability.
Topic 3: Regression.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Data Analysis Statistics. Inferential statistics.
LECTURE 16 STRUCTURAL EQUATION MODELING.
Introduction to Regression with Measurement Error STA431: Spring 2013.
Simple Linear Regression and Correlation
Classical Test Theory By ____________________. What is CCT?
Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
LECTURE 6 RELIABILITY. Reliability is a proportion of variance measure (squared variable) Defined as the proportion of observed score (x) variance due.
MEASUREMENT MODELS. BASIC EQUATION x =  + e x = observed score  = true (latent) score: represents the score that would be obtained over many independent.
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Statistical Methods, part 1 Module 2: Latent Class Analysis of Survey Error Models for measurement errors Dan Hedlin Stockholm University November 2012.
Introduction to Regression with Measurement Error STA302: Fall/Winter 2013.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Tests and Measurements Intersession 2006.
Reliability & Agreement DeShon Internal Consistency Reliability Parallel forms reliability Parallel forms reliability Split-Half reliability Split-Half.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
1 EPSY 546: LECTURE 1 SUMMARY George Karabatsos. 2 REVIEW.
G Lecture 81 Comparing Measurement Models across Groups Reducing Bias with Hybrid Models Setting the Scale of Latent Variables Thinking about Hybrid.
G Lecture 91 Measurement Error Models Bias due to measurement error Adjusting for bias with structural equation models Examples Alternative models.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
Reliability: Introduction. Reliability Session Definitions & Basic Concepts of Reliability Theoretical Approaches Empirical Assessments of Reliability.
2. Main Test Theories: The Classical Test Theory (CTT) Psychometrics. 2011/12. Group A (English)
Measurement Math DeShon Univariate Descriptives Mean Mean Variance, standard deviation Variance, standard deviation Skew & Kurtosis Skew & Kurtosis.
Lesson 2 Main Test Theories: The Classical Test Theory (CTT)
Lesson 5.1 Evaluation of the measurement instrument: reliability I.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
The SweSAT Vocabulary (word): understanding of words and concepts. Data Sufficiency (ds): numerical reasoning ability. Reading Comprehension (read): Swedish.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Nonequivalent Groups: Linear Methods Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices (2 nd ed.). New.
CFA with Categorical Outcomes Psych DeShon.
Inference about the slope parameter and correlation
Classical Test Theory Margaret Wu.
Reliability & Validity
Fundamentals of regression analysis
12 Inferential Analysis.
Evaluation of measuring tools: reliability
By ____________________
12 Inferential Analysis.
Simple Linear Regression
Presentation transcript:

Classical Test Theory Psych DeShon

Big Picture To make good decisions, you must know how much error is in the data upon which the decisions are based To make good decisions, you must know how much error is in the data upon which the decisions are based Classical test theory is a model that can be used to estimate the magnitude of error present in the data (reliability) Classical test theory is a model that can be used to estimate the magnitude of error present in the data (reliability) Based on strong assumptions Based on strong assumptions

Sources of Error Psychologically interesting variables that matter, but were not included in the model Psychologically interesting variables that matter, but were not included in the model Idiosyncratic Error Idiosyncratic Error Mood, fatigue, boredom, language difficulties, attention Mood, fatigue, boredom, language difficulties, attention Generic Error Generic Error Poor instructions, confusing categories or anchors, setting, variations in administration Poor instructions, confusing categories or anchors, setting, variations in administration Additive Error Additive Error acquiescent, leniency, severity response sets acquiescent, leniency, severity response sets

Sources of Error Systematic errors Systematic errors Social desirability Social desirability Demand characteristics Demand characteristics Experimenter expectancies Experimenter expectancies Halo error Halo error Interviewer bias Interviewer bias Rater dispersion bias Rater dispersion bias Midpoint response sets Midpoint response sets

Classical Test Theory Single Indicator Model Single Indicator Model Most common measurement model Most common measurement model Two assumptions Two assumptions Observed score is a linear combination of true score and error Observed score is a linear combination of true score and error Error is a normally distributed random variable Error is a normally distributed random variable

Classical Test Theory Recall the variance of a linear composite is: Recall the variance of a linear composite is: If you assume that error and true scores are independent, then If you assume that error and true scores are independent, then

Classical Test Theory Signal to noise ratio: Signal to noise ratio: Reliability Reliability

CTT – Estimating Reliability Can't see true score so can't form the reliability ratio directly Can't see true score so can't form the reliability ratio directly Must find some way to estimate it Must find some way to estimate it Parallel Forms!! Parallel Forms!! Original approach to reliability estimation Original approach to reliability estimation Assume you have two exactly equivalent measures of the same latent variable (X & X') Assume you have two exactly equivalent measures of the same latent variable (X & X')

CTT- Parallel Forms If two measures are parallel, then: If two measures are parallel, then: Same true scores Same true scores So, same true score variance So, same true score variance Same error variance Same error variance Therefore, same observed score variance Therefore, same observed score variance

CTT – Parallel Forms te1e1 e2e2 X1X1 X2X2 X i = t + e i

CTT – Parallel Forms Given these relations, the correlation between two parallel forms is an estimate of the reliability Given these relations, the correlation between two parallel forms is an estimate of the reliability

CTT – Standard Error of Measurement Reliability is a metric free measure of the amount of error in a measurement system Reliability is a metric free measure of the amount of error in a measurement system Standard error of measurement translates the reliability estimate into the metric of the measure Standard error of measurement translates the reliability estimate into the metric of the measure

CTT – Standard Error of Measurement Used to place a confidence interval around a particulal observed score Used to place a confidence interval around a particulal observed score Standard error of measurement and cut scores Standard error of measurement and cut scores

CTT- Multiple Indicator Model Also called common factor model Also called common factor model Spearman (1906) Spearman (1906) CTT defines a person's true score as the mean response to an infinite set of observations CTT defines a person's true score as the mean response to an infinite set of observations Operational definition Operational definition Therefore, the best way to get close to the person's true score (less error!) is to obtain responses to a large # of equivalent measures or indicators Therefore, the best way to get close to the person's true score (less error!) is to obtain responses to a large # of equivalent measures or indicators

CTT- Multiple Indicator Model Now, reliability of a linear composite of indicators instead of a correlation between two indicators Now, reliability of a linear composite of indicators instead of a correlation between two indicators X j =an examinees test score on the jth indicator X j =an examinees test score on the jth indicator F is the standardized examinees true score (t ) F is the standardized examinees true score (t ) e j =exminees random error on the jth indicator e j =exminees random error on the jth indicator μ j =the item mean (e.g., difficulty) μ j =the item mean (e.g., difficulty) λ j =factor loading, item sensitivity,item discrimination λ j =factor loading, item sensitivity,item discrimination

t X1X1 X2X2 X3X3 XkXk e1e1 e2e2 e3e3 ekek Unobserved Observed Unobserved A very simple measurement model... r t,e = 0 r e i e j = 0 r x i x j.t = 0 Key assumptions

CTT – Multiple Indicators This represents a set of simple regressions This represents a set of simple regressions One for each indicator One for each indicator Regresses the observed scores onto the latent scores Regresses the observed scores onto the latent scores Only difficulty is that the latent scores aren't observable Only difficulty is that the latent scores aren't observable BUT!, we can observe the effects of the latent scores (assuming a homogeneous model) BUT!, we can observe the effects of the latent scores (assuming a homogeneous model)

CTT – Multiple Indicators Some neat consequences... Some neat consequences... Covariance of scores on any 2 indicators is: Covariance of scores on any 2 indicators is: Variance of the j th indicator Variance of the j th indicator Variance due to true and error Variance due to true and error So, if we know the parameters we can compute the variances and covariances So, if we know the parameters we can compute the variances and covariances

CTT – Multiple Indicators Can also reverse this process Can also reverse this process If we know variances and covariances, we can compute the parameter estimates If we know variances and covariances, we can compute the parameter estimates Factor loadings may be obtained from the covariance of any three items Factor loadings may be obtained from the covariance of any three items

CTT – Multiple Indicators Can also reverse this process Can also reverse this process Error variances (aka uniquenesses) may be obtained via subtraction as: Error variances (aka uniquenesses) may be obtained via subtraction as:

CTT – Multiple Indicators Take 5 items with the following parameters Take 5 items with the following parameters

CTT – Multiple Indicators Imply the following covariance matrix Imply the following covariance matrix

CTT – Multiple Indicators You can compute the parameters from the variances and covariances You can compute the parameters from the variances and covariances You can compute the variance and covariances from the parameters You can compute the variance and covariances from the parameters Ex: covariance of items 3 and 5 Ex: covariance of items 3 and 5 Ex: factor loading for item 4 Ex: factor loading for item 4

CTT – Multiple Indicators This is an ideal example This is an ideal example Model fits perfectly Model fits perfectly No sampling error – we know the parameters No sampling error – we know the parameters In reality, all the various ways of estimating the parameters yield slightly different estimates In reality, all the various ways of estimating the parameters yield slightly different estimates Need a way to cope with this (disrepancy function) Need a way to cope with this (disrepancy function) Provides a conceptual intro to model identification Provides a conceptual intro to model identification

CTT – Multiple Indicators Parallel Indicators Parallel Indicators Equal lambdas (loadings) Equal lambdas (loadings) Equal error variances Equal error variances Tau-Equivalent Indicators Tau-Equivalent Indicators Equal lambdas Equal lambdas Congeneric Indicators Congeneric Indicators Nothing needs to be equal Nothing needs to be equal Can compare model fit to determine Can compare model fit to determine

CTT – Multiple Indicators Reliability of a homogeneous set of indicators Reliability of a homogeneous set of indicators Reliability is the ratio of true score variance to total variance Reliability is the ratio of true score variance to total variance Can be estimated directly from the common factor model Can be estimated directly from the common factor model

CTT – Multiple Indicators Omega is: Omega is: Ratio of variance due to the common attribute to the total variance in X Ratio of variance due to the common attribute to the total variance in X Square of the correlation between X (observed score) and the the common factor Square of the correlation between X (observed score) and the the common factor Correlation between two parallel test scores Correlation between two parallel test scores Square of the correlation between the total score of m indicators and the total score of an infinite set of indicators Square of the correlation between the total score of m indicators and the total score of an infinite set of indicators

CTT – Multiple Indicators Coefficient alpha Coefficient alpha If the tau-equivalent model holds! If the tau-equivalent model holds! This quantity can be estimated directly from the variance covariance matrix This quantity can be estimated directly from the variance covariance matrix

CTT – Multiple Indicators If the tau-equivalent model holds... If the tau-equivalent model holds... If not tau-equivalent, alpha will be less than omega If not tau-equivalent, alpha will be less than omega Notice m! - leads to the spearman-Brown Notice m! - leads to the spearman-Brown The number of indicators The number of indicators

CTT – Multiple Indicators Standards for alpha Standards for alpha.7 has become the standard.7 has become the standard Nunnaly was the origin Nunnaly was the origin.7 for research.7 for research.9 for application.9 for application Why? - The SEM is huge for.7 Why? - The SEM is huge for.7