Reliability & Validity
Limits all inferences that can be drawn from later tests If reliable and valid scale, can have confidence in findings If unreliable or invalid scale need to be very cautious
Item 1 Item 2 Item 3 CONSTRUCT Related measures & outcomes Unrelated measures & outcomes
Captures how the value of one variable changes when the value of the other changes Ranges from -1 to +1 A Pearson correlation is based on continuous variables Important to remember this is a relationship for a group, not each person/item Reflects the amount of variability shared by two variables
Correlations test 1test 2test3 test 1Pearson Correlation **.364 ** Sig. (2-tailed).000 N 105 test 2Pearson Correlation.555 ** ** Sig. (2-tailed).000 N 105 test3Pearson Correlation.364 **.613 ** Sig. (2-tailed).000 N 105 **. Correlation is significant at the 0.01 level (2-tailed).
r xy = n ΣXY - ΣX ΣY [n ΣX 2 – (ΣX) 2 ][n ΣY 2 - (ΣY) 2 ] r xy = correlation coefficient between x & y n = size of sample X = score on X variable Y = score on Y variable
.80 to 1.0 very strong.60 to.80 strong.40 to.60 moderate.20 to.40 weak.00 to.20 weak/none Relationships of.70 or stronger are generally considered acceptable in reliability analyses
The extent to which a scale measures construct consistently Any measurement is an observed score Reliability = true score/ (true score + error) Less error = observed score is closer to true score (more reliable) We never know the “true score”
Extent to which a test is reliable over time Calculate the correlation between two time points for each person ◦ Items should relate positively *Sometimes you expect the scores to be different
Extent to which two forms of a test are equivalent Calculate the correlation between the two forms of the test
Extent to which items are consistent with one another and represent one dimension Correlation between individual scores and the total score Also estimate correlations among the items Important that all items use the same scale and be in the same direction Cronbach’s alpha (α)
α = k s 2 y – Σs 2 i k-1 s 2 y k = number of items S 2 y = variance associated with observed score Σ s 2 i = sum of all variances for each item
Reliability Statistics Cronbach's Alpha Cronbach's Alpha Based on Standardized Items N of Items Inter-Item Correlation Matrix -In uncertain times, I usually expect the best. -I’m always optimistic about my future. -If something can go wrong for me, it will. -In uncertain times, I usually expect the best I’m always optimistic about my future If something can go wrong for me, it will
Agreement between two raters ir = # of agreements # of possible agreements
The extent to which the scale measures what it is intended to measure Can be reliable without being valid
Items sample the universe of items for a construct Can ask an expert (or several) whether items seem representative
Scale relates to other measures or behaviors in ways that would be expected Concurrent ◦ At same time or predictive ◦ Predicts later scores
Scale measures the underlying construct as intended Relation to the behaviors that the construct represents