Validity Does test measure what it says it does? Is the test useful? Can a test be reliable, but not valid? Can a test be valid, but not reliable?
Types of validity Face validity –Important only so far as it doesn’t interfere with an examinee’s willingness to cooperate. Content validity –How well does the test cover areas of content that it should? –How adequately does it sample the universe of behavior it was designed to assess?
Content validity (cont.) Panel of “experts” –Is the item/content essential? –Lawshe (1975) >50% of experts see skill as essential Important for: –Achievement/classroom tests –Training program exams –Professional exams
Criterion-Related Validity How well does a test score relate to another score/variable of interest? –Correlate test with criterion Standard against which test is evaluated Concurrent Predictive
Criterion-Related Validity (cont.) Criterion should be –Reliable Reliability limits validity; can’t be valid if not reliable. –Relevant –Valid –Uncontaminated Criterion measure has been based in part on predictor measure
Criterion-Related Validity (cont.) Concurrent validity –Criterion immediately available –Present standing on a criterion Diagnosis, score on another test –Used to predict the performance of new test takers or for people for whom the criterion isn’t available.
Criterion-Related Validity (cont.) Predictive validity –Test given, criterion measured later –Ex. ACT & College GPA; employment test & job performance Incremental validity
Base Rate & Decision Theory Base rate: proportion of population who possess a certain trait, characteristic or attribute –% of EIU undergrads who graduate –% of African Americans with sickle cell anemia Base rate affects usefulness of tests
Decision Theory 4 outcomes False rejections/negatives Valid Acceptances/ Positives Valid Rejections/ negatives False Acceptances/ Positives
Cut scores & Hit rates False rejections/negativesValid Acceptances/ Positives Valid Rejections/ negatives False Acceptances/ Positives
Cut scores & Hit rates (cont.) Reciprocal relationship between # of false rejections and # of false acceptances Which is more acceptable: to limit the number accepted who shouldn’t be, or to minimize the # rejected who could be successful?
Construct Validity Construct: –Scientific idea hypothesized to explain behavior –Postulated attribute of people, assumed to be reflected in test score –Ex.: intelligence, self-esteem, motivation Construct validity: Does the test measure the construct? –Gives theoretical meaning to scores; –Subsumes all other types of validity
Construct Validity (cont.) Convergent evidence/validity Divergent/discriminant evidence Factor analysis –Data reduction/simplification of complex correlational matrices … to reveal major dimensions that underlie a set of items –A factor is considered to be the construct that best represents relationships among variables
Factor Analysis (cont.) Methods of factor analysis –Exploratory 1.Correlation matrix 2.Factor matrix with loadings 3.Label factors Used to develop or eliminate items or scales from composite scores
Factor Analysis (cont.) Confirmatory factor analysis –Goodness of fit –After test has been developed
Validity & Bias Bias: a factor inherent within a test that systematically prevents accurate, impartial measurement –Bias implies systematic, not random variation Can you make equally valid predictions for different groups?
Bias in Predictions Questions of regression –Slope –Intercept –Error of estimate
Slope Bias
Intercept Bias
Rating error Leniency Error Severity Error Central Tendency Error Halo Effect
Test Fairness Is the test used in an impartial, just, and equitable manner? Good tests Discriminate among individuals –Are group differences due to inadequate tests? –Is the test being used fairly?