Chapter 9 Correlation, Validity and Reliability
Nature of Correlation Association – an attempt to describe or understand Not causal –However, many people will use terms such as “predictor”
Correlation Association between 2 variables in its simplest form. Variable X and Variable Y Often times X = predictor variable Y = criterion variable
Predictor/Criterion Height and shoe size.73 Height = predictor Shoe size = criterion Could very well be reversed - explanatory
Predictor/Criterion Predictor = High school GPA Criterion = College GPA Predictor = belief about fixed intelligence Criterion = amount of study time per week Predictor = amount of time reading at home Criterion = grades in Literacy in 8 th grade
Coefficient of Determination Indicated by r 2 (r-squared) Indicates the amount of variance explained or accounted by the relationship between the variables Quick and dirty method of understanding the strength of the relationship
Common uses in Education Validity (e.g. Criterion related: predictive & concurrent) Reliability of instruments Inter-rater reliability
Validity How well can you defend the measure? –Face V –Content V –Criterion-related V –Construct V
Face Validity Does instrument look like valid? –On a survey or questionnaire, the questions seem to be relevant –On a checklist, the behaviors seem relevant –For a performance test, the task seems to be appropriate
Content Validity The content of the test, the measure, is relevant to the behavior or construct being measured An expert judges or a panel of experts judge the content
Criterion Related Using a another independent measure to validate a test –Typically computing a correlation Two types –Predictive validity –Concurrent validity
Criterion-Related Predictive ACT achievement test Correlated with College GPA Concurrent Coopersmith Self-esteem Scale Correlated with teacher’s ratings of self-esteem
Construct Validity Construct – attempt to describe, name an intangible variable Use many different measures to validate a measure Self-esteem – construct –Instrument measure
Construct Validity Self-esteem – construct –Instrument measure e.g. coopersmith –Correlated it with: Behavioral checklist Teacher’s comments Another accepted instrument for Self-esteem A measure of confidence Locus of control measure
Reliability For an instrument – –Consistency of scores from use to use Types of reliability coefficients –Test – retest –Equivalent forms –Internal consistency Split-half Alpha coefficient (Cronbach alpha)
Reliability Coefficient Value ranges from 0 to considered the minimal acceptable.90 is very good.60 is sometimes acceptable but is really not very good Lower than.60 definitely unacceptable
Reliable but is it Valid? Valid but is it Reliable? Invalid and Unreliable No confidence you’ll get near the target; have no idea where it’s going to shoot.
Reliable but is it Valid? Valid but is it Reliable? Invalid but Reliable No confidence you’ll get near the target; but you know where it’s going to shoot (just not at the target!)
Reliable but is it Valid? Valid but is it Reliable? Valid but Unreliable Confidence that when you hit something, it’s what you want, but you can’t depend upon consistency.
Reliable but is it Valid? Valid but is it Reliable? Valid and Reliable Confident that when you hit a target, it’s what you want and you can depend upon consistent shots.
Inter-rater reliability Example – Two teachers reading same essay, scoring them in a similar manner – consistently Using same checklist to make observations Can be expressed as a coefficient Often as percentage of agreement A function of training, objectivity, and rubric or checklist, i.e., the operational definition!
Norm-referenced tests –Comparison of individual score to others –Intelligence test –ISAT, Iowa Basic Skills Test –SAT aptitude test –Personality test –Percentile’s - derived scores –Grading on a curve
Criterion referenced test –Individual score is compare to a benchmark (a criterion) –If Raw Score used (no conversion): C-R test –Mastery of material –Earning a grade in my class –Disadvantage is potential lack of variability
Measures of Optimum Performance Aptitude Tests –Predict future performance Achievement tests –Measure current knowledge Performance tests –Measure current ability to complete tasks
Measures of typical performance Often impacted by “social desirability” –Wanting to hide undesirable traits or characteristics One way to work around sd is to use projective tests Rorschach ink Blot Thematic Apperception Test
Paper/pencil measures of attitudes using Likert-type scales Strongly Agree – Strongly Disagree - Reverse scoring to prevent or identify “response bias”