Part II Sigma Freud & Descriptive Statistics Chapter 6 Just the Truth: An Introduction to Understanding Reliability and Validity
Why Measurement? What is measurement? You need to know that the data you are collecting represents what it is you want to know about. How do you know that the instrument you are using to collect data works every time (reliability) and measures what it is supposed to (validity)?
Scales of Measurement Measurement is the assignment of values to outcomes following a set of rules There are four types of measurement scales Nominal Ordinal Interval Ratio
Nominal Level of Measurement Characteristics of an outcome that fits one and only one category Mutually exclusive categories such as Male or Female Democrat, Republican, or Independent Categories cannot be ordered meaningfully Least precise level of measurement
Ordinal Level of Measurement Characteristics being measured are ordered Rankings such as #1, #2, #3 You know that a higher rank is better, but not by how much
Interval Level of Measurement Test or tool is based on an underlying continuum that allows you to talk about how much higher one score is than another Intervals along the scale are equal to one another Example: “Rate your restaurant experience on a scale of 1-7 with 1 = unsatisfactory and 7 = excellent”
Ratio Level of Measurement Characterized by the presence of absolute zero on the scale An absence of any of the trait being measured Examples: How many kids do you have? (can have 0) Scores on a test (0 is possible!)
Things to Remember Any outcome can be assigned one of four scales of measurement Scales of measures have an order The “higher” up the scale of measurement, the more precise (and useful) the data are Use the scale most appropriate for the research task at hand
Classical Test Theory: Os = Ts + E Observed score the actual score on a test, scale or measure True score theoretical reflection of the actual amount of a trait or characteristic an individual possesses Error score part of the score that is random, or the difference between the Observed and True scores Reliability = True Score / (True Score + Error)
Types of Reliability Test-Retest Parallel Forms Internal Consistency Measure of Stability Parallel Forms Measure of Equivalence Internal Consistency Measure of Consistency Cronbach’s Alpha (coefficient alpha) Inter-Rater Measure of Agreement
Using the Computer SPSS and Cronbach’s Alpha
How Big is Good Enough? Reliability coefficients should be positive 0.0 to 1.0 General Rules of Thumb… Test-Retest = .60-1.0 Inter-Rater = 85% agreement or better Internal Consistency alpha = .70 – 1.0 High Reliability alone DOES NOT mean you are testing or measuring the right thing!!
Establishing Reliability Make sure instructions are standardized across all settings Increase number of items or observations Delete unclear items Moderate easiness or difficulty of tests (“middle-of-the-road” strategy) Minimize the effect of external events
What is the Truth? Validity The extent to which inferences made from a test are… Appropriate Meaningful Useful (American Psychological Association & the National Council on Measurement) Does the test measure what it is supposed to measure?
Types of Validity Three types of validity: Content Validity Criterion Validity Predictive Criterion validity Concurrent Criterion validity Construct Validity
Content Validity Property of a test such that the test items sample the universe of items for which the test is designed. How to Establish… Content Expert Do items represent all possible items? How well do the number of items reflect what was taught?
Criterion Validity Assesses whether a test reflects a set of abilities in a current (concurrent) or future (predictive) setting as measured by some other test. Concurrent Validity How well does my test correlate with the outcomes of a similar test right now? Predictive Validity How well does my test predict performance on a similar measure in the future?
Construct Validity Most difficult source of validity to establish Construct = group of interrelated variables such as... Aggression Hope Intelligence (Verbal, Quantitative, Emotional) Want your construct to correlate with related behaviors and not correlate with behaviors that are not related.
All About Validity
Validity & Reliability The “Kissing Cousins” A test can be reliable but not valid A test cannot be valid unless it is reliable because… “A test cannot do what it is supposed to do (validity) until it does what it is supposed to do consistently (reliability).”