Outline Variables – definition Physical dimensions Abstract dimensions Systematic vs. random variables Scales of measurement Reliability of measurement Validity of measurement
Variables - definition Variables are things whose values (a) can be measured, and (b) vary from one occasion to another e.g., academic progress can be measured and will vary from student to student
Variables – physical dimensions The dimensions of interest may be physical e.g., length, weight, volume, velocity today, physicists are good at measuring these things – but this reflects thousands of years of progress
Variables – abstract dimensions The dimensions of interest may be abstract e.g., intelligence, value, mood these are more challenging to measure we have only been working on this for about 100 years
Abstract dimensions One problem with an abstract or “underlying” dimension is that we can never measure the dimension itself. We can only measure proxy variables – for example, by creating a questionnaire or other instrument.
Abstract dimensions The relation of scores on the instrument to scores on the underlying dimension is very often unknown While the observed scores may help us to test theories, we should be cautious when using these scores to make decisions that will affect someone’s life.
Neutral Person APerson B Marital Satisfaction Scale X Scale Z Source: Blanton, H. & Jaccard, J. (2006). Arbitrary metrics in psychology. American Psychologist, 61(1),
Variables vary from one occasion to another Things that stay the same are not of much interest the dimensions of this room Things that vary are of interest G.P.A.s of incoming students this fall how do they compare to previous classes?
Problem: not all variation is meaningful If we measure a person’s I.Q. as 115 today and 117 tomorrow, did they get smarter overnight? Probably not Such variations are random Random variability is not meaningful
Systematic vs. random variability Some variability is produced by random effects e.g., variation in blood caffeine level, or fatigue, or motivation
Systematic vs. random variability Some variability is produced by systematic effects e.g. performance quality varies systematically with hours of practice. this is meaningful variability
Systematic vs. random variability We are interested in the systematic effects because those are the ones that can be explained by a theoretical model. Predicted effects are always systematic
Systematic vs. random variability The problem is that random variability is always present Sometimes systematic variability is also present Our task is to “detect the signal in the noise”
Scales of measurement There are four kinds of scales available to science Nominal Ordinal Interval Ratio These types of scales differ in their usefulness We generally work with interval or ratio scales
Scales of measurement Nominalsimply assigns names to cases sweaters organized by color in a closet
Scales of measurement Nominal Ordinal ranks cases on some dimension Chicken wings offered in mild, medium, hot, and volcano
Scales of measurement Nominal Ordinal Interval yields distances between cases on measurement dimension altitude of a given location (above or below sea level)
Scales of measurement Nominal Ordinal Interval Ratio intervals between cases plus a true zero elapsed time for runners in a race
What kind of scale? Rated preference for desserts, using a Likert scale. Interval
What kind of scale? Listing of children by kind of cake they like best. Interval Nominal
What kind of scale? Amount of money in your bank account. Interval Nominal Ordinal
What kind of scale? Listing of movies showing this week in order you would like to see them. Interval Nominal Ratio Ordinal
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation between two sets of scores
Split-half Reliability correlation between grades on odd- numbered & even- numbered questions on an exam if most people get very similar scores on each half-test, the exam score is reliable.
Test-retest Reliability correlation between grades on two different but comparable tests covering the same material if most people get very similar scores on the two tests, the test scores are reliable. (requires people to be similar to themselves, not other people)
Validity We distinguish between the validity of a measure of some psychological process or state and the validity of a conclusion. Here, we focus on validity of measures. A subsequent lecture will consider the validity of conclusions.
Validity a measure is valid if it measures what you think it measures. we traditionally distinguish between four types of validity: face content construct criterion
Four types of validity Face The test appears to measure what it is supposed to measure not formally recognized as a type of validity
Four types of validity Face Construct The measure captures the theoretical construct it is supposed to measure
Four types of validity Face Content the measure samples the range of behavior covered by the construct.
Four types of validity Face Content Criterion Results relate closely to those produced by other measures of the same construct. do not relate to those produced by measures of other constructs
Quick Review We’re not really interested in things that don’t change. We’re interested in variation. But only systematic variation, not random variation systematic variation can be explained random variation can’t
Quick Review Some variation in performance is random and some is systematic The scientist’s tasks are to separate the systematic variation from the random, and then to build models of that systematic variation.
Quick Review We choose a measurement scale. We prefer either ratio or interval scales, when we can get them. We try to maximize both the reliability and the validity of our measurements using that scale.
Review questions Which would you expect to be easier to assess – reliability or validity? Why do we have tools and machines to measure some things for us (such as rulers, scales, and money)? What are some analogues for rulers and scales, used when we measure psychological constructs?