Download presentation
Presentation is loading. Please wait.
1
Psychology 3450W: Experimental Psychology
Fall, 2017 Professor Delamater
2
Measurement Issues 1. Measurement Scales 2. Evaluating Measurements
3. Statistics Primer Descriptive Inferential Lets look at each of these in more detail…
3
Measurement Issues Measurement Scales
What quantitative information is conveyed by our measuring instrument about the psychological construct? Our instrument is a readout of the underlying construct.
4
Measurement Issues 4 Measurement Scales 1. Nominal 2. Ordinal
3. Interval 4. Ratio
5
Measurement Issues 4 Measurement Scales
1. Nominal Scale – Numbers are used to signify categories (e.g., numbers on a uniform). Thus, they contain no numerical information at all.
6
Measurement Issues 4 Measurement Scales
1. Nominal Scale – Numbers are used to signify categories (e.g., numbers on a uniform). Thus, they contain no numerical information at all.
7
Measurement Issues 4 Measurement Scales
2. Ordinal Scale – Numbers can convey information about relative magnitude, i.e., >, <, or = relations, but no more. For example, Turnpike exit numbers tell you that exit 9 is further away from you than exit 7 (if you are at 1), but you cannot infer that the difference between 7 and 9 is equal to the difference between 4 and 6.
8
Measurement Issues 4 Measurement Scales
2. Ordinal Scale – Numbers can convey information about relative magnitude, i.e., >, <, or = relations, but no more. For example, Turnpike exit numbers tell you that exit 9 is further away from you than exit 7 (if you are at 1), but you cannot infer that the difference between 7 and 9 is equal to the difference between 4 and 6.
9
Measurement Issues 4 Measurement Scales
2. Ordinal Scale – Numbers can convey information about relative magnitude, i.e., >, <, or = relations, but no more. For example, Turnpike exit numbers tell you that exit 9 is further away from you than exit 7 (if you are at 1), but you cannot infer that the difference between 7 and 9 is equal to the difference between 4 and 6. Measure Anxiety with Heart Rate (HR) in the presence of 3 different stimuli. Your results are as follows: Stimulus 1: 60 beats per min Stimulus 2: 80 beats per min Stimulus 3: 100 beats per min Stim 3 > Stim 2 > Stim 1. It seems that Stim 3 is more anxiety-provoking than Stim 2 which is more than Stim 1.
10
Measurement Issues 4 Measurement Scales
2. Ordinal Scale – Numbers can convey information about relative magnitude, i.e., >, <, or = relations, but no more. For example, Turnpike exit numbers tell you that exit 9 is further away from you than exit 7 (if you are at 1), but you cannot infer that the difference between 7 and 9 is equal to the difference between 4 and 6. Measure Anxiety with Heart Rate (HR) in the presence of 3 different stimuli. Your results are as follows: Stimulus 1: 60 beats per min Stimulus 2: 80 beats per min Stimulus 3: 100 beats per min If the anxiety differences between 3 and 2 and 2 and 1 are not equal, then HR is only an ordinal scale measure of anxiety.
11
Measurement Issues 4 Measurement Scales
3. Interval Scale – Numbers convey the same information about relative magnitude as in the ordinal scale, but, in addition, equal differences are psychologically equal. For example, IQ and intelligence.
12
Measurement Issues 4 Measurement Scales
3. Interval Scale – Numbers convey the same information about relative magnitude as in the ordinal scale, but, in addition, equal differences are psychologically equal. For example, IQ and intelligence. Measure IQ from different people. The scores are as follows: Person 1: score of 60 Person 2: score of 90 Person 3: score of 120 Person 3 > Person 2 > Person 1. Person 3 has more intelligence than Person 2 and Person 2 has more intelligence than Person 1. That’s an ordinal scale statement.
13
Measurement Issues 4 Measurement Scales
3. Interval Scale – Numbers convey the same information about relative magnitude as in the ordinal scale, but, in addition, equal differences are psychologically equal. For example, IQ and intelligence. Measure IQ from different people. The scores are as follows: Person 1: score of 60 Person 2: score of 90 Person 3: score of 120 Person 3 > Person 2 > Person 1. If we can say that: Person 3 has as much more intelligence than Person 2 as Person 2 has over Person 1, that would be an interval scale statement.
14
Measurement Issues 4 Measurement Scales
3. Interval Scale – Numbers convey the same information about relative magnitude as in the ordinal scale, but, in addition, equal differences are psychologically equal. For example, IQ and intelligence. Measure IQ from different people. The scores are as follows: Person 1: score of 60 Person 2: score of 90 Person 3: score of 120 However, many psychologists would argue that IQ cannot be taken as a true interval scale measure of “intelligence.”
15
Measurement Issues 4 Measurement Scales
3. Interval Scale – Numbers convey the same information about relative magnitude as in the ordinal scale, but, in addition, equal differences are psychologically equal. For example, RT and “stimulus processing” vs “expectation.” Measure RT in a Posner cueing task: Valid Trials: RT = 325 msec (dot target) Invalid Trials: RT = 350 msec (dot target) In this task spatial cues speed up processing of the dot target.
16
Measurement Issues 4 Measurement Scales
3. Interval Scale – Numbers convey the same information about relative magnitude as in the ordinal scale, but, in addition, equal differences are psychologically equal. For example, RT and “stimulus processing” vs “expectation.” Measure RT in a Posner cueing task: Valid Trials: RT = 325 msec (dot target) Invalid Trials: RT = 350 msec (dot target) Valid Trials: RT = 675 msec (word target) Invalid Trials: RT = 700 msec (word target) Suppose we also looked at word targets…
17
Measurement Issues 4 Measurement Scales
3. Interval Scale – Numbers convey the same information about relative magnitude as in the ordinal scale, but, in addition, equal differences are psychologically equal. For example, RT and “stimulus processing” vs “expectation.” Measure RT in a Posner cueing task: Valid Trials: RT = 325 msec (dot target) Invalid Trials: RT = 350 msec (dot target) Valid Trials: RT = 675 msec (word target) Invalid Trials: RT = 700 msec (word target) The same “processing” advantage occurs with spatially predicted dots as with words. RT is an interval scale measure of “stimulus processing,” i.e., the added efficiency in processing produced by spatial cues.
18
Measurement Issues 4 Measurement Scales
3. Interval Scale – Numbers convey the same information about relative magnitude as in the ordinal scale, but, in addition, equal differences are psychologically equal. For example, RT and “stimulus processing” vs “expectation.” Measure RT in a Posner cueing task: Valid Trials: RT = 325 msec (dot target) Invalid Trials: RT = 350 msec (dot target) Valid Trials: RT = 675 msec (word target) Invalid Trials: RT = 700 msec (word target) However, if we were to use this RT difference as a measure of the psychological construct of “expectation,” then we’d likely only assert ordinal scale properties to RT.
19
Measurement Issues 4 Measurement Scales
3. Interval Scale – Numbers convey the same information about relative magnitude as in the ordinal scale, but, in addition, equal differences are psychologically equal. For example, RT and “stimulus processing” vs “expectation.” Measure RT in a Posner cueing task: Valid Trials: RT = 325 msec (dot target) Invalid Trials: RT = 350 msec (dot target) Valid Trials: RT = 675 msec (word target) Invalid Trials: RT = 700 msec (word target) This is because the same amount of “expectation” may translate into differences in RT at different points along the RT scale. Thus, an equal numeric difference may not be psychologically equivalent.
20
Measurement Issues 4 Measurement Scales
4. Ratio Scale – Numbers convey the same information as in the interval scale, but, in addition, ratio statements can be made because a true 0 point can be identified. For example, scale for height and weight. You can say when someone is twice as tall or heavy as someone else.
21
Measurement Issues 4 Measurement Scales
4. Ratio Scale – Numbers convey the same information as in the interval scale, but, in addition, ratio statements can be made because a true 0 point can be identified. For example, temperature scales… Compare Fahrenheit, Celsius, and Kelvin Scales of temperature. Are these interval or ratio scales? Why?
22
Measurement Issues 4 Measurement Scales
4. Ratio Scale – Numbers convey the same information as in the interval scale, but, in addition, ratio statements can be made because a true 0 point can be identified. For example, temperature scales... Kelvin Scale: Absolute 0 is theoretically meaningful – zero heat energy.
23
Measurement Issues 4 Measurement Scales
4. Ratio Scale – Numbers convey the same information as in the interval scale, but, in addition, ratio statements can be made because a true 0 point can be identified. For example, temperature scales... Kelvin Scale: Absolute 0 is theoretically meaningful – zero heat energy. In Psychological research we’d need to know how the total absence of a construct translates into a numeric value. Very Difficult.
24
Measurement Issues Measurement Scales in Psychology
Generally speaking, measuring instruments in psychology experiments assess psychological constructs at either the ordinal or interval scale of measurement. The problems are (a) that we can rarely, if ever, identify a true 0 point for some psychological construct, and (b) it is sometimes difficult to determine if equal differences at different points on the measuring scale are psychologically equivalent.
25
Measurement Issues Evaluating Measurements
Aside from quantitative considerations, we can ask how Reliable and Valid are our measuring devices. Reliability – How consistent is the measuring instrument at measuring whatever it measures?
26
Measurement Issues Evaluating Measurements
Aside from quantitative considerations, we can ask how Reliable and Valid are our measuring devices. Reliability – How consistent is the measuring instrument at measuring whatever it measures? Suppose we used this type of scale to measure intelligence. Is it consistent? If so, then it is reliable… BUT, is it valid?
27
Measurement Issues Evaluating Measurements
Aside from quantitative considerations, we can ask how Reliable and Valid are our measuring devices. Validity – How accurate is the measuring instrument at measuring a particular psychological construct? Suppose we used this type of scale to measure intelligence. Is it consistent? If so, then it is reliable… BUT, is it valid? Well, NO, this is not a valid measure of intelligence.
28
Measurement Issues Evaluating Measurements
Aside from quantitative considerations, we can ask how Reliable and Valid are our measuring devices. Validity – How accurate is the measuring instrument at measuring a particular psychological construct? Three types of validity: Face, Criterion (predictive), Construct
29
Measurement Issues Evaluating Measurements
Aside from quantitative considerations, we can ask how Reliable and Valid are our measuring devices. Face Validity – How intuitive is it that a measuring instrument measures the psychological construct? e.g., RTs in a line length discrimination task as a measure of intelligence Press Button 1 if same Press Button 2 if different
30
Measurement Issues Evaluating Measurements
Aside from quantitative considerations, we can ask how Reliable and Valid are our measuring devices. Face Validity – How intuitive is it that a measuring instrument measures the psychological construct? e.g., RTs in a line length discrimination task as a measure of intelligence Press Button 1 if same Press Button 2 if different Etc… Is RT in this task a “good” measure of intelligence? On the face of it, we’d say NO.
31
Measurement Issues Evaluating Measurements
Aside from quantitative considerations, we can ask how Reliable and Valid are our measuring devices. Face Validity – How intuitive is it that a measuring instrument measures the psychological construct? e.g., RTs in a line length discrimination task as a measure of intelligence Press Button 1 if same Press Button 2 if different Etc… But, in fact, RT in this task negatively correlates with IQ. Thus, it has low Face Validity, but that doesn’t mean it isn’t useful.
32
Measurement Issues Evaluating Measurements
Aside from quantitative considerations, we can ask how Reliable and Valid are our measuring devices. Criterion Validity – Can the measuring instrument be used to predict future performance related to the construct being measured? e.g., Can SAT, GRE scores be used to predict academic success? SAT correlates positively with 1st yr GPA, but the correlation goes down thereafter. Thus, there is some criterion (predictive) validity to these measure, but it seems limited.
33
Measurement Issues Evaluating Measurements
Aside from quantitative considerations, we can ask how Reliable and Valid are our measuring devices. Criterion Validity – Can the measuring instrument be used to predict future performance related to the construct being measured? e.g., Can a measure of depression predict whether people are likely to attempt suicide or to actually commit suicide? The Beck Depression inventory (BDI) is one popular measuring scale for depression. Green et al (2015, J Clin Psychiatry) demonstrated a clear predictive relationship between BDI and suicide deaths and attempts. Shows criterion validity
34
Measurement Issues Evaluating Measurements
Aside from quantitative considerations, we can ask how Reliable and Valid are our measuring devices. Construct Validity – Is the measuring instrument really assessing the underlying psychological construct of interest? This is the most difficult to establish because it relates to the extent that we are actually measuring the construct of interest. The way to establish construct validity is: predictable research outcomes. This increases the validity of our measuring instrument, and The construct itself. This is related to the converging operations idea. We must be on the right track if we can run multiple studies with predictable outcomes.
35
Measurement Issues Evaluating Measurements
Notice that Reliability and Validity are dissociable concepts. It is possible for a measure to be highly reliable, but not valid. However, a measure that is valid must also be reliable (because it is actually measuring the thing it is supposed to measure).
36
Measurement Issues Statistical Review – Once we collect data, how do we know whether we have “real” differences? Descriptive Statistics Inferential Statistics
37
Measurement Issues Statistical Review – Once we collect data, how do we know whether we have “real” differences? Descriptive Statistics Central Tendencies
38
Measurement Issues Statistical Review – Once we collect data, how do we know whether we have “real” differences? Descriptive Statistics Central Tendencies (mean, median, mode)
39
Measurement Issues Statistical Review – Once we collect data, how do we know whether we have “real” differences? Descriptive Statistics Central Tendencies (mean, median, mode) Spread
40
Measurement Issues Statistical Review – Once we collect data, how do we know whether we have “real” differences? Descriptive Statistics Central Tendencies (mean, median, mode) Spread (range, variance, standard deviation, SEM)
41
Measurement Issues Statistical Review – Once we collect data, how do we know whether we have “real” differences? Descriptive Statistics Central Tendencies (mean, median, mode) Spread (range, variance, standard deviation, SEM)
42
Measurement Issues Statistical Review – Once we collect data, how do we know whether we have “real” differences? Descriptive Statistics Central Tendencies (mean, median, mode) Spread (range, variance, standard deviation, SEM) Standard Dev = Square Root (variance) SEM = St Dev / Square Root (N)
43
Measurement Issues Statistical Review – Once we collect data, how do we know whether we have “real” differences? Inferential Statistics – A tool for assessing the status of an experimental hypothesis. We can ask about the likelihood of observing some difference by chance. If the observation is unlikely to have occurred by chance, then it is likely that there is a ”real” difference between our groups.
44
Measurement Issues Statistical Review – Once we collect data, how do we know whether we have “real” differences? Inferential Statistics – A tool for assessing the status of an experimental hypothesis. We can ask about the likelihood of observing some difference by chance. If the observation is unlikely to have occurred by chance, then it is likely that there is a “real” difference between our groups or experimental conditions. But what does a “real” difference mean?
45
Measurement Issues Null vs Alternative Hypotheses
Some variable is normally distributed in the population. This means that the distribution of scores has a mean (m) and a standard deviation (s).
46
Measurement Issues Null vs Alternative Hypotheses
Some variable is normally distributed in the population. This means that the distribution of scores has a mean (m) and a standard deviation (s). For example, body weights of males is normally distributed with a mean close to 200 pounds.
47
Measurement Issues Null vs Alternative Hypotheses
Now, we wish to determine if in our experiment we are collecting data that comes from the same underlying distribution or from different ones.
48
Measurement Issues Null vs Alternative Hypotheses
Control Gp Experimental Group Null Hypothesis: Both of our groups come from the same underlying population. Ho: m1 – m2 = 0
49
Measurement Issues Null vs Alternative Hypotheses
Control Gp Experimental Group Alternative Hypothesis: Both of our groups come from different underlying populations. Halt: m1 – m2 = 0
50
Measurement Issues Null vs Alternative Hypotheses
So, how can we tell when we collect some data which of these two scenarios is likely to be true?
51
Measurement Issues Null vs Alternative Hypotheses
So, how can we tell when we collect some data which of these two scenarios is likely to be true? Answer: Compute some inferential statistical test (e.g., t test, F test (or ANOVA), etc). systematic variance error variance =
52
Measurement Issues Null vs Alternative Hypotheses
So, if the null hypothesis is true, then what will the distribution of t scores look like? Answer:
53
Measurement Issues Null vs Alternative Hypotheses
So, if the null hypothesis is true, then what will the distribution of t scores look like? Answer: It will also be normally distributed. But what will be its mean?
54
Measurement Issues Null vs Alternative Hypotheses
So, if the null hypothesis is true, then what will the distribution of t scores look like? Answer: It will also be normally distributed. But what will be its mean?
55
Measurement Issues Null vs Alternative Hypotheses
When will we be inclined to decide to reject the null hypothesis? Or, how unlikely does the specific t score have to be for us to say that it probably did NOT occur by chance. Answer: We choose a critical value (or criterion). Any t score that exceeds that critical value will cause us to reject the null hypothesis. crit
56
Measurement Issues Null vs Alternative Hypotheses
When will we be inclined to decide to reject the null hypothesis? Or, how unlikely does the specific t score have to be for us to say that it probably did NOT occur by chance. Answer: We choose a critical value (or criterion). Any t score that exceeds that critical value will cause us to reject the null hypothesis. BUT, we will be wrong some of the time. By choosing a critical value where 5% of the time the score will exceed it by chance, we can limit the % of time we make these mistakes. crit
57
Measurement Issues Null vs Alternative Hypotheses
When will we be inclined to decide to reject the null hypothesis? Or, how unlikely does the specific t score have to be for us to say that it probably did NOT occur by chance. Answer: We choose a critical value (or criterion). Any t score that exceeds that critical value will cause us to reject the null hypothesis. BUT, we will be wrong some of the time. By choosing a critical value where 5% of the time the score will exceed it by chance, we can limit the % of time we make these mistakes. crit This is called a Type I error – typically set to 5%, or a = 0.05.
58
Measurement Issues Null vs Alternative Hypotheses
Now, suppose that the alternative hypothesis was actually true. What happens to the resulting distribution of t scores? Will it be centered around a mean = 0? Answer:
59
Measurement Issues Null vs Alternative Hypotheses
Now, suppose that the alternative hypothesis was actually true. What happens to the resulting distribution of t scores? Will it be centered around a mean = 0? Answer: No, it will be shifted one way or the other (e.g., to the right). Area to right of tcrit = Power (or prob of correctly rejecting H0). Area to left of tcrit in Halt distribution = Type II error (or, the prob of failing to reject a false H0). b a tcrit
60
Measurement Issues How to Increase our Statistical Power
When conducting experiments we wish to increase our ability to detect a real effect when it exists. How can we accomplish this? Answer: Increase our statistical power. But how?
61
Measurement Issues How to Increase our Statistical Power
When conducting experiments we wish to increase our ability to detect a real effect when it exists. How can we accomplish this? Answer: Increase our statistical power by (1) maximizing the hypothesized effect size, and (2) increasing our N to shrink the error variance. Increasing effect size means the Halt distribution is shifted further to the right. This will increase Power. a tcrit
62
Measurement Issues How to Increase our Statistical Power
When conducting experiments we wish to increase our ability to detect a real effect when it exists. How can we accomplish this? Answer: Increase our statistical power by (1) maximizing the hypothesized effect size, and (2) increasing our N to shrink the error variance. Increasing N will decrease the variance of both H0 and Halt distributions, and this will also increase Power. Power is the shaded areas within the Halt distributions.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.