Experiment Basics: Control

Experiment Basics: Control
Psych 231: Research Methods in Psychology

Announcements Quiz 7 due Friday Exam 2 two weeks from today
1 week from today (next Monday) I will be out of town, so I will post an on-line video of the lecture In Labs: Remember to turn in your group project ratings Announcements

Many kinds of Variables
Independent variables (explanatory) Dependent variables (response) Scales of measurement Errors in measurement Reliability & Validity Sampling error Extraneous variables Control variables Random variables Confound variables Many kinds of Variables

Colors and words Divide into two groups:
men women Instructions: Read aloud the COLOR that the words are presented in. When done raise your hand. Women first. Men please close your eyes. Okay ready? Colors and words

Blue Green Red Purple Yellow List 1

Okay, now it is the men’s turn.
Remember the instructions: Read aloud the COLOR that the words are presented in. When done raise your hand. Okay ready?

Blue Green Red Purple Yellow List 2

So why the difference between the results for men versus women?
Is this support for a theory that proposes: “Women are good color identifiers, men are not” Why or why not? Let’s look at the two lists. Our results

List 2 Men List 1 Women Blue Green Red Purple Yellow Blue Green Red
Matched Mis-Matched

What resulted in the performance difference?
Our manipulated independent variable (men vs. women) The other variable match/mis-match? Because the two variables are perfectly correlated we can’t tell This is the problem with confounds Our question of interest Blue Green Red Purple Yellow Blue Green Red Purple Yellow IV DV Confound Co-vary together ? Confound that we can’t rule out

What DIDN’T result in the performance difference?
Extraneous variables Control # of words on the list The actual words that were printed Random Age of the men and women in the groups Majors, class level, seating in classroom,… These are not confounds, because they don’t co-vary with the IV Blue Green Red Purple Yellow Blue Green Red Purple Yellow

Experimental Control Our goal:
To test the possibility of a systematic relationship between the variability in our IV and how that affects the variability of our DV. variability IV DV Control is used to: Minimize excessive variability To reduce the potential of confounds (systematic variability not part of the research design) Experimental Control

Experimental Control Our goal:
To test the possibility of a systematic relationship between the variability in our IV and how that affects the variability of our DV. the variability in our IV the variability of our DV T = NRexp + NRother + R Nonrandom (NR) Variability NRexp: Manipulated independent variables (IV) Our hypothesis: the IV will result in changes in the DV NRother: extraneous variables (EV) which covary with IV Condfounds Random (R) Variability Imprecision in measurement (DV) Randomly varying extraneous variables (EV) Experimental Control

Experimental Control: Weight analogy
Variability in a simple experiment: T = NRexp + NRother + R Absence of the treatment (NRexp = 0) Treatment group Control group “perfect experiment” - no confounds (NRother = 0) R NR exp other R Bigger the weight = more variability from a source Experimental Control: Weight analogy

Variability in a simple experiment: T = NRexp + NRother + R Control group Treatment group We can’t “see” what’s on the scales NR exp R R Difference Detector Our experiment is a “difference detector” This is the only part we “see” Experimental Control: Weight analogy

If there is an effect of the treatment then NRexp will ≠ 0 Control group Treatment group R NR exp R Difference Detector Our experiment can detect the effect of the treatment Experimental Control: Weight analogy

Variability in a simple experiment: Try it out at home: using coins as weights, with your eyes closed, can you tell different combinations apart? Treatment group Control group Difference Detector Bigger the weight = more variability from a source Experimental Control: Weight analogy

Things making detection difficult
Potential Problems Excessive random variability Confounding Treatment group Control group Difference Detector Things making detection difficult

Potential Problems Excessive random variability
If experimental control procedures are not applied Then R component of data will be excessively large, and may make NRexp undetectable Potential Problems

Excessive random variability
If R is large relative to NRexp then detecting a difference may be difficult R R NR exp Difference Detector Experiment can’t detect the effect of the treatment Excessive random variability

Reduced random variability
But if we reduce the size of NRother and R relative to NRexp then detecting gets easier So try to minimize this by using good measures of DV, good manipulations of IV, etc. R NR exp R Difference Detector Our experiment can detect the effect of the treatment Reduced random variability

Potential Problems Confound
If an EV co-varies with IV, then NRother component of data will be present, and may lead to misattribution of effect to IV This relationship may or may not exist IV DV Co-vary together EV IV = independent var DV = dependent var EV = extraneous var Potential Problems

Confounding Confound R NR R
Hard to detect the effect of NRexp because the effect looks like it could be from NRexp but could be due to the NRother R NR R other NR exp Difference Detector Experiment can detect an effect, but can’t tell where it is from Confounding

Confound Hard to detect the effect of NRexp because the effect looks like it could be from NRexp but could be due to the NRother These two situations look the same R NR exp other Difference Detector R R NR other Difference Detector There is an effect of the IV There is not an effect of the IV Confounding

Removing Confounding Confound
Hard to detect the effect of NRexp because the effect looks like it could be from NRexp but could be due to the NRother Use experimental control to spread the variability equally across conditions Use experimental control to eliminate the variability from the confound R NR other R NR exp Difference Detector R NR exp Difference Detector Removing Confounding

Controlling Variability
How do we introduce control? Methods of Experimental Control Constancy/Randomization Comparison Production Controlling Variability

Methods of Controlling Variability
Constancy/Randomization If there is a variable that may be related to the DV that you can’t (or don’t want to) manipulate Control variable: hold it constant (so there isn’t any variability from that variable, no R weight from that variable) Random variable: let it vary randomly across all of the experimental conditions (so the R weight from that variable is the same for all conditions) Methods of Controlling Variability

Comparison An experiment always makes a comparison, so it must have at least two groups (2 sides of our scale in the weight analogy) Sometimes there are control groups This is often the absence of the treatment Training group No training (Control) group Without control groups if is harder to see what is really happening in the experiment It is easier to be swayed by plausibility or inappropriate comparisons (see diet crystal example) Useful for eliminating potential confounds (think about our list of threats to internal validity) Methods of Controlling Variability

Comparison An experiment always makes a comparison, so it must have at least two groups Sometimes there are control groups This is often the absence of the treatment Sometimes there are a range of values of the IV 1 week of Training group 2 weeks of Training group 3 weeks of Training group Methods of Controlling Variability

Production The experimenter selects the specific values of the Independent Variables 1 week of Training group 2 weeks of Training group 3 weeks of Training group selects the specific values variability 1 weeks 2 weeks 3 weeks Duration taking the training program Methods of Controlling Variability

Production The experimenter selects the specific values of the Independent Variables 1 week of Training group 2 weeks of Training group 3 weeks of Training group Need to do this carefully Suppose that you don’t find a difference in the DV across your different groups Is this because the IV and DV aren’t related? Or is it because your levels of IV weren’t different enough Methods of Controlling Variability

So far we’ve covered a lot of the general details of experiments
Now let’s consider some specific experimental designs. Some bad (but not uncommon) designs (and potential fixes) Some good designs 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors Experimental designs

Poorly designed experiments
Bad design example 1: Does standing close to somebody cause them to move? (theory of personal space) “hmm… that’s an empirical question. Let’s see what happens if …” So you stand closely to people and see how long before they move Problem: no control group to establish the comparison group (this design is sometimes called “one-shot case study design”) Very Close (.2 m) Fix: introduce a (or some) comparison group(s) Close (.5 m) Not Close (1.0 m) Poorly designed experiments

Bad design example 2: Does a relaxation program decrease the urge to smoke? 2 groups relaxation training group no relaxation training group The participants choose which group to be in Training group No training (Control) group Poorly designed experiments

Bad design example 2: Non-equivalent control groups Self Assignment Independent Variable Dependent Variable Training group Measure participants No training (Control) group Random Assignment Measure Problem: selection bias for the two groups Fix: need to do random assignment to groups Poorly designed experiments

Bad design example 3: Does a relaxation program decrease the urge to smoke? Pre-test desire level Give relaxation training program Post-test desire to smoke Poorly designed experiments

Bad design example 3: One group pretest-posttest design Dependent Variable Independent Variable Pre vs. Post Dependent Variable participants Pre-test Training group Post-test Measure Pre-test No Training group Post-test Measure Fix: Add another factor Problems include: history, maturation, testing, and more Poorly designed experiments

Now let’s consider some specific experimental designs. Some bad (but not uncommon) designs Some good designs 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors Experimental designs

1 factor - 2 levels Good design example
How does anxiety level affect test performance? Two groups take the same test Grp1(low anxiety group): 5 min lecture on how good grades don’t matter, just trying is good enough Grp2 (moderate anxiety group): 5 min lecture on the importance of good grades for success What are our IV and DV? 1 Factor (Independent variable), two levels Basically you want to compare two treatments (conditions) The statistics are pretty easy, a t-test 1 factor - 2 levels

How does anxiety level affect test performance? participants Low Moderate Test Random Assignment Anxiety Dependent Variable 1 factor - 2 levels

How does anxiety level affect test performance? One factor Use a t-test to see if these points are statistically different low moderate test performance anxiety anxiety Two levels low moderate 60 80 Observed difference between conditions T-test = Difference expected by chance 1 factor - 2 levels

1 factor - 2 levels Advantages:
Simple, relatively easy to interpret the results Is the independent variable worth studying? If no effect, then usually don’t bother with a more complex design Sometimes two levels is all you need One theory predicts one pattern and another predicts a different pattern 1 factor - 2 levels

1 factor - 2 levels Interpolation Disadvantages:
“True” shape of the function is hard to see Interpolation and Extrapolation are not a good idea low moderate test performance anxiety What happens within of the ranges that you test? Interpolation 1 factor - 2 levels

1 factor - 2 levels Extrapolation Disadvantages:
“True” shape of the function is hard to see Interpolation and Extrapolation are not a good idea Extrapolation low moderate test performance anxiety What happens outside of the ranges that you test? high 1 factor - 2 levels

Now let’s consider some specific experimental designs. Some bad (but not uncommon) designs Some good designs 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors Experimental designs

1 Factor - multilevel experiments
For more complex theories you will typically need more complex designs (more than two levels of one IV) 1 factor - more than two levels Basically you want to compare more than two conditions The statistics are a little more difficult, an ANOVA (Analysis of Variance) 1 Factor - multilevel experiments

Good design example (similar to earlier ex.) How does anxiety level affect test performance? Groups take the same test Grp1(low anxiety group): 5 min lecture on how good grades don’t matter, just trying is good enough Grp2 (moderate anxiety group): 5 min lecture on the importance of good grades for success Grp3 (high anxiety group): 5 min lecture on how the students must pass this test to pass the course 1 Factor - multilevel experiments

1 factor - 3 levels participants Low Moderate Test Random Assignment
Anxiety Dependent Variable High 1 factor - 3 levels

low mod test performance anxiety anxiety low mod high high 80 60 60 1 Factor - multilevel experiments

Advantages Gives a better picture of the relationship (functions other than just straight lines) Generally, the more levels you have, the less you have to worry about your range of the independent variable low moderate test performance anxiety 2 levels high low mod test performance anxiety 3 levels 1 Factor - multilevel experiments

Disadvantages Needs more resources (participants and/or stimuli) Requires more complex statistical analysis (ANOVA [Analysis of Variance] & follow-up pair-wise comparisons) 1 Factor - multilevel experiments

Pair-wise comparisons
The ANOVA just tells you that not all of the groups are equal. If this is your conclusion (you get a “significant ANOVA”) then you should do further tests to see where the differences are High vs. Low High vs. Moderate Low vs. Moderate Pair-wise comparisons

So far we’ve covered a lot of the about details experiments generally
Now let’s consider some specific experimental designs. Some bad (but common) designs Some good designs 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors Experimental designs

Factorial experiments
Two or more factors Some vocabulary Factors - independent variables Levels - the levels of your independent variables 2 x 4 design means two independent variables, one with 2 levels and one with 4 levels “Conditions” or “groups” is calculated by multiplying the levels, so a 2x4 design has 8 different conditions A1 A2 B1 B2 B3 B4 Factorial experiments

Factorial experiments
Two or more factors Main effects - the effects of your independent variables ignoring (collapsed across) the other independent variables Interaction effects - how your independent variables affect each other Example: 2x2 design, factors A and B Interaction: At A1, B1 is bigger than B2 At A2, B1 and B2 don’t differ A A1 A2 Dependent Variable B1 B2 Everyday interaction = “it depends on …” Factorial experiments

Rate how much you would want to see a new movie (1 no interest, 5 high interest):
Hail, Caesar! – new Cohen Brothers movie in 2016 (Feb. 5) Ask men and women – looking for an effect of gender Not much of a difference: no effect of gender Interaction effects

Maybe the gender effect depends on whether you know who is in the movie. So you add another factor:
Suppose that George Clooney or Scarlett Johansson might star. You rate the preference if he were to star and if he were not to star. Effect of gender depends on whether George or Scarlett stars in the movie or not This is an interaction Interaction effects A video lecture from ThePsychFiles.com podcast

Results of a 2x2 factorial design
The complexity & number of outcomes increases: A = main effect of factor A B = main effect of factor B AB = interaction of A and B With 2 factors there are 8 basic possible patterns of results: 1) No effects at all 2) A only 3) B only 4) AB only 5) A & B 6) A & AB 7) B & AB 8) A & B & AB Results of a 2x2 factorial design

2 x 2 factorial design Interaction of AB A1 A2 B2 B1 Marginal means
What’s the effect of A at B1? What’s the effect of A at B2? Condition mean A1B1 Condition mean A2B1 Marginal means B1 mean B2 mean A1 mean A2 mean Main effect of B Condition mean A1B2 Condition mean A2B2 Main effect of A 2 x 2 factorial design

Examples of outcomes Main effect of A ✓ Main effect of B
Dependent Variable B1 B2 30 60 45 60 45 30 30 60 Main Effect of A Main effect of A ✓ Main effect of B X Interaction of A x B X Examples of outcomes

Examples of outcomes Main effect of A Main effect of B ✓
Dependent Variable B1 B2 60 60 60 30 30 30 45 45 Main Effect of A Main effect of A X Main effect of B ✓ Interaction of A x B X Examples of outcomes

Examples of outcomes Main effect of A Main effect of B
Dependent Variable B1 B2 60 30 45 60 45 30 45 45 Main Effect of A Main effect of A X Main effect of B X Interaction of A x B ✓ Examples of outcomes

Examples of outcomes Main effect of A ✓ Main effect of B ✓
Dependent Variable B1 B2 30 60 45 30 30 30 30 45 Main Effect of A Main effect of A ✓ Main effect of B ✓ Interaction of A x B ✓ Examples of outcomes

Anxiety and Test Performance
Let’s add another variable: test difficulty. anxiety low mod high 80 35 50 70 80 main effect of difficulty test performance high low mod anxiety easy easy medium hard Test difficulty 80 80 80 medium 65 80 hard 65 80 60 main effect of anxiety Yes: effect of anxiety depends on level of test difficulty Interaction ? Anxiety and Test Performance

Factorial Designs Advantages Interaction effects
Consider the interaction effects before trying to interpret the main effects Adding factors decreases the variability Because you’re controlling more of the variables that influence the dependent variable This increases the statistical Power of the statistical tests Increases generalizability of the results Because you have a situation closer to the real world (where all sorts of variables are interacting) Factorial Designs

Factorial Designs Disadvantages
Experiments become very large, and unwieldy The statistical analyses get much more complex Interpretation of the results can get hard In particular for higher-order interactions Higher-order interactions (when you have more than two interactions, e.g., ABC). Factorial Designs

Factorial designs Consider the results of our class experiment
Main effect of word type Main effect of depth of processing No Interaction between word type and depth of processing Dr. Kahn's reporting stats page Factorial designs

So far we’ve covered a lot of the about details experiments generally
Now let’s consider some specific experimental designs. Some bad (but common) designs Some good designs 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors Experimental designs

What is the effect of presenting words in color on memory for those words?
Clock Chair Cab So you present lists of words for recall either in color or in black-and-white. Two different designs to examine this question Example

Between-Groups Factor
2-levels Each of the participants is in only one level of the IV levels Clock Chair Cab Colored words participants Test Clock Chair Cab BW words

Within-Groups Factor levels participants Colored words BW Test Clock
Sometimes called “repeated measures” design 2-levels, All of the participants are in both levels of the IV levels participants Colored words BW Test Clock Chair Cab Clock Chair Cab

Between vs. Within Subjects Designs
All participants participate in all of the conditions of the experiment. Between-subjects designs Each participant participates in one and only one condition of the experiment. participants Colored words BW Test participants Colored words BW Test Between vs. Within Subjects Designs

Between subjects designs
Advantages: Independence of groups (levels of the IV) Harder to guess what the experiment is about without experiencing the other levels of IV Exposure to different levels of the independent variable(s) cannot “contaminate” the dependent variable Sometimes this is a ‘must,’ because you can’t reverse the effects of prior exposure to other levels of the IV No order effects to worry about Counterbalancing is not required participants Colored words BW Test Clock Chair Cab Between subjects designs

Between subjects designs
participants Colored words BW Test Clock Chair Cab Disadvantages Individual differences between the people in the groups Excessive variability Non-Equivalent groups Between subjects designs

Individual differences
The groups are composed of different individuals participants Colored words BW Test Individual differences

The groups are composed of different individuals participants Colored words BW Test Excessive variability due to individual differences Harder to detect the effect of the IV if there is one R NR Individual differences

The groups are composed of different individuals participants Colored words BW Test Non-Equivalent groups (possible confound) The groups may differ not only because of the IV, but also because the groups are composed of different individuals Individual differences

Dealing with Individual Differences
Strive for Equivalent groups Created equally - use the same process to create both groups Treated equally - keep the experience as similar as possible for the two groups Composed of equivalent individuals Random assignment to groups - eliminate bias Matching groups - match each individuals in one group to an individual in the other group on relevant characteristics Dealing with Individual Differences

Matching groups Group A Group B Matched groups
Trying to create equivalent groups Also trying to reduce some of the overall variability Eliminating variability from the variables that you matched people on Red Short 21yrs matched Red Short 21yrs matched Blue tall 23yrs Blue tall 23yrs matched Green average 22yrs Green average 22yrs Color Height Age matched Brown tall 22yrs Brown tall 22yrs Matching groups

Between vs. Within Subjects Designs
Between-subjects designs Each participant participates in one and only one condition of the experiment. Within-subjects designs All participants participate in all of the conditions of the experiment. participants Colored words BW Test participants Colored words BW Test Between vs. Within Subjects Designs

Within subjects designs
Advantages: Don’t have to worry about individual differences Same people in all the conditions Variability between conditions is smaller (statistical advantage) Fewer participants are required Within subjects designs

Disadvantages Range effects Order effects: Carry-over effects Progressive error Counterbalancing is probably necessary to address these order effects Within subjects designs

Range effects – (context effects) can cause a problem The range of values for your levels may impact performance (typically best performance in middle of range). Since all the participants get the full range of possible values, they may “adapt” their performance (the DV) to this range. Within subjects designs

Order effects Carry-over effects
Transfer between conditions is possible Effects may persist from one condition into another e.g. Alcohol vs no alcohol experiment on the effects on hand-eye coordination. Hard to know how long the effects of alcohol may persist. test Condition 2 Condition 1 How long do we wait for the effects to wear off? Order effects

Order effects Progressive error
Practice effects – improvement due to repeated practice Fatigue effects – performance deteriorates as participants get bored, tired, distracted Order effects

Dealing with order effects
Counterbalancing is probably necessary This is used to control for “order effects” Ideally, use every possible order (n!, e.g., AB = 2! = 2 orders; ABC = 3! = 6 orders, ABCD = 4! = 24 orders, etc). All counterbalancing assumes Symmetrical Transfer The assumption that AB and BA have reverse effects and thus cancel out in a counterbalanced design Dealing with order effects

Counterbalancing Simple case Two conditions A & B
Two counterbalanced orders: AB BA participants Colored words BW Test Counterbalancing

Often it is not practical to use every possible ordering
Partial counterbalancing Latin square designs – a form of partial counterbalancing, so that each group of trials occur in each position an equal number of times Counterbalancing

Partial counterbalancing
Example: consider four conditions Recall: ABCD = 4! = 24 possible orders 1) Unbalanced Latin square: each condition appears in each position (4 orders) D C B A Order 1 Order 2 Order 3 Order 4 A D C B B A D C C B A D Partial counterbalancing

Partial counterbalancing
Example: consider four conditions Recall: ABCD = 4! = 24 possible orders 2) Balanced Latin square: each condition appears before and after all others (8 orders) A B C D A B D C Partial counterbalancing

Mixed factorial designs
Treat some factors as within-subjects (participants get all levels of that factor) and others as between-subjects (each level of this factor gets a different group of participants). This only works with factorial (multi-factor) designs Mixed factorial designs

Experiment Basics: Control

Similar presentations

Presentation on theme: "Experiment Basics: Control"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Experiment Basics: Control

Similar presentations

Presentation on theme: "Experiment Basics: Control"— Presentation transcript:

Similar presentations

About project

Feedback