Experiment Basics: Control Psych 231: Research Methods in Psychology
Announcements Due this week in labs - Group project: Methods sections IRB worksheet (including a consent form) Recommended/required: Questionnaires/examples of stimuli, etc. – things that you want to have ready for pilot week (week 10) Group Project ratings sheet Exam 2 two weeks from today
Experimental Control Mythbusters examine: Yawning (4 mins)Yawning What sort of sampling method? Why the control group? Should they have confirmed? Probably not, if you do the stats, with this sample size the 4% difference isn’t big enough to reject the null hypothesisProbably not What the stats do: quantify how much random variability (error) there is compared to observed variability and held you decide if the observed variability is likely due to the error or the manipulated variability
Experimental Control Our goal: To test the possibility of a systematic relationship between the variability in our IV and how that affects the variability of our DV. Control is used to: Minimize excessive variability To reduce the potential of confounds (systematic variability not part of the research design)
Experimental Control Our goal: To test the possibility of a systematic relationship between the variability in our IV and how that affects the variability of our DV. T = NR exp + NR other + R NR exp : Manipulated independent variables (IV) NR other : extraneous variables (EV) which covary with IV Random (R) Variability Nonrandom (NR) Variability Imprecision in measurement (DV) Randomly varying extraneous variables (EV) Condfounds Our hypothesis: the IV will result in changes in the DV the variability in our IV
Experimental Control: Weight analogy Variability in a simple experiment: R NR other Treatment group Control group T = NR exp + NR other + R R NR exp NR other Absence of the treatment ( NR exp = 0 ) “perfect experiment” - no confounds ( NR other = 0 )
Experimental Control: Weight analogy Variability in a simple experiment: R NR exp R Treatment group Control group T = NR exp + NR other + R Difference Detector Our experiment is a “difference detector”
Experimental Control: Weight analogy If there is an effect of the treatment then NR exp will ≠ 0 R NR exp R Treatment group Control group Difference Detector Our experiment can detect the effect of the treatment Our experiment can detect the effect of the treatment
Things making detection difficult Potential Problems Confounding Excessive random variability Difference Detector
Potential Problems Confound If an EV co-varies with IV, then NR other component of data will be present, and may lead to misattribution of effect to IV IV DV EV Co-vary together
Confounding R NR exp NR other R Difference Detector Experiment can detect an effect, but can’t tell where it is from Experiment can detect an effect, but can’t tell where it is from Confound Hard to detect the effect of NR exp because the effect looks like it could be from NR exp but could be due to the NR other
Confounding R NR other R Difference Detector Confound Hard to detect the effect of NR exp because the effect looks like it could be from NR exp but could be due to the NR other R NR exp NR other R Difference Detector These two situations look the same These two situations look the same There is not an effect of the IV There is an effect of the IV
Potential Problems Excessive random variability If experimental control procedures are not applied Then R component of data will be excessively large, and may make NR exp undetectable
Excessive random variability R NR exp R Difference Detector Experiment can’t detect the effect of the treatment Experiment can’t detect the effect of the treatment If R is large relative to NR exp then detecting a difference may be difficult
Reduced random variability But if we reduce the size of NR other and R relative to NR exp then detecting gets easier RR NR exp Difference Detector Our experiment can detect the effect of the treatment Our experiment can detect the effect of the treatment So try to minimize this by using good measures of DV, good manipulations of IV, etc.
Controlling Variability How do we introduce control? Methods of Experimental Control Constancy/Randomization Comparison Production
Methods of Controlling Variability Constancy/Randomization If there is a variable that may be related to the DV that you can’t (or don’t want to) manipulate Control variable: hold it constant (so there isn’t any variability from that variable, no R weight from that variable) Random variable: let it vary randomly across all of the experimental conditions (so the R weight from that variable is the same for all conditions)
Methods of Controlling Variability Comparison An experiment always makes a comparison, so it must have at least two groups (2 sides of our scale in the weight analogy) Sometimes there are control groupscontrol groups This is often the absence of the treatment Training group No training (Control) group Without control groups if is harder to see what is really happening in the experiment It is easier to be swayed by plausibility or inappropriate comparisons (see diet crystal example)diet crystal example Useful for eliminating potential confounds (think about our list of threats to internal validity)
Methods of Controlling Variability Comparison An experiment always makes a comparison, so it must have at least two groups Sometimes there are control groups This is often the absence of the treatment 1 week of Training group 2 weeks of Training group Sometimes there are a range of values of the IV 3 weeks of Training group
Methods of Controlling Variability Production The experimenter selects the specific values of the Independent Variables 1 week of Training group 2 weeks of Training group 3 weeks of Training group Duration taking the training program 2 weeks1 weeks3 weeks variability selects the specific values
Methods of Controlling Variability Production The experimenter selects the specific values of the Independent Variables 1 week of Training group 2 weeks of Training group 3 weeks of Training group Need to do this carefully Suppose that you don’t find a difference in the DV across your different groups Is this because the IV and DV aren’t related? Or is it because your levels of IV weren’t different enough
Experimental designs So far we’ve covered a lot of the general details of experiments Now let’s consider some specific experimental designs. Some bad (but not uncommon) designs (and potential fixes) Some good designs 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors
Poorly designed experiments Bad design example 1: Does standing close to somebody cause them to move? “hmm… that’s an empirical question. Let’s see what happens if …” So you stand closely to people and see how long before they move Problem: no control group to establish the comparison group (this design is sometimes called “one-shot case study design”) Very Close (.2 m) Close (.5 m)Not Close (1.0 m) Fix: introduce a (or some) comparison group(s)
Poorly designed experiments Bad design example 2: Does a relaxation program decrease the urge to smoke? 2 groups relaxation training group no relaxation training group The participants choose which group to be in Training group No training (Control) group
Poorly designed experiments Non-equivalent control groups participants Training group No training (Control) group Measure Self Assignment Independent Variable Dependent Variable Random Assignment Problem: selection bias for the two groups Fix: need to do random assignment to groups Problem: selection bias for the two groups Fix: need to do random assignment to groups Bad design example 2:
Poorly designed experiments Bad design example 3: Does a relaxation program decrease the urge to smoke? Pre-test desire level Give relaxation training program Post-test desire to smoke
Poorly designed experiments One group pretest-posttest design participantsPre-test Training group Post-test Measure Independent Variable Pre vs. Post Dependent Variable Problems include: history, maturation, testing, and more Pre-test No Training group Post-test Measure Fix: Add another factor Bad design example 3:
Experimental designs So far we’ve covered a lot of the general details of experiments Now let’s consider some specific experimental designs. Some bad (but not uncommon) designs Some good designs 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors
1 factor - 2 levels Good design example How does anxiety level affect test performance? Two groups take the same test Grp1(low anxiety group): 5 min lecture on how good grades don’t matter, just trying is good enough Grp2 (moderate anxiety group): 5 min lecture on the importance of good grades for success 1 Factor (Independent variable), two levels Basically you want to compare two treatments (conditions) The statistics are pretty easy, a t-test What are our IV and DV?
1 factor - 2 levels participants Low Moderate Test Random Assignment Anxiety Dependent Variable Good design example How does anxiety level affect test performance?
Good design example How does anxiety level affect test performance? anxiety low moderate 8060 lowmoderate test performance anxiety One factor Two levels Use a t-test to see if these points are statistically different T-test = Observed difference between conditions Difference expected by chance 1 factor - 2 levels
Advantages: Simple, relatively easy to interpret the results Is the independent variable worth studying? If no effect, then usually don’t bother with a more complex design Sometimes two levels is all you need One theory predicts one pattern and another predicts a different pattern 1 factor - 2 levels
low moderate test performance anxiety What happens within of the ranges that you test? Interpolation Disadvantages: “True” shape of the function is hard to see Interpolation and Extrapolation are not a good idea 1 factor - 2 levels
Extrapolation lowmoderate test performance anxiety What happens outside of the ranges that you test? Disadvantages: “True” shape of the function is hard to see Interpolation and Extrapolation are not a good idea 1 factor - 2 levels high
Experimental designs So far we’ve covered a lot of the general details of experiments Now let’s consider some specific experimental designs. Some bad (but not uncommon) designs Some good designs 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors
1 Factor - multilevel experiments For more complex theories you will typically need more complex designs (more than two levels of one IV) 1 factor - more than two levels Basically you want to compare more than two conditions The statistics are a little more difficult, an ANOVA (Analysis of Variance)
Good design example (similar to earlier ex.) How does anxiety level affect test performance? Groups take the same test Grp1(low anxiety group): 5 min lecture on how good grades don’t matter, just trying is good enough Grp2 (moderate anxiety group): 5 min lecture on the importance of good grades for success 1 Factor - multilevel experiments Grp3 (high anxiety group): 5 min lecture on how the students must pass this test to pass the course
1 factor - 3 levels participants Low Moderate Test Random Assignment Anxiety Dependent Variable High Test
1 Factor - multilevel experiments anxiety low mod high 8060 lowmod test performance anxiety high
1 Factor - multilevel experiments Advantages Gives a better picture of the relationship (functions other than just straight lines) Generally, the more levels you have, the less you have to worry about your range of the independent variable lowmoderate test performance anxiety 2 levels highlowmod test performance anxiety 3 levels
1 Factor - multilevel experiments Disadvantages Needs more resources (participants and/or stimuli) Requires more complex statistical analysis (ANOVA [Analysis of Variance] & follow-up pair-wise comparisons)
Pair-wise comparisons The ANOVA just tells you that not all of the groups are equal. If this is your conclusion (you get a “significant ANOVA”) then you should do further tests to see where the differences are High vs. Low High vs. Moderate Low vs. Moderate
Experimental designs So far we’ve covered a lot of the about details experiments generally Now let’s consider some specific experimental designs. Some bad (but common) designs Some good designs 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors
Factorial experiments Two or more factors Some vocabulary Factors - independent variables Levels - the levels of your independent variables 2 x 4 design means two independent variables, one with 2 levels and one with 4 levels “Conditions” or “groups” is calculated by multiplying the levels, so a 2x4 design has 8 different conditions A1 A2 B1B2B3B4
Factorial experiments Two or more factors Main effects - the effects of your independent variables ignoring (collapsed across) the other independent variables Interaction effects - how your independent variables affect each other Example: 2x2 design, factors A and B Interaction: At A1, B1 is bigger than B2 At A2, B1 and B2 don’t differ Everyday interaction = “it depends on …” A A1 A2 Dependent Variable B1 B2
Interaction effects Rate how much you would want to see a new movie (1 no interest, 5 high interest) Ask men and women – looking for an effect of gender Not much of a difference
Interaction effects Maybe the gender effect depends on whether you know who is in the movie. So you add another factor: Suppose that George Clooney might star. You rate the preference if he were to star and if he were not to star. Effect of gender depends on whether George stars in the movie or not This is an interaction
Results of a 2x2 factorial design The complexity & number of outcomes increases: A = main effect of factor A B = main effect of factor B AB = interaction of A and B With 2 factors there are 8 basic possible patterns of results: 1) No effects at all 2) A only 3) B only 4) AB only 5) A & B 6) A & AB 7) B & AB 8) A & B & AB
2 x 2 factorial design Condition mean A1B1 Condition mean A2B1 Condition mean A1B2 Condition mean A2B2 A1A2 B2 B1 Marginal means B1 mean B2 mean A1 meanA2 mean Main effect of B Main effect of A Interaction of AB What’s the effect of A at B1? What’s the effect of A at B2?
Main effect of A Main effect of B Interaction of A x B A B A1 A2 B1 B2 Main Effect of A Main Effect of B A A1 A2 Dependent Variable B1 B2 ✓ X X Examples of outcomes
Main effect of A Main effect of B Interaction of A x B A B A1 A2 B1 B2 Main Effect of A Main Effect of B A A1 A2 Dependent Variable B1 B2 ✓ X X Examples of outcomes
Main effect of A Main effect of B Interaction of A x B A B A1 A2 B1 B2 Main Effect of A Main Effect of B A A1 A2 Dependent Variable B1 B2 ✓ X X Examples of outcomes
Main effect of A Main effect of B Interaction of A x B A B A1 A2 B1 B2 Main Effect of A Main Effect of B A A1 A2 Dependent Variable B1 B2 ✓ ✓ ✓ Examples of outcomes
Anxiety and Test Performance test performance highlowmod anxiety easy medium hard anxiety lowmodhigh main effect of difficulty 8060 main effect of anxiety Let’s add another variable: test difficulty. easy medium hard Test difficulty Interaction ? Yes: effect of anxiety depends on level of test difficulty
Factorial Designs Advantages Interaction effects –Consider the interaction effects before trying to interpret the main effects – Adding factors decreases the variability –Because you’re controlling more of the variables that influence the dependent variable –This increases the statistical Power of the statistical tests – Increases generalizability of the results –Because you have a situation closer to the real world (where all sorts of variables are interacting)
Factorial Designs Disadvantages Experiments become very large, and unwieldy The statistical analyses get much more complex Interpretation of the results can get hard In particular for higher-order interactions Higher-order interactions (when you have more than two interactions, e.g., ABC).
Factorial designs Consider the results of our class experiment Main effect of word type Main effect of depth of processing No Interaction between word type and depth of processing Dr. Kahn's reporting stats page
Experimental designs So far we’ve covered a lot of the about details experiments generally Now let’s consider some specific experimental designs. Some bad (but common) designs Some good designs 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors
Example What is the effect of presenting words in color on memory for those words? Two different designs to examine this question Clock Chair Cab Clock Chair Cab Clock Chair Cab Clock Chair Cab So you present lists of words for recall either in color or in black-and-white.
participants Colored words BW words Test 2-levels Each of the participants is in only one level of the IV Between-Groups Factor Clock Chair Cab Clock Chair Cab Clock Chair Cab Clock Chair Cab levels
participants Colored words BW words Test 2-levels, All of the participants are in both levels of the IV Clock Chair Cab Clock Chair Cab Clock Chair Cab Clock Chair Cab levels Sometimes called “repeated measures” design Within-Groups Factor
Between vs. Within Subjects Designs Within-subjects designs All participants participate in all of the conditions of the experiment. participants Colored words BW words Test participants Colored words BW words Test Between-subjects designs Each participant participates in one and only one condition of the experiment.
Within-subjects designs All participants participate in all of the conditions of the experiment. participants Colored words BW words Test participants Colored words BW words Test Between-subjects designs Each participant participates in one and only one condition of the experiment. Between vs. Within Subjects Designs
Between subjects designs Advantages: Independence of groups (levels of the IV) Harder to guess what the experiment is about without experiencing the other levels of IV Exposure to different levels of the independent variable(s) cannot “contaminate” the dependent variable Sometimes this is a ‘must,’ because you can’t reverse the effects of prior exposure to other levels of the IV No order effects to worry about Counterbalancing is not required participants Colored words BW words Test Clock Chair Cab Clock Chair Cab Clock Chair Cab Clock Chair Cab
Between subjects designs Disadvantages Individual differences between the people in the groups Excessive variability Non-Equivalent groups participants Colored words BW words Test Clock Chair Cab Clock Chair Cab Clock Chair Cab Clock Chair Cab
Individual differences The groups are composed of different individuals participants Colored words BW words Test
Individual differences The groups are composed of different individuals participants Colored words BW words Test Excessive variability due to individual differences Harder to detect the effect of the IV if there is one R NR R
Individual differences The groups are composed of different individuals participants Colored words BW words Test Non-Equivalent groups (possible confound) The groups may differ not only because of the IV, but also because the groups are composed of different individuals
Dealing with Individual Differences Strive for Equivalent groups Created equally - use the same process to create both groups Treated equally - keep the experience as similar as possible for the two groups Composed of equivalent individuals Random assignment to groups - eliminate bias Matching groups - match each individuals in one group to an individual in the other group on relevant characteristics
Matching groups Group AGroup B Matched groups Trying to create equivalent groups Also trying to reduce some of the overall variability Eliminating variability from the variables that you matched people on Red Short 21yrs Blue tall 23yrs Green average 22yrs Brown tall 22yrs Color Height Age matched Red Short 21yrs matched Blue tall 23yrs matched Green average 22yrs matched Brown tall 22yrs
Within-subjects designs All participants participate in all of the conditions of the experiment. participants Colored words BW words Test participants Colored words BW words Test Between-subjects designs Each participant participates in one and only one condition of the experiment. Between vs. Within Subjects Designs
Within subjects designs Advantages: Don’t have to worry about individual differences Same people in all the conditions Variability between conditions is smaller (statistical advantage) Fewer participants are required
Within subjects designs Disadvantages Range effects Order effects: Carry-over effects Progressive error Counterbalancing is probably necessary to address these order effects
Within subjects designs Range effects – (context effects) can cause a problem The range of values for your levels may impact performance (typically best performance in middle of range). Since all the participants get the full range of possible values, they may “adapt” their performance (the DV) to this range.
test Condition 2Condition 1 test Order effects Carry-over effects Transfer between conditions is possible Effects may persist from one condition into another e.g. Alcohol vs no alcohol experiment on the effects on hand-eye coordination. Hard to know how long the effects of alcohol may persist. How long do we wait for the effects to wear off?
Order effects Progressive error Practice effects – improvement due to repeated practice Fatigue effects – performance deteriorates as participants get bored, tired, distracted
Dealing with order effects Counterbalancing is probably necessary This is used to control for “order effects” Ideally, use every possible order (n!, e.g., AB = 2! = 2 orders; ABC = 3! = 6 orders, ABCD = 4! = 24 orders, etc ). All counterbalancing assumes Symmetrical Transfer The assumption that AB and BA have reverse effects and thus cancel out in a counterbalanced design
Counterbalancing Simple case Two conditions A & B Two counterbalanced orders: AB BA participants Colored words BW words Test Colored words BW words Test
Counterbalancing Often it is not practical to use every possible ordering Partial counterbalancing Latin square designs – a form of partial counterbalancing, so that each group of trials occur in each position an equal number of times
Partial counterbalancing Example: consider four conditions Recall: ABCD = 4! = 24 possible orders 1) Unbalanced Latin square: each condition appears in each position (4 orders) DCBA ADCB BADC CBAD Order 1 Order 2 Order 3 Order 4
Partial counterbalancing 2) Balanced Latin square: each condition appears before and after all others (8 orders) ABDC BCAD CDBA DACB ABCD BCDA CDAB DABC Example: consider four conditions Recall: ABCD = 4! = 24 possible orders
Mixed factorial designs Treat some factors as within-subjects (participants get all levels of that factor) and others as between-subjects (each level of this factor gets a different group of participants). This only works with factorial (multi-factor) designs