Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiments & Threats to Validity

Similar presentations


Presentation on theme: "Experiments & Threats to Validity"— Presentation transcript:

1 Experiments & Threats to Validity

2 Overview Experimental Research
Definition and terms Characteristics Types Definition of internal & external validity Threats to internal validity Threats to external validity Threats to both

3 Validity of a Study

4 Validity (of a study) Internal validity External validity
Is there only one unambiguous explanation? Have you eliminated confounds? Can the changes in the DV be attributed to the manipulation of IV? External validity Do the results generalize? Can the results apply to populations, situations, measurement procedures which were not part of original study? Internal validity – deals with the degree to which we can provide clear explanations for the relations between our variables. Is there only one unambiguous explanation? Have we eliminated potential confounds? If so, then we have high internal validity. External validity – deals with how well our results will generalize. Is the sample population similar to the population from which the researchers wishes to generalize? I sampled students at KSU, now are my results valid for students at a different institution, or all college students? We ask to what degree are our results represent a wider population of interest. Different people have different personalities, grow up in different regions, have different learning styles, etc., so the results may not be generalizable to people who are substantially different on certain variables. Also, we must ask whether our not our results would generalize to other settings. That is, the setting or situation in which the experiment occurred may be different. For example, say we found a difference in kids reading comprehension in the lab, would this difference generalize to the classroom setting or the home? We might also consider how our measurement procedures influence the likelihood of results generalizing. For example, due to ethical reasons, many studies on aggression measure through noise blasts. Researchers provide participants an opportunity to behave aggressive by giving a someone (sometimes a confederate, sometimes not even a real person) blasts of noise. Does this measurement of aggression generalize to other ways in which aggression is measured?

5 Threats to Internal Validity
Confound variables 1. Threats for all studies: Environmental variables E.g., Time of day, room size, room tempt, noise 2. Threats when comparing groups Assignment bias (e.g., non-equivalent control group) Groups are not equal (e.g., personality characteristics or experiences) There are many threats to internal validity. These are confound variables – variables that serve as alternative explanations. There are threats that are of concern in all studies. These are referred to as environmental variables. These could be extraneous or confound variables --- and we always have to be on the look-out for them. Previous research is helpful for considering whether some of these variables are strong or weak threats. Threats when comparing groups. One threat is assignment bias. This is when we are looking to compare our two groups, but the groups are not equal. We want both groups to be equal on all sorts of varaibles when we begin – introversion, IQ, SES, aggression, etc.

6 Threats to Internal Validity
3. Threats over time History Outside events occur Maturation Systematic changes in people (e.g., age) Instrumentation Changes in measurement, scales, or criteria Diffusion of treatment Communication across groups can interfere with manipulation of IV 3. There are also threats over time. This refers to studies that are conducted over time (weeks, months, years). History – specific events (uncontrolled event) other than the treatment occur between observations. Imagine we were conducting research on anxiety – and then Katrina or 9/11 happens. How valid do you think those results would be? Might participants be distracted? Are your findings a function of your study or the environment you pulled your participants from? Imagine an experiment to heighten racial awareness was conducted by a researcher during February. This is Black History month; so the results might be affected by events that occur during Black History month and not the treatment. The idea here is that one group is affected differently - that now makes the groups no longer equal. Or the entire sample has been influenced in some way that it makes it difficult to figure out the causal relationship between variables. Maturation – performance changes due to age or fatigue confound the effect of the treatment. There is some intrinsic change in participants. A college develops a new housing plan to promote more open-mindedness and acceptance of others. The students are tested when they enter college and when they graduate. The results show they are now more open-minded and tolerant of others. Did the housing plan work or do students just mature and grow as a result of the college experience? Instrumentation – unobserved changes in observer criteria or instrument calibration confound the effect of the treatment. Essentially these are changes in the measuring instruments or how researchers are using those instruments. Imagine a computer that measure reaction time, but doesn’t have enough memory – the reaction time test itself gets slower. “instruments” could refer to judges who rate the participants. Judges might become better raters of student concentration, so that the confounding is due not to instrument deterioration but to instrument improvement. Diffusion of treatment – occurs when communication across groups can interfere with manipulation of an independent variable. Suppose there are two groups in a study (i.e., a treatment and control group). The treatment group is being information about nutrient and helpful tips for weight loss. Imagine participants in the treatment condition pass along this helpful information to some of their friends, who just happen to be in the control condition. Because the control condition has now been contaminated – differences between the two groups may not necessarily be attributed to the independent variable.

7 Threats to Internal Validity
3. Threats over time (continued) Testing effects (can happen outside within-subjects designs too) Changes due to testing Carry-over effects: fatigue, practice, and context effects Statistical regression Regression towards the mean  extreme scores become less extreme Experimental Mortality (i.e., attrition) Differential dropout rates between groups Testing-testing changes how participants respond. Participants asked to answer several long surveys, or assessed a long IQ test could experience fatigue. Practice effects – participants get better over time simply through the testing. Pre- & post-tests - People frequently do better on second tests of a course, just because they know the testing style. I gave a test to my study skills group on Monday, presented some unique concepts on Tuesday, then gave them the posttest on Wednesday. The grades were significantly higher on the posttest. – It is possible the grades were higher because the students learned new information or perhaps they still remembered the questions from the pretest. Carry-over effects – occurs when you have several procedures and it is possible that earlier procedures could have influenced performance on later procedures. Statistical regression – subjects selected for treatment on the basis of their extreme scores tend to move closer to the mean on retesting. After extreme performance, people tend to return to baseline… its due to random error. After you do well, maybe your parents praise you. Then you don’t do so hot on the next assignment. Your parents decide to punish you. You do better. All the sudden, they think praise doesn’t work but punishment does. Oops. Also known as the Sports Illustrated Curse – show up on the cover and you’re probably going to lose your next game. Why? Because you had to do something pretty unusual to get on the cover An experimenter selected students for a reading program based on their low test scores. At the end of the treatment, the test scores had improved. Extreme scores naturally move toward the mean on subsequent tests. Experimental mortality – differential loss of subjects from the groups of a study results in nonequivalent groups (Not always death, just people dropping out) If one group has to do something challenging or painful, while another group gets an easy control condition, guess who is more likely to be dropping out? If all the people who find the condition difficult drop out, you’ve lost the effect of your manipulation I am teaching a new experimental seminar on study skills. About half of the class stopped coming to the seminar before the semester was over. The students who remained improved their study skills. So my course was effective! Probably not. The half that stopped coming might not have gained anything; that is why they stopped attending.

8 Threats to External Validity
1. Generalizing across participants Selection bias Bias in sampling procedure College students Volunteer bias Participant characteristics Cross-species generalization 1. Generalizing across participants – that is if we had conducted this research with a different population would the results still hold? Selection bias – there is a bias in the sampling procedure. Usually, we would like to apply our results to all people – or a subset like toddlers or depressed individuals; however, because of logistical issues, money issues, ethical issues – we may end up sampling in a biased way. College students – many psychological studies are conducted with college students, because they are a fairly accessible population for academics doing research. However, we have to be careful here, because college students can differ from the general public in meaningful ways. Volunteer bias – because of ethical considerations – psychological research must be voluntary. Volunteers and nonvolunteers may differ in important ways Maximum Confidence Volunteers tend to be more highly educated than nonvolunteers Volunteers tend to come from a higher social class than nonvolunteers Volunteers are of a higher intelligence in general, but not when volunteers for atypical research (such as hypnosis, sex research) Volunteers have a higher need for approval than nonvolunteers Volunteers are more social than nonvolunteers Considerable Confidence Volunteers are more “arousal seeking” than nonvolunteers (especially when the research involves stress) Individuals who volunteer for sex research are more unconventional than nonvolunteers Females are more likely to volunteer than males, except when the research involves physical or emotional stress Volunteers are less authoritarian than nonvolunteers Jews are more likely to volunteer than Protestants; however, Protestants are more likely to volunteer than Catholics Volunteers have a tendency to be less conforming than nonvolunteers, except when the volunteers are female and the research is clinically oriented Source: Adapted from Rosenthal & Rosnow, 1975. Participant characteristics – gender, age, race, ethnic characteristics, SES Cross-species generalization – rats, gorillas, humans

9 Threats to External Validity
2. Generalizing across features of study Novelty effects Do participants respond differently in the lab? Multiple treatment effects Did participation in earlier treatment affect outcomes? Experimenter characteristics Would we get the same results if we used a different experimenter? Generalizing across the features of a study. Novelty effects – participants respond differently in the lab. They may interpret or behavior differently from how they normally would (in the real world) because they are in a study. Multiple treatment effects --- this is similar to the testing effects (the carrryover effects) we discussed as a threat to internal validity. The carry-over effects also threaten external validity. That is, “can the results from the treatments be generalized to individuals who have not received earlier treatments?” Experimental characteristics – would we get the same results if we had used a different experimenter? Personality characteristics of the experimenter question our ability to generalize the results of the study.

10 Threats to External Validity
3. Generalization across features of measures Sensitization Assessment influences participants Makes them more aware of thoughts, feelings, or behavior Time of measurement Would we get the same result if we measured 1 week after intervention vs. 5 weeks? 3. Generalizing across the features of the measures. Sensitization – assessment influences participants. It makes them more aware of thoughts, feelings, or behavior. (self-monitoring). Sometimes we don’t realize how we are feeling or what behaviors we are engaging in. Want to see the effect of some new diet. We ask participants to indicate the types of foods they have been eating, give them an intervention, and then measure types of foods again. Well, it is possible that the very act of having to indicate the types of foods ---changes participants behaviors. Time of measures Measuring 1 week after intervention versus 5 weeks.

11 Balancing Internal & External Validity
We want both, but usually a trade-off Higher internal validity associated w/ greater control Greater control associated w/ lower external validity We have to decide what’s most important Depends on purpose of study Another reason why it is the accumulation of information that is important! (i.e., multiple studies) We want both, but usually a trade-off Higher internal validity associated w/ greater control Greater control associated w/ lower external validity We have to decide what’s most important Depends on purpose of study Another reason why it is the accumulation of information that is important! (i.e., multiple studies)

12 Threats to internal & external validity
Experimental bias (i.e., expectancy effect) Experimenter expectations influence results Not necessarily intentional Expectancy effects – preconceived ideas or expectations about participants, influences the experimenter treatment. This is basically like a self-fulfilling prophecy. Differential treatment – that is, if the experimenter may treat the experimental groups differently to produce results consistent with the hypothesis. Can be intentional, but lots of times it may not be. The influence may come from tone of voice, body language, enthusiasm, or a variety of factors. Whenever a researcher knows the predicted outcome of a study and has some level of contact with the participants, the potential for experimenter bias and differential treatment exists. Rosenthal - Maze bright/Maze dull rats – graduate students were asked to train rats to learn a maze. The graduate students were told that some of the rats were maze bright rats (i.e., that the rats had been bred to be especially good at running mazes) or just regular rats (maze dull rats). What do you know the researchers found that the maze bright rats did learn to run the maze faster than maze dull rats. The kicker – there was really no difference between the rats (i.e., the researchers had basically lied to the grad students). What caused the difference? The grad students treatment – those with the maze bright rats held them longer, were nicer to them, perhaps loved them a bit more than the supposed maze dull rats.

13 Threats to internal & external validity
One of the things to remember when conducting psychological research is that people are not passive – they have their own experiences, motivations – they are active and responsive. You’re probably saying well, uh, no-duh, isn’t that we want. And you’re right, but at the same time people’s actions and responses can distort the measurements. Your participants can add error in your measurement. By being aware of the ways in which your participants may increase error in our measurement, we (as researchers) can attempt to reduce that error. Demand characteristics – cues provided by researcher/situation that communicate the purpose of the study (i.e., information tells participants how they should/should not behave). Now, it is important to note, that it is the perception of the hypothesis that is important here. Your text describes a situation in which a participant comes to the lab, the experimenter says to just ignore the noise (installing a new air conditioner). The participant believes that the experimenter is lying – thinks that the experiment is obviously about his/her ability to complete a task with distraction. And the experiment may have nothing to do with that. A similar thing happened when I was conducting a study at Iowa State. The ventilation system was very poor in the lab we worked in – it was easily over 80 degrees. Participants were immediately suspicious – many of them thought we had rigged the environment to determine what kind of effect it would have on them. We weren’t! But this probably introduced error into our measurement. There are a number of ways that participants could respond to these cues. One way, is to become the ‘good participant’. They want to help or please the experimenter. If they can identify what they think the hypothesis, then they will do everything in their power to get you want you want. Participants who sincerely want to help researchers confirm hypotheses about human nature, never realizing that such cooperation is a bias that undermines the search for true cause and effect relationships. Example – Orne’s boring task (2,000 sheets of paper, add the numbers, tear the paper, add the numbers, etc.). Apprehensive attitude – sometimes they are fearful about the process itself (e.g., heard some negative things about experiments or might be under impression that psychologists routinely lie to research participants (heightened suspicion) or fearful that experimenter might discover something about their private lives). So they might not be intentionally trying to mislead or lie to anyone – but this does introduce error into our measurement. - The ‘negative’ or noncompliant participant -- negative impression about psychological research or they want to assert their individuality --- actively try to disconfirm what they believe the research is intended to demonstrate. “Oh, so you think that people will act this way --- well, I’m going to do exactly the opposite.” Almost like reactance. I almost want to add another category – the worthless participant (e.g., participants that come high/drunk, unmotivated, etc.) Reactivity Participants change behavior because they know they are participating in a study Demand characteristics Cues make hypothesis obvious Influence participants’ responses/behavior (e.g., subject effect) The ‘good’ participant Apprehensive attitude The ‘negative’ participant

14 Reducing bias, reactivity & other issues
Standardize or automate experiment Blind Single- or double-blind studies Conducting a preliminary study There are several ways to assess or reduce reactivity and experimenter bias. Standardize or automate the experiment. For example, a researcher could read from a prepared script to ensure that all participants receive exactly the same instructions. Or instructions could be presented on a printed handout, by audiotape, or on a video. In each case, the goal is to either limit the personal contact between the experimenter and the participant or reduce the likelihood of acting differently. Blind – another strategy for reducing experimenter bias is to use a ‘blind’ experiment. If the research study is conducted by an experimenter or an assistant who does not know the expected results, the experimenter should not be able to influence participants. Double-blind – neither the researcher or the participant are aware the outcome. (placebo effect) Single-blind – participants aren’t aware of their condition or even that there might be other conditions (levels of the IV). Conducting a pilot study – would help beforehand to clarify instructions, find out problems with dv (range effects), reliability/validity issues, procedural problems – might also help to thoroughly talk to participants afterwards to determine potential reactivity issues (but beware of hindsight effects – “oh, I knew that was fake all along”).

15 Reducing bias, reactivity, & other issues
Cover stories Add other unrelated DVs Use involving activities Habituation Indirect measures Using placebos Use effective cover stories. A compelling explanation for the project’s purpose will often satisfy participant curiosity. This is a way to reduce demand characteristics (i.e., generate a good cover story). Participants will search for meaning during the study (i.e., they usually will try to figure out what is going on). Give them a story that makes sense. Administer dependent measures outside the experimenter context. Tell participants they are participating in two separate experiments so they will not associate the independent variable in one with the dependent variable in the other. Example: Stereotype Threat Study – added in inkblots to throw off participants Example: Alibi Corroboration Study – seated participants around a table (just wanted them to look at one another) and then asked questions about – what was supper like at your family table. We didn’t care about what super was like. Use involving activities. If the experimenter’s procedure is sufficiently engaging, participants will have no time to be distracted by demand characteristics. Habituation. Observers simply introduce themselves into situations on many different occasions until the participants cease to react to their presence. (Observer’s presence is no longer novel or distracting) Example: MTV’s Real-world – people get used to the camera being there. - You’d be surprised at how quickly people forget about the camera – ISU lab, we’ve got some nice videos of students who apparently forgot they were being videotaped and just started picking their noses Indirect measures – figure out ways to not be so obtrusive E.g., could exam people’s recycling behavior by going through trash Radio station listening preferences - Rather than conducting an obtrusive survey or interview about favorite radio stations, the researchers went to local auto dealers and garages and checked all cars that were being serviced to see what station the radio was currently tuned to.


Download ppt "Experiments & Threats to Validity"

Similar presentations


Ads by Google