Experimentation and Validity Slides Prepared by Alison L. O’Malley Passer Chapter 10
Categories of Inference Inferences about constructs Statistical inferences Causal inferences Inferences about generalizability (Shadish, Cook, & Campbell, 2002)
Categories of Inference and Validity Type Inferences about constructs--CONSTRUCT Statistical inferences--STATISTICAL Causal inferences--INTERNAL Inferences about generalizability--EXTERNAL (Shadish, Cook, & Campbell, 2002)
Construct Validity Are we measuring what we intend to measure? Are our operational definitions true to the underlying construct? How do we know whether our measures are valid?
Statistical Conclusion Validity Did researchers perform appropriate statistical analyses? Do data meet assumptions surrounding particular tests?
External Validity Can findings be generalized beyond the present study to other populations, settings, and species? Ecological validity addresses generalizability to “real-life” settings
External Validity Mundane realism addresses similarity between experimental environment and real-world settings Psychological realism addresses degree to which experimental setting encourages participants to behave naturally Which is more important? Why?
External Validity Replication is the process by which studies are repeated to see whether original findings are upheld Evidence for or against external validity gradually accumulates through replication In 2013 the Association for Psychological Science launched a major replication initiative
Internal Validity Ability to make causal inferences Is X (the IV) driving the change in Y (the DV)? Many potential threats to internal validity 7 sources of threat
Internal Validity Threats: History Events that occur during a study that are unrelated to the experimental manipulation E.g., Riots break out on campus during data collection
Internal Validity Threats: Maturation People naturally change over time irrespective of what happens to them during a study E.g., Children acquire object permanence
Internal Validity Threats: Testing Measuring participants’ responses affects how they respond on subsequent measures Similar to maturation but change is caused by the testing procedure itself
Internal Validity Threats: Instrumentation Changes occur in a measuring instrument during data collection E.g., Testing apparatus is recalibrated over the course of a study
Internal Validity Threats: Regression to the Mean Statistical phenomenon wherein participants who receive extreme scores tend to have less extreme scores when retested even in the absence of any treatment effects Attributable to measurement error
Internal Validity Threats: Attrition Participants drop out of a study Particularly threatening when participants who dropped out differ meaningfully from those who stayed in the study (i.e., differential attrition)
Internal Validity Threats: Selection At the start of a study, groups are non- equivalent, and this difference affects the results Group 1 Mean IQ = 100 Group 1 Mean IQ = 100 Group 2 Mean IQ = 120 Group 2 Mean IQ = 120
Solutions to Internal Validity Threats Internal Validity ThreatSolutions HistoryBlock randomization MaturationRandom assignment TestingAvoid pretesting or ensure that all participants complete a pretest InstrumentationRandom assignment and counterbalancing Regression to the meanRandom assignment, exclude participants with extreme scores AttritionEstablish why participants drop out; establish whether participants who remain in differ from participants who left SelectionRandom assignment
Demand Characteristics Are participants good subjects or defiant subjects? Demand characteristics shape participants’ beliefs about the hypothesis and how they are expected to behave Is participants’ standing on the DV due to the IV? Demand characteristics? Both?
Demand Characteristics: Solutions Suspicion probes can be incorporated into debriefing to explore participants’ beliefs about the hypothesis E.g., “What was the purpose of this study?” Increase psychological realism Pilot the experiment Use unobtrusive dependent measures
Demand Characteristics: Solutions Avoid within-subjects designs Separate out participants who claim to know the hypothesis—do their results differ? Manipulate participants’ knowledge of the hypothesis Apply the “red herring” technique Why might within-subjects designs be problematic?
Experimenter Expectancy Effects Researchers may unintentionally influence their participants to respond in line with the hypothesis Combat this by training experimenters to follow a research protocol Automate as much as possible Keep experimenters unaware of the hypothesis and/or experimental condition being run through masking (aka blinding)
Placebo Effects Participants’ expectations surrounding treatment effects influence their responses to treatment To combat, include a placebo control group wherein participants do not receive the experimental treatment but are led to believe they received it Typically discussed in context of drug trials, but much broader application
Double Blind Procedures Use masking along with placebo control Neither participants nor the experimenters know who is receiving what
Yoked Control Groups Control group members linked (i.e., yoked) to experimental group members Experimental participants’ behavior dictates how control participant is treated Yoking occurs through random assignment or matching Under what circumstances might researchers use yoked control groups?
Ceiling and Floor Effects In order to make a causal inference, there must be variability in the DV (i.e., no range restriction) Ceiling effects occur when scores bunch up at the maximum DV level E.g., All employees receive “outstanding” performance ratings Floor effects occur when scores bunch up at the minimum DV level E.g., All employees receive “unsatisfactory” performance ratings
Research Design Tips Combat ceiling and floor effects by using highly sensitive measures and strong manipulations Pilot studies can identify problems here and elsewhere Incorporate manipulation checks to assess validity of IV manipulation Debrief participants thoroughly to gain insight into their experience
Replication Complete (full) replications precisely mirror the original study Partial replications include some aspects of original study
Replication In direct (exact) replication, researchers mimic original procedures Conceptual replications examine the same research question but operationalize constructs differently Replication and extensions add a new design element to the original study Factorial designs are ideal!