On Magic, Power, Extra Sensory Perception, Decline, and Death: The Ironic Effect of Multiple-Study Articles on Scientific Progress Ulrich Schimmack University.

Slides:



Advertisements
Similar presentations
The Research Process: How We Find Things Out
Advertisements

Introducing Hypothesis Tests
Introduction to Hypothesis Testing Chapter 8. Applying what we know: inferential statistics z-scores + probability distribution of sample means HYPOTHESIS.
Null Hypothesis Significance Testing What the heck have we been doing this whole time?
Testing Hypotheses About Proportions Chapter 20. Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called.
Thursday:ESP Myth Monday:Review Notebook Preparation Sheet Papers Returned Wednesday:Test – Myths 1 – 3/Mythbusting/Love Notebook DUE Personality Quizzes.
T-tests Computing a t-test  the t statistic  the t distribution Measures of Effect Size  Confidence Intervals  Cohen’s d.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Understanding Statistics in Research
Using Statistics in Research Psych 231: Research Methods in Psychology.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Chapter One: The Science of Psychology
The Scientific Method (The snows of Kilimanjaro, immaculate fish, and whale legs).
Research Methods AP Psych – Chapter 2 Psychology’s Scientific Method
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 101 Dr. Kari Lock Morgan SECTION 4.1 Statistical test Null and alternative.
Causality, Reasoning in Research, and Why Science is Hard
POSC 202A: Lecture 1 Introductions Syllabus R Homework #1: Get R installed on your laptop; read chapters 1-2 in Daalgard, 1 in Zuur, See syllabus for Moore.
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Causation and the Rules of Inference Classes 4 and 5.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
Chapter One: The Science of Psychology. Ways to Acquire Knowledge Tenacity Tenacity Refers to the continued presentation of a particular bit of information.
Quantitative Research. Overview Non-experimental QualitativeCase study Phenomenology Ethnography Historical Literature Review QuantitativeObservational.
Data Analysis (continued). Analyzing the Results of Research Investigations Two basic ways of describing the results Two basic ways of describing the.
Scientific Method The primary goal o f science is to help us understand our universe. The primary goal o f science is to help us understand our universe.
Zacharias Maniadis, Fabio Tufano and John A List MAER-Net 2015 Prague Colloquium.
The Scientific Method. Steps of Scientific Method 1.Observation: notice and describe events or processes 2.Make a question 1.Relate to observation 2.Should.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 20 Testing Hypotheses About Proportions.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Errors (4.3) Multiple testing.
Statistical Power The power of a test is the probability of detecting a difference or relationship if such a difference or relationship really exists.
Inferential Statistics Body of statistical computations relevant to making inferences from findings based on sample observations to some larger population.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
CHAPTER 9 Testing a Claim
 Descriptive Methods ◦ Observation ◦ Survey Research  Experimental Methods ◦ Independent Groups Designs ◦ Repeated Measures Designs ◦ Complex Designs.
The Scientific Method: Terminology Operational definitions are used to clarify precisely what is meant by each variable Participants or subjects are the.
Research planning. Planning v. evaluating research To a large extent, the same thing Plan a study so that it is capable of yielding data that could possibly.
1.3 Scientific Thinking and Processes KEY CONCEPT Scientific Method Science is a way of thinking, questioning, and gathering evidence.
What psychology is not…. Benefits of Psychology Activity On a separate sheet of paper, brainstorm and identify the benefits you hope to gain by studying.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Understanding Science
Experimental Psychology PSY 433 Chapter 5 Research Reports.
How Psychologists Do Research Chapter 2. How Psychologists Do Research What makes psychological research scientific? Research Methods Descriptive studies.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.1 Significance Tests:
Statistics (cont.) Psych 231: Research Methods in Psychology.
Statistical Inference What can go wrong?. Tests of significance are an important part of much of scientific study today. The validity of the conclusions.
CHAPTER 2 PSYCHOLOGICAL METHODS CONDUCTING RESEARCH.
Inferential Statistics Psych 231: Research Methods in Psychology.
Smith/Davis (c) 2005 Prentice Hall Chapter One The Science of Psychology PowerPoint Presentation created by Dr. Susan R. Burns Morningside College.
Chapter 2 Section 1 Conducting Research Obj: List and explain the steps scientists follow in conducting scientific research.
Estimating the reproducibility of psychological science: accounting for the statistical significance of the original study Robbie C. M. van Aert & Marcel.
Science is a process. It is a systematic process. The goal of the process is to gain understanding of how nature and the physical world work.
Scientific Method.
Unit 5 – Chapters 10 and 12 What happens if we don’t know the values of population parameters like and ? Can we estimate their values somehow?
Basics.
Experimental Psychology
Unit 5: Hypothesis Testing
A Closer Look at Testing
Inferential Statistics:
Null Hypothesis Testing
If I keep a plant from getting energy from sunlight, it will die.
The Science of Psychology
Significance Tests: The Basics
Key idea: Science is a process of inquiry.
Chapter 12 Power Analysis.
POSC 202A: Lecture 1 Introductions Syllabus R
Hypothesis Testing and Confidence Intervals (Part 2): Cohen’s d, Logic of Testing, and Confidence Intervals Lecture 9 Justin Kern April 9, 2018.
Psych 231: Research Methods in Psychology
Chapter 4 Summary.
Presentation transcript:

On Magic, Power, Extra Sensory Perception, Decline, and Death: The Ironic Effect of Multiple-Study Articles on Scientific Progress Ulrich Schimmack University of Toronto Mississauga

Recent Controversy in JPSP:ASC (2001) - Bem (2011) “Feeling the Future” - 9 study article - Conclusion: “extrasensory perception for subliminal stimuli” (p <.00001) - Wagenmaker et al. (2011) - results are inconclusive - Article demonstrates weaknesses of current scientific practices in experimental social psychology (ESP) research.

Bem’s Article May Elicit Feelings of Dissonance Unbalanced Triad (Heider) - I don’t belief in extrasensory perceptions. - I believe JPSP articles. - JPSP shows that extrasensory perception is possible. Dissonance motives actions that reduce dissonance. - Start believing in extrasensory perception. - Stop believing ESP articles.

But now all sorts of well-established, multiply confirmed findings have started to look increasingly uncertain. It’s as if our facts were losing their truth: claims that have been enshrined in textbooks are suddenly unprovable. This phenomenon doesn’t yet have an official name, but it’s occurring across a wide range of fields, from psychology to ecology.

For many scientists, the effect is especially troubling because of what it exposes about the scientific process. If replication is what separates the rigor of science from the squishiness of pseudoscience, where do we put all these rigorously validated findings that can no longer be proved? Which results should we believe? What is more trustworthy: a finding in a single- study article or a finding in a multiple-study article?

The Problem: Too Many Significant Results! Standard statistical theory - two causal factors - true effect - random factors - Type I error: infer true effect, when random factors produced observed effect. - Type II error: Infer no true effect, when random factors mask true effect. - Ignores a third factor that influences results in scientific articles (bias).

The Problem: Too Many Significant Results How many significant results should be found? - Power (Cohen, 1992) - bigger effect size increase chances - bigger sample size increase chances ESP studies tend to have ~60% power (Sedlmeier & Gigerenzer, 1989, Rossi, 1990) ESP journals publish 97% significant results (Sterling et al., 1995)

Magic produces 37% of significant effects in ESP journals. We are all magicians, but some more than others.

Power in multiple studies is a power function. It is hard to show significance once, but it is harder to do it again, and again, and again… Thus, it requires more magic for significant results in multiple study articles.

“Daryl Bem is a Cornell University psychologist who says he's been doing magic as a hobby since he was 17. Now he has managed what some scientists may call his greatest trick: he's written a paper attempting to prove the power of ESP — extrasensory perception — and had it accepted for publication in a major scientific journal... Did Bem really find evidence of extrasensory perception, or will his paper turn out to be an embarrassment? Already, there are doubts in the scientific world.

.09

Main Effect for ESP Study 7 used supraliminal stimuli. Effect (d =.09) not significant with N = 200 (POW = 97% for d =.25). Bem does not explain what this means. - failed replication - moderator effect of condition “I now wish I had simply continued to use subliminal exposures” Ignore Study 7 and focus on 8 ESP studies with subliminal stimuli and 9 significant effects.

Conclusion M-index for main effect.10 M-index for moderator effect.01 M-index total in a 1000 studies can produce this (or a better) better of results, if the hypotheses are true. More evidence needed? r(N – ES) = -.90

Future Replication Studies Already one failed replication of Study 8 (N = 112, power =.80) (Galak & Nelson, 2010) _id= If Power =.80, the probability to get three non- significant results is only p =.008. If Power =.90, the probability to get two non- significant results is p =.01. Thus, it takes only a few failed replications to provide more evidence that Bem is a magician.

Gaillot, Baumeister et al. (2007) “Sugar High” Dvorak & Simons (2000, PSPB) failed to replicate studies 3-6, r =.15, n.s. Kurzbach (2010) showed that studies 3-6 failed to replicate Study 1.

Schooler’s Decline Effect

Decline Effect II: Terror Management is Dying Grenberg 1994 original study - think about death - word completion as dependent variable (coff _ _ ) - N = 25, d = 1.7, POW =.98 Meta-analysis by Hayes (2010, Psych. Bull) lists 28 significant studies (0 failed replications) average d = 1.08, average N = 60, POW =.97, m- index =.43. However, strong decline effect, r = -.60 First, published non-significant effect by Niemic (2010, JPSP), d =.30, N = 57, n.s.

Decline Effect III: Malleability of Race-IAT Dasgupta & Greenwald (2001), JPSP N = 32, 16 per cell d =.7, POW =.50 [d =.2, POW =.09] Joy-Gaba & Nosek (2010), Social Psychology N = 4,628, d =.08, POW =.77

Conclusion “Less is more except of course for sample size” (Cohen, 1990, p. 1304) Implications - Request power-analysis in method section - Get rid off null-hypothesis testing, report confidence intervals