Zacharias Maniadis, Fabio Tufano and John A List MAER-Net 2015 Prague Colloquium.

Slides:



Advertisements
Similar presentations
Andrea M. Landis, PhD, RN UW LEAH
Advertisements

Chapter 13 Comparing Two Populations: Independent Samples.
Quasi-Experimental Design
Statistical Issues in Research Planning and Evaluation
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Chapter 13: Inference for Distributions of Categorical Data
Doing Social Psychology Research
Chapter 7 Probability. Definition of Probability What is probability? There seems to be no agreement on the answer. There are two broad schools of thought:
Meta Analysis An Introduction. What… is… it? A “study of studies,” i.e., averaging results across studies in a given domain to get a better estimate of.
Company LOGO B2C E-commerce Web Site Quality: an Empirical Examination (Cao, et al) Article overview presented by: Karen Bray Emilie Martin Trung (John)
Chapter One: The Science of Psychology
Methodology: How Social Psychologists Do Research
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Chapter 8 Experimental Research
Chapter 4 Research Methods
Chapter 4 Hypothesis Testing, Power, and Control: A Review of the Basics.
Statistical Analysis Statistical Analysis
Chapter 2 The Research Enterprise in Psychology. n Basic assumption: events are governed by some lawful order  Goals: Measurement and description Understanding.
Moving from Development to Efficacy & Intervention Fidelity Topics National Center for Special Education Research Grantee Meeting: June 28, 2010.
Chapter One: The Science of Psychology. Ways to Acquire Knowledge Tenacity Tenacity Refers to the continued presentation of a particular bit of information.
Statistics for the Behavioral Sciences Second Edition Chapter 11: The Independent-Samples t Test iClicker Questions Copyright © 2012 by Worth Publishers.
Experimental Research Methods in Language Learning Chapter 16 Experimental Research Proposals.
More About Factorial Design Suppose experiment tests whether taking a game theory class causes a person to get more papers published We cannot simply examine.
ECON 3039 Labor Economics By Elliott Fan Economics, NTU Elliott Fan: Labor 2015 Fall Lecture 31.
Topics Appropriate to Experiments Projects with limited and well-defined concepts. Projects that are exploratory rather than descriptive. Studies of small.
SS440 Seminar: Unit 4 Research in Psychopathology Dr. Angie Whalen Kaplan University 1.
Statistical Power The power of a test is the probability of detecting a difference or relationship if such a difference or relationship really exists.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
+ Chi Square Test Homogeneity or Independence( Association)
Chapter 14 Repeated Measures and Two Factor Analysis of Variance
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Chapter 12 Introduction to Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick.
On Magic, Power, Extra Sensory Perception, Decline, and Death: The Ironic Effect of Multiple-Study Articles on Scientific Progress Ulrich Schimmack University.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
to become a critical consumer of information.
SP 2015 CP PROBABILITY & STATISTICS Observational Studies vs. Experiments Chapter 11.
Methodology: How Social Psychologists Do Research
The Psychologist as Detective, 4e by Smith/Davis © 2007 Pearson Education Chapter One: The Science of Psychology.
BHS Methods in Behavioral Sciences I May 9, 2003 Chapter 6 and 7 (Ray) Control: The Keystone of the Experimental Method.
CHAPTER 9: Producing Data Experiments ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Research Design Quantitative. Quantitative Research Design Quantitative Research is the cornerstone of evidence-based practice It provides the knowledge.
April Center for Open Fostering openness, integrity, and reproducibility of scientific research.
Practical Steps for Increasing Openness and Reproducibility Courtney Soderberg Statistical and Methodological Consultant Center for Open Science.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
Smith/Davis (c) 2005 Prentice Hall Chapter One The Science of Psychology PowerPoint Presentation created by Dr. Susan R. Burns Morningside College.
Micro array Data Analysis. Differential Gene Expression Analysis The Experiment Micro-array experiment measures gene expression in Rats (>5000 genes).
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Chapter 14 Repeated Measures and Two Factor Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh.
Research Methods: Experimentation Mr. Koch AP Psychology Andover High School.
Why do so many researchers misreport p-values?
Introduction The two-sample z procedures of Chapter 10 allow us to compare the proportions of successes in two populations or for two treatments. What.
CHAPTER 4 Designing Studies
CHAPTER 2 Research Methods in Industrial/Organizational Psychology
CHAPTER 4 Designing Studies
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
CHAPTER 4 Designing Studies
More About Factorial Design
Chapter 13: Inference for Distributions of Categorical Data
CHAPTER 4 Designing Studies
Research Design Quantitative.
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
Psych 231: Research Methods in Psychology
CHAPTER 4 Designing Studies
Chapter 10 Introduction to the Analysis of Variance
CHAPTER 4 Designing Studies
Presentation transcript:

Zacharias Maniadis, Fabio Tufano and John A List MAER-Net 2015 Prague Colloquium

 The ‘credibility crisis in science’ raises the question of where economics stands as a science  How credible are our experimental results? 1. We first show that much more research is needed in order to answer this question. This defines a promising research agenda 2. Experimental economics: is there enough replication to make us feel safe?

 Experiments play increasingly important role in economics: Increasing representation in economic journals (Card et al., JEP, 2004)  Also in policy analysis and development  Experiments are view as prima facie more credible (Duflo 2006, Angrist and Pischke, 2010)

Source: Card, Della Vigna and Malmendier (JEP, 2011)

Xsby Jonah Lehrer New Yorker, 13 Dec. 2010

 In many disciplines, several widely accepted findings cannot be replicated  The size of treatment effects seems to shrink with successive replications  Examples: 1. Biomedical sciences (Ioannidis, PloS Med., 2005) 2. Psychology (Open Science Initiative., 2015) 3. Ecology (Jennions and Moller, Proc. Royal Soc., 2001)

 Using a Bayesian model we isolate necessary variables that need to be measured in order to answer this question  Need to use meta-research. Examples of such research abound in psychology and related disciplines

 n = No. of associations being studied  π = fraction of n associations actually true  α = typical significance level  (1-β) = typical study power  The Post-Study Probability (PSP) that the research finding is true: (1)

 Rigorous theory testing/high priors  Power/Sample size  Researchers’ competition/publication bias  Research Bias, with three Components: ◦ 1) Degrees of Freedom, ◦ 2) Publication pressure ◦ 3) ‘Positive Results’ Premium’  Frequency of Replication

 We argue that there is serious lack of evidence  Juxtaposed with other behavioral disciplines such as psychology, we see where research need to be directed

 Priors: Delong and Lang (1992): econ tends to study true hypotheses. Card and Dellavigna (2011): 68% of field experiments lack theory  Power: Ortmann and Le (2013); Doucouliagos, Ioannidis and Stanley (2015) calculate low power  Publication Bias: Doucouliagos and Stanley (2013), Brodeur, Le and Sangnier (2012) and many more  Replication: Duvendack,Palmer-Jones and Reed (2015) show low success rates

 Retrospective power analysis in psychology: ◦ Cohen (1962) found median power 0.48 ◦ Sedlmeier and Gigerenzer (1989) review ten studies in 70s-80s in several disciplines following Cohen’s approach ◦ Bakker, van Dijk, and Wicherts’ (2012) general power estimate equal to 0.35.

 We may not know much about the Post-study probability that we should assign to a positive result  But at least if frequent replications occur, we can be reassured that the PSP converges to the truth fast (Maniadis, Tufano and List 2014)  But do they?

 What fraction of experimental economic papers are replications across the last 40 years?  Do enough “tacit” replications exist to make us feel safe?  Which factors affect the ‘success rate’?

 Duvendack, Palmer-Jones and Reed (2015) do not calculate the fraction of papers that contain replications  They also do not examine the factors that affect the ‘replication success’ rate  Finally, they have a very small number of experimental studies in their replication sample (11 studies)

 We looked at the economics literature in English language in the period  Used WoK and traced the root experiment*  We randomly sampled 2001 papers and examined which are actual experiments  Among the experimental ones, we checked in detail and elicited the fraction of replications

 We focused on top 150 journals in economics  We examined all replications in detail to code: ◦ The type of replication (exact/conceptual/mixed) ◦ The success/failure of replication ◦ Authorship overlap with original ◦ Similar or different subject pools with original ◦ Similar or different language with original ◦ Same or different journal with original ◦ Similar or different methodologies (paper based vs computerized, etc.) with original

 Among 7754 papers with root experiment* (but not replicat*) about half were experiments  Only 1038/2001 sampled papers were actual experiments  655/1159 of studies with terms “experiment*” and “replicat*”contained actual experiments  Among those 655, 100 turned out to be actual replications

 Perhaps researchers conduct replications but do not with to declare them as such  So, we thoroughly went through 500 papers which were actual experiments and did not have the root replicat*  Only 13 were found to be replications

 Fraction of total papers in economics that contain new experimental data: 2.3%  Fraction of replications studies over the total number of experimental studies: 2.56%  Overall success rate: 32%

Replication rates in the top 150 journal in Economics according to the Eigenfactor Score

Replication type (N=76) Overall All16%84% Failed11%0%13% Mixed47%67%44% Successful42%33%44%

Replication type Overall Conceptual (N=35)23%77% Failed11%0%15% Mixed51%50%52% Successful37%50%33%

Replication type Overall Direct (N=41)10%90% Failed10%0%11% Mixed44%100%38% Successful46%0%51%

Replication type Overall By same authors (N=13)31%69% Failed8%0%11% Mixed46%75%33% Successful46%25%56% By same journal (N=10)40%60% Failed10%0%17% Mixed40%75%17% Successful50%25%67%

 Much more research is needed using meta- research methods in economics  We conducted a study to see how prevalent replication in experimental economics is. We found that about 2.6% are replications  Success rate (37%) similar to Open Science Initiative (36-39%) and Duvendack, Palmer- Jones and Reed (2015) (22%)  Makel et al (2012) found 67%