PHILOSOPHY OF SCIENCE: Bayesian inference Zoltán Dienes, Philosophy of Psychology Thomas Bayes 1702-1761.

Slides:



Advertisements
Similar presentations
Bayesian inference PHILOSOPHY OF SCIENCE: Thomas Bayes
Advertisements

Chapter 16 Inferential Statistics
7 Chapter Hypothesis Testing with One Sample
Hypothesis Testing Part I – As a Diagnostic Test.
1 Hypothesis Testing Chapter 8 of Howell How do we know when we can generalize our research findings? External validity must be good must have statistical.
Review You run a t-test and get a result of t = 0.5. What is your conclusion? Reject the null hypothesis because t is bigger than expected by chance Reject.
Statistical Issues in Research Planning and Evaluation
Chapter 14 Comparing two groups Dr Richard Bußmann.
PHILOSOPHY OF SCIENCE: Neyman-Pearson approach Zoltán Dienes, Philosophy of Psychology Jerzy Neyman April 16, August 5, 1981 Egon Pearson 11 August.
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
BCOR 1020 Business Statistics
Introduction, Acquiring Knowledge, and the Scientific Method
Nature of Science.
Inferential Statistics
The problem of sampling error in psychological research We previously noted that sampling error is problematic in psychological research because differences.
Introduction to Hypothesis Testing
Hypothesis Testing:.
Chapter 4 Hypothesis Testing, Power, and Control: A Review of the Basics.
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
© 2011 Pearson Prentice Hall, Salkind. Introducing Inferential Statistics.
1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Statistical Analysis A Quick Overview. The Scientific Method Establishing a hypothesis (idea) Collecting evidence (often in the form of numerical data)
Understanding the Variability of Your Data: Dependent Variable Two "Sources" of Variability in DV (Response Variable) –Independent (Predictor/Explanatory)
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Chapter 8 Introduction to Hypothesis Testing
Chapter 9 Power. Decisions A null hypothesis significance test tells us the probability of obtaining our results when the null hypothesis is true p(Results|H.
Association between 2 variables
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
1 The Scientist Game Chris Slaughter, DrPH (courtesy of Scott Emerson) Dept of Biostatistics Vanderbilt University © 2002, 2003, 2006, 2008 Scott S. Emerson,
AP STATISTICS LESSON 10 – 4 ( DAY 1 ) INFERENCE AS DECISION.
1 Virtual COMSATS Inferential Statistics Lecture-16 Ossam Chohan Assistant Professor CIIT Abbottabad.
Statistics (cont.) Psych 231: Research Methods in Psychology.
PHILOSOPHY OF SCIENCE: Bayesian inference Zoltán Dienes, Philosophy of Psychology Thomas Bayes
Copyright © Cengage Learning. All rights reserved. 8 Introduction to Statistical Inferences.
How to get the most out of null results using Bayes Zoltán Dienes.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Analyzing Statistical Inferences How to Not Know Null.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Not in FPP Bayesian Statistics. The Frequentist paradigm Defines probability as a long-run frequency independent, identical trials Looks at parameters.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Digression - Hypotheses Many research designs involve statistical tests – involve accepting or rejecting a hypothesis Null (statistical) hypotheses assume.
C82MST Statistical Methods 2 - Lecture 1 1 Overview of Course Lecturers Dr Peter Bibby Prof Eamonn Ferguson Course Part I - Anova and related methods (Semester.
Welcome to MM570 Psychological Statistics
9.3/9.4 Hypothesis tests concerning a population mean when  is known- Goals Be able to state the test statistic. Be able to define, interpret and calculate.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
CHAPTER OVERVIEW Say Hello to Inferential Statistics The Idea of Statistical Significance Significance Versus Meaningfulness Meta-analysis.
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Understanding Statistics © Curriculum Press 2003     H0H0 H1H1.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.1 Significance Tests:
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Tests of Significance We use test to determine whether a “prediction” is “true” or “false”. More precisely, a test of significance gets at the question.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
+ Testing a Claim Significance Tests: The Basics.
Chapter 8 Introducing Inferential Statistics.
Logic of Hypothesis Testing
9.3 Hypothesis Tests for Population Proportions
One-Sample Tests of Hypothesis
Unit 5: Hypothesis Testing
Review You run a t-test and get a result of t = 0.5. What is your conclusion? Reject the null hypothesis because t is bigger than expected by chance Reject.
Inferential Statistics
Significance Tests: The Basics
Chapter 12 Power Analysis.
Inferential Statistics
Presentation transcript:

PHILOSOPHY OF SCIENCE: Bayesian inference Zoltán Dienes, Philosophy of Psychology Thomas Bayes

Subjective probability: Personal conviction in an opinion – to which a number is assigned that obeys the axioms of probability. Probabilities reside in the mind of the individual not the external world. There are no true or objective probabilities. You can’t be criticized for your subjective probability regarding any uncertain proposition – but you must revise your probability in the light of data in ways consistent with the axioms of probability.

Personal probabilities: Which of the following alternatives would you prefer? If your choice turns out to be true I will pay you 10 pounds: 1.It will rain tomorrow in Brighton 2.I have a bag with one blue chip and one red chip. I randomly pull out one chip and it is red.

Personal probabilities: Which of the following alternatives would you prefer? If your choice turns out to be true I will pay you 10 pounds: 1.It will rain tomorrow in Brighton 2.I have a bag with one blue chip and one red chip. I randomly pull out one and it is red. If you chose 2, your p(‘it will rain’) is less than.5 If you chose 1, your p(‘it will rain’) is greater than.5

Assume you think it more likely than not that it will rain. Now which do you chose? I will give you 10 pounds if your chosen statement turns out correct. 1.It will rain tomorrow in Brighton. 2.I have a bag with 3 red chips and 1 blue. I randomly pick a chip and it is red.

Assume you think it more likely than not that it will rain. Now which do you chose? I will give you 10 pounds if your chosen statement turns out correct. 1.It will rain tomorrow in Brighton. 2.I have a bag with 3 red chips and 1 blue. I randomly pick a chip and it is red. If you chose 2, your p(‘it will rain’) is less than.75 (but more than.5) If you chose 1, your p(‘it will rain’) is more than.75.

By imagining a bag with differing numbers of red and blue chips, you can make your personal probability as precise as you like. e.g. if you prefer, in order to get 10 pounds, gambling on ‘It will rain tomorrow in brighton’ Rather than on ‘Selecting one red chip out of a bag composed of 6 reds and 4 blues’ Then your personal probability that it will rain is greater than 0.6.

This is a notion of probability that applies to the truth of theories (Remember objective probability does not apply to theories) So that means we can answer questions about p(H) – the probability of a hypothesis being true – and also p(H|D) – the probability of a hypothesis given data (which we cannot do on the Neyman-Pearson approach).

Consider a theory you might be testing in your project. What is your personal probability that the theory is true?

ODDS in favour of a theory = P(theory is true)/P(theory is false) For example, for the theory you may be testing ‘extroverts have high cortical arousal’ Then you might think (it’s completely up to you) P(theory is true) = 0.5 It follows you think that P(theory is false) = 1 – P(theory is true) = 0.5 So odds in favour of theory = 0.5/0.5 = 1 Even odds

If you think P(theory is true) = 0.8 Then you must think P(theory is false) = 0.2 So your odds in favour of the theory are 0.8/0.2 = 4 (4 to 1) What are your odds in favour of the theory you are testing in your project?

Odds before you have collected your data are called your prior odds Experimental results tell you by how much to increase your odds (the Bayes Factor, B) Odds after collecting your data are called your posterior odds Posterior odds = B * Prior odds For example, a Bayesian analysis might lead to a B of 5. What would your posterior odds for your project hypothesis be?

You can convert back to probabilities with the formula: P (Theory is true) = odds/(odds + 1) So if your prior odds had been 1 and the Bayes factor 5 Then posterior odds = 5 Your posterior probability of your theory being true = 5/6 =.83 Not a black and white decision like significance testing (conclude one thing if p =.048 and another if p =.052)

If B is greater than 1 then the data supported your experimental hypothesis over the null If B is less than 1, then the data supported the null hypothesis over the experimental one If B = about 1, experiment was not sensitive. (Automatically get a notion of sensitivity; contrast: just relying on p values in significance testing.)

For each experiment you run, just keep updating your odds/personal probability by multiplying by each new Bayes factor No need for p values at all! No need for power calculations! No need for critical values of t-tests! And as we will see: No need for post hoc tests! No need to plan number of subjects in advance!

EXAMPLE WITH REAL DATA: Sheldrake’s (1981) theory of morphic resonance

EXAMPLE WITH REAL DATA: Sheldrake’s (1981) theory of morphic resonance - Any system by virtue of assuming a particular form, becomes associated with a “morphic field” - The morphic field then plays a causal role in the development and maintenance of future systems, acting perhaps instantaeously through space and without decay through time - The field guides future systems to take similar forms - The effect is stronger the more similar the future system is to the system that generated the field - The effect is stronger the more times a form has been assumed by previous similar systems - The effect occurs at all levels of organization

Nature editorial by John Maddox 1981: The “book is the best candidate for burning there has been in many years... Sheldrake’s argument is pseudo-science... Hypotheses can be dignified as theories only if all aspects of them can be tested.” Wolpert, 1984: “... It is possible to hold absurd theories which are testable, but that does not make them science. Consider the hypothesis that the poetic Muse resides in tiny particles contained in meat. This could be tested by seeing if eating more hamburgers improved one’s poetry”

Repetition priming Subjects identify a stimulus more quickly or accurately with repeated presentation of the stimulus Lexical decision Subjects decide whether a presented letter string makes a meaningful English word or not (in the order actually presented). Two aspects of repetition priming are consistent with an explanation that involves morphic resonance: Durability, stimulus specificity Unique prediction of morphic resonance: Should get repetition priming between separate subjects! (ESP)

Design: Stimuli:shared+uniquesharedshared+unique... Subject no: ,... Subject type:resonatorboostersresonator...

Design: Stimuli:shared+uniquesharedshared+unique... Subject no: ,... Subject type:resonatorboostersresonator... - There were 10 resonators in total with nine boosters between each. Resonators were assigned randomly in advance to their position in the sequence. - The shared stimuli received morphic resonance at ten times the rate as the unique stimuli - There was a distinctive experimental context (white noise, essential oil of ylang ylang, stimuli seen through a chequerboard pattern)

Design: Stimuli:shared+uniquesharedshared+unique... Subject no: ,... Subject type:resonatorboostersresonator... Prediction of theory of morphic resonance: The resonators should become progressively faster on the shared as compared to the unique stimuli

Data for words. slope (ms/resonator) = - 5.0, SE = 1.5 Neyman-Pearson: p = significant

With some plausible assumptions, Bayes factor = 12 i.e. whatever one’s prior odds in favour of morphic resonance you should multiply them by 12 in the light of the data. Contrast Neyman-Pearson: Result was significant so should categorically reject null hypothesis With Bayesian approach, if before you had a very low odds in favour of morphic resonance, they can still be very low afterwards.

NB: I ran further three further studies with same paradigm Including one in which boosters were run at Sussex and resonators run in Goetingen – could they show which word set was being boosted in Sussex?? All results flat as a pancake. Combined Bayes factor for non-word data = about 1 Combining with word data = about ½ Does not rule out morphic resonance – just changes our odds

Summary: A Bayes factor tells you how much to multiply your prior odds in the light of data. Advantages: Low sensitive experiments show up as having Bayes’ factors near 1. You are not tempted to accept the null hypothesis just because the experiment was insensitive.

Contrasts with Neyman-Pearson: 1.A significant result (i.e. accepting the experimental hypothesis on Neyman Pearson) can give rise to a very small Bayes factor!! See handout for example. (It arises for vague theories – vague theories are punished by Bayes)

2. On Neyman Pearson, it matters whether you formulated your hypothesis before or after looking at the data. Post hoc vs planned comparisons Predictions made in advance of rather than before looking at the data are treated differently Bayesian inference: It does not matter what day of the week you thought of your theory The evidence for your theory is just as strong regardless of its timing

3. On Neyman Pearson standardly you should plan in advance how many subjects you will run. If you just miss out on a significant result you can’t just run 10 more subjects and test again. You cannot run until you get a significant result. Bayes: It does not matter when you decide to stop running subjects. You can always run more subjects if you think it will help.

4. On Neyman Pearson you must correct for how many tests you conduct in total. For example, if you ran 100 correlations and 4 were just significant, researchers would not try to interpret those significant results. On Bayes, it does not matter how many other statistical hypotheses you investigated. All that matters is the data relevant to each hypothesis under investigation.

The strengths of Bayesian analyses are also its weaknesses: 1.Are our subjective convictions really susceptible to the assignment of precise numbers and are they really the sorts of things that do or should follow the axioms of probability? Should papers worry about the strength of our convictions in their result sections, or just the objective reasons for why someone might change their opinions? BUT can just report Bayes factor

2.Bayesian procedures – because they are not concerned with long term frequencies - are not guaranteed to control error probabilities (Type I, type II). Which is more important to you –to use a procedure with known long term error rates or to know the amount by which you should change your conviction in a hypothesis?