MSc Methods XX: YY Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel: 7670 0592

Slides:



Advertisements
Similar presentations
In 1999, Sally Clark was convicted of the murder of her two sons. The data: In 1996, her first son died apparently of cot death at a few weeks of age.
Advertisements

Chapter 7 Hypothesis Testing
Statistics Hypothesis Testing.
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
Testing Hypotheses About Proportions
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 21 More About Tests and Intervals.
Hypothesis Testing An introduction. Big picture Use a random sample to learn something about a larger population.
Inference Sampling distributions Hypothesis testing.
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
1 Hypothesis Testing Chapter 8 of Howell How do we know when we can generalize our research findings? External validity must be good must have statistical.
Testing Hypotheses About Proportions Chapter 20. Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Introduction to Hypothesis Testing
Lecture 2: Thu, Jan 16 Hypothesis Testing – Introduction (Ch 11)
Introduction to Hypothesis Testing CJ 526 Statistical Analysis in Criminal Justice.
MSc Methods XX: YY Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel:
8-2 Basics of Hypothesis Testing
Ch. 9 Fundamental of Hypothesis Testing
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 11 Introduction to Hypothesis Testing.
The Inexpert Witness Born 1933 Distinguished paediatrician Famous for “Munchausen Syndrome by Proxy” Expert witness in cases of suspected child abuse and.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Lecture Slides Elementary Statistics Twelfth Edition
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 9 Introduction to Hypothesis Testing.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 20 Testing Hypotheses About Proportions.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
MSc Methods part XX: YY Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel:
Chapter 4 Introduction to Hypothesis Testing Introduction to Hypothesis Testing.
Chapter 8 Introduction to Hypothesis Testing
Copyright © 2009 Pearson Education, Inc. Chapter 21 More About Tests.
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
Chapter 20 Testing hypotheses about proportions
Lecture 16 Dustin Lueker.  Charlie claims that the average commute of his coworkers is 15 miles. Stu believes it is greater than that so he decides to.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 20 Testing Hypotheses About Proportions.
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
1 Chapter 8 Hypothesis Testing 8.2 Basics of Hypothesis Testing 8.3 Testing about a Proportion p 8.4 Testing about a Mean µ (σ known) 8.5 Testing about.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Section 8-2 Basics of Hypothesis Testing.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Chapter 20 Testing Hypothesis about proportions
Lecture 18 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Lecture 17 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Slide 21-1 Copyright © 2004 Pearson Education, Inc.
Hypothesis Testing. “Not Guilty” In criminal proceedings in U.S. courts the defendant is presumed innocent until proven guilty and the prosecutor must.
Reasoning with Probs How does evidence lead to conclusions in situations of uncertainty? Bayes Theorem Data fusion, use of techniques that combine data.
2001: Dissertation Process Measurement in data-poor situations Dr. Mathias (Mat) Disney UCL Geography Office: 113 Pearson Building Tel:
Sally Clark. Sally Clark (August 1964 – 15 March 2007) was a British lawyer who became the victim of a miscarriage of justice when she was wrongly convicted.
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
6.2 Large Sample Significance Tests for a Mean “The reason students have trouble understanding hypothesis testing may be that they are trying to think.”
Slide 20-1 Copyright © 2004 Pearson Education, Inc.
Chapter 20 Testing Hypotheses About Proportions. confidence intervals and hypothesis tests go hand in hand:  A confidence interval shows us the range.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Statistics 20 Testing Hypothesis and Proportions.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 21 More About Tests and Intervals.
MSc Methods part II: Bayesian analysis Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel:
Chapter 10 One-Sample Test of Hypothesis. Example The Jamestown steel company manufactures and assembles desks and other office equipment at several plants.
Module 10 Hypothesis Tests for One Population Mean
Testing Hypotheses about Proportions
Testing Hypotheses About Proportions
WARM – UP A local newspaper conducts a poll to predict the outcome of a Senate race. After conducting a random sample of 1200 voters, they find 52% support.
Testing Hypotheses about Proportions
Testing Hypotheses About Proportions
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Presentation transcript:

MSc Methods XX: YY Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel:

Induction and deduction

1.Realism: physical world is real; 2.Presuppositions: world is orderly and comprehensible; 3.Evidence: science demands evidence; 4.Logic: science uses standard, settled logic to connect evidence and assumptions with conclusions; 5.Limits: many matters cannot usefully be examined by science; 6.Universality: science is public and inclusive; 7.Worldview: science must contribute to a meaningful worldview. Gauch (2006): “Seven pillars of Science”

Fundamental laws of probability can be derived from statements of logic BUT there are different ways to apply Two key ways –Frequentist –Bayesian – after Rev. Thomas Bayes ( ) What’s this got to do with methods?

Fallacies can be hard to spot in longer, more detailed arguments: –Fallacies of composition; ambiguity; false dilemmas; circular reasoning; genetic fallacies (ad hominem) Gauch (2003) notes: –For an argument to be accepted by any audience as proof, audience MUST accept premises and validity –That is: part of responsibility for rational dialogue falls to the audience –If audience data lacking and / or logic weak then valid argument may be incorrectly rejected (or vice versa) Aside: sound argument v fallacy

If plants lack nitrogen, they become yellowish –The plants are yellowish, therefore they lack N –The plants do not lack N, so they do not become yellowish –The plants lack N, so they become yellowish –The plants are not yellowish, so they do not lack N Affirming the antecedent: p  q; p,  q ✓ Denying the consequent: p  q: ~q,  ~p ✓ Affirming the consequent: p  q: q,  p X Denying the antecedent: p  q: ~p,  ~q X Aside: sound argument v fallacy

Bayesian view is directly related to how we do science Frequentist view of hypothesis testing is fundamentally flawed (Jaynes, ch 17 for eg): –To test H do it indirectly - invent null hypothesis H o that denies H, then argue against Ho –But in practice, Ho is not (usually) a direct denial of H –H usually a disjunction of many different hypotheses, where H o denies all of them while assuming things (eg normal distribution of errors) which H neither assumes nor denies Jeffreys (1939, p316): “…an hypothesis that may be true is rejected because it has failed to predict observable results that have not occurred. This seems remarkable…on the face of it, the evidence might more reasonably be taken as evidence for the hypothesis, not against it. The same applies to all all the current significance tests based on P-values.” Bayesian reasoning

Prior knowledge? –What is known beyond the particular experiment at hand, which may be substantial or negligible We all have priors: assumptions, experience, other pieces of evidence Bayes approach explicitly requires you to assign a probability to your prior (somehow) Bayesian view - probability as degree of belief rather than a frequency of occurrence (in the long run…) Bayes: see Gauch (2003) ch 5

For N data, Given data {x k }, what is best estimate of μ and error, σ? Likelihood? Simple uniform prior? Log(Posterior),L A more complex example: mean of Gaussian

For best estimate μ o So and best estimate is simple mean i.e. Confidence depends on σ i.e. And so Here μ min = -2 μ max = 15 If we make larger? Weighting of error for each point? A more complex example: mean of Gaussian

After Stirzaker (1994) and Gauch (2003): Blood test for rare disease occurring by chance in 1:100,000. Test is quite accurate: –Will tell if you have disease 95% of time i.e. p = 0.95 –BUT also gives false positive 0.5% of the time i.e. p = Q: if test says you have disease, what is the probability this diagnosis is correct? –80% of health experts questioned gave the wrong answer (Gauch, 2003: 211) –Use 2-hypothesis form of Bayes’ Theorem Common errors: ignored prior

Back to our disease test Correct diagnosis only 1 time in false +ve! For a disease as rare as this, the false positive rate (1:200) makes test essentially useless Common errors: ignored prior

Knowledge of general population gives prior odds diseased:healthy 1: Knowledge of +ve test gives likelihood odds 95:5 Mistake is to base conclusion on likelihood odds Prior odds completely dominate –0.005 x ~ >> 0.95 x ~1x10 -6 What went wrong?

14 The tragic case of Sally Clark Two cot-deaths (SIDS), 1 year apart, aged 11 weeks and 8 weeks. Mother Sally Clark charged with double murder, tried and convicted in 1999 –Statistical evidence was misunderstood, “expert” testimony was wrong, and a fundamental logical fallacy was introduced What happened? We can use Bayes’ Theorem to decide between 2 hypotheses –H1 = Sally Clark committed double murder –H2 = Two children DID die of SIDS theorem/ theorem/

15 The tragic case of Sally Clark Data? We observe there are 2 dead children We need to decide which of H1 or H2 are more plausible, given D (and prior expectations) i.e. want ratio P(H1|D) / P(H2|D) i.e. odds of H1 being true compared to H2, GIVEN data and prior prob. of H1 or H2 given data D Likelihoods i.e. prob. of getting data D IF H1 is true, or if H2 is true Very important - PRIOR probability i.e. previous best guess

16 The tragic case of Sally Clark ERROR 1: events NOT independent P(1 child dying of SIDS)? ~ 1:1300, but for affluent non- smoking, mother > 26yrs ~ 1:8500. Prof. Sir Roy Meadows (expert witness) –P(2 deaths)? 1:8500*8500 ~ 1:73 million. –This was KEY to her conviction & is demonstrably wrong –~ births a year in UK, so at 1:73M a double cot death is a 1 in 100 year event. BUT 1 or 2 occur every year – how come?? No one checked … –NOT independent P(2 nd death | 1 st death) 5-10 higher i.e. 1:100 to 200, so P(H2) actually 1:1300*5/1300 ~ 1:300000

17 The tragic case of Sally Clark ERROR 2: “Prosecutor’s Fallacy” –1: still VERY rare, so she’s unlikely to be innocent, right?? Meadows “Law”: ‘one cot death is a tragedy, two cot deaths is suspicious and, until the contrary is proved, three cot deaths is murder’ –WRONG: Fallacy to mistake chance of a rare event as chance that defendant is innocent In large samples, even rare events occur quite frequently - someone wins the lottery (1:14M) nearly every week births a year, expect 2-3 double cot deaths….. AND we are ignoring rarity of double murder (H1)

18 The tragic case of Sally Clark ERROR 3: ignoring odds of alternative (also very rare) –Single child murder v. rare (~30 cases a year) BUT generally significant family/social problems i.e. NOT like the Clarks. –P(1 murder) ~ 30: i.e. 1:21700 –Double MUCH rarer, BUT P(2 nd |1 st murder) ~ 200 x more likely given first, so P(H1|D) ~ (1/21700* 200/21700) ~ 1:2.4M So, two very rare events, but double murder ~ 10 x rarer than double SIDS So P(H1|D) / P(H2|D)? –P (murder) : P (cot death) ~ 1:10 i.e. 10 x more likely to be double SIDS –Says nothing about guilt & innocence, just relative probability

19 The tragic case of Sally Clark Sally Clark acquitted in 2003 after 2 nd appeal (but not on statistical fallacies) after 3 yrs in prison, died of alcohol poisoning in 2007 –Meadows “Law” redux: triple murder v triple SIDS? In fact, P(triple murder | 2 previous) : P(triple SIDS| 2 previous) ~ ((21700 x 123) x 10) / ((1300 x 228) x 50) = 1.8:1 So P(triple murder) > P(SIDS) but not by much Meadows’ ‘Law’ should be: –‘when three sudden deaths have occurred in the same family, statistics give no strong indication one way or the other as to whether the deaths are more or less likely to be SIDS than homicides’ From: Hill, R. (2004) Multiple sudden infant deaths – coincidence or beyond coincidence, Pediatric and Perinatal Epidemiology, 18, (

After Stewart (1996) & Gauch (2003: 212): –Boy? Girl? Assume P(B) = P(G) = 0.5 and independent –For a family with 2 children, what is P that other is a girl, given that one is a girl? 4 possible combinations, each P(0.25): BB, BG, GB, GG Can’t be BB, and in only 1 of 3 remaining is GG possible So P(B):P(G) now 2:1 –Using Bayes’ Theorem: X = at least 1 G, Y = GG –P(X) = ¾ and so Common errors: reversed conditional Stewart, I. (1996) The Interrogator’s Fallacy, Sci. Am., 275(3),

Easy to forget that order does matter with conditional Ps –As and but – as this is cause & effect –Gauch (2003) notes use of “when” in incorrectly phrasing Q: For a family with 2 children, what is P that other is a girl, when one is a girl? –P(X when Y) not defined –It is not P(X|Y), nor is it P(Y|X) or even P(X AND Y) Common errors: reversed conditional Stewart, I. (1996) The Interrogator’s Fallacy, Sci. Am., 275(3),

Relates to Prosecutor’s Fallacy again –Stewart uses DNA match example –What is P(match) i.e. prob. suspect’s DNA sample matches that from crime scene, given they are innocent? –But this is wrong question – SHOULD ask: –What is P(innocent) i.e. prob. suspect is innocent, given a DNA match? Note Bayesian approach – we can’t calculate likelihood of innocence (1 st case), but we can estimate likelihood of DNA match, given priors Evidence: DNA match of all markers P(match|innocent) = 1: BUT question jury must answer is P(innocent|match). Priors? –Genetic history and structure of population of possible perpetrators –Typically means evidence about as strong as you get from match using half genetic markers, but ignoring population structure –Evidence combines mulitplicatively, so strength goes up as ~ (no. markers) 1/2 Common errors: reversed conditional Stewart, I. (1996) The Interrogator’s Fallacy, Sci. Am., 275(3),

If P(innocent|match) ~ 1: then P(match|innocent) ~ 1:1000 Other priors? Strong local ethnic identity? Many common ancestors within yrs (isolated rural areas maybe)? P(match|innocent) >> 1:1000, maybe 1:100 Says nothing about innocence, but a jury must consider whether the DNA evidence establishes guilt beyond reasonable doubt Common errors: reversed conditional Stewart, I. (1996) The Interrogator’s Fallacy, Sci. Am., 275(3),

Significance testing and P-values are widespread BUT they tell you nothing about the effect you’re interested in (see Siegfried (2010) for eg). All P value of < 0.05 can say is: –There is a 5% chance of obtaining the observed (or more extreme) result even if no real effect exists i.e. if the null hypothesis is correct Two possible conclusions remain: –i) there is a real effect –Ii) the result is an improbable (1 in 20) fluke BUT P value cannot tell you which is which If P > 0.05 then also two conclusions: –i) there is no real effect –ii) test not capable of discriminating a weak effect Aside: the problem with P values A P value is the probability of an observed (or more extreme) result arising only from chance. Credit: S. Goodman, adapted by A. Nandy alue_chart.jpg