Significance testing and confidence intervals Col Naila Azam.

Slides:



Advertisements
Similar presentations
Significance testing Ioannis Karagiannis (based on previous EPIET material) 18 th EPIET/EUPHEM Introductory course
Advertisements

Significance testing and confidence intervals Ágnes Hajdu EPIET Introductory course
Critical review of significance testing F.DAncona from a Alain Morens lecture 2006.
Chapter 9 Hypothesis Testing Understandable Statistics Ninth Edition
Inferential Statistics
Is it statistically significant?
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 11 Introduction to Hypothesis Testing.
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 11 Introduction to Hypothesis Testing.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics
Inference about Population Parameters: Hypothesis Testing
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 11 Introduction to Hypothesis Testing.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Chapter Ten Introduction to Hypothesis Testing. Copyright © Houghton Mifflin Company. All rights reserved.Chapter New Statistical Notation The.
Overview of Statistical Hypothesis Testing: The z-Test
Confidence Intervals and Hypothesis Testing - II
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 9 Introduction to Hypothesis Testing.
Fundamentals of Hypothesis Testing: One-Sample Tests
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
More About Significance Tests
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Chapter 4 Introduction to Hypothesis Testing Introduction to Hypothesis Testing.
Chapter 8 Introduction to Hypothesis Testing
Instructor Resource Chapter 5 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
Chapter 20 Testing hypotheses about proportions
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
No criminal on the run The concept of test of significance FETP India.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Chapter 11 Introduction to Hypothesis Testing Sir Naseer Shahzada.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Issues concerning the interpretation of statistical significance tests.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
1 Hypothesis Testing A criminal trial is an example of hypothesis testing. In a trial a jury must decide between two hypotheses. The null hypothesis is.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
AP Statistics Section 11.1 B More on Significance Tests.
© Copyright McGraw-Hill 2004
Statistical Techniques
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Hypothesis Tests for 1-Proportion Presentation 9.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Chapter 9 Introduction to the t Statistic
Chapter Nine Hypothesis Testing.
Significance testing and confidence intervals
Keller: Stats for Mgmt & Econ, 7th Ed
Keller: Stats for Mgmt & Econ, 7th Ed Hypothesis Testing
Hypothesis Testing: Hypotheses
Chapter Nine Part 1 (Sections 9.1 & 9.2) Hypothesis Testing
Statistical Test A test of significance is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to.
Presentation transcript:

Significance testing and confidence intervals Col Naila Azam

RECAP OF BIOSTATISTICS Types of data, variables, census,research and distribution of data( normal, skewed) Description of data; measures of central tendency,dispersion and implication in describing various types of data Concept of standard deviation vs standard error and limits of confidence /confidence interval 2

Learning objectives To understand the concept of hypothesis formulation To delineate the implications of various types of hypothesis(one tailed/two tailed) on research results in epidemiology To grasp the concept of p – value in result interpretation To clarify the concept of critical value in data distribution and the possible errors 3

The idea of statistical inference Sample Population Conclusions based on the sample Generalisation to the population Hypotheses 4

Inferential statistics Uses patterns in the sample data to draw inferences about the population represented, accounting for randomness. Two basic approaches: – Hypothesis testing – Estimation Common goal: conclude on the effect of an independent variable (exposure) on a dependent variable (outcome). 5

The aim of a statistical test To reach a scientific decision (“yes” or “no”) on a difference (or effect), on a probabilistic basis, on observed data. 6

Why significance testing? Gastroenteritis outbreak in karachi: “The risk of illness was higher among diners who ate home preserved pickles (RR=3.6).” Is the association due to chance? 7

The two hypothesis! There is a difference between the two groups (=there is an effect) Alternative Hypothesis (H 1 ) (e.g.: RR=3.6) When you perform a test of statistical significance you usually reject or do not reject the Null Hypothesis (H 0 ) There is NO difference between the two groups (=no effect) Null Hypothesis (H 0 ) (e.g.: RR=1) 8

gastroenteritis outbreak in karachi Null hypothesis (H0): “There is no association between consumption of green pickles and gastroenteritis.” Alternative hypothesis (H1): “There is an association between consumption of green pickles and gastroenteritis.” 9

Hypothesis testing and null hypothesis Tests of statistical significance Data not consistent with H 0 : – H 0 can be rejected in favour of some alternative hypothesis H 1 (the objective of our study). Data are consistent with the H 0 : – H 0 cannot be rejected You cannot say that the H 0 is true. You can only decide to reject it or not reject it. 10

How to decide when to reject the null hypothesis? H 0 rejected using reported p value p-value = probability that our result (e.g. a difference between proportions or a RR) or more extreme values could be observed under the null hypothesis 11

p values – practicalities Small p values = low degree of compatibility between H 0 and the observed data:  you reject H 0, the test is significant Large p values = high degree of compatibility between H 0 and the observed data:  you don’t reject H 0, the test is not significant We can never reduce to zero the probability that our result was not observed by chance alone 12

Levels of significance – practicalities We need a cut-off ! p value > 0.05 = H 0 non rejected (non significant) p value ≤ 0.05 = H 0 rejected (significant) BUT: Give always the exact p-value rather than „significant“ vs. „non-significant“. 13

”The limit for statistical significance was set at p=0.05.” ”There was a strong relationship (p<0.001).” ”…, but it did not reach statistical significance (ns).” „ The relationship was statistically significant (p=0.0361)” Examples from the literature p=0.05  Agreed convention Not an absolute truth ”Surely, God loves the 0.06 nearly as much as the 0.05” (Rosnow and Rosenthal, 1991) 14

p = 0.05 and its errors Level of significance, usually p = 0.05 p value used for decision making But still 2 possible errors: H 0 should not be rejected, but it was rejected :  Type I or alpha error H 0 should be rejected, but it was not rejected :  Type II or beta error 15

Keep in mind that committing a Type I error OR a Type II error can be VERY bad depending on the problem. 16

H 0 is “true” but rejected: Type I or  error H 0 is “false” but not rejected: Type II or  error Types of errors Decision based on the p value Truth No diff Diff 17

More on errors Probability of Type I error: – Value of α is determined in advance of the test – The significance level is the level of α error that we would accept (usually 0.05) Probability of Type II error: – Value of β depends on the size of effect (e.g. RR, OR) and sample size – 1-β: Statistical power of a study to detect an effect on a specified size (e.g. 0.80) – Fix β in advance: choose an appropriate sample size 18

 H 0 is true H 1 is true Test statistics T  Even more on errors 19

Principles of significance testing Formulate the H 0 Test your sample data against H 0 The p value tells you whether your data are consistent with H 0 i.e, whether your sample data are consistent with a chance finding (large p value), or whether there is reason to believe that there is a true difference (association) between the groups you tested You can only reject H 0, or fail to reject it! 20

11.21 Interpreting the p-value… Overwhelming Evidence (Highly Significant) Strong Evidence (Significant) Weak Evidence (Not Significant) No Evidence (Not Significant) p=.0069

11.22 Conclusions of a Test of Hypothesis… If we reject the null hypothesis, we conclude that there is enough evidence to infer that the alternative hypothesis is true. If we fail to reject the null hypothesis, we conclude that there is not enough statistical evidence to infer that the alternative hypothesis is true. This does not mean that we have proven that the null hypothesis is true!

FORMULAE FOR ESTIMATION OF STANDARD ERROR(SE) OF SAMPLE 1. SE of sample mean= SD/ √n 2. SE of sample proportion(p) = √pq/n 3. SE of difference between two means [SE(d)]=√SD 1 / n 1 + SD 2 /n 2 4. SE of difference between two proportions= √p 1 q 1 /n 1 + p 2 q 2 /n 2

11.24 Concepts of Hypothesis Testing (1)… The two hypotheses are called the null hypothesis and the other the alternative or research hypothesis. The usual notation is: H 0 : — the ‘null’ hypothesis H 1 : — the ‘alternative’ or ‘research’ hypothesis The null hypothesis (H 0 ) will always state that the parameter equals the value specified in the alternative hypothesis (H 1 ) pronounced H “nought”

11.25 Concepts of Hypothesis Testing… Consider mean demand for computers during assembly lead time. Rather than estimate the mean demand, our operations manager wants to know whether the mean is different from 350 units. In other words, someone is claiming that the mean time is 350 units and we want to check this claim out to see if it appears reasonable. Recall that the standard deviation [σ]was assumed to be 75, the sample size [n] was 25, and the sample mean [ ] was calculated to be

We rephrase the hypothesis as Null = the mean is = 350 H 0 : = 350 Thus, our research hypothesis becomes: H 1 : ≠ 350 While sd = 75, sample size = 25 and Sample mean calculated to be

11.27 Concepts of Hypothesis Testing… For example, if we’re trying to decide whether the mean is not equal to 350, a large value of (say, 600) would provide enough evidence. If is close to 350 (say, 355) we could not say that this provides a great deal of evidence to infer that the population mean is different than 350.

11.28 Concepts of Hypothesis Testing (4)… The two possible decisions that can be made:  Conclude that there is enough evidence to support the alternative hypothesis (also stated as: reject the null hypothesis in favor of the alternative)  Conclude that there is not enough evidence to support the alternative hypothesis (also stated as: failing to reject the null hypothesis in favor of the alternative) NOTE: we do not say that we accept the null hypothesis if a statistician is around…

11.29 Concepts of Hypothesis Testing (2)… The testing procedure begins with the assumption that the null hypothesis is true. Thus, until we have further statistical evidence, we will assume: H 0 : = 350 (assumed to be TRUE) The next step will be to determine the sampling distribution of the sample mean assuming the true mean is 350. is normal with /(25) = 15

11.30 Is the Sample Mean in the Guts of the Sampling Distribution??

11.31 Three ways to determine this: First way 1.Unstandardized test statistic: Is in the guts of the sampling distribution? Depends on what you define as the “guts” of the sampling distribution. If we define the guts as the center 95% of the distribution [this means  = 0.05], then the critical values that define the guts will be 1.96 standard deviations of X-Bar on either side of the mean of the sampling distribution [350], UCV = *15 = = LCV = 350 – 1.96*15 = 350 – 29.4 = 320.6

Unstandardized Test Statistic Approach

11.33 Three ways to determine this: Second way 2. Standardized test statistic: Since we defined the “guts” of the sampling distribution to be the center 95% [  = 0.05], If the Z-Score for the sample mean is greater than 1.96, we know that will be in the reject region on the right side or If the Z-Score for the sample mean is less than -1.97, we know that will be in the reject region on the left side. Z = ( - )/ = ( – 350)/15 = Is this Z-Score in the guts of the sampling distribution???

Standardized Test Statistic Approach

11.35 Three ways to determine this: Third way 3. The p-value approach (which is generally used with a computer and statistical software): Increase the “Rejection Region” until it “captures” the sample mean. For this example, since is to the right of the mean, calculate P( > ) = P(Z > 1.344) = Since this is a two tailed test, you must double this area for the p-value. p-value = 2*(0.0901) = Since we defined the guts as the center 95% [  = 0.05], the reject region is the other 5%. Since our sample mean,, is in the 18.02% region, it cannot be in our 5% rejection region [  = 0.05].

p-value approach

11.37 Statistical Conclusions: Unstandardized Test Statistic: Since LCV (320.6) < (370.16) < UCV (379.4), we fail to reject the null hypothesis at a 5% level of significance. Standardized Test Statistic: Since -Z  /2 (-1.96) < Z(1.344) < Z  /2 (1.96), we fail to reject the null hypothesis at a 5% level of significance. P-value: Since p-value (0.1802) > 0.05 [  ], we fail to reject the hull hypothesis at a 5% level of significance.

Criticism on significance testing “Epidemiological application need more than a decision as to whether chance alone could have produced association.” (Rothman et al. 2008) → Estimation of an effect measure (e.g. RR, OR) rather than significance testing. 38

The epidemiologist needs measurements rather than probabilities  2 is a test of association OR, RR are measures of association on a continuous scale  infinite number of possible values The best estimate = point estimate Range of values allowing for random variability: Confidence interval  precision of the point estimate 39

Confidence interval (CI) Range of values, on the basis of the sample data, in which the population value (or true value) may lie. Frequently used formulation: „If the data collection and analysis could be replicated many times, the CI should include the true value of the measure 95% of the time.” 40

Confidence interval (CI) Indicates the amount of random error in the estimate Can be calculated for any „test statistic“, e.g.: means, proportions, ORs, RRs e.g. CI for means 95% CI = x – 1.96 SE up to x SE 1 - α α/2 Lower limit upper limit of 95% CI  = 5% s 41

CI terminology RR = 1.45 (0.99 – 2.1) Confidence intervalPoint estimate Lower confidence limit Upper confidence limit 42

The amount of variability in the data The size of the sample The arbitrary level of confidence you desire for your study (usually 90%, 95%, 99%) Width of confidence interval depends on … A common way to use CI regarding OR/RR is : If 1.0 is included in CI  non significant If 1.0 is not included in CI  significant 43

 2 A test of association. It depends on sample size. p value Probability that equal (or more extreme) results can be observed by chance alone OR, RR Direction & strength of association if > 1risk factor if < 1protective factor (independently from sample size) CI Magnitude and precision of effect What we have to evaluate the study 44

Comments on p-values and CIs Presence of significance does not prove clinical or biological relevance of an effect. A lack of significance is not necessarily a lack of an effect: “Absence of evidence is not evidence of absence”. 45

Comments on p values and CIs A huge effect in a small sample or a small effect in a large sample can result in identical p values. A statistical test will always give a significant result if the sample is big enough. p values and CIs do not provide any information on the possibility that the observed association is due to bias or confounding. 46

THANK YOU FOR APPRECIATING LOGIC OF BIOSTATISTICS