Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Inferential Statistics
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Inferential Statistics
Inference Sampling distributions Hypothesis testing.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Statistical Issues in Research Planning and Evaluation
Confidence Intervals © Scott Evans, Ph.D..
Hypothesis testing Week 10 Lecture 2.
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Hypothesis Testing GTECH 201 Lecture 16.
Lecture 2: Thu, Jan 16 Hypothesis Testing – Introduction (Ch 11)
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Chapter 9 Hypothesis Testing.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Statistical hypothesis testing – Inferential statistics I.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
AM Recitation 2/10/11.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
1 Economics 173 Business Statistics Lectures 3 & 4 Summer, 2001 Professor J. Petry.
Overview of Statistical Hypothesis Testing: The z-Test
Overview Definition Hypothesis
Introduction to Hypothesis Testing for μ Research Problem: Infant Touch Intervention Designed to increase child growth/weight Weight at age 2: Known population:
1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Testing Hypotheses Tuesday, October 28. Objectives: Understand the logic of hypothesis testing and following related concepts Sidedness of a test (left-,
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Chapter 8 Introduction to Hypothesis Testing
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Instructor Resource Chapter 5 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Significance testing and confidence intervals Col Naila Azam.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
CHAPTER 17: Tests of Significance: The Basics
No criminal on the run The concept of test of significance FETP India.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Issues concerning the interpretation of statistical significance tests.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Introduction Suppose that a pharmaceutical company is concerned that the mean potency  of an antibiotic meet the minimum government potency standards.
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
© Copyright McGraw-Hill 2004
Statistical Techniques
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
T tests comparing two means t tests comparing two means.
Chapter 13 Understanding research results: statistical inference.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Confidence Intervals and Hypothesis Testing Mark Dancox Public Health Intelligence Course – Day 3.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Hypothesis Tests for 1-Proportion Presentation 9.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
By Hatim Jaber MD MPH JBCM PhD
Dr.MUSTAQUE AHMED MBBS,MD(COMMUNITY MEDICINE), FELLOWSHIP IN HIV/AIDS
Interpreting Epidemiologic Results.
Type I and Type II Errors
Presentation transcript:

Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet Era Inferential Statistics Hypothesis Testing Introduction to Biostatistics

Key Lecture Concepts Assess role of random error (chance) as an influence on the validity of the statistical association Identify role of the p-value in statistical assessments Identify role of the confidence interval in statistical assessments Briefly introduce tests to undertake 2

Research Process Research question Hypothesis Identify research design Data collection Presentation of data Data analysis Interpretation of data Polgar, Thomas 3

Interpreting Results When evaluating an association between disease and exposure, we need guidelines to help determine whether there is a true difference in the frequency of disease between the two exposure groups, or perhaps just random variation from the study sample. 4

Random Error (Chance) 1.Rarely can we study an entire population, so inference is attempted from a sample of the population 2.There will always be random variation from sample to sample 3.In general, smaller samples have less precision, reliability, and statistical power (more sampling variability) 5

Hypothesis Testing The process of deciding statistically whether the findings of an investigation reflect chance or real effects at a given level of probability. 6

Elements of Testing hypothesis Null Hypothesis Alternative hypothesis Identify level of significance Test statistic Identify p-value / confidence interval Conclusion 7

H 0 :There is no association between the exposure and disease of interest H 1 :There is an association between the exposure and disease of interest 8 Hypothesis Testing Note: With prudent skepticism, the null hypothesis is given the benefit of the doubt until the data convince us otherwise.

Hypothesis Testing Because of statistical uncertainty regarding inferences about population parameters based upon sample data, we cannot prove or disprove either the null or alternate hypotheses as directly representing the population effect. Thus, we make a decision based on probability and accept a probability of making an incorrect decision. 9 Chernick

Associations Two types of pitfalls can occur that affect the association between exposure and disease –Type 1 error: observing a difference when in truth there is none –Type 2 error: failing to observe a difference where there is one. 10

Interpreting Epidemiologic Results YOUR DECISION H 0 True (No assoc.) H 1 True (Yes assoc.) Do not reject H 0 (not stat. sig.) Correct decision Type II (beta error) Reject H 0 (stat. sig.) Type I (alpha error) Correct decision Four possible outcomes of any epidemiologic study: 11 REALITY

YOUR DECISION H 0 True (No assoc.) H 1 True (Yes assoc.) Do not reject H 0 (not stat. sig.) Correct decision Failing to find a difference when one exists Reject H 0 (stat. sig.) Finding a difference when there is none Correct decision Four possible outcomes of any epidemiologic study: 12 REALITY

Type I and Type II errors  is the probability of committing type I error.  is the probability of committing type II error. 13

DECISIONH 0 TrueH 1 True Do not reject H 0 (not stat. sig.) Reject H 0 (stat. sig.) Type I (alpha error) “Conventional” Guidelines: Set the fixed alpha level (Type I error) to 0.05 This means, if the null hypothesis is true, the probability of incorrectly rejecting it is 5% or less. 14 Study Result

Empirical Rule For a Normal distribution approximately, a) 68% of the measurements fall within one standard deviation around the mean b) 95% of the measurements fall within two standard deviations around the mean c) 99.7% of the measurements fall within three standard deviations around the mean 15

34.13% 13.59% 2.28% Normal Distribution 50 % 13.59% 16  usually set at 5%)

4.A test statistic to assess “statistical significance” is performed to assess the degree to which the data are compatible with the null hypothesis of no association 5.Given a test statistic and an observed value, you can compute the probability of observing a value asextreme or more extreme than the observed value under the null hypothesis of no association. This probability is called the “p-value” 17 Random Error (Chance)

6.By convention, if p < 0.05, then the association between the exposure and disease is considered to be “statistically significant.” (e.g. we reject the null hypothesis (H 0 ) and accept the alternative hypothesis (H 1 )) 18 Random Error (Chance)

p-value –the probability that an effect at least as extreme as that observed could have occurred by chance alone, given there is truly no relationship between exposure and disease (H o ) –the probability the observed results occurred by chance –that the sample estimates of association differ only because of sampling variability. Sever 19

What does p < 0.05 mean? Indirectly, it means that we suspect that the magnitude of effect observed (e.g. odds ratio) is not due to chance alone (in the absence of biased data collection or analysis) Directly, p=0.05 means that one test result out of twenty results would be expected to occur due to chance (random error) alone 20 Random Error (Chance)

D+D+ D-D- E+E E-E Example: I E+ = 15 / ( ) = 0.15 I E- = 10 / ( ) = 0.10 RR = I E+ /I E- = 1.5, p = 0.30 Although it appears that the incidence of disease may be higher in the exposed than in the non-exposed (RR=1.5), the p-value of 0.30 exceeds the fixed alpha level of This means that the observed data are relatively compatible with the null hypothesis. Thus, we do not reject H 0 in favor of H 1 (alternative hypothesis). 21

Random Error (Chance) Take Note: The p-value reflects both the magnitude of the difference between the study groups AND the sample size 22 The size of the p-value does not indicate the importance of the results Results may be statistically significant but be clinically unimportant Results that are not statistically significant may still be important

23 Sometimes we are more concerned with estimating the true difference than the probability that we are making the decision that the difference between samples is significant

Random Error (Chance) A related, but more informative, measure known as the confidence interval (CI) can also be calculated. CI = a range of values within which the true population value falls, with a certain degree of assurance (probability). 24

Confidence Interval - Definition A range of values for a variable constructed so that this range has a specified probability of including the true value of the variable A measure of the study’s precision Sever Lower limit Upper limit Point estimate 25

Statistical Measures of Chance Confidence interval –95% C.I. means that true estimate of effect (mean, risk, rate) lies within 2 standard errors of the population mean 95 times out of 100 Sever 26

Interpreting Results Confidence Interval: Range of values for a point estimate that has a specified probability of including the true value of the parameter. Confidence Level: (1.0 –  ), usually expressed as a percentage (e.g. 95%). Confidence Limits: The upper and lower end points of the confidence interval. 27

Hypothetical Example of 95% Confidence Interval Exposure:Caffeine intake (high versus low) Outcome: Incidence of breast cancer Risk Ratio: 1.32 (point estimate) p-value: 0.14 (not statistically significant) 95% C.I.: _____________________________________________________ (null value) 95% confidence interval

INTERPRETATION: Our best estimate is that women with high caffeine intake are 1.32 times (or 32%) more likely to develop breast cancer compared to women with low caffeine intake. However, we are 95% confident that the true value (risk) of the population lies between 0.87 and 1.98 (assuming an unbiased study). _____________________________________________ (null value) 95% confidence interval 29 Random Error (Chance)

If the 95% confidence interval does NOT include the null value of 1.0 (p < 0.05), then we declare a “statistically significant” association. If the 95% confidence interval includes the null value of 1.0, then the test result is “not statistically significant.” 30 Random Error (Chance) Interpretation:

Interpretation of C.I. For OR and RR: The C.I. provides an idea of the likely magnitude of the effect and the random variability of the point estimate. On the other hand, the p-value reveals nothing about the magnitude of the effect or the random variability of the point estimate. In general, smaller sample sizes have larger C.I.’s due to uncertainty (lack of precision) in the point estimate. 31 Interpreting Results

Selection of Tests of Significance 32

Scale of Data 1. Nominal: Data do not represent an amount or quantity (e.g., Marital Status, Sex) 2. Ordinal: Data represent an ordered series of relationship (e.g., level of education) 3. Interval: Data are measured on an interval scale having equal units but an arbitrary zero point. (e.g.: Temperature in Fahrenheit) 4. Interval Ratio: Variable such as weight for which we can compare meaningfully one weight versus another (say, 100 Kg is twice 50 Kg) 33

Which Test to Use? Scale of Data NominalChi-square test OrdinalMann-Whitney U test Interval (continuous) - 2 groups T-test Interval (continuous) - 3 or more groups ANOVA 34

Protection against Random Error Test statistics provide protection from type 1 error due to random chance Test statistics do not guarantee protection against type 1 errors due to bias or confounding. Statistics demonstrate association, but not causation. 35