Hypothesis Testing – Part I

Slides:



Advertisements
Similar presentations
Chapter 9 Hypothesis Testing Understandable Statistics Ninth Edition
Advertisements

Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Inference Sampling distributions Hypothesis testing.
STATISTICAL INFERENCE PART V
Chapter 10: Hypothesis Testing
Chap 9-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Basic Business Statistics 12 th Edition Chapter 9 Fundamentals of Hypothesis.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Introduction to Hypothesis Testing
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Lecture 2: Thu, Jan 16 Hypothesis Testing – Introduction (Ch 11)
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 8 Introduction to Hypothesis Testing.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Sample Size Determination In the Context of Hypothesis Testing
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Chapter 9 Hypothesis Testing.
Ch. 9 Fundamental of Hypothesis Testing
Statistics for Managers Using Microsoft® Excel 5th Edition
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Hypothesis Testing A hypothesis is a conjecture about a population. Typically, these hypotheses will be stated in terms of a parameter such as  (mean)
Chapter 10 Hypothesis Testing
Confidence Intervals and Hypothesis Testing - II
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Fundamentals of Hypothesis Testing: One-Sample Tests
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.
14. Introduction to inference
Lesson 11 - R Review of Testing a Claim. Objectives Explain the logic of significance testing. List and explain the differences between a null hypothesis.
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
STATISTICAL INFERENCE PART VII
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Chapter 10 Hypothesis Testing
Introduction to Hypothesis Testing: One Population Value Chapter 8 Handout.
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
Significance Tests: THE BASICS Could it happen by chance alone?
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
Chapter 20 Testing hypotheses about proportions
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
© Copyright McGraw-Hill 2004
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Hypothesis Tests for 1-Proportion Presentation 9.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Chapter Nine Hypothesis Testing.
Chapter 9 Hypothesis Testing: Single Population
Presentation transcript:

Hypothesis Testing – Part I

Recall: We learned how to describe data Made no assumptions about where the data came from Nor about method of sampling We focused on methods of sampling Probability samples Learned how to calculate probabilities Focus on specific probability distributions

We learned how to estimate unknown population parameters Goal: to try to understand the characteristics (parameters) of the population that gave rise to our sample data 4. Now, we’ll learn how to evaluate alternative explanations for the data we have observed The purpose is to test a research hypothesis

Examples of Research Hypotheses 1. Early treatment, compared to later treatment, of patients with an evolving Ml will result in better heart function (ejection fraction) at 24 hours. 2. The implementation of policy “A” will result in a reduction in the inappropriate use of a particular drug. 3. The delivery of an educational intervention to high school students will result in greater use of “safer” sex practices. 4. The average cost of a particular procedure is $X.

To answer such questions, we collect data then analyze the data to see if they are compatible with the research hypothesis being true. We reason by use of “proof by contradiction” Proof by example won’t work. The critic can always claim a counter example must exist.

The Logic of Statistical Hypothesis Testing The investigator starts by presuming the NULL explanation, eg: The treatment as NO benefit The new cost is the same as the old (there is NO difference between cost of new and old) Data are then collected and evaluated for consistency with the NULL explanation

If the data are NOT consistent with the null explanation then abandon the null explanation in favor of an alternative Typically, it is the alternative explanation that the investigator would like to advance

If the null hypothesis is not true – Then some alternative hypothesis must be true This suggests some guidelines: We’ll let “Ho” represent the null hypothesis “Ha” represent the alternative hypothesis (called “H-naught” or “H-a”

The null is the one we hope to contradict The null and alternative hypotheses are typically specified so that The null is the one we hope to contradict The null and alternative are Mutually exclusive Collectively exhaustive Both should be specified in advance! – before the data is collected.

The Research Hypothesis  Ho and Ha Examples Early treatment post MI results in better function of the heart at 24 hours (better = higher) Define study with: Group 1 = “Early” m1 = true mean at 24 hours Group 2 = “Late” m2 = true mean at 24 hours Research hypothesis says m1 > m2  This is the alternative hypothesis since it is the explanation the investigator seeks to advance.

Thus, with the alternative defined, the null is defined to be anything other than the alternative: That is, m1  m2 Thus we have: Note that we can rewrite the hypotheses to compare the difference between group means to zero

If our research hypothesis is that “early treatment leads to different heart function at 24 hours” then our alternative is that the means are not equal: m1 ¹ m2

The first case is called a one-sided alternative we are interested in a change in only one direction : m1 > m2 The second case is called a two-sided alternative we consider change in either direction away from equality: m1  m2

Example 2: The average cost of a particular procedure is $X Suppose a medical insurance company wants to pay no more than $500 for a particular surgical procedure: Let m = true average cost The research hypothesis says m  500  specify this as the alternative hypothesis Thus, Ho: m ≥ 500 Ha: m < 500

Hypothesis test as “proof by contradiction” Assume null hypothesis is true Determine a “rejection region” corresponding to values unlikely to occur using this assumption (Ho true) Either: “Reject” Ho if the observed data is in the rejection region. OR “Fail to Reject” Ho if data in the rejection region with assumption

For example we will reject Ho: m > 500 when X is low enough that we believe the true mean must be less than $500 Rejection Region X 500

Steps in Constructing a Statistical Hypothesis Test 1. Identify the research question 2. State the assumptions necessary for computing probabilities 3. Specify Ho and Ha and the α-level (usually α = 0.05) Specify the test statistic Specify a decision rule 6. Compute the test statistic and the achieved significance( or P - value) from sample data 7. Come to a “Statistical” Decision 8. Reach a Conclusion 9. Report a confidence interval

Example: 1. Identify the research question Suppose the mean birth weight for 1998 of all US hospital births is known to be m = 3400 gm with s = 710 gm, based upon national birth certificate data. How do births at Hospital A compare? We are asking Is the mean birth weight at Hospital A different from the national mean?

Experiment: Collect birth weights of 100 consecutive births at Hospital A and compute our sample mean of x = 3250 gm. What Assumptions must we make about our data to compute probabilities? Assume: a random sample of births from a population with known s = 710 gm (known national standard deviation). Thus, by the central limit theorem:

Specify null and alternative hypotheses: Ho: The true mean birth weight at Hospital A is the same as the national mean. Ha: The true mean birth weight at Hospital A is different from the national mean Or Ho: m = 3400 Ha: m  3400

4. Compute the Test Statistic This is where the proof by contradiction thinking comes in. We want to know: If it is true that m = 3400 gm (Ho) what are the chances of observing a sample mean as far away from m = 3400 as x = 3250?

(greater OR less than the value for Ho), Since Ha is two-sided (greater OR less than the value for Ho), we want to know the probability represented by the following shaded area: 150 150 3250 3400 3550 “What are the chances of observing x as far away from the pop. mean m=3400 as the one we have (3250) in either direction” ?

We want to compute: We know how to do this! 150 150 3250 3400 3550 We know how to do this! We can transform the probability calculation into an equivalent one for a standard normal:

When we use m = 3400, as presumed by Ho , the resulting quantity is called a a Test Statistic More generally if we let mo represent the value of m specified by Ho we have Test Statistic:

5. Specify Decision Rule What is the probability of observing a sample mean as far away from mo as the mean we have observed?” This probability calculation is known as: the achieved significance or the significance of the data or the p-value

Our decision rule might be Reject Ho if the achieved significance is less than 0.05 This is equivalent to saying, If the probability of observing a sample mean, x, this far or farther from mo is less than 5%, then we will reject Ho in favor of Ha.

Compute the test statistic from the sample data:

The achieved significance (or P-value)is: -2.11 0 2.11

Statistical Decision: Since 0.0348 is less than 0.05 we will reject Ho. We are saying that x = 3250 is sufficiently different from µo = 3400 that it suggests that Ho is not true and should be abandoned. That is, if Ho is true, the probability of a sample mean this far away is only .0348 or 3.5% – an unlikely outcome, so reject Ho.

8. Conclusion: Hospital A has babies of significantly different birth weight than the US average. In fact, the mean birth weight at Hospital A appears to be lower than the US mean.

9. Compute a Confidence Interval Estimate of the true mean birth weight for babies at Hospital A We have all the ingredients to compute a confidence interval estimate: x = 3250, s = 710, n=100 since the true standard deviation is known we use: z.975 = 1.96 for a 95% confidence interval: x  z (s/n) = 3250  (1.96)(710/10) = 3250  139.2 95% CI: (3110.8 , 3389.2)

Interpretation: The hypothesized mean mo = 3400 falls outside (above) the 95% confidence interval. It therefore seems likely that the mean birth weight at Hospital A is less than the overall US mean. Your confidence interval should give a consistent result with your hypothesis test. If it doesn’t – check your work!

Comparing CI estimates and Hypothesis Testing When conducting a hypothesis test, with an a=.05 decision rule, we are centering an interval around the hypothesized mean (m0): When our observed sample mean (x) falls outside this interval, we interpret this as indicating, with 0.05 likelihood of error, that our sample comes from a distribution with a different mean x ?? m0-z.975s/n m0 m0+z.975s/n

COMMENTS: The “.05 rule” alone is very uninformative it leads to a “reject” or “do not reject” with no information about the data. A better approach is to report both the achieved significance confidence interval estimate You can then interpret these, while also leaving room for your reader to interpret.

2. Don’t forget the conclusion step! Too often, only a p-value is reported or, worse still, only a “reject” or “do not reject” is reported. 3. Statistical significance alone gives no clues about biology. Keep in mind that a standard error is a function of sample size n. This means that by increasing n, the SE can be made smaller and smaller.

Eventually, any observation can achieve statistical significance regardless of its biological relevance. For example is a statistically significant change in blood pressure of 1 mm Hg very useful? If we have a very large n, say n=1000 we might find such a difference of 1mmhg statistically significant, but it may not be a biologically meaningful distinction.

4. A statistical hypothesis test uses probabilities based only on the null hypothesis (Ho) model! The proof by contradiction thinking asks us to: presume that Ho is true then examine the plausibility of our data in light of this assumption. We either reject it, or we fail to do so. We do not prove that Ho is correct.

Actually True Actually False 5. We can summarize the results of statistical hypothesis testing as follows: NULL HYPOTHESIS Actually True Actually False  Type II error b Type I error a Fail to Reject DECISION

IF Ho is true and we (incorrectly) reject Ho we have type I error we can calculate Pr[type I error] = a IF Ha is true and we (incorrectly) fail to reject Ho we have type II error we must have a specific Ha model before we can calculate Pr[type II error] = b IF Ha is true and we (correctly) reject Ho This occurs with probability = (1-b) which we call the “POWER” of a test

Example 2 Does a new treatment for cancer increase the survival time from diagnosis significantly beyond 38.3 months? A sample of 100 subjects given the new treatment had a mean survival time of 46.9 months. Assume the data are a random sample of survival times from a N(m, s2) with s = 43.3 months. (e.g., we may know the distribution of survival times from prior studies)

SOLUTION. 2. Assumptions We have a random sample of n=100 survival times from a population with s = 43.3. Thus, 3. Specify Ho and Ha Research hypothesis suggests an increase in survival Ho: m £ 38.3 Ha: m > 38.3 (one sided!)

4. Specify Test Statistic: Since s = 43.3 is known, we’ll use 5. Decision Rule We’ll calculate z using observed data compute the achieved significance (p-value) and compare this to 0.05 If it is less than 0.05 we will reject Ho otherwise we will “fail to reject” Ho

Calculations – Achieved significance Be careful! For a one-sided test, we are concerned with a probability in only 1 direction from mo! .0233 38.3 46.9 z 1.986

7. Statistical Decision .023 < .05  Reject Ho 8. Conclusion It is unlikely that the improvements in survival time are due to chance. The new treatment appears to significantly improves survival. Confidence Interval on True Mean survival using new treatment: z.975 = 1.96 for a 95% confidence interval, known s: x  z (s/n) = 46.9  (1.96)(43.3/10) = 46.9  8.49 95% CI: (38.41, 55.39)

A note on One-sided hypothesis tests: Quite often, we are interested in a change in only one direction: Does a new drug increase the proportion of patients cured? Does a new policy decrease the hospital length of stay (LOS)? A test that looks at a change in only one direction seems to make sense. However in practice this is rarely done.

If it is possible for the change to occur in either direction then a test should look for the change in either direction. For example, the new drug could actually decrease the proportion of patients cured, or the new policy could potentially result in increased LOS due to unexpected side effects. Standard practice is to use a two-sided test!

Recap of Significance Testing So Far The Basic Idea Compute the “probability of the data” (achieved significance) presuming Ho to be true. Large Probabilities are consistent with Ho -- do not reject Small Probabilities are NOT consistent with Ho -- reject

“Probability of the Data” We want to know the probability of a sample statistic as extreme or more extreme than the one observed. One Sided Alternative Two Sided Alternative Distribution determined by Ho t or z = observed sample statistic

Next we will consider a couple of examples that parallel the situations we have discussed so far for confidence interval estimation. We will also focus on computer analysis for conducting hypothesis tests

Application 1: One Population, s2 Known, Test of hypothesis on mean, m 1. Research Question: Serum enzyme A levels are measured in 10 patients with a sample mean of 22. If it is known that the population variance is 45 and if normality is assumed, are the data consistent with a population mean of 25? That is, we have n=10 s2=45 x=22 mo=25

Random sample of serum enzyme A levels 2. Assumptions Random sample of serum enzyme A levels from a Normal distribution with s2 = 45. Thus, Why must we assume normality of the data for this example? Is n particularly large for Central Limit Theorem to hold?

3. Specify Ho and Ha : The wording “are the data consistent with” suggests a two sided alternative: Ho: m = 25 Ha: m ¹ 25 4. Test Statistic s2 known suggests use of Normal or z-transformation:

Decision Rule We’ll calculate the achieved significance (p-value) and compare to a = .05 Reject Ho for p<.05, else fail to reject. Calculations Test Statistic:

Achieved Significance: For 2-sided test: Total area is the achieved level of significance or the p-value - 1.41 0 1.41 Minitab examples

Statistical Decision 0.1586 represents the probability of a sample mean at least as far away from 25 as 22, if in fact the true mean is 25. 0.1586, or 15.86% is reasonably high. This suggests the data are consistent with the null hypothesis Þ Do NOT reject Ho .1586 > .05 (or .1586 > a)

Conclusion The data are consistent with an hypothesized mean serum A level of m = 25. Note we have not proven Ho is true, merely that with the evidence of our sample, m = 25 is a reasonable possibility and we cannot reject it.

95% Confidence Interval x  z.975(s/n) = 22  1.96(2.12) = (17.5, 26.5) Note that: mo = 25 is within the interval

Using Minitab: Stat  Basic Stats  1-Sample Z Select variable to test Test mean: Enter mo Sigma: Enter s

Z-Test Test of mu = 25.00 vs mu not = 25.00 The assumed sigma = 6.71 Variable N Mean StDev SE Mean Z P SerA 10 22.00 6.29 2.12 -1.41 0.16

Application 2: One Normal Population, s2 UNknown Test of m 1. Research Question A drug company claims a certain capsule contains 2.5 milligrams of a drug X. An independent laboratory obtained a random sample of 20 capsules and measured the amount of the drug in each. The measurements were as follows: 3.31 1.30 0.61 2.42 1.94 2.23 2.35 0.96 2.97 2.91 1.70 2.05 3.15 2.54 1.84 2.23 1.94 0.88 0.83 1.92 Is the drug company claim correct?

s2 UNknown suggests use of the t-transformation: Assumptions We have data from a random sample from a normal distribution, s2 unknown. Specify Ho and Ha Ho: mo = 2.5 mg Ha: mo  2.5 mg (Two-sided) Test Statistic s2 UNknown suggests use of the t-transformation: Why must we assume normality of the data? So the t-distribution is applicable!

Decision Rule We’ll calculate the achieved significance level (two-sided) and compare this to a type I error of a = 0.05 Calculations (x = 2.00, s = .787) Test Statistic: Achieved significance:

Statistical Decision 0.0108 < 0.05 p-value < type I error (a) Therefore, REJECT Ho Conclusion The mean amount of drug in the capsules appears to be significantly different from 2.5 mg. In fact, the mean is significantly less than 2.5 mg.

Confidence Interval Estimate x  t19; .975 (se) = 2.00  2.093(.176) = (1.66, 2.35). Note that The confidence interval does not include the hypothesized mean amount of 2.50

The particular test just conducted is known as a ONE-SAMPLE t-TEST. We have a sample from a single population We are comparing our observed sample mean to some hypothesized value for the mean.

Using Minitab: Stats  Basic Stats  1-Sample t Select variable to test Enter mo

T-Test of the Mean Test of mu = 2.500 vs mu not = 2.500 Variable N Mean StDev SE Mean T P drugx 20 2.004 0.787 0.176 -2.82 0.011 Note that the one sample t-test provides you with estimates of the mean, standard deviation and standard error. By checking the Confidence Interval option, you can get a confidence interval rather than a hypothesis test.