Hypothesis Testing One-sample means and proportions Lecture 4.

Slides:



Advertisements
Similar presentations
Is it statistically significant?
Advertisements

Inference Sampling distributions Hypothesis testing.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Significance Testing Chapter 13 Victor Katch Kinesiology.
Introduction to Hypothesis Testing
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
8-2 Basics of Hypothesis Testing
Chapter 11: Inference for Distributions
Inferences About Process Quality
Stat 217 – Day 15 Statistical Inference (Topics 17 and 18)
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics
Statistics for Managers Using Microsoft® Excel 5th Edition
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Overview Definition Hypothesis
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Fundamentals of Hypothesis Testing: One-Sample Tests
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
More About Significance Tests
NONPARAMETRIC STATISTICS
Inference for a Single Population Proportion (p).
Comparing Two Population Means
Significance Tests: THE BASICS Could it happen by chance alone?
LECTURE 19 THURSDAY, 14 April STA 291 Spring
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Hypotheses tests for means
Lecture 16 Dustin Lueker.  Charlie claims that the average commute of his coworkers is 15 miles. Stu believes it is greater than that so he decides to.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
STA Lecture 251 STA 291 Lecture 25 Testing the hypothesis about Population Mean Inference about a Population Mean, or compare two population means.
1 Nonparametric Statistical Techniques Chapter 17.
Section 10.1 Confidence Intervals
Nonparametric Statistics
Confidence Intervals Lecture 3. Confidence Intervals for the Population Mean (or percentage) For studies with large samples, “approximately 95% of the.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Chapter 221 What Is a Test of Significance?. Chapter 222 Thought Question 1 The defendant in a court case is either guilty or innocent. Which of these.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Chapter 20 Testing Hypothesis about proportions
Lecture 18 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Copyright © 2010 Pearson Education, Inc. Slide
Lecture 17 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
AP Statistics Section 11.1 B More on Significance Tests.
BPS - 5th Ed. Chapter 251 Nonparametric Tests. BPS - 5th Ed. Chapter 252 Inference Methods So Far u Variables have had Normal distributions. u In practice,
Introduction Suppose that a pharmaceutical company is concerned that the mean potency  of an antibiotic meet the minimum government potency standards.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Statistical Analysis II Lan Kong Associate Professor Division of Biostatistics and Bioinformatics Department of Public Health Sciences December 15, 2015.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
Essential Statistics Chapter 171 Two-Sample Problems.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Hypothesis Tests for 1-Proportion Presentation 9.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Chapter 9 Hypothesis Testing.
Presentation transcript:

Hypothesis Testing One-sample means and proportions Lecture 4

Background Research projects often start out with hypotheses such as: The average IQ for UMDNJ public health students is higher than the national average Are boy birth rates higher than the national average when both parents have high PCB levels? In the conclusions of such studies, it is natural to either reject or accept the initial hypotheses.

IQ for Public Health Students How unlikely would it be that the average IQ score of 10 public health students will be 10 points or more points above the national average (given that the average IQ scores are the same)? Sampling distribution is normal. Z-score = Approx. Probability = from Normal table, 3 rd column You have calculated your first one-sided p-value! If the probability is too small, i.e., the event is too unlikely given the hypothesized value, then we reject the hypothesized value.

Reading the Normal Table (A5.2, pp ) 1 st column: z-score, # standard errors (deviations) 2 nd column: probability of being within z standard errors (deviations) of the mean 3 rd column: probability of being z standard errors (dev.’s) above the mean 4 th column: probability of being z standard errors (dev.’s) below the mean

Experiment to Test Hypothesis about Public Health Students’ IQ State Hypothesis: “Public Health students’ IQ is higher than national average.” Translate Hypothesis into statistical terminology: A null hypothesis, An alternative hypothesis, Plan sample selection (how & How many?) Collect Data Conduct hypothesis test.

Hypothesis Tests – Basic Concepts The alternative hypothesis is generally the hypothesis that the investigator has an interest in proving. The null hypothesis is the hypothesis that is assumed true if we fail to prove the alternative. The five steps to a hypothesis test revolve around assuming the null hypothesis to be true and then examining whether the data are consistent with that null hypothesis.

General Steps to a Hypothesis Test 1.State the necessary assumptions for the test, assessing whether they are true for the collected data. 2.State the null and alternative hypotheses. 3.Calculate test statistic from the data, assuming the null hypothesis to be true. 4.Calculate a p-value (based on the form of the alternative hypothesis). 5.State conclusion in lay-person’s terms.

P-Values A P-value is the probability of observing an event at least as extreme as what was observed (in the direction of the alternative hypothesis)

Hypothesis Test for the Population Mean 1.Assumptions: Independent, random sample Large Sample OR Population distribution is approximately normal 2.Hypotheses ( for one-sided/one-tailed and two-sided tests ) 3.Test Statistic: 4. P-value based on a normal OR t-distribution with the appropriate degrees of freedom (n-1). 5. Conclusion in lay-person’s terms.

Distribution of the Test Statistic (z-score) The test statistic is given as When the sample size is small and the population distribution is normal, this follows a t-distribution. When the sample size is large, we can approximate the distribution as normal, no matter what the population distribution. In both cases, if the null hypothesis is true, we would expect the z-score

Matching p-values with the Direction of the Alternative Hypothesis If H A : μ ≠ μ 0, p-value = probability that in any experiment the test statistic would be at least as far away from zero as in the current study given that μ 0 is the true population mean If H A : μ > μ 0, p-value = probability that in any experiment the test statistic would at least as great as in the current study given that μ 0 is the true population mean If H A : μ < μ 0, p-value = probability that in any experiment the test statistic would at least as small as in the current study given that μ 0 is the true population mean

More on p-values for Means If H A : μ ≠ μ 0, p-value = probability that in any experiment the sample mean is at least as many standard errors away from the population mean (in the null hypothesis) as in the current study given that μ 0 is the true population mean

Hypothesis concerning Public Health Students’ IQ: Step One: 1.Assumptions: Independent, random sample – depends on sampling design. Sample size is Small. Population distribution is approximately normal – check by looking at sample distribution via a histogram or box- plot.

Hypothesis concerning Public Health Students’ IQ: Step Two: 2.Hypotheses: We are interested in showing that Public Health students IQ’s are higher than the national average, thus H 0 : μ = 100 (convention says we use “=“) H A : μ > 100

Hypothesis concerning Public Health Students’ IQ: Step Three: 3.Test Statistic: Suppose that n = 10, sample mean = 110, and standard deviation = 10. The z-score with an estimated standard deviation:

Hypothesis concerning Public Health Students’ IQ: Step Four: 4.P-Value: The alternative hypothesis is that the true mean is larger than the national average. Therefore, we calculate, using the t-table, Prob(t>observed given that μ = 100), which is between.007 and.006 since the observed t is between 3.1 and 3.2.

Hypothesis concerning Public Health Students’ IQ: Step Five: 5.Conclusion: Assuming that the average IQ of Public Health Students is the same as the national average, one would only observe a sample as “extreme” as the one observed less than 1% of the time. Thus, we may conclude that is unlikely that the national average is the true mean for public health students. There is strong evidence to reject the null hypothesis in favor of the alternative which says that the average IQ for public health students is greater than the national average.

Exercise Capacity: Large Sample A study was conducted of 90 male patients following a new treatment for congestive heart failure. One of the variables measured was the increase in exercise capacity (measured in minutes) over a 4-week treatment period. The previous treatment regime had produced an average increase of 2 minutes. The researchers wanted to evaluate whether the new treatment had increased the exercise capacity even more than the previous treatment.

Exercise Capacity: Large Sample, cont. The sample mean and standard deviation were calculated to be 2.17 and 1.05, respectively. State the necessary assumptions for the test, assessing whether they are true for the collected data. 1.Assumptions: 2.Hypotheses vs.

Exercise Capacity: Large Sample, cont. 3. Test Statistic : z = ( ) divided by 1.05/sqrt(90). 4. P-Value: Prob(at least 1.54 st.errors above) = Conclusion: If the true increase with this treatment were 2.00 minutes then just by chance we would expect the mean for a single sample of 90 subjects

Exercise Capacity: Large Sample,cont. Typically, since this is greater than 5%, we would say that this increase is not significantly bigger than the increase for the previous treatment.

Exercise Capacity: A Larger Sample Suppose in an identical study, the sample was twice as big (180 men) with the same calculated sample mean and standard deviation. The standard error would now be SEM=1.05/sqrt(180) And the z-score would equal with a p-value of Now, since the p-value is so much smaller we might reject the null hypothesis and conclude that the increase is significantly bigger than the increase for the previous treatment! The only difference between the experiments is the sample size!

Why we “fail to reject the null” If our sample size is not big enough, we may not be able to detect a small difference, even if such a difference truly exists. Alternatively, if the true difference is very small, we may not be able to detect it even with a large sample size. Therefore, if our p-value isn’t small enough to reject the null, we typically say that We don’t have enough evidence to reject the null hypothesis

Rejecting the null hypothesis What p-value is small enough to reject the null hypothesis? Depends. For pilot studies, For huge studies on which major changes in policy may result from rejecting the null hypothesis, one may pick cut-off’s Using a popular cut-off, most investigators reject the null hypothesis if the p-value ALWAYS report the p-value.

P-values and Significance Levels The threshold with respect to the p-value at which we reject the null hypothesis is called the significance level (α) for the test. Once the significance level is set, this is the Type I error, the probability of rejecting the null hypothesis when it is in fact true. The threshold with respect to the test statistic at which we reject the null hypothesis is called the critical value.

More on p-values and α In general, the smaller the significance level, the larger the critical value will be. If we reject the null hypothesis at a specified significance level, we say that the population mean is significantly different (or larger or smaller) than the null value.

Hypothesis Tests for Proportions Same idea as for means! Still use the z-score, but with a different standard error. Compare z-score to a standard normal distribution.

Proportions Example 1.Assumptions Independent, random sample Large sample (same criteria as for CI) 2.Hypotheses H 0 : π = π 0 vs. H A : π > π 0 or π < π 0 or π ≠ π 0 (π 0 =.5 used when testing for a majority or minority) 3.Test Statistic z=(p- π 0 )/SE(p) 4.P-value using normal tables 5.Conclusion referring to applied hypothesis

SE(p) under H 0 For confidence intervals, we had no pre- conceived notion of the specific value of p. The standard error was estimated as the square root of p(1-p)/n. Under H 0, we claim that π = π 0. Therefore, p is estimating π 0. Hence under H 0, the true standard error for p is

Example on Sleep Disorders N=1,010 adults polled in the Fall of 2001 Statement: “Seventy-four percent of respondents in the study reported experiencing at least one symptom of a sleep disorder a few nights a week or more. That number was up significantly from 62 percent in 1999 and 2000, and from 69 percent last year.” Presumably, when the researchers collected the data they didn’t have any pre-suppositions about whether the change, if any, would be larger or smaller. Hence we’ll use a “two-sided” hypothesis test: H A : sleep disorders changed significantly from previous year.

Sleep Disorders Hypothesis Test 1.Assumptions Independent, random sample, Large sample (n=1,010) 2.Hypotheses (with π 0 =.69) H 0 : π =.69 vs. H A : π ≠.69 in this example 3.Test Statistic SE(p) = sqrt[(.69)(1-.69)/(1010)] =.0146 z=(p- π 0 )/SE(p) = ( )/ = P-value using normal tables and column 5. P-value is between and Conclusion With a p-value of there is overwhelming evidence that the proportion of the population polled with sleep disorders has changed from the previous year.

Type 1 and Type 2 errors For a fixed sample size, the larger the difference between the true parameter and the hypothesized parameter, the more likely we are to detect that difference. Recall: Type 1 error (α) is the probability of rejecting the null hypothesis when it is true. Type 2 error is the probability of failing to reject the null hypothesis when the alternative hypothesis is true.

Power The power of a test is one minus the probability of type 2 error or, equivalently, the probability of rejecting the null hypothesis when the alternative hypothesis is true. As we suggested previously, there are a number of things which can lead to a “high” power.

Power, cont. Type 1 and Type 2 error must be balanced. For fixed sample size and fixed parameter values, decreasing Type 1 error results in or, equivalently, results in In planning a study, one plans a sample size using guesses at the true parameter values and by fixing the levels of Type 1 and Type 2 errors. The smaller we want both errors, the larger the required sample size. Sample size calculations in the next lecture.

Two-sided Hypothesis Tests and Confidence Intervals These are in some sense “equivalent” If we fail to reject a two-sided hypothesis test (say that the proportion equals ½ or the mean equals 0) at the 0.05 level, then a 95% confidence interval would include the null value (½ or 0), and vice versa. If we reject a two-sided hypothesis test (say that the proportion equals ½ or the mean equals 0) at the 0.05 level, then a 95% confidence interval would not include the null value (½ or 0), and vice versa.

Small Sample, Non-normal Population If the sample was large, the Central Limit Theorem would be applicable for testing hypotheses about the mean. If the population was normal, the sampling distribution of the mean is exactly a normal distribution to start with. If the sample is small and the population non- normal, what do we do? Nonparametric statistics is a sub-field of statistics that creates inferences concerning populations that cannot be assumed to follow any particular distribution.

Example Suppose that a nurse has been instructed to perform a procedure in a new way. Researchers recorded the change in the number of minutes it took the nurse to perform the procedure. The data is 0.6, -0.5, 1.1, 2.4, 3.5, , 1.0, 2.1, -0.6, -0.2 We would be hard pressed to say that this data even approximately follows a normal distribution.

Assumption of normality for small sample example There are only 11 observations and we might be uncomfortable claiming that this distribution looks normal. Instead, it looks more uniform.

The Sign Test – 5 Steps Assumptions: Random, independent sample Hypotheses: Null hypothesis: Median equals zero Alternative hypothesis: Median does not equal zero Test statistic: p=7/11, interested in comparing proportion that are greater than zero with one-half.

The Sign Test – 5 Steps, cont. P-value: Need exact calculation since CLT doesn’t apply with small samples. 95% CI for p with small samples: (0.308, 0.891) Conclusion: Since 0.5 is included in the 95% confidence interval, we can’t say that the median is significantly different than zero at the 0.05 level. (We fail to reject the null hypothesis.)

The Signed Rank Test – 5 steps Assumptions: The measurement is continuous Independent, random sample from the population Distribution is symmetric Hypotheses: H 0 : Median of the distribution is 0 H A : Median of distribution is non-zero Test Statistic: Minimum of the rank sums P-value: from the computer! For this example, p= Conclusion: As per usual.

Calculation of Signed Rank Test Statistic Order observations from smallest to largest in absolute value |Y| (1) < |Y| (2) <…<|Y| (n) So from example, |-0.2| < |-0.4| < |-0.5| < |-0.6| < 0.6 < 1.0 < 1.1 < 2.0 < 2.1 < 2.4 < 3.5 Assign Ranks to these absolute values 1, 2, …, n In example, 1, 2, …, 11

Signed Rank Test Statistic, cont… Arrange the ranks into two groups: those with actual values that are smaller and those that are larger than zero. Sum the ranks for both the negative and positive valued observations, separately. Here, for negative values, sum of ranks = = 10.5 For positive values sum of ranks = = 56.5 Test Statistic = smallest rank sum

P-values for signed rank test For critical values and p-values, look at tables/computer generated p-values. This procedure is unavailable in the Student version of SPSS. It is available in SAS and the regular version of SPSS.

Comments on Signed Rank Test More “powerful” than the Sign Test, but requires more assumptions One-sided tests are possible Robust to outliers Some books/programs use the sum of the ranks of the positive values as the test statistic – p-values are always the same Nonparametric confidence intervals are also available from some software programs. For tied observations, use average rank for each tied observation.

Homework To be posted, not graded Solutions will be posted on Monday Read Chapters 10, 11 and 12