IE241: Introduction to Hypothesis Testing. We said before that estimation of parameters was one of the two major areas of statistics. Now let’s turn to.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Lecture XXIII.  In general there are two kinds of hypotheses: one concerns the form of the probability distribution (i.e. is the random variable normally.
Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
Chapter 10.  Real life problems are usually different than just estimation of population statistics.  We try on the basis of experimental evidence Whether.
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Inferential Statistics & Hypothesis Testing
Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12 Reading Assignment pp ; 485.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 8 Introduction to Hypothesis Testing.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview of Lecture Independent and Dependent Variables Between and Within Designs.
8-2 Basics of Hypothesis Testing
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics
The Neymann-Pearson Lemma Suppose that the data x 1, …, x n has joint density function f(x 1, …, x n ;  ) where  is either  1 or  2. Let g(x 1, …,
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Overview of Statistical Hypothesis Testing: The z-Test
Hypothesis Testing – Introduction
IE241: Introduction to Hypothesis Testing
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Hypothesis testing is used to make decisions concerning the value of a parameter.
Presented by Mohammad Adil Khan
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Chapter 5 Sampling and Statistics Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
STT 315 This lecture is based on Chapter 6. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Statistical Decision Theory
STA Statistical Inference
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
The Practice of Statistics Third Edition Chapter 10: Estimating with Confidence Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Chapter 20 Testing hypotheses about proportions
Lecture 16 Dustin Lueker.  Charlie claims that the average commute of his coworkers is 15 miles. Stu believes it is greater than that so he decides to.
Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter 9A.
Statistical Inference Statistical Inference involves estimating a population parameter (mean) from a sample that is taken from the population. Inference.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Welcome to MM570 Psychological Statistics
Introduction to hypothesis testing Hypothesis testing is about making decisions Is a hypothesis true or false? Ex. Are women paid less, on average, than.
© Copyright McGraw-Hill 2004
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
INTRODUCTION TO HYPOTHESIS TESTING From R. B. McCall, Fundamental Statistics for Behavioral Sciences, 5th edition, Harcourt Brace Jovanovich Publishers,
1 Hypothesis Testing Basic Problem We are interested in deciding whether some data credits or discredits some “hypothesis” (often a statement about the.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Sampling Distributions Statistics Introduction Let’s assume that the IQ in the population has a mean (  ) of 100 and a standard deviation (  )
Course Overview Collecting Data Exploring Data Probability Intro. Inference Comparing Variables Relationships between Variables Means/Variances Proportions.
IE241 Final Exam. 1. What is a test of a statistical hypothesis? Decision rule to either reject or not reject the null hypothesis.
One Sample Inf-1 In statistical testing, we use deductive reasoning to specify what should happen if the conjecture or null hypothesis is true. A study.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Slide 20-1 Copyright © 2004 Pearson Education, Inc.
Chapter 7 Hypothesis Testing with One Sample Let’s begin…
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Chapter Nine Hypothesis Testing.
Chapter 9 Hypothesis Testing.
Hypothesis Testing – Introduction
Hypothesis Testing: Hypotheses
CONCEPTS OF HYPOTHESIS TESTING
Introduction to Inference
Statistics and Data Analysis
Introduction to Inference
Chapter Nine Part 1 (Sections 9.1 & 9.2) Hypothesis Testing
INTRODUCTION TO HYPOTHESIS TESTING
Virtual University of Pakistan
Hypothesis Testing A hypothesis is a claim or statement about the value of either a single population parameter or about the values of several population.
Power Section 9.7.
Confidence Intervals.
Presentation transcript:

IE241: Introduction to Hypothesis Testing

We said before that estimation of parameters was one of the two major areas of statistics. Now let’s turn to the second major area of statistics, hypothesis testing. What is a statistical hypothesis? A statistical hypothesis is an assumption about f(X) if X is continuous or p(X) if X is discrete. A test of a statistical hypothesis is a procedure for deciding whether or not to reject the hypothesis.

Let’s look at an example. A buyer of light bulbs bought 50 bulbs of each of two brands. When he tested them, Brand A had an average life of 1208 hours with a standard deviation of 94 hours. Brand B had a mean life of 1282 hours with a standard deviation of 80 hours. Are brands A and B really different in quality?

We set up two hypotheses. The first, called the null hypothesis Ho, is the hypothesis of no difference. Ho: μ A = μ B The second, called the alternative hypothesis Ha, is the hypothesis that there is a difference. Ha: μ A ≠ μ B

On the basis of the sample of 50 from each of the two populations of light bulbs, we shall either reject or not reject the hypothesis of no difference. In statistics, we always test the null hypothesis. The alternative hypothesis is the default winner if the null hypothesis is rejected.

We never really accept the null hypothesis; we simply fail to reject it on the basis of the evidence in hand. Now we need a procedure to test the null hypothesis. A test of a statistical hypothesis is a procedure for deciding whether or not to reject the null hypothesis. There are two possible decisions, reject or not reject. This means there are also two kinds of error we could make.

The two types of error are shown in the table below. True state Decision H o trueH o false Reject H o Type 1 error α Correct decision Do not reject H o Correct decision Type 2 error β

If we reject H o when H o is in fact true, then we make a type 1 error. The probability of type 1 error is α. If we do not reject H o when H o is really false, then we make a type 2 error. The probability of a type 2 error is β.

Now we need a decision rule that will make the probability of the two types of error very small. The problem is that the rule cannot make both of them small simultaneously. Because in science we have to take the conservative route and never claim that we have found a new result unless we are really convinced that it is true, we choose a very small α, the probability of type 1 error.

Then among all possible decision rules given α, we choose the one that makes β as small as possible. The decision rule consists of a test statistic and a critical region where the test statistic may fall. For means from a normal population, the test statistic is where the denominator is the standard deviation of the difference between two independent means.

The critical region is a tail of the distribution of the test statistic. If the test statistic falls in the critical region, Ho is rejected. Now, how much of the tail should be in the critical region? That depends on just how small you want α to be. The usual choice is α =.05, but in some very critical cases, α is set at.01. Here we have just a non-critical choice of light bulbs, so we’ll choose α =.05. This means that the critical region has probability =.025 in each tail of the t distribution.

For a t distribution with.025 in each tail, the critical value of t = 1.96, the same as z because the sample size is greater than 30. The critical region then is |t |> In our light bulb example, the test statistic is

Now 4.23 is much greater than 1.96 so we reject the null hypothesis of no difference and declare that the average life of the B bulbs is longer than that of the A bulbs. Because α =.05, we have 95% confidence in the decision we made.

We cannot say that there is a 95% probability that we are right because we are either right or wrong and we don’t know which. But there is such a small probability that t will land in the critical region if Ho is true that if it does get there, we choose to believe that Ho is not true. If we had chosen α =.01, the critical value of t would be 2.58 and because 4.23 is greater than 2.58, we would still reject Ho. This time it would be with 99% confidence.

How do we know that the test we used is the best test possible? We have controlled the probability of Type 1 error. But what is the probability of Type 2 error in this test? Does this test minimize it subject of the value of α?

To answer this question, we need to consider the concept of test power. The power of a statistical test is the probability of rejecting Ho when Ho is really false. Thus power = 1-β. Clearly if the test maximizes power, it minimizes the probability of Type 2 error β. If a test maximizes power for given α, it is called an admissible testing strategy.

Before going further, we need to distinguish between two types of hypotheses. A simple hypothesis is one where the value of the parameter under Ho is a specified constant and the value of the parameter under Ha is a different specified constant. For example, if you test Ho: μ = 0 vs Ha: μ = 10 then you have a simple hypothesis test. Here you have a particular value for Ho and a different particular value for Ha.

For testing one simple hypothesis Ha against the simple hypothesis Ho, a ground-breaking result called the Neyman-Pearson lemma provides the most powerful test. λ is a likelihood ratio with the Ha parameter MLE in the numerator and the Ho parameter MLE in the denominator. Clearly, any value of λ > 1 would favor the alternative hypothesis, while values less than 1 would favor the null hypothesis.

Consider the following example of a test of two simple hypotheses. A coin is either fair or has p(H) = 2/3. Under Ho, P(H) = ½ and under Ha, P(H) = 2/3. The coin will be tossed 3 times and a decision will be made between the two hypotheses. Thus X = number of heads = 0, 1, 2, or 3. Now let’s look at how the decision will be made.

First, let’s look at the probability of Type 1 error α. In the table below, Ho ⇒ P(H) =1/2 and Ha ⇒ P(H) = 2/3. Now what should the critical region be? XP(X|Ho)P(X|Ha) 01/81/27 13/86/27 23/812/27 31/88/27

Under Ho, if X = 0, α = 1/8. Under Ho, if X = 4, α = 1/8. So if either of these two values is chosen as the critical region, the probability of Type 1 error would be the same. Now what if Ha is true? If X = 0 is chosen as the critical region, the value of β = 26/27 because that is the probability that X ≠ 0. On the other hand, if X = 4 is chosen as the critical region, the value of β = 19/27 because that is the probability that X ≠ 3. Clearly, the better choice for the critical region is X=3 because that is the region that minimizes β for fixed α. So this critical region provides the more powerful test.

In discrete variable problems like this, it may not be possible to choose a critical region of the desired α. In this illustration, you simply cannot find a critical region where α =.05 or.01. This is seldom a problem in real-life experimentation because n is usually sufficiently large so that there is a wide variety of choices for critical regions.

This problem to illustrate the general method for selecting the best test was easy to discuss because there was only a single alternative to Ho. Most problems involve more than a single alternative. Such hypotheses are called composite hypotheses.

Examples of composite hypotheses: Ho: μ = 0 vs Ha: μ ≠ 0 which is a two-sided Ha. A one-sided Ha can be written as Ho: μ = 0 vs Ha: μ > 0 or Ho: μ = 0 vs Ha: μ < 0 All of these hypotheses are composite because they include more than one value for Ha. And unfortunately, the size of β here depends on the particular alternative value of μ being considered.

In the composite case, it is necessary to compare Type 2 errors for all possible alternative values under Ha. So now the size of Type 2 error is a function of the alternative parameter value θ. So β(θ) is the probability that the sample point will fall in the noncritical region when θ is the true value of the parameter.

Because it is more convenient to work with the critical region, the power function 1-β(θ) is usually used. The power function is the probability that the sample point will fall in the critical region when θ is the true value of the parameter. As an illustration of these points, consider the following continuous example.

Let X = the time that elapses between two successive trippings of a Geiger counter in studying cosmic radiation. It is assumed that the density function is f(x;θ) = θe -θx where θ is a parameter which depends on experimental conditions. Under Ho, θ = 2. Now a physicist believes that θ < 2. So under Ha, θ < 2.

Now one choice for the critical region is X ≥ 1. and Another choice is the left tail, X ≤.07 for which α =.135. That is, Now let’s examine the power functions for the two competing critical regions.

For the critical region X > 1, and for the critical region X <.07, The graphs of these two functions are called the power curves for the two critical regions.

These two power functions are Note that the power function for X>1 region is always higher than the power function for X 1 is superior.