11.1 – Significance Tests: The basics

Slides:



Advertisements
Similar presentations
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
Advertisements

Inference Sampling distributions Hypothesis testing.
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
2 nd type of inference Assesses the evidence provided by the data in favor of some claim about the population Asks how likely an observed outcome would.
Choosing Significance Level Section Starter At the local bakery, loaves of bread are supposed to weigh 1 pound, with standard deviation 0.13.
Testing Hypotheses About Proportions Chapter 20. Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called.
AP Statistics – Chapter 9 Test Review
Fundamentals of Hypothesis Testing. Identify the Population Assume the population mean TV sets is 3. (Null Hypothesis) REJECT Compute the Sample Mean.
Ch. 9 Fundamental of Hypothesis Testing
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 11 Introduction to Hypothesis Testing.
CHAPTER 23 Inference for Means.
Chapter Ten Introduction to Hypothesis Testing. Copyright © Houghton Mifflin Company. All rights reserved.Chapter New Statistical Notation The.
Confidence Intervals and Hypothesis Testing - II
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 9 Introduction to Hypothesis Testing.
© 2002 Prentice-Hall, Inc.Chap 7-1 Statistics for Managers using Excel 3 rd Edition Chapter 7 Fundamentals of Hypothesis Testing: One-Sample Tests.
© 2003 Prentice-Hall, Inc.Chap 9-1 Fundamentals of Hypothesis Testing: One-Sample Tests IE 340/440 PROCESS IMPROVEMENT THROUGH PLANNED EXPERIMENTATION.
Testing Hypotheses About Proportions
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 20 Testing Hypotheses About Proportions.
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Claims about a Population Mean when σ is Known Objective: test a claim.
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Introduction to Statistical Inferences Inference means making a statement about a population based on an analysis of a random sample taken from the population.
BPS - 3rd Ed. Chapter 141 Tests of Significance: The Basics.
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
Chapter 4 Introduction to Hypothesis Testing Introduction to Hypothesis Testing.
© 2003 Prentice-Hall, Inc.Chap 7-1 Business Statistics: A First Course (3 rd Edition) Chapter 7 Fundamentals of Hypothesis Testing: One-Sample Tests.
Introduction to Hypothesis Testing: One Population Value Chapter 8 Handout.
Significance Tests: THE BASICS Could it happen by chance alone?
Significance Toolbox 1) Identify the population of interest (What is the topic of discussion?) and parameter (mean, standard deviation, probability) you.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
Testing of Hypothesis Fundamentals of Hypothesis.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 20 Testing Hypotheses About Proportions.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Statistics 101 Chapter 10 Section 2. How to run a significance test Step 1: Identify the population of interest and the parameter you want to draw conclusions.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
MATH 2400 Ch. 15 Notes.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Statistical Significance The power of ALPHA. “ Significant ” in the statistical sense does not mean “ important. ” It means simply “ not likely to happen.
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
CHAPTER 9 Testing a Claim
BPS - 3rd Ed. Chapter 141 Tests of significance: the basics.
Chapter 9: Hypothesis Tests Based on a Single Sample 1.
AP Statistics Section 11.1 B More on Significance Tests.
© 2004 Prentice-Hall, Inc.Chap 9-1 Basic Business Statistics (9 th Edition) Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
What is a Hypothesis? A hypothesis is a claim (assumption) about the population parameter Examples of parameters are population mean or proportion The.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Lesson Use and Abuse of Tests. Knowledge Objectives Distinguish between statistical significance and practical importance Identify the advantages.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
Section 10.2: Tests of Significance Hypothesis Testing Null and Alternative Hypothesis P-value Statistically Significant.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.1 Significance Tests:
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Hypothesis Tests Hypothesis Tests Large Sample 1- Proportion z-test.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
+ Homework 9.1:1-8, 21 & 22 Reading Guide 9.2 Section 9.1 Significance Tests: The Basics.
Introduction to Inference Tests of Significance Proof
Chapter Nine Hypothesis Testing.
Topic 20 Examples Check of Answers.
Unit 5: Hypothesis Testing
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Significance Tests: The Basics
Significance Tests: The Basics
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Presentation transcript:

11.1 – Significance Tests: The basics

Inference: to assess the evidence provided by the sample to claim information about the population.

Hypothesis: A claim made about a population parameter, and sample data is gathered to determine whether the hypothesis is true.

Null Hypothesis: The statement being tested. We believe this to be true until we get evidence against it. NOTATION:

Alternate Hypothesis: Statement we hope or suspect is true instead of the null hypothesis NOTATION:

(Two-Tailed)

Test Statistic: A sample statistic that is computed from the data. It helps us to make a statistical decision. Do we have enough evidence to reject the null hypothesis or not? test statistic = Z =

p-value: This value measures how much evidence you have against the null hypothesis. Small p-values indicate the outcome measured from the sample data is unlikely given the null hypothesis is true. It provides strong evidence against your null hypothesis.

Statistically Significant: An event unlikely to occur by chance. If your p-value is small, then it is statistically significant. It is called alpha, .

Significance Level: The decisive p-value we fix in advance. This states when the null hypothesis should be rejected. This level is compared to the p-value. Common  levels of rejection are =0.10, =0.05, and =0.01. p < , then reject the null p  , then accept the null

Conditions: SRS Normality Independence

Example #1 – State the notation for the null and alternative hypothesis Suppose we work in the quality control department of Ruffles Potato Chips. The quality control manager wants us to verify that the filling machine is calibrated properly. We wish to determine if the mean amount of chips in a bag is different from the advertised 12.5 ounces. The company is concerned if there are too many or too few chips in the bag.

Example #1 – State the notation for the null and alternative hypothesis b. According to the US Department of Agriculture, the mean farm rent in Indiana was $89 per acre in 1995. A researcher for the USDA claims that the mean rent has decreased since then. He randomly selected 50 farms from Indiana and determined the mean farm rent to be $67.

Example #1 – State the notation for the null and alternative hypothesis c. Researchers claim to have found a brain protein that blocks the craving for fatty food and therefore, increases the loss of body fat. To test this theory, 100 people are treated with protein and the reduction in body fat is measured.

True mean weight of loaves of bread Example #2 At the bakery where you work, loaves of bread are supposed to weigh 1 pound. From experience, the weights of loaves produced at the bakery follow a Normal distribution with standard deviation  = 0.13 pounds. You believe that new personnel are producing loaves that are heavier than 1 pound. As supervisor of Quality Control, you want to test your claim at the 5% significance level. You weigh 20 loaves and obtain a mean weight of 1.05 pounds. a. Identify the parameter of interest. State your null and alternative hypotheses.  = True mean weight of loaves of bread

SRS Normality Independence Example #2 At the bakery where you work, loaves of bread are supposed to weigh 1 pound. From experience, the weights of loaves produced at the bakery follow a Normal distribution with standard deviation  = 0.13 pounds. You believe that new personnel are producing loaves that are heavier than 1 pound. As supervisor of Quality Control, you want to test your claim at the 5% significance level. You weigh 20 loaves and obtain a mean weight of 1.05 pounds. b. Verify the conditions are met. SRS (must assume) Normality (yes, pop. is approx normal, therefore, so is sample dist) Independence (There are more than 200 loaves of bread)

Example #2 At the bakery where you work, loaves of bread are supposed to weigh 1 pound. From experience, the weights of loaves produced at the bakery follow a Normal distribution with standard deviation  = 0.13 pounds. You believe that new personnel are producing loaves that are heavier than 1 pound. As supervisor of Quality Control, you want to test your claim at the 5% significance level. You weigh 20 loaves and obtain a mean weight of 1.05 pounds. c. Calculate the test statistic and the P-value. Illustrate using the graph provided.

P(Z > 1.72) = 1 – P(Z < 1.72) =

P(Z > 1.72) = 1 – P(Z < 1.72) = 1 – 0.9573 = 0.0427

Example #2 At the bakery where you work, loaves of bread are supposed to weigh 1 pound. From experience, the weights of loaves produced at the bakery follow a Normal distribution with standard deviation  = 0.13 pounds. You believe that new personnel are producing loaves that are heavier than 1 pound. As supervisor of Quality Control, you want to test your claim at the 5% significance level. You weigh 20 loaves and obtain a mean weight of 1.05 pounds. d. State your conclusions clearly in complete sentences. I would reject the null hypothesis at the 0.05 level. I believe that the workers are making the loaves heavier.

11.2 - Carrying Out Significance Tests

Steps to Hypothesis Testing: PHANTOMS Parameter of interest H: Hypothesis A: Assumptions N: Name of Test T: Test Statistic O: Obtain P-Value M: Make a Statistical Decision S: Summary in context of problem.

One-Sample Z-Test: Testing the mean when  is known.

Calculator Tip: Z-Test Stat – Tests - ZTest

Mean oil output per well in the US Example #1 An energy official claims that the oil output per well in the US has declined from the 1998 level of 11.1 barrels per day. He randomly samples 50 wells throughout the US and determines that the mean output to be 10.7 barrels per day. Assume =1.3 barrels. Test the researchers claim at the =0.05 level. P: Mean oil output per well in the US H:

A: SRS (says so) Normality Independence N: ZTest (n 30, so by the CLT, approx normal) Independence (Safe to assume more than 500 wells in the US) N: ZTest

T:

O: P(Z < -2.176) =

O: P(Z < -2.176) = 0.0146

M: < 0.0146 0.05 Reject the Null

S: There is enough evidence to reject the claim that the average oil output per well in the US is 11.1 barrels per day.

Mean volume of Dell computer stock Example #2 The average daily volume of Dell computer stock in 2000 was 31.8 million shares with a standard deviation of 14.8 million shares according to Yahoo! A stock analyst claims that the stock volume in 2001 is different from the 2000 level. Based on a random sample of 35 trading days in 2001, he finds the sample mean to be 37.2 million shares. Test the analyst’s claim at the =0.01 level. P: Mean volume of Dell computer stock H:

A: SRS (says so) Normality Independence N: ZTest (n 30, so by the CLT, approx normal) Independence (Safe to assume more than 350 trading days) N: ZTest

T:

O: 2[ P(Z < -2.16)] =

O: 2[ P(Z < -2.16)] = 2[ 0.0154] = 0.0308

M: > 0.0308 0.01 Accept the Null

S: There is not enough evidence to claim that the average daily volume of Dell stock is different from 31.8 million shares.

Duality of Confidence Intervals and Hypothesis Testing If the confidence interval does not contain μo, we have evidence that supports the alternative hypothesis, thus we reject the null hypothesis at the  level. Note: The Confidence Interval matches the two-tailed test only!

To not reject 31.8 million shares per day Example #3 The average daily volume of Dell computer stock in 2000 was 31.8 million shares with a standard deviation of 14.8 million shares according to Yahoo! A stock analyst claims that the stock volume in 2001 is different from the 2000 level. Based on a random sample of 35 trading days in 2001, he finds the sample mean to be 37.2 million shares. Test the analyst’s claim at the =0.01 level. What was your conclusion from this hypothesis test in Example #2? To not reject 31.8 million shares per day

Note: We already did P and A b. Construct a 99% confidence interval for the true average daily volume of Dell Computer stock in 2001. Note: We already did P and A N: Z-Interval

I:

31.8 is in the interval, so can’t assume it is different I am 99% confident the true mean daily Dell volume stock is between 30.756 and 43.644 million shares. c. Does this interval reaffirm your statistical decision from the hypothesis test? Explain. Yes, 31.8 is in the interval, so can’t assume it is different

True mean anger expression for marijuana users Example #4 Does marijuana use affect anger expression? Assume for all non-users, the mean score on an anger expression scale is 41.5 with a standard deviation of 6.05. For a random sample of 47 frequent marijuana users, the mean score was 44. Test the claim that marijuana affects the expression of anger at the =0.05 level. P: True mean anger expression for marijuana users H:

A: SRS (says so) Normality Independence N: ZTest (n 30, so by the CLT, approx normal) Independence (Safe to assume more than 470 marijuana users) N: ZTest

T:

O: 2[ P(Z < -2.83)] =

O: 2[ P(Z < -2.83)] = 2[ 0.0023] = 0.0046

M: > 0.0046 0.05 Reject the Null

S: There is not enough evidence to claim that the average anger expression for marijuana users is 41.5. Does marijuana use affect anger expression? Yes, anger expression is different for marijuana users

b. Calculate a 95% confidence interval for the mean anger expression of frequent marijuana users. Does this interval reaffirm your statistical decision in part a? N: Z-Interval

I:

41.5 is not in the interval, so can’t assume it is the same I am 95% confident the true mean anger expression for marijuana users is between 42.27 and 45.73. b. Does this interval reaffirm your statistical decision in part a? Yes, 41.5 is not in the interval, so can’t assume it is the same

11.3 – Use and Abuse of Tests 11.4 – Using Inference to Make Decisions

What  level to use? How plausible is Ho? If it represents an assumption that the people you must convince have believed for years, strong evidence (small ) will be needed. What are the consequences for rejecting Ho? To do this means you might have to make major changes to accept Ha. Consider the sample and if you need to increase the sample size or look for outliers. Is the sample a true representation of the population? Remember that a certain percent of time you won’t reject the null. (ex. 5%) Multiple testing helps to check this.

Beware of the p-values of 0.049 and the 0.051! What  level to use? Typically ok to use 0.05 Beware of the p-values of 0.049 and the 0.051!

  Type I Error Power of the test p =  p = 1 –  Type II Error p =  Errors in Hypothesis Testing: Because a statistician must make inferences (or conclusions) based on random data that is subject to sampling errors, we can make mistakes in hypothesis testing. In fact, there are two types of errors that can be made. Ho True Ho False Reject Ho Do not Reject Ho Type I Error  Power of the test p =  p = 1 –  Type II Error  p =  Note: You will never have to calculate 

To reduce type II error and increase the power of the test: Increase the sample size Increase the significance level alpha (be careful, if we choose an alpha that almost guarantees never to make a type I error, then there is a large type II error, because it would be hard to reject the null under any circumstance.

  I: Innocent and found guilty Guilty and found guilty Example #1: In a criminal trial, the defendant is held to be innocent until shown to be guilty beyond a reasonable doubt. If we consider hypotheses H0: defendant is innocent Ha: defendant is guilty we can reject H0 only if the evidence strongly favors Ha. 1. Make a diagram that shows the truth about the defendant, and the possible verdicts and that identifies the two types of error. Which type of error is more serious? Ho True Ho False Reject Ho Do not Reject Ho  I: Innocent and found guilty Guilty and found guilty Innocent and found innocent II: Guilty and found innocent  Type I error is more serious

Example #1: In a criminal trial, the defendant is held to be innocent until shown to be guilty beyond a reasonable doubt. If we consider hypotheses H0: defendant is innocent Ha: defendant is guilty we can reject H0 only if the evidence strongly favors Ha. 2. Is this goal better served by a test with  = 0.20 or a test with  = 0.01? Explain your answer.  = 0.01 because the probability of a Type I error would be smaller then Ho True Ho False Reject Ho Do not Reject Ho I: Innocent and found guilty Guilty and found guilty   Innocent and found innocent Guilty and found innocent II: 

The ability to find a person guilty that is in fact guilty Example #1: In a criminal trial, the defendant is held to be innocent until shown to be guilty beyond a reasonable doubt. If we consider hypotheses H0: defendant is innocent Ha: defendant is guilty we can reject H0 only if the evidence strongly favors Ha. 3. Explain what is meant by the power of the test in this setting. The ability to find a person guilty that is in fact guilty Ho True Ho False Reject Ho Do not Reject Ho I: Innocent and found guilty Guilty and found guilty  power Innocent and found innocent Guilty and found innocent II: 

Example #2: For each of the following samples, state the null and alternative hypotheses, Identify when a Type I and a Type II Error would occur. A company specializing in parachute assembly claims that its competitor’s main parachute failure rate is more than 1%. You perform a hypothesis test to determine whether the company’s claim is true. Which error is more serious? Ho: The main parachute failure rate is 1% Ha: The main parachute failure rate is more than 1%

  I: II: Type II error is more serious A company specializing in parachute assembly claims that its competitor’s main parachute failure rate is more than 1%. You perform a hypothesis test to determine whether the company’s claim is true. Which error is more serious? Ho: The main parachute failure rate is 1% Ha: The main parachute failure rate is more than 1% Ho True Ho False Reject Ho Do not Reject Ho  I: Failure rate is 1% and think its more than 1% Failure rate is not 1% and think its more than 1% Failure rate is 1% and think it is 1% II: Failure rate is not 1% and think it is 1%  Type II error is more serious

Example #2: For each of the following samples, state the null and alternative hypotheses, Identify when a Type I and a Type II Error would occur. b. A company that produces snack foods uses a machine to package 454 gram bags of pretzels. If it is working properly, the bags will be exactly 454 grams. You perform a hypothesis test to determine whether the company is packaging the right amount of grams per bag. Ho: There is 454 grams of pretzels are in the bag Ha: There is not 454 grams of pretzels are in the bag

b. A company that produces snack foods uses a machine to package 454 gram bags of pretzels. If it is working properly, the bags will be exactly 454 grams. You perform a hypothesis test to determine whether the company is packaging the right amount of grams per bag. Ho: There is 454 grams of pretzels are in the bag Ha: There is not 454 grams of pretzels are in the bag Ho True Ho False Reject Ho Do not Reject Ho  I: Not 454g in bag, and don’t think 454g 454 grams in bag and don’t think 454g 454 grams in bag and think 454g. II: Not 454g in bag and think 454g. 