Chapter 8 Statistical inference: Significance Tests About Hypotheses

Slides:



Advertisements
Similar presentations
Statistics Hypothesis Testing.
Advertisements

6. Statistical Inference: Example: Anorexia study Weight measured before and after period of treatment y i = weight at end – weight at beginning For n=17.
INFERENCE: SIGNIFICANCE TESTS ABOUT HYPOTHESES Chapter 9.
Testing Hypotheses About Proportions Chapter 20. Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called.
Significance Tests About
Chapter 10: Hypothesis Testing
Significance Testing Chapter 13 Victor Katch Kinesiology.
Stat 112 – Notes 3 Homework 1 is due at the beginning of class next Thursday.
8-2 Basics of Hypothesis Testing
Inferences About Process Quality
Chapter 8 Introduction to Hypothesis Testing
BCOR 1020 Business Statistics
6. Statistical Inference: Significance Tests Goal: Use statistical methods to test hypotheses such as “For treating anorexia, cognitive behavioral and.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
How Can We Test whether Categorical Variables are Independent?
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Chapter 8 Hypothesis Testing 8-1 Review and Preview 8-2 Basics of Hypothesis.
Overview Definition Hypothesis
Confidence Intervals and Hypothesis Testing - II
1 Chapter 10: Section 10.1: Vocabulary of Hypothesis Testing.
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
Chapter 9 Large-Sample Tests of Hypotheses
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Hypothesis Testing for Proportions
Chapter 8 Introduction to Hypothesis Testing
1 Chapter 9: Statistical Inference: Significance Tests About Hypotheses Section 9.1: What Are the Steps for Performing a Significance Test?
LECTURE 19 THURSDAY, 14 April STA 291 Spring
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. CHAPTER 12 Significance Tests About Hypotheses TESTING HYPOTHESES ABOUT PROPORTIONS.
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
Chapter 20 Testing hypotheses about proportions
Lecture 16 Dustin Lueker.  Charlie claims that the average commute of his coworkers is 15 miles. Stu believes it is greater than that so he decides to.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 20 Testing Hypotheses About Proportions.
Section 10.3: Large-Sample Hypothesis Tests for a Population Proportion.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.4 Analyzing Dependent Samples.
Lecture 17 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
1 Chapter 8 Hypothesis Testing 8.2 Basics of Hypothesis Testing 8.3 Testing about a Proportion p 8.4 Testing about a Mean µ (σ known) 8.5 Testing about.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Section 8-2 Basics of Hypothesis Testing.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Chapter 221 What Is a Test of Significance?. Chapter 222 Thought Question 1 The defendant in a court case is either guilty or innocent. Which of these.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Lecture 18 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.
Lecture 17 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
© Copyright McGraw-Hill 2004
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
STA Lecture 221 !! DRAFT !! STA 291 Lecture 22 Chapter 11 Testing Hypothesis – Concepts of Hypothesis Testing.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.2 Tests About a Population.
Slide 20-1 Copyright © 2004 Pearson Education, Inc.
1 Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis Section 12.1: How Can We Model How Two Variables Are Related?
Copyright © 2009 Pearson Education, Inc. 9.2 Hypothesis Tests for Population Means LEARNING GOAL Understand and interpret one- and two-tailed hypothesis.
Hypothesis Tests Hypothesis Tests Large Sample 1- Proportion z-test.
Hypothesis Tests for 1-Proportion Presentation 9.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 9 Statistical Inference: Significance Tests About Hypotheses Section 9.1 Steps for Performing.
+ Homework 9.1:1-8, 21 & 22 Reading Guide 9.2 Section 9.1 Significance Tests: The Basics.
Chapter Nine Hypothesis Testing.
6. Statistical Inference: Significance Tests
FINAL EXAMINATION STUDY MATERIAL III
Week 11 Chapter 17. Testing Hypotheses about Proportions
Analyzing the Association Between Categorical Variables
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Presentation transcript:

Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called a Significance Test To analyze evidence that data provide To make decisions based on data

Two Major Methods for Making Statistical Inferences about a Population Confidence Interval Significance Test

Questions that Significance Tests Attempt to Answer Does a proposed diet truly result in weight loss, on the average? Is there evidence of discrimination against women in promotion decisions? Does one advertising method result in better sales, on the average, than another advertising method?

What Are the Steps For Performing a Significance Test? Section 8.1 What Are the Steps For Performing a Significance Test?

Hypothesis A hypothesis is a statement about a population, usually of the form that a certain parameter takes a particular numerical value or falls in a certain range of values The main goal in many research studies is to check whether the data support certain hypotheses

Significance Test A significance test is a method of using data to summarize the evidence about a hypothesis A significance test about a hypothesis has five steps

Step 1: Assumptions A (significance) test assumes that the data production used randomization Other assumptions may include: Assumptions about the sample size Assumptions about the shape of the population distribution

Step 2: Hypotheses Each significance test has two hypotheses: The null hypothesis is a statement that the parameter takes a particular value The alternative hypothesis states that the parameter falls in some alternative range of values

Null and Alternative Hypotheses The value in the null hypothesis usually represents no effect The symbol Ho denotes null hypothesis The value in the alternative hypothesis usually represents an effect of some type The symbol Ha denotes alternative hypothesis

Null and Alternative Hypotheses A null hypothesis has a single parameter value, such as Ho: p = 1/3 An alternative hypothesis has a range of values that are alternatives to the one in Ho such as Ha: p ≠ 1/3 or Ha: p > 1/3 or Ha: p < 1/3

Step 3: Test Statistic The parameter to which the hypotheses refer has a point estimate: the sample statistic A test statistic describes how far that estimate (the sample statistic) falls from the parameter value given in the null hypothesis

Step 4: P-value To interpret a test statistic value, we use a probability summary of the evidence against the null hypothesis, Ho First, we presume that Ho is true Next, we consider the sampling distribution from which the test statistic comes We summarize how far out in the tail of this sampling distribution the test statistic falls

Step 4: P-value We summarize how far out in the tail the test statistic falls by the tail probability of that value and values even more extreme This probability is called a P-value The smaller the P-value, the stronger the evidence is against Ho

Step 4: P-value

Step 4: P-value The P-value is the probability that the test statistic equals the observed value or a value even more extreme It is calculated by presuming that the null hypothesis H is true

Step 5: Conclusion The conclusion of a significance test reports the P-value and interprets what it says about the question that motivated the test

Summary: The Five Steps of a Significance Test Assumptions Hypotheses Test Statistic P-value Conclusion

Is the Statement a Null Hypothesis or an Alternative Hypothesis? In Canada, the proportion of adults who favor legalize gambling is 0.50. Null Hypothesis Alternative Hypothesis

Is the Statement a Null Hypothesis or an Alternative Hypothesis? The proportion of all Canadian college students who are regular smokers is less than 0.24, the value it was ten years ago. Null Hypothesis Alternative Hypothesis

Significance Tests About Section 8.2 Significance Tests About Proportions

Example: Are Astrologers’ Predictions Better Than Guessing? Scientific “test of astrology” experiment: For each of 116 adult volunteers, an astrologer prepared a horoscope based on the positions of the planets and the moon at the moment of the person’s birth Each adult subject also filled out a California Personality Index Survey

Example: Are Astrologers’ Predictions Better Than Guessing? For a given adult, his or her birth data and horoscope were shown to an astrologer together with the results of the personality survey for that adult and for two other adults randomly selected from the group The astrologer was asked which personality chart of the 3 subjects was the correct one for that adult, based on his or her horoscope

Example: Are Astrologers’ Predictions Better Than Guessing? 28 astrologers were randomly chosen to take part in the experiment The National Council for Geocosmic Research claimed that the probability of a correct guess on any given trial in the experiment was larger than 1/3, the value for random guessing

Example: Are Astrologers’ Predictions Better Than Guessing? Put this investigation in the context of a significance test by stating null and alternative hypotheses

Example: Are Astrologers’ Predictions Better Than Guessing? With random guessing, p = 1/3 The astrologers’ claim: p > 1/3 The hypotheses for this test: Ho: p = 1/3 Ha: p > 1/3

What Are the Steps of a Significance Test about a Population Proportion? Step 1: Assumptions The variable is categorical The data are obtained using randomization The sample size is sufficiently large that the sampling distribution of the sample proportion is approximately normal: np ≥ 15 and n(1-p) ≥ 15

What Are the Steps of a Significance Test about a Population Proportion? Step 2: Hypotheses The null hypothesis has the form: Ho: p = po The alternative hypothesis has the form: Ha: p > po (one-sided test) or Ha: p < po (one-sided test) or Ha: p ≠ po (two-sided test)

What Are the Steps of a Significance Test about a Population Proportion? Step 3: Test Statistic The test statistic measures how far the sample proportion falls from the null hypothesis value, po, relative to what we’d expect if Ho were true The test statistic is:

The P-value summarizes the evidence What Are the Steps of a Significance Test about a Population Proportion? Step 4: P-value The P-value summarizes the evidence It describes how unusual the data would be if H0 were true

We summarize the test by reporting and interpreting the P-value What Are the Steps of a Significance Test about a Population Proportion? Step 5: Conclusion We summarize the test by reporting and interpreting the P-value

Example: Are Astrologers’ Predictions Better Than Guessing? Step 1: Assumptions The data is categorical – each prediction falls in the category “correct” or “incorrect” prediction Each subject was identified by a random number. Subjects were randomly selected for each experiment. np=116(1/3) > 15 n(1-p) = 116(2/3) > 15

Example: Are Astrologers’ Predictions Better Than Guessing? Step 2: Hypotheses H0: p = 1/3 Ha: p > 1/3

Example: Are Astrologers’ Predictions Better Than Guessing? Step 3: Test Statistic: In the actual experiment, the astrologers were correct with 40 of their 116 predictions (a success rate of 0.345)

Example: Are Astrologers’ Predictions Better Than Guessing? Step 4: P-value The P-value is 0.40

Example: Are Astrologers’ Predictions Better Than Guessing? Step 5: Conclusion The P-value of 0.40 is not especially small It does not provide strong evidence against H0: p = 1/3 There is not strong evidence that astrologers have special predictive powers

How Do We Interpret the P-value? A significance test analyzes the strength of the evidence against the null hypothesis We start by presuming that H0 is true The burden of proof is on Ha

How Do We Interpret the P-value? The approach used in hypotheses testing is called a proof by contradiction To convince ourselves that Ha is true, we must show that data contradict H0 If the P-value is small, the data contradict H0 and support Ha

Two-Sided Significance Tests A two-sided alternative hypothesis has the form Ha: p ≠ p0 The P-value is the two-tail probability under the standard normal curve We calculate this by finding the tail probability in a single tail and then doubling it

Example: Dr Dog: Can Dogs Detect Cancer by Smell? Study: investigate whether dogs can be trained to distinguish a patient with bladder cancer by smelling compounds released in the patient’s urine

Example: Dr Dog: Can Dogs Detect Cancer by Smell? Experiment: Each of 6 dogs was tested with 9 trials In each trial, one urine sample from a bladder cancer patient was randomly place among 6 control urine samples

Example: Dr Dog: Can Dogs Detect Cancer by Smell? Results: In a total of 54 trials with the six dogs, the dogs made the correct selection 22 times (a success rate of 0.407)

Example: Dr Dog: Can Dogs Detect Cancer by Smell? Does this study provide strong evidence that the dogs’ predictions were better or worse than with random guessing?

Example: Dr Dog: Can Dogs Detect Cancer by Smell? Step 1: Check the sample size requirement: Is the sample size sufficiently large to use the hypothesis test for a population proportion? Is np0 >15 and n(1-p0) >15? 54(1/7) = 7.7 and 54(6/7) = 46.3 The first, np0 is not large enough We will see that the two-sided test is robust when this assumption is not satisfied

Example: Dr Dog: Can Dogs Detect Cancer by Smell? Step 2: Hypotheses H0: p = 1/7 Ha: p ≠ 1/7

Example: Dr Dog: Can Dogs Detect Cancer by Smell? Step 3: Test Statistic

Example: Dr Dog: Can Dogs Detect Cancer by Smell? Step 4: P-value

Example: Dr Dog: Can Dogs Detect Cancer by Smell? Step 5: Conclusion Since the P-value is very small and the sample proportion is greater than 1/7, the evidence strongly suggests that the dogs’ selections are better than random guessing

Example: Dr Dog: Can Dogs Detect Cancer by Smell? Insight: In this study, the subjects were a convenience sample rather than a random sample from some population Also, the dogs were not randomly selected Any inferential predictions are highly tentative The predictions become more conclusive if similar results occur in other studies

Summary of P-values for Different Alternative Hypotheses Alternative Hypothesis P-value Ha: p > p0 Right-tail probability Ha: p < p0 Left-tail probability Ha: p ≠ p0 Two-tail probability

The Significance Level Tells Us How Strong the Evidence Must Be Sometimes we need to make a decision about whether the data provide sufficient evidence to reject H0 Before seeing the data, we decide how small the P-value would need to be to reject H0 This cutoff point is called the significance level

The Significance Level Tells Us How Strong the Evidence Must Be

Significance Level The significance level is a number such that we reject H0 if the P-value is less than or equal to that number In practice, the most common significance level is 0.05 When we reject H0 we say the results are statistically significant

Possible Decisions in a Test with Significance Level = 0.05 P-value: Decision about H0: ≤ 0.05 Reject H0 > 0.05 Fail to reject H0

Report the P-value Learning the actual P-value is more informative than learning only whether the test is “statistically significant at the 0.05 level” The P-values of 0.01 and 0.049 are both statistically significant in this sense, but the first P-value provides much stronger evidence against H0 than the second

“Do Not Reject H0” Is Not the Same as Saying “Accept H0” Analogy: Legal trial Null Hypothesis: Defendant is Innocent Alternative Hypothesis: Defendant is Guilty If the jury acquits the defendant, this does not mean that it accepts the defendant’s claim of innocence Innocence is plausible, because guilt has not been established beyond a reasonable doubt

One-Sided vs Two-Sided Tests Things to consider in deciding on the alternative hypothesis: The context of the real problem In most research articles, significance tests use two-sided P-values Confidence intervals are two-sided

The Binomial Test for Small Samples The test about a proportion assumes normal sampling distributions for and the z-test statistic. It is a large-sample test the requires that the expected numbers of successes and failures be at least 15. In practice, the large-sample z test still performs quite well in two-sided alternatives even for small samples. Warning: For one-sided tests, when p0 differs from 0.50, the large-sample test does not work well for small samples

For a test of H0: p = 0.50: The z test statistic is 1.04. Find the P-value for Ha: p > 0.50. .15 .20 .175 .222

For a test of H0: p = 0.50: The z test statistic is 1.04. Find the P-value for Ha: p ≠ 0.50. .15 .22 .30 .175

For a test of H0: p = 0.50: The z test statistic is 1.04. Does the P-value for Ha: p ≠ 0.50 give strong evidence against H0? yes no

For a test of H0: p = 0.50: The z test statistic is 2.50. Find the P-value for Ha: p > 0.50. .05 .10 .0062 .0124

For a test of H0: p = 0.50: The z test statistic is 2.50. Find the P-value for Ha: p ≠ 0.50. .05 .10 .0062 .0124

For a test of H0: p = 0.50: The z test statistic is 2.50. Does the P-value for Ha: p ≠ 0.50 give strong evidence against H0? yes no

Significance Tests about Means Section 8.3 Significance Tests about Means

What Are the Steps of a Significance Test about a Population Mean? Step 1: Assumptions The variable is quantitative The data are obtained using randomization The population distribution is approximately normal. This is most crucial when n is small and Ha is one-sided.

What Are the Steps of a Significance Test about a Population Mean? Step 2: Hypotheses: The null hypothesis has the form: H0: µ = µ0 The alternative hypothesis has the form: Ha: µ > µ0 (one-sided test) or Ha: µ < µ0 (one-sided test) or Ha: µ ≠ µ0 (two-sided test)

What Are the Steps of a Significance Test about a Population Mean? Step 3: Test Statistic The test statistic measures how far the sample mean falls from the null hypothesis value µ0 relative to what we’d expect if H0 were true The test statistic is:

What Are the Steps of a Significance Test about a Population Mean? Step 4: P-value The P-value summarizes the evidence It describes how unusual the data would be if H0 were true

What Are the Steps of a Significance Test about a Population Mean? Step 5: Conclusion We summarize the test by reporting and interpreting the P-value

Summary of P-values for Different Alternative Hypotheses Alternative Hypothesis P-value Ha: µ > µ0 Right-tail probability Ha: µ < µ0 Left-tail probability Ha: µ ≠ µ0 Two-tail probability

Example: Mean Weight Change in Anorexic Girls A study compared different psychological therapies for teenage girls suffering from anorexia The variable of interest was each girl’s weight change: ‘weight at the end of the study’ – ‘weight at the beginning of the study’

Example: Mean Weight Change in Anorexic Girls One of the therapies was cognitive therapy In this study, 29 girls received the therapeutic treatment The weight changes for the 29 girls had a sample mean of 3.00 pounds and standard deviation of 7.32 pounds

Example: Mean Weight Change in Anorexic Girls

Example: Mean Weight Change in Anorexic Girls How can we frame this investigation in the context of a significance test that can detect a positive or negative effect of the therapy? Null hypothesis: “no effect” Alternative hypothesis: therapy has “some effect”

Example: Mean Weight Change in Anorexic Girls Step 1: Assumptions The variable (weight change) is quantitative The subjects were a convenience sample, rather than a random sample. The question is whether these girls are a good representation of all girls with anorexia. The population distribution is approximately normal

Example: Mean Weight Change in Anorexic Girls Step 2: Hypotheses H0: µ = 0 Ha: µ ≠ 0

Example: Mean Weight Change in Anorexic Girls Step 3: Test Statistic

Example: Mean Weight Change in Anorexic Girls Step 4: P-value Minitab Output Test of mu = 0 vs not = 0 Variable N Mean StDev SE Mean wt_chg 29 3.000 7.3204 1.3594 CI 95% CI T P (0.21546, 5.78454) 2.21 0.036

Example: Mean Weight Change in Anorexic Girls Step 5: Conclusion The small P-value of 0.036 provides considerable evidence against the null hypothesis (the hypothesis that the therapy had no effect)

Example: Mean Weight Change in Anorexic Girls “The diet had a statistically significant positive effect on weight (mean change = 3 pounds, n = 29, t = 2.21, P-value = 0.04)” The effect, however, may be small in practical terms 95% CI for µ: (0.2, 5.8) pounds

Results of Two-Sided Tests and Results of Confidence Intervals Agree Conclusions about means using two-sided significance tests are consistent with conclusions using confidence intervals If P-value ≤ 0.05 in a two-sided test, a 95% confidence interval does not contain the H0 value If P-value > 0.05 in a two-sided test, a 95% confidence interval does contain the H0 value

What If the Population Does Not Satisfy the Normality Assumption For large samples (roughly about 30 or more) this assumption is usually not important The sampling distribution of x is approximately normal regardless of the population distribution

What If the Population Does Not Satisfy the Normality Assumption In the case of small samples, we cannot assume that the sampling distribution of x is approximately normal Two-sided inferences using the t distribution are robust against violations of the normal population assumption They still usually work well if the actual population distribution is not normal

Regardless of Robustness, Look at the Data Whether n is small or large, you should look at the data to check for severe skew or for severe outliers In these cases, the sample mean could be a misleading measure

Find the approximate P-value for the alternative, Ha: µ > 100. A study has a random sample of 20 subjects. The test statistic for testing Ho:µ=100 is t = 2.40. Find the approximate P-value for the alternative, Ha: µ > 100. between .100 and .050 between .050 and .025 between .025 and .010 between .010 and .005

Find the approximate P-value for the alternative, Ha: µ ≠ 100. A study has a random sample of 20 subjects. The test statistic for testing Ho:µ=100 is t = 2.40. Find the approximate P-value for the alternative, Ha: µ ≠ 100. between .100 and .050 between .050 and .020 between .025 and .010 between .020 and .010

Decisions and Types of Errors in Significance Tests Section 8.4 Decisions and Types of Errors in Significance Tests

Type I and Type II Errors When H0 is true, a Type I Error occurs when H0 is rejected When H0 is false, a Type II Error occurs when H0 is not rejected

Significance Test Results

An Analogy: Decision Errors in a Legal Trial

P(Type I Error) = Significance Level α Suppose H0 is true. The probability of rejecting H0, thereby committing a Type I error, equals the significance level, α, for the test.

P(Type I Error) We can control the probability of a Type I error by our choice of the significance level The more serious the consequences of a Type I error, the smaller α should be

Type I and Type II Errors As P(Type I Error) goes Down, P(Type II Error) goes Up The two probabilities are inversely related

A significance test about a proportion is conducted using a significance level of 0.05. The test statistic is 2.58. The P-value is 0.01. If Ho is true, for what probability of a Type I error was the test designed? .01 .05 2.58 .02

A significance test about a proportion is conducted using a significance level of 0.05. The test statistic is 2.58. The P-value is 0.01. If this test resulted in a decision error, what type of error was it? Type I Type II

Limitations of Significance Tests Section 8.5 Limitations of Significance Tests

Statistical Significance Does Not Mean Practical Significance When we conduct a significance test, its main relevance is studying whether the true parameter value is: Above, or below, the value in H0 and Sufficiently different from the value in H0 to be of practical importance

What the Significance Test Tells Us The test gives us information about whether the parameter differs from the H0 value and its direction from that value

What the Significance Test Does Not Tell Us It does not tell us about the practical importance of the results

Statistical Significance vs. Practical Significance A small P-value, such as 0.001, is highly statistically significant, but it does not imply an important finding in any practical sense In particular, whenever the sample size is large, small P-values can occur when the point estimate is near the parameter value in H0

Significance Tests Are Less Useful Than Confidence Intervals A significance test merely indicates whether the particular parameter value in H0 is plausible When a P-value is small, the significance test indicates that the hypothesized value is not plausible, but it tells us little about which potential parameter values are plausible

Significance Tests are Less Useful than Confidence Intervals A Confidence Interval is more informative, because it displays the entire set of believable values

Misinterpretations of Results of Significance Tests “Do Not Reject H0” does not mean “Accept H0” A P-value above 0.05 when the significance level is 0.05, does not mean that H0 is correct A test merely indicates whether a particular parameter value is plausible

Misinterpretations of Results of Significance Tests Statistical significance does not mean practical significance A small P-value does not tell us whether the parameter value differs by much in practical terms from the value in H0

Misinterpretations of Results of Significance Tests The P-value cannot be interpreted as the probability that H0 is true

Misinterpretations of Results of Significance Tests It is misleading to report results only if they are “statistically significant”

Misinterpretations of Results of Significance Tests Some tests may be statistically significant just by chance

Misinterpretations of Results of Significance Tests True effects may not be as large as initial estimates reported by the media

How Likely is a Type II Error? Section 8.6 How Likely is a Type II Error?

Type II Error A Type II error occurs in a hypothesis test when we fail to reject H0 even though it is actually false

Calculating the Probability of a Type II Error To calculate the probability of a Type II error, we must do a separate calculation for various values of the parameter of interest

Example: Reconsider the Experiment to test Astrologers’ Predictions Scientific “test of astrology” experiment: For each of 116 adult volunteers, an astrologer prepared a horoscope based on the positions of the planets and the moon at the moment of the person’s birth Each adult subject also filled out a California Personality Index Survey

Example: Reconsider the Experiment to test Astrologers’ Predictions For a given adult, his or her birth data and horoscope were shown to an astrologer together with the results of the personality survey for that adult and for two other adults randomly selected from the group The astrologer was asked which personality chart of the 3 subjects was the correct one for that adult, based on his or her horoscope

Example: Reconsider the Experiment to test Astrologers’ Predictions 28 astrologers were randomly chosen to take part in the experiment The National Council for Geocosmic Research claimed that the probability of a correct guess on any given trial in the experiment was larger than 1/3, the value for random guessing

Example: Reconsider the Experiment to test Astrologers’ Predictions With random guessing, p = 1/3 The astrologers’ claim: p > 1/3 The hypotheses for this test: Ho: p = 1/3 Ha: p > 1/3 The significance level used for the test is 0.05

Example: Reconsider the Experiment to test Astrologers’ Predictions For what values of the sample proportion can we reject H0? A test statistic of z = 1.645 has a P-value of 0.05. So, we reject H0 for z ≥ 1.645 and we fail to reject H0 for z <1.645.

Example: Reconsider the Experiment to test Astrologers’ Predictions Find the value of the sample proportion that would give us a z of 1.645:

Example: Reconsider the Experiment to test Astrologers’ Predictions So, we fail to reject H0 if Suppose that in reality astrologers can make the correct prediction 50% of the time (that is, p = 0.50) In this case, (p = 0.50), we can now calculate the probability of a Type II error

Example: Reconsider the Experiment to test Astrologers’ Predictions We calculate the probability of a sample proportion < 0.405 assuming that the true proportion is 0.50

Example: Reconsider the Experiment to test Astrologers’ Predictions The area to the left of -2.04 in the standard normal table is 0.02 The probability of making a Type II error and failing to reject H0: p = 1/3 is only 0.02 in the case in which the true proportion is 0.50 This is only a small chance of making a Type II error

Power of a Test Power = 1 – P(Type II error) The higher the power, the better In practice, it is ideal for studies to have high power while using a relatively small significance level

Example: Reconsider the Experiment to test Astrologers’ Predictions In this example, the Power of the test at p = 0.50 is: 1 – 0.02 = 0.98 Since, the higher the power the better, a test power of 0.98 is quite good