Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Slides:



Advertisements
Similar presentations
Chapter 12: Testing hypotheses about single means (z and t) Example: Suppose you have the hypothesis that UW undergrads have higher than the average IQ.
Advertisements

Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Inference Sampling distributions Hypothesis testing.
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
Review: What influences confidence intervals?
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Inference about a Mean Part II
S519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 8: Significantly significant.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
The one sample t-test November 14, From Z to t… In a Z test, you compare your sample to a known population, with a known mean and standard deviation.
Statistics for Managers Using Microsoft® Excel 5th Edition
Hypothesis Testing: Two Sample Test for Means and Proportions
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Inferential Statistics
AM Recitation 2/10/11.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Hypothesis Testing:.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Overview Definition Hypothesis
Confidence Intervals and Hypothesis Testing - II
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Chapter 8 Hypothesis Testing : An Introduction.
Statistical inference: confidence intervals and hypothesis testing.
Statistical Techniques I
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Fundamentals of Hypothesis Testing: One-Sample Tests
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
One-Sample Tests of Hypothesis Chapter 10 McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter 9A.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
One-Sample Tests of Hypothesis. Hypothesis and Hypothesis Testing HYPOTHESIS A statement about the value of a population parameter developed for the purpose.
Significance Test A claim is made. Is the claim true? Is the claim false?
Statistical Inference Statistical Inference involves estimating a population parameter (mean) from a sample that is taken from the population. Inference.
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
Chapter 20 Testing Hypothesis about proportions
Inferential Statistics Part 1 Chapter 8 P
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
© Copyright McGraw-Hill 2004
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Chapter 9 Introduction to the t Statistic
Chapter 8: Inferences Based on a Single Sample: Tests of Hypotheses
Business Statistics Topic 7
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Chapter Nine Part 1 (Sections 9.1 & 9.2) Hypothesis Testing
Presentation transcript:

Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p

Introduction Hypothesis testing is closely related to estimation (i.e., what we studied at last week) The difference is that now we are posing a hypothesis that we want to test For example, rather than just estimating a population parameter using a sample, we may hypothesize that a sample is different than the population in some way Bases on a sample statistic we can either accept or reject the hypothesis

Steps in Classical Hypothesis Testing 1: Formulate a hypothesis 2: Specify the sampling statistic and its distribution 3: Select a level of significance 4: Construct a decision rule 5: Compute a value of the test statistic 6: Decide to accept or reject the hypothesis

Formulate a hypothesis Null Hypothesis (H 0 ) – when the sample statistic follows the population parameter (e.g., when characteristics from a sample more or less match those from the population) Alternative Hypothesis (H A ) – When the sample statistic does not follow the population parameter Possible statements:

Formulate a hypothesis Which type of hypothesis (null or alternative) are we typically concerned with? How do “tails” of a distribution fit the statements? What does it mean to say these hypotheses are mutually exclusive & exhaustive?

Formulate a hypothesis Remember that the hypotheses are being tested using sample data that may contain sampling error This is why hypothesis testing falls under the category of inferential statistics We have to infer results based on a sample We can’t be completely certain of the results, so there is a degree of uncertainty associated with our answers To estimate this uncertainty we rely upon probability

Types of error Type 1 Error: when we falsely reject a null hypothesis, the probability of doing so is labeled α (i.e., α = P(type 1 error) Type 2 Error: when we falsely accept a null hypothesis, the probability of doing so is labeled β (i.e., β = P(type 2 error) H 0 is trueH 0 is false Reject H 0 Type 1 ErrorNo Error Accept H 0 No ErrorType 2 Error

Specify the sampling statistic and its distribution What sampling statistic should you choose for μ, σ, and pi respectively? What distributions will the sampling statistics have and how do we know? FYI, when used to test a hypothesis, sampling statistics are also called test statistics

Select a level of significance In classical hypothesis testing we are only concerned with type 1 error (α) For example: alpha of 0.1, 0.05, or 0.01 The value for alpha is called the significance level This means that if we reject H 0 we will be very confident that it is false How confident depends on the significance level The flip-side of this approach is that we are more likely to not reject a null hypothesis that is false

Select a level of significance How does this fit with the idea that we are typically concerned with H A rather than H 0 ? Answer: since the significance is tied to rejecting H 0 it is also linked with accepting H A This means that the hypotheses we make should be structured so that we are testing H A (i.e., rejecting H 0 should be scientifically interesting) To make this more clear, think about the opposite case: if we were really interested in accepting H 0 we would have no idea about the significance because we are ignoring type 2 error

Select a level of significance Whenever we report a decision about the null hypothesis (to reject it or not) we also report the statistical significance Example: The null hypothesis is rejected at the 0.05 significance level

Select a level of significance Which significance level we actually choose depends on the application When might we want a very small α? In geography 0.1, 0.05., 0.01 are pretty typical It is also common to see results reported for multiple alphas

Construct a decision rule For this step we take the hypothesis we’ve defined and the significance level we’ve selected and determine the critical region and the critical values In other words, we take our values, and determine the thresholds for accepting or rejecting H 0

Construct a decision rule Critical Regions: if the sample statistic falls within these area(s) we will reject H 0 Critical Values: the thresholds that divide the critical region(s) from the non-critical region

Construct a decision rule For a test statistic with a normal distribution (e.g., x and p) we make our decision rule using: For p, the equation is: For x the equation is: Key things to remember How to calculate σ The number of tails

Compute a value of the test statistic Here we just compute the values using equations we’re familiar with (e.g., x and p) Note that constructing a decision rule and computing the values of a test statistic can also be done using z-values for the critical values and for the test statistic (see p. 289 for details)

Decide to accept or reject the hypothesis Now we just compare the test statistic with the critical values and make our decision to reject H 0 or not

Classical Hypothesis Testing Example Has the mean temperature of Charlotte increased over the last 30 years? This is an example for μ

Example Data Suppose Charlotte’s annual mean temp for the last: 150 years is 50 o F. 30 years is 53 o F. Suppose the population variance, σ 2, for these 150 years is 9 (so σ = 3) Assumptions: Each year is independent of other years The last 30 years act as a sample of the population of years since greenhouse gases have been emitted into the Earth’s atmosphere. (These 30 are all we have access to). These 30 years come from the same distribution.

Steps in Classical Hypothesis Testing 1: Formulate a hypothesis 2: Specify the sampling statistic and its distribution 3: Select a level of significance 4: Construct a decision rule 5: Compute a value of the test statistic 6: Decide to accept or reject the hypothesis

Step 1: Formulate a hypothesis Scientifically, we say our hypothesis is: the mean temperature of Charlotte has increased over the last 30 years Statistically, we develop Null hypothesisH 0 : Θ ≤ Θ 0 Alternative hypothesisH A : Θ > Θ 0 When we apply the data: Null hypothesisH 0 : x ≤ 50 o F Alternative hypothesisH A : x > 50 o F This is a 1-sided test

Step 2: Specify the sampling statistic and its distribution What sampling statistic should we use? What distribution with it have? Answers: The sample mean (in this case 53 o F) A normal distribution Our sample size is 30, which is just large enough to use the z rather than the t distribution This is an application of the central limit theorem

The sample statistic & the hypothesis If x is below or near 50, we do not reject the null hypothesis: H 0 : x ≤ 50 o F. If x is far greater than 50, we reject the null hypothesis in favor of the alternative hypothesis: H A : x > 50 o F. Why isn’t this simple comparison sufficient? Answer: because x is just a sample and may have error We set a cutoff point for x, above which we reject our null hypothesis. This cutoff is set at a point where, if the null hypothesis were true, a value of x this large or larger would be very unlikely (due to sampling variation alone).

Step 3: Select a level of significance This step is always somewhat arbitrary, but we’ll just use 0.05 This means that we’re willing to accept a 5% chance of having a type 1 error (i.e., rejecting H 0 when we should not)

Step 4: Construct a decision rule

So we say that we will reject H 0 if x is > with a significance level of 0.05

Steps 5 & 6 Step 5: Compute a value of the test statistic In this case we already have the test statistic (x = 53) Step 6: Decide to reject the null hypothesis (or not) Now we just compare our test statistic with the critical value Since 53 is > we will reject the null hypothesis and accept the alternative hypothesis

Shortcomings of the classical approach The decision to reject the null hypothesis is binary No detail is given for how far the test statistic is from the critical value (e.g., is it just above it, or way above it) Different α value might read to different decisions

The PROB-VALUE approach This approach fixes the shortcomings of the classical approach Basically it involves using the same equations, but flipping them around so that we solve for α In other words: At what level is the test statistic significant What is the α (i.e., the probability of making a type 1 error) Should we reject H 0 how likely are we to be wrong

The PROB-VALUE approach This is based on the equation: The difference from the classical approach is that now we look up the z-value to tell us the alpha (α)

PROB-VALUE example Charlotte Example Using a z-table, what alpha is associated with this z? Answer: α = This value is actually from Excel, the z-table in the book does not go up to In other words, there is a 2.16 in 100 million chance of the null being falsely rejected

PROB-VALUE & alpha Remember that the PROB-VALUE is equivalent to finding the alpha associated with a z-value Therefore we can also use the PROB-VALUE to reject a H 0 (or not) Example: If our selected significance level is 0.05 And our PROB-VALUE is We’d reject the null hypothesis since < 0.05

Additional things to consider As with confidence intervals, when conducting a hypothesis using μ we should use t instead of z when: n < 30 we have s instead of σ (with an n > 30 either is ok) As with confidence intervals, when conducting a hypothesis test using π we should use the binomial distribution instead of z when: n < 100 Example 9-4 in the book solves such a problem

Sample Problems Galore! We’re going to go through several examples that are reminiscent of problems on your homework and what will be on the exam

Key questions to ask before starting What is the test statistic? x and p have slightly different equations, particularly for their standard deviations How many tails does the test have? Determines whether we use α or α/2 Determines whether we multiply the PROB-VALUE by 2 If we are doing a 1 tailed test, which critical value are we concerned about? : lower critical value : upper critical value What distribution should we use (t, z, or binomial)

Sample Problem #1 A census of UNC students found that students had, on average, 3.4 pets each while growing up with a standard deviation of 1.9 pets. A single dorm with 220 students had an average of 3.65 pets growing up. Assuming the students are assigned to the dorm at random (i.e., they are statistically independent), does this dorm have a higher than normal “pet history” with a 0.01 significance level?

Sample Problem #1 What is the test statistic? How many tails does the test have? Which critical value are we concerned about? Putting these together - what are H 0 and H A ?

Sample Problem #1 What are n, σ, and α? n = 220 σ = 1.9 α = 0.01 What distribution should we use and why? The z-distribution since n > 30 What is the z-value associated with α? Z 0.01 = 2.33 What is the standard deviation of x?

Sample Problem #1 Critical Value Should we reject the null hypothesis?

Sample Problem #1 What would happen to the critical value if we changed the significance level to 0.05? Does this make us more or less likely to reject the null hypothesis?

Sample Problem #1 PROB-VALUE What values go in this equation? What do we do with the resulting z-value? What is the PROB-VALUE

Sample Problem #2 A census of UNC students found that students had, on average, a 12 minute commute (walking, bicycling, bus, car, etc.) to their first class of the day. 16 randomly sampled students living off campus had an average commute of 17 minutes with a sample standard deviation of 4.5 minutes. Do students living off campus have a longer commute with a 0.05 significance level?

Sample Problem #2 What is the test statistic? How many tails does the test have? Which critical value are we concerned about? Putting these together - what are H 0 and H A ?

Sample Problem #2 What are n, s, and α? n = 16 s = 4.5 α = 0.05 What distribution should we use and why? The t-distribution since n < 30 and we have s instead of σ What is the t-value associated with α? t 0.05,15 = 1.75 What is the standard deviation of x?

Sample Problem #2 Critical Value Should we reject the null hypothesis?

Sample Problem #2 PROB-VALUE What values go in this equation? What do we do with the resulting z-value? What is the PROB-VALUE

Sample Problem #3 A botanical index states that the average weight of a northern red oak acorn is 6 grams. A random sample of 101 acorns was collected from the red oaks in the quad and the acorns had an average weight of 5.6 grams and a sample standard deviation of 1.3 grams. Are the oak trees in the quad atypical from normal trees with a significance of 0.05?

Sample Problem #3 What is the test statistic? How many tails does the test have? Which critical value are we concerned about? Putting these together - what are H 0 and H A ?

Sample Problem #3 What are n, s, and α? n = 101 s = 1.3 α = 0.05 What distribution should we use and why? Either one would be ok, but since we’re using s we’ll go with t What is the t-value associated with α/2? t 0.025,100 = 1.98Note how close this is to z = 1.96 What is the standard deviation of x?

Sample Problem #3 Critical Value Should we reject the null hypothesis?

Sample Problem #3 PROB-VALUE What values go in this equation? What do we do with the resulting z-value? What is the PROB-VALUE

Sample Problem #4 Suppose a census of UNC students found that 8 percent of students bike to class regularly. A random sample of 160 business majors found that 7 biked regularly. If would seem that business majors bike less than other students, what significance level does this statement have?

Sample Problem #4 What is the test statistic? How many tails does the test have? Which critical value are we concerned about? Putting these together - what are H 0 and H A ?

Sample Problem #4 What are n, π, and p? n = 160 π = 0.08 p = 7/160 = What distribution should we use and why? The z-distribution since have probabilities and a large n What is the standard deviation of p?

Sample Problem #4 PROB-VALUE What values go in this equation? What do we do with the resulting z-value? What is the PROB-VALUE

Sample Problem #4 What does a PROB-VALUE of indicate about our statement?

Statistical Significance vs. Practical Significance What are all these tests really telling us? They tell us about the presence of difference (, =), which can be really scientifically uninteresting Two approaches for managing this situation Test only important hypotheses Use confidence intervals rather than hypothesis tests