1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.

Slides:



Advertisements
Similar presentations
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Advertisements

Psych 5500/6500 The Sampling Distribution of the Mean Fall, 2008.
1 One Tailed Tests Here we study the hypothesis test for the mean of a population when the alternative hypothesis is an inequality.
1 Difference Between the Means of Two Populations.
Estimation from Samples Find a likely range of values for a population parameter (e.g. average, %) Find a likely range of values for a population parameter.
Zen and the Art of Significance Testing At the center of it all: the sampling distribution The task: learn something about an unobserved population on.
1 More Regression Information. 2 3 On the previous slide I have an Excel regression output. The example is the pizza sales we saw before. The first thing.
An Inference Procedure
1 Confidence Interval for the Population Proportion.
The Basics of Regression continued
1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.
8 - 10: Intro to Statistical Inference
The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.
1 Confidence Interval for the Population Mean. 2 What a way to start a section of notes – but anyway. Imagine you are at the ground level in front of.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
More Simple Linear Regression 1. Variation 2 Remember to calculate the standard deviation of a variable we take each value and subtract off the mean and.
Inference about a Mean Part II
1 T-test for the Mean of a Population: Unknown population standard deviation Here we will focus on two methods of hypothesis testing: the critical value.
Stat 217 – Day 15 Statistical Inference (Topics 17 and 18)
1 The Sample Mean rule Recall we learned a variable could have a normal distribution? This was useful because then we could say approximately.
Statistics for Managers Using Microsoft® Excel 5th Edition
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
An Inference Procedure
Chapter 12 Section 1 Inference for Linear Regression.
Review of normal distribution. Exercise Solution.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Fundamentals of Hypothesis Testing: One-Sample Tests
Inference for Proportions(C18-C22 BVD) C19-22: Inference for Proportions.
Section #4 October 30 th Old: Review the Midterm & old concepts 1.New: Case II t-Tests (Chapter 11)
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Estimation Statistics with Confidence. Estimation Before we collect our sample, we know:  -3z -2z -1z 0z 1z 2z 3z Repeated sampling sample means would.
Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
A Sampling Distribution
1 Statistical Inference Greg C Elvers. 2 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Statistics 101 Chapter 10. Section 10-1 We want to infer from the sample data some conclusion about a wider population that the sample represents. Inferential.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Statistical Inference
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Section 10.1 Confidence Intervals
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
PSY 307 – Statistics for the Behavioral Sciences Chapter 9 – Sampling Distribution of the Mean.
Chapter 10: Introduction to Statistical Inference.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Inferences from sample data Confidence Intervals Hypothesis Testing Regression Model.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
26134 Business Statistics Tutorial 11: Hypothesis Testing Introduction: Key concepts in this tutorial are listed below 1. Difference.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
© Copyright McGraw-Hill 2004
Applied Quantitative Analysis and Practices LECTURE#14 By Dr. Osman Sadiq Paracha.
Confidence Interval Estimation For statistical inference in decision making: Chapter 9.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
 Here’s the formula for a CI for p: p-hat is our unbiased Estimate of p. Z* is called the critical value. I’ll teach you how to calculate that next. This.
INF397C Introduction to Research in Information Studies Spring, Day 12
Business Statistics Topic 7
Problems: Q&A chapter 6, problems Chapter 6:
Interval Estimation Download this presentation.
How Confident Are You?.
Presentation transcript:

1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing

2 Normal distribution mean value As a start we can think about the normal distribution. Along the horizontal axis we measure the variable we think has a normal distribution. The variable might be age, income or whatever. Note the mean value is in the center of the distribution.

3 Normal distribution mean value The curve above the axis helps us understand what the probability of a range of values would have. As an example, the probability of having a value above the mean is 50%. 50% is the area under the curve to the right of the mean. The z table would help us find the probability of other ranges of values.

4 Example We could imagine that the people in a typical classroom represent a population. The population would be the people who meet in the class on a regular basis. As we think of this population, we might want to know about characteristics of the population such as age, income, or educational attainment. If we looked at the population we would call the population mean and standard deviation of a variable(of say, age) parameters of the population.

5 example When we look at the people in the class we could find out the population mean by asking everyone to give their age and then we could calculate the mean. But in many statistical studies we do not collect information from everyone. We only take a sample. The sample will have a mean and standard deviation as well. Since a sample does not include everyone in the population, the sample mean (and sample standard deviation) will have a value that depends on which people made it into the sample.

6 example Let’s take a sample of 5 people in the class and determine the average age. We have for an average of If we took a different sample of 5 we would have for an average of So in principle we could look at every possible sample of size 5 and calculate the mean for each sample. The mean for each sample of size five could then be looked at as a distribution.

7 sampling distribution When we think about repeated sampling, statistics like the mean from the sample could be thought of as a making up a sampling distribution. Due to the central limit theorem, we know a great deal about the sampling distribution of the sample mean. The nice thing about the central limit theorem is that it holds whether we know about the population or not.

8 central limit theorem The basic idea of the central limit theorem is that if you consider samples from a population, the sampling distribution of sample means 1) has a normal distribution - the sampling distribution is normal, 2) has mean value equal to the mean of the population, and, 3) has standard deviation or, in this context, a standard error equal to the standard deviation of the population divided by the square root of the sample size. The standard error is just the standard deviation of the sampling distribution and, as such, is just given this special name.

9 central limit theorem So we see the variable in the population can have a normal distribution and the sample mean can have a normal distribution. Example: If in the population age ~N(30, 3) – the ~ means distributed – the N here means normally with mean 30 and standard deviation 3-, then samples of size, say 9, have x ~N(30, 1). (The x with a line over it is called x bar and refers to the sample mean.) How did I get this? Do you get it?

rule For a normal distribution it is know that 1) approximately 68% of the values are within 1 standard deviation of the mean, 2) approximately 95% of the values are within 2 standard deviations of the mean, and 3) approximately 99.7% of the values are within 3 standard deviations of the mean. So from our example of age before, in the population 68% of the people are between 27 and 33, but 68% of the sample means would fall between 29 and 31.

11 rule in a graph population age mean age

12 statistical inference Up to this point we have operated as if we knew the population mean. (What we have done will act as a model for what we are about to do.) But most of the time we don’t - that is why we have statistics. We will take a sample and try to infer what the population mean is from the sample we draw. The two methods of inference are 1) confidence intervals and 2) hypothesis tests.

13 confidence interval When we take a sample and calculate the mean of the sample we could use this sample mean as our estimate of the population mean. But remember that the mean of the sample would vary depending on the sample. Instead of just a point estimate of the mean of the population we use an interval or range of values for our estimate of where the population mean might be. To account for sampling variability, we use an interval.

14 confidence interval sample means The true mean we just don’t know it. The lines I put here tell us where 95% of the means should fall. The distance from the center is 1.96(σ)/(square root of sample size) σ below is the population standard deviation, which we will assume is known.

15 confidence interval sample means Now when we get the sample mean we use the same distance, 1.96 (σ)/(square root of sample size), around the sample mean. We are then 95% confident that our interval will contain the true unknown mean. x

Where did I get the 1.96 on the previous page? Before we said approximately 95% of the sample means are within 2 standard deviations of the mean. To be more precise we say 95% of the sample means are within 1.96 standard deviations. If you look at the standard normal table in the book you see associated with a Z = 1.96 the value.975. So.025 is in the upper tail, and due to symmetry,.025 in the lower tail of a normal distribution. So to be precise we use 1.96 in the formulas when we refer to the middle 95%.

17 Story about hypothesis tests. Not really stats, but an idea to consider. Say I have two decks of cards. One deck is a regular deck – spades, hearts, diamonds and clubs. The other deck is special – 4 sets of hearts. Now, I take out one of the decks, but you do not know which one. In the language of statistics the null hypothesis will be that I took out the regular deck. You will accept the null hypothesis unless an event occurs that has a really low probability. If a really low probability event occurs you will reject the null hypothesis and go with the alternative hypothesis. So, I take out a deck and deal you five cards – a royal flush hearts! You would reject the null hypothesis of a regular deck and go with the alternative that the deck I pulled out is the special one because a royal flush hearts has a low probability in a regular deck.

18 hypothesis test In a hypothesis test we don’t know the unknown population mean, but we have a value in mind(the hypothesized value), say from other research or the like. What we then do is use the hypothesized value as if it were the true value and see how likely our sample mean value would be, coming from the population with the center at the hypothesized value. Low probabilities of occurrence(usually defined as 5% or.05) would have us reject our hypothesized value as the true mean.

19 hypothesis test sample means With the hypothesized value as the center, we would look at the probability of getting the sample mean value or a more extreme value. If the shaded value is.05 or less(for a one tail test) we reject the hypothesized value as the true value. x p-value

20 hypothesis test sample means When this shaded area is.05 or less we are saying that, based on the hypothesized value as the center, the probability of getting a sample mean with the value we obtained is so small that we will reject our hypothesized value and conclude the center value must be something else. x