Announcements Exam 1 key posted on web HW 7 (posted on web) Due Oct. 24 Bonus E Due Oct. 24 Office Hours –this week: –Wed. 8-11, 3-4 –Fri. 8-11 For Bonus.

Slides:



Advertisements
Similar presentations
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Advertisements

Section 9.3 Inferences About Two Means (Independent)
1 One Tailed Tests Here we study the hypothesis test for the mean of a population when the alternative hypothesis is an inequality.
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
Inference for a population mean BPS chapter 18 © 2006 W. H. Freeman and Company.
Chapter 10: Hypothesis Testing
Review: What influences confidence intervals?
Lecture 3 Miscellaneous details about hypothesis testing Type II error
1 Difference Between the Means of Two Populations.
Lecture 5 Outline – Tues., Jan. 27 Miscellanea from Lecture 4 Case Study Chapter 2.2 –Probability model for random sampling (see also chapter 1.4.1)
Final Jeopardy $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 LosingConfidenceLosingConfidenceTesting.
Lecture 4 Chapter 11 wrap-up
Inference about a Mean Part II
In this chapter we consider questions about population means similar to those we have studied about proportions in the last couple of chapters.
1 T-test for the Mean of a Population: Unknown population standard deviation Here we will focus on two methods of hypothesis testing: the critical value.
Ch. 9 Fundamental of Hypothesis Testing
PSY 307 – Statistics for the Behavioral Sciences
“There are three types of lies: Lies, Damn Lies and Statistics” - Mark Twain.
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
Getting Started with Hypothesis Testing The Single Sample.
Inference about Population Parameters: Hypothesis Testing
PSY 307 – Statistics for the Behavioral Sciences
The t-test Inferences about Population Means when population SD is unknown.
Fall 2012Biostat 5110 (Biostatistics 511) Discussion Section Week 8 C. Jason Liang Medical Biometry I.
June 2, 2008Stat Lecture 18 - Review1 Final review Statistics Lecture 18.
Overview Definition Hypothesis
II.Simple Regression B. Hypothesis Testing Calculate t-ratios and confidence intervals for b 1 and b 2. Test the significance of b 1 and b 2 with: T-ratios.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Education 793 Class Notes T-tests 29 October 2003.
Confidence Intervals and Significance Testing in the World of T Welcome to the Real World… The World of T T.
More About Significance Tests
June 18, 2008Stat Lecture 11 - Confidence Intervals 1 Introduction to Inference Sampling Distributions, Confidence Intervals and Hypothesis Testing.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Comparing Two Population Means
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
Dan Piett STAT West Virginia University
Welcome to the Unit 8 Seminar Dr. Ami Gates
Inference for Means (C23-C25 BVD). * Unless the standard deviation of a population is known, a normal model is not appropriate for inference about means.
Topic 5 Statistical inference: point and interval estimate
Section 9.2 Testing the Mean  9.2 / 1. Testing the Mean  When  is Known Let x be the appropriate random variable. Obtain a simple random sample (of.
Introduction to Hypothesis Testing: One Population Value Chapter 8 Handout.
CHAPTER 18: Inference about a Population Mean
Don’t forget HW due on Tuesday. Assignment is on web.
Testing of Hypothesis Fundamentals of Hypothesis.
Introduction Scientists, mathematicians, and other professionals sometimes spend years conducting research and gathering data in order to determine whether.
Jeopardy Statistics Edition. Terms Calculator Commands Sampling Distributions Confidence Intervals Hypothesis Tests: Proportions Hypothesis Tests: Means.
Large sample CI for μ Small sample CI for μ Large sample CI for p
Confidence Intervals with Means Unit 12, Notes p 189.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Psych 230 Psychological Measurement and Statistics
Estimation of a Population Mean
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Aim: How do we use a t-test?
Review I A student researcher obtains a random sample of UMD students and finds that 55% report using an illegally obtained stimulant to study in the past.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
AP Statistics Unit 5 Addie Lunn, Taylor Lyon, Caroline Resetar.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is large (n > 30) a) Central limit theorem applies b) Can.
Inference for Proportions Section Starter Do dogs who are house pets have higher cholesterol than dogs who live in a research clinic? A.
1 Chapter 8 Interval Estimation. 2 Chapter Outline  Population Mean: Known  Population Mean: Unknown  Population Proportion.
Confidence Intervals and Significance Testing in the World of T Unless you live in my animated world, Z-Testing with population σ isn’t reality… So, let’s.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Inference about the mean of a population of measurements (  ) is based on the standardized value of the sample mean (Xbar). The standardization involves.
Inference about the mean of a population of measurements (  ) is based on the standardized value of the sample mean (Xbar). The standardization involves.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Confidence Intervals with Means Chapter 9. What is the purpose of a confidence interval? To estimate an unknown population parameter.
Announcements Exams returned at end of class Average = 78 Standard Dev = 12 Key with explanations will be posted Don’t be discouraged: First test is often.
Review: What influences confidence intervals?
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Presentation transcript:

Announcements Exam 1 key posted on web HW 7 (posted on web) Due Oct. 24 Bonus E Due Oct. 24 Office Hours –this week: –Wed. 8-11, 3-4 –Fri For Bonus E look for sample size, margin of error, sampling method, etc. Last time in class, we discussed summary statistics and graphs Today we will cover the sampling distribution of the mean

Bonus E: Election Coverage Give a statistical critique of election coverage of next week’s debate If you can’t watch debate, you may use a magazine or newspaper (include copy) Clarity: 2 points Validity: 2 points Brevity: 2 points Typed on paper: due Oct. 24

Sample & Population Symbols Sample mean = x (x-bar) Sample SD = s Sample size = n Population mean =  pop Population SD =  pop Standard letters represent sample values (which are known) and Greek letters represent population values (which are unknown and “Greek to me”). The sample values will be used to estimate the population values.

Sampling Distribution of the Mean Sampling distribution characterizes all possible sample means and their likelihoods Sampling distribution will be used in hypothesis testing and confidence intervals for the population mean

It’s normal! For large enough samples (at least 30 observations) the sampling distribution is normal If the population is normal, the sampling distribution is normal for any sample size. The mean of the normal curve is  pop The SD of the normal curve is  pop /sqrt(n) (n=no. obs)  pop  pop /sqrt(n)

Benefits of Normality The horizontal axis corresponds to possible sample values for the mean The height of the curve represents how many samples have a sample mean of that value The most likely sample means are also those closest to the true value. The width of the curve narrows with larger samples - showing that larger samples get closer to the truth!

Review so far We’re dealing with numerical data Wish to summarize with mean and SD Want to generalize to population Look at all possible samples and their corresponding means Resulting distribution of sample means is normal Think of the distribution as a histogram of the possible sample means

Where to go from here We want to use this fact to generalize our results to the population through hypothesis tests and confidence intervals We know that the normal curve is centered at the right place - the population mean If we can figure out the width of the normal curve, then we know how close the sample mean should be to the population mean We need to relate the resulting normal curve to the standard normal to find areas

Converting it to a “Z” Recall that to convert our curve to the standard normal, we use Z = (X-  )/   =  pop  =  pop /sqrt(n) Then we can find the areas and the probabilities of certain samples There is a minor obstacle - we don’t know  pop Let’s estimate  pop with s, the sample SD This will modify the right side of the Z formula Left must be modified to balance this change

How to modify “Z” The modification must account for the fact that a sample value was used to replace a true population value The sample standard deviation has its own sampling distribution With larger samples, the sample standard deviation will be closer to the true population standard deviation By accounting for these considerations, the sampling distribution for “Z” was found

The t-distribution The correct sampling distribution for the value t = (x-  )/(s/sqrt(n)) is the t- distribution The t-distribution is shaped like the standard normal, but a little shorter with heavier tails The heavier tails account for the fact that the sample standard deviation has it’s own sampling variability The t-distribution has a total area of 1

The t-distribution Symmetric about zero “Spread” determined by the degrees of freedom df = n-1 Higher df means that the sample standard deviation is a good approximation Therefore, higher df makes the t more like the z

History of the t-distribution Actually it’s called the “Student’s t- distribution” Guiness Brewery in Ireland was trying to use sampling to monitor the quality of its products and hired a mathematician This mathematician developed the t- distribution to address the challenge Published under the name “student” to protect company secrets in early 1900’s

Example: Frozen Dinners Each dinner is slightly different Stated No. of calories per meal is 240 Wish to test if this is true (H 0 :  = 240) H A :   240,  = 0.05 Take a simple random sample of 12 dinners Calorie Counts: 255, 244, 239, 242, 265, 245, 259, 248, 225, 226, 251 x = , s = t=( ) /(12.38/sqrt(12)) =1.21 df = 11 p-value = % CI = (236,253)

Example: Frozen Dinners The t-curve to the right represents all possible sample values of t if the true pop is normal with mean 240 Our sample is fairly reasonable under this null hypothesis Fail to reject H 0

Example: Pennies Wish to determine the average age of penny in circulation Test H 0 :  pop <= 8 Set  = 0.05 Sample 67 pennies (is it simple random?) sample mean = sample SD = 8.50 Sample t = 3.28 df = 66 p-value = Reject the null If the true population mean age was 8, we would observe a sample with this high of a mean (or higher) only of the time

Example: Pennies If the true population has mean 8, then all possible sample means and SDs (and thus sample t’s) are described by this t-dist The real sample has t=3.28, quite rare if the null is true Reject the null! The real sample doesn’t match the proposed population.

Example: Pennies Well, if 8 is not a reasonable value for the population mean, then what is? I start by proposing population values until my observed sample falls in the middle. This results in the confidence interval. The 95% CI in this case is (9.32,13.48) Note: The interval does not include 8! The test and CI agree: 8 is not a reasonable population value.

The Formulas Hypothesis Testing t = (x-  Ho )/(s/sqrt(n)) df = n-1 use table to find p-value Confidence Interval x ± t  /2,n-1 s/sqrt(n) Stataquest can perform calculations Not required to memorize, but note: Increasing n narrows CI’s Decreasing  widens CI’s Increasing n reduces the probability of a Type II Increasing  reduces the probability of a Type II Type I is fully controlled by  = Prob of Type I

Review and Preview We’re dealing with numerical data Want to make statements about pop mean Sampling distribution of mean is normal SD of resulting normal is unknown, depends on pop SD Estimating the pop SD with sample SD results in t-distribution t distribution is used to make statements about a population mean through hypothesis tests and confidence intervals

Review and Preview Both examples done today used the one- sample t-test The one sample t-test is used to make statements about a single population mean Next time we will discuss the paired t- test The paired t-test is used to test the mean change in a population (before-after studies, like Quaker Oats commercial)

Using StataQuest: One Sample t procedures Click Editor. Enter data. Click Close. Go to Statistics: Parametric Tests: 1- sample t test. Select the variable of interest and set confidence level For the frozen dinners, this gives: Variable | Obs Mean Std. Dev calories | Ho: mean = 240 t = 1.21 with 11 d.f. Pr > |t| = % CI = ( , ) Since our test is two- tailed and SQ gives two-tailed p-values, this is the p-value for our test as well.

Using StataQuest: One sample t procedures Following the steps on the previous page for the pennies we get: Variable | Obs Mean Std. Dev age | Ho: mean = 8 t = 3.28 with 66 d.f. Pr > |t| = % CI = ( , ) We want a right tail p- value since our alternative is right- sided Since t>0, we take the 2-tailed p-value given by SQ and divide it by two If the alternative had been left sided, we would take half of SQ’s p-value and subtract it from one

Using StataQuest One Sample t procedures Use this method when you have summary stats: x, s, n Go to Calculator: 1- sample t test Enter the requested info (hypothesized mean is the mean stated in null hypothesis) For the pennies –No. obs = 67 –Sample mean = –Sample SD = 8.50 –Hyp. Mean = 8 –Conf. Level = 95 Variable | Obs Mean Std. Dev x | Ho: mean = 8 t = 3.27 with 66 d.f. Pr > |t| = % CI = ( , )