Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.

Slides:



Advertisements
Similar presentations
Chapter 18: Inference about One Population Mean STAT 1450.
Advertisements

Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
Introduction Comparing Two Means
STATISTICAL INFERENCE PART V
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
Chapter 9: Inferences for Two –Samples
Business Statistics for Managerial Decision
Chapter 11: Inference for Distributions
Chapter 9 Hypothesis Testing.
5-3 Inference on the Means of Two Populations, Variances Unknown
CHAPTER 19: Two-Sample Problems
Objective: To test claims about inferences for two sample means, under specific conditions.
Statistical Inference for Two Samples
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Inference for Distributions
Lesson Comparing Two Means.
Education 793 Class Notes T-tests 29 October 2003.
Ch 11 – Inference for Distributions YMS Inference for the Mean of a Population.
AP STATISTICS LESSON 11 – 2 (DAY 1) Comparing Two Means.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
More About Significance Tests
Week 91 Large Sample Tests – Non-Normal population Suppose we have a large sample from a non-normal population and we are interested in conducting a hypotheses.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
STATISTICAL INFERENCE PART VII
Comparing Two Population Means
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Chapter 10 Comparing Two Means Target Goal: I can use two-sample t procedures to compare two means. 10.2a h.w: pg. 626: 29 – 32, pg. 652: 35, 37, 57.
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
Week 111 Power of the t-test - Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 7 representative gardens.
Chapter 11 Inference for Distributions AP Statistics 11.1 – Inference for the Mean of a Population.
CHAPTER 18: Inference about a Population Mean
1 Happiness comes not from material wealth but less desire.
For 95 out of 100 (large) samples, the interval will contain the true population mean. But we don’t know  ?!
AP Exam Prep: Essential Notes. Chapter 11: Inference for Distributions 11.1Inference for Means of a Population 11.2Comparing Two Means.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Week101 Decision Errors and Power When we perform a statistical test we hope that our decision will be correct, but sometimes it will be wrong. There are.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
1 Section 9-4 Two Means: Matched Pairs In this section we deal with dependent samples. In other words, there is some relationship between the two samples.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
Confidence Intervals with Means Unit 12, Notes p 189.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
Lesson Comparing Two Means. Knowledge Objectives Describe the three conditions necessary for doing inference involving two population means. Clarify.
Week121 Robustness of the two-sample procedures The two sample t-procedures are more robust against nonnormality than one-sample t-procedures. When the.
© Copyright McGraw-Hill 2004
ISMT253a Tutorial 1 By Kris PAN Skewness:  a measure of the asymmetry of the probability distribution of a real-valued random variable 
Copyright ©2006 Brooks/Cole A division of Thomson Learning, Inc. Introduction to Probability and Statistics Twelfth Edition Robert J. Beaver Barbara M.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.
MATB344 Applied Statistics I. Experimental Designs for Small Samples II. Statistical Tests of Significance III. Small Sample Test Statistics Chapter 10.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
Essential Statistics Chapter 171 Two-Sample Problems.
Week 101 Test on Pairs of Means – Case I Suppose are iid independent of that are iid. Further, suppose that n 1 and n 2 are large or that are known. We.
1 Design and Analysis of Experiments (2) Basic Statistics Kyung-Ho Park.
+ Unit 5: Estimating with Confidence Section 8.3 Estimating a Population Mean.
Chapter 9 Lecture 3 Section: 9.3. We will now consider methods for using sample data from two independent samples to test hypotheses made about two population.
Chapter 7 Inference Concerning Populations (Numeric Responses)
CHAPTER 19: Two-Sample Problems ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
When  is unknown  The sample standard deviation s provides an estimate of the population standard deviation .  Larger samples give more reliable estimates.
AP Statistics Chapter 11 Section 2. TestConfidence IntervalFormulasAssumptions 1-sample z-test mean SRS Normal pop. Or large n (n>40) Know 1-sample t-test.
Inference about the mean of a population of measurements (  ) is based on the standardized value of the sample mean (Xbar). The standardization involves.
16/23/2016Inference about µ1 Chapter 17 Inference about a Population Mean.
Chapter 11 Inference for Distributions AP Statistics 11.2 – Inference for comparing TWO Means.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Decision Errors and Power
Basic Practice of Statistics - 3rd Edition Two-Sample Problems
Essential Statistics Two-Sample Problems - Two-sample t procedures -
Hypothesis Testing – Introduction
Presentation transcript:

week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n -1 degrees of freedom. The t distribution has mean 0 and it is a symmetric distribution. The is a different t distribution for each sample size. A particular t distribution is specified by the degrees of freedom that comes from the sample standard deviation.

week112 Tests for the population mean  when  is unknown Suppose that a SRS of size n is drawn from a population having unknown mean μ and unknown stdev. . To test the hypothesis H 0 : μ = μ 0, we first estimate  by s – the sample stdev., then compute the one-sample t statistic given by In terms of a random variable T having the t (n - 1) distribution, the P-value for the test of H 0 against H a : μ > μ 0 is P( T ≥ t ) H a : μ < μ 0 is P( T ≤ t ) H a : μ ≠ μ 0 is 2·P( T ≥ |t|)

week113 Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 6 representative gardens where sewage sludge was used as fertilizer. The following measurements (in mg/kg of dry weight) were obtained. Cd Is there strong evidence that the mean concentration of Cd is higher than 12. Descriptive Statistics Variable N Mean Median TrMean StDev SE Mean Cd The hypothesis to be tested are: H 0 : μ = 12 vs H a : μ > 12.

week114 The test statistics is: The degrees of freedom are df = 6 – 1 = 5 Since t = 1.38 < 2.015, we cannot reject H 0 at the 5% level and so there are no strong evidence. The P-value is 0.1 < P(T (5) ≥ 1.38) < 0.15 and so is greater then 0.05 indicating a non significant result.

week115 CIs for the population mean  when  unknown Suppose that a SRS of size n is drawn from a population having unknown mean μ. A C-level CI for μ when  is unknown is an interval of the form where t* is the value for the t (n -1) density curve with area C between –t* and t*. Example: Give a 95% CI for the mean Cd concentration.

week116 MINITAB commands: Stat > Basic Statistics > 1-Sample t MINITAB outputs for the above problem: T-Test of the Mean Test of mu = vs mu > Variable N Mean StDev SE Mean T P Cd T Confidence Intervals Variable N Mean StDev SE Mean 95.0 % CI Cd (6.79, 29.21)

week117 Question 3 Final exam Dec 2000 In order to test H 0 : μ = 60 vs H a : μ ≠ 60 a random sample of 9 observations (normally distributed) is obtained, yielding and s = 5. What is the p-value of the test for this sample? a)greater than b)between 0.05 and c)between and d)between 0.01 and e)less than 0.01.

week118 Question A manufacturing company claims that its new floodlight will last 1000 hours. After collecting a simple random sample of size ten, you determine that a 95% confidence interval for the true mean number of hours that the floodlights will last, , is (970, 995). Which of the following are true? (Assume all tests are two-sided.) I) At any  <.05, we can reject the null hypothesis that the true mean is II) If a 99% confidence interval for the mean were determined here, the numerical value 972 would certainly lie in this interval. III) If we wished to test the null hypothesis H 0 :  = 988, we could say that the p-value must be < 0.05.

week119 Questions 1.Alpha (level of sig. α) is a)the probability of rejecting H 0 when H 0 is true. b)the probability of supporting H 0 when H 0 is false. c)supporting H 0 when H 0 is true. d)rejecting H 0 when H 0 is false. 2. Confidence intervals can be used to do hypothesis tests for a) left tail tests. b) right tail tests c) two tailed test 3. The Type II error is supporting a null hypothesis that is false. T/F

week1110 Robustness of the t procedures Robust procedures A statistical inference procedure is called robust if the probability calculations required are insensitive to violations of the assumptions made. t-procedures are quite robust against nonnormality of the population except in the case of outliers or strong skewness.

week1111 Simulation study Let’s generate 100 samples of size 10 from a moderately skewed distribution (Chi-square distribution with 5 df ) and calculate the 95% t-intervals to see how many of them contain the true mean μ = 5. First let’s have a look at the histogram of the 1000 values generated from this distribution. Variable N Mean Median TrMean StDev C

week1112 T Confidence Intervals Variable N Mean StDev SE Mean 95.0 % CI C ( 2.43, 7.99)... C ( 3.309, 5.589) C ( 2.31, 8.36) C ( 1.612, 4.921)* C ( 2.844, 7.118) C ( 2.638, 4.812)* C ( 2.819, 6.155)... C ( 3.324, 5.977) C ( 1.425, 4.520)* C ( 3.072, 6.297) C ( 3.459, 7.728) C ( 1.982, 4.955)* C ( 2.84, 8.34)... C ( 3.462, 7.916) C ( 2.479, 4.970)* C ( 2.843, 5.930)... C ( 4.55, 9.47) C ( 1.661, 4.902)* C ( 2.49, 7.06)... C ( 3.49, 9.56) C ( 2.042, 5.186) The number of intervals not capturing the true mean (μ = 5) is 6/100.

week1113 Example 100 samples of size 15 were drawn from a very skewed distribution (Chi-square distribution with d. f. 1) Variable N Mean Median TrMean StDev C The 95% CIs (t-intervals) for these 100 samples are given below.

week1114 T Confidence Intervals Variable N Mean StDev SE Mean 95.0 % CI C ( 0.253, 1.293) C ( 0.268, 1.919) C ( 0.146, 0.960)* C ( , 0.792)* C ( 0.051, 2.427)... C ( 0.148, 0.834)* C ( , 1.184) C ( 0.184, 0.915)* C ( 0.208, 1.060) C ( 0.216, 0.800)*... C ( 0.406, 1.837) C ( 0.151, 0.887)* C ( 0.543, 2.789)... C ( , 2.480) C ( 0.353, 0.935)* C ( 0.466, 1.709)

week1115 T Confidence Intervals (continuation)... C ( 0.379, 1.411) C ( , 0.816)* C ( 0.488, 1.587) C ( 0.173, 1.732) C ( , )* C ( 0.130, 2.345)... C ( 0.442, 1.400) C ( 0.018, 1.609) The number of intervals not capturing the true mean (μ = 1) is 9/100.

week1116 Match Pairs t-test In a matched pairs study, subjects are matched in pairs and the outcomes are compared within each matched pair. The experimenter can toss a coin to assign two treatment to the two subjects in each pair. Matched pairs are also common when randomization is not possible. One situation calling for match pairs is when observations are taken on the same subjects, under different conditions. A match pairs analysis is needed when there are two measurements or observations on each individual and we want to examine the difference. For each individual (pair), we find the difference d between the measurements from that pair. Then we treat the d i as one sample and use the one sample t – statistic to test for no difference between the treatments effect. Example: similar to exercise 7.41 on page 446 in IPS.

week1117 Data Display Row Student Pretest Posttest improvement

week1118 One sample t-test for the improvement T-Test of the Mean Test of mu = vs mu > Variable N Mean StDev SE Mean T P improvem MINITAB commands for the paired t-test Stat > Basic Statistics > Paired t Paired T-Test and Confidence Interval Paired T for Posttest – Pretest N Mean StDev SE Mean Posttest Pretest Difference % CI for mean difference: (-0.049, 2.949) T-Test of mean difference=0 (vs > 0): T-Value = 2.02 P-Value = 0.029

week1119 Character Stem-and-Leaf Display Stem-and-leaf of improvement N = 20 Leaf Unit = (7)

week1120 Two-sample problems The goal of inference is to compare the response in two groups. Each group is considered to be a sample form a distinct population. The responses in each group are independent of those in the other group. A two-sample problem can arise form a randomized comparative experiment or comparing random samples separately selected from two populations. Example: A medical researcher is interested in the effect of added calcium in our diet on blood pressure. She conducted a randomized comparative experiment in which one group of subjects receive a calcium supplement and a control group gets a placebo.

week1121 Comparing two means (with two independent samples) Here we will look at the problem of comparing two population means when the population variances are known or the sample sizes are large. Suppose that a SRS of size n 1 is drawn from an N( μ 1, σ 1 ) population and that an independent SRS of size n 2 is drown from an N( μ 2, σ 2 ) population. Then the two-sample z statistics for testing the null hypothesis H 0 : μ 1 = μ 2 is given by and has the standard normal N(0,1) sampling distribution. Using the standard normal tables, the P-value for the test of H 0 against H a : μ 1 > μ 2 is P( Z ≥ z ) H a : μ 1 < μ 2 is P( Z ≤ z ) H a : μ 1 ≠ μ 2 is 2·P(Z ≥ |z|)

week1122 Example A regional IRS auditor runs a test on a sample of returns filed by March 15 to determine whether the average return this year is larger than last year. The sample data are shown here for a random sample of returns from each year. Assume that the std. deviation of returns is known to be about 100 for both years. Test whether the average return is larger this year than last year. Last YearThis Year Mean Sample size100120

week1123 Solution The hypothesis to be tested are: H 0 : μ 1 = μ 2 vs H a : μ 1 < μ 2. The test statistics is: The P-value = P(Z < -2.22) = < 0.05, therefore we can reject H 0 and conclude that at the 5% significant level, the average return is larger this year than last year. A 95% CI for the difference is given by:,

week1124 Comparing two population means (unknown std. deviations) Suppose that a SRS of size n 1 is drawn from a normal population with unknown mean  1 and that an independent SRS of size n 2 is drawn from another normal population with unknown mean  2. To test the null hypothesis H 0 :  1 =  2, we compute the two sample t-statistic This statistic has a t-distribution with df approximately equal to smaller of n 1 – 1 and n We can use this distribution to compute the P-value.

week1125 Example The weight gains for n 1 = n 2 = 8 rats tested on diets 1 and 2 are summarized here. Test whether diet 2 has greater mean weight gain. Use the 5% significant level. The hypotheses to be tested are: H 0 : μ 1 = μ 2 vs H a : μ 1 < μ 2. The test statistic is Diet 1Diet 2 n88 Std dev mean3.13.2

week1126 The P-value is P(T (7) ≤- 3.65) = P(T (7) ≥ 3.65), from table D we have < P-value < 0.01 and so we reject H 0 and conclude that the mean weight gain from diet 2 is significantly greater than that from diet 1 (at the 5% and 1% significant level). A C% CI for the difference between the two means is given by, For this example the 95% CI is = (0.0353, 0.165)