Download presentation
Presentation is loading. Please wait.
Published byMerryl Allen Modified over 9 years ago
1
week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n -1 degrees of freedom. The t distribution has mean 0 and it is a symmetric distribution. The is a different t distribution for each sample size. A particular t distribution is specified by the degrees of freedom that comes from the sample standard deviation.
2
week112 Tests for the population mean when is unknown Suppose that a SRS of size n is drawn from a population having unknown mean μ and unknown stdev. . To test the hypothesis H 0 : μ = μ 0, we first estimate by s – the sample stdev., then compute the one-sample t statistic given by In terms of a random variable T having the t (n - 1) distribution, the P-value for the test of H 0 against H a : μ > μ 0 is P( T ≥ t ) H a : μ < μ 0 is P( T ≤ t ) H a : μ ≠ μ 0 is 2·P( T ≥ |t|)
3
week113 Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 6 representative gardens where sewage sludge was used as fertilizer. The following measurements (in mg/kg of dry weight) were obtained. Cd 21 38 12 15 14 8 Is there strong evidence that the mean concentration of Cd is higher than 12. Descriptive Statistics Variable N Mean Median TrMean StDev SE Mean Cd 6 18.00 14.50 18.00 10.68 4.36 The hypothesis to be tested are: H 0 : μ = 12 vs H a : μ > 12.
4
week114 The test statistics is: The degrees of freedom are df = 6 – 1 = 5 Since t = 1.38 < 2.015, we cannot reject H 0 at the 5% level and so there are no strong evidence. The P-value is 0.1 < P(T (5) ≥ 1.38) < 0.15 and so is greater then 0.05 indicating a non significant result.
5
week115 CIs for the population mean when unknown Suppose that a SRS of size n is drawn from a population having unknown mean μ. A C-level CI for μ when is unknown is an interval of the form where t* is the value for the t (n -1) density curve with area C between –t* and t*. Example: Give a 95% CI for the mean Cd concentration.
6
week116 MINITAB commands: Stat > Basic Statistics > 1-Sample t MINITAB outputs for the above problem: T-Test of the Mean Test of mu = 12.00 vs mu > 12.00 Variable N Mean StDev SE Mean T P Cd 6 18.00 10.68 4.36 1.38 0.11 T Confidence Intervals Variable N Mean StDev SE Mean 95.0 % CI Cd 6 18.00 10.68 4.36 (6.79, 29.21)
7
week117 Question 3 Final exam Dec 2000 In order to test H 0 : μ = 60 vs H a : μ ≠ 60 a random sample of 9 observations (normally distributed) is obtained, yielding and s = 5. What is the p-value of the test for this sample? a)greater than 0.10. b)between 0.05 and 0.10. c)between 0.025 and 0.05. d)between 0.01 and 0.025. e)less than 0.01.
8
week118 Question A manufacturing company claims that its new floodlight will last 1000 hours. After collecting a simple random sample of size ten, you determine that a 95% confidence interval for the true mean number of hours that the floodlights will last, , is (970, 995). Which of the following are true? (Assume all tests are two-sided.) I) At any <.05, we can reject the null hypothesis that the true mean is 1000. II) If a 99% confidence interval for the mean were determined here, the numerical value 972 would certainly lie in this interval. III) If we wished to test the null hypothesis H 0 : = 988, we could say that the p-value must be < 0.05.
9
week119 Questions 1.Alpha (level of sig. α) is a)the probability of rejecting H 0 when H 0 is true. b)the probability of supporting H 0 when H 0 is false. c)supporting H 0 when H 0 is true. d)rejecting H 0 when H 0 is false. 2. Confidence intervals can be used to do hypothesis tests for a) left tail tests. b) right tail tests c) two tailed test 3. The Type II error is supporting a null hypothesis that is false. T/F
10
week1110 Robustness of the t procedures Robust procedures A statistical inference procedure is called robust if the probability calculations required are insensitive to violations of the assumptions made. t-procedures are quite robust against nonnormality of the population except in the case of outliers or strong skewness.
11
week1111 Simulation study Let’s generate 100 samples of size 10 from a moderately skewed distribution (Chi-square distribution with 5 df ) and calculate the 95% t-intervals to see how many of them contain the true mean μ = 5. First let’s have a look at the histogram of the 1000 values generated from this distribution. Variable N Mean Median TrMean StDev C1 1000 4.9758 4.2788 4.7329 3.1618
12
week1112 T Confidence Intervals Variable N Mean StDev SE Mean 95.0 % CI C1 10 5.21 3.89 1.23 ( 2.43, 7.99)... C4 10 4.449 1.593 0.504 ( 3.309, 5.589) C5 10 5.33 4.23 1.34 ( 2.31, 8.36) C6 10 3.267 2.312 0.731 ( 1.612, 4.921)* C7 10 4.981 2.988 0.945 ( 2.844, 7.118) C8 10 3.725 1.520 0.481 ( 2.638, 4.812)* C9 10 4.487 2.332 0.738 ( 2.819, 6.155)... C14 10 4.650 1.854 0.586 ( 3.324, 5.977) C15 10 2.973 2.163 0.684 ( 1.425, 4.520)* C16 10 4.685 2.254 0.713 ( 3.072, 6.297) C26 10 5.594 2.984 0.944 ( 3.459, 7.728) C27 10 3.468 2.078 0.657 ( 1.982, 4.955)* C28 10 5.59 3.84 1.22 ( 2.84, 8.34)... C62 10 5.689 3.113 0.984 ( 3.462, 7.916) C63 10 3.724 1.741 0.551 ( 2.479, 4.970)* C64 10 4.387 2.157 0.682 ( 2.843, 5.930)... C87 10 7.01 3.44 1.09 ( 4.55, 9.47) C88 10 3.281 2.265 0.716 ( 1.661, 4.902)* C89 10 4.78 3.20 1.01 ( 2.49, 7.06)... C99 10 6.52 4.24 1.34 ( 3.49, 9.56) C100 10 3.614 2.198 0.695 ( 2.042, 5.186) The number of intervals not capturing the true mean (μ = 5) is 6/100.
13
week1113 Example 100 samples of size 15 were drawn from a very skewed distribution (Chi-square distribution with d. f. 1) Variable N Mean Median TrMean StDev C1 1500 0.9947 0.4766 0.8059 1.3647 The 95% CIs (t-intervals) for these 100 samples are given below.
14
week1114 T Confidence Intervals Variable N Mean StDev SE Mean 95.0 % CI C1 15 0.773 0.939 0.242 ( 0.253, 1.293) C2 15 1.093 1.491 0.385 ( 0.268, 1.919) C3 15 0.553 0.735 0.190 ( 0.146, 0.960)* C4 15 0.387 0.732 0.189 ( -0.019, 0.792)* C5 15 1.239 2.146 0.554 ( 0.051, 2.427)... C23 15 0.491 0.619 0.160 ( 0.148, 0.834)* C24 15 0.582 1.088 0.281 ( -0.020, 1.184) C25 15 0.550 0.660 0.170 ( 0.184, 0.915)* C26 15 0.634 0.769 0.199 ( 0.208, 1.060) C27 15 0.508 0.528 0.136 ( 0.216, 0.800)*... C51 15 1.122 1.292 0.334 ( 0.406, 1.837) C52 15 0.519 0.664 0.171 ( 0.151, 0.887)* C53 15 1.666 2.028 0.524 ( 0.543, 2.789)... C59 15 1.208 2.297 0.593 ( -0.065, 2.480) C60 15 0.644 0.525 0.136 ( 0.353, 0.935)* C61 15 1.088 1.122 0.290 ( 0.466, 1.709)
15
week1115 T Confidence Intervals (continuation)... C79 15 0.895 0.931 0.240 ( 0.379, 1.411) C80 15 0.391 0.767 0.198 ( -0.034, 0.816)* C81 15 1.038 0.992 0.256 ( 0.488, 1.587) C82 15 0.952 1.407 0.363 ( 0.173, 1.732) C83 15 0.2763 0.2999 0.0774 ( 0.1102, 0.4424)* C84 15 1.237 1.999 0.516 ( 0.130, 2.345)... C99 15 0.921 0.865 0.223 ( 0.442, 1.400) C100 15 0.813 1.437 0.371 ( 0.018, 1.609) The number of intervals not capturing the true mean (μ = 1) is 9/100.
16
week1116 Match Pairs t-test In a matched pairs study, subjects are matched in pairs and the outcomes are compared within each matched pair. The experimenter can toss a coin to assign two treatment to the two subjects in each pair. Matched pairs are also common when randomization is not possible. One situation calling for match pairs is when observations are taken on the same subjects, under different conditions. A match pairs analysis is needed when there are two measurements or observations on each individual and we want to examine the difference. For each individual (pair), we find the difference d between the measurements from that pair. Then we treat the d i as one sample and use the one sample t – statistic to test for no difference between the treatments effect. Example: similar to exercise 7.41 on page 446 in IPS.
17
week1117 Data Display Row Student Pretest Posttest improvement 1 1 30 29 -1 2 2 28 30 2 3 3 31 32 1 4 4 26 30 4 5 5 20 16 -4 6 6 30 25 -5 7 7 34 31 -3 8 8 15 18 3 9 9 28 33 5 10 10 20 25 5 11 11 30 32 2 12 12 29 28 -1 13 13 31 34 3 14 14 29 32 3 15 15 34 32 -2 16 16 20 27 7 17 17 26 28 2 18 18 25 29 4 19 19 31 32 1 20 20 29 32 3
18
week1118 One sample t-test for the improvement T-Test of the Mean Test of mu = 0.000 vs mu > 0.000 Variable N Mean StDev SE Mean T P improvem 20 1.450 3.203 0.716 2.02 0.029 MINITAB commands for the paired t-test Stat > Basic Statistics > Paired t Paired T-Test and Confidence Interval Paired T for Posttest – Pretest N Mean StDev SE Mean Posttest 20 28.75 4.74 1.06 Pretest 20 27.30 5.04 1.13 Difference 20 1.450 3.203 0.716 95% CI for mean difference: (-0.049, 2.949) T-Test of mean difference=0 (vs > 0): T-Value = 2.02 P-Value = 0.029
19
week1119 Character Stem-and-Leaf Display Stem-and-leaf of improvement N = 20 Leaf Unit = 1.0 2 -0 54 4 -0 32 6 -0 11 8 0 11 (7) 0 2223333 5 0 4455 1 0 7
20
week1120 Two-sample problems The goal of inference is to compare the response in two groups. Each group is considered to be a sample form a distinct population. The responses in each group are independent of those in the other group. A two-sample problem can arise form a randomized comparative experiment or comparing random samples separately selected from two populations. Example: A medical researcher is interested in the effect of added calcium in our diet on blood pressure. She conducted a randomized comparative experiment in which one group of subjects receive a calcium supplement and a control group gets a placebo.
21
week1121 Comparing two means (with two independent samples) Here we will look at the problem of comparing two population means when the population variances are known or the sample sizes are large. Suppose that a SRS of size n 1 is drawn from an N( μ 1, σ 1 ) population and that an independent SRS of size n 2 is drown from an N( μ 2, σ 2 ) population. Then the two-sample z statistics for testing the null hypothesis H 0 : μ 1 = μ 2 is given by and has the standard normal N(0,1) sampling distribution. Using the standard normal tables, the P-value for the test of H 0 against H a : μ 1 > μ 2 is P( Z ≥ z ) H a : μ 1 < μ 2 is P( Z ≤ z ) H a : μ 1 ≠ μ 2 is 2·P(Z ≥ |z|)
22
week1122 Example A regional IRS auditor runs a test on a sample of returns filed by March 15 to determine whether the average return this year is larger than last year. The sample data are shown here for a random sample of returns from each year. Assume that the std. deviation of returns is known to be about 100 for both years. Test whether the average return is larger this year than last year. Last YearThis Year Mean380410 Sample size100120
23
week1123 Solution The hypothesis to be tested are: H 0 : μ 1 = μ 2 vs H a : μ 1 < μ 2. The test statistics is: The P-value = P(Z < -2.22) = 0.0139 < 0.05, therefore we can reject H 0 and conclude that at the 5% significant level, the average return is larger this year than last year. A 95% CI for the difference is given by:,
24
week1124 Comparing two population means (unknown std. deviations) Suppose that a SRS of size n 1 is drawn from a normal population with unknown mean 1 and that an independent SRS of size n 2 is drawn from another normal population with unknown mean 2. To test the null hypothesis H 0 : 1 = 2, we compute the two sample t-statistic This statistic has a t-distribution with df approximately equal to smaller of n 1 – 1 and n 2 - 1. We can use this distribution to compute the P-value.
25
week1125 Example The weight gains for n 1 = n 2 = 8 rats tested on diets 1 and 2 are summarized here. Test whether diet 2 has greater mean weight gain. Use the 5% significant level. The hypotheses to be tested are: H 0 : μ 1 = μ 2 vs H a : μ 1 < μ 2. The test statistic is Diet 1Diet 2 n88 Std dev..0330.070 mean3.13.2
26
week1126 The P-value is P(T (7) ≤- 3.65) = P(T (7) ≥ 3.65), from table D we have 0.005 < P-value < 0.01 and so we reject H 0 and conclude that the mean weight gain from diet 2 is significantly greater than that from diet 1 (at the 5% and 1% significant level). A C% CI for the difference between the two means is given by, For this example the 95% CI is = (0.0353, 0.165)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.