Presentation is loading. Please wait.

Presentation is loading. Please wait.

Inference about the mean of a population of measurements (  ) is based on the standardized value of the sample mean (Xbar). The standardization involves.

Similar presentations


Presentation on theme: "Inference about the mean of a population of measurements (  ) is based on the standardized value of the sample mean (Xbar). The standardization involves."— Presentation transcript:

1 Inference about the mean of a population of measurements (  ) is based on the standardized value of the sample mean (Xbar). The standardization involves subtracting the mean of Xbar and dividing by the standard deviation of Xbar – recall that –Mean of Xbar is  ; and –Standard deviation of Xbar is  /sqrt(n) Thus we have (Xbar -  )/(  /sqrt(n)) which has a Z distribution if: –Population is normal and  is known ; or if –n is large so CLT takes over…

2 But what if  is unknown?? Then this standardized Xbar doesn’t have a Z distribution anymore, but a so-called t-distribution with n-1 degrees of freedom… Since  is unknown, the standard deviation of Xbar,  /sqrt(n), is unknown. We estimate it by the so-called standard error of Xbar, s/sqrt(n), where s=the sample standard deviation. There is a t-distribution for every value of the sample size; we’ll use t(k) to stand for the particular t-distribution with k degrees of freedom. There are some properties of these t- distributions that we should note…

3 Every t-distribution looks like a N(0,1) distribution; i.e., it is centered and symmetric around 0 and has the same characteristic “bell” shape… however, the standard deviation of t(k) {sqrt(k/(k-2))} is greater than 1, the s.d. of Z so the t-distribution density curve is more spread out than Z. Probabilities involving r.v.s that have the t(k) distributions are given by areas under the t(k) density curve … the pt function in R gives us the probabilities we need… pt(q, k) = Prob(t(k)<= q)

4 The good news is that everything we’ve already learned about constructing confidence intervals and testing hypotheses about  carries through under the assumption of unknown  … So e.g., a 95% confidence interval for  based on a SRS from a population with unknown  is Xbar +/- t * (s.e.(Xbar)) Recall that s.e.(Xbar) = s/sqrt(n). Here t * is the appropriate quantile from the t-distribution so that the area between –t * and +t * is.95 As we did before, if we change the level of confidence then the value of t * must change appropriately… e.g., for 95% confidence with df=12, qt(.975,12) gives the correct t * ….

5 Similarly, we may test hypotheses using this t- distributed standardized Xbar… e.g., to test the H 0 :    against H a :    we use (Xbar -    /(s/sqrt(n)) which has a t- distribution with n-1 df, assuming the null hypothesis is true. See the last page of these notes for a summary of hypothesis testing in the case of “the one-sample t-test” … HW: Read the online Chapter 10 on Hypothesis Testing with Standard Errors (start with the first 3 sections… the third deals with the t-distribution). Work on the second problem set handout…

6 Note: a statistic is robust if it is insensitive to violations of the assumptions made when the statistic is used. For example, the t-statistic requires normality of the population… how sensitive is the t-statistic to violations of normality?? Consider these practical guidelines for inference on a single mean: –If the sample size is < 15, use the t procedures if the data are close to normal. –If the sample size is >= 15 then unless there is strong non-normality or outliers, t procedures are OK –If the sample size is large (say n >= 40) then even if the distribution is skewed, t procedures are OK

7


Download ppt "Inference about the mean of a population of measurements (  ) is based on the standardized value of the sample mean (Xbar). The standardization involves."

Similar presentations


Ads by Google