Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics: Unlocking the Power of Data Lock 5 Section 6.4 Distribution of a Sample Mean.

Similar presentations


Presentation on theme: "Statistics: Unlocking the Power of Data Lock 5 Section 6.4 Distribution of a Sample Mean."— Presentation transcript:

1 Statistics: Unlocking the Power of Data Lock 5 Section 6.4 Distribution of a Sample Mean

2 Statistics: Unlocking the Power of Data Lock 5 Outline Standard error for a sample mean CLT for sample means t-distribution Distribution of sample mean

3 Statistics: Unlocking the Power of Data Lock 5 The larger the sample size, the smaller the SE

4 Statistics: Unlocking the Power of Data Lock 5 The standard deviation of the population is Standard Deviation

5 Statistics: Unlocking the Power of Data Lock 5 The standard deviation of the sample is Standard Deviation

6 Statistics: Unlocking the Power of Data Lock 5 The standard deviation of the sample mean is Standard Deviation The standard error is the standard deviation of the statistic.

7 Statistics: Unlocking the Power of Data Lock 5 Olympic Marathon Times

8 Statistics: Unlocking the Power of Data Lock 5 Olympic Marathon Times

9 Statistics: Unlocking the Power of Data Lock 5 CLT for a Mean Population Distribution of Sample Data Distribution of Sample Means n = 10 n = 30 n = 50

10 Statistics: Unlocking the Power of Data Lock 5 A normal distribution is usually a good approximation as long as n ≥ 30 Smaller sample sizes may be sufficient for symmetric distributions, and 30 may not be sufficient for very skewed distributions or distributions with high outliers

11 Statistics: Unlocking the Power of Data Lock 5 Math SAT Scores

12 Statistics: Unlocking the Power of Data Lock 5 Usually, we don’t know the population standard deviation , so estimate it with the sample standard deviation, s Standard Error

13 Statistics: Unlocking the Power of Data Lock 5 Replacing  with s changes the distribution of the standardized test statistic from normal to t The t distribution is very similar to the standard normal, but with slightly fatter tails to reflect this added uncertainty t-distribution

14 Statistics: Unlocking the Power of Data Lock 5 The t-distribution is characterized by its degrees of freedom (df) Degrees of freedom are calculated based on the sample size The higher the degrees of freedom, the closer the t-distribution is to the standard normal Degrees of Freedom

15 Statistics: Unlocking the Power of Data Lock 5 t-distribution

16 Statistics: Unlocking the Power of Data Lock 5 Aside: William Sealy Gosset

17 Statistics: Unlocking the Power of Data Lock 5 t-distribution If a population with mean µ 0 is approximately normal or if n is large (n ≥ 30), the standardized statistic for a mean using the sample s follows a t-distribution with n – 1 degrees of freedom:

18 Statistics: Unlocking the Power of Data Lock 5 t-distribution To use the t-distribution, either n has to be large or the population has to be normally distributed. If these conditions are met, then t has a t-distribution when the null hypothesis is true.

19 Statistics: Unlocking the Power of Data Lock 5 Normality Assumption Using the t-distribution requires that the data comes from a normal distribution Note: this assumption is about the population data, not the distribution of the statistic For large sample sizes we do not need to worry about this, because s will be a very good estimate of , and t will be very close to N(0,1) For small sample sizes (n < 30), we can only use the t-distribution if the distribution of the data is approximately normal

20 Statistics: Unlocking the Power of Data Lock 5 Normality Assumption One small problem: for small sample sizes, it is very hard to tell if the data actually comes from a normal distribution! Population Sample Data, n = 10

21 Statistics: Unlocking the Power of Data Lock 5 Small Samples If sample sizes are small, only use the t- distribution if the data look reasonably symmetric and do not have any extreme outliers. Even then, remember that it is just an approximation! In practice/life, if sample sizes are small, you should just use simulation methods (bootstrapping and randomization)


Download ppt "Statistics: Unlocking the Power of Data Lock 5 Section 6.4 Distribution of a Sample Mean."

Similar presentations


Ads by Google