Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture #28 Thursday, December 1, 2016 Textbook: 16.1

Similar presentations


Presentation on theme: "Lecture #28 Thursday, December 1, 2016 Textbook: 16.1"— Presentation transcript:

1 Lecture #28 Thursday, December 1, 2016 Textbook: 16.1
Statistics 200 Lecture #28 Thursday, December 1, 2016 Textbook: 16.1 Objectives: • Generalize the two-sample t-test to more than two samples. • Explain how testing equality of means can be rephrased as a test of variance (“Analysis of variance”). • Formulate null and alternative hypotheses for ANOVA. • State assumptions necessary for ANOVA. • Read an ANOVA table.

2 Example of a stemplot (easy to make by hand!)
The decimal point is 1 digit(s) to the right of the | 0 | 1 | 00 2 | 3 | 0 4 | 77 5 | 2 6 | 1348 7 | 0677 8 | 9 | 10 | 11 | These are the lab grades (out of 120 possible) for one of the 4 lab sections.

3 Motivating Example: Do students change their study habits as they go through college?
In this example, the response is _________ and the explanatory is _________ (use quantitative or categorical) quantitative categorical

4 Quantitative response, categorical predictor
We have seen this in the past. For example, compare male GPA with female GPA. In such cases, we have used difference of two means. D.O.T.M. works because the categorical variable only has two levels. But what about the current example?

5 Quantitative response, categorical predictor
In the current example, our categorical predictor has four levels: Freshman, Sophomore, Junior, Senior. There’s no such thing as “difference of four means”! (We could try doing all of the pairwise comparisons, but this would be super-tedious AND there is a statistical problem with this approach.) Fr vs. So Fr vs. Ju Fr vs. Se So vs. Ju So vs. Se Ju vs. Se Try googling “multiple comparisons problem”

6 Quantitative response, categorical predictor
Variable CollYear Count Mean SE Mean StDev StudHrWk Freshman Sophomore Junior Senior We have all the data we might need to compare the four means. We even know what the null hypothesis should be: H0: µ1 = µ2 = µ3 = µ4

7 Quantitative response, categorical predictor
Variable CollYear Count Mean SE Mean StDev StudHrWk Freshman Sophomore Junior Senior It may sound strange, but we’re going to test equality of the means by a procedure called analysis of variance, or ANOVA. H0: µ1 = µ2 = µ3 = µ4

8 How does ANOVA work to compare means?
Answer: If the means are very different from one another, the variance of the sample means will be large. See the four red X’s in the plot? Those are the sample means. Does their variability seem large? What will you compare it to?

9 How does ANOVA work to compare means?
ANOVA works by comparing the variation between the group means to the variation within the groups. Focus on the horizontal variation (the vertical is meaningless here).

10 Another graphical summary of the data:
ANOVA works by comparing the variation between the group means to the variation within the groups. Looking at the four CIs, one for each mean, does it look like we’ll reject H0?

11 How well do you understand this plot?
Which of the four groups has the largest sample size? Freshmen Juniors Seniors Sophomores Each MOE equals: multiplier times s/sqrt(n). And the same s is used for each. A key piece of information is here.

12 Hypotheses for ANOVA: Remember: There are groups within the population, as defined by their values of the categorical variable. H0: Population means are the same for each group. Ha: Not all population group means are the same. H0: µ1 = µ2 = … = µk Ha: Not all µ’s are the same. In our particular situation… H0: Each class (F, So, J, Se) has the same mean study time. Ha: The mean study times are not the same for each class.

13 There are some assumptions made in ANOVA:
An excerpt from page 638: Each population group has a normal distribution. Each population group has the same standard deviation. In STAT 200, you will not be asked to check assumptions. However, you must know these two!

14 We may use Minitab to perform ANOVA:
“Factor” is another name for a categorical variable, used often in an ANOVA context.

15 ANOVA output is summarized in a single table:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear Error Total The MS, or mean square, is the estimated variance. Note: It equals SS / DF. The top row gives the “between” variation.

16 ANOVA output is summarized in a single table:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear Error Total The MS, or mean square, is the estimated variance. Note: It equals SS / DF. The second row gives the “within” variation.

17 ANOVA output is summarized in a single table:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear Error Total The F statistic is merely the ratio of the MS between to the MS within. It is the test statistic we use for ANOVA! The second row gives the “within” variation.

18 ANOVA output is summarized in a single table:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear Error Total The p-value is based on the F statistic and the two DF values for between and within.

19 ANOVA output is summarized in a single table:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear Error Total One more ANOVA table fact: The MS error, or MS within, is also called the pooled sample variance. You can take its square root to get the pooled standard deviation. The pooled stdev was used in the interval plot seen earlier.

20 Use ANOVA table to write a conclusion
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear Error Total We do not reject the null hypothesis (p-value=0.552), which means that there is no evidence of any difference among the mean study hours per week for freshmen, sophomores, juniors, and seniors.

21 What about mean GPA goal?

22 What about mean GPA goal?
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear Error Total We reject the null hypothesis (p-value<0.0005), which means that there is strong evidence that the mean goal GPA is not the same among the four groups of freshmen, sophomores, juniors, and seniors. You may wonder whether we may then follow up to find out where the differences lie. Yes, but not in this class…

23 If you understand today’s lecture…
16.1, 16.3, 16.4, 16.8, 16.13 Objectives: • Generalize the two-sample t-test to more than two samples. • Explain how testing equality of means can be rephrased as a test of variance (“Analysis of variance”). • Formulate null and alternative hypotheses for ANOVA. • State assumptions necessary for ANOVA. • Read an ANOVA table.


Download ppt "Lecture #28 Thursday, December 1, 2016 Textbook: 16.1"

Similar presentations


Ads by Google