Lecture #28 Thursday, December 1, 2016 Textbook: 16.1

Slides:



Advertisements
Similar presentations
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Advertisements

ANOVA: Analysis of Variation
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Variance Chapter 16.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
Chapter 9: Inferences for Two –Samples
The Two Factor ANOVA © 2010 Pearson Prentice Hall. All rights reserved.
© 2010 Pearson Prentice Hall. All rights reserved The Complete Randomized Block Design.
Copyright ©2011 Brooks/Cole, Cengage Learning Analysis of Variance Chapter 16 1.
Part I – MULTIVARIATE ANALYSIS
Chapter 7 Analysis of ariance Variation Inherent or Natural Variation Due to the cumulative effect of many small unavoidable causes. Also referred to.
Statistics Are Fun! Analysis of Variance
Lecture 9: One Way ANOVA Between Subjects
Center for Biofilm Engineering Marty Hamilton Professor Emeritus of Statistics Montana State University Statistical design & analysis for assessing the.
Chapter 12: Analysis of Variance
Chapter 13: Inference in Regression
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences.
1 Design of Engineering Experiments Part 2 – Basic Statistical Concepts Simple comparative experiments –The hypothesis testing framework –The two-sample.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Analysis of Variance (ANOVA) Brian Healy, PhD BIO203.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.3 Two-Way ANOVA.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
CHAPTER 27: One-Way Analysis of Variance: Comparing Several Means
McGraw-Hill, Bluman, 7th ed., Chapter 12
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
Chapter 15 Analysis of Variance. The article “Could Mean Platelet Volume be a Predictive Marker for Acute Myocardial Infarction?” (Medical Science Monitor,
Six Easy Steps for an ANOVA 1) State the hypothesis 2) Find the F-critical value 3) Calculate the F-value 4) Decision 5) Create the summary table 6) Put.
ANOVA: Analysis of Variation
Statistics 200 Objectives:
Lecture #8 Thursday, September 15, 2016 Textbook: Section 4.4
Chapter 11 Created by Bethany Stubbe and Stephan Kogitz.
Chapter 13 f distribution and 0ne-way anova
ANOVA: Analysis of Variation
Lecture #25 Tuesday, November 15, 2016 Textbook: 14.1 and 14.3
Statistical Significance
Statistics 200 Objectives:
ANOVA: Analysis of Variation
Chapter 14 Introduction to Multiple Regression
Lecture #23 Tuesday, November 8, 2016 Textbook: 13.1 through 13.4
Lecture #24 Thursday, November 10, 2016 Textbook: 13.1 through 13.6
ANOVA: Multiple Comparisons & Analysis of Variance
Chapter 10 Two-Sample Tests and One-Way ANOVA.
ANOVA: Analysis of Variation
Lecture Slides Elementary Statistics Twelfth Edition
Objectives (PSLS Chapter 18)
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Practice & Communication of Science
i) Two way ANOVA without replication
Comparing Three or More Means
Basic Practice of Statistics - 5th Edition
Least Square Regression
Statistics 200 Objectives:
WELCOME TO THE WORLD OF INFERENTIAL STATISTICS
Lecture #27 Tuesday, November 29, 2016 Textbook: 15.1
One-Way Analysis of Variance
CHAPTER 29: Multiple Regression*
Chapter 11 Analysis of Variance
Chapter 11: The ANalysis Of Variance (ANOVA)
One-Way Analysis of Variance
The Analysis of Variance
Analysis of Variance Objective
ANOVA: Analysis of Variance
ANalysis Of VAriance Lecture 1 Sections: 12.1 – 12.2
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

Lecture #28 Thursday, December 1, 2016 Textbook: 16.1 Statistics 200 Lecture #28 Thursday, December 1, 2016 Textbook: 16.1 Objectives: • Generalize the two-sample t-test to more than two samples. • Explain how testing equality of means can be rephrased as a test of variance (“Analysis of variance”). • Formulate null and alternative hypotheses for ANOVA. • State assumptions necessary for ANOVA. • Read an ANOVA table.

Example of a stemplot (easy to make by hand!) The decimal point is 1 digit(s) to the right of the | 0 | 1 | 00 2 | 3 | 0 4 | 77 5 | 2 6 | 1348 7 | 0677 8 | 014558899 9 | 4445666888899 10 | 0002234445556788899 11 | 0000011222223445599 These are the lab grades (out of 120 possible) for one of the 4 lab sections.

Motivating Example: Do students change their study habits as they go through college? In this example, the response is _________ and the explanatory is _________ (use quantitative or categorical) quantitative categorical

Quantitative response, categorical predictor We have seen this in the past. For example, compare male GPA with female GPA. In such cases, we have used difference of two means. D.O.T.M. works because the categorical variable only has two levels. But what about the current example?

Quantitative response, categorical predictor In the current example, our categorical predictor has four levels: Freshman, Sophomore, Junior, Senior. There’s no such thing as “difference of four means”! (We could try doing all of the pairwise comparisons, but this would be super-tedious AND there is a statistical problem with this approach.) Fr vs. So Fr vs. Ju Fr vs. Se So vs. Ju So vs. Se Ju vs. Se Try googling “multiple comparisons problem”

Quantitative response, categorical predictor Variable CollYear Count Mean SE Mean StDev StudHrWk Freshman 47 14.00 1.48 10.17 Sophomore 178 13.531 0.704 9.397 Junior 51 14.12 1.32 9.42 Senior 11 9.73 1.54 5.12 We have all the data we might need to compare the four means. We even know what the null hypothesis should be: H0: µ1 = µ2 = µ3 = µ4

Quantitative response, categorical predictor Variable CollYear Count Mean SE Mean StDev StudHrWk Freshman 47 14.00 1.48 10.17 Sophomore 178 13.531 0.704 9.397 Junior 51 14.12 1.32 9.42 Senior 11 9.73 1.54 5.12 It may sound strange, but we’re going to test equality of the means by a procedure called analysis of variance, or ANOVA. H0: µ1 = µ2 = µ3 = µ4

How does ANOVA work to compare means? Answer: If the means are very different from one another, the variance of the sample means will be large. See the four red X’s in the plot? Those are the sample means. Does their variability seem large? What will you compare it to?

How does ANOVA work to compare means? ANOVA works by comparing the variation between the group means to the variation within the groups. Focus on the horizontal variation (the vertical is meaningless here).

Another graphical summary of the data: ANOVA works by comparing the variation between the group means to the variation within the groups. Looking at the four CIs, one for each mean, does it look like we’ll reject H0?

How well do you understand this plot? Which of the four groups has the largest sample size? Freshmen Juniors Seniors Sophomores Each MOE equals: multiplier times s/sqrt(n). And the same s is used for each. A key piece of information is here.

Hypotheses for ANOVA: Remember: There are groups within the population, as defined by their values of the categorical variable. H0: Population means are the same for each group. Ha: Not all population group means are the same. H0: µ1 = µ2 = … = µk Ha: Not all µ’s are the same. In our particular situation… H0: Each class (F, So, J, Se) has the same mean study time. Ha: The mean study times are not the same for each class.

There are some assumptions made in ANOVA: An excerpt from page 638: Each population group has a normal distribution. Each population group has the same standard deviation. In STAT 200, you will not be asked to check assumptions. However, you must know these two!

We may use Minitab to perform ANOVA: “Factor” is another name for a categorical variable, used often in an ANOVA context.

ANOVA output is summarized in a single table: Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear 3 186.7 62.23 0.70 0.552 Error 283 25087.6 88.65 Total 286 25274.2 The MS, or mean square, is the estimated variance. Note: It equals SS / DF. The top row gives the “between” variation.

ANOVA output is summarized in a single table: Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear 3 186.7 62.23 0.70 0.552 Error 283 25087.6 88.65 Total 286 25274.2 The MS, or mean square, is the estimated variance. Note: It equals SS / DF. The second row gives the “within” variation.

ANOVA output is summarized in a single table: Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear 3 186.7 62.23 0.70 0.552 Error 283 25087.6 88.65 Total 286 25274.2 The F statistic is merely the ratio of the MS between to the MS within. It is the test statistic we use for ANOVA! The second row gives the “within” variation.

ANOVA output is summarized in a single table: Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear 3 186.7 62.23 0.70 0.552 Error 283 25087.6 88.65 Total 286 25274.2 The p-value is based on the F statistic and the two DF values for between and within.

ANOVA output is summarized in a single table: Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear 3 186.7 62.23 0.70 0.552 Error 283 25087.6 88.65 Total 286 25274.2 One more ANOVA table fact: The MS error, or MS within, is also called the pooled sample variance. You can take its square root to get the pooled standard deviation. The pooled stdev was used in the interval plot seen earlier.

Use ANOVA table to write a conclusion Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear 3 186.7 62.23 0.70 0.552 Error 283 25087.6 88.65 Total 286 25274.2 We do not reject the null hypothesis (p-value=0.552), which means that there is no evidence of any difference among the mean study hours per week for freshmen, sophomores, juniors, and seniors.

What about mean GPA goal?

What about mean GPA goal? Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value CollYear 3 3.294 1.09800 18.14 0.000 Error 283 17.127 0.06052 Total 286 20.421 We reject the null hypothesis (p-value<0.0005), which means that there is strong evidence that the mean goal GPA is not the same among the four groups of freshmen, sophomores, juniors, and seniors. You may wonder whether we may then follow up to find out where the differences lie. Yes, but not in this class…

If you understand today’s lecture… 16.1, 16.3, 16.4, 16.8, 16.13 Objectives: • Generalize the two-sample t-test to more than two samples. • Explain how testing equality of means can be rephrased as a test of variance (“Analysis of variance”). • Formulate null and alternative hypotheses for ANOVA. • State assumptions necessary for ANOVA. • Read an ANOVA table.