1 Topic 9 - ANOVA Background ANOVA. 2 Comparing several means (some situations) Does the average number of words per sentence in advertisements differ.

Slides:



Advertisements
Similar presentations
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Advertisements

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Chapter 11 Analysis of Variance
ANOVA: Analysis of Variation
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
The Two Factor ANOVA © 2010 Pearson Prentice Hall. All rights reserved.
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
Independent Sample T-test Formula
Part I – MULTIVARIATE ANALYSIS
ANOVA Analysis of Variance: Why do these Sample Means differ as much as they do (Variance)? Standard Error of the Mean (“variance” of means) depends upon.
Chapter 11 Analysis of Variance
Analysis of Variance: Inferences about 2 or More Means
Statistics Are Fun! Analysis of Variance
Lesson #23 Analysis of Variance. In Analysis of Variance (ANOVA), we have: H 0 :  1 =  2 =  3 = … =  k H 1 : at least one  i does not equal the others.
ANalysis Of VAriance (ANOVA) Comparing > 2 means Frequently applied to experimental data Why not do multiple t-tests? If you want to test H 0 : m 1 = m.
Chapter 3 Analysis of Variance
Lecture 9: One Way ANOVA Between Subjects
Chapter 17 Analysis of Variance
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 15 Analysis of Variance.
The Analysis of Variance
Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).
Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.
Introduction to Analysis of Variance (ANOVA)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Chap 10-1 Analysis of Variance. Chap 10-2 Overview Analysis of Variance (ANOVA) F-test Tukey- Kramer test One-Way ANOVA Two-Way ANOVA Interaction Effects.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
The basic idea So far, we have been comparing two samples
INFERENTIAL STATISTICS: Analysis Of Variance ANOVA
Statistics Primer ORC Staff: Xin Xin (Cindy) Ryan Glaman Brett Kellerstedt 1.
Analysis of Variance ( ANOVA )
January 31 and February 3,  Some formulae are presented in this lecture to provide the general mathematical background to the topic or to demonstrate.
Sociology 5811: Lecture 14: ANOVA 2
Topic 9 - ANOVA Background - pages ANOVA - pages
One-way Analysis of Variance 1-Factor ANOVA. Previously… We learned how to determine the probability that one sample belongs to a certain population.
ANOVA (Analysis of Variance) by Aziza Munir
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Analysis of Variance.
Testing Hypotheses about Differences among Several Means.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Analysis of Variance (ANOVA) Brian Healy, PhD BIO203.
Lecture 9-1 Analysis of Variance
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
One-way ANOVA: - Comparing the means IPS chapter 12.2 © 2006 W.H. Freeman and Company.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
ETM U 1 Analysis of Variance (ANOVA) Suppose we want to compare more than two means? For example, suppose a manufacturer of paper used for grocery.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
Chap 11-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 11 Analysis of Variance.
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
One-way ANOVA Example Analysis of Variance Hypotheses Model & Assumptions Analysis of Variance Multiple Comparisons Checking Assumptions.
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Oneway ANOVA comparing 3 or more means. Overall Purpose A Oneway ANOVA is used to compare three or more average scores. A Oneway ANOVA is used to compare.
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
Chapter 11 Analysis of Variance
INF397C Introduction to Research in Information Studies Spring, Day 12
Lecture Slides Elementary Statistics Twelfth Edition
One-Way Analysis of Variance: Comparing Several Means
Chapter 11 Analysis of Variance
Chapter 15 Analysis of Variance
Presentation transcript:

1 Topic 9 - ANOVA Background ANOVA

2 Comparing several means (some situations) Does the average number of words per sentence in advertisements differ across magazine types? Does the expected survival time vary for different types of cancer among patients treated with a specific drug? Is the mean response time not the same for three different types of circuits? Is there a difference in average distance carry for baseballs stored at a variety of humidity levels? Is there a statistical difference between the home run hitting ability of, say Babe Ruth vs. Roger Maris vs. the modern day Mark McGwire or Sammy Sosa?

3 Comparing several means Suppose that instead of comparing two means we want to test for the equivalence of several means H 0 :  1 =  2 = …=  I H A : at least two  i ’s are different Each of the groups we are comparing are called treatments or factors. We make our decision based on samples from each of the I treatment groups. Let X i,j represent the j th sample from the i th treatment group with j = 1,…, n i. We assume each sample comes from a Normal population with common variance.

4 ANOVA – Analysis of Variance We partition the total variability of the data into treatment (in our control good) and error (out of our control bad) components. What you really want here is for the SSTRT to equal the SSTOT. That means that you have no random error, no SSERR, and 100% of the variation in the model is defined by the treatments. While this would be a perfect result, it is rarely ever the case.

5 ANOVA - Means squares MS trt = SS trt /DF trt, MS err = SS err /DF err, F = MS trt /Ms err If H 0 is true (all the means are the same, or really close to being the same), then F should be close to 0. –Your distribution means should be visually close and there should be a lot of “commonality” amongst the distributions….meaning that from a visual standpoint, it would be quite difficult to tell if any specific value of X fell into distribution 1 or 2 or 3 or 4….. If H 0 is false (at least two of the means are different), then F should be much larger than 1. –Distribution means should be separated and there should be minimal overlap or “commonality” of the distributions….it should be relatively easy to tell if a specific value of X fell into distribution 1 or 2 or 3 or 4….. The lower the level of overlap in the distributions, the higher the F value and the more persuasive your result.

6 ANOVA – Decision rule Reject H 0 if F > F  DF trt,DF err Demonstration of F calculator. Note: Since your F test statistic is the ratio of the MStrt to MSerr, the higher that value the better. Larger values of the F test statistic are similar to larger test stats for Z or T, inasmuch as they are more powerful, or able to prove our point with greater significance.

Example Calc of SStrt+SSerr=SStot (1) 7 TRTOBS1OBS2OBS3OBS4OBS5AVG Grand mean is the average of all values in the dataset = SStrt is the summation of the squared differences between the treatment means and the grand mean, weighted by the number of observations for each treatment. SStrt = (4( )^2)+(5( )^2)+(5( )^2) +(5( )^2)+(5( )^2)= SSerr is the summation of the squared differences between the individual observations and their respective treatment means. SSerr=(10-11)^2+(11-11)^2+(11-11)^2+(12-11)^2+( )^2+…+(10-9.6)^2=27.2 SStot=( )^2+( )^2+…+( )^2+( )^2=

Example Calc of SStrt+SSerr=SStot (2) 8 SStrt = (4( )^2)+(5( )^2)+(5( )^2) +(5( )^2)+(5( )^2)= SSerr=(10-11)^2+(11-11)^2+(11-11)^2+(12-11)^2+( )^2+…+(10-9.6)^2=27.2 SStot=( )^2+( )^2+…+( )^2+( )^2= Analysis of Variance results: Data stored in separate columns. Column means ColumnnMeanStd. Error Trt Trt Trt Trt Trt ANOVA table SourcedfSSMSF-StatP-value Treatments Error Total

9 ANOVA table SourcedfSSMSF-StatP-value Treatments < Error Total

10 Magazine ads example 30 magazines were grouped by educational level: –Group 1 – High educational level –Group 2 – Medium educational level –Group 3 – Low educational level 3 magazines randomly selected from each group: –Group 1: 1. Scientific American, 2. Fortune, 3. The New Yorker –Group 2: 4. Sports Illustrated, 5. Newsweek, 6. People –Group 3: 7. National Enquirer, 8. Grit, 9. True Confessions 6 ads randomly selected from each of the 9 magazines and the variables below recorded: –WDS - number of words in advertisement copy –SEN - number of sentences in advertising copy –3SYL - number of 3+ syllable words in advertising copy –MAG - magazine (1 through 9 as above) –GROUP - educational level

11 Magazine AdsMagazine Ads in StatCrunch Is the average number of words per sentence the same across magazine groups? – WDS/SEN – Compare boxplots & QQ plots What are the null and alternative hypotheses? Note: Remember to hold down the CNTL key in StatCrunch when you want to add several ANOVA treatments.

12 CircuitCircuit example Response times in milliseconds were recorded for three different types of circuits used in a shutoff mechanism. Does the data suggest at level 0.05 that all three circuits have the same mean response time? Ho: The mean response times are all the same Ha: At least two of the mean response times are different.

Golf Ball Data I play a lot of golf and I’m always looking for equipment to help me shoot lower scoresThe problem is that I’m cheap….. One of the main factors in golf is to drive the ball as far as possible (assuming that you don’t create additional dispersion in the process), so if you can find a “longer ball”, it could be beneficial. The link above shows sample driving distances for three types of balls under consideration (Trispeed, E6 and B330). Test to see if there’s a difference in driving distance….(discuss method here). Ho: The mean driving distance of all balls is the same. Ha: At least two of the balls are decidedly higher or lower than the rest. 13

14 Multiple comparisons If we reject H 0 in favor of the alternative H A, then we are only concluding that at least two of the means are different. If we want to drill down to see which means are actually different, we might be tempted to do two-sample t tests for all mean pairs. The problem is that the overall level of significance is much higher than the level of significance for each pair wise test. 3 groups of pairwise comparisons at 5% alpha, gives us 3 comparisons. The resulting overall alpha is which is way more than we wanted, plus it’s conservative, because 3-pairwise comparisons are not actually independent. To do these multiple comparisons, we must use Tukey’s method to maintain an overall level of significance.

Tukey’s interpretation of Golf Ball Data Shows simultaneous confidence intervals at overall alpha =.05. If “0” is inside a confidence interval, the two listed populations are not different. If it’s not, the two populations are statistically different. Here, both Trispeed and E6 are different than B330, but not from each other. 15 The additional file for Topic 9 contains information and examples on aspects of ANOVA.