Unequal Variance and ANOVA ©2005 Dr. B. C. Paul. ANOVA Assumptions ANOVA assumes the populations sampled in each class are normally distributed Also assumes.

Slides:



Advertisements
Similar presentations
Topic 12 – Further Topics in ANOVA
Advertisements

Introduction to Regression ©2005 Dr. B. C. Paul. Things Favoring ANOVA Analysis ANOVA tells you whether a factor is controlling a result It requires that.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Limitations of ANOVA ©2005 Dr. B. C. Paul. The Data Size Effect We Did ANOVA with one factor We Did ANOVA with one factor We Did it with two factors (Driver.
Chapter 14 Comparing two groups Dr Richard Bußmann.
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Chapter 19 Confidence Intervals for Proportions.
Confidence Intervals for Proportions
The Two Factor ANOVA © 2010 Pearson Prentice Hall. All rights reserved.
Chapter 19: Confidence Intervals for Proportions
BA 555 Practical Business Analysis
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Matching level of measurement to statistical procedures
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 5): Outliers Fall, 2008.
Correlations and T-tests
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Statistics made simple Modified from Dr. Tammy Frank’s presentation, NOVA.
Assumption of Homoscedasticity
Chapter 19: Confidence Intervals for Proportions
Repeated Measures ANOVA Used when the research design contains one factor on which participants are measured more than twice (dependent, or within- groups.
Basic Analysis of Variance and the General Linear Model Psy 420 Andrew Ainsworth.
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Inferential Statistics: SPSS
Linear Regression Inference
Prediction concerning Y variable. Three different research questions What is the mean response, E(Y h ), for a given level, X h, of the predictor variable?
STA291 Statistical Methods Lecture 31. Analyzing a Design in One Factor – The One-Way Analysis of Variance Consider an experiment with a single factor.
Proportions for the Binomial Distribution ©2005 Dr. B. C. Paul.
Two Way ANOVA ©2005 Dr. B. C. Paul. ANOVA Application ANOVA allows us to review data and determine whether a particular effect is changing our results.
CS130 – Software Tools Fall 2010 Statistics and PASW Wrap-up 1.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
6/4/2016Slide 1 The one sample t-test compares two values for the population mean of a single variable. The two-sample t-test of population means (aka.
Copyright © 2012 Pearson Education. All rights reserved © 2010 Pearson Education Copyright © 2012 Pearson Education. All rights reserved. Chapter.
Nonparametric Tests IPS Chapter 15 © 2009 W.H. Freeman and Company.
Statistically speaking…
Analysis of Residuals ©2005 Dr. B. C. Paul. Examining Residuals of Regression (From our Previous Example) Set up your linear regression in the Usual manner.
The z test statistic & two-sided tests Section
MATH 2400 Ch. 15 Notes.
ANOVA: Analysis of Variance.
Multiple Regression ©2005 Dr. B. C. Paul. Problems with Regression So Far We have only been able to consider one factor as controlling at a time Everything.
SW318 Social Work Statistics Slide 1 One-way Analysis of Variance  1. Satisfy level of measurement requirements  Dependent variable is interval (ordinal)
Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly.
Non-Parametric Statistics ©2005 Dr. B. C. Paul. The Normal Problem Techniques we have used so far relied mostly on underlying distribution to be normal.
Copyright © 2009 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Welcome to MM570 Psychological Statistics
ONE-WAY BETWEEN-GROUPS ANOVA Psyc 301-SPSS Spring 2014.
Quadratic Regression ©2005 Dr. B. C. Paul. Fitting Second Order Effects Can also use least square error formulation to fit an equation of the form Math.
ANOVA, Regression and Multiple Regression March
 Assumptions are an essential part of statistics and the process of building and testing models.  There are many different assumptions across the range.
Analysis of Variance STAT E-150 Statistical Methods.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Dr. C. Ertuna1 Hypothesis Testing 2 Samples (Chapter - 04/D)
Problems with Variance ©2005 Dr. B. C. Paul. Determining What To Do We have looked at techniques that depend on normally distributed data with variance.
Statistics 19 Confidence Intervals for Proportions.
F73DA2 INTRODUCTORY DATA ANALYSIS ANALYSIS OF VARIANCE.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Chapter 10 Confidence Intervals for Proportions © 2010 Pearson Education 1.
Exploring Group Differences
Confidence Intervals for Proportions
Make Sure You Have Your Dependent Variable and Factor Selected
Statistics made simple Dr. Jennifer Capers
Confidence Intervals for Proportions
Inference on Proportions
Checking the data and assumptions before the final analysis.
Presentation transcript:

Unequal Variance and ANOVA ©2005 Dr. B. C. Paul

ANOVA Assumptions ANOVA assumes the populations sampled in each class are normally distributed Also assumes that the variance of those distributions is the same  How can we check the homogeity of variance assumption?

Our Story Quincy’s company has been outsourcing American jobs and wishes to know if there are differences between where his factories are located, what shift the workers are on, and the number of rejected widgets.  Quincy gathers data on the rejection rate at his three factories using 20 weekly rates each taken at random from the record.

Quincy’s data set Quincy’s favorite plant is in Yucatan Mexico (where he is forced to spend much of his time on visits at the beach) – he labels this plant #1  His day shift he calls 1  His night shift he calls 2 Quincy’s second plant is in Peoria, Illinois. The plant has sentimental value to him since he used to go there with his grandfather, but the workers there are uppity and want things like fair wages and safe working conditions – he labels this plant #2 Quincy’s third plant is in Malaysia. Quincy does not wish to discuss what he does during visits to this plant.

Quincy Enters His Data Set Rejection rate/ 1000 Widgets. Day or Night Shift Which Plant

Quincy Scratches His Head What kind of test should he do? There are too many plants and shifts here to try to do flocks of T tests. Then he remembers ANOVA  His variables plant shift and plant are really categories The numerical values are arbitrary and not even ordered  Only his # of rejects is ordered  With two classes to divide by – shift and plant – Quincy decides to do a two way ANOVA

Reaching for his trusty SPSS program Quincy Begins Quincy clicks Analyze to pull down The menu. He highlights general linear model To bring up the pop-out menu He then highlights and clicks Multivariate.

Quincy Picks # of rejects for his variable and shift and plant as his fixed factors.

This Time Quincy Looks at Options Quincy wants to see his means Displayed But most important he wants To use Lavene’s test to check That the variance is the same For all the plants and shifts.

Quincy clicks continue and OK and the program is off to the races It starts with the Kill Joy special of the Day. Levene’s test says there is no Way the variance is the same for all The categories.

Then it gives a result it has already called into question. The most powerful effect Was which plant (Oh – Nuts outsourcing does Seem to change reject rate) Shift was the next most Important. But the response to shift work Also varies by where the Plant is located.

Importance of the controlling variables About 71% of the variation Can be traced to which plant And which shift is working.

Shift Effects The 95% confidence intervals for The mean of Day Shift and Night Shift do not come close to Overlapping. The values suggest night shift Messes up more than day shift.

This Really Sucks For Quincy’s foreign plant the 95% confidence intervals overlap So Quincy may not be able to be Certain the foreign plants are Different – However – his U.S. plant has a Distinctly lower rejection rate from The foreign plants. Quincy’s outsourcing has impacted Rejection rate.

Moving on to Plant and Shift Interactions. The U.S. plant on either day shift Or night shift has much lower Rejection than any of the foreign plants We also cannot be sure that there is an effect of Shift in the U.S. operations.

Continued Inspection On night shift the screw-up rate is similar for Malaysia and Mexico On Day shift, however, Malaysia shows better Performance than Mexico

Usefulness of Means Displays Can see that the plots of individual classes with confidence intervals can give you an at a glimpse view of which differences specifically are driving the test results Bad news is those confidence intervals used a common standard error which Levene’s test proved was wrong.

What Happens When Assumptions Fail? We have a very clear cut result from our data, but unfortunately its legitimacy has been called into question The more good information we have the closer we can peg our answers  We saw that adding samples improved the certainty of our conclusions  We saw with the Brehens-Fisher T test that when we lost homogeity of variance we were less sure of our values than before  We know we have lost something here also General models and methods don’t help us know exactly how much and don’t tell us what to do about it. We have been warned of an error and we know the direction that the error will push our answers.

So Are the U.S. Plants Better Or Not? With the high certainty levels here and fairly good sample size we probably can still be sure about our U.S. plants  If things were close we simply would not know for sure with the analysis and methods we have so far.

What about the homogeneous error variance in regression? In many regression analysis the largest source for poorly distributed error variance is lack of fit of the model  We already know how to bring in other variables and look for non-linear effects If variance still changes it is often that larger numbers have greater variance.  Some sort of data normalization can help this