Analysis of variance ANOVA.

Slides:



Advertisements
Similar presentations
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Advertisements

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Hypothesis Testing Steps in Hypothesis Testing:
PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Random variable Distribution. 200 trials where I flipped the coin 50 times and counted heads no_of_heads in a trial.
Nonparametric tests and ANOVAs: What you need to know.
Chapter Seventeen HYPOTHESIS TESTING
Independent Sample T-test Formula
One-way Between Groups Analysis of Variance
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
ANOVA Greg C Elvers.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Jeopardy Opening Robert Lee | UOIT Game Board $ 200 $ 200 $ 200 $ 200 $ 200 $ 400 $ 400 $ 400 $ 400 $ 400 $ 10 0 $ 10 0 $ 10 0 $ 10 0 $ 10 0 $ 300 $
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
ANOVA: Analysis of Variance.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Two-Way (Independent) ANOVA. PSYC 6130A, PROF. J. ELDER 2 Two-Way ANOVA “Two-Way” means groups are defined by 2 independent variables. These IVs are typically.
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
DSCI 346 Yamasaki Lecture 4 ANalysis Of Variance.
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 11: Between-Subjects Designs 1.
Lecture notes 13: ANOVA (a.k.a. Analysis of Variance)
ANOVA: Analysis of Variation
Chapter 11 Analysis of Variance
Comparing Multiple Groups:
ANOVA: Analysis of Variation
ANOVA: Analysis of Variation
Dependent-Samples t-Test
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Correlation I have two variables, practically „equal“ (traditionally marked as X and Y) – I ask, if they are independent and if they are „correlated“,
ANOVA: Analysis of Variation
SEMINAR ON ONE WAY ANOVA
Statistics for Managers Using Microsoft Excel 3rd Edition
Lecture Slides Elementary Statistics Twelfth Edition
Size of a hypothesis test
Comparison of two samples
Statistics for the Social Sciences
i) Two way ANOVA without replication
Comparing Three or More Means
Basic Practice of Statistics - 5th Edition
Hypothesis testing using contrasts
Post Hoc Tests on One-Way ANOVA
Post Hoc Tests on One-Way ANOVA
Central Limit Theorem, z-tests, & t-tests
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Comparing Multiple Groups: Analysis of Variance ANOVA (1-way)
Econ 3790: Business and Economic Statistics
Comparing Groups.
Chapter 11 Analysis of Variance
Chapter 11: The ANalysis Of Variance (ANOVA)
Analysis of Variance (ANOVA)
I. Statistical Tests: Why do we use them? What do they involve?
ANOVA Determining Which Means Differ in Single Factor Models
One-Way Analysis of Variance
Psych 231: Research Methods in Psychology
The Analysis of Variance
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Chapter 15 Analysis of Variance
Chapter 10 – Part II Analysis of Variance
ANOVA: Analysis of Variance
K-Sample Methods Assume X1j from treatment 1 (sample of size n1) and and so forth… Xkj from treatment k (sample of size nk) for a total of n1+n2+ … +nk.
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

Analysis of variance ANOVA

Examples of problems We compare concentration values of nitrogen in leaves of five related plant species We compare number of seeds produced by plants grown (each grown independently!) under five different light regimes Generally – we compare more than 2 groups

Why don’t test in pairs and don’t use series of t-tests? Species A Species B Species C

If we have k groups (and we compare k means) We use k.(k-1)/2 tests. Probability of Type I error is α in each of them. The chance, that we make at least one Type I error, increases then with number of means compared.

Probability, that we make Type I error using more t tests during searching among all the pairs in group of k means. „Statistical fishing“

Thus I test just one hypothesis “All groups are the same” or better H0: 1=2=3= ...= k. Providing homogeneity of variance (and normality). HA says then: it isn’t true, that all the means are the same (at least one of them differ from the rest ones)

Analysis of variance = ANOVA (ANalysis Of VAriance) In the most simple case - Single Factor ANOVA, one-way ANOVA

Model: Xij = μ+αi + εij “error” (random) variability N(0, σ2) Independent of α “shift” of group i against general mean general mean Null hypothesis can be then written αij = 0 (in other words – there is no shift among groups, just error variability)

Data – 3 groups group means The question is – what probability will be, that such or more variable means I get, if the samples are from one population? What variability we can expect, can be computed as Overall/grand mean

Data – 3 groups Variability is sum of squares of deviations (from the respective mean) group means Within groups Overall/grand mean Estimation of general variance (in the case, that H0 is true) based on variability inside groups

Data – 3 groups Variability is sum of squares of deviations (from the mean) group means Among groups Overall/grand mean Multiply by group size Estimation of general variance (in the case, that H0 is true) based on variability among groups

Data – 3 groups Variability is sum of squares of deviations (from the mean) group means general variability Overall/grand mean Even here holds, that MSTOT = SSTOT/DFTOT (it isn’t much useful though)

holds Thus ANALYSIS OF VARIANCE - I decompose variance into its components

I have two estimates of variance (MSG and MSe) and if null hypothesis is true If null hypothesis is true, then they are estimates of the same value. Ratio of two estimates of variance (of variables with normal distribution) has F-distribution. If the groups come from populations differing in means, then the variability among groups is bigger than variability inside groups.

Variability among groups can be proved just against variability inside groups!!!

Test’s process is classic Attention, we have two degrees of freedom (numerator and denominator) again Probability, that variability among means is this big or bigger (if H0 is true)

Nowadays prints P=0.026

Statistica prints Intercept is a test of null hypothesis, that grand mean is 0. In most cases such null hypothesis is clearly absurd and then it has no sense to mention this in publications.

I have two groups (k=2), should I use ANOVA, or t-test? It doesn’t matter, as P is exactly the same in both cases (F is a square of given t)

Power of test Increases with deviation from H0 - we cannot affect in though :( Increases with number of observations in group Increases with balance of groups Decreases with number of groups (don’t try to compare many groups with small numbers of replications within groups!)

Violation of assumptions Robustness Robustness to violation of normality increases with number of observations in group Robustness to disturbance of homogeneity of variances decreases rapidly in unbalanced group sizes

Factors with fixed and random effects I want to find out limitation of the element in food: Rabbits feed with normal food and food enriched with magnesium, calcium and ferrum a. - Fixed effect – I am interested in, which one is the best, if any. etc. I have 10 randomly chosen plants from a meadow and I am interested whether their offsprings differ according to the parent plant - random effect – it doesn’t matter, if better offspring originates from my plant no.1 or my plant no.3.

Fortunately One way ANOVA is the same for fixed and random factor.

For factors with fixed effect It isn’t enough to know, that groups aren’t the same, I want to know, what differs from what. This question hasn’t one good solution (thus it has a lot of them). Experiment-wise vs. comparison-wise Type I error rate. Bonferroni

Multiple comparison tests Tukey - “classic” (that probability of Type I error was lower than α at least in one test- i.e. before given level of significance, usually 5%). Analogue of multiple t-tests Critical values depend on k. For big k is the test very weak (a lot of partial tests is done). SE is estimated on the basis of all groups, not only those ones compared (to make denominator DF bigger and thus power of test too - attention – there is considerable sensitivity to violation of homogeneity of variances.

Typical results

In graphic form There is something strange – we probably committed to Type II error [but we usually pretend that this is alright]. As you can see, Tukey is not ideal (nothing is ideal in multiple comparisons), but, at least, no-one will criticise you too much

What are the other possibilities? Dunnet –test each “treatment” group against single control: fewer tests (increases just linearly with number of groups) => more powerful test. I can use one-tailed tests too. Contrasts – testing of “groups of groups”, usually logically planed Planed observations

Non-parametric possibilities in lieu of ANOVA Permutation tests (randomly divided observations to groups of the same size as in experiment, this generates my own distributions of test statistic under the null hypothesis) Kruskal-Wallis test – based on rank Both tests test H0, that samples are from one population. If formulated as location tests, assumption is that distribution shape is the same in all the groups. Median test – I compare number of observations above and below common median in each groups.

Kruskal-Wallis Ri – sum of ranks in single groups ni – sum of observations in single groups N – collective number of observations