Download presentation
Presentation is loading. Please wait.
Published byAdolfo González Cano Modified over 6 years ago
1
Always be mindful of the kindness and not the faults of others.
2
One-way Anova: Inferences about More than Two Population Means
What is Anova? One-Way Anova; F tests Pairwise comparisons: Bonferroni procedure
3
Analysis of Variance & One Factor Designs
Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly influential FACTOR)
4
= Many other factors (possibly, some we’re unaware of)
OBJECTIVE: To determine the impact of X on Y Mathematical Model: Y = f (x, ) , where = (impact of) all factors other than X Ex: Y = Forced expiratory volume in one second (liters) X = Medical center (John Hopkins, Rancho Los Amigos, St. Louis) = Many other factors (possibly, some we’re unaware of)
5
Statistical Model • Yij “LEVEL” OF Center Yij = + j + ij
(Brand is, of course, represented as “categorical”) “LEVEL” OF Center • • • • • • • • C 1 2 • n Y11 Y12 • • • • • • •Y1c Yij = + j + ij i = 1, , nj j = 1, , C Y21 • YnI • Yij Ync • • • • • • • •
6
Let mj = AVERAGE associated with jth level of X
Where = OVERALL AVERAGE j = index for FACTOR (center) LEVEL i= index for “replication” j = Differential effect (response) associated with jth level of X and ij = “noise” or “error” associated with the (particular) (i,j)th data value. Let mj = AVERAGE associated with jth level of X tj = mj – m and m = AVERAGE of mj .
7
•••• Y1, Y2, etc., are Column Means 1 2 3 C Y11 Y12 • • • • • •Y1c Y21
YRI •••• YRc • • • • • • • • • Y 1 Y 2 • • • (Y j) • • Y c Y1, Y2, etc., are Column Means
8
Y • = Y j /C = “GRAND MEAN”
(assuming same # data points in each column) (otherwise, Y • = mean of all the data) c j=1
9
These estimates are based on Gauss’ (1796) PRINCIPLE OF LEAST SQUARES
MODEL: Yij = + j + ij Y• estimates Yj - Y • estimatesj (= mj – m) (for all j) These estimates are based on Gauss’ (1796) PRINCIPLE OF LEAST SQUARES and on COMMON SENSE
10
MODEL: Yij = + j + ij If you insert the estimates into the MODEL, (1) Yij = Y • + (Yj - Y • ) + ij. < it follows that our estimate of ij is (2) ij = Yij - Yj <
11
{ { { Then, Yij = Y• + (Yj - Y• ) + ( Yij - Yj)
or, (Yij - Y• ) = (Yj - Y•) + (Yij - Yj ) { { { (3) TOTAL VARIABILITY in Y Variability in Y associated with X Variability in Y associated with all other factors + =
12
SUM OF SQUARES BETWEEN COLUMNS SUM OF SQUARES WITHIN COLUMNS
If you square both sides of (3), and double sum both sides (over i and j), you get, [after some unpleasant algebra, but lots of terms which “cancel”] {{ C nj C C nj (Yij - Y• )2 = nj(Yj - Y•)2 + (Yij - Yj)2 { j=1 i=1 j=1 j=1 i=1 TSS TOTAL SUM OF SQUARES ( SSB SUM OF SQUARES BETWEEN COLUMNS = + SSW (SSE) SUM OF SQUARES WITHIN COLUMNS ( ( ( ( (
13
ANOVA TABLE SSB SSB C - 1 = MSB C - 1 SSW = MSW SSW N - C N-C
SOURCE OF VARIABILITY SSQ DF Mean square (M.S.) Between Columns (due to center) SSB SSB C - 1 = MSB C - 1 Within Columns (due to other factors) SSW = MSW SSW N - C N-C TOTAL TSS N - 1
14
ANOVA TABLE Source of Variability df SSQ M.S. CENTER 1.583 2 0.791
= 0.791 ERROR 14.480 57 = 0.254 TOTAL = 60 -1
15
> 1 , < 1 , We can show: E ( MSB ) = 2 + VCOL E ( MSW ) = 2
This suggests that There’s some evidence of non-zero VCOL, or “level of X affects Y” if MSB > 1 , MSW if MSB No evidence that VCOL > 0, or that “level of X affects Y” < 1 , MSW
16
With HO: Level of X has no impact on Y
HI: Level of X does have impact on Y, We need MSB > > 1 MSW to reject HO.
17
More Formally, HO: 1 = 2 = • • • c = 0 HI: not all j = 0 OR (All column means are equal) HO: 1 = 2 = • • • • c HI: not all j are EQUAL
18
The distribution of MSB = “Fcalc” , is MSW
The F - distribution with (dfB, dfw) degrees of freedom Assuming HO true. Ca = Table Value
19
In our problem: ANOVA TABLE Source of Variability df SSQ M.S. Fcalc
CENTER 1.583 2 = 0.791 3.12=0.791/0.254 ERROR 14.480 57 = 0.254 TOTAL = 60 -1
20
F table: Table A-5 = .05 C0.5 = 3.15 Fcal =3.12 (2, 57 DF)
21
Hence, at =. 05, Do Not Reject Ho , i. e
Hence, at = .05, Do Not Reject Ho , i.e., Conclude that centers don’t differ significantly on FEV1 at 5% level. P-value is .052, so it is significant at 6% level
22
Multiple Comparison Procedures
Once we reject H0: ==...c in favor of H1: NOT all ’s are equal, we don’t yet know the way in which they’re not all equal, but simply that they’re not all the same. If there are 4 columns, are all 4 ’s different? Are 3 the same and one different? If so, which one? etc.
23
Overall Type I Error Rate
We set up “” as the significance level for a hypothesis test. Suppose we test 3 independent hypotheses, each at = .05; each test has type I error (rej H0 when it’s true) of However, P(at least one type I error in the 3 tests) = 1-P( accept all ) = 1 - (.95)3 .14 3, given true
24
Pairwise Comparisons Bonferroni Correction:
Do a series of pairwise t-tests, each with specified value divided by # of comparisons. Pairwise Comparisons
25
MINITAB INPUT center fev1 1 3.23 1 3.47 1 1.86 1 2.47 . . 3 2.85
1 3.23 1 3.47 1 1.86 1 2.47 . . 3 2.85 3 2.43 3 3.20 3 3.53
26
ONE FACTOR ANOVA (MINITAB)
MINITAB: STAT>>ANOVA>>ONE-WAY Click for comparisons
28
Minitab Outputs Fisher 98.3% Individual Confidence Intervals
All Pairwise Comparisons among Levels of center Simultaneous confidence level = 95.58% center = 1 subtracted from: center Lower Center Upper ( * ) ( * ) center = 2 subtracted from: center Lower Center Upper ( * )
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.