Doc.Ing. Zlata Sojková,CSc.1 Analysis of Variance.

Slides:



Advertisements
Similar presentations
What is Chi-Square? Used to examine differences in the distributions of nominal data A mathematical comparison between expected frequencies and observed.
Advertisements

PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Inference for Regression
1 Hypothesis testing. 2 A common aim in many studies is to check whether the data agree with certain predictions. These predictions are hypotheses about.
Model Adequacy Checking in the ANOVA Text reference, Section 3-4, pg
Analysis of Variance (ANOVA) ANOVA can be used to test for the equality of three or more population means We want to use the sample results to test the.
Design of Experiments and Analysis of Variance
ANOVA: Analysis of Variation
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Chapter 10 Simple Regression.
Statistics Are Fun! Analysis of Variance
Lesson #23 Analysis of Variance. In Analysis of Variance (ANOVA), we have: H 0 :  1 =  2 =  3 = … =  k H 1 : at least one  i does not equal the others.
Chapter 3 Analysis of Variance
Chapter 3 Experiments with a Single Factor: The Analysis of Variance
Final Review Session.
Experimental Design Terminology  An Experimental Unit is the entity on which measurement or an observation is made. For example, subjects are experimental.
Chapter 11 Multiple Regression.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Inferences About Process Quality
Analysis of Variance & Multivariate Analysis of Variance
Today Concepts underlying inferential statistics
Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Inferential Statistics
Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
CHAPTER 3 Analysis of Variance (ANOVA) PART 1
Correlation and Linear Regression
Fundamentals of Data Analysis Lecture 7 ANOVA. Program for today F Analysis of variance; F One factor design; F Many factors design; F Latin square scheme.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 13 Experimental Design and Analysis of Variance nIntroduction to Experimental Design.
PROBABILITY & STATISTICAL INFERENCE LECTURE 6 MSc in Computing (Data Analytics)
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
© Copyright McGraw-Hill CHAPTER 12 Analysis of Variance (ANOVA)
Chapter 10 Analysis of Variance.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Between-Groups ANOVA Chapter 12. >When to use an F distribution Working with more than two samples >ANOVA Used with two or more nominal independent variables.
Chapter 14 – 1 Chapter 14: Analysis of Variance Understanding Analysis of Variance The Structure of Hypothesis Testing with ANOVA Decomposition of SST.
One-Way ANOVA ANOVA = Analysis of Variance This is a technique used to analyze the results of an experiment when you have more than two groups.
Design Of Experiments With Several Factors
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
Chapter 14 – 1 Chapter 14: Analysis of Variance Understanding Analysis of Variance The Structure of Hypothesis Testing with ANOVA Decomposition of SST.
Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
Analysis of Variance.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
MARKETING RESEARCH CHAPTER 17: Hypothesis Testing Related to Differences.
Chapter 12 Introduction to Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick.
One-Way Analysis of Variance Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test.
The p-value approach to Hypothesis Testing
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
SUMMARY EQT 271 MADAM SITI AISYAH ZAKARIA SEMESTER /2015.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
CHAPTER 10: ANALYSIS OF VARIANCE(ANOVA) Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
Chi-square test.
Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc.
i) Two way ANOVA without replication
CHAPTER 3 Analysis of Variance (ANOVA)
Chapter 10: Analysis of Variance: Comparing More Than Two Means
One way ANALYSIS OF VARIANCE (ANOVA)
doc.Ing. Zlata Sojková,CSc.
What are their purposes? What kinds?
Presentation transcript:

doc.Ing. Zlata Sojková,CSc.1 Analysis of Variance

doc.Ing. Zlata Sojková,CSc.2 §In practice it is often necessary to compare a large number of independent random selections in terms of level, we are interested in hypothesis: for at least one i (i = 1, 2,…m) for m > 2, when  i, i =1, 2, …m are mean values of normally distributed populations with equal variances  2, t.j. N( ,  2 ) §To verify this hypothesis is used important statistical method called Analysis of variance, abbreviated ANOVA ( resp. AV)

doc.Ing. Zlata Sojková,CSc.3 §In practice is AV used for examination of the impact of one, or more factors (treatments) on the statistical sign. §Factors are labeled A, B,…in AV they will be regarded as qualitative attributes with different variations – levels of factor §Result will be quantitative statistical sign denoted Y §AV is frequently used in the evaluation of biological experiments § The simplest case is AV with single factor called One factor analysis of variance

doc.Ing. Zlata Sojková,CSc.4 §Level of the factor refer to : l certain amount of quantitative factor, e.g. Amount of pure nutrients in manure, different income groups of households l Certain kind of qualitative factor, e.g. different types of the same crop, methods of products placing in stores, §AV is a generalization of Student's t-test for independent choices §AV also examines the impact of qualitative factors resulting in a quantitative character -> analyzes the relationships between attributes

doc.Ing. Zlata Sojková,CSc.5 Scheme of single-factor experiment “balanced attempt” A 12…j… nY i.y i. 1y 11 y 12 y 1j y 1n Y 1.y 1. 2 y 21 y 22 y 2j y 2n Y 2. y 2. ………….. i y i1 y i2 y ij y in Y i. y i. ………….. m y m1 y m2 y mj y mn Y m. y m. Y.. y.. Repetition Levels of the factor row sum row average Overall average Total sum

doc.Ing. Zlata Sojková,CSc.6 Row sum:Total sum: Row average: Overall average:

doc.Ing. Zlata Sojková,CSc.7 Model for resulting observed value:  - expected values for all levels of the factor and observed values  i - impact of i-th level of the factor A e ij - random error, every measurement is biased, resp. impact of random factors where i = 1, 2,…, m j = 1,2,…, n

doc.Ing. Zlata Sojková,CSc.8 Then we can formulate null hypothesis: H o :  1 =  2 =…  i =  m = 0 -> effects of all levels of factor A are zero, insignificant, against the alternative hypothesis H 1 :  i  0 for at least one i (i = 1,2…m) effect  i at least one i – level of the factor is significant, => significantly different from zero or

doc.Ing. Zlata Sojková,CSc.9 Estimates of parameters are sample characteristics: : What can be rewrited:

doc.Ing. Zlata Sojková,CSc.10 Comparison of two experiments with three levels of factor

doc.Ing. Zlata Sojková,CSc.11 Principle of the ANOVA Essence of the analysis of variance lies in the decomposition of the total variability of the investigated sign. Total variability Variability between levels of factor, caused by the action of factor A, “variability between groups” Random variability, residual, “variability within groups“ ScSc S1S1 SrSr

doc.Ing. Zlata Sojková,CSc.12 Variability between groups Variability within groups Total variability Variability 1 Sum of squares (SS) 2 Degrees of freedom m-1 m.n - m N-1= m.n-1 3 Mean square (MS) (1/2) S1S1 SrSr ScSc s12s12 sr2sr2 4 F critical

doc.Ing. Zlata Sojková,CSc.13 Test statistics for one factor ANOVA can be written: F value will be compared with appropriate table value for F- distribution: F , with (m-1) and (m.n - m) degrees of freedom

doc.Ing. Zlata Sojková,CSc.14 Decision about test result: §If F vyp  F . ((m-1,(N-m))  We reject H 0, In that case is effect of at least one level of the factor significant, thus average level of the indicator is significantly different from others. => At least one effect  i is statistically significantly different from zero. Rejection region H 0 FF Acceptance regon Ho If F vyp  F  Do not reject Ho

doc.Ing. Zlata Sojková,CSc.15 If null hypothesis is rejected: §We found only that effect of the factor on examined attribute is significant. §It is also necessary to identify levels of the factor, which are significantly different - for this purpose are used tests of contrasts §Test of contrast: Duncan test, Scheffe test, Tuckey test and others…..

doc.Ing. Zlata Sojková,CSc.16 Terms of use AV: §Samples have normal distribution, violating of this assumption has significant effect on the results of AV §statistical independence of random errors eij §Identical residual variances  1 2 =  2 2 = …. =  2, t.j. D(e ij ) =  2 for all i = 1,2…., m, j=1,2, …n this assumption is more serious and can be verified by Cochran, resp. Bartlett test.

doc.Ing. Zlata Sojková,CSc.17 Scheme of single-factor experiment “unbalanced attempt” A 12…j … n i Y i.y i. 1y 11 y 12 y 1j... n 1 Y 1.y 1. 2 y 21 y 22 y 2j... n 2 Y 2. y 2. ………….. i y i1 y i2 y ij... n i Y i. y i. ………….. m y m1 y m2 y mj... n m Y m. y m. Y.. y.. Different number of repetitions Levels of the factor row sum Row average Overall average Where

doc.Ing. Zlata Sojková,CSc.18 Variability between groups Variability within groups Total variability Variability 1 Sum of squares (SS) 2 Degrees of freedom m-1 N - m N-1 3 Mean square (MS) (1/2) S1S1 SrSr S s12s12 sr2sr2 4 F- critical

doc.Ing. Zlata Sojková,CSc.19 Two-factor analysis of variance with one observation in each subclass.... TAV §Consider the effect of factor A, which we investigate on the m - levels, i = 1,2,...., m §Then consider the effect of factor B, which is observed on n - levels, j = 1,2, …, n §On every i-level of factor A and j-level of factor B we have only one observation (repetition) y ij §=>We are veryfying two null hypothesis

doc.Ing. Zlata Sojková,CSc.20 Scheme for Two-factor experiment with one observation in each subclass TAV A 12 …j … nY i.y i. 1y 11 y 12 y 1j y 1n Y 1.Y 1. 2 y 21 y 22 y 2j y 2n Y 2. y 2. ………….. i y i1 y i2 y ij y in Y i. y i. ………….. m y m1 y m2 y mj y mn Y m. y m. Y. 1 Y Y. j... Y. 1 Y.. y. 1 y y. j... y. 1 y.. n- levels of factor B m-levels of factor A row sum Row average Overall average B Column sum Column average

doc.Ing. Zlata Sojková,CSc.21 We are verifying the validity of two null hypothesis Hypothesis for factor A: H o 1 :  1 =  2 =…  i =  m = 0 t.j. All effects of factor A levels are equal to zero, thus insignificant, against alternative hypothesis H 1 1 :  i  0 for at least one i (i = 1,2…m) effect  i of at least one i – level of factor A is significant, significantly different from zero We can write model for examined attribute as follows:

doc.Ing. Zlata Sojková,CSc.22 Hypothesis for factor B: H o 2 :  1 =  2 =…  j =  n = 0 => All effects of factor A levels are equal to zero, thus insignificant, against alternative hypothesis H 1 2 :  j  0 for at least one j (j = 1,2…m) effect  j of at least one j – level of the factor B is significant, significantly different from zero

doc.Ing. Zlata Sojková,CSc.23 Variability between rows Residual variability Total variability TAV Variability 1 Sum of squares (SS) 2 Degrees of freedom m-1 n-1 (m-1)(n-1) 3 Mean square (MS) (1/2) SrSr ScSc s12s12 sr2sr2 4 F - critical S1S1 S2S2 Variability between columns m.n -1 s22s22

doc.Ing. Zlata Sojková,CSc.24 Decomposition of the total variability S c = S 1 + S 2 + S r Variability between rows, effect of factor A Variability between columns, Effect of factor B Residual variability Total variability

doc.Ing. Zlata Sojková,CSc.25 Investigating the relationships between statistical attributes §Investigating the relationship between qualitative attributes, e.g. A  B, called measurement of the association §Investigating the relationship between quantitative attributes - regression and correlation analysis

doc.Ing. Zlata Sojková,CSc.26 Inestigating the association §Based on the association, resp. pivot tables §For testing the existence of significant relationship between qualitative signs we use  2 - test of independence Ho: two signs A and B are independent H 1 : signsA and B are dependent Attribute A has m - levels, variations Attribute B has k - levels, variation

doc.Ing. Zlata Sojková,CSc.27 Hypotheses formulation §Dependence of the attributes will appear in different frequency §E.g. We examine wheter the size of the package is affected by the size of the family §H o : Choice of the package size depend on the count of family members §H 1 : Choice of package size is affected by the size of the family §The procedure lies in comparing empirical and theoretical frequencies, (how should be empirical frequencies, if the attributes A and B were independent

doc.Ing. Zlata Sojková,CSc.28 Simultanous frequencies, frequencies of the second order (a i b j ) Package size Size of the family <Total (b 1 ) (b 2 ) (b 3 ) do 100g (a 1 )(a 1 b 1 )(a 1 b 2 ) g (a 2 ) 250g < (a 3 )(a 3 b 3 ) Total Marginal frequencies (a i ) resp.(b j ) Total count of the respondents n

doc.Ing. Zlata Sojková,CSc.29 Determination of theoretical frequencies Based on the sentence about independence of the random events A and B: P(A  B) = P(A). P(B), thus signs A and B are independent, then: P(a i b j ) = P(a i ).P(b j ) Estimate based on the relative frequencies: (a i b j ) o = (a i ). (b j )  (a i b j ) o = (a i ).(b j ) n n n n Theoretical frequencies

doc.Ing. Zlata Sojková,CSc.30 Calculation of theoretical frequencies (a 1 b 1 ) o = 70.40/300 = 9,33 Package size Family size and <Total (b 1 ) (b 2 ) (b 3 ) do 100g (a 1 ) , g (a 2 ) g < (a 3 ) Total Total count of respondents n

doc.Ing. Zlata Sojková,CSc.31 Calculation of test criteria and decision: If  2 calculated   2 for significance  for degrees of freedom (m-1).(k-1)  H o is rejected => signs A and B are dependent In our case it means, that count of the family members significantly affects choice of the package size. Further, we should measure strength (power) of the dependence.