SADC Course in Statistics Analysis of Variance for comparing means (Session 11)

Slides:



Advertisements
Similar presentations
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Advertisements

1 Contact details Colin Gray Room S16 (occasionally) address: Telephone: (27) 2233 Dont hesitate to get in touch.
Stratified Sampling Module 3 Session 6.
1 Session 8 Tests of Hypotheses. 2 By the end of this session, you will be able to set up, conduct and interpret results from a test of hypothesis concerning.
SADC Course in Statistics Common Non- Parametric Methods for Comparing Two Samples (Session 20)
SADC Course in Statistics Multiple Linear Regresion: Further issues and anova results (Session 07)
SADC Course in Statistics Estimating population characteristics with simple random sampling (Session 06)
SADC Course in Statistics Analysis of Variance with two factors (Session 13)
SADC Course in Statistics Simple Linear Regression (Session 02)
The Poisson distribution
SADC Course in Statistics Further ideas concerning confidence intervals (Session 06)
SADC Course in Statistics Introduction to Non- Parametric Methods (Session 19)
SADC Course in Statistics Tests for Variances (Session 11)
Assumptions underlying regression analysis
SADC Course in Statistics Basic principles of hypothesis tests (Session 08)
SADC Course in Statistics The binomial distribution (Session 06)
SADC Course in Statistics Inferences about the regression line (Session 03)
SADC Course in Statistics Revision of key regression ideas (Session 10)
Correlation & the Coefficient of Determination
SADC Course in Statistics Sampling design using the Paddy game (Sessions 15&16)
SADC Course in Statistics Processing single and multiple variables Module I3 Sessions 6 and 7.
SADC Course in Statistics Session 4 & 5 Producing Good Tables.
SADC Course in Statistics Graphical summaries for quantitative data Module I3: Sessions 2 and 3.
SADC Course in Statistics Choosing appropriate methods for data collection.
SADC Course in Statistics Comparing two proportions (Session 14)
SADC Course in Statistics Linking tests to confidence intervals (and other issues) (Session 10)
SADC Course in Statistics Introduction to Statistical Inference (Session 03)
SADC Course in Statistics (Session 09)
SADC Course in Statistics Review of ideas of general regression models (Session 15)
SADC Course in Statistics Case Study Work (Sessions 16-19)
SADC Course in Statistics Setting the scene (Session 01)
SADC Course in Statistics A model for comparing means (Session 12)
SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)
SADC Course in Statistics Objectives and analysis Module B2, Session 14.
1 Table design Module 3 Session 2. 2 Objectives of this session By the end of this session, you will be able to: appreciate the different type of objectives.
SADC Course in Statistics Comparing Means from Paired Samples (Session 13)
SADC Course in Statistics Revision on tests for means using CAST (Session 17)
SADC Course in Statistics Revision on tests for proportions using CAST (Session 18)
Probability Distributions
You will need Your text Your calculator
Lecture 14 chi-square test, P-value Measurement error (review from lecture 13) Null hypothesis; alternative hypothesis Evidence against null hypothesis.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Multiple Regression. Introduction In this chapter, we extend the simple linear regression model. Any number of independent variables is now allowed. We.
© The McGraw-Hill Companies, Inc., Chapter 10 Testing the Difference between Means and Variances.
Chapter Thirteen The One-Way Analysis of Variance.
Ch 14 實習(2).
Module 20: Correlation This module focuses on the calculating, interpreting and testing hypotheses about the Pearson Product Moment Correlation.
Simple Linear Regression Analysis
Multiple Regression and Model Building
4/4/2015Slide 1 SOLVING THE PROBLEM A one-sample t-test of a population mean requires that the variable be quantitative. A one-sample test of a population.
One-Way BG ANOVA Andrew Ainsworth Psy 420. Topics Analysis with more than 2 levels Deviation, Computation, Regression, Unequal Samples Specific Comparisons.
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Objectives 10.1 Simple linear regression
Inference for Regression
Correlation and regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
SADC Course in Statistics Introduction and Study Objectives (Session 01)
SADC Course in Statistics Comparing Means from Independent Samples (Session 12)
SADC Course in Statistics Comparing Regressions (Session 14)
SADC Course in Statistics Paddy results: a discussion (Session 17)
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
STAT 3130 Statistical Methods I Session 2 One Way Analysis of Variance (ANOVA)
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Data Analysis Module: One Way Analysis of Variance (ANOVA)
Correlation and Simple Linear Regression
Chapter 14: Analysis of Variance One-way ANOVA Lecture 8
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
Presentation transcript:

SADC Course in Statistics Analysis of Variance for comparing means (Session 11)

To put your footer here go to View > Header and Footer 2 Learning Objectives At the end of this session, you will be able to understand and interpret the components of an anova table for comparing means present the results following an anova in terms of an appropriate summary table make simple comparisons across pairs of levels of an explanatory categorical variable

To put your footer here go to View > Header and Footer 3 Comparing two groups Recall from Module H2 that the means of two population sub-groups with respect to a quantitative measurement of interest can be compared using a t-test. For example, we could compare the mean poverty levels, or the mean land area owned by households across urban and rural areas. Or we could compare household size, or the household dependency ratio across male headed and female headed households.

To put your footer here go to View > Header and Footer 4 Comparing more than two groups In the above examples, just two groups were being compared. Can we extend these ideas to a comparison across more than two groups? This is possible through use of an analysis of variance (anova). We have met an anova already, but the objective and hypothesis are different here!

To put your footer here go to View > Header and Footer 5 Objectives addressed: Some examples of questions to be answered: Is the average income of households in Malawi the same across its three regions? Is the average length of the rainy season the same across different districts? Have interventions to control the incidence of malaria (mean number of cases per 1000 of population) been equally effective across three areas in Zambia where controls were put in place?

To put your footer here go to View > Header and Footer 6 An example – Paddy data again! Suppose farmers want to know which variety of rice to grow in order to maximise their yields… There are three varieties to choose from, i.e. new improved, old improved and traditional. The null hypothesis to be tested is: H 0 : means are the same across all varieties; versus the alternative hypothesis H 1 : all variety means are not equal some difference somewhere…

To put your footer here go to View > Header and Footer 7 Using anova to compare means As in a simple linear regression, the anova splits the overall variation in y (here y=rice yields) into two components: variation due to differences in means residual variation, i.e. variation not due to variety possible differences. H 0 is tested by comparing the two components of variation above. A large variance ratio is evidence against H 0.

To put your footer here go to View > Header and Footer 8 Anova table - interpretation Sourced.f.S.S.M.S.FProb. Variety Residual Total d.f. for variety since it reflects variation between 3 varieties. Res. M.S is the balance, or unexplained component of variation in yields. It represents variation between farmers within varieties.

To put your footer here go to View > Header and Footer 9 Anova table - results Sourced.f.S.S.M.S.FProb. Variety Residual Total F-ratio of 40.8 on (2,33) d.f. is highly significant (p-value=0.000). This indicates strong evidence to reject H 0.

To put your footer here go to View > Header and Footer 10 Presentation of results Results are presented in terms of the variety means and their standard errors. VarietyMeanStd.error95% C.I. New improved (5.29, 6.63) Old improved (4.22, 4.87) Traditional (2.65, 3.35) Overall (3.84, 4.28)

To put your footer here go to View > Header and Footer 11 Conclusions The anova results indicate clear evidence that varieties differ in terms of their yields. The best one is the new improved variety, giving a mean yield of about 4 tonnes per hectare, (95% confidence limits ranging from 5.3 to 6.6). The traditional variety does poorly in comparison with the other two varieties, yielding only about 3 tonnes per hectare.

To put your footer here go to View > Header and Footer 12 Further comparisons The anova is only a first step in the analysis. If the F-ratio is significant, proceed further to see where the actual differences occur. Do this using t-tests… For example, to compare the mean yields for new and old improved varieties, first calculate Difference in means = Standard error of the difference given by ______________________________________ = /[ (1/4)+(1/17) ] = where n i is no. of obs. for variety i, and s 2 is the residual mean square from anova

To put your footer here go to View > Header and Footer 13 t-test for comparing means Then find the t-statistic given by t = difference in means/(std. error of diff) Here, t=1.416/0.365 = 3.88 Compare this with t-tables with 33 d.f. (since std. error of difference is based on anova residual mean square) Result is significant at the 0.1% sig. level Conclude: Strong evidence of a difference

To put your footer here go to View > Header and Footer 14 Other comparisons There may be other comparisons of interest. For example, comparing the mean of the traditional variety with improved varieties. Here the comparison of interest is: A t-test may again be used by computing the t-statistic as value above divided by its standard error, and comparing the result with a t-distribution with d.f.=Residual d.f.

To put your footer here go to View > Header and Footer 15 Practical work follows to ensure learning objectives are achieved…