Chapter 3 Comparison of groups.

Slides:



Advertisements
Similar presentations
Copyright 2004 David J. Lilja1 Comparing Two Alternatives Use confidence intervals for Before-and-after comparisons Noncorresponding measurements.
Advertisements

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
1 Chapter 10 Comparisons Involving Means  1 =  2 ? ANOVA Estimation of the Difference between the Means of Two Populations: Independent Samples Hypothesis.
ANOVA: Analysis of Variation
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 13 Experimental Design and Analysis of Variance nIntroduction to Experimental Design.
1 1 Slide Analysis of Variance Chapter 13 BA 303.
T tests comparing two means t tests comparing two means.
Analysis of Variance ( ANOVA )
ANOVA (Analysis of Variance) by Aziza Munir
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Testing Hypotheses about Differences among Several Means.
CHAPTER 11 SECTION 2 Inference for Relationships.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd.. 1 Slide Slide Slides Prepared by Juei-Chao Chen Fu Jen Catholic University Slides Prepared.
The p-value approach to Hypothesis Testing
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
Chapter 15 Analysis of Variance. The article “Could Mean Platelet Volume be a Predictive Marker for Acute Myocardial Infarction?” (Medical Science Monitor,
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
ANOVA: Analysis of Variation
Chapter 6 Hypothesis tests.
Comparing Multiple Groups:
ANOVA: Analysis of Variation
Virtual University of Pakistan
ANOVA: Analysis of Variation
Dependent-Samples t-Test
Chapter 12 Chi-Square Tests and Nonparametric Tests
Statistical models, estimation, and confidence intervals
Chapter 3 Comparison of groups.
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Chapter 10 Two-Sample Tests and One-Way ANOVA.
ANOVA: Analysis of Variation
Model validation and prediction
Lecture Slides Elementary Statistics Twelfth Edition
i) Two way ANOVA without replication
Applied Business Statistics, 7th ed. by Ken Black
Comparing Three or More Means
Basic Practice of Statistics - 5th Edition
Statistical Data Analysis - Lecture10 26/03/03
Statistics Analysis of Variance.
Statistics for Business and Economics (13e)
Comparing Multiple Groups: Analysis of Variance ANOVA (1-way)
Comparing k Populations
Econ 3790: Business and Economic Statistics
Chapter 6 Hypothesis tests.
Comparing Three or More Means
Chapter 11 Analysis of Variance
Chapter 9 Hypothesis Testing.
Lesson Comparing Two Means.
Comparing Populations
One-Way Analysis of Variance
Hypothesis Testing: The Difference Between Two Population Means
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Analysing Means I: (Extending) Analysis.
Chapter 10 – Part II Analysis of Variance
ANalysis Of VAriance Lecture 1 Sections: 12.1 – 12.2
STATISTICS INFORMED DECISIONS USING DATA
F test for Lack of Fit The lack of fit test..
Presentation transcript:

Chapter 3 Comparison of groups

Comparison of groups The purpose of an analysis is often to compare different groups of data. Suppose, for example, that a meat scientist wants to examine the effect of three different storage conditions on the tenderness of meat. For that purpose 24 pieces of meat have been collected and allocated into three storage (or treatment) groups, each of size eight. In each group all eight pieces of meat are stored under the same conditions, and after some time the tenderness of each piece of meat is measured. The main question is whether the different storage conditions affect the tenderness: are the observed differences between the groups due to a real effect, or due to random variation?

Example: Parasite counts for salmons An experiment with two difference salmon stocks, from River Conon in Scotland and from River Ätran in Sweden, was carried out as follows. Thirteen fish from each stock were infected and after four weeks the number of a certain type of parasites was counted for each of the 26 fish with the following results: The purpose of the study was to investigate if the number of parasites during an infection is the same for the two salmon stocks.

Example: Parasite counts for salmons

Example: Parasite counts for salmons The mean and sample standard deviations are computed to The summary statistics and the boxplots tell the same story: The observed parasite counts are generally higher for the Ätran group compared to the Conon group, indicating that Ätran salmons are more susceptible to parasites. The purpose of the statistical analysis is to clarify whether the observed difference is caused by an actual difference between the stocks or by random variation.

Example: Dung decomposition An experiment with dung from heifers was carried out in order to explore the influence of antibiotics on the decomposition of dung organic material. As part of the experiment, 36 heifers were divided into six groups. All heifers were fed a standard feed, and antibiotics of different types (alpha-Cypermethrin, Enrofloxacin, Fenbendazole, Ivermectin, Spiramycin) were added to the feed for heifers in five of the groups. No antibiotics were added for heifers in the remaining group (the control group). For each heifer, a bag of dung was dug into the soil, and after eight weeks the amount of organic material was measured for each bag.

Example: Dung decomposition

Example: Dung decomposition The observations together with group means (solid lines) and the total mean (dashed line) are shown on the left, and parallel boxplots are shown on the right panel. The amount of organic material appears to be lower for the control group compared to any of the five types of antibiotics, suggesting that decomposition is generally inhibited by antibiotics. However, there is variation from group to group (between-group variation) as well as a relatively large variation within each group (within-group variation). The within-group variation seems to be roughly the same for all types, except perhaps for spiramycin, but that is hard to evaluate because there are fewer observations in that group.

Example: Dung decomposition The sample means and the sample standard deviations are computed for each group separately. We find the same indications as we did in the boxplots. On average the amount of organic material is lower for the control group than for the antibiotics groups, and except for the spiramycin group the standard deviations are roughly the same in all groups.

Group means and SD’s Consider the situation with n observations split into k groups. Label the groups 1 through k. Let g(i) denote the group for observation i. Then g(i) has one of the values 1,…,k. The sample mean and sample standard deviation in group j are given by

Residual variance The residual variance s2 can also be computed as a weighted average of the group variance estimates, sj2, as follows Note that the group variance sj2 is assigned the weight nj−1, the denominator in (3.1). The summation of (3.5) is called the residual sum of squares.

Within-group variation Within-group variation refers to the variation in each of the groups. It is illustrated by the vertical deviations between the observations and their corresponding group means. The residual sum of squares is given by SSe describes the within-group variation since it measures squared deviations between the observations and the group means. The residual degrees of freedom is dfe = n−k, so the residual mean squares is

Between-group variation Between-group variation refers to differences between the groups; for example, deviation between the different treatments in the antibiotics example. It is illustrated by the vertical differences between the group means (horizontal line segments) and the overall mean (dashed line): When we examine the between-group variation, the k group means essentially act as our “observations”; hence, dfgrp = k−1, and the “average” squared difference MSgrp per group becomes the between-group variation.

Analysis of variance If there is no difference between any of the groups, then the group averages will be of similar size and be similar to the overall mean. Hence, MSgrp will be “small”. On the other hand, if groups 1 and 2, say, are different, then the group averarages will be somewhat different; hence, MSgrp will be “large”. “Small” and “large” should be measured relative to the within-group variation, and MSgrp is thus standardized with MSe. We use Large values of Fobs are critical; that is, not in agreement with the assumption (hypothesis) that there is no different between any of the groups.

Analysis of variance This disagreement is equivalent to Fobs being larger, and the corresponding p-value are often inserted in an analysis of variance table. The p-value of being smaller than 0.05 indicates significance evidence toward the disagreement between groups.

Example: Dung decomposition We conclude that there is strong evidence of group differences. Subsequently, we need to quantify the conclusion further: Which groups are different and how large are the differences?

Paired samples and dependence Paired samples occur, for example, if two measurements are collected for each subject in the sample under different circumstances (treatments), or if measurements are taken on pairs of related observational units such as twins. In dietary studies with two diets under investigation, for example, it is common that the subjects try one diet in one period and the other diet in another period; thus, they are “dependent.” As a consequence, the between-group variation is confused with the within-group variation, making the analysis of variance inappropriate for paired data.

Independent samples It is important to distinguish paired samples from unpaired—or independent—samples, because different methods of analysis are appropriate. For unpaired samples like the dung decomposition data, we impose an assumption of independence between all observations. This means that the observations do not share information. This setup with independent samples corresponds to a one-way analysis of variance, or one-way ANOVA. It is called “analysis of variance” because different sources of variation are compared and “one-way” because only one factor—the treatment or grouping—is varied in the experiment.

Summary of grouped data Two independent samples where the samples correspond to two different groups or treatments and can be assumed to be independent. Independent samples where the samples correspond to k different groups or treatments and can be assumed to be independent. Paired samples where the observations consist of pairs of measurements, with the observations in a pair corresponding to two different groups or treatments. The first case is a special case of the second, but we emphasize it anyway, for two reasons. First, it is very important to distinguish two independent samples from paired samples because different analysis methods are appropriate.

Example: Word count The attach() command makes it possible to use the variables GENDER, GROUP, COUNT with reference to the data frame. > attach(Data) The following command produces boxplots grouped by GENDER. > boxplot(COUNT ~ GENDER, col="green", ylab="Word counts per day")

Example: Word count COUNT on the left-hand side of ~ in the call to aov() is modeled as grouped data indicated by GENDER on the right-hand side. The output will then list the analysis of variance table, and the group means. > outcome <- aov(COUNT ~ GENDER) > summary(outcome) > Means <- model.tables(outcome, "means") > Means