1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.

Slides:



Advertisements
Similar presentations
Dr. AJIT SAHAI Director – Professor Biometrics JIPMER, Pondicherry
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Analysis of Variance (ANOVA) ANOVA can be used to test for the equality of three or more population means We want to use the sample results to test the.
The Two Factor ANOVA © 2010 Pearson Prentice Hall. All rights reserved.
Chapter Seventeen HYPOTHESIS TESTING
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 12 Chi-Square Tests and Nonparametric Tests
Statistics Are Fun! Analysis of Variance
BCOR 1020 Business Statistics
Chapter Goals After completing this chapter, you should be able to:
Lecture 9: One Way ANOVA Between Subjects
Horng-Chyi HorngStatistics II127 Summary Table of Influence Procedures for a Single Sample (I) &4-8 (&8-6)
Statistics for Managers Using Microsoft® Excel 5th Edition
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
The Kruskal-Wallis Test The Kruskal-Wallis test is a nonparametric test that can be used to determine whether three or more independent samples were.
Chapter 14 Inferential Data Analysis
ABOUT TWO DEPENDENT POPULATIONS
Chapter 12: Analysis of Variance
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 12-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
Ch 10 Comparing Two Proportions Target Goal: I can determine the significance of a two sample proportion. 10.1b h.w: pg 623: 15, 17, 21, 23.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
© 2002 Prentice-Hall, Inc.Chap 9-1 Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 9 Analysis of Variance.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chapter 10 Analysis of Variance.
ANOVA (Analysis of Variance) by Aziza Munir
Testing Hypotheses about Differences among Several Means.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
+ Chi Square Test Homogeneity or Independence( Association)
Chapter 11 Chi- Square Test for Homogeneity Target Goal: I can use a chi-square test to compare 3 or more proportions. I can use a chi-square test for.
Chapter 15 – Analysis of Variance Math 22 Introductory Statistics.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
CHI SQUARE TESTS.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Virtual University of Pakistan Lecture No. 44 of the course on Statistics and Probability by Miss Saleha Naghmi Habibullah.
Statistics in Applied Science and Technology Chapter14. Nonparametric Methods.
NON-PARAMETRIC STATISTICS
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
1 Math 10 M Geraghty Part 8 Chi-square and ANOVA tests © Maurice Geraghty 2015.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter Fifteen Chi-Square and Other Nonparametric Procedures.
Chapters Way Analysis of Variance - Completely Randomized Design.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
Analysis of Variance ANOVA - method used to test the equality of three or more population means Null Hypothesis - H 0 : μ 1 = μ 2 = μ 3 = μ k Alternative.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
DSCI 346 Yamasaki Lecture 4 ANalysis Of Variance.
Test of independence: Contingency Table
Chapter 12 Chi-Square Tests and Nonparametric Tests
Comparing Three or More Means
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chapter 11: Inference for Distributions of Categorical Data
1-Way Analysis of Variance - Completely Randomized Design
What are their purposes? What kinds?
1-Way Analysis of Variance - Completely Randomized Design
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS

2 ONE-WAY ANALYSIS OF VARIANCE (ANOVA) Analysis of variance is used for two different purposes: 1.To estimate and test hypotheses about population variances 2.To estimate and test hypotheses about population means We are concerned here with the latter use.

3 H 0 :  1 =  2 =  3 =...=  k H a : Not all the  i are equal. Assumptions: We have K independent samples, one from each of K populations. Each population has a normal distribution with unknown mean  i All of the populations have the same standard deviation  (unknown)

4 Mean T.. T.k T.3 T.2 T.1 Total x 3k x 33 x 32 x 31 x 2k x 23 x 22 x 21 x 1k x 13 x 12 x 11 k321 Treatment

5 The Total Sum of Squares The Within Groups Sum of Squares The Among Groups Sum of Squares SST=SSA+SSW

6 Within groups mean square Among groups mean square Variance Ratio (F)

7 SourceSSdfMSF (VR) Among samplesSSAk-1MSAMSA/MSW Within samplesSSWN-kMSW TotalSSTN-1 ANOVA TABLE

8 Testing for Significant Differences Between Individual Pairs of Means Whenever the analysis of variance leads to a rejection of the null hypothesis of no difference among population means, the question naturally arise regarding just which pairs of means are different. Over the years several procedures for making individual comparisons have been suggested. The oldest procedure, and perhaps the one most widely used in the past, is the Least Significant Difference (LSD) procedure. LSD () LSD (Least Significant Difference ) Tukey Tukey Bonferroni Bonferroni Sidak Sidak Dunnett’s C Dunnett’s C Dunnett’s T3 Dunnett’s T3

9 When sample sizes are equal (n 1 =n 2 =n 3 =...=n k =n) Least Significant Difference (LSD) p<0.05 When sample sizes are not equal (n 1  n 2  n 3 ...  n k ) p<0.05

10 Example In a study of the effect of glucose on insulin release, specimens of pancreatictissue from experimental animals were randomly assigned to be treated with one of five different stimulants. Later, a determination was made on the amount of insulin released. The experimenters wished to know if they could conclude that there is a difference among the five treatments with respect to the mean amount of insulin released. The resulting measurements of amount of insulin released following treatment are displayed in the table. The five sets of observed data constitute five independent samples from the respective populations. Each of the populations from which he samples come is normally distributed with mean,  i, and variances  i 2. Each population has the same variance.

11 Stimulant Total Mean

12 H 0 :  1 =  2 =  3 =  4 =  5 H a : Not all the  i are equal. SSA=SST-SSW= =

13 ANOVA TABLE 121,185430,29619,779,000 41,357271, ,54331 Between Groups Within Groups Total Sum of SquaresdfMean SquareFSig. MSW=SSW/27=41.357/27=1.532 MSA=SSA/(5-1)= /4= F=MSA/MSW=30.296/1.532= We conclude that not all population means are equal.

14 Since n 1  n 2  n 3  n 4  n 5 ), reject H 0 if HypothesisLSDStatistical Decision H 0 :  1 =  <1.538, accept H 0. H 0 :  1 =   1.538, reject H 0. H 0 :  4 =  <1.314, accept H 0.

15

16

17 KRUSKAL- WALLIS ONE-WAY ANOVA When the assumptions underlying One-way ANOVA are not met, that is, when the populations from which the samples are drawn are not normally distributed with equal variances, or when the data for analysis consist only of ranks, a nonparametric alternative to the one- way analysis of variance may be used to test the hypothesis of equal location parameters.

18 The application of the test involves the following steps: 1.The n1, n2,..., nk observations from the k groups are combined into a single series of size n and arranged in order of magnitude from smallest to largest. The observations are then replaced by ranks. 2.The ranks assigned to observations in each of the k groups are added separately to give k rank sums. 3.The test statistic is computed. # of groups # of obs. in jth group Sum of ranks in jth group

19 4.When there are three groups and five and fewer observations in each group, the significance of the computed KW is determined by using special tables. When there are more than five observations in one or more of the groups, KW is compared with the tabulated values of  2 with k-1 df.

20 Determing which groups are significantly different Like the one-way ANOVA, the Kruskal-Wallis test is an overall test of significant result, the test does not indicate where the differences are among the groups. To determine which groups are significantly different from one another, it is necessary to undertake multiple comparisons.  p<0.05

21 Example The effect of two drugs on reaction time to a certain stimulus were studied in three groups of experimental animals. Group III served as a control while the animals in group I treated with drug A and those in group II were treated with drug B prior to the application of the stimulus. Table shows the reaction times in seconds of 13 animals. Can we conclude that the three populations represented by the three samples differ with respect to reaction time? H 0 : The population distributions are all identical. H a : At least one of the populations tends to exhibit larger values than at least one of the other populations.

22 Group IIIIII Rank R i KW (5,4,4;0.05) =5.617<KW cal p<0.05, reject H 0.

23 GroupsStatistical Decision p< p< p<0.05 Multiple Comparisons Table

24 We can use the chi-square test to compare frequencies or proportions in two or more groups. The classification according to two criteria, of a set of entities, can be shown by a table in which the r rows represents the various levels of one criterion of classification and c columns represent the various levels of the second criterion. Such a table is generally called a contingency table. We will be interested in testing the null hypothesis that in the population the two criteria of classification are independent or associated. rxc Chi Square Test

25 Second Criteria First Criteria 12cTotal 1O 11 O 12 O 1c O 1. 2O 21 O 22 O 2c O 2. rO r1 O r2 O rc O r. TotalO.1 O.2 O.c N

26 No more than 20% of the cells should have expected frequencies of less than 5. df = (r-1)(c-1)

27 Example A research team studying the relationship between blood type and severity of a certain condition in a population collected data on 1500 subjects as displayed in the below contingency table. The researchers wished to know if these data were compatible with the hypothesis that severity of condition and blood type are independent. Severity of Condition Blood Type ABAB0Total Absent Mild Severe Total

,1%16,0%6,8%36,1%100,0% ,9%21,0%7,6%29,5%100,0% ,0 37,3%12,0%9,3%41,3%100,0% ,0%16,1%7,0%35,9%100,0% Severity of condition Count % within severity Count % within severity Count % within severity Count % within severity Absent Mild Severe Total ABABO Blood Type Total 541,2213,092,4473,41320,0 43,116,97,437,7105,0 30,812,15,326,9 615,0242,0105,0538,01500,0 Expected Count 0 cells (,0%) have expected count less than 5. The minimum expected count is 5,25.

29  2 (6,0.05) =12.592>  2 (calculated), accept H 0, p>0.05 We conclude that these data are compatible with the hypothesis that severity of the condition and blood type are independent. H 0 : severity of condition and blood type are independent. H a : severity of condition and blood type are not independent.

30 Assumption is violated We decide to merge two conditions When the sample size is small and assumption about expected frequencies is not met;

31 After combining mild and severe groups in one group, no more than 20% of the cells have expected frequencies less than 5.

32 12,375 a 3, Chi-Square N of Valid Cases Valuedf Asymp. Sig. (2-sided) 1 cells (12,5%) have expected count less than 5. The minimum expected count is 3,30. a.  2 = 5,  2 = 0,  2 = 0,  2 = 7, Reject H 0. Which type of blood group(s) is/are different from the others ? Exclude Type O from the analysis If null hypothesis is rejected, how can we find the group which is different?

33 p>0.05 Except for blood type O, distribution of tromboembolism is similar within the others.