Treatment comparisons

Slides:

Advertisements

Similar presentations

Prepared by Lloyd R. Jaisingh

Advertisements

Overview of Lecture Partitioning Evaluating the Null Hypothesis ANOVA

Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION

Chapter 7 Sampling and Sampling Distributions

Dr. AJIT SAHAI Director – Professor Biometrics JIPMER, Pondicherry

Treatment Comparisons ANOVA can determine if there are differences among the treatments, but what is the nature of those differences? Are the treatments.

The Kruskal-Wallis H Test

ANALYSIS OF VARIANCE (ONE WAY)

Chapter 3 Experiments with a Single Factor: The Analysis of Variance

Hypothesis Testing Variance known?. Sampling Distribution n Over-the-counter stock selling prices calculate average price of all stocks listed [  ]calculate.

ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.

Chapter Thirteen The One-Way Analysis of Variance.

Ch 14 實習(2).

By Shaimaa Elkadi Supervised by Dr.Amal Fatani. From the previous One Way ANOVA But.

Lesson #24 Multiple Comparisons. When doing ANOVA, suppose we reject H 0 :  1 =  2 =  3 = … =  k Next, we want to know which means differ. This does.

Chapter 8 Estimation Understandable Statistics Ninth Edition

One-Way BG ANOVA Andrew Ainsworth Psy 420. Topics Analysis with more than 2 levels Deviation, Computation, Regression, Unequal Samples Specific Comparisons.

STA305 Spring 2014 This started with excerpts from STA2101f13

Analysis of Variance (ANOVA) ANOVA methods are widely used for comparing 2 or more population means from populations that are approximately normal in distribution.

Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.

One-Way ANOVA Multiple Comparisons.

ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.

Comparing Means.

Analysis of Variance Chapter 3Design & Analysis of Experiments 7E 2009 Montgomery 1.

8. ANALYSIS OF VARIANCE 8.1 Elements of a Designed Experiment

One-way Between Groups Analysis of Variance

Lecture 12 One-way Analysis of Variance (Chapter 15.2)

Chapter 9 Hypothesis Testing.

K-group ANOVA & Pairwise Comparisons ANOVA for multiple condition designs Pairwise comparisons and RH Testing Alpha inflation & Correction LSD & HSD procedures.

Comparing Means.

BCOR 1020 Business Statistics

Today Concepts underlying inferential statistics

Chapter 14 Inferential Data Analysis

Linear Contrasts and Multiple Comparisons (Chapter 9)

Chapter 12 Inferential Statistics Gay, Mills, and Airasian

Intermediate Applied Statistics STAT 460

1 Multiple Comparison Procedures Once we reject H 0 :   =   =...  c in favor of H 1 : NOT all  ’s are equal, we don’t yet know the way in which.

When we think only of sincerely helping all others, not ourselves,

Comparing Means. Anova F-test can be used to determine whether the expected responses at the t levels of an experimental factor differ from each other.

1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.

Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.

Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.

The Scientific Method Formulation of an H ypothesis P lanning an experiment to objectively test the hypothesis Careful observation and collection of D.

Everyday is a new beginning in life. Every moment is a time for self vigilance.

Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.

Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.

STA MCP1 Multiple Comparisons: Example Study Objective: Test the effect of six varieties of wheat to a particular race of stem rust. Treatment:

Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.

One-way ANOVA: - Comparing the means IPS chapter 12.2 © 2006 W.H. Freeman and Company.

Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Linear Models One-Way ANOVA. 2 A researcher is interested in the effect of irrigation on fruit production by raspberry plants. The researcher has determined.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 13: Multiple Comparisons Experimentwise Alpha (α EW ) –The probability.

ANOVA P OST ANOVA TEST 541 PHL By… Asma Al-Oneazi Supervised by… Dr. Amal Fatani King Saud University Pharmacy College Pharmacology Department.

Topic 22: Inference. Outline Review One-way ANOVA Inference for means Differences in cell means Contrasts.

Chapters Way Analysis of Variance - Completely Randomized Design.

Chapter 13 Understanding research results: statistical inference.

1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.

Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.

Comparing Three or More Means

Hypothesis testing using contrasts

Statistics Analysis of Variance.

Multiple Comparisons Q560: Experimental Methods in Cognitive Science Lecture 10.

Analysis of Treatment Means

Linear Contrasts and Multiple Comparisons (§ 8.6)

Multiple Comparisons: Example

1-Way Analysis of Variance - Completely Randomized Design

Comparing Means.

Analysis of Treatment Means

1-Way Analysis of Variance - Completely Randomized Design

STATISTICS INFORMED DECISIONS USING DATA

Presentation transcript:

Treatment comparisons ANOVA can determine if there are differences among the treatments, but what is the nature of those differences? Are the treatments measured on a continuous scale? Look at response surfaces (linear regression, polynomials) Is there an underlying structure to the treatments? Compare groups of treatments using orthogonal contrasts or a limited number of preplanned mean comparison tests Are the treatments unstructured? Use appropriate mean comparison tests

Comparison of Means Pairwise Comparisons Least Significant Difference (LSD) Simultaneous Confidence Intervals Dunnett Test (making all comparisons to a control) Bonferroni Inequality Other Multiple Comparisons - “Data Snooping” Fisher’s Protected LSD (FPLSD) Student-Newman-Keuls test (SNK) Tukey’s honestly significant difference (HSD) Waller and Duncan’s Bayes LSD (BLSD) False Discovery Rate Procedure Often misused - intended to be used only for data from experiments with unstructured treatments

Multiple Comparison Tests Fixed Range Tests – a constant value is used for all comparisons Application Hypothesis Tests Confidence Intervals Multiple Range Tests – values used for comparison vary across a range of means

Variety Trials In a breeding program, you need to examine large numbers of selections and then narrow to the best In the early stages, based on single plants or single rows of related plants. Seed and space are limited, so difficult to have replication When numbers have been reduced and there is sufficient seed, you can conduct replicated yield trials and you want to be able to “pick the winner”

Least Significant Difference Calculating a t for testing the difference between two means any difference for which the t > t would be declared significant Further, is the smallest difference for which significance would be declared therefore or with equal replication, where r is number of observations forming the mean LSD t MSE r = a 2 /

Do’s and Don’ts of using LSD LSD is a valid test when making comparisons planned in advance of seeing the data (this includes the comparison of each treatment with the control) Comparing adjacent ranked means The LSD should not (unless F for treatments is significant) be used for making all possible pairwise comparisons making more comparisons than df for treatments

Pick the Winner A plant breeder wanted to measure resistance to stem rust for six wheat varieties planted 5 seeds of each variety in each of four pots placed the 24 pots randomly on a greenhouse bench inoculated with stem rust measured seed yield per pot at maturity

Ranked Mean Yields (g/pot) Mean Yield Difference Variety Rank Yi Yi-1 - Yi F 1 95.3 D 2 94.0 1.3 E 3 75.0 19.0 B 4 69.0 6.0 A 5 50.3 18.7 C 6 24.0 26.3

ANOVA Compute LSD at 5% and 1% Source df MS F Variety 5 2,976.44 24.80 Error 18 120.00 Compute LSD at 5% and 1% LSD t MSE r = a 2 120 4 16.27 / 2.101 ( * ) 2.878 22.29

Back to the data... Mean Yield Difference Variety Rank Yi Yi-1 - Yi LSD=0.05 = 16.27 LSD=0.01 = 22.29 Mean Yield Difference Variety Rank Yi Yi-1 - Yi F 1 95.3 D 2 94.0 1.3 E 3 75.0 19.0* B 4 69.0 6.0 A 5 50.3 18.7* C 6 24.0 26.3**

Pairwise Comparisons If you have 10 varieties and want to look at all possible pairwise comparisons that would be t(t-1)/2 or 10(9)/2 = 45 that’s a few more than t-1 df = 9 LSD would only allow 9 comparisons

Type I vs Type II Errors Type I error - saying something is different when it is really the same (Paranoia) the rate at which this type of error is made is the significance level Type II error - saying something is the same when it is really different (Sloth) the probability of committing this type of error is designated b the probability that a comparison procedure will pick up a real difference is called the power of the test and is equal to 1-b Type I and Type II error rates are inversely related to each other For a given Type I error rate, the rate of Type II error depends on sample size variance true differences among means

Nobody likes to be wrong... Protection against Type I is choosing a significance level Protection against Type II is a little harder because it depends on the true magnitude of the difference which is unknown choose a test with sufficiently high power Reasons for not using LSD for more than t-1 comparisons the chance for a Type I error increases dramatically as the number of treatments increases for example, with only 20 means - you could make a type I error 95% of the time (in 95/100 experiments)

Comparisonwise vs Experimentwise Error Comparisonwise error rate ( = C) measures the proportion of all differences that are expected to be declared real when they are not Experimentwise error rate (E) the risk of making at least one Type I error among the set (family) of comparisons in the experiment measures the proportion of experiments in which one or more differences are falsely declared to be significant the probability of being wrong increases as the number of means being compared increases

Comparisonwise vs Experimentwise Error Experimentwise error rate (E) Probability of no Type I errors = (1-C)x where x = number of pairwise comparisons Max x = t(t-1)/2 , where t=number of treatments  Probability of at least one Type I error E = 1- (1-C)x Comparisonwise error rate C = 1- (1-E)1/x if t = 10, Max x = 45, E = 90%

Fisher’s protected LSD (FPLSD) Uses comparisonwise error rate Computed just like LSD but you don’t use it unless the F for treatments tests significant So in our example data, any difference between means that is greater than 16.27 is declared to be significant LSD = tα 2MSE / r

Waller-Duncan Bayes LSD (BLSD) Do ANOVA and compute F (MST/MSE) with q and f df (corresponds to table nomenclature) Choose error weight ratio, k k=100 corresponds to 5% significance level k=500 for a 1% test Obtain tb from table (A7 in Petersen) depends on k, F, q (treatment df) and f (error df) Compute Any difference greater than BLSD is significant Does not provide complete control of experimentwise Type I error Reduces Type II error BLSD = t 2MSE/r

Duncan’s New Multiple-Range Test Alpha varies depending on the number of means involved in the test Alpha 0.05 Error Degrees of Freedom 6 Error Mean Square 113.0833 Number of Means 2 3 4 5 6 Critical Range 26.02 26.97 27.44 27.67 27.78 Means with the same letter are not significantly different. Duncan Grouping Mean N variety A 95.30 2 6 A A 94.00 2 4 B A 75.00 2 5 B A B A 69.00 2 2 B B 50.30 2 1 C 22.50 2 3

Student-Newman-Keuls Test (SNK) Rank the means from high to low Compute t-1 significant differences, SNKj , using the HSD Compare the highest and lowest if less than SNK, no differences are significant if greater than SNK, compare next highest mean with next lowest using next SNK Uses experimentwise for the extremes and comparisonwise for adjacent where j=1,2,..., t-1, k=2,3,...,t SNK Q MSE r j = a,k, /

Using SNK with example data: Q 2.97 3.61 4.00 4.28 4.49 SNK 16.27 19.77 21.91 23.44 24.59 Mean Yield Variety Rank Yi F 1 95.3 D 2 94.0 E 3 75.0 B 4 69.0 A 5 50.3 C 6 24.0 5 4 3 2 1 = 15 comparisons 18 df for error se= = SQRT(120/4) = 5.477 SNK=Q*se

Tukey’s honestly significant difference (HSD) From a table of studentized range values, select a value of Qa which depends on k (the number of means) and v (error df) (Appendix Table VII in Kuehl) Compute HSD as For any pair of means, if the difference is greater than HSD, it is significant Uses an experimentwise error rate Dunnett’s test is a special case where all treatments are compared to a control HSD = Q MSE / r

Bonferroni Inequality E  x * C where x = number of pairwise comparisons C = E / x where E = maximum desired experimentwise error rate Advantages simple strict control of Type I error Disadvantage very conservative, low power to detect differences

False Discovery Rate Reject H0

Most Popular FPLSD test is widely used, and widely abused BLSD is preferred by some because It is a single value and therefore easy to use Larger when F indicates that the means are homogeneous and small when means appear to be heterogeneous The False Discovery Rate has nice features, but is it widely accepted in the literature? Tukey’s HSD test widely accepted and often recommended by statisticians may be too conservative if Type II error has more serious consequences than Type I error