Statistical Data Analysis - Lecture /04/03

Slides:

Advertisements

Similar presentations

Analysis of Variance (ANOVA) Statistics for the Social Sciences Psychology 340 Spring 2010.

Advertisements

One-Way ANOVA Multiple Comparisons.

PSY 307 – Statistics for the Behavioral Sciences

Chapter 10 Simple Regression.

Comparing Means.

Chapter 3 Experiments with a Single Factor: The Analysis of Variance

Lecture 9: One Way ANOVA Between Subjects

Chapter 11 Multiple Regression.

Lecture 12 One-way Analysis of Variance (Chapter 15.2)

Comparing Means.

Today Concepts underlying inferential statistics

5-3 Inference on the Means of Two Populations, Variances Unknown

Linear Contrasts and Multiple Comparisons (Chapter 9)

6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.

Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.

Intermediate Applied Statistics STAT 460

1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.

Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.

Statistics 11 Confidence Interval Suppose you have a sample from a population You know the sample mean is an unbiased estimate of population mean Question:

Between-Groups ANOVA Chapter 12. >When to use an F distribution Working with more than two samples >ANOVA Used with two or more nominal independent variables.

1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 13: Multiple Comparisons Experimentwise Alpha (α EW ) –The probability.

Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.

Statistics for the Social Sciences Psychology 340 Spring 2009 Analysis of Variance (ANOVA)

Example x y We wish to check for a non zero correlation.

Chapters Way Analysis of Variance - Completely Randomized Design.

Multiple comparisons 郭士逢輔大生科系 2008 Advanced Biostatistics.

ANOVA: Analysis of Variation

ANOVA: Analysis of Variation

ANOVA: Analysis of Variation

Step 1: Specify a null hypothesis

Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.

ANOVA: Analysis of Variation

ANALYSIS OF VARIANCE (ANOVA)

Week 2 – PART III POST-HOC TESTS.

Lecture Slides Elementary Statistics Twelfth Edition

Two-way ANOVA with significant interactions

Size of a hypothesis test

Factorial Experiments

CHAPTER 10 Comparing Two Populations or Groups

Multiple comparisons

Comparing Three or More Means

Basic Practice of Statistics - 5th Edition

Statistical Data Analysis - Lecture10 26/03/03

Chapter 2 Simple Comparative Experiments

Kin 304 Inferential Statistics

Correlation and Regression

Analysis of Variance (ANOVA)

Linear Contrasts and Multiple Comparisons (§ 8.6)

9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE

Chapter 11 Analysis of Variance

Elementary Statistics

Chapter 11: The ANalysis Of Variance (ANOVA)

Analysis of Variance (ANOVA)

1-Way Analysis of Variance - Completely Randomized Design

I. Statistical Tests: Why do we use them? What do they involve?

CHAPTER 10 Comparing Two Populations or Groups

One-Way Analysis of Variance

Confidence intervals for the difference between two means: Independent samples Section 10.1.

Comparing Means.

Chapter 7: The Normality Assumption and Inference with OLS

CHAPTER 10 Comparing Two Populations or Groups

Product moment correlation

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups

1-Way Analysis of Variance - Completely Randomized Design

CHAPTER 10 Comparing Two Populations or Groups

STATISTICS INFORMED DECISIONS USING DATA

Presentation transcript:

Statistical Data Analysis - Lecture12 - 01/04/03 How LSD plots work Recall: when we use an LSD plot to make a pairwise comparison we assume two things The numbers of observations per group are fairly similar – i.e. balanced or “near balanced” designs are better The standard errors of the groups are approximately equal. It is the second assumption that explains the factor of in the LSD interval formula Statistical Data Analysis - Lecture12 - 01/04/03

Statistical Data Analysis - Lecture12 - 01/04/03 Assume then Now suppose , then the difference is significant if where t=tdf(0.025). Now assuming that the t-value used for the LSD interval is approximately the same as t, the two arrows will not overlap if or Statistical Data Analysis - Lecture12 - 01/04/03

Statistical Data Analysis - Lecture12 - 01/04/03 Comparisons When you perform a one-way ANOVA, you have the choice of performing some comparisons The choices are Tukey’s HSD, Fisher’s Protected LSD, Dunnett’s Multiple Range Test and Hsu’s MCB We will consider the first two (Dunnett’s is considered unreliable, and Hsu’s MCB is virtually never used) Fisher’s comparisons are carried out using the LSD procedure we have discussed (although a different df may have been used) Statistical Data Analysis - Lecture12 - 01/04/03

Tukey’s HSD / Tukey-Kramér intervals Test Against where is the critical value from the Studentised Range Distribution and nh is the harmonic mean Statistical Data Analysis - Lecture12 - 01/04/03

Statistical Data Analysis - Lecture12 - 01/04/03 Example In this experiment there are three (k = 3) groups with 50 observations per group. ANOVA gives us Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) treatment 2 164.616 82.308 82.29 < 2.2e-16 *** Residuals 147 147.033 1.000 P-value << 0.05 so definitely significant. The group means are Lo Med Hi -0.0222 0.1230 2.2700 And the critical value Which differences are significant? Statistical Data Analysis - Lecture12 - 01/04/03

Tukey’s HSD / Tukey-Kramér intervals Bonferroni intervals are very very conservative for large numbers of groups (large k) By conservative we mean the intervals are wide Fisher’s LSD is at the other end of the scale, where the intervals are quite small, so the chance of a type I error is higher than with a Bonferroni interval Tukey intervals are somewhere in between. For small k, they behave more like Bonferroni intervals, for large k, like LSD intervals – “the porridge was neither too hot nor too cold – it was just right!” Statistical Data Analysis - Lecture12 - 01/04/03

Statistical Data Analysis - Lecture12 - 01/04/03 Linear Contrasts It is relatively simple to see from our results that the two Michael Crichton books have smaller sentence lengths on average. Therefore, it may be instructive, and useful, to be able to compare the two authors rather than to compare two books We can do this by generalising the concept of confidence intervals for the difference of two means. All possible pairwise differences between the mean sentence length of the ith book, i, and the mean sentence length of the jth book, j, are linear combinations of the general form If the ci’s are specified constants subject to the constraint Then this is called a linear contrast Statistical Data Analysis - Lecture12 - 01/04/03

Statistical Data Analysis - Lecture12 - 01/04/03 Linear Contrasts For example, 1 - 2 is a linear contrast that examines the difference between book 1 (Eye of the Dragon) and book 2 (The Tommy Knockers). The coefficients are This contrast satisifies our constraint that the coefficients sum to zero It is then easy to see how we can construct contrasts that test groups against each other. Statistical Data Analysis - Lecture12 - 01/04/03

A linear contrast for authors E.g. We wish to look at the difference between the Steven King books and the Michael Crichton books, so our contrast takes the form To place any confidence intervals on our contrasts we need estimates of the contrast itself and the standard error of the estimate. Statistical Data Analysis - Lecture12 - 01/04/03

Estimating a linear contrast The estimate of the contrast is easily obtained by replacing the population means with the sample means, i.e. if is the mean of the ith group is then an estimate of the contrast is We’ve seem that the WGMS is an estimate of the variance of each of the groups (remember we assume each group has the same variance), so the square root of the WGMS is an estimate of the standard deviation, Statistical Data Analysis - Lecture12 - 01/04/03

Standard error of a linear contrast An estimate of the standard deviation (the standard error) of the contrast is then given by Therefore 100(1-)% confidence interval for the contrast is given by where Statistical Data Analysis - Lecture12 - 01/04/03

A hypothesis test for linear contrasts Given that we have the an estimate of the statistic, and the and estimate of the the standard deviation (its standard error), it is relatively simple to go from our confidence interval to a hypothesis test Our null and alternative hypotheses are So our hypothesised difference 0 is zero Our test statistic is where and Statistical Data Analysis - Lecture12 - 01/04/03

Statistical Data Analysis - Lecture12 - 01/04/03 so Now we find the P-value using where T is distributed Student with N – k degrees of freedom Statistical Data Analysis - Lecture12 - 01/04/03

Statistical Data Analysis - Lecture12 - 01/04/03 E.g. Lets test the logged sentence lengths of the Steven King books against the logged sentence lengths of the Michael Crichton books. Our estimate of the contrast is where is the mean of the logged sentence lengths of the ith book (i=“The Eye of the Dragon”,..., “Disclosure”). Working this out gives 0.75. The WGMS0.507, so therefore our standard error is Statistical Data Analysis - Lecture12 - 01/04/03

Statistical Data Analysis - Lecture12 - 01/04/03 N-k=400-8=392, so t392(0.025)=z0.025=1.96 and our test statistic is As t0>>1.96 we can say that there is very strong evidence against the null hypothesis that the two authors are the same (on the basis of sentence length). A 95% confidence interval for our contrast is Transforming back to the original scale, this tells us that the sentences ins Stephen King’s books are on average approximately 1.8 to 2.5 times longer Statistical Data Analysis - Lecture12 - 01/04/03