ANOVA: Analysis of Variance Xuhua Xia

Slides:



Advertisements
Similar presentations
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Hypotheses Test.
Advertisements

School of Computing FACULTY OF ENGINEERING MJ11 (COMP1640) Modelling, Analysis & Algorithm Design Vania Dimitrova Lecture 19 Statistical Data Analysis:
BY ADETOUN DIPEOLU GPVTS ST2 08/12/2010 Non-parametric tests –Mann- Whitney U test.
CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS.
Xuhua Xia Multiple regression Xuhua Xia
Happiness comes not from material wealth but less desire. 1.
Z-test and t-test Xuhua Xia
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 16 l Nonparametrics: Testing with Ordinal Data or Nonnormal Distributions.
Statistical Methods II
Lecture 10 Non Parametric Testing STAT 3120 Statistical Methods I.
Chapters 7-8 Key Points and JMP Instructions. Example 1: Comparing Two Sample Means Means are different at the  =.05 level.
K sample problems and non-parametric tests. Two-Sample T-test (unpaired)
Nonparametric Statistics Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Be humble in our attribute, be loving and varying in our attitude, that is the way to live in heaven.
Statistics 07 Nonparametric Hypothesis Testing. Parametric testing such as Z test, t test and F test is suitable for the test of range variables or ratio.
Parametric Tests 1) Assumption of population normality 2) homogeneity of variance Parametric more powerful than nonparametric.
Wilcoxon Tests What is the Purpose of Wilcoxon Tests? What are the Assumptions? How does the Wilcoxon Rank-Sum Test Work? How does the Wilcoxon Matched-
ONE-WAY REPEATED MEASURES ANOVA What is the Purpose? What are the Assumptions? How Does it Work? What is the Non-Parametric Replacement?
Analysis of variance (2) Lecture 10. Normality Check Frequency histogram (Skewness & Kurtosis) Probability plot, K-S test Normality Check Frequency histogram.
Biostatistics in Research Practice: Non-parametric tests Dr Victoria Allgar.
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
Non-parametric statistics
Statistical Methods II
Quiz 12  Nonparametric statistics. 1. Which condition is not required to perform a non- parametric test? a) random sampling of population b) data are.
AM Recitation 2/10/11.
NONPARAMETRIC STATISTICS
Means Tests MARE 250 Dr. Jason Turner. Type of stats test called a means test Tests for differences in samples based upon their average (mean) and standard.
Chapter 13 – Difference Between Two Parameters Math 22 Introductory Statistics.
Nonparametric Statistical Methods: Overview and Examples ETM 568 ISE 468 Spring 2015 Dr. Joan Burtner.
© Copyright McGraw-Hill CHAPTER 13 Nonparametric Statistics.
Ordinally Scale Variables
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
ANOVA Assumptions 1.Normality (sampling distribution of the mean) 2.Homogeneity of Variance 3.Independence of Observations - reason for random assignment.
Chapter 15 – Analysis of Variance Math 22 Introductory Statistics.
Experimental Design and Statistics. Scientific Method
ANOVA ANOVA is used when more than two groups are compared In order to conduct an ANOVA, several assumptions must be made – The population from which the.
Research Methods: 2 M.Sc. Physiotherapy/Podiatry/Pain Inferential Statistics.
Non – Parametric Test Dr.L.Jeyaseelan Dept. of Biostatistics Christian Medical College Vellore, India.
Statistics in Applied Science and Technology Chapter14. Nonparametric Methods.
Nonparametric Statistics
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Value Stream Management for Lean Healthcare ISE 491 Fall 2009 Data Analysis - Lecture 7.
Chapter 13. The Chi Square Test ( ) : is a nonparametric test of significance - used with nominal data -it makes no assumptions about the shape of the.
Chapter 21prepared by Elizabeth Bauer, Ph.D. 1 Ranking Data –Sometimes your data is ordinal level –We can put people in order and assign them ranks Common.
Chapter 13 Understanding research results: statistical inference.
What parameter is being tested? Categorical  proportionNumeric  mean.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
I. ANOVA revisited & reviewed
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/6.
Statistics in SPSS Lecture 7
Parametric vs Non-Parametric
Non-Parametric Tests.
Y - Tests Type Based on Response and Measure Variable Data
Tests in biostatistics and how to apply them? Prepared by Ajay Prakash Uniyal Department of Plant Sciences Central University of Punjab.
Statistical Tool Boxes
Nonparametric Statistical Methods: Overview and Examples
Nonparametric Statistical Methods: Overview and Examples
PARAMETRIC TESTS t-tests (parametric, interval and ratio data)
Non-parametric tests.
Some Nonparametric Methods
Statistics in SPSS Lecture 8
Nonparametric Statistical Methods: Overview and Examples
Nonparametric Statistical Methods: Overview and Examples
Statistics in SPSS Lecture 9
Hypothesis Testing: Two-Sample Inference
Descriptive statistics Pearson’s correlation
Statistics for biological data
Presentation transcript:

ANOVA: Analysis of Variance Xuhua Xia

Review of t-test Parametric –Pair-sample t-test: t.test(x1, x2, paired=TRUE) –Unpaired two-sample t-test assuming equal variance: t.test(x1, x2, var.equal=TRUE) when the two variances are not equal (Always do a non-parametric test and use the results of the more sensitive test): t.test(x1, x2) –Consequence of violating the assumption Nonparametric Man-Whitney-Wilcoxon test (Ensure that x is a 'factor'): wilcox.test(y~x,data=myDat,paired=T|F) Test equality of variance var.test(x1,x2) p <- 2*pf(Var small /Var large,DF small,DF large ) Alternative: rank the variables and perform a regular t-test) Equivalent methods in EXCEL Xuhua Xia

Review of Standard Error (SE) Xuhua Xia

Head of the statistics Division at the Rothamsted Experimental Station in Hertfordshire. One of the three founders of theoretical population genetics. Developer of statistical methods, especially the likelihood methods. Published The Genetical Theory of Natural Selection in 1930, in which he proposed the fundamental theory of natural selection. ANOVA was mainly developed by Ronald A. Fisher The F statistic was named after him. “To call in a statistician after the experiment is done may be no more than asking him to perform a postmortem examination; he may be able to say what the experiment died of.” Ronald A. Fisher ( )

Xuhua Xia x ij =  +  i +  ij vs. x ij =  +  ij One-way ANOVA Model Is this effect zero? This is the same model for t-test, except that the subscript i is 1 and 2 in t-test, but 1, 2,..., n in one-way ANOVA

Xuhua Xia ANOVA Rationale The essence of ANOVA is to partition the total variation into its components. Suppose we have three groups (e.g., Control plus two treatment), each with N 1 =N 2 =N 3 =200 test animals. Given the null hypothesis that all three groups do not differ from each other, i.e., they all represent random samples from the same underlying population, we can estimate the population variance in three ways: –From all 600 animals: Var = Total SS/DF –From individual groups: SS 1 /DF 1, SS 2 /DF 2, SS 3 /DF 3 Var withinGroup = (SS 1 +SS 2 +SS 3 )/(DF 1 +DF 2 +DF 3 ) –From the three group means: M 1, M 2, M 3 and the grand mean M: SE = sqrt{[(M 1 -M) 2 + (M 2 -M) 2 + (M 3 -M) 2 ]/2} Var betweenGroup = SE 2 *200 = [N 1 *(M 1 -M) 2 + N 2 *(M 2 -M) 2 + N 3 (M 3 -M) 2 ]/2 Given the null hypothesis, Var withinGroup = Var betweenGroup. So ANOVA is an F-test of the two variances. In ANOVA termination, Var withinGroup is MS Error and Var betweenGroup is MS Model.

Xuhua Xia Low-fat foodMedium-fat foodHigh-fat food Weight048 gain2610 One-way experimental design

Xuhua Xia Numerical Illustration of One-Way ANOVA Assignment: Repeat the ANOVA computation by first replacing 10 in the High-fat food group by two values 9 and 20. Submit this slide with all updated values. Name: ID:

Xuhua Xia Dependent variable: Weight Gain SourceDFSSMSFp Model Error Total570.0 ANOVA Table The null hypothesis H0: X1 = X2 = X3 is rejected. The three kinds of food differ significantly in their effect on weight gain of rabbits. In particular, Medium-fat and High-fat foods are significantly better than Low-fat food. However, Medium-fat and High-fat foods do not differ in their effect on rabbit weight gain.

ANOVA and t-test Parametric: –aov(DV~IV1+IV2+… –aov(DV~IV1+IV2+IV1:IV2) or aov(DV~IV1*IV2) –Contrast ANOVA and t-test by using Mercury2Gr_A.txt and Mercury2Gr_B.txt (same data in two different format, one for t.test and one for aov : DarwinPlantBreeding_A.txt and DarwinPlantBreeding_B.txt (Ensure that the variable Speies is a factor Nonparametric: –One-way ANOVA: kruskal.test(DV~IV) –Randomized block design: friedman.test(y~A+B) Others: –summary(fit) print(model.tables(fit,"means"),digits=3) –boxplot(DV~IV) Xuhua Xia

Which of the six strains of clover has the highest protein content? The experimenter divided his field into 5 relatively homogenous blocks each with 6 plots, and randomly assigned his 6 strains to the 6 plots within each block. After harvesting, he determined the nitrogen content for each strain in each plot. Randomized complete blocks

Xuhua Xia Randomized complete blocks Block3dok13dok133dok43dok53dok7compos B B B B B Recode the data into three columns (variables): Yield, Variety and Block, and save it to a text file such as RandCompleteBlock.txt for data analysis in R, e.g., YieldVarietyBlock 333dok1B dok1B2 ……

R functions Xuhua Xia md<-read.table("RandCompleteBlock.txt",header=T) attach(md) fit<-aov(Yield~Block+Variety) summary(fit) anova(fit) TukeyHSD(fit) $Block diff lwr upr p adj B2-B B3-B B4-B B5-B B3-B $Variety diff lwr upr p adj 3dok13-3dok dok4-3dok dok5-3dok dok7-3dok compos-3dok dok4-3dok dok5-3dok dok7-3dok

Xuhua Xia Example A researcher needs to assess the effect of 3 drugs on reduce appetite. Appetite reduction is measured by inter-meal interval (in minutes). The half-life of the drugs is about 3 days. Seven human subjects differ in age, gender, appetite, degree of obesity and potentially many other ways. If the researcher randomly allocates these seven subjects into three groups, then some groups may contain young subjects than others or more males than others, etc., so that any group differences would be confounded by potentially many other factors. He decided to use randomized complete block design and administer the drugs on Monday in three consecutive weeks. For each subject, he randomized the three drugs into the three Mondays (top right), took an index of appetite, and obtained the data table (bottom right) Using test subjects as blocks is also called repeated measures ANOVA or within-subject ANOVA Assignment A: analyze the data and report the effect size and the result of the significance test (in short, what you want to include in a manuscript) SubjectDrug 1Drug 2Drug SubjectWeek1Week 2Week 3 1Drug2Drug1Drug3 2Drug1Drug3Drug2 3 Drug3Drug1 4Drug3Drug1Drug2 5Drug1Drug2Drug3 6 Drug2Drug1 7Drug2Drug1Drug3