Lecture 2: Replication and pseudoreplication

Slides:

Advertisements

Similar presentations

Multiple-choice question

Advertisements

Analysis of Variance (ANOVA). Hypothesis H 0 :  i =  G H 1 :  i | (  i   G ) Logic S 2 within = error variability S 2 between = error variability.

Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.

Multiple Comparisons in Factorial Experiments

Experiments with both nested and “crossed” or factorial factors

Model Adequacy Checking in the ANOVA Text reference, Section 3-4, pg

© 2010 Pearson Prentice Hall. All rights reserved The Complete Randomized Block Design.

Design of Engineering Experiments - Experiments with Random Factors

ANOVA: ANalysis Of VAriance. In the general linear model x = μ + σ 2 (Age) + σ 2 (Genotype) + σ 2 (Measurement) + σ 2 (Condition) + σ 2 (ε) Each of the.

Confidence intervals. Population mean Assumption: sample from normal distribution.

Lack of independent replicates: A common pitfall in experimental design.

The Statistical Analysis Partitions the total variation in the data into components associated with sources of variation –For a Completely Randomized Design.

Statistics 350 Lecture 16. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.

Spotting pseudoreplication 1.Inspect spatial (temporal) layout of the experiment 2.Examine degrees of freedom in analysis.

Final Review Session.

Correlation. Two variables: Which test? X Y Contingency analysis t-test Logistic regression Correlation Regression.

Hypothesis testing for the mean [A] One population that follows a normal distribution H 0 :  =  0 vs H 1 :    0 Suppose that we collect independent.

Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).

Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.

Analysis of Variance Introduction The Analysis of Variance is abbreviated as ANOVA The Analysis of Variance is abbreviated as ANOVA Used for hypothesis.

Biostatistics-Lecture 9 Experimental designs Ruibin Xi Peking University School of Mathematical Sciences.

Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.

1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.

Completing the ANOVA From the Summary Statistics.

Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.

Experimental design. Experiments vs. observational studies Manipulative experiments: The only way to prove the causal relationships BUT Spatial and temporal.

INTRODUCTION TO ANALYSIS OF VARIANCE (ANOVA). COURSE CONTENT WHAT IS ANOVA DIFFERENT TYPES OF ANOVA ANOVA THEORY WORKED EXAMPLE IN EXCEL –GENERATING THE.

Blocks and pseudoreplication

DOX 6E Montgomery1 Design of Engineering Experiments Part 9 – Experiments with Random Factors Text reference, Chapter 13, Pg. 484 Previous chapters have.

1 G Lect 11a G Lecture 11a Example: Comparing variances ANOVA table ANOVA linear model ANOVA assumptions Data transformations Effect sizes.

ANOVA: Analysis of Variance.

1 The Two-Factor Mixed Model Two factors, factorial experiment, factor A fixed, factor B random (Section 13-3, pg. 495) The model parameters are NID random.

Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.

Experimental design.

N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.

The general linear test approach to regression analysis.

IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.

New Information Technologies in Learning Statistics M. Mihova, Ž. Popeska Institute of Informatics Faculty of Natural Sciences and Mathematics, Macedonia.

1 Chapter 14: Repeated-Measures Analysis of Variance.

Analysis of Variance 11/6. Comparing Several Groups Do the group means differ? Naive approach – Independent-samples t-tests of all pairs – Each test doesn't.

1 Topic 14 – Experimental Design Crossover Nested Factors Repeated Measures.

1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide

Inferential Statistics Psych 231: Research Methods in Psychology.

MEASURES OF CENTRAL TENDENCY Central tendency means average performance, while dispersion of a data is how it spreads from a central tendency. He measures.

Chapter 12 Simple Linear Regression and Correlation

Chapter 23 Comparing Means.

CHAPTER 7 Linear Correlation & Regression Methods

Statistical Data Analysis - Lecture /04/03

REGRESSION (CONTINUED)

Inferential Statistics

REGRESSION (CONTINUED)

Comparing Three or More Means

12 Inferential Analysis.

Relationship with one independent variable

Chapter 5 Introduction to Factorial Designs

BA 275 Quantitative Business Methods

Comparing Several Means: ANOVA

Statistics review Basic concepts: Variability measures Distributions

Chapter 12 Simple Linear Regression and Correlation

Statistical Inference about Regression

Relationship between mean yield, coefficient of variation, mean square error and plot size in wheat field experiments Coefficient of variation: Relative.

Independent Samples: Comparing Means

12 Inferential Analysis.

Relationship with one independent variable

Quadrat sampling Quadrat shape Quadrat size Lab Regression and ANCOVA

Chapter 24 Comparing Means Copyright © 2009 Pearson Education, Inc.

The Analysis of Variance

Experimental design.

A protocol for data exploration to avoid common statistical problems

Presentation transcript:

Lecture 2: Replication and pseudoreplication

This lecture will cover: Experimental units (replicates) Pseudoreplication Degrees of freedom

Experimental unit Scale at which independent applications of the same treatment occur Also called “replicate”, represented by “n” in statistics

Experimental unit Example: Effect of fertilization on caterpillar growth

Experimental unit ? + F + F - F - F n=2

Experimental unit ? + F - F n=1

Pseudoreplication Misidentifying the scale of the experimental unit; Assuming there are more experimental units (replicates, “n”) than there actually are

When is this a pseudoreplicated design? + F - F

Example 1. Hypothesis: Insect abundance is higher in shallow lakes

Example 1. Experiment: Sample insect abundance every 100 m along the shoreline of a shallow and a deep lake

Example 2. What’s the problem ? Spatial autocorrelation

Example 2. Hypothesis: Two species of plants have different growth rates

Example 2. Experiment: Mark 10 individuals of sp. A and 10 of sp. B in a field. Follow growth rate over time If the researcher declares n=10, could this still be pseudoreplicated?

Example 2.

Example 2. time

Temporal pseudoreplication: Multiple measurements on SAME individual, treated as independent data points time time

Spotting pseudoreplication Inspect spatial (temporal) layout of the experiment Examine degrees of freedom in analysis

Degrees of freedom (df) Number of independent terms used to estimate the parameter = Total number of datapoints – number of parameters estimated from data

Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Independent term method: Can the first data point be any number? Yes, say 8 Can the second data point be any number? Yes, say 12 Can the third data point be any number? No – as mean is fixed ! Variance is  (y – mean)2 / (n-1)

Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Independent term method: Therefore 2 independent terms (df = 2)

Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Subtraction method Total number of data points? 3 Number of estimates from the data? 1 df= 3-1 = 2

Therefore 2 parameters estimated simultaneously Example: Linear regression Y = mx + b Therefore 2 parameters estimated simultaneously (df = n-2)

Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 What is n for each level?

Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 df = 3 df = 3 df = 3 n = 4 How many df for each variance estimate?

Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 df = 3 df = 3 df = 3 What’s the within-treatment df for an ANOVA? Within-treatment df = 3 + 3 + 3 = 9

Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 If an ANOVA has k levels and n data points per level, what’s a simple formula for within-treatment df? df = k(n-1)

Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA (within-treatment MS). Is there pseudoreplication?

Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. Yes! As k=2, n=10, then df = 2(10-1) = 18

Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. What mistake did the researcher make?

Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. Assumed n=50: 2(50-1)=98

Why is pseudoreplication a problem? Hint: think about what we use df for!

How prevalent? Hurlbert (1984): 48% of papers Heffner et al. (1996): 12 to 14% of papers