Principles in the design of multiphase experiments with a later laboratory phase: orthogonal designs Chris Brien 1, Bronwyn Harch 2, Ray Correll 2 & Rosemary.

Slides:



Advertisements
Similar presentations
Analysis by design Statistics is involved in the analysis of data generated from an experiment. It is essential to spend time and effort in advance to.
Advertisements

Robust microarrary experiments by design: a multiphase framework Chris Brien Phenomics & Bioinformatics Research Centre, University of South Australia.
Multiphase experiments in the biological sciences Chris Brien Phenomics and Bioinformatics Research Centre, University of South Australia Joint work with:
Principles in the design of multiphase experiments with a later laboratory phase: orthogonal designs Chris Brien 1, Bronwyn Harch 2, Ray Correll 2 & Rosemary.
Robust microarray experiments by design: a multiphase framework Chris Brien Phenomics & Bioinformatics Research Centre, University of South Australia
Factor-allocation in gene- expression microarray experiments Chris Brien Phenomics and Bioinformatics Research Centre University of South Australia
Statistical Techniques I EXST7005 Multiple Regression.
Multiple Comparisons in Factorial Experiments
SPH 247 Statistical Analysis of Laboratory Data 1April 2, 2013SPH 247 Statistical Analysis of Laboratory Data.
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control.
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
M. Kathleen Kerr “Design Considerations for Efficient and Effective Microarray Studies” Biometrics 59, ; December 2003 Biostatistics Article Oncology.
Experimental Design, Response Surface Analysis, and Optimization
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Selection of Research Participants: Sampling Procedures
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Chapter 4 Multiple Regression.
Chapter 3 Analysis of Variance
EXPERIMENTS AND OBSERVATIONAL STUDIES Chance Hofmann and Nick Quigley
Nested and Split Plot Designs. Nested and Split-Plot Designs These are multifactor experiments that address common economic and practical constraints.
Formulating mixed models for experiments, including longitudinal experiments [JABES (2009) 14, ] Chris Brien 1 & Clarice Demétrio 2 1 University.
Formulating mixed models for experiments, including longitudinal experiments [JABES (2009) 14, ] Chris Brien 1 & Clarice Demétrio 2 1 University.
Biostatistics-Lecture 9 Experimental designs Ruibin Xi Peking University School of Mathematical Sciences.
Association vs. Causation
Introduction to Multilevel Modeling Using SPSS
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Design of Experiments Dr.... Mary Whiteside. Experiments l Clinical trials in medicine l Taguchi experiments in manufacturing l Advertising trials in.
Fundamentals of Data Analysis Lecture 7 ANOVA. Program for today F Analysis of variance; F One factor design; F Many factors design; F Latin square scheme.
Magister of Electrical Engineering Udayana University September 2011
1 Chapter 1: Introduction to Design of Experiments 1.1 Review of Basic Statistical Concepts (Optional) 1.2 Introduction to Experimental Design 1.3 Completely.
The Randomized Block Design. Suppose a researcher is interested in how several treatments affect a continuous response variable (Y). The treatments may.
Statistical Techniques I EXST7005 Conceptual Intro to ANOVA.
Fixed vs. Random Effects
BIOL 582 Lecture Set 10 Nested Designs Random Effects.
Slide 13-1 Copyright © 2004 Pearson Education, Inc.
Factorial Experiments Analysis of Variance (ANOVA) Experimental Design.
Chapter 7 Experimental Design: Independent Groups Design.
So far... We have been estimating differences caused by application of various treatments, and determining the probability that an observed difference.
Experimental Design All experiments have independent variables, dependent variables, and experimental units. Independent variable. An independent.
Part III Gathering Data.
Experimental Design If a process is in statistical control but has poor capability it will often be necessary to reduce variability. Experimental design.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Testing Hypotheses about Differences among Several Means.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
1 Chapter 1: Introduction to Design of Experiments 1.1 Review of Basic Statistical Concepts (Optional) 1.2 Introduction to Experimental Design 1.3 Completely.
Chapter coverage Part A Part A –1: Practical tools –2: Consulting –3: Design Principles Part B (4-6) One-way ANOVA Part B (4-6) One-way ANOVA Part C (7-9)
Chris Brien University of South Australia Bronwyn Harch Ray Correll CSIRO Mathematical and Information Sciences Design and analysis.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Supplementary PPT File for More detail explanation on SPSS Anova Results PY Cheng Nov., 2015.
Statistics for Differential Expression Naomi Altman Oct. 06.
1 Overview of Experimental Design. 2 3 Examples of Experimental Designs.
Intermediate Applied Statistics STAT 460 Lecture 18, 11/10/2004 Instructor: Aleksandra (Seša) Slavković TA: Wang Yu
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
1 Module One: Measurements and Uncertainties No measurement can perfectly determine the value of the quantity being measured. The uncertainty of a measurement.
October 16, 2002Experimental Design1 Principles of Experimental Design October 16, 2002 Mark Conaway.
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
THE SCIENTIFIC METHOD: It’s the method you use to study a question scientifically.
Designs for Experiments with More Than One Factor When the experimenter is interested in the effect of multiple factors on a response a factorial design.
WELCOME TO BIOSTATISTICS! WELCOME TO BIOSTATISTICS! Course content.
Copyright ©2011 Brooks/Cole, Cengage Learning Gathering Useful Data for Examining Relationships Observation VS Experiment Chapter 6 1.
CHAPTER 4 Designing Studies
Randomizing and checking standard and multiphase designs using the R package dae Chris Brien Phenomics and Bioinformatics Research Centre, University.
i) Two way ANOVA without replication
Topics Randomized complete block design (RCBD) Latin square designs
CHAPTER 4 Designing Studies
Chapter 10 – Part II Analysis of Variance
Multiple Regression Berlin Chen
DOE Terminologies IE-432.
Presentation transcript:

Principles in the design of multiphase experiments with a later laboratory phase: orthogonal designs Chris Brien 1, Bronwyn Harch 2, Ray Correll 2 & Rosemary Bailey 3 1 University of South Australia, 2 CSIRO Mathematics, Informatics & Statistics, 3 Queen Mary University of London

Outline 1.Primary experimental design principles. 2.Factor-allocation description for standard designs. 3.Single set description. 4.Principles for simple multiphase experiments. 5.Principles leading to complications, even with orthogonality. 6.Summary. 2

1)Primary experimental design principles Principle 1 (Evaluate designs with skeleton ANOVA tables)  Use whether or not data to be analyzed by ANOVA. Principle 2 (Fundamentals): Use randomization, replication and blocking (local control). Principle 3 (Minimize variance): Block entities to form new entities, within new entities being more homogeneous; assign treatments to least variable entity-type. Principle 4 (Split units): confound some treatment sources with more variable sources if some treatment factors: i. require larger units than others, ii. are expected to have a larger effect, or iii. are of less interest than others. 3

A standard athlete training example 9 training conditions — combinations of 3 surfaces and 3 intensities of training — to be investigated. Assume the prime interest is in surface differences  intensities are only included to observe the surfaces over a range of intensities. Testing is to be conducted over 4 Months:  In each month, 3 endurance athletes are to be recruited.  Each athlete will undergo 3 tests, separated by 7 days, under 3 different training conditions. On completion of each test, the heart rate of the athlete will be measured. Randomize 3 intensities to 3 athletes in a month and 3 surfaces to 3 tests in an athlete.  A split-unit design, employing Principles 2, 3 and 4(iii). 4 Peeling et al. (2009)

2) Factor-allocation description for standard designs Standard designs involve a single allocation in which a set of treatments is assigned to a set of units:  treatments are whatever are allocated;  units are what treatments are allocated to;  treatments and units each referred to as a set of objects; Often do by randomization using a permutation of the units.  More generally treatments are allocated to units e.g. using a spatial design or systematically Each set of objects is indexed by a set of factors:  Unit or unallocated factors (indexing units);  Treatment or allocated factors (indexing treatments). Represent the allocation using factor-allocation diagrams that have a panel for each set of objects with:  a list of the factors; their numbers of levels; their nesting relationships. 5 (Nelder, 1965; Brien, 1983; Brien & Bailey, 2006)

Factor-allocation diagram for the standard athlete training experiment 6 One allocation (randomization):  a set of training conditions to a set of tests. 3Intensities 3Surfaces 9 training conditions 4Months 3Athletes in M 3Tests in M, A 36 tests The set of factors belonging to a set of objects forms a tier:  they have the same status in the allocation (randomization):  {Intensities, Surfaces} or {Months, Athletes, Tests}  Textbook experiments are two-tiered. A crucial feature is that diagram automatically shows EU and restrictions on randomization/allocation.

Some derived items Sets of generalized factors (terms in the mixed model):  Months, Months  Athletes, Months  Athletes  Tests;  Intensities, Surfaces, Intensities  Surfaces. Corresponding types of entities (groupings of objects):  month, athlete, test (last two are Eus);  intensity, surface, training condition (intensity-surface combination). Corresponding sources (in an ANOVA):  Months, Athletes[M], Tests[M  A];  Intensities, Surfaces, Intensities#Surfaces. 7 3Intensities 3Surfaces 9 training conditions 4Months 3Athletes in M 3Tests in M, A 36 tests

Skeleton ANOVA Intensities is confounded with the more-variable Athletes[M] & Surfaces with Tests[M^A]. 8 3Intensities 3Surfaces 9 training conditions 4Months 3Athletes in M 3Tests in M, A 36 tests

Mixed model Take generalized factors derived from factor-allocation diagram and assign to either fixed or random model: Intensities + Surfaces + Intensities  Surfaces | Months + Months  Athletes + Months  Athletes  Tests Corresponds to the mixed model: Y = X I  I + X S  S + X I  S  I  S + Z M u M + Z M  A u M  A + Z M  A  T u M  A  T. where the Xs and Zs are indicator variable matrices for the generalized factors in its subscript, and  s and us are fixed and random parameters, respectively, with 9 This is an ANOVA model, equivalent to the randomization model, and is also written: Y = X I  I + X S  S + X I  S  I  S + Z M u M + Z M  A u M  A + e. To fit in SAS must set the DDFM option of MODEL to KENWARDROGER. Brien & Demétrio (2009).

3)Single-set description Single set of factors that uniquely indexes observations:  {Months, Intensities, Surfaces} (Athletes and Tests omitted). What are the EUs in the single-set approach?  A set of units that are indexed by Months-Intensities combinations and another set by the Months-Intensities-Surfaces combinations.  Of course, Months-Intensities(-Surfaces) are not actual EUs, as Intensities (Surfaces) are not randomized to those combinations.  They act as a proxy for the unnamed units. Mixed model is: I + S + I  S | M + M  I + M  I  S.  Previous model: I + S + I  S | M + M  A + M  A  T.  Former more economical as A and T not needed.  In SAS, default DDFM option of MODEL ( CONTAIN ) works. However, M  A and M  I are different sources of variability:  inherent variability vs block-treatment interaction. This "trick" is confusing, false economy and not always possible. 10 e.g. Searle, Casella & McCulloch (1992); Littel et al. (2006).

Single-set description ANOVA Single set of factors that uniquely indexes observations:  {Months, Intensities, Surfaces} Use factors to derive skeleton ANOVA 11 Confounding not exhibited, and need E[MSq] to see that M#I is denominator for Intensities.

Summary of factor-allocation versus single-set description Factor-allocation description is based on the tiers and so is multi-set. It has a specific factor for the EUs and so their identity not obscured. Single-set description factors are a subset of those identified in the factor-allocation description and so more economical. Skeleton ANOVA from factor-allocation description shows:  The confounding resulting from the allocation;  The origin of the sources of variation more accurately (Athletes[Months] versus Months#Intensities). 12

4)Principles for simple multiphase experiments Suppose in the athlete training experiment:  in addition to heart rate taken immediately upon completion of a test,  the free haemoglobin is to be measured using blood specimens taken from the athletes after each test, and  the specimens are transported to the laboratory for analysis. The experiment is two phase: testing and laboratory phases.  The outcome of the testing phase is heart rate and a blood specimen.  The outcome of the laboratory phase is the free haemoglobin. How to process the specimens from the first phase in the laboratory phase? 13

Some principles Principle 5 (Simplicity desirable): assign first-phase units to laboratory units so that each first-phase source is confounded with a single laboratory source.  Use composed randomizations with an orthogonal design. Principle 6 (Preplan all): if possible. Principle 7 (Allocate all and randomize in laboratory): always allocate all treatment and unit factors and randomize first-phase units and lab treatments. Principle 8 (Big with big): Confound big first-phase sources with big laboratory sources, provided no confounding of treatment with first-phase sources. 14

A simple two-phase athlete training experiment Simplest is to randomize specimens from a test to locations (in time or space) during the laboratory phase. 15 3Intensities 3Surfaces 9 training conditions 4Months 3Athletes in M 3Tests in M, A 36 tests 36Locations 36 locations Composed randomizations

A simple two-phase athlete training experiment (cont’d) No. tests = no. locations = 36 and so tests sources exhaust the locations source. Cannot separately estimate locations and tests variability, but can estimate their sum. But do not want to hold blood specimens for 4 months. 16

A simple two-phase athlete training experiment (cont’d) Simplest is to align lab-phase and first-phase blocking. 17 3Intensities 3Surfaces 9 training conditions 4Months 3Athletes in M 3Tests in M, A 36 tests 4Batches 9Locations in B 36 locations Note Months confounded with Batches (i.e. Big with Big). Composed randomizations

A simple two-phase athlete training experiment – mixed model Form generalized factors and assign to fixed or random:  I + S + I  S | M + M  A + M  A  T + B + B  L. ANOVA shows us i) there will be aliasing and ii) model without lab terms will fit and be sufficient – a “model of convenience”. 18 3Intensities 3Surfaces 9 training conditions 4Months 3Athletes in M 3Tests in M, A 36 tests 4Batches 9Locations in B 36 locations

The multiphase law DF for sources from a previous phase can never be increased as a result of the laboratory-phase design. However, it is possible that first-phase sources are split into two or more sources, each with fewer degrees of freedom than the original source. 19 DF for first phase sources unaffected.

Factor-allocation in multiphase experiments While multiphase experiments will often involve multiple allocations, not always:  A two-phase experiment will not if the first phase involves a survey i.e. no allocation e.g. tissues sampled from animals of different sexes.  A two-phase experiment may include more than two allocations: e.g. a grazing trial in the first phase that involves two composed randomizations. Factor-allocation description is particularly helpful in understanding multiphase experiments with multiple allocations. 20

5)Principles leading to complications, even with orthogonality Principle 9 (Use pseudofactors):  An elegant way to split sources (as opposed to introducing grouping factors unconnected to real sources of variability). Principle 10 (Compensating across phases):  Sometimes, if something is confounded with more variable first- phase source, can confound with less variable lab source. Principle 11 (Laboratory replication):  Replicate laboratory analysis of first-phase units if lab variability much greater than 1 st -phase variation;  Often involves splitting product from the first phase into portions (e.g. batches of harvested crop, wines, blood specimens into aliquots, drops, lots, samples and fractions). Principle 12 (Laboratory treatments):  Sometimes treatments are introduced in the laboratory phase and this involves extra randomization. 21

A replicated two-phase athlete training experiment Suppose duplicates of free haemoglobin to be done:  2 fractions taken from each specimen;  One fraction taken from 9 specimens for the month and analyzed;  Then, the other fraction from 9 specimens analyzed. 22 3Intensities 3Surfaces 9 training conditions 4Months 2Fractions in M, A, T 3Athletes in M 3Tests in M, A 72 fractions 4Batches 2Rounds in B 9Locations in B, R 72 locations 2F 1 in M Problem: 18 fractions in a month to assign to 2 rounds in a batch:  Use F 1 to group the 9 fractions to be analyzed in the same round.  An alternative is to introduce the grouping factor FGroups,  but, while in the analysis, not an anticipated variability source.

A replicated two-phase athlete training experiment (cont’d) 23 3Intensities 3Surfaces 9 training conditions 4Months 2Fractions in M, A, T 3Athletes in M 3Tests in M, A 72 fractions 2F 1 in M Note split source 4Batches 2Rounds in B 9Locations in B, R 72 locations

A replicated two-phase athlete training experiment (cont’d) 24 Last line measures laboratory variation. Again, no. fractions = no. locations = 72 and so fractions sources exhaust locations sources.  Consequently, not all terms could be included in a mixed model.  Pseudofactors not needed in mixed model (cf. grouping factors).

A replicated two-phase athlete training experiment – mixed model 25 Form generalized factors and assign to fixed or random:  I + S + I  S | M + M  A + M  A  T + M  A  T  F + B + B  R + B  R  L. Removing aliased terms gives one “model of convenience”:  I + S + I  S | M  A + M  A  T + B + B  R + B  R  L (M, M  A  T  F omitted).  Another model would result from removing B and B  R  L (need B  R).

Single-set description for example Single set of factors that uniquely indexes observations:  {4 Months x 2 Rounds x 3 Intensities x 3 Surfaces}. 26 3Intensities 3Surfaces 9 training conditions 4Months 2Fractions in M, A, T 3Athletes in M 3Tests in M, A 72 fractions 2F 1 in M 4Batches 2Rounds in B 9Locations in B, R 72 locations How do:  the locations sources;  Fractions; affect the response? Similar to model of convenience

27 6)Summary Factor-allocation, rather than single-set, description used, being more informative and particularly helpful in multiphase experiments. Multiphase experiments usually have multiple allocations. Have provided 4 standard principles and 8 principles specific to orthogonal, multiphase designs. In practice, will be important to have some idea of likely sources of laboratory variation. Are laboratory treatments to be incorporated? Will laboratory replicates be necessary?

28 References Brien, C. J. (1983). Analysis of variance tables based on experimental structure. Biometrics, 39, Brien, C.J., and Bailey, R.A. (2006) Multiple randomizations (with discussion). J. Roy. Statist. Soc., Ser. B, 68, 571–609. Brien, C.J. and Demétrio, C.G.B. (2009) Formulating mixed models for experiments, including longitudinal experiments. J. Agr. Biol. Env. Stat., 14, Brien, C.J., Harch, B.D., Correll, R.L. and Bailey, R.A. (2010) Multiphase experiments with laboratory phases subsequent to the initial phase. I. Orthogonal designs. Journal of Agricultural, Biological and Environmental Statistics, submitted. Littell, R. C., G. A. Milliken, et al. (2006). SAS for Mixed Models. Cary, N.C., SAS Press. Nelder, J. A. (1965). The analysis of randomized experiments with orthogonal block structure. Proceedings of the Royal Society of London, Series A, 283(1393), , Peeling, P., B. Dawson, et al. (2009). Training Surface and Intensity: Inflammation, Hemolysis, and Hepcidin Expression. Medicine & Science in Sports & Exercise, 41, Searle, S. R., G. Casella, et al. (1992). Variance components. New York, Wiley.

29 Web address for link to Multitiered experiments site