Introduction to the design of cDNA microarray experiments Statistics 246, Spring 2002 Week 9, Lecture 1 Yee Hwa Yang.

Slides:



Advertisements
Similar presentations
Optimal designs for one and two-colour microarrays using mixed models
Advertisements

Experimental Design and Differential Expression Class web site: Statistics for Microarrays.
M. Kathleen Kerr “Design Considerations for Efficient and Effective Microarray Studies” Biometrics 59, ; December 2003 Biostatistics Article Oncology.
Statistical tests for differential expression in cDNA microarray experiments (2): ANOVA Xiangqin Cui and Gary A. Churchill Genome Biology 2003, 4:210 Presented.
Microarray analysis challenges. While not quite as bad as my hobby of ice climbing you, need the right equipment! T. F. Smith Bioinformatics Boston Univ.
1 Introduction to Experimental Design 1/26/2009 Copyright © 2009 Dan Nettleton.
Pre-processing in DNA microarray experiments Sandrine Dudoit PH 296, Section 33 13/09/2001.
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Mathematical Statistics, Centre for Mathematical Sciences
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Experimental design for microarrays Presented by Alex Sánchez and Carmen Ruíz de Villa Departament d’Estadística. Universitat de Barcelona.
Plotting the path from RNA to microarray: the importance of experimental planning and methods Glenn Short Microarray Core Facility/Lipid Metabolism Unit.
Identification of C. elegans sensory ray genes using whole- genome expression profiling Douglas S. Portman and Scott W. Emmons.
Sandrine Dudoit1 Microarray Experimental Design and Analysis Sandrine Dudoit jointly with Yee Hwa Yang Division of Biostatistics, UC Berkeley
Statistics for Microarrays
Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001.
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
Lecture 13 – Tues, Oct 21 Comparisons Among Several Groups – Introduction (Case Study 5.1.1) Comparing Any Two of the Several Means (Chapter 5.2) The One-Way.
CDNA Microarray Design and Pre-processing By H. Bjørn Nielsen.
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
1 Lecture 21, Statistics 246, April 8, 2004 Identifying expression differences in cDNA microarray experiments, cont.
DNA Arrays …DNA systematically arrayed at high density, –virtual genomes for expression studies, RNA hybridization to DNA for expression studies, –comparative.
Some thoughts of the design of cDNA microarray experiments Terry Speed & Yee HwaYang, Department of Statistics UC Berkeley MGED IV Boston, February 14,
Microarray Analysis Jesse Mecham CS 601R. Microarray Analysis It all comes down to Experimental Design Experimental Design Preprocessing Preprocessing.
Gene Expression Data Analyses (1) Trupti Joshi Computer Science Department 317 Engineering Building North (O)
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Designing Microarray Experiments Naomi Altman Oct. 06.
CDNA Microarrays Neil Lawrence. Schedule Today: Introduction and Background 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo.
QNT 531 Advanced Problems in Statistics and Research Methods
CDNA Microarrays MB206.
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture 6 Solving Normal Equations and Estimating Estimable Model Parameters.
Agenda Introduction to microarrays
Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research.
We calculated a t-test for 30,000 genes at once How do we handle results, present data and results Normalization of the data as a mean of removing.
ENGR 610 Applied Statistics Fall Week 9
Design of Experiments Problem formulation Setting up the experiment Analysis of data Panu Somervuo, March 20, 2007.
ARK-Genomics: Centre for Comparative and Functional Genomics in Farm Animals Richard Talbot Roslin Institute and R(D)SVS University of Edinburgh Microarrays.
Design of microarray gene expression profiling experiments Peter-Bram ’t Hoen.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.
Statistics for Differential Expression Naomi Altman Oct. 06.
Henrik Bengtsson Mathematical Statistics Centre for Mathematical Sciences Lund University, Sweden Plate Effects in cDNA Microarray Data.
Design of Micro-arrays Lecture Topic 6. Experimental design Proper experimental design is needed to ensure that questions of interest can be answered.
Microarray Technology. Introduction Introduction –Microarrays are extremely powerful ways to analyze gene expression. –Using a microarray, it is possible.
Differential Expressions: Multiple Treatments ANOVA Kruskal Wallis Factorial Set-up.
Microarray hybridization Usually comparative – Ratio between two samples Examples – Tumor vs. normal tissue – Drug treatment vs. no treatment – Embryo.
MICROARRAYS D’EXPRESSIÓ ESTUDI DE REGULADORS DE LA TRANSCRIPCIÓ DE LA FAMILIA trxG M. Corominas:
CSIRO Insert presentation title, do not remove CSIRO from start of footer Experimental Design Why design? removal of technical variance Optimizing your.
Introduction to Microarrays. The Central Dogma.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
Chapter 13 Design of Experiments. Introduction “Listening” or passive statistical tools: control charts. “Conversational” or active tools: Experimental.
Hybridization Design for 2-Channel Microarray Experiments Naomi S. Altman, Pennsylvania State University), NSF_RCN.
Experimental Design Reaching a balance between statistical power and available finances.
Henrik Bengtsson Mathematical Statistics Centre for Mathematical Sciences Lund University Plate Effects in cDNA Microarray Data.
1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture 9 Review.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Other uses of DNA microarrays
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 10 Introduction to the Analysis.
Microarray: An Introduction
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
DNA Microarray. Microarray Printing 96-well-plate (PCR Products) 384-well print-plate Microarray.
Statistics for Microarray Data Analysis – Lecture 3
Introduction to Experimental Design
Normalization for cDNA Microarray Data
Design Issues Lecture Topic 6.
Presentation transcript:

Introduction to the design of cDNA microarray experiments Statistics 246, Spring 2002 Week 9, Lecture 1 Yee Hwa Yang

Some aspects of design Layout of the array –Which cDNA sequence to print? Library Controls –Spatial position Allocation of samples to the slides –Different design layout A vs B : Treatment vs control Multiple treatments Time series Factorial –Replication number of hybridizations use of dye swap in replication Different types replicates (e.g pooled vs unpooled material (samples)) –Other considerations Physical limitations: the number of slides and the amount of material Extensibility - linking

Issues that affect design of array experiments Scientific Aim of the experiment Specific questions and priorities between them. How will the experiments answer the questions posed? Practical (Logistic) Types of mRNA samples: reference, control, treatment 1, etc. Amount of material. Count the amount of mRNA involved in one channel of a hybridization as one unit. Number of slides available for experiment. Other Information The experimental process prior to hybridization: sample isolation, mRNA extraction, amplification, labelling. Controls planned: positive, negative, ratio, etc. Verification method: Northern, RT-PCR, in situ hybridization, etc.

Graphical representation

Natural design choice Case 1: Meaningful biological control (C) Samples: Liver tissue from four mice treated by cholesterol modifying drugs. Question 1: Genes that respond differently between the T and the C. Question 2: Genes that responded similarly across two or more treatments relative to control. Case 2: Use of universal reference. Samples: Different tumor samples. Question: To discover tumor subtypes. C T1 T2T3T4 T1T1 Ref T2T2 T n-1 TnTn

Direct vs Indirect Two samples e.g. KO vs. WT or mutant vs. WT TC T C Ref Direct Indirect  2 /22222 average (log (T/C))log (T / Ref) – log (C / Ref )

I) Common Reference II) Common reference III ) Direct comparison Number of SlidesN = 3N=6N=3 Ave. variance20.67 Units of materialA = B = C = 1A = B = C = 2 Ave. variance10.67 One-way layout: one factor, k levels CB A O CBA O CBA All pair-wise comparisons are of equal importance

Dye-swap CB A Design B1 CB A Design B2 - Design B1 and B2 have the same average variance - The direction of arrows potentially affects the bias of the estimate but not the variance -For k = 3, efficiency ratio (Design A1 / Design B) = 3 -In general, efficiency ratio = (2k) / (k-1)

Design: how we sliced up the bulb A P D V M L

Multiple direct comparisons between different samples (no common reference) Different ways of estimating the same contrast: e.g. A compared to P Direct = log(A/P) Indirect = log(A/M) + log((M/P) or log(A/D) + log(D/P) or log(A/L) – log((P/L) How do we combine these? L P V D M A

Linear model analysis Define a matrix X so that E(Y)=Xb a = log(A), p=log(P), d=log(D), v=log(V), m=log(M), l=log(L)

Pooled reference T2T4T5T6T7T3T1 Ref Compare to T1 t vs t+3 t vs t+2 t vs t+1 Time Series Possible designs: 1)All sample vs common pooled reference 2)All sample vs time 0 3)Direct hybridization between times.

Design choices in time seriest vs t+1t vs t+2 T1T2T2T3T3T4T1T3T2T4T1T4Ave N=3A) T1 as common reference B) Direct Hybridization N=4C) Common reference D) T1 as common ref + more E) Direct hybridization choice F) Direct Hybridization choice T2 T3 T4 T1 T2 T3 T4 T1 Ref T2 T3 T4 T1 T2T3T4T1 T2T3T4T1 T2 T3 T4 T1

2 by 2 factorial – two factors, each with two levels Example 1: Suppose we wish to study the joint effect of two drugs, A and B. 4 possible treatment combinations: –C: No treatment –A: drug A only. –B: drug B only. –A.B: both drug A and B. Example 2: Our interest in comparing two strain of mice (mutant and wild-type) at two different times, postnatal and adult. 4 possible samples: –C: WT at postnatal –A: WT at adult (effect of time only) –B: MT at postnatal (effect of the mutation only) –A.B : MT at adult (effect of both time and the mutation).

Different ways of estimating parameters. e.g. B effect. 1 = (  + b) - (  ) = b = ((  + a) - (  )) -((  + a)-(  + b)) = (a) - (a + b) = b Factorial design   a  b  a+b+ab AC BAB

Factorial design  a  b  a+b+ab AC B AB 

IndirectA balance of direct and indirect I)II)III)IV) # Slides N = 6 Main effect A NA Main effect B Interacti on A.B x 2 factorial C A.BBA B C A B C A B C A Table entry: variance

Linear model analysis Define a matrix X so that E(Y)=Xb Use least squares estimate for a, b, ab

Common reference approach Estimate (ab) with y3 - y2 - y1 y1 = log (A / C) = a y2 = log (B / C) = b y3 = log (AB / C) = a + b + ab C A.BBA y1 y2 y3

IndirectA balance of direct and indirect I)II)III)IV) # Slides N = 6 Main effect A NA Main effect B Interacti on A.B x 2 factorial C A.BBA B C A B C A B C A Table entry: variance

More general n by m factorial experiment 2 factors, one with n levels and the other with m levels OE experiment (2 by 2): interested in difference between zones, age and also zone.age interaction. Further experiment (2 by 3): only interested in genes where difference between treatment and controls changes with time treatment control treatment

WT.P11  + a1 MT.P21  + (a1 + a2) + b + (a1 + a2)b MT.P11  +a1+b+a1.b WT.P21  + a1 + a2 WT P1  MT.P1  + b

Replication —Why replicate slides: –Provides a better estimate of the log-ratios –Essential to estimate the variance of log-ratios —Different types of replicates: –Technical replicates Within slide vs between slides –Biological replicates

Sample size Apo A1 Data Set

Technical replication - labelling 3 sets of self – self hybridization: (cerebellum vs cerebellum) Data 1 and Data 2 were labeled together and hybridized on two slides separately. Data 3 were labeled separately. Data 1 Data 2 Data 3

Technical replication - amplification Olfactory bulb experiment: 3 sets of Anterior vs Dorsal performed on different days #10 and #12 were from the same RNA isolation and amplification #12 and #18 were from different dissections and amplifications All 3 data sets were labeled separately before hybridization

amplification T1 T2 T1 T2 Original samples Amplified samples Replicate Design 2 Replicate Design

M6 = Lc.MT.P21  + (  1 +  2) +  + (  1 +  2)*  Common reference approach Estimate (  1.  ) with M5 – M4 - M2 + M1 Estimate (  1 +  2).  with M6 – M4 – M3 + M1 M3 = Lc.WT.P21  + (  1 +  2) M2 = Lc.WT.P11  +  1 M4 = Lc.MT.P1  +  M5 = Lc.MT.P11  +  1 +  +  1 *  M1 = Lc.MT.P1 