Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Toni Reverter Bioinformatics Group CSIRO Livestock Industries Queensland Bioscience.

Similar presentations


Presentation on theme: "Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Toni Reverter Bioinformatics Group CSIRO Livestock Industries Queensland Bioscience."— Presentation transcript:

1 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Toni Reverter Bioinformatics Group CSIRO Livestock Industries Queensland Bioscience Precinct 306 Carmody Rd., St. Lucia, QLD 4067, Australia SSAI – QLD Branch – 6 Apr. 2004

2 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries CONTENTS 1.Introduction …………………………… 4 6 2.Technical Concerns ……...……………. 2 7 3.Designs ………………..……………….21 15 4.Analysis ……………..…………………14 16 5.Coverage and Sensitivity...……………. 5 7 6.Summary …………....………………… 2 4 SlidesMinutes

3 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 1. Introduction 1.a – The Material This is a Cow This is a Sheep This is a Pig (female) This is a Chicken

4 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries cDNA “A” Cy5cDNA “B” Cy3 Tissue Samples Treat ATreat B mRNA Extraction & Amplification Hybridization Laser 1 Laser 2 Optical Scanner + Image Capture Analysis 1.b - The Method 1. Introduction

5 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 1.c - The Challenge SSAI – QLD Branch – 6 Apr. 2004 Time Dependent Chronology Logical 1800s – DATA 30-60s – METHODS 50-70s – SOFTWARE 1980s – COMPUTER cDNA  Human Dependent Skill Integration Quantitative Computer Sci. Statisticians Mathematicians ……. Non-Q Biochemists Physiologists Pathologists ……. BANANAEGG “banana omelette” Historical Excitement Balance Interdisciplinary Data Dependent Paradigm Distribution SourceSize 1. Introduction

6 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries The Biologist and the Statistician are being executed. They are both granted one last request. The Statistician asks that he/she be allowed to give one final lecture on his/her Grand Theory of Statistics. The Biologist asks that he/she be executed first. JOKE “The majority of microarray papers are analysed with substandard methods” C Tilstone (citing D Allison), Nature 2003, 424:610 CLAIM 1.Biologists don’t care …………………………………10 2.Statisticians are bad ………………………………….20 3.Unrealistic expectations ………………………………70 REASONS P Value 1.c – Human-Dependent Challenge 1. Introduction

7 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Replication: 1.Animal 2.Sample 3.Array 4.Spot 1.Biochemist Level: a.Preparation (Printing) of the Chip b.RNA Extraction, Amplification and Hybridisation c.Optical Scanner (Reading) 2.Quantitative Level: a.Design b.Image (data) Quality c.Data Analysis d.Data Storage 2. Technical Concerns Note: Randomisation intentionally neglected.

8 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 2.a – Data Quality: GP3xCLI2.b – Storage: GEXEX 2. Technical Concerns

9 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries a.Identify/Prioritise Questions b.N of Available Samples c.N of Available Arrays d.Consider Dye Bias Key Issues: Put more arrays on key questions 3. Experimental Designs Pooling? Dye-Swap Dye-Balancing Self-Self O B A AB Reference Evaluation of Designs: O B A AB Loop O B A AB All-Pairs Variance of Estimated Effects (Relative to the All-Pairs) Reference 1 3 2 Loop 4/3 1 8/3 1 All-Pairs 1 2 1 Main effect of A Main effect of B Interaction AB Contrast A-B

10 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Glonek & SolomonFactorial and Time Course Designs for cDNA Microarray Experiments Definition A design with a total of n slides and design matrix X is said to be admissible if there exists no other design with n slides and design matrix X* such that ci*  ci For all i with strict inequality for at least one i. Where ci* and ci are respectively the diagonal elements of (X*’X*)-1 and (X’X)-1. Samples vs Slides vs Configurations 3 412 2 6 3 11 132 (S-1) S(S-1) Samples (S) Arrays N of Configurations? 3. Experimental Designs

11 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries S A-1 N of Configurations? 3. Experimental Designs

12 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Pie-Bald black Non-Pie-Bald black Normal White Recessive S A-1 = 5 3 = 125 N of Configurations? 3. Experimental Designs

13 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries x5 3. Experimental Designs

14 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 0 hr24 hr S A-1 = 10 9 = 1 Billion! N of Configurations? 3. Experimental Designs

15 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Opt 1: 10 Slides Opt 2: 10 Slides Opt 3: 11 Slides Opt 4: 9 SlidesOpt 5: 9 Slides Transitivity (Townsend, 2003) & Extendability (Kerr, 2003) 3. Experimental Designs

16 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 0 hr24 hr N of Configurations? S A-1 = 12 10 = 62 Billion! 3. Experimental Designs

17 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 0 hr24 hr R R RR R R R R R R R R G G G G G G G G G G G G N of Configurations? 3. Experimental Designs

18 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Pavlidis et al.(2003) The effect of replication on gene Expression microarray experiments. Bioinformatics 19:1620 >= 5 Replicates 10-15 Replicates Peng et al.(2003) Statistical implications of pooling RNA Samples for microarray experiments. BMC Bioinformatics 4:26 Power: n9c9  95%, n3c3  50%, n9c3  90% n25c5  n20c20 Handling Constraints (Samples & Arrays): 3. Experimental Designs

19 N of Arrays? SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries F HS M TM F HS M HS F TM M HS F HS M HS R R R R R R R R R R R R R R G G G G G G G G G G G G G G 24: 23 To 552 14: 13 To 182 pooling 3. Experimental Designs

20 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries RESSUS0324MFHSTM RES8-810-1.7661.766-3.8663.866 SUS8011.766-1.7663.866-3.866 08-4 -1.3351.3350.666-0.666 310-6-1.0331.033-0.4680.468 24102.368-2.368-0.1980.198 M6.247-6.2470.493-0.493 F6.247-0.4930.493 HS3.798-3.798 TM3.798 Sum(ABS)29.3 22.023.027.121.7 17.6 Sum(ABS) 26.8 26.8 39.1 23.1 17.3 7.1 7.1 14.3 14.3 Reference Design 3. Experimental Designs

21 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Another (NEW?) Constraint: A B C D E M avium slope 18 days 33-3-3 M avium broth 18 days101-2-2-1-2-1-2-1-2-1 M para broth 10 weeks 51-2-2-1-1 M para broth 12 weeks 61-1-4-5-2-1 M para in-vivo 31-1-1 3. Experimental Designs

22 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries AB C D E A A A B B B C D E C C D E DE              Importance due to Transitivity of AB with BC and BD Procedure: Five configurations will be proposed and the statistical optimality of each evaluated. Another (NEW?) Constraint: 3. Experimental Designs

23 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 333 1221212121 12211 114521 111

24 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 333 1221212121 12211 114521 111 Configuration 1

25 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 333 1221212121 12211 114521 111 Configuration 2

26 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 333 1221212121 12211 114521 111 Configuration 3

27 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 333 1221212121 12211 114521 111 Configuration 4

28 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 333 1221212121 12211 114521 111 Configuration 5

29 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries AB C D E A A A B B B C D E C C D E DE Imp WeightSquared Error 1 2 3 4 5 46 5 6 6 5 4 1 4 4 1 20 2 1 0 0 4 0 1 4 4 23 2 2 3 4 1 0 0 1 4 10 0 0 0 0 1 1 1 1 1 35 5 4 4 5 4 4 1 1 4 44 5 5 5 5 0 1 1 1 1 10 0 0 0 0 1 1 1 1 1 22 0 2 3 2 0 4 0 1 0 10 0 0 0 0 1 1 1 1 1 43 3 3 3 3 1 1 1 1 1 SSE17 14 11 16 18 01 2 1 0 0 MSE.74.64.48.66.75 Noise DD Conclusion: Configuration 3

30 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 1.Relaxed data acquisition criteria a.Signal to Noise > 1.00 (relaxer (sp?) exist) b.Mean to Median > 0.85 (Tran et al. 2002) 2.Moving away from a.Ratios b.“heavy-duty” normalisation techniques 3.Mixed-Model Equations a.Check residuals b.Check REML estimates of Variance Components c.Proportion of Total V due to Gene x Variety 4.Process results Gene x Treatment a.Mixtures of Distributions 4. Data Analysis My (EDUCATED?) View:

31 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Log2 Intensities Comparison Group Array|Block|Dye (FIXED) Main Gene Effect (RANDOM) Gene x Dye (RANDOM) Gene x Variety (RANDOM) Residual (RANDOM) DE Genes Note: missing but (generally) unimportant. Gene x Array|Block (RANDOM) 4. Data Analysis Mixed-Model Equations

32 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Mixed-Model Equations Log2(Int.) = CG + Gene + G  Dye + G  Array + G  Variety + Error The proportion of the Total Variation accounted for by the G x Variety Interaction anticipates the proportion of DE Genes CLAIM Control of FDR 4. Data Analysis

33 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Y 11 197,8029.331.995.1715.99768257.5139343 Y 12 74,03010.821.914.9515.99576128.522243 Y 21 110,3089.992.074.2515.99576191.527319 Y 22 116,4099.892.095.1715.99576202.119318 Y 23 117,68710.382.044.9115.99576204.336320 Y 31 106,59110.111.776.6015.99672158.637278 Y 32 236,6719.442.115.3615.991,440164.357269 Observations Comparison Groups Levels Observations N Mean SD Min Max Mean Min Max 4. Data Analysis

34 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 54 Array Slides 959,498 Valid Intensity Records (S2N>1, M2M>0.85) 7,638 Elements (genes) 752,476 Equations 56 (Co)Variance Components (REML) BAYESMIX (Bayesian Mixtures of distributions) 4. Data Analysis

35 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 56 (Co)Variance Components 4. Data Analysis

36 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries % Total Variance Due to: Error 3.0 – 3.6 5.1 – 6.7 3.0 – 3.7 Gene 83.6 – 90.4 78.3 – 81.9 47.5 – 83.9 Gene x Array 3.5 – 9.8 10.4 – 12.6 10.6 – 43.5 Gene x Variety 2.4 – 3.7 2.1 – 2.6 2.5 – 5.4 Genetic Correlations Moderate (EXP3) to Strong Gene  Variety Corr Strong (EXP1) to Moderate (EXP2) 4. Data Analysis

37 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries i = 1, …, 7,638 genes j = 1, …, 7 variables t = 0, …, 5 time points (EXP3 only) Other measure definitions could also be valid Measures of (Possible) Differential Expression 4. Data Analysis

38 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis Mixtures of Distributions

39 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Mixtures of Distributions 4. Data Analysis

40 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Exp1 Exp2 Exp3 Up Down Up Down Up Down High-LowUp 409 0 26 13 36 11 Down 41 3 0 5 0 HOL-JBL Up 68 0 0 8 Down 319 10 6 TSS-UTSUp 252 0 Down 109 10 DE Elements across the 3 Exp (2 UP/DOWN/UP; 8 UP/UP/DOWN) Differentially Expressed Genes 4. Data Analysis

41 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Residuals Plots 4. Data Analysis

42 178 @ Day 82 139 @ Day 120 114 @ Day 105171 @ Inguinal 123 55 68 71 75 39 41 130 43129330 55164523 22 53 27 12 36 5 31 99 26 14 40 11 43 21 42 10 36 5 81 24 25 12 46 12 36 5 26 12 36 5 44 22 Bovine Ovine Up-Regulated Down-Regulated Allocation of 238 DE Genes Design and Analysis of Microarray Experiments at CSIRO Livestock Industries SSAI – QLD Branch – 6 Apr. 2004 4. Data Analysis Homologs Orthologs Paralogs

43 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries The “Real” Target: Molecular Interaction Maps Adapted from Aladjem et al. 2004, Sciences’s STKE 4. Data Analysis

44 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries MPSS Paper PNAS 03, 100:4702 tpmN Tags % > 1(0.0)27,965 100.00 5(0.7)15,145 54.16 10(1.0)10,519 37.61 50(1.7) 3,261 11.66 100(2.0) 1,719 6.15 500(2.7) 298 1.07 1,000(3.0) 154 0.55 5,000(3.7) 26 0.09 10,000(4.0) 7 0.02 MPSS Test Data No Tags = 25,503 S 1 S 2 100.00 100.00 57.14 49.87 36.11 33.66 10.89 10.74 5.73 5.67 1.21 1.13 0.57 0.55 0.15 0.11 0.05 0.05 cDNA Noise Paper PNAS 02, 99:14031 100.00 56.19 36.79 11.76 6.95 1.94 1.11 0.29 0.16 5. Coverage and Sensitivity

45 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 5. Coverage and Sensitivity

46 LetN T = N of “Total” Genes N D = N of “Differentially Expressed” Genes (N D  N T ) % x 1.The relevance of f(x i ) is limited to the Concentration  Signal mapping. 2.At equilibrium the probability of an error either way equals. Flat line (except Upper Bound) Design and Analysis of Microarray Experiments at CSIRO Livestock Industries SSAI – QLD Branch – 6 Apr. 2004 5. Coverage and Sensitivity

47 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries SSAI – QLD Branch – 6 Apr. 2004 5. Coverage and Sensitivity

48  <  =  >  Not many DE genes High Confidence Few False +ve Lots of DE genes High Power Few False -ve Design and Analysis of Microarray Experiments at CSIRO Livestock Industries SSAI – QLD Branch – 6 Apr. 2004 5. Coverage and Sensitivity

49 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries General (ie. not only CSIRO LI): 1.Still in its infancy (…possibly even embryonic stage) 2.Many decisions have a heuristic rather than a theoretical foundation 3.Prone to miss-conceptions: a.Amount of Expression = Amount of Response b.Same cut-off point to judge all genes c.Over-emphasis in normalization (hence, despise “Boutique Arrays”) d.Over-emphasis in variance stabilization e.Over-emphasis in controlling false-positives f.Over-emphasis in biological replicates (DANGER ) 4.No hope for a “One size fits all” software (even method) 5.Safer to aim towards “Tailor to individual’s needs” 6.Integration of interdisciplinary skills is a must 6. Summary

50 SSAI – QLD Branch – 6 Apr. 2004 Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Livestock Species: 1.Tailing humans (…at the moment) a.Andersson & Georges (2004) Domestic-animal genomics: Deciphering the genetics of complex traits. Nature Genetics, March 2004, Vol 5:202-212 2.Several key advantages a.More relaxed ethical issues (…relative to R&D in humans) b.Very strong similarities at the genome level with humans c.The genome is (being) sequenced for several species 3.Strong background knowledge of genetics accumulated a.Quantitative genetics b.Mixed-Model equations c.Computing expertise 4.Journals will soon be inundated 5.We have the opportunity to participate 6. Summary


Download ppt "Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Toni Reverter Bioinformatics Group CSIRO Livestock Industries Queensland Bioscience."

Similar presentations


Ads by Google