Two-Color Microarrays: Reference Designs and Reference RNAs. Kathleen Kerr Department of Biostatistics University of Washington Collaborators: Kyle Serikawa,

Slides:



Advertisements
Similar presentations
Experiment Design for Affymetrix Microarray.
Advertisements

23/11/2007Asian School of Business, Trivandrum Business Research Scales and Questionnaire.
Optimal designs for one and two-colour microarrays using mixed models
Relating Gene Expression to a Phenotype and External Biological Information Richard Simon, D.Sc. Chief, Biometric Research Branch, NCI
M. Kathleen Kerr “Design Considerations for Efficient and Effective Microarray Studies” Biometrics 59, ; December 2003 Biostatistics Article Oncology.
Statistical tests for differential expression in cDNA microarray experiments (2): ANOVA Xiangqin Cui and Gary A. Churchill Genome Biology 2003, 4:210 Presented.
ECS 289A Presentation Jimin Ding Problem & Motivation Two-component Model Estimation for Parameters in above model Define low and high level gene expression.
© Cambridge International Examinations 2013 Component/Paper 1.
Microarray Normalization
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Sandrine Dudoit1 Microarray Experimental Design and Analysis Sandrine Dudoit jointly with Yee Hwa Yang Division of Biostatistics, UC Berkeley
Getting the numbers comparable
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
More On Preprocessing Javier Cabrera. Outline 1.Transform the data into a scale suitable for analysis. 2.Remove the effects of systematic and obfuscating.
Normalization of 2 color arrays Alex Sánchez. Dept. Estadística Universitat de Barcelona.
GCB/CIS 535 Microarray Topics John Tobias November 8th, 2004.
Gene Set Enrichment Analysis Petri Törönen petri(DOT)toronen(AT)helsinki.fi.
Determining the Size of
Filtering and Normalization of Microarray Gene Expression Data Waclaw Kusnierczyk Norwegian University of Science and Technology Trondheim, Norway.
1 CHAPTER M4 Cost Behavior © 2007 Pearson Custom Publishing.
Determining How Costs Behave
(4) Within-Array Normalization PNAS, vol. 101, no. 5, Feb Jianqing Fan, Paul Tam, George Vande Woude, and Yi Ren.
Genome of the week - Deinococcus radiodurans Highly resistant to DNA damage –Most radiation resistant organism known Multiple genetic elements –2 chromosomes,
Expression profiling of peripheral blood cells for early detection of breast cancer Introduction Early detection of breast cancer is a key to successful.
Technical Adequacy Session One Part Three.
2-1 MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4)
Evidence Based Medicine
Fundamentals of Data Analysis Lecture 9 Management of data sets and improving the precision of measurement.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
CSCE555 Bioinformatics Lecture 16 Identifying Differentially Expressed Genes from microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun.
S14: Analytical Review and Audit Approaches. Session Objectives To define analytical review To define analytical review To explain commonly used analytical.
We calculated a t-test for 30,000 genes at once How do we handle results, present data and results Normalization of the data as a mean of removing.
1 Critical Review of Published Microarray Studies for Cancer Outcome and Guidelines on Statistical Analysis and Reporting Authors: A. Dupuy and R.M. Simon.
Center for Sustainable Transportation Infrastructure Harmonization of Friction Measuring Devices Using Robust Regression Methods Samer Katicha 09/09/2013.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Lecture Topic 5 Pre-processing AFFY data. Probe Level Analysis The Purpose –Calculate an expression value for each probe set (gene) from the PM.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Summarization of Oligonucleotide Expression Arrays BIOS Winter 2010.
Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.
Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.
CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.
Statistics for Differential Expression Naomi Altman Oct. 06.
Biostatistics Case Studies 2010 Peter D. Christenson Biostatistician Session 3: Clustering and Experimental Replicates.
Design of Micro-arrays Lecture Topic 6. Experimental design Proper experimental design is needed to ensure that questions of interest can be answered.
Tom Kepler Santa Fe Institute Normalization and Analysis of DNA Microarray Data by Self-Consistency and Local Regression
Suppose we have T genes which we measured under two experimental conditions (Ctl and Nic) in n replicated experiments t i * and p i are the t-statistic.
CSIRO Insert presentation title, do not remove CSIRO from start of footer Experimental Design Why design? removal of technical variance Optimizing your.
(1) Normalization of cDNA microarray data Methods, Vol. 31, no. 4, December 2003 Gordon K. Smyth and Terry Speed.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Analytical Review and Audit Approaches
BME 353 – BIOMEDICAL MEASUREMENTS AND INSTRUMENTATION MEASUREMENT PRINCIPLES.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Oigonucleotide (Affyx) Array Basics Joseph Nevins Holly Dressman Mike West Duke University.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Measuring Research Variables
Rigor and Transparency in Research
Analysis of Mismeasured Data David Yanez Department of Biostatistics University of Washington July 5, 2005 Biost/Stat 579.
Determining How Costs Behave
The Matching Hypothesis
David Kellen, Henrik Singmann, Sharon Chen, and Samuel Winiger
Similar Triangles Applied
Pan Du, Simon Lin Robert H. Lurie Comprehensive Cancer Center
Retrieval Performance Evaluation - Measures
Biomarkers as Endpoints
Pre-processing AFFY data
Design Issues Lecture Topic 6.
Presentation transcript:

Two-Color Microarrays: Reference Designs and Reference RNAs. Kathleen Kerr Department of Biostatistics University of Washington Collaborators: Kyle Serikawa, Mette Peters, Caimiao Wei, Roger Bumgarner

“Reference Design” “Loop Design”

Advantages: Reference Design Simple; easy to execute (Relatively) easy to analyze –If a “tonsil” RNA is used as the reference RNA in a reference design, then measurements on other RNAs can be considered to be measured in “tonsil” units

What goes here?

Some previous work on reference RNAs Gorreta et al, Biotechniques, He et al, Biotechniques, Novoradovskaya et al, BMC Genomics, All assert that a “good” reference RNA gives strong signal for all the genes on the array: most genes expressed “above background” in the reference.

This assertion is based on the conventional wisdom that signals “near background” are unreliable (“Unreliable” may overstate the case) Consider: the popular methods of data normalization all assume that co-hybridized RNAs are “not too different”

Method Validation Ideally, methods should be evaluated in terms of how well they answer a scientific question of interest. –Analogous to using clinically relevant endpoints in clinical trials rather than surrogates. The “proportion of spots above background” does not satisfy this ideal.

How to validate reference RNAs? A better (though still not ideal) criterion is to evaluate whether a comparison of RNAs made through a reference matches the comparison that would have been achieved through direct comparison. If the estimates agree well, then the results are not “reference-specific”

Experimental Design

3 “Test” RNA pairs: 1.Placenta Assumed to be most similar to placenta reference 2.Kidney A component of commercial reference 3.Lung Not a component of commercial reference

3 “Test” RNA pairs: 1.Placenta 2.Kidney 3.Lung Predictions 1.Placenta reference will work best for the placenta test pair 2.Commercial reference will work better for the kidney test pair than the lung test pair 3.Pool reference will work well overall

3 “Test” RNA pairs: 1.Placenta 2.Kidney 3.Lung How did we do on our predictions? Predictions 1.Placenta reference will work best for placenta test pair 2.Commercial reference will work better for the kidney test pair than the lung test pair 3.Pool reference will work well overall Predictions 1 & 3 were born out; prediction 2 was not. However, the main result was that choice of reference RNA did not matter as much as we thought. 3 Reference RNAs: 1.Placenta 2.Commercial 3.Pool

Compare: data with background subtraction

Compare: data without intensity-based normalization

Compare: data with background subtraction, No intensity- based normalization

Concordance for Low-intensity Genes

-Indirect and direct log-ratios are in reasonable agreement for the vast majority of these genes -Some low-intensity genes are reproducibly measured as differentially expressed -Most “highly discrepant genes” are NOT picked out by flagging low-intensity genes Conclusion: discarding data from low-intensity spots is a very crude filter

*Moving average for low intensity- genes only *Moving average for all genes Conclusion: Measurements on low-intensity genes are less reliable but not unreliable

Is this a reasonable way to evaluate reference RNAs? Concordance between reference-based log- ratios and direct log-ratios means that the comparison through the reference is not “reference-specific”

The evaluation is based on a kind of reproducibility, which is not the same as accuracy. However: –A much stronger kind of reproducibility than just reproducibility among technical replicates. –Though not sufficient, good reproducibility is necessary for low error –The best kind of “accuracy” with microarray measurements is an open issue. Opinion: For most intents and purposes, a low-variance, biased estimate of the log-ratio is preferable to a high-variance, unbiased estimate The results here suggest that, on average, microarrays give an estimate of log-ratios that are proportional to the true log-ratios –Shi et al, BMC Bioinformatics, 2005

Summary/Conclusions To date, evaluations of reference RNAs –Have used criteria distant from scientific objectives –Have not accounted for the assumptions invoked in the popular methods of data normalization We found results to be robust to choice of reference Is there such a thing as a “universal” reference? The concordance between our indirect and direct logratios is encouraging, but the “linearity” issue needs more attention

Kerr KF, Serikawa, Wei, Peters, Bumgarner (2007). What is the best reference RNA? And other questions regarding the design and analysis of two-color microarray experiments. OMICS 11: (Pre-print at ) Many Thanks to the Conference Organizers!