A robust neural networks approach for spatial and intensity-dependent normalization of cDNA microarray data A.L. Tarca, J.E.K. Cooke and J. MacKay Presented.

Slides:



Advertisements
Similar presentations
Perceptron Lecture 4.
Advertisements

Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
Microarray Quality Assessment Issues in High-Throughput Data Analysis BIOS Spring 2010 Dr Mark Reimers.
Pre-processing in DNA microarray experiments Sandrine Dudoit PH 296, Section 33 13/09/2001.
Filtering and Normalization of Microarray Gene Expression Data Waclaw Kusnierczyk Norwegian University of Science and Technology Trondheim, Norway.
Normalization of microarray data
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Sandrine Dudoit1 Microarray Experimental Design and Analysis Sandrine Dudoit jointly with Yee Hwa Yang Division of Biostatistics, UC Berkeley
Getting the numbers comparable
Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001.
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Preprocessing Methods for Two-Color Microarray Data
Microarray Data Preprocessing and Clustering Analysis
Normalization Class web site: Statistics for Microarrays.
Differentially expressed genes
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Gene Expression Data Analyses (2)
Normalization of 2 color arrays Alex Sánchez. Dept. Estadística Universitat de Barcelona.
Microarray Data Analysis Data quality assessment and normalization for affymetrix chips.
Artificial Neural Networks
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
October 28, 2010Neural Networks Lecture 13: Adaptive Networks 1 Adaptive Networks As you know, there is no equation that would tell you the ideal number.
Data mining and statistical learning - lecture 11 Neural networks - a model class providing a joint framework for prediction and classification  Relationship.
\department of mathematics and computer science Supervised microarray data analysis Mark van de Wiel.
Multilayer feed-forward artificial neural networks for Class-modeling F. Marini, A. Magrì, R. Bucci Dept. of Chemistry - University of Rome “La Sapienza”
Analysis of microarray data
Filtering and Normalization of Microarray Gene Expression Data Waclaw Kusnierczyk Norwegian University of Science and Technology Trondheim, Norway.
Ranga Rodrigo April 5, 2014 Most of the sides are from the Matlab tutorial. 1.
1 Normalization Methods for Two-Color Microarray Data 1/13/2009 Copyright © 2009 Dan Nettleton.
(4) Within-Array Normalization PNAS, vol. 101, no. 5, Feb Jianqing Fan, Paul Tam, George Vande Woude, and Yi Ren.
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
CDNA Microarrays Neil Lawrence. Schedule Today: Introduction and Background 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo.
Radial Basis Function Networks
The following slides have been adapted from to be presented at the Follow-up course on Microarray Data Analysis.
CDNA Microarrays MB206.
Panu Somervuo, March 19, cDNA microarrays.
Randomization issues Two-sample t-test vs paired t-test I made a mistake in creating the dataset, so previous analyses will not be comparable.
ANNs (Artificial Neural Networks). THE PERCEPTRON.
Appendix B: An Example of Back-propagation algorithm
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Intelligence Techniques Multilayer Perceptrons.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
The Analysis of Microarray data using Mixed Models David Baird Peter Johnstone & Theresa Wilson AgResearch.
1 Pre-processing - Normalization Databases Statistics for Microarray Data Analysis – Lecture 2 The Fields Institute for Research in Mathematical Sciences.
Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
Microarray hybridization Usually comparative – Ratio between two samples Examples – Tumor vs. normal tissue – Drug treatment vs. no treatment – Embryo.
(1) Normalization of cDNA microarray data Methods, Vol. 31, no. 4, December 2003 Gordon K. Smyth and Terry Speed.
For a specific gene x ij = i th measurement under condition j, i=1,…,6; j=1,2 Is a Specific Gene Differentially Expressed Differential expression.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Chapter 8: Adaptive Networks
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
DNA Microarray. Microarray Printing 96-well-plate (PCR Products) 384-well print-plate Microarray.
Normalization Methods for Two-Color Microarray Data
Diagnostics and Transformation for SLR
Getting the numbers comparable
Normalization for cDNA Microarray Data
Diagnostics and Transformation for SLR
Model Adequacy Checking
Pre-processing AFFY data
Presentation transcript:

A robust neural networks approach for spatial and intensity-dependent normalization of cDNA microarray data A.L. Tarca, J.E.K. Cooke and J. MacKay Presented by Dana Mohamed

Microarrays

Importance of Microarrays (and that the data is correct) Assumption that microarray data linearly reflects amount of mRNA present in cell –In turn, reflects gene expression levels If the data is incorrect, –So is our interpretation of gene expression And therefore all the science built on that interpretation is also incorrect

Where error is Intensity of Fluorescence –Overall imbalance of dye intensity 2 dyes: Cy5 (R) and Cy3 (G) If R & G expressed at equal levels, R/G = 1 Space –Intensities variable on coordinates Can be “dirty” on sides of microarray

Previous Methods Many address intensity bias Few address spatial bias Most rely on M* = M – m –M* is the normalized values –M is the raw log-ratio (M = log 2 R/G) –m is the estimate of the bias

Important Variables M = log 2 (R/G) –Log ratio converts multiplicative error to additive error A = (1/2)0.5log 2 RG –Average of the log-intensities Minus-add plots –M vs. A –Useful for assessing systematic bias

Calculating m in other methods gMed – global median normalization –m = median(M i ) –M i are all the values of M pLo – print tip loess –m = c i (A) pLoGS –found in GeneSight biodiscovery.com biodiscovery.com –Local group median (3x3 square regions) + print tip loess cPLo2D - print tip loess + pure 2D normalization –BioConductor bioconductor.org bioconductor.org –m = α c i (A) + β c i (SpotRow,SpotCol) –c i (SpotRow,SpotCol) is the loess estimate of M using spot row and column coordinates inside the ith print tip gLoMedF –global loess normalization + spatial median filter

Robust Neural Networks Technique pNN2DA – print tip robust neural nets 2D and A –Attempt to find the best fit of M using A and the 2-D space coordinates of the spots: m = c i (A,X,Y) Instead of using individual print tips – use 3x3 “bins” of them – X and Y –Accounts for spatial bias

Neural Nets Terminology Uses multi-layer feedforward network Sigmoid Function

Neural Networks Uses multi-layer feedforward network x is the vector (X,Y,A,1), I = 3, w are the weights, sigma one represents the hidden neurons and they are sigmoid functions, sigma two is the single neuron in the output layer, which is also sigmoid, Sigma one J+1 accounts for the second layer bias, J represents the number of neurons in the hidden layer of the network

Multi-layered Feedforward Usually, J = 3 to take care of outliers but also so as to avoid over-fitting

Criteria & Datasets Criteria: a) reduce variability of log-ratios between replicated slides and within slides b) ability to distinguish truly regulated genes from the other genes Datasets: 1)Apo AI: a,b 2)Swirl Zebra Fish: a 3)Poplar experiment: a 4)Perturbed Apo AI: b

Classic Neural Nets vs. Robust NNets

Criteria refresher The ability to reduce the variability of log-ratios between replicated slides and within slides The ability to distinguish truly regulated genes from the other genes

Impact on Variability

Cont. – 3 Data Sets

Downregulated Gene Sorting – Apo AI set

DRGS – Perturbed Apo AI set

Spatial Uniformity of M values distribution

Results Table

Strengths/Weaknesses Seems promising Uses multiple tests to determine efficacy Doesn’t use enough datasets Uses patterned perturbed dataset –But no “real” perturbed dataset

Future Work More datasets When should this normalization technique be used over other techniques? Should this technique be combined with elements of other techniques to further improve it?

References Tarca, A.L., J.E.K. Cooke, and J. Mackay. “A robust neural networks approach for spatial and intensity-dependent normalization of cDNA microarray data." Bioinformatics Jun 2005; 21: Haykin, Simon. Neural Networks: A Comprehensive Foundation. New Jersey: Prentice Hall, Mount, David W. Bioinformatics: sequence and genome analysis. New York: Cold Spring Harbor Laboratory Press, 2001.