(1) Normalization of cDNA microarray data Methods, Vol. 31, no. 4, December 2003 Gordon K. Smyth and Terry Speed.

Slides:



Advertisements
Similar presentations
Limma: Linear Models for Microarray Data R user group 21 June 2005 Judith Boer.
Advertisements

Microarray Quality Assessment Issues in High-Throughput Data Analysis BIOS Spring 2010 Dr Mark Reimers.
Pre-processing in DNA microarray experiments Sandrine Dudoit PH 296, Section 33 13/09/2001.
MicroArray Image Analysis
MicroArray Image Analysis Robin Liechti
Microarray Normalization
Filtering and Normalization of Microarray Gene Expression Data Waclaw Kusnierczyk Norwegian University of Science and Technology Trondheim, Norway.
December 5, 2013Computer Vision Lecture 20: Hidden Markov Models/Depth 1 Stereo Vision Due to the limited resolution of images, increasing the baseline.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Mathematical Statistics, Centre for Mathematical Sciences
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Image Quantitation in Microarray Analysis More tomorrow...
Normalization of Microarray Data - how to do it! Henrik Bengtsson Terry Speed
TIGR Spotfinder: a tool for microarray image processing
Sandrine Dudoit1 Microarray Experimental Design and Analysis Sandrine Dudoit jointly with Yee Hwa Yang Division of Biostatistics, UC Berkeley
Getting the numbers comparable
Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001.
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Preprocessing Methods for Two-Color Microarray Data
Normalization Class web site: Statistics for Microarrays.
Low-Level Analysis and QC Regional Biases Mark Reimers, NCI.
Figure 1: (A) A microarray may contain thousands of ‘spots’. Each spot contains many copies of the same DNA sequence that uniquely represents a gene from.
Low Level Statistics and Quality Control Javier Cabrera.
Scan - Print Do repeated scans and prints to show image degradation. HW0202.
Normalization of 2 color arrays Alex Sánchez. Dept. Estadística Universitat de Barcelona.
Microarray Data Analysis Data quality assessment and normalization for affymetrix chips.
A robust neural networks approach for spatial and intensity-dependent normalization of cDNA microarray data A.L. Tarca, J.E.K. Cooke and J. MacKay Presented.
Corrections and Normalization in microarrays data analysis
Scanning and image analysis Scanning -Dyes -Confocal scanner -CCD scanner Image File Formats Image analysis -Locating the spots -Segmentation -Evaluating.
Analysis of microarray data
B IOINFORMATICS Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 8 Analyzing Microarray Data Aleppo University Faculty of technical.
Filtering and Normalization of Microarray Gene Expression Data Waclaw Kusnierczyk Norwegian University of Science and Technology Trondheim, Norway.
1 Normalization Methods for Two-Color Microarray Data 1/13/2009 Copyright © 2009 Dan Nettleton.
(4) Within-Array Normalization PNAS, vol. 101, no. 5, Feb Jianqing Fan, Paul Tam, George Vande Woude, and Yi Ren.
Preprocessing of cDNA microarray data Lecture 19, Statistics 246, April 1, 2004.
Image Quantitation in Microarray Analysis More tomorrow...
CDNA Microarrays Neil Lawrence. Schedule Today: Introduction and Background 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo.
The following slides have been adapted from to be presented at the Follow-up course on Microarray Data Analysis.
CDNA Microarrays MB206.
Panu Somervuo, March 19, cDNA microarrays.
Bioconductor Packages for Pre-processing DNA Microarray Data affy and marray Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor.
Randomization issues Two-sample t-test vs paired t-test I made a mistake in creating the dataset, so previous analyses will not be comparable.
WORKSHOP SPOTTED 2-channel ARRAYS DATA PROCESSING AND QUALITY CONTROL Eugenia Migliavacca and Mauro Delorenzi, ISREC, December 11, 2003.
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
Analysis of Microarray Data Analysis of images Preprocessing of gene expression data Normalization of data –Subtraction of Background Noise –Global/local.
Scenario 6 Distinguishing different types of leukemia to target treatment.
infinity-project.org Engineering education for today’s classroom 2 Outline How Can We Use Digital Images? A Digital Image is a Matrix Manipulating Images.
The Analysis of Microarray data using Mixed Models David Baird Peter Johnstone & Theresa Wilson AgResearch.
1 Pre-processing - Normalization Databases Statistics for Microarray Data Analysis – Lecture 2 The Fields Institute for Research in Mathematical Sciences.
What Is Microarray A new powerful technology for biological exploration Parallel High-throughput Large-scale Genomic scale.
December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.
Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.
Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.
Pre-processing in DNA microarray experiments Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor short course Summer 2002.
ABC D EF GH I JKL. Supplementary figure S1: Exemplary overview of the quality assessment plots generated by Robin. All plots have been generated using.
Henrik Bengtsson Mathematical Statistics Centre for Mathematical Sciences Lund University, Sweden Plate Effects in cDNA Microarray Data.
Microarray hybridization Usually comparative – Ratio between two samples Examples – Tumor vs. normal tissue – Drug treatment vs. no treatment – Embryo.
Pre-processing DNA Microarray Data Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor Short Course Winter 2002 © Copyright.
Henrik Bengtsson Mathematical Statistics Centre for Mathematical Sciences Lund University Plate Effects in cDNA Microarray Data.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Composition & Elements of Art and Principles of Design A artists toolbox.
Lecture 2 – Pre-processing and Normalization José Luis Mosquera Computational Lab on Microarrays Data Analysis Special Topics in Computer Science Institute.
Normalization Methods for Two-Color Microarray Data
The Basics of Microarray Image Processing
Getting the numbers comparable
Optimal gene expression analysis by microarrays
Normalization for cDNA Microarray Data
DIGITAL IMAGE PROCESSING Elective 3 (5th Sem.)
Presentation transcript:

(1) Normalization of cDNA microarray data Methods, Vol. 31, no. 4, December 2003 Gordon K. Smyth and Terry Speed

Normalization for two-color cDNA microarray data The purpose of normalization is to adjust for effects which arise from variation in the microarray techno- logy rather than from biological differences between the RNA samples or between the printed probes. Variation in the microarray technology 1. within array: The dye-bias generally vary with intensity and spatial position on the slide. 2. between array: Differences may arise from differences in print quality, ambient conditions when the plates were processed or simply from changes in the scanner settings.

M-A plot Write R and G for the background-corrected red and green intensities for each spot. The log-ratios of expression, M = log 2 R - log 2 G. The log-intensity of each spot, A = (log 2 R + log 2 G)/2, a measure of the overall brightness of the spot.

Diagnostic plots  dye-bias and i ntensity : MA-plot (Fig. 1)  Spatial variation : Spatial image plot (Fig. 2) Boxplots of the M-values for each print tip group (Fig. 3)  Combining Spatial and i ntensity trends: Individual loess curves for each print tip groups (Fig. 4)

Fig. 1. MA-plot for an array showing three different trend lines. Blue : shows the median of the M-values. Orange : shows the overall trend line as estimated by loess regression. Yellow : shows the loess curve through a set of control spots known to be not differentially expressed.

Fig. 2. Spatial plot of green (cy3) back- ground values. The array was printed using a 12 x 4 pattern of print tips. The background tends to be more intense: around the edges in the four corners in tip rows 8 and 9 and columns 3 and 4

Fig. 3. Boxplots of the M-values for each print tip group. The M-values are higher : in the middle of each sequence of four and in the middle of the overall sequence (tip rows 7, 8, and 9).

Fig. 4. Individual loess curves for the 48 print tip groups. For this array, the slope and shape of the curves is broadly consistent over the print-tip groups although the height varies. The height of the curves varies between tip groups in a similar way to the height of the boxplots in Fig. 3.

Global loess

Print-tip loess normalization We recommend this method as a routine normalization method for cDNA arrays. It corrects the M-values both for sub-array spatial variation and for intensity-based trends. Print-tip loess

Two-dimensional loess We do not use this method as a routine normalization strategy because of  concern that imperfections on the array may present sudden rather than smooth changes and  concern that the two-dimensional loess curve may confuse local clusters of differential expression on the array with the spatial trend to be removed. Another way to model spatial variation is to use a two-dimensional loess curve. This can be combined with intensity-based loess normalization to give the two- dimensional normalization strategy,

Composite normalization may be used when a suitable set of control spots is available which are known to be not differentially expressed. Control spots: 1. To be of most use in loess normalization, the control spots should span as wide a range of intensities as possible. 2. A satisfactory set of controls for this purpose is a specially designed microarray sample pool (MSP) titration series in which the entire clone library is pooled and then titrated at a series of different concentrations. Composite loess normalization

Yang et al. (2002) propose the composite normalization

Correcting for other trends Print-order effect ( Fig. 5 ) the numerical order in which the spots were laid down during the printing of the array

Fig. 5. Plate or print-order effects for the first slide in the ApoAI knock-out experiment report- ed by Callow et al (2000). This array was printed with a 4 x 4 arrangement of print- tips and with 19 rows and 21 columns in each tip group. This means that the print-order index goes from 1 to 19 x 21 = 399 and that 4 x 4 =16 spots share each print-order index. The plot shows that a series of plates starting around print- order 169 have higher median M-values than the rest of the array. Indeed, it turns out that spots with print orders between 169 and 252 were printed with DNA from a different library to the other spots. median M-value

One can normalize for this print-order effect by subtracting from the M-values the medians shown in the plot. One would then proceed on to print-tip loess normalization. Print-order normalization

Weighting for spot quality Most image analysis programs routinely record a variety of descriptive information about each spot. If this information is used to construct a numeric quality measure for each spot, then lower quality spots can be down-weighted in the normalization process. Spot quality 1. spot size 2. roundness of the spot 3. background intensity 4. SNR 5. foreground or background regions 6. spot location

A more comprehensive measure of spot quality: weighting spots according to spot area (the number of pixels in the segmented foreground region of the spot) Inspection of the TIFF images of arrays used in the examples suggests that the area in pixels of an ideal circular spot on these arrays is about 165 pixels (=2*165) The values from the weight function are used as relative weights in all the loess regre- ssions used in the normal- izations.

This weight function is a simple function of only one of the morphological characteristics of the spot and more complex quality measures can easily be imagined. Other measures of spot quality computed from the image analysis output could be used in the same way to provide weights in the normalization.