Presentation is loading. Please wait.

Presentation is loading. Please wait.

Henrik Bengtsson Mathematical Statistics Centre for Mathematical Sciences Lund University, Sweden Plate Effects in cDNA Microarray Data.

Similar presentations


Presentation on theme: "Henrik Bengtsson Mathematical Statistics Centre for Mathematical Sciences Lund University, Sweden Plate Effects in cDNA Microarray Data."— Presentation transcript:

1 Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences Lund University, Sweden Plate Effects in cDNA Microarray Data

2 2 of 21 Outline Data Known systematic variation / artifacts New way of plotting microarray data Print order / Plate effects Normalization of plate effects Normalization strategies Finding the best strategy: Measure of Reproducibility Results Discussion

3 3 of 21 Data Matt Callow’s apoAI experiment (2000): –(8 apoAI-KO mice vs. pool of 8 control mice), 8 control mice vs. pool of 8 control mice, i.e. eight hybridized slides. –5357 EST’s/genes (6 triplicates, 175 duplicates, 4989 single spotted) & 840 blanks => 6384 spots in all. –Labeled using Cy3-dUTP and Cy5-dUTP. –Signals extracted from the images by Spot.

4 4 of 21 Intensity dependent effects The log-ratio, M, depends on the intensity of the spot, A.

5 5 of 21 Print-tip/spatial intensity effects The log-ratio (and its variance) varies with print-tip group. But, how are the spots printed…?

6 6 of 21 6384 spots printed onto N slides in total 399 print turns using 4x4 print-tips 4·4·399= 6384

7 7 of 21 Print order plot The spots are order according to when they were spotted/dipped onto the glass slide(s). Note that it takes hours/days to print all spots an all slides.

8 8 of 21 Print dip plot Median values of the 16 log-ratios at each dip from each of the 399 print turns.

9 9 of 21 Sources of artifacts scanning data: (R fg,G fg,R bg,G bg,...) cDNA clones PCR product amplification purification printing Hybridize RNA Test sample cDNA RNA Reference sample cDNA excitation red laser green laser emission overlay images Production Plate effects (clone sets,..., ?) Intensity effects (labeling efficiency) Intensity effects (quenching) Print order effects (climate, print-tips,...)

10 10 of 21 Plate effects The log-ratios depends on the plate the spotted clone comes from. (384-well plates from 6 different labs were used)

11 11 of 21 Normalizing plate by plate Assumption: The genes from one plate are in average non-differentially expressed. Correctness? Are clones on the plates selected randomly? Spots on plates are less random than for instance spots in print-tip groups. Recall that in the current setup we do a comparison between 8 control mice and the pool of them.

12 12 of 21 Removing (constant) plate biases Will remove some of the intensity dependent effects......and some of the spatial artifacts.

13 13 of 21...and then an intensity normalization? Intensity normalization => reintroduced plate biases! Should we normalize A for plate effect? No! Less DNA hybridized to the blanks and to the ”brain” spots, compared to the rest (“liver” clones) Why? Because the intensities of the spots, A, also show plate effects. ?

14 14 of 21 Intensity dep. normalization plate by plate...and most of the spatial artifacts....plus a print-tip normalization? Removes the plate effects...

15 15 of 21 Multiple ways to normalize Component-wise normalization methods, e.g. Ex: print-tip normalization + constant plate normalization Ex: plate intensity normalization + print-tip normalization... will work in the general case Simultaneous normalization methods (not covered here) Ex: print-tip & plate intensity normalization (two dimensions)... requires a model and will not be applicable to the general case Need a way to compare different the outcomes...

16 16 of 21 Measure of Reproducibility Median absolute deviation (MAD) for gene i with replicates j=1,2,...,J: d i = 1.4826 · median | r ij | where r ij = M ij – median M ij is residual j for gene i. The measure of reproducibility (small in good) is a scalar defined as the mean of all genewise MADs: M.O.R. =  d i / N where N is the number of genes. Ex: two different genes: d a < d b

17 17 of 21 Pl–Constant platewise normalization, Pl(A)–Intensity dependent platewise normalization, Sl(A)–Intensity dependent slidewise normalization, Pr(A)–Intensity dependent print-tip-wise normalization, sPr(A)–Scaled intensity dependent print-tip-wise normalization, bg–background corrected data. Results 21 different normalization strategies was performed on both background and non-background subtracted data, i.e. total 42 runs.

18 18 of 21 Pl – Constant platewise norm., Pl(A) – Intensity dep. platewise norm., Sl(A) – Intensity dep. slidewise norm., Pr(A) – Intensity dep. print-tip-wise norm., sPr(A) – Scaled intensity dep. print-tip-wise norm., bg – background corrected data. Doing platewise intensity dependent normalization lowers the gene variability by another ~10% from print-tip norm. In all cases it is better not to do background correction. Using measure of reproducibility is helpful in deciding what to do. Results

19 19 of 21 Visual comparison Scaled print-tip intensity normalization: (M.O.R.=0.123; 46%) Scaled print-tip follow by plate intensity normalization: (M.O.R.=0.110; 41%) No normalization: (M.O.R.=0.270; 100%)

20 20 of 21 Discussion What are the reasons for seeing plate effects and where do they actually occur? i) in clone setup, ii) on the plates, iii) during printing, iv) at hybridization or where? Look at the behavior of the variance in addition to the bias. Are there any reasons for doing platewise normalization of variances too? How general is the result that not doing background subtraction performs better than doing it?

21 21 of 21 Acknowledgements Statistics Dept, UC Berkeley: * Sandrine Dudoit * Terry Speed * Yee Hwa Yang Lawrence Berkeley National Laboratory: * Matt Callow Mathematical Statistics, Lund University: * Ola Hössjer, Jan Holst Ernest Gallo Research Center, UCSF: * Karen Berger com.braju.sma – object-oriented extension to sma (free): http://www.braju.com/R/ [R] Software (free): http://www.r-project.org/ The Statistical Microarray Analysis (sma) library (free): http://www.stat.berkeley.edu/users/terry/zarray/Software/smacode.html

22 22 of 21 Extra slides

23 23 of 21 Transformed data {(M,A)} n=1..5184 : M = log 2 (R/G) (ratio), A = log 2 (R·G) 1/2 = 1/2·log 2 (R·G) (intensity signal)  R=(2 2A+M ) 1/2, G=(2 2A-M ) 1/2 Data Transformation “Observed” data {(R,G)} n=1..5184 : R = red channel signal G = green channel signal (background corrected or not)

24 24 of 21 Normalization Biased towards the green channel & Intensity dependent artifacts

25 25 of 21 Blanks / empty spots blanks 99%


Download ppt "Henrik Bengtsson Mathematical Statistics Centre for Mathematical Sciences Lund University, Sweden Plate Effects in cDNA Microarray Data."

Similar presentations


Ads by Google