Presentation is loading. Please wait.

Presentation is loading. Please wait.

1-11-20051 We calculated a t-test for 30,000 genes at once How do we handle results, present data and results Normalization of the data as a mean of removing.

Similar presentations


Presentation on theme: "1-11-20051 We calculated a t-test for 30,000 genes at once How do we handle results, present data and results Normalization of the data as a mean of removing."— Presentation transcript:

1 1-11-20051 We calculated a t-test for 30,000 genes at once How do we handle results, present data and results Normalization of the data as a mean of removing biases and reducing experimental variability Two basic questions in the normalization process Are we attenuating the signal? Are we compromising the independence of our measurements? Outliers – part of the quality control. If we can identify physical reasons for excluding an observation (e.g. scratch on the slide) Such physical problems are usually "flagged" in the process of quantifying fluorescence intensities The questions of excluding a whole array from the analysis is particularly tricky – we will discuss it further later Genome-wide analysis

2 1-11-20052 The Problem: Identify genes whose expression in a target organ (Lung) of a model organism (Rat) is affected by an environmental toxicant (W) Population: All model organisms of this type (Rats) Sample: 12 randomly selected rats from the population of all rats. (Randomly means that all rats in the population have the equal chance of being selected) Randomization: Randomly select 6 rats to be treated by the toxicant. Randomly is the key word here that allows us to ascribe observed changes to the treatment alone. Prepare samples and extract RNA from all 12 rats Randomly assign labeled RNA to different microarrays Process microarrays in a random order Randomization Issue

3 1-11-20053 12 microarrays, 12 samples (C1,...,C6,W1,...,W6) Randomly assign samples to different microarrays In terms of a single gene, 12 different "spots" Single Channel Microarrays – Each Sample Assigned to a Different Microarray W3W5W6W1W2W4C5C1C2C4C6C3 Proceed with a two-sample t-test as we did so far

4 1-11-20054 6 microarrays, 12 samples (C1,...,C6,W1,...,W6) Randomly select pairs and assign then to different microarrays In terms of a single gene, 6 different "spots" Two-Channel Microarrays – One C and One W Sample Assigned to Each Microarray W3 C5 W6C1W2 C2 W5C6W4C4W1C3 Individual samples are no longer "free" to be assigned to any microarray – restriction on the randomization process Measurements are "blocked" within a microarray (terminology) We could still randomly assign samples and not have treatment and the control on each microarray, but this would be unreasonable (arguments to come) Need to use a paired t-test

5 1-11-20055 For a specific gene r i = x iw -x ic = i th difference, i=1,…,6 Paired t-test Differential expression    0 Statistical Model of observed data Estimating parameters Calculating t-statistic t*t* -t * "Null Distribution" is t- distribution with n-1 degrees of freedom

6 1-11-20056 Two-sample t-test vs paired t-test Denominator1.510.04 p-value0.8700.002 Reference Distributiont 2n-2 t n-1

7 1-11-20057 Two-sample t-test vs paired t-test

8 1-11-20058 Two-sample t-test vs paired t-test

9 1-11-20059 Two-sample t-test vs paired t-test

10 1-11-200510 Two-sample t-test vs paired t-test Small advantage for two-sample t-test purely due to degrees of freedom Bigger possible advantage due to the smaller denominator (standard error)

11 1-11-200511 When is t-test "better" than paired t-test Q: Can we use the two-paired t-test in this case since it gives us a smaller p-value? A: NO! Randomization and non-independence issues remain t-sample tpaired t Denominator0.560.64 p-value0.00080.0097

12 1-11-200512 Multiple Factor Experiments - Incomplete Block Design Control Treatment Control Treatment 1 Treatment 2 Array Cy 3Cy 5

13 1-11-200513 Multiple Factor Experiments - Incomplete Block Design No color effect Homogeneous variance Optimal No color effect Homogeneous variance Sub-Optimal Homogeneous color effect Homogeneous variance

14 1-11-200514 Multiple Factor Experiments - Incomplete Block Design C T1 T2 T1 & T2 C T1 T2 T1 & T2 Homogeneous Variance

15 1-11-200515 limma... is a package for the analysis of microarray data, especially the use of linear models for analyzing designed experiments and the assessment of differential expression. Specially constructed data objects to represent various aspects of microarray data Specially constructed "object methods" for importing, normalizing, displaying and analyzing microarray data Unique in the implementation of the empirical Bayes procedure for identifying differentially expressed genes by "borrowing" information from different genes (everything so far has been gene by gene)


Download ppt "1-11-20051 We calculated a t-test for 30,000 genes at once How do we handle results, present data and results Normalization of the data as a mean of removing."

Similar presentations


Ads by Google