Scenario 6 Distinguishing different types of leukemia to target treatment.

Scenario 6 Distinguishing different types of leukemia to target treatment

Acute Myeloid Leukemia (AML) vs Acute Lymphoblastic Leukemia (ALL) Golub, T. R., et al. 1999. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531-7.

AMLALL

AML: ALL:

AML: ALL: 12345

AML: ALL: 1 2 3 4 5 1 2 3 4 5 AMLALL

1 2 3 4 5 1 2 3 4 5 AMLALL+

1 2 3 4 5 AMLALL+

Spotted Microarray Process CTRL TEST

Microarray Platforms Spotted arrays Inserts from cDNA libraries, PCR products, or oligonucleotides Probed with labeled RNA or cDNA from 2 samples Affymetrix GeneChip arrays 25mer oligonucleotides synthesized on a glass wafer Probed with labeled RNA or cDNA from a single sample

Affymetrix Synthesis of Ordered Oligonucleotide Arrays O O O O O Light (deprotection) HO HO O O O T T O O O T T C C O Light (deprotection) T T O O O C A T A T A G C T G T T C C G Mask Substrate Mask Substrate T – C – REPEAT

Affymetrix GeneChip ® Probe Array

Affymetrix GeneChip ® Probe Arrays 24µm Each probe cell or feature contains millions of copies of a specific oligonucleotide probe Image of Hybridized Probe Array Over 250,000 different probes complementary to genetic information of interest Single stranded, fluorescently labeled DNA target Oligonucleotide probe * * * * * 1.28cm GeneChip Probe Array Hybridized Probe Cell BGT108_DukeUniv *

Affymetrix Probe Tiling Strategy The presence or absence of each Gene is determined by a panel of 20 perfect match and 20 mismatch (control) oligonucleotides (25-mer)

Sample output:

Data Analysis Sample 1 Sample 2

Data Analysis Sample 1 Sample 2 Sample 2 (Light units) Sample 1 (Light units)

Golub, T. R., et al. 1999. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531-7. http://www-genome.wi.mit.edu/cgi-bin/cancer/datasets.cgi Near the bottom of the page: “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.” Paper, data tables, supplemental figures

Golub, T. R., et al. 1999. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531-7. Measured the expression of 6817 human genes using Affymetrix arrays. Initially examined 27 ALL and 11 AML samples. Each ALL or AML specimen was used to prepare labeled RNA that was apparently hybridized with a single chip. “Samples were subjected to a priori quality control standards regarding the amount of labeled RNA and the quality of the scanned microarray image.” Eight of 80 leukemia samples were discarded.

Golub, T. R., et al. 1999. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531-7. The signal strength from each chip was apparently normalized to that of the other chips by multiplying every value in the chip by the multiplication factor listed in the “Rescaling factors” table on the web.

Experimental Goals 1.“Class Prediction”-- Determine whether an unknown sample belongs to a predefined class. –Find a set of genes whose expression is high in AML and low in ALL or vice versa. –Measure the expression of these genes in unknown samples and use the measurements as a class predictor.

P(g,c) = Correlation Coefficient measuring the degree to which expression of a given gene in the set of samples correlates with assignment to either class (AML or ALL) =  1 (g) –  2 (g) or  2 (g) –  1 (g)  1 (g) +  2 (g)

Figure 2. Neighborhood analysis: ALL vs AML. For the 38 leukemia samples in the initial dataset, the plot shows the number of genes within various 'neighborhoods' of the the ALL/AML class distinction together with curves showing the 5% and 1% significance levels for the number of genes within corresponding neighborhoods of the randomly permuted class distinctions (see notes 16,17 in the paper). Genes more highly expressed in ALL compared to AML are shown in the left panel; those more highly expressed in AML compared to ALL are shown in right panel. Note the large number of genes highly correlated with the class distinction. In the left panel (higher in ALL), the number of genes with correlation P(g,c) > 0.30 was 709 for the AML-ALL distinction, but had a median of 173 genes for random class distinctions. Note that P(g,c) = 0.30 is the point where the observed data intersects the 1% significance level, meaning that 1% of random neighborhoods contain as many points as the observed neighborhood round the AML- ALL distinction. Similarly, in the right panel (higher in AML), 711 genes with P(g,c) > 0.28 were observed, whereas a median of 136 genes is expected for random class distinctions.

Votes are cast in favor of either AML or ALL for each informative gene. The magnitude of each vote is given by: Prediction Strength (PS) =V win – V lose and must be >0.3. V win + V lose w i v i where v i = x i –  AML +  ALL (x i = exp. of gene i ) 2 And w i = a weighting factor that reflects how well the gene is correlated with the class distinction. The class with the most votes wins (either ALL or AML).

Figure 3b. Genes distinguishing ALL from AML. The 50 genes most highly correlated with the ALL/AML class distinction are shown. Each row corresponds to a gene, with the columns corresponding to expression levels in different samples. Expression levels for each gene are normalized across the samples such that the mean is 0 and the standard deviation is 1. Expression levels greater than the mean are shaded in red, and those below the mean are shaded in blue. The scale indicates standard deviations above or below the mean. The top panel shows genes highly expressed in ALL, the bottom panel shows genes more highly expressed in AML. Note that while these genes as a group appear correlated with class, no single gene is uniformly expressed across the class, illustrating the value of a multi-gene prediction method.

Supplementary fig. 2. Expression levels of predictive genes in independent dataset. The expression levels of the 50 genes most highly correlated with the ALL-AML distinction in the initial dataset were determined in the independent dataset. Each row corresponds to a gene, with the columns corresponding to expression levels in different samples. The expression level of each gene in the independent dataset is shown relative to the mean of expression levels for that gene in the initial dataset. Expression levels greater than the mean are shaded in red, and those below the mean are shaded in blue. The scale indicates standard deviations above or below the mean. The top panel shows genes highly expressed in ALL, the bottom panel shows genes more highly expressed in AML.

Experimental Goals 1. 2.“Class Discovery”-- Determine whether a group of samples can be divided into two or more classes based only on measurement of their gene expression. –Employs “self-organizing maps.” –Must address two requirements: construction of algorithms to cluster the samples by gene expression and determining whether the class assignments produced by the algorithm are meaningful

Scenario 6 Distinguishing different types of leukemia to target treatment.

Similar presentations

Presentation on theme: "Scenario 6 Distinguishing different types of leukemia to target treatment."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Scenario 6 Distinguishing different types of leukemia to target treatment.

Similar presentations

Presentation on theme: "Scenario 6 Distinguishing different types of leukemia to target treatment."— Presentation transcript:

Similar presentations

About project

Feedback