Scenario 6 Distinguishing different types of leukemia to target treatment.

Slides:



Advertisements
Similar presentations
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Advertisements

Instance-based Classification Examine the training samples each time a new query instance is given. The relationship between the new query instance and.
Modeling sequence dependence of microarray probe signals Li Zhang Department of Biostatistics and Applied Mathematics MD Anderson Cancer Center.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
T. R. Golub, D. K. Slonim & Others Big Picture in 1999 The Need for Cancer Classification Cancer classification very important for advances in cancer.
DNA microarray and array data analysis
Microarrays Dr Peter Smooker,
Copyright, ©, 2002, John Wiley & Sons, Inc.,Karp/CELL & MOLECULAR BIOLOGY 3E Transcriptional Control in Eukaryotes Background Information Microarrays.
Figure 1: (A) A microarray may contain thousands of ‘spots’. Each spot contains many copies of the same DNA sequence that uniquely represents a gene from.
The Human Genome Project and ~ 100 other genome projects:
DNA Arrays …DNA systematically arrayed at high density, –virtual genomes for expression studies, RNA hybridization to DNA for expression studies, –comparative.
Central Dogma 2 Transcription mRNA Information stored In Gene (DNA) Translation Protein Transcription Reverse Transcription SELF-REPAIRING ARABIDOPSIS,
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
5 µm Millions of copies of a specific oligonucleotide probe >5 760,000 different complementary probes ~ targets Single stranded, labeled ‘target’
Alternative Splicing As an introduction to microarrays.
Microarrays: Theory and Application By Rich Jenkins MS Student of Zoo4670/5670 Year 2004.
Introduce to Microarray
Gene Expression BMI 731 Winter 2005 Catalin Barbacioru Department of Biomedical Informatics Ohio State University.
Introduction to DNA microarrays DTU - January Hanne Jarmer.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Microarrays: Basic Principle AGCCTAGCCT ACCGAACCGA GCGGAGCGGA CCGGACCGGA TCGGATCGGA Probe Targets Highly parallel molecular search and sort process based.
Gene expression profiling identifies molecular subtypes of gliomas
Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek.
Affymetrix vs. glass slide based arrays
Whole Genome Expression Analysis
CSE182 L14 Mass Spec Quantitation MS applications Microarray analysis.
Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.
Introduction to DNA microarrays DTU - May Hanne Jarmer.
Technology for Systems Biology. Nucleic Acid Hybridization In principle complementary strands will associate Chemistry is quite different on surfaces.
Microarray - Leukemia vs. normal GeneChip System.
CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Procedure Characteristics of Data Data.
Class Prediction and Discovery Using Gene Expression Data Donna K. Slonim, Pablo Tamayo, Jill P. Mesirov, Todd R. Golub, Eric S. Lander 발표자 : 이인희.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
Intro to Microarray Analysis Courtesy of Professor Dan Nettleton Iowa State University (with some edits)
GeneChip® Probe Arrays
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
Lecture 7. Functional Genomics: Gene Expression Profiling using
Whole Genome Approaches to Cancer 1. What other tumor is a given rare tumor most like? 2. Is tumor X likely to respond to drug Y?
Application of Class Discovery and Class Prediction Methods to Microarray Data Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics.
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
Microarray Technology. Introduction Introduction –Microarrays are extremely powerful ways to analyze gene expression. –Using a microarray, it is possible.
Microarray hybridization Usually comparative – Ratio between two samples Examples – Tumor vs. normal tissue – Drug treatment vs. no treatment – Embryo.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Introduction to Microarrays. The Central Dogma.
Examples of Classifying Expression Data / 7.90 Computational Functional Genomics Spring 2002.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
Applications of Supervised Learning in Bioinformatics Yen-Jen Oyang Dept. of Computer Science and Information Engineering.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Ishida et al. Supplementary Figures 1-3 Page 1 Supplementary Fig. 1. Stepwise determination of genomic aberrations on chr-13 in medulloblastomas from Ptch1.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
Microarrays and Other High-Throughput Methods BMI/CS 576 Colin Dewey Fall 2010.
Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring T.R. Golub et al., Science 286, 531 (1999)
CSE182 L14 Mass Spec Quantitation MS applications Microarray analysis.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Introduction to Oligonucleotide Microarray Technology
Classifiers!!! BCH364C/391L Systems Biology / Bioinformatics – Spring 2015 Edward Marcotte, Univ of Texas at Austin.
EQTLs.
Microarray - Leukemia vs. normal GeneChip System.
Gene expression arrays in cancer research: methods and applications
Microarray Technology and Applications
Molecular Classification of Cancer
Example of a DNA Array (note green, yellow red colors; also note that only part of the total array is depicted)
Volume 1, Issue 2, Pages (March 2002)
Volume 1, Issue 1, Pages (February 2002)
Volume 7, Issue 4, Pages (April 2005)
Target-Specific Precision of CRISPR-Mediated Genome Editing
Volume 3, Issue 1, Pages (July 2016)
Optimal gene expression analysis by microarrays
Greater induction of apoptosis following EGFR TKI treatment correlates with higher basal BIM expression across a panel of EGFR-mutant lung cancers. Greater.
Presentation transcript:

Scenario 6 Distinguishing different types of leukemia to target treatment

Acute Myeloid Leukemia (AML) vs Acute Lymphoblastic Leukemia (ALL) Golub, T. R., et al Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531-7.

AMLALL

AML: ALL:

AML: ALL: 12345

AML: ALL: AMLALL

AML: ALL: AMLALL

AML: ALL: AMLALL

AML: ALL: AMLALL

AML: ALL: AMLALL

AML: ALL: AMLALL

AMLALL+

AMLALL+

Spotted Microarray Process CTRL TEST

Microarray Platforms Spotted arrays Inserts from cDNA libraries, PCR products, or oligonucleotides Probed with labeled RNA or cDNA from 2 samples Affymetrix GeneChip arrays 25mer oligonucleotides synthesized on a glass wafer Probed with labeled RNA or cDNA from a single sample

Affymetrix Synthesis of Ordered Oligonucleotide Arrays O O O O O Light (deprotection) HO HO O O O T T O O O T T C C O Light (deprotection) T T O O O C A T A T A G C T G T T C C G Mask Substrate Mask Substrate T – C – REPEAT

Affymetrix GeneChip ® Probe Array

Affymetrix GeneChip ® Probe Arrays 24µm Each probe cell or feature contains millions of copies of a specific oligonucleotide probe Image of Hybridized Probe Array Over 250,000 different probes complementary to genetic information of interest Single stranded, fluorescently labeled DNA target Oligonucleotide probe * * * * * 1.28cm GeneChip Probe Array Hybridized Probe Cell BGT108_DukeUniv *

Affymetrix Probe Tiling Strategy The presence or absence of each Gene is determined by a panel of 20 perfect match and 20 mismatch (control) oligonucleotides (25-mer)

Sample output:

Data Analysis Sample 1 Sample 2

Data Analysis Sample 1 Sample 2 Sample 2 (Light units) Sample 1 (Light units)

Data Analysis Sample 1 Sample 2 Sample 2 (Light units) Sample 1 (Light units)

Data Analysis Sample 1 Sample 2 Sample 2 (Light units) Sample 1 (Light units)

Golub, T. R., et al Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: Near the bottom of the page: “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.” Paper, data tables, supplemental figures

Golub, T. R., et al Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: Measured the expression of 6817 human genes using Affymetrix arrays. Initially examined 27 ALL and 11 AML samples. Each ALL or AML specimen was used to prepare labeled RNA that was apparently hybridized with a single chip. “Samples were subjected to a priori quality control standards regarding the amount of labeled RNA and the quality of the scanned microarray image.” Eight of 80 leukemia samples were discarded.

Golub, T. R., et al Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: The signal strength from each chip was apparently normalized to that of the other chips by multiplying every value in the chip by the multiplication factor listed in the “Rescaling factors” table on the web.

Experimental Goals 1.“Class Prediction”-- Determine whether an unknown sample belongs to a predefined class. –Find a set of genes whose expression is high in AML and low in ALL or vice versa. –Measure the expression of these genes in unknown samples and use the measurements as a class predictor.

P(g,c) = Correlation Coefficient measuring the degree to which expression of a given gene in the set of samples correlates with assignment to either class (AML or ALL) =  1 (g) –  2 (g) or  2 (g) –  1 (g)  1 (g) +  2 (g)

Figure 2. Neighborhood analysis: ALL vs AML. For the 38 leukemia samples in the initial dataset, the plot shows the number of genes within various 'neighborhoods' of the the ALL/AML class distinction together with curves showing the 5% and 1% significance levels for the number of genes within corresponding neighborhoods of the randomly permuted class distinctions (see notes 16,17 in the paper). Genes more highly expressed in ALL compared to AML are shown in the left panel; those more highly expressed in AML compared to ALL are shown in right panel. Note the large number of genes highly correlated with the class distinction. In the left panel (higher in ALL), the number of genes with correlation P(g,c) > 0.30 was 709 for the AML-ALL distinction, but had a median of 173 genes for random class distinctions. Note that P(g,c) = 0.30 is the point where the observed data intersects the 1% significance level, meaning that 1% of random neighborhoods contain as many points as the observed neighborhood round the AML- ALL distinction. Similarly, in the right panel (higher in AML), 711 genes with P(g,c) > 0.28 were observed, whereas a median of 136 genes is expected for random class distinctions.

Votes are cast in favor of either AML or ALL for each informative gene. The magnitude of each vote is given by: Prediction Strength (PS) =V win – V lose and must be >0.3. V win + V lose w i v i where v i = x i –  AML +  ALL (x i = exp. of gene i ) 2 And w i = a weighting factor that reflects how well the gene is correlated with the class distinction. The class with the most votes wins (either ALL or AML).

Figure 3b. Genes distinguishing ALL from AML. The 50 genes most highly correlated with the ALL/AML class distinction are shown. Each row corresponds to a gene, with the columns corresponding to expression levels in different samples. Expression levels for each gene are normalized across the samples such that the mean is 0 and the standard deviation is 1. Expression levels greater than the mean are shaded in red, and those below the mean are shaded in blue. The scale indicates standard deviations above or below the mean. The top panel shows genes highly expressed in ALL, the bottom panel shows genes more highly expressed in AML. Note that while these genes as a group appear correlated with class, no single gene is uniformly expressed across the class, illustrating the value of a multi-gene prediction method.

Supplementary fig. 2. Expression levels of predictive genes in independent dataset. The expression levels of the 50 genes most highly correlated with the ALL-AML distinction in the initial dataset were determined in the independent dataset. Each row corresponds to a gene, with the columns corresponding to expression levels in different samples. The expression level of each gene in the independent dataset is shown relative to the mean of expression levels for that gene in the initial dataset. Expression levels greater than the mean are shaded in red, and those below the mean are shaded in blue. The scale indicates standard deviations above or below the mean. The top panel shows genes highly expressed in ALL, the bottom panel shows genes more highly expressed in AML.

Experimental Goals 1. 2.“Class Discovery”-- Determine whether a group of samples can be divided into two or more classes based only on measurement of their gene expression. –Employs “self-organizing maps.” –Must address two requirements: construction of algorithms to cluster the samples by gene expression and determining whether the class assignments produced by the algorithm are meaningful