Download presentation
Presentation is loading. Please wait.
1
BASIC METHODOLOGIES OF ANALYSIS: SUPERVISED ANALYSIS: HYPOTHESIS TESTING USING CLINICAL INFORMATION (MLL VS NO TRANS.) IDENTIFY DIFFERENTIATING GENES Basic methodologies1 SUPERVISED METHODS CAN ONLY VALIDATE OR REJECT HYPOTHESES. CAN NOT LEAD TO DISCOVERY OF UNEXPECTED PARTITIONS. UNSUPERVISED: EXPLORATORY ANALYSIS NO PRIOR KNOWLEDGE IS USED EXPLORE STRUCTURE OF DATA ON THE BASIS OF CORRELATIONS AND SIMILARITIES
2
Advantages of SPC Scans all resolutions (T) Robust against noise and initialization - calculates collective correlations. Identifies “ natural ” ( ) and stable clusters ( T) No need to pre-specify number of clusters Clusters can be any shape Can use distance matrix as input (vs coordinates)
3
stability larger T - tighter, more stable cluster TT
4
P53 p53 IS A CENTRAL PLAYER IN APOPTOSIS AND IN CELL CYCLE CONTROL. IT IS A TRANSCRIPTION FACTOR.
5
PRIMARY TARGETS OF P53 TEMPERATURE SENSITIVE MUTANT P53, ACTIVATE - 32 C (t=0) MEASURE EXPRESSION AT t=0,2,6,12,24 h (use t=0 as control) REPEAT IN PRESENCE OF CYCLOHEXIMIDE (CHX) t=0,2,4,6,9,12 (CHX INHIBITS PROTEIN SYNTHESIS) IDENTIFY UPREGULATED GENES USING FILTER: AT LEAST 2.5 FOLD INCREASE AT 3 OR MORE TIME POINTS (SEPARATELY IN EACH OF THE TWO EXPTS, -CHX AND +CHX) 38 CANDIDATE PRIMARIES: EFFECT OF FILTERING??? RELEASE FILTER FROM +CHX CLUSTERING: 38 47 (31) K. Kannan, D. Givol, G. Rechavi,... G. Getz, I. Kela, Oncogene 2001
6
REDUCE EFFECT OF FILTERING BY CLUSTERING X – 38 candidate primary targets % candidate primary targets c a K.Kannan et al, Oncogene
7
INHIBITION OF P53-INDUCED APOPTOSIS BY IL-6 Lotem…Rechavi, D. Givol, L. Sachs PNAS 2003 BY REDUCING TEMPERATURE TO 32 DEGREES, P53 ASSUMES WILD-TYPE CONFORMATION, IS ACTIVATED AND INDUCES APOPTOSIS ADDING THE CYTOKINE IL-6 INHIBITS THE APOPTOTIC PROCESS QUESTION: WHERE DOES IL-6 INTERFERE IN THE CASCADE INITIATED BY P53? AT TOP? AT BOTTOM?
8
Transactivation Growth arrest Other activities (C terminal = TFIIH binding?) (N terminal = SH3 binding?) Bax, IGF-BP3, Fas, killer/DR5, Noxa, PIG3, p53AIP1, PIDD, Puma Apoptosis p21/Waf1 Activated p53 Other genes etc, etc, etc QUESTION: WHERE DOES IL-6 INTERFERE IN THE CASCADE INITIATED BY P53? AT TOP? AT BOTTOM? ??
9
Transactivation Growth arrest Other activities (C terminal = TFIIH binding?) (N terminal = SH3 binding?) Bax, IGF-BP3, Fas, killer/DR5, Noxa, PIG3, p53AIP1, PIDD, Puma Apoptosis p21/Waf1 Activated p53 Other genes etc, etc, etc ?? Caspese cascade IL-6 ?? QUESTION: WHERE DOES IL-6 INTERFERE IN THE CASCADE INITIATED BY P53? AT TOP? AT BOTTOM? IL-6 ??
10
333 GENES UPREGULATED BY P53 – NOT AFFECTED BY IL-6 309 GENES DOWNREGULATED BY P53 ALSO NOT AFFECTED
11
Transactivation Growth arrest Other activities (C terminal = TFIIH binding?) (N terminal = SH3 binding?) Bax, IGF-BP3, Fas, killer/DR5, Noxa, PIG3, p53AIP1, PIDD, Puma Apoptosis p21/Waf1 Activated p53 Other genes etc, etc, etc ?? Caspese cascade IL-6 ?? QUESTION: WHERE DOES IL-6 INTERFERE IN THE CASCADE INITIATED BY P53? AT TOP? AT BOTTOM? IL-6 ?? ANSWER: AT BOTTOM!!
12
COLON CANCER DATA Alon,Barkai, Notterman, Gish, Ybarra, Mack, Levine: PNAS 96, 6745 (1999) AFFYMETRIX; 40 TUMOR, 22 NORMAL TISSUES 2000 (OUT OF 6500) GENES OF HIGHEST INTENSITY A ij = EXPRESSION LEVEL OF GENE i IN TISSUE j Colon Cancer Data
13
COLON CANCER DATA: Colon Cancer Data
14
Two-way clustering S1(G1) G1(S1) TWO-WAY CLUSTERING:
15
TWO-WAY CLUSTERING: Two way clustering-ordered S1(G1) G1(S1)
16
TWO-WAY CLUSTERING – TISSUES 1. IDENTIFY TISSUE CLASSES (TUMOR/NORMAL) 2-way clustering - tissues
17
Ribosomal proteins Cytochrome C HLA2 metabolism 2-way clustering – genes Erel TWO-WAY CUSTERING – GENES - G1(S1) 2. FIND DIFFERENTIATING AND CORRELATED GENES EACH GENE = POINT IN 62 DIMENSIONAL SPACE
18
TWO-WAY CLUSTERING: Two-way clustering Can one improve?
19
football
20
COUPLED TWO-WAY CLUSTERING C2WC - Motivation MOTIVATION: ONLY A SMALL SUBSET OF GENES PLAY A ROLE IN A PARTICULAR BIOLOGICAL PROCESS; THE OTHER GENES INTRODUCE NOISE, WHICH MAY MASK THE SIGNAL OF THE IMPORTANT PLAYERS. ONLY A SUBSET OF SAMPLES EXHIBIT THE EXPRESSION PATTERNS OF INTEREST. SHOULD USE A SUBSET OF GENES TO STUDY A SUBSET OF THE SAMPLES (AND VICE VERSA) PROBLEM: ENORMOUS NUMBER OF SUBMATRICES
21
COUPLED TWO-WAY CLUSTERING PICK ONE STABLE GENE CLUSTER. REPRESENT TISSUES BY THE EXPRESSION LEVELS OF THESE GENES ONLY. ANALYZE ALL TISSUE CLUSTERS BY USING ALL GENE CLUSTERS, ONE AT A TIME. LOOK FOR INTERNAL STRUCTURE, SUB-CLUSTERS. USE ALL STABLE TISSUE CLUSTERS TO CLASSIFY GENES; IDENTIFY GENE CLUSTERS THAT GOVERN BIOLOGICAL PROCESSES. ITERATE THE PROCEDURE UNTIL NO NEW STABLE CLUSTERS EMERGE C2WC - method
22
COUPLED TWO-WAY CLUSTERING OF COLON CANCER: TISSUES tissues 1 G4 G12 S1(G4) S1(G12)
23
COUPLED TWO-WAY CLUSTERING OF COLON CANCER: TISSUES CTWC colon cancer - tissues S1(G4) S1(G12) S17
24
genes1 S17 G1(S17)
25
COUPLED TWO WAY CLUSTERING OF COLON CANCER - GENES USING ONLY THE TUMOR TISSUES TO CLUSTER GENES, REVEALS CORRELATION BETWEEN TWO GENE CLUSTERS; CELL GROWTH AND EPTHELIAL COLON CANCER - ASSOCIATED WITH EPITHELIAL CELLS CTWC of colon cancer - genes G1(S17) G1(S1)
26
17 Primary GlioBlastoMa 3 Cell Lines 1185 Genes, 36 Samples GLIOBLASTOMA: glioblastoma CLONTECH ARRAYS S Godard, G Getz, H Kobayashi, P Farmer, M Delorenzi, M Nozaki, A-C Diserens, M-F Hamou, P-Y Dietrich, J-G Villemure, R C. Janzer, P Bucher, R Stupp, N de Tribolet, E Domany, M E. Hegi 12 Astrocytoma(II) 4 secondary GBM 174 genes separate (at FDR of 5%) PrGBM from LGA + ScGBM
27
Astrocytoma(II) Secondary GBM Primary GlioBlastoMa Cell Lines GENES S2 S3 T S1(G1) G12 G5 Coupled Two-Way Clustering (CTWC) of 358 Genes and 36 Samples GLIOBLASTOMA: glioblastoma G1(S1) FILTERING 358 HIGHLY VARYING GENES
28
S11 S12 S14 S10 S13 S1(G5) Super-Paramagnetic Clustering of All Samples Using Stable Gene Cluster G5 Fig. 2B S1(G5)
29
G5Ver validation
30
AB004904 STAT-induced STAT inhibitor 3 M32977VEGF M35410IGFBP2 X51602VEGFR1 M96322gravin AB004903STAT-induced STAT inhibitor 2 X52946 PTN J04111c-jun X79067TIS11B THE GENES OF G5: VEGF AND ITS RECEPTORS – INSTRUMENTAL IN ANGIOGENESIS; INDUCED GROWTH OF BLOOD VESSELS, ESSENTIAL FOR GROWTH BEYOND A CRITICAL SIZE. THE COEXPRESSION OF IGFBP2 WAS INDEPENDENTLY VERIFIED; 1ST EVIDENCE FOR POSSIBLE ROLE IN ANGIOGENESIS. THE GENES OF G5
31
Fig 6
32
Total of 45 samples/chips: 5 Cell lines. 5 Normal samples. 35 tumor samples, 5 of which are repeats. 10 adenocarcinoma tumors: 4 are HPV-16 and 6 are HPV-18. 20 epidermal carcinoma: 12 HPV-16, 6 HPV-18, 1 HPV-33 and 1 HPV-99. S 02 - 1 e g ‘S’ - sample ‘C’ - cell line Sample number Batch #1,2,3 ‘a’ - adeno ‘e’ - epidermal ‘n’ - normal ‘g’ - good ‘b’ - bad ‘o’ - other Analysis of cervical cancer data C. Rosty, F. Radvanyi, N. Stransky …M. Sheffer, D. Tsafrir, I. Tsafrir …X. Sastre, Oncogene (2005) MAIN AIM: PREDICT OUTCOME AT DISCOVERY
33
AIM: IDENTIFY GENES WHOSE EXPRESSION LEVEL, MEASURED AT THE TIME OF DISCOVERY OF THE MALIGNANCY, IS INDICATIVE OF OUTCOME
34
WE USED STANDARD STATISTICAL TESTS LOOKING FOR GENES WHOSE EXPRESSION LEVELS SEPARATE PATIENTS WITH GOOD OUTCOME FROM PATIENTS WITH BAD OUTCOME. NO SUCH GENES WERE FOUND PERHAPS TRY UNSUPERVISED METHODS (E.G. CLUSTERING) ???
35
For first cluster analysis we removed the cell-lines and the repeats. 35 left Filtered the genes: 5000 left 45 samples 17,300 probes PCA 1 PCA 2 PCA 3 Gene Expression Matrix
36
Two clustering operations: 35 samples based on the expression of 5000 probes; S1(G1) 5000 probes in 35 dimensional space; G1(S1) Two-way Clustering of cervical data S1(G1) G1(S1)
37
G7 G10 S1(G7) S1(G3) Coupled Two-Way Clustering of Cervix Cancer G3 S1(G10) FOCUS ON G3: CLUSTER OF 148 GENES (163 probe sets) 35 SAMPLES (REMOVE CELL LINES AND REPLICATES) 5000 GENES (PASSED VARIANCE FILTER)
38
G7 G10 S1(G7) S1(G3) Coupled Two-Way Clustering of Cervical Cancer G3 S1(G10) Getz et al PNAS 2000 normal “good” cell lines FOCUS ON G3 (PROLIFERATION CLUSTER, GO ): 1. Cluster samples using 163 probe sets; 2. SORT (using SPIN )
39
‘ Good outcome ’ sample cluster (AACR 2004) 163 probes S19-1noo S28-1noo S07-1noo S35-1noo S02-1noo S29-3a6g S26-2a8+ S20-2e8g S03-1e8g S34-2e8g S23-1a8g S13-1a8b S31-3a6g S08-1e6g S23-2a8g S10-1e6b S18-1e8b S04-1a8b S12-1e8b S05-1a8b S11-1e3b S25-3a6g S22-1e6g S27-1e6b S32-2e6g S17-3a6o S33-1e6b S15-2e6g S09-1e6b S18-2e8b S06-1e6+ S14-1e6b S33-2e6b S15-1e6g S21-2a8o S24-1e6b S01-1e6g S14-3e6b S30-2e8b C01-3c8o S16-1e9o C06-3c8o C07-3c6o C03-3c6o C05-3c8o Normal samples Cell-line samples Good outcome Low expression level of the “Proliferation Cluster” indicates good outcome High expression: no prediction Validated by RT-PCR of 20 genes over 70 samples
40
E7 DNA: Corr=0.34,0.55 E7 RNA: Corr=0.54,0.62 Activity of P53 and Rb is controlled by E6/E7 Viral Protein Content. E6/E7 Protein Concentration controlled by E6/E7 RNA Expression Level E6/E7 RNA Level controlled by E6/E7 DNA COPY NUMBER Ordered Expression Matrix of 20 proliferation Genes HPV16/HPV18 P53 and Rb control (restrain) proliferation (inactivating E2F) use TF binding site sequence information to derive network
41
AIM: IDENTIFY GENES WHOSE EXPRESSION LEVEL, MEASURED AT THE TIME OF DISCOVERY OF THE MALIGNANCY, IS INDICATIVE OF OUTCOME FINDING: A CUSTER OF 150 GENES, ASSOCIATED WITH CELL PROLIFERATION, HAS RELATIVELY LOW EXPRESSION LEVELS IN A SUBSET OF THE “GOOD OUTCOME” PATIENTS. VALIDATION (PCR) FINDING: CELL PROLIFERATION EXPRESSION LEVEL IS CONTROLLED BY AMOUNT OF VIRAL PROTEINS E6, E7, WHICH IS GOVERNED BY NUMBER OF DNA COPIES THAT WERE INSERTED BY THE VIRUS Rosty et al, Oncogene 2005
42
signature algorithm J. Ihmels, G. Friedlander,S. Bergmann,O. Sarig, Y Ziv, N. Barkai
43
( (a)N core = 37,73,145 genes for ribosomal proteins 132 genes for biosynthesis Each used as input G I ref, returns (nearly same) gene signature S ref add N rand randomly picked genes G I input set of N core + N rand genes, returns gene signatures S I Recurrence of S ref is measured by Overlap = Fraction of shared genes by S ref and S I (b) Use as G I ref sets of genes with shared regulatory sequences. Only the truely coregulated ones are returned in S ref ; recurrent. yeast genome: 6400 genes, 1000 “conditions” (chips) recurrence
44
pathways (a)Tricarboxyl acid (TCA) cycle: known genes in E.coli, find (34) homologues in yeast used as G I ; produce S I which excludes the wrong genes and misses only few correct ones (b,c) Identify two autonomous subparts of the cycle
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.