Presentation is loading. Please wait.

Presentation is loading. Please wait.

F INDING C ONSISTENT S UBNETWORKS ACROSS M ICROARRAY DATASET Fan Qi GS5002 Journal Club.

Similar presentations


Presentation on theme: "F INDING C ONSISTENT S UBNETWORKS ACROSS M ICROARRAY DATASET Fan Qi GS5002 Journal Club."— Presentation transcript:

1 F INDING C ONSISTENT S UBNETWORKS ACROSS M ICROARRAY DATASET Fan Qi GS5002 Journal Club

2 O UTLINE  Introduction  Methodology  Results & Discussions  Conclusions 2

3 I NTRODUCTION Identify Differential Gene Expression Identify significant genes w.r.t a phenotype Importance: Testing effectiveness of treatment Biological insights of diseases Develop new treatment Disease Prophylaxis Any others ? 3

4 C URRENT M ETHODS Individual Genes Search for individual differentially expressed genes Fold-change, t-test, SAM Gene Pathway Detection Looking at a set of genes instead of individual genes Bayesian learning and Boolean network learning Gene Classes Adding existing biological insights Over-representation analysis (ORA), Functional Class Scoring(FCS), GSEA, NEA, ErmineJ 4

5 C HALLENGE Different Results from Different Dataset of the SAME disease! Zhang M [1] demonstrated inconsistency in SAM: Datasets DEGsPOGnPOG Prostate cancer Top 100.3 Top 500.14 TOP 1000.15 Lung cancer Top 100.00 Top 500.200.19 TOP 1000.310.30 DMD Top 100.20 Top 500.42 TOP 1000.54 Reconstruct from Table 1 in [1] Inconsistency among datasets 5

6 N EW A PPROACH SNet [2] Proposed in 2011 Utilize gene-gene relationship in analysis Gene-gene relationship Activates VS. Inhibits Gene Subnetwork Gene is the Vertex, Relationship is an edge From Fig 1 in [2] RHOAVAVPIK3R2 ARHGEF1 RAC1 IQGAP1 Partially adapted from Fig 2 in [2] 6

7 M ETHODOLOGY Input: Genes labeled with phenotype Gain from microarray experiment Third-party Info: Gene Pathway Info Gene Reaction Info Attributes of Subnetwork Size, Score Output: A set of significant sub-network Subnetwork Extraction Subnetwork Scoring Subnetwork Significance 7

8 M ETHODOLOGY –S TEP 1 P3 P2 P1 Phenotypes …….. Patient’s Gene Ranked List 8

9 M ETHODOLOGY –S TEP 1 P1 Repeat for every phenotype group 9

10 M ETHODOLOGY –S TEP 1 P1 (d) P1 ……. 10

11 M ETHODOLOGY –S TEP 1 ……… 11

12 M ETHODOLOGY – S TEP 2 T test Assign to each Subnetwork 12 P1 (d)

13 M ETHODOLOGY – S TEP 3 Fig 5 in original paper 13

14 R ESULTS AND D ISCUSSIONS Dataset: Leukemia: Golub VS Armstrong ALL: Ross VS Yeoh DMD: Haslett VS Pescatori Lung: Bhattacharjee VS Garber Performance Comparison: Subnetwork Overlap (with GSEA) Gene Overlap (GSEA, SAM, t-Test) Other Comparisons: Network Size, Gene Validity with t-Test 14

15 R ESULTS AND D ISCUSSIONS Subnetwork Overlap DiseaseDataset 1Dataset 2SNETGSEASNETGSEA LeukemiaGolubArmstrong83.33%0%200 ALLRossYeoh47.63%23.1%106 DMDHaslettPescatori58.33%55.6%710 LungBhattacharjeeGarber90.90%0%90 Synthesized from Table 1, 2 from [2] Higher the better 15

16 R ESULTS AND D ISCUSSIONS Gene Overlap DiseaseSnetGSEAT-Test (p <0.05) T-Test (top) SAM (p <0.05) SAM (top) Leukemia91.30%2.38%73.01%14.29%49.96%22.62% ALL93.01%4.0%60.20%57.33%81.25%49.33% DMD69.23%28.9%49.60%20.00%76.98%42.22% Lung51.18%4.0%65.61%26.16%65.61%24.62% Synthesized from Table 3, 4,5 from [2] Higher the better 16

17 R ESULTS AND D ISCUSSIONS Size of subnetworks DiseaseT-TestSNet Size of Network2345567>8 Leukemia8481002321 Subtype7551111016 DMD4531001005 Lung6532105301 Reconstructed from Table 6 from [2] 17

18 R ESULTS AND D ISCUSSIONS Validity Compare the genes in EACH Subnetwork with those in t- test Genes in each Subnetwork appears in T-Test is around 70%- 100% Selected Results (too large to present full) Subnetwork NamePercentageSubnetwork NamePercentage Leukaemia_B Cell-VAV181.82%SNET_CTNNB1100% Leukaemia_UBC100%SNET_TNFSF1060% Leukaemia_RAC157.15%SNET_PYGM60% DMD_RHOA75%DMD_ACTB83.33% DMD_SDC388.89%Leaukaemia_POU2F275.00% MLLBCR_ACAA128.67%BCR_T_RASA144.44% MLLBCR_BLNK72.73%BCR_ABL175.00% SNET_NOTCH3100%DMD_CALM180% Selected from Table 7,8,9,10 in[2] 18

19 C ONCLUSIONS Traditional Methods have inconsistency problem across different dataset of the same disease SNet utilize Biological insights to mitigate the gap Gene-to-Gene relationship Gene Pathway knowledge SNet shows better results than established algorithms More consistent 19

20 R EFERENCES [1] Zhang M, Zhang L, Zou J, Yao C, Xiao H, Liu Q, Wang J, Wang D, Wang C, Guo Z: Evaluating reproducibility of differential expression discoveries in microarray studies by considering correlated molecular changes. [2] Donny Soh, Difeng Dong1, Yike Guo, Limsoon Wong Finding consistent disease subnetworks across microarray datasets 20

21 21


Download ppt "F INDING C ONSISTENT S UBNETWORKS ACROSS M ICROARRAY DATASET Fan Qi GS5002 Journal Club."

Similar presentations


Ads by Google