Presentation is loading. Please wait.

Presentation is loading. Please wait.

Subspace Differential Coexpression Analysis for the Discovery of Disease-related Dysregulations Gang Fang, Rui Kuang, Gaurav Pandey, Michael Steinbach,

Similar presentations


Presentation on theme: "Subspace Differential Coexpression Analysis for the Discovery of Disease-related Dysregulations Gang Fang, Rui Kuang, Gaurav Pandey, Michael Steinbach,"— Presentation transcript:

1 Subspace Differential Coexpression Analysis for the Discovery of Disease-related Dysregulations Gang Fang, Rui Kuang, Gaurav Pandey, Michael Steinbach, Chad L. Myers and Vipin Kumar gangfang@cs.umn.edu http://www-users.cs.umn.edu/~kumar/dmbio/ Department of Computer Science and Engineering RECOMB Systems Biology 12/05/2009

2 Differential Expression (DE) –Traditional analysis targets the changes of expression level Differential Expression (DE) Expression over samples in controls and cases Expression level controls cases [Golub et al., 1999], [Pan 2002], [Cui and Churchill, 2003] etc. genes controlscases [Kostka & Spang, 2005]

3 Matrix of expression values Differential Coexpression (DC) –Targets changes of the coherence of expression controlscases Question: Is this gene interesting, i.e. associated w/ the phenotype? Answer: No, in term of differential expression (DE). However, what if there are another two genes ……? Yes! Expression over samples in controls and cases Differential Coexpression (DC) [Silva et al., 1995], [Li, 2002], [Kostka & Spang, 2005], [Rosemary et al., 2008], [Cho et al. 2009] etc. Biological interpretations of DC: Dysregulation of pathways, mutation of transcriptional factors, etc. genes controlscases [Kostka & Spang, 2005]

4 Existing work on differential coexpression –Pairs of genes with differential coexpression [Silva et al., 1995], [Li, 2002], [Li et al., 2003], [Lai et al. 2004] –Clustering based differential coexpression analysis [Ihmels et al., 2005], [Watson., 2006] –Network based analysis of differential coexpression [Zhang and Horvath, 2005], [Choi et al., 2005], [Gargalovic et al. 2006], [Oldham et al. 2006], [Fuller et al., 2007], [Xu et al., 2008] –Beyond pair-wise (size-k) differential coexpression [Kostka and Spang., 2004], [Prieto et al., 2006] –Gene-pathway differential coexpression [Rosemary et al., 2008] –Pathway-pathway differential coexpression [Cho et al., 2009] Differential Coexpression (DC)

5 Full-space differential coexpression May have limitations due to the heterogeneity of –Causes of a disease (e.g. genetic difference) –Populations affected (e.g. demographic difference) Existing DC work is “full-space” Motivation: Such subspace patterns may be missed by full- space models Full-space measures: e.g. correlation difference

6 Definition of Subspace Differential Coexpression Pattern –A set of k genes = {g 1, g 2,…, g k } – : Fraction of samples in class A, on which the k genes are coexpressed – : Fraction of samples in class B, on which the k genes are coexpressed Extension to Subspace Differential Coexpression Details in [Fang, Kuang, Pandey, Steinbach, Myers and Kumar, PSB 2010] as a measure of subspace differential coexpression Problem: given n genes, find all the subsets of genes, s.t. SDC≥d Given n genes, there are 2 n candidates of SDC pattern ! How to effectively handle the combinatorial search space ? Similar motivation and challenge as biclustering, but here differetial biclustering !

7 Direct Mining of Differential Patterns [Fang, Pandey, Gupta, Steinbach and Kumar, TR 09-011, CS@UMN] Refined SDC measure: “direct” A measure M is antimonotonic if V A,B: A B  M(A) >= M(B) Details in [Fang, Kuang, Pandey, Steinbach, Myers and Kumar, PSB 2010] >> ≈

8 Advantages: 1) Systematic & direct 2) Completeness 3) Efficiency An Association-analysis Approach systematic and efficient combinatorial search [ Agrawal et al. 1994] Refined SDC measure A measure M is antimonotonic if V A,B: A B  M(A) >= M(B) Disqualified Prune all the supersets

9 Three lung cancer datasets –[Bhattacharjee et al. 2001], [Stearman et al. 2005], [Su et al. 2007] All are from Affymetrix microarrays (first two: HG-U95A, and the third: HG-U133A) –Lung cancer samples & normal samples Combined dataset –More samples –Proper normalizations before combining: (RMA, DWD, XPN) –Lung cancer samples (102) –normal samples (67) Validation RMA [Irizarry et al., 2003], DWD [Benito et al., 2004], XPN [Shabalin et al., 2008]

10 Statistical Significance Phenotype permutation test (n=1000 ) A B C

11 Could Subspace DC patterns have been discovered in full-space? Full-space DC measures DC (Differential Coexpression) Subspace DC measures Phenotype permutation based significant cutoff for the full-space measure 88 statistically significant size-3 patterns (stars) Can also be found in full-space Can NOT be found in full-space

12 A 10-gene Subspace DC Pattern www. ingenuity.com: enriched Ingenuity subnetwork ≈ 60% ≈ 10% Enriched with the TNF-α/NFkB signaling pathway (6/10 overlap with the pathway, P-value: 1.4*10 -5 ) Suggests that the dysregulation of TNF-α/NFkB pathway may be related to lung cancer

13 Specific interpretation –Enriched cancer-related signaling pathways TNF-α/NFkB WNT –Target gene sets of cancer-related microRNA & TFs microRNA: –miR-101 ({PIK3C2B,TSC22D1} + AKAP12) Transcriptional factor (TF): –ATF2 ({ETV4,PTHLH} + CBX5) Biological Interpretations miR-101 is shown down-regulated in cancer [Friedman et al 2009] Mutations of ATF2 are shown to be related to cancer [Woo et al. 2002]

14 Summary –Proposed the problem definition & a systematic approach for subspace DC –Subspace DC analysis can identify many statistically significant & biologically relevant patterns that would have been missed in full-space Potential Biomedical utility –Study the demographic and genetic difference within each class –Phenotype classification with subspace DC patterns Combine DE and Subspace DC patterns Summary & Future Directions DE (Differential Expression); DC (Differential Coexpression) Compare

15 Co-authors at Dept. Computer Science, Univ. of Minnesota Conference organizers NSF grants #CRI-0551551 #IIS-0308264 #ITR-0325949 UMR-IBM-Mayo BICB Fellowship Acknowledgement Rui Kuang Gaurav Pandey Michael Steinbach Chad Myers Vipin Kumar Data Mining for Biomedical Informatics Group Comp. Bio. Group Comp. Bio. & Func. Genomic Group

16 Paper –Gang Fang, Rui Kuang, Gaurav Pandey, Michael Steinbach, Chad L. Myers and Vipin Kumar, Subspace Differential Coexpression Analysis: Problem Definition and a General Approach Proceedings of 15 th Pacific Symposium on Biocomputing, 2010 Source codes: http://vk.cs.umn.edu/SDChttp://vk.cs.umn.edu/SDC Questions: –Gang Fang: gangfang@cs.umn.edu Thanks!


Download ppt "Subspace Differential Coexpression Analysis for the Discovery of Disease-related Dysregulations Gang Fang, Rui Kuang, Gaurav Pandey, Michael Steinbach,"

Similar presentations


Ads by Google