Download presentation
Presentation is loading. Please wait.
Published byDella Charles Modified over 9 years ago
1
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P. and Di Cunto F. ugo.ala@unito.it Molecular Biotechnology Center, University of Torino
2
Introduction Massive repositories of gene expression data obtained with microarray technology represent an extremely rich source of biological information; Since genes involved in the same functions tend to show very similar expression profiles, co-expression analysis performed on these datasets could be a very powerful approach for inferring functional relationships among genes and for predicting the involvement of specific sequences in human genetic diseases; However, so far gene co-expression has not proved to be a particularly useful criterion for disease genes identification.
3
Reasons 1.Microarray data are noisy 2.Many genes showing very similar expression profiles are not functionally related (Spellman et al, 2002) Functional relationships inferred on the basis of co-expression in a single species contain a large majority of false positive predictions.
4
A powerful help: phylogenetic conservation Since gene regulatory regions evolve at higher speed than coding regions, if the co-expression of two genes is evolutionarily conserved, it is much more likely that the genes are functionally related. Obviously, the confidence level increases with the phylogenetic distance among species. A gene co-expression network constructed with expression data from distant species (H. sapiens, C. elegans, D. melanogaster, S. cerevisiae) (Stuart et al, 2003 )
5
Human-mouse conserved co-expression represents an excellent compromise between sensitivity and specificity to predict functional relationships among mammalian genes (Pellegrino et al, 2004) A powerful help: phylogenetic conservation
6
Evaluation of gene expression profile correlation among all the probes by Pearson’s coefficient Single-species datasets of microarray experiments, based on probes which can be linked to EntrezGene IDs Link every probe with the probes which are in the first percentile of the respective ranked lists Merge links between probes by Entrez Gene identifiers Construction of human-mouse conserved coexpression networks for disease gene prediction Step one: single species networks Homo sapiens Mus musculus Human gene co-expression networks H-GCN Mouse gene co-expression networks M-GCN
7
Select the links found in both the co-expression networks, according to Homologene Construction of human-mouse conserved coexpression networks for disease gene prediction Step two: human-mouse networks Human gene co-expression networks H-GCN Mouse gene co-expression networks M-GCN Human-mouse co-expression network
8
Conserved co-expression networks Data retrieval 4129 experiments for 102296 EST probes for human 467 experiments for 80595 EST probes for mouse 353 experiments for 46241 probesets for human (Roth et al, 2006) 122 experiments for 19692 probesets for mouse (Su et al, 2004) Experiments based on cDNA platforms and performed mostly on tumor cell lines Experiments based on Affymetrix platforms and performed on normal tissues
9
8512 nodes (genes); 56397 edges; 12766 nodes (genes); 155403 edges; We concentrate our network analysis on CC (Co-expression cluster) defined as the nearest neighbors of each node of networks, thus obtaining a CC for each gene Conserved co-expression networks Results
10
in vivo in vitroyeast-two-hybrid A-GCN S-GCN Random Both networks exhibit a highly significant overlap with protein-protein interactions reported in the Human Protein Reference Database Conserved co-expression networks Comparison with other networks Good protein-protein predictors
11
A-CCN S-CCN Random A-CCN and S-CCN show a strong enrichment for functional annotation, compared with random permutations. Conserved co-expression networks GO Analysis Good criterion to identify functionally related genes
12
Predicting human disease genes MimMiner (Van Driel et al, 2006), a text-mining phenotype similarity relationship database, represents a very useful way for the merging of co-expression data with disease information. A-CCN S-CCN Random A-CCN and S-CCN show also a strong enrichment for what concern OMIM Ids characterizing disease phenotype.
13
OMIM locus (phenotype description) CCs Conserved Co-expression clusters How to of the algorithm (1)
14
OMIM locus (phenotype description) CCs Conserved Co-expression clusters DRCCs Disease Related Co-expression Clusters How to of the algorithm (2)
15
OMIM locus (phenotype description) DRCCs Disease Related Co-expression Clusters How to of the algorithm (3) These genes become our candidate disease genes
16
Leave-one-out Leave-one-out cross validation tests over all known disease genes have shown good performance
17
We applied our procedure to 850 OMIM phenotype entries with unknown molecular basis (but mapped to one or more genetic loci). The candidates are 321, covering a set of 81 loci (65 from A-CCN, 6 from S-CCN and 10 from both networks) Predicting human disease genes Results
18
Examples and discussion of some candidates
19
Conclusions Our approach, based on conserved co-expression analysis, has been demonstrated particularly successful to provide reliable predictions of potential disease-causing genes because of two main factors: 1.the phylogenetic filter 2.the integration with quantitative phenotype correlation data In conclusion, we propose that our method and our list of candidates will provide a useful support for the identification of new disease-causing genes.
20
Our real network … Ala U. Piro R. Silengo L. Damasco C. Grassi E. Provero P. Di Cunto F. Brunner H.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.