Merge links between probes by Entrez Gene identifiers Genes and proteins of living organisms deploy their functions through a complex series of interactions.

Slides:



Advertisements
Similar presentations
1 Modular Co-evolution of metabolic networks Zhao Jing.
Advertisements

Zhen Shi June 2, 2010 Journal Club. Introduction Most disease-causing mutations are thought to confer radical changes to proteins (Wang and Moult, 2001;
Putting genetic interactions in context through a global modular decomposition Jamal.
Global Mapping of the Yeast Genetic Interaction Network Tong et. al, Science, Feb 2004 Presented by Bowen Cui.
A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles Authors: Chia-Hao Chin 1,4,
Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.
Threshold selection in gene co- expression networks using spectral graph theory techniques Andy D Perkins*,Michael A Langston BMC Bioinformatics 1.
Bi-correlation clustering algorithm for determining a set of co- regulated genes BIOINFORMATICS vol. 25 no Anindya Bhattacharya and Rajat K. De.
Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis Jonsson.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Regulatory networks 10/29/07. Definition of a module Module here has broader meanings than before. A functional module is a discrete entity whose function.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
1 Topology, Functionality and Evolution of Metabolic Networks Jing Zhao Shanghai Center for Bioinformation and Technology 28, September,
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
Network Analysis and Application Yao Fu
1. Abstract SAGE Serial analysis of gene expression (SAGE) is a method of large-scale gene expression analysis.that involves sequencing small segments.
Gene Set Enrichment Analysis (GSEA)
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Functional annotation and identification of candidate disease genes by computational analysis of normal tissue gene expression data L. Miozzi 1, U. Ala.
Ranking-Aware Integration and Explorative Search of Distributed Bio-Data Dipartimento di Elettronica e Informazione NETTAB 2012 Integrated Bio-Search November.
ANALYZING PROTEIN NETWORK ROBUSTNESS USING GRAPH SPECTRUM Jingchun Chen The Ohio State University, Columbus, Ohio Institute.
Networks and Interactions Boo Virk v1.0.
Paper prepared for presentation at the 16 th ICABR Conference – 128 th EAAE Seminar “The Political Economy of the Bioeconomy: Biotechnology and Biofuel”
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Abstract Background: In this work, a candidate gene prioritization method is described, and based on protein-protein interaction network (PPIN) analysis.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Gene expression analysis
Searching for structured motifs in the upstream regions of hsp70 genes in Tetrahymena termophila. Roberto Marangoni^, Antonietta La Terza*, Nadia Pisanti^,
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Understanding Network Concepts in Modules Dong J, Horvath S (2007) BMC Systems Biology 2007, 1:24.
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
1 A text-mining analysis of the human phenome Marc A van Driel 1, Jorn Bruggeman 2, Gert Vriend 1, Han G Brunner *,3 and Jack AM Leunissen 2 European Journal.
Functional prediction methods. The usual troubles of the molecular and cellular biology labs What are the functions of a previously non characterized.
Introduction to biological molecular networks
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
GO based data analysis Iowa State Workshop 11 June 2009.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
By Jay Krishnan. Introduction Information gathered from Proteomic techniques + neuroscientific research = Information on protein composition and function.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Network Analysis Goal: to turn a list of genes/proteins/metabolites into a network to capture insights about the biological system 1.Types of high-throughput.
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
EQTLs.
1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION
Hyunghoon Cho, Bonnie Berger, Jian Peng  Cell Systems 
Biological networks CS 5263 Bioinformatics.
Bioinformatics tools to identify structured motifs in the upstream regions of stress-response-involved genes in Tetrahymena thermophila Antonietta La Terza*,
Large Scale Data Integration
Presented by Meeyoung Park
Walking the Interactome for Prioritization of Candidate Disease Genes
Volume 3, Issue 1, Pages (July 2016)
Michal Levin, Tamar Hashimshony, Florian Wagner, Itai Yanai 
The Orphan Disease Networks
Predicting Gene Expression from Sequence
Hyunghoon Cho, Bonnie Berger, Jian Peng  Cell Systems 
Interactome Networks and Human Disease
Presentation transcript:

Merge links between probes by Entrez Gene identifiers Genes and proteins of living organisms deploy their functions through a complex series of interactions. These relationships can be more or less direct, and can be inferred from different types of experimental evidences. The most obvious relation is a direct molecular interaction, which can be shown both with biochemical methods and with molecular biological techniques such as the yeast two hybrid system. Nevertheless, very close functional relationships are even possible in the absence of direct molecular binding. Considering that genes involved in the same functions tend to show very similar expression pattern and given the availability of massive gene expression data repository, co- expression analysis represents one of the most powerful tools for exploring the complexity of functional relationships among genes. In particular, phylogenetic conservation of co-expression relationships has been proposed as a very strong criterion to identify functionally relevant links among genes. Here we will present two global networks of co-expression relationships conserved between human and mouse, one based on the analysis of cDNA microarray database and the other on Affymetrix microarray datasets, and their comparison. Moreover, we evaluate the overlapping of this network with the literature and two hybrid-based human interactome. Our preliminary results strongly suggest that co-expression relationships conserved between human and mouse are very relevant for exploring the function of mammalian genes and that integration with other information (like phenotype similarity) can provide reliable prediction of potential disease-causing genes. Evaluation of gene expression profile correlation among all the probes by Pearson’s coefficient Link every probe with the probes which are in the first percentile of the respective ranked lists Single-species datasets of microarray experiments, based on probes which can be linked to EntrezGene IDs Select the links found in both the coexrpression networks, according to Homologene Human coexpression network (H-GCN) Human-Mouse coexpression network HM-GCN GO Enrichment The first HM-GCN (S-GCN) was generated from data of SMD, the second one (A-GCN) from Affymetrix data ([2] and [3]). Both HM-GCN networks exhibit topological properties that are similar to other biological (gene coexpression, protein-protein interaction or metabolic) networks, such as a tendency for highly connected nodes (hubs) although they show degree distribution to an exponential one (Fig 1). S-GCN is composed of 8.5*10 3 nodes (genes) and nearly 6*10 4 edges; its average connectivity is 13.2 edges per node.. Instead, 12.8*10 3 genes and 1.5*10 5 links compose A-GCN with an average connectivity of 24.3 edges per node. They contain a large connected component (of 2305 and 4122 genes respectively) with some other small connected components containing only a few nodes. HOW TO: NETWORKS FEATURES ANALYSIS OF OMIM TERMS IN THE NETWORK We have performed a statistical analysis to evaluate the enrichment in Gene Ontology (GO) terms for all the nodes (genes) in the network. We have limited this analysis to the first neighbours of all the genes (co-expression cluster CC) and the results is that more than the 36% (for A-GCN) and 28% (for S-GCN) of genes are enriched for at least a GO keyword. This result is statistically significant if compared with the evaluation of enrichment in GO for random permutations, as shown in figure. This confirms that human-mouse conserved co-expression is a valuable criterion to identify functionally related genes. Overlap of S-GCN and A-GCN Since many genes, such as those involved in basic cellular functions, should be co-expressed regardless of the particular experimental situation, we would expect the S-GCN and the A-GCN to have many common links. Indeed, S-GCN and A-GCN share 2305 edges, between the 7332 common nodes, which represents a striking overlap (the randomized A-GCN had on average 87.5 edges in common with the S-GCN, with standard deviation 6.2). On the other hand, the large number of specific links which characterize the two networks indicates that they provide highly complementary information Mouse coexpression network (M-GCN) COMPARISON BETWEEN HM-GCNS AND HPRD Figure 1. Cumulative degree distribution for HM-GCNs. The horizontal axis is vertex degree (i.e. number of link per node) k, and the vertical axis is the cumulative probability distribution, i.e., the fraction of vertices that have degree greater than or equal to k. Homo sapiensMus musculus GENERATION AND ANALYSIS OF HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P. and Di Cunto F. Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P. and Di Cunto F. OR BIBLIOGRAPHY: 1 CLOE: Identification of putative functional relationships among genes by comparison of expression profiles between two species M. Pellegrino, P. Provero, L. Silengo and F. DiCunto. BMC Bioinformatics 2004, 5:179 2Gene expression analyses reveal molecular relationships among 20 regions of the human CNS. Roth RB, Hevezi P, Lee J, Willhite D, Lechner SM, Foster AC, Zlotnik A (2006) Neurogenetics 7: A gene atlas of the mouse and human protein-encoding transcriptomes. Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, Cooke MP, Walker JR, Hogenesch JB (2004) Proc Natl Acad Sci U S A 101: http:// 5A text-mining analysis of the human phenome. M. A. van Driel, J. Bruggeman, G. Vriend, H. G Brunner and J. A M Leunissen European Journal of Human Genetics (2006) 14, 535–542 To explore the possible predictive value in terms of human phenotypes for co-expression links in our network, we focused our attention on the OMIM terms used in MimMiner [5]. The measure of the prevalence of links between genes associated to highly related phenotype description shows, in both networks, a strong enrichment if compared with the average number obtained from the randomized networks. This result strongly suggest that HM-GCNs represent valuable resources to help dissecting the molecular bases of many genetic diseases whose genes have not yet been identified. In particular, the integration with other information (like phenotype similarity) can provide reliable prediction of potential disease-causing genes for orphan disease loci. A-GCN S-GCN Random permutation HM-GCNS COMPARISON AND ANALYSIS Figure 3. Comparison between S-GCN and A-GCN (brown green column) with their GO and OMIM enrichment. Z scores are reported to show the good statistical results for these analysis Figure 2. Z-score reported values show the measure for this significant overlap. Z-score is defined as the subtraction between the number of common links and the average number of common links obtained from the randomized networks then divided by the corresponding standard deviation. in vivo in vitro yeast-two-hybrid Table 1. Table with some predicted candidates for orphan disease loci. The first column shows in which network the candidate has been found (S for S-GCN and G for A-GCN); the second column indicates the available information: 1: no mutations known with similar phenotypes 2: mutation with similar phenotype >0,4 3: mutation with similar phenotype <0,4 #: mutation found in patients Then are reported the HUGO name for the predicted candidate, the OMIM ID of the disease (with a brief description); and the P-value representing the probability of finding a predicted candidate concerning a particular disease from a particular network. To evaluate our results (S-GCN and A-GCN), we performed a comparison with the Human Protein Reference Database [4], composed by interactions detected by in vivo, in vitro and/or yeast-two-hybrid experiments. Both networks show a statistically significant overlap with this recently network of protein-protein interactions.