CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.

Slides:



Advertisements
Similar presentations
CSCE555 Bioinformatics Lecture 3 Gene Finding Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Advertisements

Putting genetic interactions in context through a global modular decomposition Jamal.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles Authors: Chia-Hao Chin 1,4,
Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis Jonsson.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Heuristic alignment algorithms and cost matrices
Comparison of Networks Across Species CS374 Presentation October 26, 2006 Chuan Sheng Foo.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
6. Lecture SS 20005Cell Simulations1 V6: the “interactome” - Protein-protein interaction data is noisy and incomplete - V5: use Bayesian networks to combine.
Structure discovery in PPI networks using pattern-based network decomposition Philip Bachman and Ying Liu BIOINFORMATICS System biology Vol.25 no
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
Gene and Protein Networks II Monday, April CSCI 4830: Algorithms for Molecular Biology Debra Goldberg.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
“Multiple indexes and multiple alignments” Presenting:Siddharth Jonathan Scribing:Susan Tang DFLW:Neda Nategh Upcoming: 10/24:“Evolution of Multidomain.
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
Networks of Protein Interactions Network Alignment Antal Novak CS 374 Lecture 6 10/13/2005 Nuke: Scalable and General Pairwise and Multiple Network Alignment.
BCB 570 Spring Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.
Efficient Algorithms for Detecting Signaling Pathways in Protein Interaction Networks Jacob Scott, Trey Ideker, Richard M. Karp, Roded Sharan RECOMB 2005.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
341: Introduction to Bioinformatics Dr. Natasa Przulj Deaprtment of Computing Imperial College London
The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed.
Alignment Statistics and Substitution Matrices BMI/CS 576 Colin Dewey Fall 2010.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
BIONFORMATIC ALGORITHMS Ryan Tinsley Brandon Lile May 9th, 2014.
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
Network Analysis and Application Yao Fu
CSCE555 Bioinformatics Lecture 6 Hidden Markov Models Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Analyzing transcription modules in the pathogenic yeast Candida albicans Elik Chapnik Yoav Amiram Supervisor: Dr. Naama Barkai.
CSCE555 Bioinformatics Lecture 16 Identifying Differentially Expressed Genes from microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun.
QNET: A tool for querying protein interaction networks Banu Dost +, Tomer Shlomi*, Nitin Gupta +, Eytan Ruppin*, Vineet Bafna +, Roded Sharan* + University.
Anindya Bhattacharya and Rajat K. De Bioinformatics, 2008.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Introduction to Bioinformatics Biological Networks Department of Computing Imperial College London March 18, 2010 Lecture hour 18 Nataša Pržulj
Construction of Substitution Matrices
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
Complementarity of network and sequence information in homologous proteins March, Department of Computing, Imperial College London, London, UK 2.
Algorithms for Biological Networks Prof. Tijana Milenković Computer Science and Engineering University of Notre Dame Fall 2010.
Anis Karimpour-Fard ‡, Ryan T. Gill †,
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Construction of Substitution matrices
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Predicting Protein Function Annotation using Protein- Protein Interaction Networks By Tamar Eldad Advisor: Dr. Yanay Ofran Computational Biology.
Robustness, clustering & evolutionary conservation Stefan Wuchty Center of Network Research Department of Physics University of Notre Dame title.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
1 Lesson 12 Networks / Systems Biology. 2 Systems biology  Not only understanding components! 1.System structures: the network of gene interactions and.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
Cohesive Subgraph Computation over Large Graphs
CSCI2950-C Lecture 12 Networks
Spectral methods for Global Network Alignment
Bioinformatics 3 V6 – Biological Networks are Scale- free, aren't they? Fri, Nov 2, 2012.
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Schedule for the Afternoon
Spectral methods for Global Network Alignment
SEG5010 Presentation Zhou Lanjun.
Anastasia Baryshnikova  Cell Systems 
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
Presentation transcript:

CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page: University of South Carolina Department of Computer Science and Engineering

In the beginning there was DNA… Liolios K, Tavernarakis N, Hugenholtz P, Kyrpides, NC. The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide. NAR 34, D NAR 34, D

…then came protein interactions Arabidopsis PPI network E. Coli PPI network Yeast PPI network

Comparative Genomics to Comparative Interactomics Evolutionary conservation implies functional relevance ◦ Sequence conservation implies functional conservation ◦ Network conservation implies functional conservation too! What new insights might we gain from network comparisons? (Why should we care?)

Network comparisons allow us to: Identify conserved functional modules Query for a module, ala BLAST Predict functions of a module Predict protein functions Validate protein interactions Predict protein interactions Only possible with network comparisons Possible with existing techniques, but improved with network comparisons

What is a Protein Interaction Network? Proteins are nodes Interactions are edges Edges may have weights Yeast PPI network H. Jeong et al. Lethality and centrality in protein networks. Nature 411, 41 (2001)

The Network Alignment Problem Given k different protein interaction networks belonging to different species, we wish to find conserved sub- networks within these networks Conserved in terms of protein sequence similarity (node similarity) and interaction similarity (network topology similarity)

Example Network Alignment Sharan and Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology 24, pp , 2006

General Framework For Network Alignment Algorithms Sharan and Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology 24, pp , 2006 Network construction Scoring function Alignment algorithm

Building Co-expression Networks Gene A Gene B Gene C Gene B Gene A Gene C Pearson Correlation = Genes Arrays Microarray data Expression Balaji S. Srinivasan

Two Algorithms NetworkBLAST (covered today) Sharan et al. Conserved patterns of protein interaction in multiple species. PNAS, 102(6): , Græmlin Flannick et al. Græmlin: General and robust alignment of multiple large interaction networks. Genome Res 16: , 2006.

Overview of Sharan et al. Conserved patterns of protein interaction in multiple species. PNAS, 102(6): , 2005.

Estimation of Interaction Probabilities In the preprocessing step, edges in the network are given a reliability score using a logistic regression model based on three features: 1.Number of times an interaction was observed 2.Pearson correlation coefficient between expression profiles 3.Proteins’ small world clustering coefficient

Network Alignment Graphs Construct a Network Alignment Graph to represent the alignment Nodes contain groups of sequence similar proteins from the k organisms Edges represent conserved interactions. An edge between two nodes is present if: 1.One pair of proteins directly interacts, the rest are distance at most 2 away 2.All protein pairs are of distance exactly 2 3.At least max(2, k – 1) protein pairs directly interact Tries to account for interaction deletions

Example Network Alignment Graph Nodes a b c a’ b’ c’ a’’ b’’ c’’ a b c a’ b’ c’ a’’ b’’ c’’ Network alignment graph Individual species’ PPI network Species XSpecies YSpecies Z

Scoring Function Sharan et al. devise a scoring scheme based on a likelihood model for the fit of a single sub-network to the given structure High scoring subgraphs correspond to structured sub-networks (cliques or pathways) Only network topology is scored, node similarity is not

Log Likelihood Ratio Model Measures the likelihood that a subgraph occurs if it is a conserved network vs. that if it were a randomly constructed network Randomly constructed network preserves degree distribution for nodes log Pr(Subgraph occurs | Conserved Network) Pr(Subgraph occurs | Random Network)

Log Likelihood Ratio Model (i) in a real subnetwork, each interaction should be present independently with high probability, and (ii) in a random subnetwork, the probability of an interaction between any two proteins depends on their total number of connections in the network.

Likelihood Ratio Scoring of a Protein Complex in a Single Species U : a subset of vertices (proteins) in the PPI graph O U : collection of all observations on vertex pairs in U O uv : interaction between proteins u, v observed M s : conserved network model M n : random network (null) model T uv : proteins u, v interact F uv : proteins u, v do not interact β : probability that proteins u, v interact in conserved model p uv : probability that edge u, v exists in a random model Probability of complex being observed in a conserved network model Probability of subgraph being observed in a random network model

Likelihood Ratio Scoring of a Protein Complex in a Single Species Hence, log likelihood for a complex occurring in a single species is given by For multiple complexes across different species, it is the sum of the log likelihoods L(A, B, C) = L(A) + L(B) + L(C)

Example of Complex Scoring Nodes a b c a’ b’ c’ a’’ b’’ c’’ a b c a’ b’ c’ a’’ b’’ c’’ Conserved complex A in the Network alignment graph Individual species’ PPI network L(A) = L(X1) + L(Y1) + L (Z1) Complex X1 in Species X Complex Y1 in Species Y Complex Z1 in Species Z

Alignment algorithm Problem of identifying conserved sub- networks reduces to finding high scoring subgraphs NP-complete problem Heuristic solution: ◦ Greedy extension of high scoring seeds ◦ (Does this sound familiar? BLAST?) ◦ Common to both papers discussed

Alignment algorithm 1. Find seeds for each node v in the alignment graph a.Find high scoring paths of 4 nodes by exhaustive search b.Greedily add 3 other nodes one by one, that maximally increase the score of the seed

Alignment algorithm 2. Iteratively add or remove nodes to increase the overall score of the node Original seeds are preserved Limit size of discovered subgraphs to 15 nodes Record up to 4 highest scoring subgraphs discovered around each node

Alignment algorithm 3. Filter subgraphs with a high degree of overlap Iteratively find high scoring subgraph and remove all highly overlapping ones remaining

Results Conserved network regions within yeast (orange ovals), fly (green rectangles) and worm (blue hexagons) PPI networks.

Results Prediction of protein function ‘Guilt by association’ If a conserved cluster or path is significantly enriched in a functional annotation Prediction of protein interactions Predictions based on 2 strategies: Evidence that proteins with similar sequences interact Co-occurrence of proteins in the same conserved cluster or path Experimental verification of Yeast interactions using Y2H yielded 40-62% success rate

Overview of Fast, scalable, network alignment ◦ Scales linearly in number of networks compared ◦ NetworkBLAST scales exponentially Supports efficient querying of modules Speed-sensitivity control via user defined parameter ◦ Not supported in NetworkBLAST

Input to the Algorithm Weighted protein interaction graphs ◦ Weights represent probability that proteins interact ◦ Constructed via network integration algorithm A phylogenetic tree relating the species in the desired alignment ◦ Used for progressive alignment

Key Ideas of Graimin Generating An Initial Alignment From The Seed Greedy Seed Extension Phase Progressive alignment technique using the phylogenetic tree

Results Functional module identification using network alignment

Results Multiple alignment of 10 networks showing possible cell division module Functional annotation using network alignment

The Future of Network Comparison Græmlin Græmlin? Sharan and Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology 24, pp , 2006

Summary The problem: Network comparison/comparative interactomes NetworkBlast algorithm Brief introduction fo The analogy between sequence comparison and network comparison

Reference & Acknowledgements Chuan Sheng Foo Sharan et al. Conserved patterns of protein interaction in multiple species. PNAS. February 8, 2005 | vol. 102 | no. 6 |