Clustering Metabolic Networks Using Minimum Cut Trees Ryan Kellogg 1, Allison Heath 2, Lydia Kavraki 2,3 1 Carnegie Mellon University, Department of Electrical.

Slides:



Advertisements
Similar presentations
Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Biological Networks Analysis Introduction and Dijkstras algorithm.
Advertisements

EMIS 8374 Vertex Connectivity Updated 20 March 2008.
Information Networks Graph Clustering Lecture 14.
1 s-t Graph Cuts for Binary Energy Minimization  Now that we have an energy function, the big question is how do we minimize it? n Exhaustive search is.
Lectures on Network Flows
Ehsan Ullah, Prof. Soha Hassoun Department of Computer Science Mark Walker, Prof. Kyongbum Lee Department of Chemical and Biological Engineering Tufts.
Threshold selection in gene co- expression networks using spectral graph theory techniques Andy D Perkins*,Michael A Langston BMC Bioinformatics 1.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
HCS Clustering Algorithm
1. Elements of the Genetic Algorithm  Genome: A finite dynamical system model as a set of d polynomials over  2 (finite field of 2 elements)  Fitness.
Spanning Trees.
Spanning Trees. 2 Spanning trees Suppose you have a connected undirected graph Connected: every node is reachable from every other node Undirected: edges.
Maximum Flows Lecture 4: Jan 19. Network transmission Given a directed graph G A source node s A sink node t Goal: To send as much information from s.
CSE 421 Algorithms Richard Anderson Lecture 4. What does it mean for an algorithm to be efficient?
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Segmentation via Graph Cuts
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Tree Decomposition Benoit Vanalderweireldt Phan Quoc Trung Tram Minh Tri Vu Thi Phuong 1.
Models and Algorithms for Complex Networks Graph Clustering and Network Communities.
COSC 2007 Data Structures II Chapter 14 Graphs III.
Approximating the Minimum Degree Spanning Tree to within One from the Optimal Degree R 陳建霖 R 宋彥朋 B 楊鈞羽 R 郭慶徵 R
Greedy Approximation Algorithms for finding Dense Components in a Graph Paper by Moses Charikar Presentation by Paul Horn.
Spanning Trees Introduction to Spanning Trees AQR MRS. BANKS Original Source: Prof. Roger Crawfis from Ohio State University.
Toward Automatically Drawn Metabolic Pathway Atlas with Peripheral Node Abstraction Algorithm Myungha Jang, Arang Rhie, and Hyun-Seok Park * Bioinformatics.
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
Lecture7 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.
Module 5 – Networks and Decision Mathematics Chapter 23 – Undirected Graphs.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
1 Departament of Bioengineering, University of California 2 Harvard Medical School Department of Genetics Metabolic Flux Balance Analysis and the in Silico.
Robustness of complex networks with the local protection strategy against cascading failures Jianwei Wang Adviser: Frank,Yeong-Sung Lin Present by Wayne.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Minimum Spanning Trees CS 146 Prof. Sin-Min Lee Regina Wang.
1 CIS 4930/6930 – Recent Advances in Bioinformatics Spring 2014 Network problems Tamer Kahveci.
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
Chapter 7 April 28 Network Flow.
Introduction to biological molecular networks
DNAmRNAProtein Small molecules Environment Regulatory RNA How a cell is wired The dynamics of such interactions emerge as cellular processes and functions.
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 25.
Community Discovery in Social Network Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
Network resilience.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Chapter 7 May 3 Ford-Fulkerson algorithm Step-by-step walk through of an example Worst-case number of augmentations Edmunds-Karp modification Time complexity.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Graph spectral analysis/
Maik Friedel, Thomas Wilhelm, Jürgen Sühnel FLI-Jena, Germany Introduction: During the last 10 years, a large number of complete.
© M. Winter COSC/MATH 4P61 - Theory of Computation Minimum-weight Spanning Tree Weighted Graph Spanning.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
Maximum Flow - Anil Kishore Graph Theory Basics. Prerequisites What is a Graph Directed, Weighted graphs How to traverse a graph using – Depth First Search.
Course Name: Comparative Genomics Conducted by- Shigehiko kanaya & Md. Altaf-Ul-Amin.
::Network Optimization:: Minimum Spanning Trees and Clustering Taufik Djatna, Dr.Eng. 1.
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
Graphs Representation, BFS, DFS
School of Computing Clemson University Fall, 2012
Minimum Spanning Tree 8/7/2018 4:26 AM
Lectures on Network Flows
Prediction of Regulatory Elements for Non-Model Organisms Rachita Sharma, Patricia.
3.5 Minimum Cuts in Undirected Graphs
Spanning Trees.
4-4 Graph Theory Trees.
Lecture 19-Problem Solving 4 Incremental Method
INTRODUCTION TO NETWORK FLOWS
Maximum Flow Problems in 2005.
EMIS 8374 Max-Flow in Undirected Networks Updated 18 March 2008
Presentation transcript:

Clustering Metabolic Networks Using Minimum Cut Trees Ryan Kellogg 1, Allison Heath 2, Lydia Kavraki 2,3 1 Carnegie Mellon University, Department of Electrical & Computer Engineering, 2 Rice University, Department of Computer Science; 3 Rice University, Department of Bioengineering Problem Finding clusters in metabolic networks is important for several reasons:  Clusters may correspond to groups of reactions that perform a common function  Complex metabolic networks can be simplified based on their cluster composition  Insights about large-scale organization and evolutionary history can be achieved [3] Our approach is interesting because:  One can change the size and number of clusters produced by adjusting a single parameter  The algorithm is elegant and mathematically robust  Execution is efficient and based on network flow computations Motivation This project is about the discovery and analysis of clusters in metabolic networks. We implement an algorithm for cluster detection based on minimum cut trees, apply the algorithm to metabolic network data, analyze the identified clusters and discuss the biological implications. Overview Conclusion and Future Work Results The algorithm for detecting clusters is based on a structure called a minimum cut tree [2]. The minimum cut tree T of a graph G has the property that lowest edge weight along the path between two nodes in T equals the minimum cut between the same two nodes in G. Consider the following example graph and its corresponding minimum cut tree: Explanation: Suppose we are interested in the minimum cut between nodes A and F. The dashed red line indicates this cut, which has capacity 17. Consequently, in the min-cut tree, along the path between nodes A and F, the lowest edge weight is 17. Minimum Cut Trees Method We model metabolic networks using a directed, bipartite graph:  One set of nodes represents compounds  One set of nodes represents reactions  Edges associate compounds with reactions Metabolic networks are very complex. This model is a first order approximation. It relates the topological information necessary for cluster identification. Metabolic Networks as Graphs The minimum cut tree clustering (MCTC) algorithm proceeds as follows [1]: Clustering Algorithm Tuning Alpha Begin with an undirected, weighted graph G. Attach artificial sink to each node in G with edge of weight α. Call this structure “expanded graph”. Compute the minimum cut tree of the expanded graph. Now, remove the artificial sink from the structure. The disconnected components are clusters of G. We obtain optimal clusterings for each of the four organisms and compare with known metabolic pathways. Matches fall roughly into four categories:  Full match: A cluster coincides exactly with a pathway.  Partial match: A cluster is contained by but does not fill a pathway.  Multi-match: A single cluster spans multiple pathways.  No match: There is little discernable clustering in a pathway. We present an example of each type: Biological Analysis This is a ongoing project. More analysis is necessary to determine the extent that the MCTC algorithm is useful for understanding metabolic networks. Current progress is encouraging; the algorithm seems to produce biologically meaningful clusters with reasonable efficiency. Future work we will explore: cluster detection when pathway structure is unknown, simplified network representations based on cluster composition, and applications in other types of biological networks, such as motif identification in regulatory networks. References [1] G.W. Flake, R.E. Tarjan, K. Tsioutsiouliklis. “Graph Clustering and Minimum Cut Trees.” Internet Mathematics;1: [2] R.E. Gomory and T.C. Hu. “Multi-terminal Network Flows.” J. Soc. Indust. Appl. Math; 9: [3] P. Holme and M. Huss. “Discovery and Analysis of Biochemical Network Hierarchies”. Bioinformatics; 19: For questions or comments: Allison Heath We seek to objectify selection of alpha in our analysis:  Choose the value corresponding to clusters that “best fit” known metabolic pathway structure  To calculate, find intersection of average pathways per cluster (PPC) and average clusters per pathway (CPP) Figure to right shows best fit alpha values for the four organisms in our study Cluster Statistics Interesting observations:  Number of clusters changes with α in step-like fashion  Moderate sized clusters for only small range of α  Overall behavior is as expected Full Match: E. coli Fatty Acid Biosynthesis No Match: A. thaliana Reductive carboxylate cycle Partial Match S. cerevisiae Nucleotide sugars metabolism Multi Match H. sapiens Methionine metabolism / Cysteine metabolism Our data comes from the Kyoto Encyclopedia for Genes and Genomes (KEGG). We study the full metabolism of four organisms:  Saccharomyces cerevisiae  Arabidopsis thaliana  Escherichia coli  Homo sapiens TotalLCC EdgesNodesEdgesNodes S. Cerevisiae A. thaliana E. coli H. sapiens We note the KEGG data is disconnected. We focus on the primary, largest connected component (LCC) in the metabolic network. Data