School of Information University of Michigan SI 614 Network subgraphs (motifs) Biological networks Lecture 11 Instructor: Lada Adamic.

Slides:



Advertisements
Similar presentations
Network analysis Sushmita Roy BMI/CS 576
Advertisements

Biological Networks Analysis Degree Distribution and Network Motifs
Network biology Wang Jie Shanghai Institutes of Biological Sciences.
The Architecture of Complexity: Structure and Modularity in Cellular Networks Albert-László Barabási University of Notre Dame title.
Analysis and Modeling of Social Networks Foudalis Ilias.
School of Information University of Michigan Network resilience Lecture 20.
The multi-layered organization of information in living systems
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
Information Networks Generative processes for Power Laws and Scale-Free networks Lecture 4.
Weighted networks: analysis, modeling A. Barrat, LPT, Université Paris-Sud, France M. Barthélemy (CEA, France) R. Pastor-Satorras (Barcelona, Spain) A.
School of Information University of Michigan SI 614 Random graphs & power law networks preferential attachment Lecture 7 Instructor: Lada Adamic.
Hierarchy in networks Peter Náther, Mária Markošová, Boris Rudolf Vyjde : Physica A, dec
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
Mining and Searching Massive Graphs (Networks)
A Real-life Application of Barabasi’s Scale-Free Power-Law Presentation for ENGS 112 Doug Madory Wed, 1 JUN 05 Fri, 27 MAY 05.
Biological Networks Feng Luo.
Network Statistics Gesine Reinert. Yeast protein interactions.
Regulatory networks 10/29/07. Definition of a module Module here has broader meanings than before. A functional module is a discrete entity whose function.
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Global topological properties of biological networks.
Network Motifs Zach Saul CS 289 Network Motifs: Simple Building Blocks of Complex Networks R. Milo et al.
Advanced Topics in Data Mining Special focus: Social Networks.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Protein Classification A comparison of function inference techniques.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Models and Algorithms for Complex Networks Networks and Measurements Lecture 3.
Lecture 28 Network resilience
Network Analysis and Application Yao Fu
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Network Clustering Experimental network mapping Graph theory and terminology Scale-free architecture Integrating with gene essentiality Robustness Lecturer:
Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.
Analysis of biological networks Part III Shalev Itzkovitz Shalev Itzkovitz Uri Alon’s group Uri Alon’s group July 2005 July 2005.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
School of Information University of Michigan Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution.
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
LECTURE 2 1.Complex Network Models 2.Properties of Protein-Protein Interaction Networks.
Introduction to biological molecular networks
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Bioinformatics Center Institute for Chemical Research Kyoto University
Network resilience.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Robustness, clustering & evolutionary conservation Stefan Wuchty Center of Network Research Department of Physics University of Notre Dame title.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
1 Lesson 12 Networks / Systems Biology. 2 Systems biology  Not only understanding components! 1.System structures: the network of gene interactions and.
Network Analysis Goal: to turn a list of genes/proteins/metabolites into a network to capture insights about the biological system 1.Types of high-throughput.
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Network (graph) Models
Structures of Networks
Lecture 27 Network resilience
CSCI2950-C Lecture 12 Networks
Biological networks CS 5263 Bioinformatics.
Network biology : protein – protein interactions
Biological Networks Analysis Degree Distribution and Network Motifs
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Peer-to-Peer and Social Networks
Lecture 20 Network resilience
Lecture 9: Network models CS 765: Complex Networks
Advanced Topics in Data Mining Special focus: Social Networks
Presentation transcript:

School of Information University of Michigan SI 614 Network subgraphs (motifs) Biological networks Lecture 11 Instructor: Lada Adamic

Outline motifs motif detection (software & Pajek) review of network characteristics used to compare model with real-world network one more: degree assortativity biological networks types characteristics hierarchical modularity model

Schematic view of network motif detection

Motifs can overlap in the network motif matches in the target graph motif to be found graph

Examples of network motifs (3 nodes) Feed forward loop Found in neural networks Seems to be used to neutralize “biological noise” Single-Input Module e.g. gene control networks

All 3 node motifs

Examples of network motifs (4 nodes) Parallel paths Found in neural networks Food webs W XY Z

4 node subgraphs (computational expense increases with the size of the graph!)

Network motif detection Some motifs will occur more often in real world networks than random networks Technique: construct many random graphs with the same number of nodes and edges (same node degree distribution?) count the number of motifs in those graphs calculate the Z score: the probability that the given number of motifs in the real world network could have occurred by chance Software available:

What the Z score means  mean number of times the motif appeared in the random graph # of times motif appeared in random graph zxzx = x -  x xx  standard deviation the probability observing a Z score of 2 is In the context of motifs: Z > 0, motif occurs more often than for random graphs Z < 0, motif occurs less often than in random graphs |Z| > 1.65, only a 5% chance of random occurence

Finding classes on graphs based on their motif “profiles”

Finding motifs (cliques and subgraphs) in Pajek Create a second network that is the subgraph you are looking for e.g. an undirected triad *Vertices 3 1 "v1" 2 "v2" 3 "v3" *Arcs *Edges

finding motifs with Pajek Use the two drop down menus in the ‘networks’ list to specify two networks: Then run Nets>Fragment (1 in 2)>Find under Net>Fragment (1 in 2)>Options can select ‘induced’ subnetwork containing only overlapping fragments in

finding motifs with Pajek (cont’d) Now we have just the triads: Creates a hierarchy object with the membership of each triad listed

Comparing network models with the real thing check for structural similarity between the artificial network (the model) and the real world network degree distribution assortativity do high degree nodes connect to other high degree nodes? average shortest path dependence on size of network clustering coefficient compare to a randomized version conserving node degree dependence on node degree dependence on size of network motif profile

How can we randomize a network while preserving the degree distribution? Stub reconnection algorithm (M. E. Newman, et al, 2001, also known in mathematical literature since 1960s) Break every edge in two “edge stubs” A  B to A   B Randomly reconnect stubs Problems: Leads to multiple edges Cannot be modified to preserve additional topological properties

Local rewiring algorithm Randomly select and rewire two edges (Maslov, Sneppen, 2002, also known in mathematical literature since 1960s) Repeat many times Preserves both the number of upstream and downstream neighbors of each node

Conserving additional low-level topological properties In addition to k i one may also conserve: The exact numbers of loops or other motifs The size and numbers of components: Internet – all nodes have to be connected to each other Metropolis algorithm: two edges are rewired based on E=(N actual -N desired ) 2 /N desired If  E  0 rewiring step is always accepted If  E>0 rewiring step is accepted with p=exp(-  E/T)

Assortativity Social networks are assortative: the gregarious people associate with other gregarious people the loners associate with other loners The Internet is disassortative: Assortative: hubs connect to hubs RandomDisassortative: hubs are in the periphery

Correlation profile of a network Detects preferences in linking of nodes to each other based on their connectivity Measure N(k 0,k 1 ) – the number of edges between nodes with connectivities k 0 and k 1 Compare it to N r (k 0,k 1 ) – the same property in a properly randomized network Very noise-tolerant with respect to both false positives and negatives

Correlation profiles give complex networks unique identities Internet Protein interactions slide by Sergei Maslov 2D picture

Correlation profiles give complex networks unique identities Internet Protein interactions Sergei Maslov: 2D histogram

Correlation profiles -cont’d Pastor-Satorras and Vespignani: 2D plot average degree of the node’s neighbors degree of node

Correlation profiles -cont’d Newman: single number internet degree correlation coefficient The Pearson correlation coefficient of nodes on each side on an edge

Other examples of assortative mixing Assortativity is not limited to degree-degree correlations other attributes social networks: race, income, gender, age food webs: herbivores, carnivores internet: high level connectivity providers, ISPs, consumers Tendency of like individuals to associate: ‘homophily’ Scott Feld paper

Biological networks In biological systems nodes and edges can represent different things nodes protein, gene, chemical edges mass transfer, regulation Can construct bipartite or tripartite networks: e.g. genes and proteins

GENOME PROTEOME METABOLISM bio-chemical reactions protein-protein interactions protein-gene interactions slide after Reka Albert

Cellular processes form networks on many levels metabolic reaction networks (tri-partite) slide after Reka Albert Node types: metabolites (substrates or products), open rectangles metabolite-enzyme complexes (black rectangles) enzymes (open ovals) Edges substrate to complex or complex to product symmetrical edges

regulatory networks nodes: genes, proteins edges: translation regulation: activating inhibiting slide after Reka Albert

the yeast two-hybrid method Activation and binding domains are separated and each attached to a different protein If the proteins interact, the two domains will be brought together and activate the transcription of a reporter gene Can do simultaneous genome-wide experiments slide after Reka Albert

Resulting interaction network slide after Reka Albert

Properties and problems of resulting networks Properties giant component exists power law distribution with an exponential cutoff longer path length than randomized higher incidence of short loops than randomized Problems false positives false negatives only 20% overlap between different studies

Implications Robustness resilient to random breakdowns mutations in hubs can be deadly Evolution most connected hubs conserved across organisms (important) gene duplication hypothesis new gene still has same output protein, but no selection pressure because the original gene is still present. So some interactions can be added or dropped leads to scale free topology

Metabolic networks: how to represent them Can consider the one-mode projection of substrate interactions (undirected) slide after Reka Albert

Metabolic networks are scale-free In the bi-partite graph: the probability that a given substrate participates in k reactions is k  indegree:  = 2.2 outdegree:  = 2.2 (a) A. fulgidus (Archae) (b) E. coli (Bacterium) (c) C. elegans (Eukaryote), (d) averaged over 43 organisms

Modularity No modularity Modularity Hierarchical modularity E. Ravasz et al., Science 297, (2002) (Pajek!)

How do we know that metabolic networks are modular? clustering decreases with degree as C(k)~ k -1 randomized networks (which preserve the power law degree distribution) have a clustering coefficient independent of degree

How do we know that metabolic networks are modular? clustering coefficient is the same across metabolic networks in different species with the same substrate corresponding randomized scale free network: C(N) ~ N (simulation, no analytical result) bacteria archaea (extreme-environment single cell organisms) eukaryotes (plants, animals, fungi, protists) scale free network of the same size

review: what would the clustering coefficient of a random network be assume average degree of node is k probability of one neighbor linking to another is ~ k/N scales as N -1

Constructing a hierarchically modular network RSMOB model Start from a fully connected cluster of nodes Create 4 identical replicas of the cluster, linking the outside nodes of the replicas to the center node of the original (N = 25 nodes) This process can repeated indefinitely (initial number of nodes can be different than 5)

Properties of the hierarchically modular model RSMOB model Power law exponent  = 2.26 (in agreement with real world metabolic networks) C ≈ 0.6, independent of network size (also comparable with observed real-world values) C(k) ≈ k -1, as in real world network How to test for hierarchically arranged modules in real world networks perform hierarchical clustering on the topological overlap map (we’ll cover hierarchical clustering in a few weeks…) can be done with Pajek

Topological overlap A: Network consisting of nested modules B: Topological overlap matrix hierarchical clustering

Hubs may act within a module, or connect modules Party hub: simultaneous interactions tends to be within the same module Date hub: sequential interactions connect different modules Han et al, Nature 443, 88 (2004) slide after Reka Albert

some matching motifs frequently overlap (e.g. feed forward loop) Zhang et al, J. Biol 4, 6 (2005)