Dynamic Networks: How Networks Change with Time? Vahid Mirjalili CSE 891
Overview Introduction Methodology –DHAC: clustering in a single snapshot – MATH-EM: Cluster matching in different time frames Results Discussion Further improvement
Motivation To infer the dynamic state of a cell in response to physiological changes Two algorithms used: DHAC: Dynamic Hierarchal Agglomerative Clustering for clustering time-evolving networks MATH-EM: for matching corresponding clusters across time-points
Background Current biological networks are static Experimental methods: Protein abundance (mass spec.) (mainly available for high abundant proteins) Transcript abundance (more readily available) Previous works: combining transcript abundance and interaction networks to create a moving cell
Dynamic Networks Probabilistic framework The number of proteins can increase or decrease at each time-point Protein can switch interacting partners Complexes can grow/shrink Reveals temporal regulation of cell protein state
HAC: Hierarchal Agglomerative Clustering Agglomerative = “bottom up” approach Divisive = “top down” approach
HAC Features Maximizes the likelihood of a hierarchal stochastic block model Automatic selection of model size Multi-scale networks Outperforms other methods in link prediction Extending HAC to dynamic networks: How complexes inferred at one time point correspond to other time points Transitions of a protein require dynamic coupling between network snapshots
DHAC: Converting likelihood modularity from maximum likelihood to fully Bayesian statistics Kernelize likelihood modularity with an adaptive bandwidth to couple network clusters at different time points
Dynamic Network Clustering {G(t) = (V(t), E(t)), t= 1.. T} V: proteins E: (undirected, unweighted) protein-protein interactions Goal: find the stochastic block models {M(t) t=1.. T} M(t): network generative model for G(t) Introducing coupling between time points improves dynamic network clustering
DHAC: notations probability of a structure model M The probability that a vertex is in cluster k
Merging Clusters To merging clusters 1 &2 into 1’: Maximum likelihood Bayesian
Kernelization Kernel reweighting: to couple nearby snapshots
DHAC Algorithm for t=1:T do Set each vertex to be a single cluster Let be cumulative model comparison score Compute merging scores of pairs having an edge or a shared neighbor repeat Pick a pair i,j of maximum Update scores of affected pairs after merging i,j Merge i,j to i' Compute merging scores i',j for all j with or Update until no pairs left output at which was maximum end for
Cluster Matching Algorithm Searching through time-frames to see how complexes evolve Goal: to find the most probable matching of cluster i to a global index k
Results ● Drosophila development (gene expression data available) DHAC-local: variable bandwidth DHAC-const: constant bandwidth
Yeast Metabolic Cycle
Yeast Results Yeast results identify protein complexes with asynchronous gene expression 31 dynamic protein complexes were recovered Many of the complexes have cluster- specific gene-ontology with P-value<0.05 Some of the complexes disappear and then reappear across time-points
Discussion DHAC scales as O(EJ ln(V)) Networks with 2000 vertices take up to 5 min. A full genome network (10000 to vertices) can be analyzed in a day or a week This methods permits proteins to switch between complexes over time A natural multi-scale complexes, sub- complexes and proteins
Further improvement Information from pathway to complex to sub-complex to finer structures could be used Lack a method to match the dynamically evolving hierarchical structures over snapshots They only focused on the bottom level complexes, rather than the hierarchical structure
MATCH-EM Goal: Match similar groups across time- points Find the mapping of each cluster to a global index There is one and only one global index for cluster i The probability that vertex u is in global index k The assignment matrix
The matching probability under consistent indexing Number of shared vertices between cluster i at time t, and cluster j at time t+1 Probability that a vertex can make a transition from k to k’ between two consecutive snapshots
Update:
Experimental Data Combining Gene expression time series with static protein interaction networks The presence of a protein is assumed to be related to the transcriptional abundance of the corresponding transcript at a nearby time N x T matrix: transcription levels of N genes across T time points The dynamics of the networks is generated from the transcription matrix, under the assuming that proteins in a complex have correlated gene expression profiles
Results: Held-out link prediction Randomly select two vertices, and remove the edge After clustering, vertex u is assigned to group i, and vertex v to cluster j The maximum likelihood probability that u- v were connected:
AUPRC: area under the curve of Precision-Recall-Curve AUROC: area under the curve of receiver-operating-characteristics (generated by true-positive-rate and false-positive-rate)
Yeast Metabolic Cycle Three dominant metabolic states: 1.Reductive Building: 977 genes RB 2.Reductive Charging: 1510 genes RC 3.Oxidative: 1023 genes OX 36 snapshots Preprocessing: iterative degree cutoff, reducing the number of proteins from 1380 to 480±14
Macro-view of YMC RB phase OX phase RC phase
Micro-views of YMC dynamics Cluster #7: mitochondrial ribosome complex 1.RSMs: ribosomal small subunits of mitochondria 2.MRPs: mitochondrial ribosomal proteins RSM22 is active at t=9, 20 & 32, while other proteins are not transcribed Methylation of 3’-end of rRNA of small mitochondrial subunit is requred for the assembly and stability of mitochindrial ribosome Deleting RSM22 yields a viable cell with non-functional mitochondria Hypothesis Early expression of RSM22 provide the methylation activity required for the assembly of small sub-units of mitochondrial ribosome
Cluster #7: mitochondrial ribosomal complex Average expression levels during the three main phases
Cluster #16: nuclear pore Active at t=9, 20 & 32 Most genes are OX-responsive Combines with subunits of other complexes The co-expressed cores: –Nuclear pore complex (NPC) –Karyopherin proteins (KAP) Micro-views of YMC dynamics
Cluster #16: nuclear pore complex During OX phase, SRP1 and SXM1 Are additionally recruited
What we learned from YMC? RRP4 and RRP42 are part of exosome that edit RNA molecules, they transition between the nuclear pore and other complexes RNA processing is tightly coupled to transport through the nuclear pore to cytoplasm Dynamic reorganization of the nuclear pore occurs during the metabolic cycle