BCB 570 Spring Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering
BCB 570 Spring Outline Data for Protein-protein interaction networks Brief review of network concepts for network analysis Effect of different data sets Biological network comparison
BCB 570 Spring Two hybrid system P protein of interest, referred to as "bait," is bound to a DNA Binding Domain (DBD). A separate protein, called the "prey," is bound to an open reading frame. If these two proteins (the bait and prey) interact, a reporter gene is transcribed. In general, used for initial identification of interacting proteins, not for detailed characterization of the interaction Image from
BCB 570 Spring Domain Belief Assumptions : A domain is a discrete functional and structural unit, such that it folds as a unit and carries out a particular function. Proteins consist of a number of these domains, laid out in a linear array along the polypeptide chain. The properties of a domain are basically the same when this unit is put into a different context (such as in a hybrid protein, for instance in the two-hybrid system). Limitations: Not all proteins have a domain structure. In many proteins, domains exist but they include portions of the polypeptide from different parts of the chain; for example, a domain might be composed of residues and Properties of a domain may change when it is taken out of the context of the intact protein. E.g., some proteins contain "autoinhibitory" regions.
BCB 570 Spring Co-Immunoprecipitation (co- IP) to find out what is binding the protein itself is used as an affinity reagent to isolate its binding partners Compared with two-hybrid and chip-based approaches, this strategy has the advantages that the fully processed and modified protein serves as bait
BCB 570 Spring Proteome Mass Spectrometry
BCB 570 Spring Problems Noisy data Many weak associations Self-activators contaminants Molecules are highly connected
BCB 570 Spring Approach Get more evidence Physical interactions Synthetic lethality Co-citation Co-expression Literature
BCB 570 Spring MIPS Database GDA1p
BCB 570 Spring PIR Database
BCB 570 Spring DIP GDA1p YEL017W YBR161W YJL152W ALD5p Ssp120p HPA2p
BCB 570 Spring
BCB 570 Spring Biogrid.org
BCB 570 Spring Analyzing P-P interaction networks Create networks Find structure in networks, search for modules or motifs Analyze results using known databases, functional enrichment, expression data, organelle information,etc
BCB 570 Spring Science Dec 5;302(5651): Epub 2003 Nov 6. A protein interaction map of Drosophila melanogaster. By Giot, et al.
BCB 570 Spring
BCB 570 Spring
BCB 570 Spring
BCB 570 Spring Copyright restrictions may apply. Jonsson, P. F. et al. Bioinformatics : ; doi: /bioinformatics/btl390 A description of the protein communities identified by k-clique cluster analysis (k = 6)
BCB 570 Spring Find structure Use cliques or highly connected regions in a network Clique Percolation Method (CPM, see Derényi et al., 2005) to locate the k-clique percolation clusters of the networkDerényi et al., 2005 MCL-Markov Cluster Algorithm based on simulation of (stochastic) flow in graphs Enright A.J., Van Dongen S., Ouzounis C.A. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Research 30(7): (2002). Animation
BCB 570 Spring Method: MCL Cluster Definition: Natural clusters in a graph are characterised by the presence of many edges between the members of that cluster, and one expects that the number of ‘higher-length’ (longer) paths between two arbitrary nodes in the cluster is high. Random walks on the graph rarely go from one natural cluster to another. The MCL algorithm finds cluster structure in graphs by deterministically computes (the probabilities of) random walks through the similarity graph, and uses two operators transforming one set of probabilities into another. It uses the language of stochastic matrices (also called Markov matrices) to capture the mathematical concept of random walks on a graph. Expansion coincides with taking the power of a stochastic matrix using the normal matrix product finding probabilities of random walks between nodes Inflation corresponds with taking the Hadamard power of a matrix:
BCB 570 Spring Example
BCB 570 Spring
BCB 570 Spring Adding in Transcriptional Interactions ChIP-chip with whole genome microarrays determines the range of in vivo DNA binding sites for any given protein Map protein complexes (interacting proteins and their Map co-regulated complexes within and across species.
BCB 570 Spring
BCB 570 Spring Approach Cross Species Nature Biotechnology 24, (2006) Modeling cellular machinery through biological network comparison Roded Sharan& Trey Ideker
BCB 570 Spring Network Alignment Why is this hard?
BCB 570 Spring
BCB 570 Spring PATHBlast Identifies pairs of interaction paths, drawn from the networks of different species or from different processes within a species, Proteins at equivalent path positions must share strong sequence homology. Score is a sum of alignments plus the probability of the interaction ideally compared to the null set.
BCB 570 Spring Algorithms for Network Alignment Scoring: measure similarity of each subnetwork to a predefined structure of interest and the level of conservation of the subnetwork across networks being compared. Search procedures: find conserved subnetworks of interest.
BCB 570 Spring
BCB 570 Spring Edit-Distance Methods Evolution-based Define M to be set of matches determine by orthology relationships between pairs of proteins N: set of mismatched interactions, sets of proteins where one pair interacts D: union of sets of duplicated protein pairs within each network
BCB 570 Spring Fit to a desired structure Maximum likelihood Compute a log-likelihood ratio that measures fit to an ideal structure vs. chance that the subnetwork is observed at random (null hypothesis). Ratios summed over aligned subnetworks to give overall score.
BCB 570 Spring Model of Protein Complex Each protein interacts with high prob , independently of other protein pairs. Null: every two proteins interact with a probability that depends on their node degree, p(u,v) Likelihood that a set of proteins, C, with interactions E(C) forms a complex is:
BCB 570 Spring
BCB 570 Spring Network Queries
BCB 570 Spring Searching Greedy seach: promising seed network, refines using local search using an editing approach (adding/deleting a protein) Works well for defined graph structures such as paths or trees
BCB 570 Spring Network Evolution
BCB 570 Spring