BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

Slides:



Advertisements
Similar presentations
Global Mapping of the Yeast Genetic Interaction Network Tong et. al, Science, Feb 2004 Presented by Bowen Cui.
Advertisements

Ab initio gene prediction Genome 559, Winter 2011.
Biological networks: Types and sources Protein-protein interactions, Protein complexes, and network properties.
Biological networks: Types and sources Protein-protein interactions, Protein complexes, and network properties.
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Gene Ontology John Pinney
Research Methodology of Biotechnology: Protein-Protein Interactions Yao-Te Huang Aug 16, 2011.
University at BuffaloThe State University of New York Young-Rae Cho Department of Computer Science and Engineering State University of New York at Buffalo.
Structural bioinformatics
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Comparison of Networks Across Species CS374 Presentation October 26, 2006 Chuan Sheng Foo.
Regulatory networks 10/29/07. Definition of a module Module here has broader meanings than before. A functional module is a discrete entity whose function.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Protein-protein interactions
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Protein domains vs. structure domains - an example.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Tutorial 2: Some problems in bioinformatics 1. Alignment pairs of sequences Database searching for sequences Multiple sequence alignment Protein classification.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Phylogenetic Shadowing Daniel L. Ong. March 9, 2005RUGS, UC Berkeley2 Abstract The human genome contains about 3 billion base pairs! Algorithms to analyze.
Biological networks: Types and origin
BLOSUM Information Resources Algorithms in Computational Biology Spring 2006 Created by Itai Sharon.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D. (1999). Detecting protein function and protein-protein interactions from genome sequences.
Protein Classification A comparison of function inference techniques.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Protein Interaction Networks Aalt-Jan van Dijk Applied Bioinformatics, PRI, Wageningen UR & Mathematical and Statistical Methods, Biometris, Wageningen.
341: Introduction to Bioinformatics Dr. Natasa Przulj Deaprtment of Computing Imperial College London
Alignment Statistics and Substitution Matrices BMI/CS 576 Colin Dewey Fall 2010.
Protein protein interactions
Network Analysis and Application Yao Fu
Biological Pathways & Networks
Interactions and more interactions
Evolution and Scoring Rules Example Score = 5 x (# matches) + (-4) x (# mismatches) + + (-7) x (total length of all gaps) Example Score = 5 x (# matches)
QNET: A tool for querying protein interaction networks Banu Dost +, Tomer Shlomi*, Nitin Gupta +, Eytan Ruppin*, Vineet Bafna +, Roded Sharan* + University.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Finish up array applications Move on to proteomics Protein microarrays.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Pairwise alignment of DNA/protein sequences I519 Introduction to Bioinformatics, Fall 2012.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Proteome and interactome Bioinformatics.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Introduction to Bioinformatics Biological Networks Department of Computing Imperial College London March 18, 2010 Lecture hour 18 Nataša Pržulj
Construction of Substitution Matrices
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Markov Cluster (MCL) algorithm Stijn van Dongen.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions.
Predicting Protein Function Annotation using Protein- Protein Interaction Networks By Tamar Eldad Advisor: Dr. Yanay Ofran Computational Biology.
Motif Search and RNA Structure Prediction Lesson 9.
Computational Biology, Part 3 Representing and Finding Sequence Features using Frequency Matrices Robert F. Murphy Copyright  All rights reserved.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
CSCI2950-C Lecture 12 Networks
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
CSCI2950-C Lecture 13 Network Motifs; Network Integration
SEG5010 Presentation Zhou Lanjun.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

BCB 570 Spring Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering

BCB 570 Spring Outline Data for Protein-protein interaction networks Brief review of network concepts for network analysis Effect of different data sets Biological network comparison

BCB 570 Spring Two hybrid system P protein of interest, referred to as "bait," is bound to a DNA Binding Domain (DBD). A separate protein, called the "prey," is bound to an open reading frame. If these two proteins (the bait and prey) interact, a reporter gene is transcribed. In general, used for initial identification of interacting proteins, not for detailed characterization of the interaction Image from

BCB 570 Spring Domain Belief Assumptions : A domain is a discrete functional and structural unit, such that it folds as a unit and carries out a particular function. Proteins consist of a number of these domains, laid out in a linear array along the polypeptide chain. The properties of a domain are basically the same when this unit is put into a different context (such as in a hybrid protein, for instance in the two-hybrid system). Limitations: Not all proteins have a domain structure. In many proteins, domains exist but they include portions of the polypeptide from different parts of the chain; for example, a domain might be composed of residues and Properties of a domain may change when it is taken out of the context of the intact protein. E.g., some proteins contain "autoinhibitory" regions.

BCB 570 Spring Co-Immunoprecipitation (co- IP) to find out what is binding the protein itself is used as an affinity reagent to isolate its binding partners Compared with two-hybrid and chip-based approaches, this strategy has the advantages that the fully processed and modified protein serves as bait

BCB 570 Spring Proteome Mass Spectrometry

BCB 570 Spring Problems Noisy data Many weak associations Self-activators contaminants Molecules are highly connected

BCB 570 Spring Approach Get more evidence Physical interactions Synthetic lethality Co-citation Co-expression Literature

BCB 570 Spring MIPS Database GDA1p

BCB 570 Spring PIR Database

BCB 570 Spring DIP GDA1p YEL017W YBR161W YJL152W ALD5p Ssp120p HPA2p

BCB 570 Spring

BCB 570 Spring Biogrid.org

BCB 570 Spring Analyzing P-P interaction networks Create networks Find structure in networks, search for modules or motifs Analyze results using known databases, functional enrichment, expression data, organelle information,etc

BCB 570 Spring Science Dec 5;302(5651): Epub 2003 Nov 6. A protein interaction map of Drosophila melanogaster. By Giot, et al.

BCB 570 Spring

BCB 570 Spring

BCB 570 Spring

BCB 570 Spring Copyright restrictions may apply. Jonsson, P. F. et al. Bioinformatics : ; doi: /bioinformatics/btl390 A description of the protein communities identified by k-clique cluster analysis (k = 6)

BCB 570 Spring Find structure Use cliques or highly connected regions in a network Clique Percolation Method (CPM, see Derényi et al., 2005) to locate the k-clique percolation clusters of the networkDerényi et al., 2005 MCL-Markov Cluster Algorithm based on simulation of (stochastic) flow in graphs Enright A.J., Van Dongen S., Ouzounis C.A. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Research 30(7): (2002). Animation

BCB 570 Spring Method: MCL Cluster Definition: Natural clusters in a graph are characterised by the presence of many edges between the members of that cluster, and one expects that the number of ‘higher-length’ (longer) paths between two arbitrary nodes in the cluster is high. Random walks on the graph rarely go from one natural cluster to another. The MCL algorithm finds cluster structure in graphs by deterministically computes (the probabilities of) random walks through the similarity graph, and uses two operators transforming one set of probabilities into another. It uses the language of stochastic matrices (also called Markov matrices) to capture the mathematical concept of random walks on a graph. Expansion coincides with taking the power of a stochastic matrix using the normal matrix product finding probabilities of random walks between nodes Inflation corresponds with taking the Hadamard power of a matrix:

BCB 570 Spring Example

BCB 570 Spring

BCB 570 Spring Adding in Transcriptional Interactions ChIP-chip with whole genome microarrays determines the range of in vivo DNA binding sites for any given protein Map protein complexes (interacting proteins and their Map co-regulated complexes within and across species.

BCB 570 Spring

BCB 570 Spring Approach Cross Species Nature Biotechnology 24, (2006) Modeling cellular machinery through biological network comparison Roded Sharan& Trey Ideker

BCB 570 Spring Network Alignment Why is this hard?

BCB 570 Spring

BCB 570 Spring PATHBlast Identifies pairs of interaction paths, drawn from the networks of different species or from different processes within a species, Proteins at equivalent path positions must share strong sequence homology. Score is a sum of alignments plus the probability of the interaction ideally compared to the null set.

BCB 570 Spring Algorithms for Network Alignment Scoring: measure similarity of each subnetwork to a predefined structure of interest and the level of conservation of the subnetwork across networks being compared. Search procedures: find conserved subnetworks of interest.

BCB 570 Spring

BCB 570 Spring Edit-Distance Methods Evolution-based Define M to be set of matches determine by orthology relationships between pairs of proteins N: set of mismatched interactions, sets of proteins where one pair interacts D: union of sets of duplicated protein pairs within each network

BCB 570 Spring Fit to a desired structure Maximum likelihood Compute a log-likelihood ratio that measures fit to an ideal structure vs. chance that the subnetwork is observed at random (null hypothesis). Ratios summed over aligned subnetworks to give overall score.

BCB 570 Spring Model of Protein Complex Each protein interacts with high prob , independently of other protein pairs. Null: every two proteins interact with a probability that depends on their node degree, p(u,v) Likelihood that a set of proteins, C, with interactions E(C) forms a complex is:

BCB 570 Spring

BCB 570 Spring Network Queries

BCB 570 Spring Searching Greedy seach: promising seed network, refines using local search using an editing approach (adding/deleting a protein) Works well for defined graph structures such as paths or trees

BCB 570 Spring Network Evolution

BCB 570 Spring