Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.

Slides:



Advertisements
Similar presentations
gSpan: Graph-based substructure pattern mining
Advertisements

MitoInteractome : Mitochondrial Protein Interactome Database Rohit Reja Korean Bioinformation Center, Daejeon, Korea.
CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
1 Efficient Subgraph Search over Large Uncertain Graphs Ye Yuan 1, Guoren Wang 1, Haixun Wang 2, Lei Chen 3 1. Northeastern University, China 2. Microsoft.
Frequent Subgraph Pattern Mining on Uncertain Graph Data
A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles Authors: Chia-Hao Chin 1,4,
Seeing the forest for the trees : using the Gene Ontology to restructure hierarchical clustering Dikla Dotan-Cohen, Simon Kasif and Avraham A. Melkman.
Decomposition of overlapping protein complexes: A graph theoretical method for analyzing static and dynamic protein associations Algorithms for Molecular.
Threshold selection in gene co- expression networks using spectral graph theory techniques Andy D Perkins*,Michael A Langston BMC Bioinformatics 1.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Comparison of Networks Across Species CS374 Presentation October 26, 2006 Chuan Sheng Foo.
Structure discovery in PPI networks using pattern-based network decomposition Philip Bachman and Ying Liu BIOINFORMATICS System biology Vol.25 no
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Modular Organization of Protein Interaction Network Feng Luo, Ph.D. Department of Computer Science Clemson University.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Data Mining Presentation Learning Patterns in the Dynamics of Biological Networks Chang hun You, Lawrence B. Holder, Diane J. Cook.
Biological networks Construction and Analysis. Recap Gene regulatory networks –Transcription Factors: special proteins that function as “keys” to the.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Protein Classification A comparison of function inference techniques.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
341: Introduction to Bioinformatics Dr. Natasa Przulj Deaprtment of Computing Imperial College London
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed.
Quantitative analysis of domain interactomes Jason Lee Capstone presentation Sp `07.
Automatic methods for functional annotation of sequences Petri Törönen.
Metagenomic Analysis Using MEGAN4
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and.
Gene Set Enrichment Analysis (GSEA)
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Graph and Topological Structure Mining on Scientific Articles Fan Wang, Ruoming Jin, Gagan Agrawal and Helen Piontkivska The Ohio State University The.
Improving PPI Networks with Correlated Gene Expression Data Jesse Walsh.
QNET: A tool for querying protein interaction networks Banu Dost +, Tomer Shlomi*, Nitin Gupta +, Eytan Ruppin*, Vineet Bafna +, Roded Sharan* + University.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
A Method for Protein Functional Flow Configuration and Validation Woo-Hyuk Jang 1 Suk-Hoon Jung 1 Dong-Soo Han 1
Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.
Tutorial session 3 Network analysis Exploring PPI networks using Cytoscape EMBO Practical Course Session 8 Nadezhda Doncheva and Piet Molenaar.
Systems Biology ___ Toward System-level Understanding of Biological Systems Hou-Haifeng.
Top X interactions of PIN Network A interactions Coverage of Network A Figure S1 - Network A interactions are distributed evenly across the top 60,000.
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Statistical Testing with Genes Saurabh Sinha CS 466.
Gene set analyses of genomic datasets Andreas Schlicker Jelle ten Hoeve Lodewyk Wessels.
DNAmRNAProtein Small molecules Environment Regulatory RNA How a cell is wired The dynamics of such interactions emerge as cellular processes and functions.
Mining Graph Patterns Efficiently via Randomized Summaries Chen Chen, Cindy X. Lin, Matt Fredrikson, Mihai Christodorescu, Xifeng Yan, Jiawei Han VLDB’09.
Predicting Protein Function Annotation using Protein- Protein Interaction Networks By Tamar Eldad Advisor: Dr. Yanay Ofran Computational Biology.
G LOBAL S IMILARITY B ETWEEN M ULTIPLE B IONETWORKS Yunkai Liu Computer Science Department University of South Dakota.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
6/11/20161 Graph models and efficient exact algorithms in studying cancer signaling pathways Songjian Lu, Lujia Chen, Chunhui Cai Department of Biomedical.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
Gspan: Graph-based Substructure Pattern Mining
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
The Transcriptional Landscape of the Mammalian Genome
CSCI2950-C Lecture 12 Networks
Mining in Graphs and Complex Structures
GO : the Gene Ontology & Functional enrichment analysis
Statistical Testing with Genes
Mining Frequent Subgraphs
FUNCTIONAL ANNOTATION OF REGULATORY PATHWAYS
SEG5010 Presentation Zhou Lanjun.
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
Bioinformatics Unit, ISTECH Inc.
Statistical Testing with Genes
Network-Based Coverage of Mutational Profiles Reveals Cancer Genes
Generation of an HPV–human PPI map.
Presentation transcript:

Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar

Background  Availability of genome scale protein network  Understanding topological organization  Identification of conserved subnetworks across different species  Discover modules of interaction  Predict functions of uncharacterized proteins  Improve the accuracy of currently available networks

Aim of study  Using available functional annotations of proteins in PPI network and look for overrepresented patterns of interactions in the network  Present new frequent pattern identification technique PPISpan

Yeast as a model  Why yeast genomics? A model eukaryote organism …  Well known PPI network Saccharomyces cerevisiae

PPI Network  Protein protein interaction shown by edge between them indicating physical association in the form of modification, transport or complex formation  Interesting conserved interaction patterns among species  Patterns correspond to specific biological process

Frequent sub-graphs A graph (sub graph) is frequent if is support (occurrence frequency) in a given dataset is no less than minimum support threshold

Example: Frequent Subgraphs GRAPH DATASET FREQUENT PATTERNS (MIN SUPPORT IS 2) (A)(B)(C) (1)(2)

The Algorithm - PPISpan  Based on gSpan  Modified to adapt for PPI network  Candidate generation  Frequency counting

Algorithm: PPISpan (G, L, minSup) 1. Set the vertex labels in G with GO terms from the desired GO level L 2. S <- all frequent 1-edge graphs in G in frequency based lexicographical order 3. for each edge e in S (in ascending order frequency) do 4. SubGraphs (e, minSup, e) 5. Remove e from G

Algorithm: Subpgraphs (s, minSup, ext) 1. If (feasible (s, ext)) 2. If DES code of s != to its minimum DFS code 3. return 4. C <- Generate all children of s (by growing an edge, ext) 5. Maximal <- true 6. For each c in C (in DFS lexicographical order) do 7. If support (c) >= minSup 8. Subgraphs (c, minSup, c.ext) 9. maximal <- false 10. If (maximal) 11. output s

Datasets used 1. Database of interacting proteins (DIP) data constructed from high-throughput experiments 1. String Database confidence weighted predicted data 1. WI-PHI weighted yeast interactome enriched for direct physical interactions

Gene Ontology annotations o Used to assign functional category labels to the proteins in PPI network o Collaborative effort to address the need of consistent descriptions of the gene products in different databases o Provides description for biological processes, cellular components, and molecular functions

GO slim terms Provides a broad overview of the functional categories in GO GO Slim Molecular Function Terms for S. Cerevisiae Term ID Definition GO:3674molecular function unknown GO:16787 hydrolase activity GO:16740 transferase activity GO:5515 protein binding … Total of 22 broad functional categories

Research Steps o Label the nodes with functional categories with GO annotations o Consider molecular function hierarchy o Focus on functional interaction patterns in arbitrarily topologies o Find non-overlapping embeddings using PPISpan

Problems faced o Noise in PPI network o False positives o False negatives o Accuracy and specificity of annotations of proteins

Supporting embedding o Specific instance of the functional pattern realized by certain proteins in the PPI network

Experiment details o Implemented in C++ o Searched for frequent interaction patterns of support >= 15

Pattern frequency in different datasets Number of patterns found

Observation  Most of the patterns are trees  Star topology most abundant  Cycles rare

Comparison with known molecular complexes and pathways  Ignore topology and treat patterns as set of proteins for comparison  Molecular complexes from MIPS ( Munich Information Center for Protein Sequences ) complex catalogue database  Signaling, transport, and regulatory pathways from KEGG database  Use high quality complexes

cpcount o Average number of different complexes or pathways the embeddings of a frequent interaction pattern overlaps with o To speculate on the location of interacting patterns

cpoverlap o Quantifies the overlap between proteins in an embedding and known complexes and pathways o Ratio of proteins in an embedding that are members of known functional modules

Observations from comparison o For some of the observed patterns, topology is more important than underlying functional annotations o Comparison of all the patterns with random patterns in terms of overlap with MIPS complexes Comparison of all the patterns with random patterns in terms of overlap with MIPS complexes o Comparison of all the patterns with random patterns in terms of overlap with transport and signaling pathways Comparison of all the patterns with random patterns in terms of overlap with transport and signaling pathways

Analysis of patterns with MIPS complexes o Selected patterns from DIP and WI-PHI networks Selected patterns from DIP and WI-PHI networks o Selected patterns from the STRING network Selected patterns from the STRING network o cpoverlap of selected patterns with respect to MIPS complexes cpoverlap of selected patterns with respect to MIPS complexes o cpcount of selected patterns with respect to MIPS complexes cpcount of selected patterns with respect to MIPS complexes

Analysis of patterns with KEGG pathways o Selected patterns from DIP, STRING and WI-PHI networks Selected patterns from DIP, STRING and WI-PHI networks o cpoverlap of selected patterns with respect to transport and signaling pathways cpoverlap of selected patterns with respect to transport and signaling pathways o cpcount of selected patterns with respect to transport and signaling pathways cpcount of selected patterns with respect to transport and signaling pathways

Some interesting Functional interaction patterns o A frequent functional interaction pattern in the DIP network A frequent functional interaction pattern in the DIP network o A frequent functional interaction pattern in the WI-PHI network A frequent functional interaction pattern in the WI-PHI network o A functional interaction pattern related to the MAPK signaling pathwaysignaling pathways A functional interaction pattern related to the MAPK signaling pathwaysignaling pathways o A functional interaction pattern related to the SNARE interactions in vesicular transport A functional interaction pattern related to the SNARE interactions in vesicular transport

Conclusions o Proposed new frequent pattern identification technique, PPISpan o utilized molecular function Gene Ontology annotations to assign non-unique labels to proteins of a PPI network o identified significantly frequent functional interaction patterns o Frequent patterns offer a new perspective into the modular organization of protein- protein interaction networks

QUESTIONS ?

THANK YOU