CSCE555 Bioinformatics Lecture 18 Network Biology Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:

Slides:



Advertisements
Similar presentations
Network biology Wang Jie Shanghai Institutes of Biological Sciences.
Advertisements

Analysis and Modeling of Social Networks Foudalis Ilias.
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
Biological networks: Types and sources Protein-protein interactions, Protein complexes, and network properties.
Advanced Topics in Data Mining Special focus: Social Networks.
Biological networks: Types and sources Protein-protein interactions, Protein complexes, and network properties.
Weighted networks: analysis, modeling A. Barrat, LPT, Université Paris-Sud, France M. Barthélemy (CEA, France) R. Pastor-Satorras (Barcelona, Spain) A.
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
Emergence of Scaling in Random Networks Barabasi & Albert Science, 1999 Routing map of the internet
Transcription Networks And The Cell’s Functional Organization Presenter: Roni Sharf.
University at BuffaloThe State University of New York Young-Rae Cho Department of Computer Science and Engineering State University of New York at Buffalo.
1. Lecture WS 2004/05Bioinformatics III1 Bioinformatics III “Systems biology”,“Integrative cell biology” Course will address two areas: 25% genomics: single.
A Real-life Application of Barabasi’s Scale-Free Power-Law Presentation for ENGS 112 Doug Madory Wed, 1 JUN 05 Fri, 27 MAY 05.
Proteome Network Evolution by Gene Duplication S. Cenk Şahinalp Simon Fraser University.
Biological Networks Feng Luo.
Network Statistics Gesine Reinert. Yeast protein interactions.
Regulatory networks 10/29/07. Definition of a module Module here has broader meanings than before. A functional module is a discrete entity whose function.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
Biological networks: Types and origin Protein-protein interactions, complexes, and network properties Thomas Skøt Jensen Center for Biological Sequence.
Gene and Protein Networks II Monday, April CSCI 4830: Algorithms for Molecular Biology Debra Goldberg.
Global topological properties of biological networks.
1 Protein-Protein Interaction Networks MSC Seminar in Computational Biology
BIOLOGICAL NETWORKS Woochang Hwang.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Biological networks: Types and origin
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Models and Algorithms for Complex Networks Networks and Measurements Lecture 3.
Biological Pathways & Networks
ANALYZING PROTEIN NETWORK ROBUSTNESS USING GRAPH SPECTRUM Jingchun Chen The Ohio State University, Columbus, Ohio Institute.
Network Biology Presentation by: Ansuman sahoo 10th semester
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
V 1 – Introduction Mon, Oct 15, 2012 Bioinformatics 3 — Volkhard Helms.
Network Clustering Experimental network mapping Graph theory and terminology Scale-free architecture Integrating with gene essentiality Robustness Lecturer:
Complex Networks First Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Social Network Analysis Prof. Dr. Daning Hu Department of Informatics University of Zurich Mar 5th, 2013.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
LECTURE 2 1.Complex Network Models 2.Properties of Protein-Protein Interaction Networks.
Introduction to biological molecular networks
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Bioinformatics Center Institute for Chemical Research Kyoto University
Network resilience.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Robustness, clustering & evolutionary conservation Stefan Wuchty Center of Network Research Department of Physics University of Notre Dame title.
1 Lesson 12 Networks / Systems Biology. 2 Systems biology  Not only understanding components! 1.System structures: the network of gene interactions and.
Network Analysis Goal: to turn a list of genes/proteins/metabolites into a network to capture insights about the biological system 1.Types of high-throughput.
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
Netlogo demo. Complexity and Networks Melanie Mitchell Portland State University and Santa Fe Institute.
Biological Network Analysis
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
Scale-free and Hierarchical Structures in Complex Networks L. Barabasi, Z. Dezso, E. Ravasz, S.H. Yook and Z. Oltvai Presented by Arzucan Özgür.
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Biological networks CS 5263 Bioinformatics.
Network biology : protein – protein interactions
Assessing Hierarchical Modularity in Protein Interaction Networks
Department of Computer Science University of York
Presentation transcript:

CSCE555 Bioinformatics Lecture 18 Network Biology Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page: University of South Carolina Department of Computer Science and Engineering

Outline Biological Networks & Databases Background of graphs and networks Three types of bio-network analysis ◦ Network statistics ◦ Network based functional annotation ◦ Bio-network reconstruction/inference Summary 12/17/20152

Why network analysis: Building models from parts lists Systems Biology view

BIOLOGICAL NETWORKS Networks are found in biological systems of varying scales: 1. Evolutionary tree of life 2. Ecological networks 3. Expression networks 4. Regulatory networks - genetic control networks of organisms 5. The protein interaction network in cells 6. The metabolic network in cells … more biological networks

Examples of Biological Networks Metabolic Networks Signaling Networks Transcription Regulatory Networks Protein-Protein Interaction Networks 5

Signaling & Metabolic Pathway Network A Pathway can be defined as a modular unit of interacting molecules to fulfill a cellular function. Signaling Pathway Networks ◦ In biology a signal or biopotential is an electric quantity (voltage or current or field strength), caused by chemical reactions of charged ions. ◦ refer to any process by which a cell converts one kind of signal or stimulus into another. ◦ Another use of the term lies in describing the transfer of information between and within cells, as in signal transduction. Metabolic Pathway Networks ◦ a series of chemical reactions occurring within a cell, catalyzed by enzymes, resulting in either the formation of a metabolic product to be used or stored by the cell, or the initiation of another metabolic pathway

A Signaling Pathway Example

A Metabolic Pathway Example

Regulatory Network

Expression Network  A network representation of genomic data.  Inferred from genomic data, i.e. microarray. Gene co-expression network. Each node is a gene. Edge: co- expression relationship

11 Example of a PPI Network Yeast PPI network Nodes – proteins Edges – interactions The color of a node indicates the phenotypic effect of removing the corresponding protein (red = lethal, green = non-lethal, orange = slow growth, yellow = unknown).

How do we know that proteins interact? (PPI Identification methods) Data ◦ Yeast 2 hybrid assay ◦ Mass spectrometry ◦ Correlated m-RNA expression ◦ Genetic interactions Analysis ◦ Phylogenetic analysis ◦ Gene neighbors ◦ Co-evolution ◦ Gene clusters Also see: Comparative assessment of large-scale data sets of protein-protein interactions – von Mering 12

Protein Interaction Databases  Species-specific ◦ FlyNets - Gene networks in the fruit fly ◦ MIPS - Yeast Genome Database ◦ RegulonDB - A DataBase On Transcriptional Regulation in E. Coli ◦ SoyBase ◦ PIMdb - Drosophila Protein Interaction Map database  Function-specific ◦ Biocatalysis/Biodegradation Database ◦ BRITE - Biomolecular Relations in Information Transmission and Expression ◦ COPE - Cytokines Online Pathfinder Encyclopaedia ◦ Dynamic Signaling Maps ◦ EMP - The Enzymology Database ◦ FIMM - A Database of Functional Molecular Immunology ◦ CSNDB - Cell Signaling Networks Database

Protein Interaction Databases  Interaction type-specific ◦ DIP - Database of Interacting Proteins ◦ DPInteract - DNA-protein interactions ◦ Inter-Chain Beta-Sheets (ICBS) - A database of protein-protein interactions mediated by interchain beta-sheet formation ◦ Interact - A Protein-Protein Interaction database ◦ GeneNet (Gene networks)  General ◦ BIND - Biomolecular Interaction Network Database ◦ BindingDB - The Binding Database ◦ MINT - a database of Molecular INTeractions ◦ PATIKA - Pathway Analysis Tool for Integration and Knowledge Acquisition ◦ PFBP - Protein Function and Biochemical Pathways Project ◦ PIM (Protein Interaction Map)

Pathway Databases  KEGG (Kyoto Encyclopedia of Genes and Genomes)   Institute for Chemical Research, Kyoto University  PathDB   National Center for Genomic Resources  SPAD: Signaling PAthway Database  Graduate School of Genetic Resources Technology. Kyushu University.  Cytokine Signaling Pathway DB.  Dept. of Biochemistry. Kumamoto Univ.  EcoCyc and MetaCyc  Stanford Research Institute  BIND (Biomolecular Interaction Network Database)  UBC, Univ. of Toronto

KEGG Pathway Database: Computerize current knowledge of molecular and cellular biology in terms of the pathway of interacting molecules or genes. Genes Database: Maintain gene catalogs of all sequenced organisms and link each gene product to a pathway component Ligand Database: Organize a database of all chemical compounds in living cells and link each compound to a pathway component Pathway Tools: Develop new bioinformatics technologies for functional genomics, such as pathway comparison, pathway reconstruction, and pathway design

Network Properties

18 Properties of networks Small world effect Transitivity/ Clustering Scale Free Effect Maximum degree Network Resilience and robustness Mixing patterns and assortativity Community structure Evolutionary origin Betweenness centrality of vertices

Biological Networks Properties Power law degree distribution: Rich get richer Small World: A small average path length ◦ Mean shortest node-to-node path Robustness: Resilient and have strong resistance to failure on random attacks and vulnerable to targeted attacks Hierarchical Modularity: A large clustering coefficient ◦ How many of a node ’ s neighbors are connected to each other

20 Graph Terminology Node Edge Directed/Undirected Degree Shortest Path/Geodesic distance Neighborhood Subgraph Complete Graph Clique Degree Distribution Hubs

Graphs Graph G=(V,E) is a set of vertices V and edges E A subgraph G’ of G is induced by some V’  V and E’  E Graph properties: ◦ Connectivity (node degree, paths) ◦ Cyclic vs. acyclic ◦ Directed vs. undirected

Network Measures Degree k i Degree distribution P(k) Mean path length Network Diameter Clustering Coefficient

Paths: metabolic, signaling pathways Cliques: protein complexes Hubs: regulatory modules Subgraphs: maximally weighted Network Analysis

Sparse vs Dense Graphs G(V, E) where |V|=n, |E|=m the number of vertices and edges Graph is sparse if m~n Graph is dense if m~n 2 Complete graph when m=n 2

Connected Components G(V,E) |V| = 69 |E| = 71

Connected Components G(V,E) |V| = 69 |E| = 71 6 connected components

Paths A path is a sequence {x 1, x 2,…, x n } such that (x 1,x 2 ), (x 2,x 3 ), …, (x n-1,x n ) are edges of the graph. A closed path x n =x 1 on a graph is called a graph cycle or circuit.

Shortest-Path between nodes

Longest Shortest-Path

Network Measures: Degree

P(k) is probability of each degree k, i.e fraction of nodes having that degree. For random networks, P(k) is normally distributed. For real networks the distribution is often a power- law: P(k) ~ k  Such networks are said to be scale-free Degree Distribution

k: neighbors of I n I : edges between node I’s neighbors The density of the network surrounding node I, characterized as the number of triangles through I. Related to network modularity The center node has 8 (grey) neighbors There are 4 edges between the neighbors C = 2*4 /(8*(8-1)) = 8/56 = 1/7 Clustering Coefficient

Interesting Properties of Network Types

Small-world Network Every node can be reached from every other by a small number of hops or steps High clustering coefficient and low mean- shortest path length ◦ Random graphs don’t necessarily have high clustering coefficients Social networks, the Internet, and biological networks all exhibit small-world network characteristics

36 Small world effect  most pairs of vertices in the network seem to be connected by a short path l is mean geodesic distance d ij is the geodesic distance between vertex i and vertex j l ~ log(N)

Scale-Free Networks are Robust Complex systems (cell, internet, social networks), are resilient to component failure Network topology plays an important role in this robustness ◦ Even if ~80% of nodes fail, the remaining ~20% still maintain network connectivity Attack vulnerability if hubs are selectively targeted In yeast, only ~20% of proteins are lethal when deleted, and are 5 times more likely to have degree k>15 than k<5.

Hierarchical Networks

Detecting Hierarchical Organization

Other Interesting Features Cellular networks are assortative, hubs tend not to interact directly with other hubs. Hubs tend to be “older” proteins (so far claimed for protein-protein interaction networks only) Hubs also seem to have more evolutionary pressure— their protein sequences are more conserved than average between species (shown in yeast vs. worm) Experimentally determined protein complexes tend to contain solely essential or non-essential proteins— further evidence for modularity.

41 Network Models Random Network Scale free Network Hierarchical Network

42 Random Network I  The Erdös–Rényi (ER) model of a random network starts with N nodes and connects each pair of nodes with probability p, which creates a graph with approximately pN(N– 1)/2 randomly placed links  The node degrees follow a Poisson distribution

43 Random Network II  Mean shortest path l ~ log N, which indicates that it is characterized by the small-world property.  Random graphs have served as idealized models of certain gene networks, ecosystems and the spread of infectious diseases and computer viruses.

44

45 Scale Free Networks I  Power-law degree distribution: P(k) ~ k –γ, where γ is the degree exponent. Usually 2-3 The network’s properties are determined by hubs The network is often generated by a growth process called Barabási–Albert model

46 Scale Free Networks II  Scale-free networks with degree exponents 2<γ<3, a range that is observed in most biological and non-biological networks like the Internet backbone, the World Wide Web, metabolic reaction network and telephone call graphs.  The mean shortest path length is proportional to log(n)/log(log(n))

 PREFERENTIAL ATTACHMENT on Growth: the probability that a new vertex will be connected to vertex i depends on the connectivity of that vertex: How Scale-free networks are formed? In biological network, many such networks are due to gene duplication !

48 Hierarchical Networks I  To account for the coexistence of modularity, local clustering and scale-free topology in many real systems it has to be assumed that clusters combine in an iterative manner, generating a hierarchical network The hierarchical network model seamlessly integrates a scale-free topology with an inherent modular structure by generating a network that has a power-law degree distribution with degree exponent γ = 1 + ln4/ln3 = 2.26

49 Hierarchical Networks II  It has a large system-size independent average clustering coefficient ~ 0.6. The most important signature of hierarchical modularity is the scaling of the clustering coefficient, which follows C(k) ~ k –1 a straight line of slope –1 on a log–log plot  A hierarchical architecture implies that sparsely connected nodes are part of highly clustered areas, with communication between the different highly clustered neighborhoods being maintained by a few hubs Some examples of hierarchical scale free networks.

Problems of Network Biology  Network Inference  Micro Array, Protein Chips, other high throughput assay methods  Function prediction  The function of 40-50% of the new proteins is unknown  Understanding biological function is important for:  Study of fundamental biological processes  Drug design  Genetic engineering  Functional module detection  Cluster analysis  Topological Analysis  Descriptive and Structural  Locality Analysis  Essential Component Analysis  Dynamics Analysis  Signal Flow Analysis  Metabolic Flux Analysis  Steady State, Response, Fluctuation Analysis  Evolution Analysis  Biological Networks are very rich networks with very limited, noisy, and incomplete information.  Discovering underlying principles is very challenging. 50

Summary The problem: Identify Differentially expressed genes from Microarray data How to identify: t-test and Rank product How to evaluate significance of identified genes

Reference & Acknowledgements Albert Barabasi et al ◦ Network Biology: understanding the cell’s functional organization Jing-Dong et al ◦ Evidence for dynamically organized modularity in the yeast protein–protein interaction network Woochang Hwang