Biological Network Analysis

Slides:

Advertisements

Similar presentations

DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng.

Advertisements

Network biology Wang Jie Shanghai Institutes of Biological Sciences.

An Intro To Systems Biology: Design Principles of Biological Circuits Uri Alon Presented by: Sharon Harel.

The multi-layered organization of information in living systems

Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.

CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.

VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.

Biological networks: Types and sources Protein-protein interactions, Protein complexes, and network properties.

Biological networks: Types and sources Protein-protein interactions, Protein complexes, and network properties.

Gene regulatory network

A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles Authors: Chia-Hao Chin 1,4,

University at BuffaloThe State University of New York Young-Rae Cho Department of Computer Science and Engineering State University of New York at Buffalo.

Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.

Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.

Regulatory networks 10/29/07. Definition of a module Module here has broader meanings than before. A functional module is a discrete entity whose function.

27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.

Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.

Biological networks: Types and origin Protein-protein interactions, complexes, and network properties Thomas Skøt Jensen Center for Biological Sequence.

Gene and Protein Networks II Monday, April CSCI 4830: Algorithms for Molecular Biology Debra Goldberg.

Network Motifs: simple Building Blocks of Complex Networks R. Milo et. al. Science 298, 824 (2002) Y. Lahini.

BACKGROUND E. coli is a free living, gram negative bacterium which colonizes the lower gut of animals. Since it is a model organism, a lot of experimental.

Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.

Ohnologs and Regulatory Networks Robbie Sedgewick Group Meeting March 2, 2006.

6. Gene Regulatory Networks

Biological networks: Types and origin

Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.

Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.

Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.

Protein Classification A comparison of function inference techniques.

Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.

Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.

Models and Algorithms for Complex Networks Networks and Measurements Lecture 3.

Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.

Synthetic Mammalian Transgene Negative Autoregulation Harpreet Chawla April 2, 2015 Vinay Shimoga, Jacob White, Yi Li, Eduardo Sontag & Leonidas Bleris.

Biological Networks Lectures 6-7 : February 02, 2010 Graph Algorithms Review Global Network Properties Local Network Properties 1.

MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.

Biological Pathways & Networks

Networks and Interactions Boo Virk v1.0.

Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.

Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.

Network Clustering Experimental network mapping Graph theory and terminology Scale-free architecture Integrating with gene essentiality Robustness Lecturer:

Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.

Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.

Analysis of the yeast transcriptional regulatory network.

Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.

Bioinformatics 3 V8 – Gene Regulation Fri, Nov 15, 2013.

Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.

Introduction to biological molecular networks

341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.

Bioinformatics Center Institute for Chemical Research Kyoto University

Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.

Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.

Bioinformatics 3 V8 – Gene Regulation Fri, Nov 9, 2012.

Robustness, clustering & evolutionary conservation Stefan Wuchty Center of Network Research Department of Physics University of Notre Dame title.

1 Lesson 12 Networks / Systems Biology. 2 Systems biology  Not only understanding components! 1.System structures: the network of gene interactions and.

Network Analysis Goal: to turn a list of genes/proteins/metabolites into a network to capture insights about the biological system 1.Types of high-throughput.

Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.

Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.

Network Motifs See some examples of motifs and their functionality Discuss a study that showed how a miRNA also can be integrated into motifs Today’s plan.

Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.

Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey

Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.

Cohesive Subgraph Computation over Large Graphs

Genomic Data Integration

Biological networks CS 5263 Bioinformatics.

Biological Networks Analysis Degree Distribution and Network Motifs

Building and Analyzing Genome-Wide Gene Disruption Networks

CSCI2950-C Lecture 13 Network Motifs; Network Integration

SEG5010 Presentation Zhou Lanjun.

Presentation transcript:

Biological Network Analysis Kimberly Glass BIO508 April 9, 2014

Outline Network models Network properties Network paths Network motifs Information flow Graph clustering Biological networks Relational networks Correlative networks Causative/regulatory networks Applications Biological data integration Function prediction Resources and tools

Outline Network models Network properties Network paths Network motifs Information flow Graph clustering Biological networks Relational networks Correlative networks Causative/regulatory networks Applications Biological data integration Function prediction Resources and tools

The Internet colored by IP address http://www.jeffkennedyassociates.com:16080/connections/concept/image.html

Co-authorship of scientific articles http://www.jeffkennedyassociates.com:16080/connections/concept/image.html

Networks in Molecular Biology Protein-Protein interactions Protein-DNA interactions Genetic interactions Metabolic reactions Co-expression interactions Text mining interactions Association Networks Etc. Barabasi & Oltvai, Nature Reviews, 2004

Graphs Graph G=(V,E) is a set of vertices V and edges E V = {v1, v2, v3, v4, v5} E = {(v1, v2), (v1, v3), (v2, v4), (v2, v5) , (v3, v5)} A subgraph G’ of G is induced by some V’  V and E’  E For example, V’ = {v1, v2, v3} and E’ = {(v1, v2), (v1, v3)} Graph properties: Directed vs. undirected Weighted vs. unweighted Cyclic vs. acyclic Connectivity (node degree, paths) v2 v5 v3 v1 v2 v3

Networks and Graphs: Terminology Formally, a network is a graph is… G = (V, E), an ordered tuple of two sets V = {v1, …, vn}, a set of unique nodes, and E = {(vi, vj), …}, a set of (un)ordered node tuples Bipartite Cyclic Multigraph Acyclic (DAG) Weighted 0.5 1.2 6 -2 Loops (Self-connections) Undirected Directed

Sparse vs Dense G(V, E) where |V|=n, |E|=m the number of vertices and edges Graph is sparse if m~n Graph is dense if m~n2 Complete graph when m=n2

Connected Components G(V,E) |V| = 69 |E| = 71

Connected Components G(V,E) |V| = 69 |E| = 71 6 connected components

Paths A path is a sequence {x1, x2,…, xn} such that (x1,x2), (x2,x3), …, (xn-1,xn) are edges of the graph. A closed path xn=x1 on a graph is called a graph cycle or circuit.

Shortest-Path between nodes

Shortest-Path between nodes

Longest Shortest-Path

Network paths and diameter Shortest path: Connect two nodes by as few edges as possible Network diameter: The longest shortest path in the network The network diameter is often very short: ‘Small world network’

Network Motifs: Simple Building Blocks of Complex Networks Milo, Alon, et. al. Science. 2002 Oct 25;298(5594):824-7

Network Motifs Feedback Positive auto-regulation Negative auto-regulation memory delay speed + stability Coherent feed-forward Bi-fan filter Incoherent feed-forward Whole Genome Duplication and evolvability pulse

Network Motifs: Simple Building Blocks of Complex Networks Milo, Alon, et. al. Science. 2002 Oct 25;298(5594):824-7

Network Motifs: Simple Building Blocks of Complex Networks Shen-Orr, Alon et.al. Nature Genetics, 2002 May;31(1):64-8.

Degree or connectivity

Random vs scale-free networks P(k) is probability of each degree k, i.e fraction of nodes having that degree. For random networks, P(k) is normally distributed. For real networks the distribution is often a power-law: P(k) ~ k-g Such networks are said to be scale-free

Knock-out lethality and connectivity

Clustering coefficient The density of the network surrounding node I, characterized as the number of triangles through I. Related to network modularity k: neighbors of I nI: edges between node I’s neighbors The center node has 8 (grey) neighbors There are 4 edges between the neighbors C = 2*4 /(8*(8-1)) = 8/56 = 1/7

Mixing Properties of Networks Assortative Network Nodes tend to connect to other nodes of similar degree Disassortative Network http://en.wikipedia.org/wiki/Assortativity Nodes tend to connect to other nodes of dissimilar degree

Network Structure: Hubs, Bottlenecks, and Information Flow 26

Network Structure: Cliques and Clusters Clique: fully connected subgraph Quasi-clique: near-miss k-clique: clique of size exactly k Maximal clique: largest clique in graph http://science.cancerresearchuk.org/sci/lightmicro/images/116771 http://scienceblogs.com/goodmath/upload/2007/07/maximal-cliques.jpg http://en.wikipedia.org/wiki/Community_structure

Outline Networks as a model Network properties Network paths Network motifs Information flow Graph clustering Biological networks Relational networks Correlative networks Causative/regulatory networks Applications Biological data integration Function prediction Resources and tools

How is biological data represented in networks? High Correlation Low Gene expression Physical PPIs Genetic interactions Colocalization Sequence Protein domains Regulatory binding sites … + =

Building and Interpreting Biological Networks How we build a biological network depends on what data we have AND what we want the edges in the network to represent. The meaning of the edges in a biological network depend on the method used to generate those edges. Influences how we interpret the interactions in a network. node: an object in the network (e.g. genes) edge: indicates relationship between two nodes

Interpreting the “edges” in Biological Networks Relational Networks Generally Undirected (non-causal relationships) Nodes all of same “type” Generally no “signs” on edges Example: Protein A is a dimerization partner with protein B. A B Correlation Network Undirected (non-causal relationships) Nodes all of same “type” Edges can have “signs” Example: When the expression of Gene A changes, so does the expression for Gene B. A B *Correlation is not causation. Regulatory Network Directed Network (causal relationships) Can have “types” of nodes Edges can have “signs” Example: TF A regulates Gene B. A B

Types of Protein Interactions Physical Protein Interactions Edge between proteins if they physically interact Wild Type Viable Cell Death X Synthetic Lethality Edge between proteins if mutating both causes lethality

Functional Associations Between Processes Edges Associations between processes Very Strong Moderately Gene Ontology: structured as a directed acyclic graph (DAG) Ashburger et al. Gene Ontology: tool for the unification of biology. Nature Genetics 2000.

Functional Associations Between Genes Level of shared function between genes Edge between two genes if they are involved in many of the same biological processes

Interpreting the “edges” in Biological Networks Relational Networks Generally Undirected (non-causal relationships) Nodes all of same “type” Generally no “signs” on edges Example: Protein A is a dimerization partner with protein B. A B Correlation Network Undirected (non-causal relationships) Nodes all of same “type” Edges can have “signs” Example: When the expression of Gene A changes, so does the expression for Gene B. A B *Correlation is not causation. Regulatory Network Directed Network (causal relationships) Can have “types” of nodes Edges can have “signs” Example: TF A regulates Gene B. A B

Network inference from expression data Margolin and Califano, Ann. N.Y. Acad. Sci. 1115: 51–72 (2007). Differential equations Boolean Networks Linear Regression Bayesian networks Information theoretic models Latent variable networks conditions genes Focusing on gene expression is a simplification. But let’s us to put our hand on it.

Correlation is the simplest metric for co-expression genes genes conditions genes

Mutual Information is a Measure of Non-linear Correlation Pearson correlation value Source: http://en.wikipedia.org/wiki/Correlation_and_dependence

Mutual Information (MI) Definition Properties Measures how much knowing one of these variables reduces uncertainty about the other Positive and symmetric Invariant under nonlinear transformation Network Reconstruction Algorithms that use MI: ARACNE CLR

(Algorithm for the Reconstruction of Accurate Cellular Networks) ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) Margolin, Califano et al. BMC Bioinformatics. 2005 Mar 20;7 Suppl 1:S7.

(Algorithm for the Reconstruction of Accurate Cellular Networks) ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) Margolin, Califano et al. BMC Bioinformatics. 2005 Mar 20;7 Suppl 1:S7. Key Idea: Remove indirect relationships.

CLR (Context Likelihood of Relatedness) Faith, Gardner et al. PLoS Biol. 2007 Jan;5(1):e8.

CLR (Context Likelihood of Relatedness) Faith, Gardner et al. PLoS Biol. 2007 Jan;5(1):e8. Key Idea: Normalize the MI for each gene pair against its corresponding background.

Interpreting the “edges” in Biological Networks Relational Networks Generally Undirected (non-causal relationships) Nodes all of same “type” Generally no “signs” on edges Example: Protein A is a dimerization partner with protein B. A B Correlation Network Undirected (non-causal relationships) Nodes all of same “type” Edges can have “signs” Example: When the expression of Gene A changes, so does the expression for Gene B. A B *Correlation is not causation. Regulatory Network Directed Network (causal relationships) Can have “types” of nodes Edges can have “signs” Example: TF A regulates Gene B. A B

Thinking of Gene Regulation As a Network Nodes are genes, edges indicate causal relationships between genes (“TF A regulates gene B”) Networks are directed, from transcription factors to target genes (some of which are also transcription factors) Edges in gene regulatory networks can have signs corresponding to target gene activation (increased transcription) and gene repression (prevention of transcription) note that edge signs are hard to measure in practice. Transcription Factor Target Gene TF A activates gene B Transcription Factor Target Gene TF A represses gene B

How Can We Model GRNs in Human Systems? TF1 TF2 TF3 TF-Gene Regulation Data Two main ways to produce this type of network: G1 TF1 Experimentally Computationally Technique: ChIP-chip Technique: DNA sequence scan for TF binding sites Limitations: very expensive, limited number of ChIP antibodies Limitations: only know recognitions sequences for 10-20% of TFs, prone to false positives, not environment-specific Strength: High quality, environment-specific Strengths: cheap G2 G3 TF2 G4 G5 TF3 TF4 G6

Outline Networks as a model Network properties Network paths Network motifs Information flow Graph clustering Biological networks Relational networks Correlative networks Causative/regulatory networks Applications Biological data integration Function prediction Resources and tools

Incorporating Epigenetic Information With TF Sequence-motif Data All potential interactions Motif found within gene’s promoter Interactions with Epigenetic Evidence Motif found in gene’s promoter and located in region of open chromatin Epigenetic data motif TF1  Gene1 Gene1 Gene2 Gene3 Gene4 Open Chromatin (DNase hypersensitivity site)

Relationship between Expression Information and Gene Regulation Experimental (ChIP-chip) Computational (motif) Gene Expression Limited antibodies (sparse) Quality of PWM Large amount of data Environment specific Not environment specific Non-functional targets Non-functional sequences Correlation is not causation “Good quality, sparse, expensive” “Poor quality, dense, cheap” Regulatory Network combination

Relationship between Expression Information and Gene Regulation Correlation of expression might occur when: One gene regulates another Two genes are regulated by the same TF. Gene Expression Large amount of data Environment specific Correlation is not causation TF TF is expressed Sometime later….. genes are expressed Correlation in two genes’ expression patterns is actually more often a measure of co-regulation

Relationship between Expression Information and Gene Regulation ? TF1 G2 G1 Correlated expression Example: G2 The expression of G1 and G2 is highly correlated Since TF1 targets G1, there is a higher possibility that TF1 also regulated G2.

Protein Interaction Is Related to Regulation Some transcription factors don’t bind a particular DNA sequence. TFs can regulate a gene: Through direct interaction with the control (promoter) region of that gene. By forming a complex with other TFs which directly interact with the promoter region of that gene. We can model protein interactions as a network.

Protein-Protein Interaction Data TF-Gene Regulation Data Relationship between Protein Interaction Information and Gene Regulation Protein-Protein Interaction Data TF-Gene Regulation Data G1 TF1 TF1 TF4 G2 G3 TF5 TF2 TF2 G4 G5 TF3 TF3 TF4 Know recognition sequence

Protein-Protein Interaction Data TF-Gene Regulation Data Relationship between Protein Interaction Information and Gene Regulation Protein-Protein Interaction Data TF-Gene Regulation Data G1 TF1 TF1 TF4 G2 G3 TF5 TF2 TF2 G4 G5 TF3 TF3 TF4

Relationship between Protein Interaction Information and Gene Regulation Integrated Network Example: G3 TF1 and TF2 are potential regulators. Since TF5 interacts with both TF1 and TF2, there is higher possibility that TF5 is also involved in the regulation of G3. G1 TF1 G2 G3 TF5 TF2 G4 G5 TF3 TF4 TF-Gene Regulation Protein-Protein Interaction

Outline Networks as a model Network properties Network paths Network motifs Information flow Graph clustering Biological networks Relational networks Correlative networks Causative/regulatory networks Applications Biological data integration Function prediction Resources and tools

Functional mapping: mining biological networks Predicted relationships between genes High Confidence Low The strength of these relationships indicates how cohesive a process is. Cell cycle genes

Functional mapping: mining biological networks Predicted relationships between genes High Confidence Low Cell cycle genes

Functional mapping: mining biological networks Predicted relationships between genes High Confidence Low The strength of these relationships indicates how associated two processes are. Cell cycle genes DNA replication genes

Predicting gene function Predicted relationships between genes High Confidence Low Cell cycle genes

Predicting gene function Predicted relationships between genes High Confidence Low Cell cycle genes

Predicting gene function Predicted relationships between genes High Confidence Low These edges provide a measure of how likely a gene is to specifically participate in the process of interest. Cell cycle genes

Outline Networks as a model Network properties Network paths Network motifs Information flow Graph clustering Biological networks Relational networks Correlative networks Causative/regulatory networks Applications Biological data integration Function prediction Resources and tools

Known Gene Regulatory Network: E. coli E. coli is a single-celled organism with a circular DNA structure encoding approximately 4000 genes (about 2500 “operons”) Probably has with most complete experimentally-constructed gene regulatory network. Used for many early investigations into GRN structure. http://regulondb.ccg.unam.mx/

Human Regulatory Information: ENCODE https://genome.ucsc.edu/ENCODE/

Protein Interaction Information: StringDB http://string-db.org/

Pathway Information http://www.biocarta.com/ http://www.genome.jp/kegg/ http://www.geneontology.org/

Network Analysis and Visualization http://www.cytoscape.org/ http://igraph.sourceforge.net/ http://www.graphviz.org/