Introduction to biological molecular networks Colin Dewey BMI/CS 576 www.biostat.wisc.edu/bmi576 colin.dewey@wisc.edu Fall 2015
Omics data provide comprehensive description of nearly all components of the cell Joyce & Palsson, Nature Mol cell biol. 2006
Understanding a cell as a system Measure: identify the parts of a system Parts: different types of bio-molecules genes, proteins, metabolites High-throughput assays to measure these molecules Model: how these parts are put together Clustering Network inference and analysis
Key concepts in networks What are molecular networks? Different types of networks Graph-theoretic representation Computational problems in network biology Network structure and parameter learning Analysis of network properties Inference on networks: for data integration, interpretation and hypothesis generation Classes of methods for expression-based network inference Probabilistic graphical models for networks Bayesian networks Evaluation of inferred networks
Representing molecular networks Molecular networks typically represented by graphs Vertices/Nodes = a molecular part Edges = connections or interactions between parts Edges can have directionality and/or weight Vertex/Node A E F D Edge B C
Different types of molecular networks Depends on what the vertices represent the edges represent whether edges directed or undirected
Transcriptional regulatory networks Nodes: regulatory protein like a TF, or target gene Edges: TF A regulates C Transcription factors (TF) A B E. coli: 153 TFs and 1319 target genes Gene C A B C Directed, weighted S. cerevisiae: 157 TFs and 4410 target genes Vargas and Santillan, 2008
Part of the E. coli Regulatory Network Figure from Wei et al., Biochemical Journal 2004
Gene Regulation Example: the lac Operon this protein regulates the transcription of LacZ, LacY, LacA these proteins metabolize lactose
Gene Regulation Example: the lac Operon lactose is absent ⇒ the protein encoded by lacI represses transcription of the lac operon
Gene Regulation Example: the lac Operon lactose is present ⇒ it binds to the protein encoded by lacI changing its shape; in this state, the protein doesn’t bind upstream from the lac operon; therefore the lac operon can be transcribed
Gene Regulation Example: the lac Operon this example provides a simple illustration of how a cell can regulate (turn on/off) certain genes in response to the state of its environment an operon is a sequence of genes transcribed as a unit the lac operon is involved in metabolizing lactose it is “turned on” when lactose is present in the cell the lac operon is regulated at the transcription level the depiction here is incomplete; for example, the level of glucose in the cell also influences transcription of the lac operon
Detecting protein-DNA interactions ChIP-chip and ChIP-seq binding profiles for transcription factors Determine the (approximate) locations in the genome where a protein binds Peter Park, Nature Reviews Genetics, 2009
Protein-protein interaction networks Vertices: proteins Edges: Protein U physically interacts with protein X Protein complex U X Y Z U X Y Z Yeast protein interaction network Undirected Barabasi et al. 2004
Metabolic networks Vertices: enzymes Edges: Enzyme M and N share a metabolite Proteins (enzymes) metabolites M Enzymes a b N Metabolites c O d M N O Undirected, weighted Figure from KEGG database
Overview of the E. coli Metabolic Pathway Map image from the KEGG database
The Metabolic Pathway for Synthesizing the Amino Acid Alanine reactions metabolites enzymes (proteins that catalyze reactions) genes encoding the enzymes image from the Ecocyc database www.biocyc.org
Signaling networks Vertices: Enzymes and other proteins Edges: Enzyme P modifies protein Q Receptors P Q TF A P Q A Directed Sachs et al., 2005, Science
Genetic interaction networks Genetic interaction: If the phenotype of double mutant is significantly different than each mutant alone Vertices: Genes Edges: Genetic interaction between query (Q) and gene G Q G Undirected Dixon et al., 2009, Annu. Rev. Genet
Yeast genetic interaction network Costanzo et al, 2011, Science
Summary of different types of Molecular networks Physical networks Transcriptional regulatory networks: interactions between regulatory proteins (transcription factors) and genes Protein-protein: interactions among proteins Signaling networks: interactions between protein and small molecules, and among proteins that relay signals from outside the cell to the nucleus Functional networks metabolic: describe reactions through which enzymes convert substrates to products genetic: describe interactions among genes which when genetically perturbed together produce a significant phenotype than individually
Computational problems in networks Network reconstruction Infer the structure and parameters of networks We will examine this problem in the context of “expression-based network inference” Analysis of network properties Degree distributions Network motifs Highly connected nodes and relationship to lethality Network applications Interpretation of gene sets Using networks to infer function of a gene
Network reconstruction Given A set of attributes associated with network nodes Typically attributes are mRNA levels Do Infer which nodes interact Algorithms for network reconstruction can vary based on their meaning of interaction Similarity Mutual information Predictive ability
Computational methods to infer networks We will focus on transcriptional regulatory networks These networks are typically inferred from gene expression data Many methods to do network inference We will focus on probabilistic graphical models
Modeling a regulatory network Hot1 Sko1 HSP12 Sko1 Structure HSP12 Hot1 Who are the regulators? ψ(X1,X2) Function X1 X2 Y BOOLEAN LINEAR DIFF. EQNS PROBABILISTIC …. How they determine expression levels? Hot1 regulates HSP12 HSP12 is a target of Hot1
Mathematical representations of networks X1 X2 Input expression of neighbors f Models differ in the function that maps input system state to output state X3 Output expression of node Boolean Networks Differential equations Probabilistic graphical models Input Output Rate equations Probability distributions X1 X2 X1 X2 1 X3 1 X3