Representation, Learning and Inference in Models of Cellular Networks

Slides:



Advertisements
Similar presentations
The lac operon.
Advertisements

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Regulation of Gene Expression
PowerPoint Presentation Materials to accompany
STRATEGY FOR GENE REGULATION 1.INFORMATION IN NUCLEIC ACID – CIS ELEMENT CIS = NEXT TO; ACTS ONLY ON THAT MOLECULE 2.TRANS FACTOR (USUALLY A PROTEIN) BINDS.
Lac Operon.
Genetic Regulatory Mechanisms
Announcements 1. Reading Ch. 15: skim btm Look over problems Ch. 15: 5, 6, 7.
13 The Genetics of Viruses and Prokaryotes. 13 The Genetics of Viruses and Prokaryotes 13.1 How Do Viruses Reproduce and Transmit Genes? 13.2 How Is Gene.
Chapter 18 Regulation of Gene Expression.
To understand the concept of the gene function control. To understand the concept of the gene function control. To describe the operon model of prokaryotic.
Negative regulatory proteins bind to operator sequences in the DNA and prevent or weaken RNA polymerase binding.
12-5 Gene Regulation.
AP Biology Chapter 18: Gene Regulation. Regulation of Gene Expression Important for cellular control and differentiation. Understanding “expression” is.
Bacterial Operons A model of gene expression regulation Ch 18.4.
Four of the many different types of human cells: They all share the same genome. What makes them different?
Review: Bayesian learning and inference
Goal: Reconstruct Cellular Networks Biocarta. Conditions Genes.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.
Prokaryotic Gene Regulation:
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Gene regulation  Two types of genes: 1)Structural genes – encode specific proteins 2)Regulatory genes – control the level of activity of structural genes.
Draw 8 boxes on your paper
CONTROL MECHANISMS 5.5. Controlling Transcription and Translation of Genes  Housekeeping Genes: needed at all times: needed for life functions vital.
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
Reconstruction of Transcriptional Regulatory Networks
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Gene Regulation, Part 1 Lecture 15 Fall Metabolic Control in Bacteria Regulate enzymes already present –Feedback Inhibition –Fast response Control.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
REVIEW SESSION 5:30 PM Wednesday, September 15 5:30 PM SHANTZ 242 E.
How Does A Cell Know? Which Gene To Express Which Gene To Express& Which Gene Should Stay Silent? Which Gene Should Stay Silent?
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
Control, Genomes and Environment Cellular Control – The lac operon.
Review: Bayesian inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y.
Introduction to biological molecular networks
Bayesian Networks for Regulatory Network Learning BMI/CS 576 Colin Dewey Fall 2015.
© 2011 Pearson Education, Inc. Lectures by Stephanie Scher Pandolfi BIOLOGICAL SCIENCE FOURTH EDITION SCOTT FREEMAN 17 Control of Gene Expression in Bacteria.
Module Networks BMI/CS 576 Mark Craven December 2007.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Belief Networks Kostas Kontogiannis E&CE 457. Belief Networks A belief network is a graph in which the following holds: –A set of random variables makes.
PROBABILISTIC REASONING Heng Ji 04/05, 04/08, 2016.
Gene Regulation.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Gene regulation.
Another look at Bayesian inference
CSCI2950-C Lecture 12 Networks
Control of Gene Expression in Prokaryotes
Inference in Bayesian Networks
Control of Gene Expression
Human Cells Metabolic pathways
GENE EXPRESSION AND REGULATION
Lac Operon Lactose is a disaccharide used an energy source for bacteria when glucose is not available in environment Catabolism of lactose only takes place.
Bayesian Networks Applied to Modeling Cellular Networks
Learning Sequence Motif Models Using Expectation Maximization (EM)
Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001
Lect 16: Lac Operon.
Lac Operon.
Gene Regulation.
Ch 18: Regulation of Gene Expression
Regulation of Gene Expression
How to Use This Presentation
Gene Regulation Packet #22.
Review Warm-Up What is the Central Dogma?
The control of gene expression enable individual
Gene Regulation in Prokaryotes
mitosis Gene Regulation A. Overview
Prokaryotic (Bacterial) Gene Regulation
Regulation of Gene Transcription
Presentation transcript:

Representation, Learning and Inference in Models of Cellular Networks BMI/CS 576 www.biostat.wisc.edu/bmi576/ Colin Dewey cdewey@biostat.wisc.edu Fall 2010

Various Subnetworks within Cells metabolic: describe reactions through which enzymes convert substrates to products regulatory (genetic): describe interactions that control expression of particular genes signaling: describe interactions among proteins and (sometimes) small molecules that relay signals from outside the cell to the nucleus note: these networks are linked together and the boundaries among them are not crisp

Figure from KEGG database gene products other molecules

Part of the E. coli Regulatory Network Caption: This figure illustrates the core of the GRN in E. coli, where TFs regulate other TFs [74]. Short horizontal lines from which bent arrows extend represent cis-regulatory elements responsible for the expression of the genes named below the line. When more than one TF regulates a gene, the order of their binding sites is as given in the figure. An arrowhead indicates activation and a horizontal bar indicates repression when the position of the binding site is known. If only the nature of TF regulation is known, without binding site information, ‘+’ and ‘−’ symbols indicate activation and repression respectively. These examples may be indirect rather than direct regulation. The circles with the different colours as given in the key represent the different families of DNA binding domains. The names of dominant regulators are in bold. FIS, factor for inversion stimulation; IHF, integration host factor. Modified with permission, from Madan Babu, M. and Teichmann, S. A., (2003), Nucleic Acids Res. 31, 1234–1244. © Oxford University Press. Figure from Wei et al., Biochemical Journal 2004

A Signaling Network Figure from Sachs et al., Science 2005 Classic signaling network and points of intervention. This is a graphical illustration of the conventionally accepted signaling molecule interactions, the events measured, and the points of intervention by small-molecule inhibitors. Signaling nodes in color were measured directly. Signaling nodes in gray were not measured, but are presented to place the signaling nodes that were measured within contextual cellular pathways. The interventions classified as activators are colored green and inhibitors are colored red. Intervention site of action is indicated in the figure. Arcs are used to illustrate connections between signaling molecules; in some cases, the connections may be indirect and may involve specific phosphorylation sites of the signaling molecules (see Table 3 for details of these connections). This figure contains a synopsis of signaling in mammalian cells and is not representative of all cell types, with inositol signaling corelationships being particularly complex. Figure from Sachs et al., Science 2005

Two Key Tasks learning: given background knowledge and high-throughput data, try to infer the (partial) structure/parameters of a network inference: given a (partial) network model, use it to predict an outcome of biological interest (e.g. will the cells grow faster in medium x or medium y?) both of these are challenging tasks because typically data are noisy data are incomplete – characterize a limited range of conditions important aspects of the system not measured – some unknown structure and/or parameters

Transcriptional Regulation Example: the lac Operon in E. coli E. coli can use lactose as an energy source, but it prefers glucose. How does it switch on its lactose-metabolizing genes?

The lac Operon: Repression by LacI lactose absent  protein encoded by lacI represses transcription of the lac operon

The lac Operon: Induction by LacI lactose present  protein encoded by lacI won’t bind to the operator (O) region

The lac Operon: Activation by Glucose glucose absent  CAP protein promotes binding by RNA polymerase; increases transcription

Network Model Representations directed graphs Boolean networks differential equations Bayesian networks and related graphical models etc.

Probabilistic Model of lac Operon suppose we represent the system by the following discrete variables L (lactose) present, absent G (glucose) present, absent I (lacI) present, absent C (CAP) present, absent lacI-unbound true, false CAP-bound true, false Z (lacZ) high, low, absent suppose (realistically) the system is not completely deterministic the joint distribution of the variables could be specified by 26 × 3 - 1 = 191 parameters

Motivation for Bayesian Networks Explicitly state (conditional) independencies between random variables Provide a more compact model (fewer parameters) Use directed graphs to specify model Take advantage of graph algorithms/theory Provide intuitive visualizations of models

A Bayesian Network for the lac System Pr ( L ) Z L G C I lacI-unbound CAP-bound absent present 0.9 0.1 Pr ( lacI-unbound | L, I ) L I true false absent 0.9 0.1 present Pr ( Z | lacI-unbound, CAP-bound ) lacI-unbound CAP-bound absent low high true false 0.1 0.8

Bayesian Networks Also known as Directed Graphical Models a BN is a Directed Acyclic Graph (DAG) in which the nodes denote random variables each node X has a conditional probability distribution (CPD) representing P(X | Parents(X) ) the intuitive meaning of an arc from X to Y is that X directly influences Y formally: each variable X is independent of its non-descendants given its parents

Bayesian Networks a BN provides a factored representation of the joint probability distribution Z L G C I lacI-unbound CAP-bound 1 + 1 + 1 + 1 + 4 + 4 + 4 x 2 = 20 this representation of the joint distribution can be specified with 20 parameters (vs. 191 for the unfactored representation)

Representing CPDs for Discrete Variables CPDs can be represented using tables or trees consider the following case with Boolean variables A, B, C, D Pr( D | A, B, C ) A Pr(D = T) = 0.9 F T B Pr(D = T) = 0.5 C Pr(D = T) = 0.8 Pr( D | A, B, C ) A B C T F 0.9 0.1 0.8 0.2 0.5

Representing CPDs for Continuous Variables we can also model the distribution of continuous variables in Bayesian networks one approach: linear Gaussian models U1 U2 … Uk X X normally distributed around a mean that depends linearly on values of its parents ui

The Inference Task in Bayesian Networks Given: values for some variables in the network (evidence), and a set of query variables Do: compute the posterior distribution over the query variables variables that are neither evidence variables nor query variables are hidden variables the BN representation is flexible enough that any set can be the evidence variables and any set can be the query variables L I G C L G I C lacI-unbound CAP-bound Z present ? low lacI-unbound CAP-bound Z

The Parameter Learning Task Given: a set of training instances, the graph structure of a BN Do: infer the parameters of the CPDs this is straightforward when there aren’t missing values, hidden variables L I G C L G I C lacI-unbound CAP-bound Z present true false low absent high ... lacI-unbound CAP-bound Z

The Structure Learning Task Given: a set of training instances Do: infer the graph structure (and perhaps the parameters of the CPDs too) L G I C lacI-unbound CAP-bound Z present true false low absent high ...