Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001

Slides:



Advertisements
Similar presentations
The lac operon.
Advertisements

Weighing Evidence in the Absence of a Gold Standard Phil Long Genome Institute of Singapore (joint work with K.R.K. “Krish” Murthy, Vinsensius Vega, Nir.
Probabilistic modelling in computational biology Dirk Husmeier Biomathematics & Statistics Scotland.
Warm up Mon 11/3/14 Adv Bio 1. What does the phrase “gene regulation” mean? 2. If the lac operon cannot bind to the repressor.. What would be the outcome?
A Probabilistic Dynamical Model for Quantitative Inference of the Regulatory Mechanism of Transcription Guido Sanguinetti, Magnus Rattray and Neil D. Lawrence.
Gene regulation. Gene expression models  Prokaryotes and Eukaryotes employ common and different methods of gene regulation  Prokaryotic models 1. Trp.
Lac Operon.
Announcements 1. Reading Ch. 15: skim btm Look over problems Ch. 15: 5, 6, 7.
Control Mechanisms (Prokaryote) SBI4U. Controlling Expression  When a gene is being used by a cell, it gets transcribed, and then the mRNA is translated.
Control of Prokaryotic Gene Expression. Prokaryotic Regulation of Genes Regulating Biochemical Pathway for Tryptophan Synthesis. 1.Produce something that.
Regulation of gene expression References: 1.Stryer: “Biochemistry”, 5 th Ed. 2.Hames & Hooper: “Instant Notes in Biochemistry”, 2 nd Ed.
Regulation of Gene Expression
Four of the many different types of human cells: They all share the same genome. What makes them different?
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
CISC667, F05, Lec26, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Genetic networks and gene expression data.
Cs726 Modeling regulatory networks in cells using Bayesian networks Golan Yona Department of Computer Science Cornell University.
Gene Regulatory Networks - the Boolean Approach Andrey Zhdanov Based on the papers by Tatsuya Akutsu et al and others.
1 Lac Operon. 2 Lactose and Glucose Much of the control of gene expression occurs at the transcriptional level Our understanding of transcriptional regulation.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Assigning Numbers to the Arrows Parameterizing a Gene Regulation Network by using Accurate Expression Kinetics.
Prokaryotic Gene Regulation:
Gaussian Processes for Transcription Factor Protein Inference Neil D. Lawrence, Guido Sanguinetti and Magnus Rattray.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Control Mechanisms. Four Levels of Control of Gene Expression Type of ControlDescription Transcriptional Regulates which genes are transcribed. Controls.
Chapter 16 – Control of Gene Expression in Prokaryotes
The Lac Operon An operon is a length of DNA, made up of structural genes and control sites. The structural genes code for proteins, such as enzymes.
REVIEW SESSION 5:30 PM Wednesday, September 15 5:30 PM SHANTZ 242 E.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Complexities of Gene Expression Cells have regulated, complex systems –Not all genes are expressed in every cell –Many genes are not expressed all of.
Introduction to biological molecular networks
© 2011 Pearson Education, Inc. Lectures by Stephanie Scher Pandolfi BIOLOGICAL SCIENCE FOURTH EDITION SCOTT FREEMAN 17 Control of Gene Expression in Bacteria.
Controlling Gene Expression. Control Mechanisms Determine when to make more proteins and when to stop making more Cell has mechanisms to control transcription.
Are genes always being transcribed and translated?
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Warm Up Write down 5 times it would be beneficial for a gene to be ‘turned off’ and the protein not be expressed 1.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Regulation of Prokaryotic and Eukaryotic Gene Expression
Representation, Learning and Inference in Models of Cellular Networks
(Regulation of gene expression)
Learning gene regulatory networks in Arabidopsis thaliana
Regulation of Gene Expression
Bayesian Networks Applied to Modeling Cellular Networks
Regulation of Gene Expression
Control of Gene Expression
Lecture 6 By Ms. Shumaila Azam
Lect 16: Lac Operon.
Control of Gene Expression
Gene Regulation Ability of an organisms to control which genes are present in response to the environment.
Lac Operon.
Regulation of Gene Expression
Gene Regulation.
Regulation of Gene Expression
Controlling Gene Expression
Ch 18: Regulation of Gene Expression
CONTROL MECHANISMS Sections 5.5 Page 255.
Regulation of Gene Expression
1 Department of Engineering, 2 Department of Mathematics,
Regulation of Gene Expression
Agenda 3/16 Genes Expression Warm Up Prokaryotic Control Lecture
Gene Regulation Section 12–5
1 Department of Engineering, 2 Department of Mathematics,
Regulation of Gene Expression
1 Department of Engineering, 2 Department of Mathematics,
Evaluation of inferred networks
Gene Expression Activation of a gene to transcribe DNA into RNA.
CISC 667 Intro to Bioinformatics (Spring 2007) Genetic networks and gene expression data CISC667, S07, Lec24, Liao.
FLIPPED CLASSROOM ACTIVITY CONSTRUCTOR – USING EXISTING CONTENT
Presentation transcript:

Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001 Computational Biology Lecture #11: Inferring Regulatory Networks from Gene Expression Data Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001 9/16/2018 ©Bud Mishra, 2001

Regulatory Networks ©Bud Mishra, 2001 All cells in an organism have the same genomic data, but the proteins synthesized in each vary according to cell type, time and environmental factors There are network of interactions among various biochemical entities in a cell (DNA RNA, protein, small moleules) Can we infer the networks of interactions among genes? 9/16/2018 ©Bud Mishra, 2001

Gene Regulation ©Bud Mishra, 2001 DNA Transport to cytosol transcription mRNA Nonphosphorylated protein Transport to nucleus Nonphosphorylated protein Post-translational modifications Nonphosphorylated protein 9/16/2018 ©Bud Mishra, 2001

Regulatory Networks ©Bud Mishra, 2001 There are lots of regulatory interactions that occur after transcription. But we will focus on transcriptional regulation: It plays a major role in the regulation of protein synthesis We can measure mRNA levels relatively easily 9/16/2018 ©Bud Mishra, 2001

Transcriptional Regulation: Example: The lac Operon Regions coding for proteins Regulatory Regions Diffusable regulatory proteins RNA polymerase P O lacZ lacI lacY lacA I Z Y A mRNA + ribosomes 9/16/2018 ©Bud Mishra, 2001

Transcriptional Regulation: Example: The lac Operon Binds but cannot move to transcribe Regions coding for proteins Regulatory Regions Diffusable regulatory proteins RNA polymerase I lacI P P O lacZ lacY lacA mRNA + ribosomes No mRNA I When lactose is absent, the protein encoded by lacI represses transcription of the lac operon 9/16/2018 ©Bud Mishra, 2001

Transcriptional Regulation: Example: The lac Operon Regions coding for proteins Regulatory Regions Diffusable regulatory proteins RNA polymerase P O lacZ lacI lacY lacA I Z Y A mRNA + ribosomes Lactose Confirmational change Blocked 9/16/2018 ©Bud Mishra, 2001

Inferring Regulatory Network Given: Temporal expression data for a set of genes Infer: The network of regulatory relationship among the genes 9/16/2018 ©Bud Mishra, 2001

Regulatory Network Models Boolean Networks Kaufmann ’93, Liang, Fuhrman & Somogyi ’98 Differential Equations Chen, He & Church ’99 Bayesian Networks Friedman et al. ’99 Weight Matrices Weaver, Workman & Stormo ‘99 9/16/2018 ©Bud Mishra, 2001

Inferring Regulatory Networks with Weight Matrices Overview: Assume discrete time steps u(t) is a vector representing the expression level of n genes at time t Build a model for predicting u(t+1) given u(0), u(1),…, u(t) 9/16/2018 ©Bud Mishra, 2001

Overview of the model ©Bud Mishra, 2001 u(t) Input expression levels at time t r(t) Determine net regulation of each gene at time t x(t) Determine response of each gene at time t u(t+1) predict input expression levels at time t+1 u(t) r(t) x(t) u(t+1) 9/16/2018 ©Bud Mishra, 2001

Determining the Net Regulation of Each Gene Model regulative interactions among genes with a weight matrix ri(t) = åj wij uj(t) ri(t) = Regulatory input to i wij = Regulatory influence of j on i uj(t) = Expression level of j 9/16/2018 ©Bud Mishra, 2001

Determining the Response of Each Gene r(t) x(t+1) The regulatory input to each gene determines its response through a sigmoid-like (“squashing”) function. xi(t+1) = [1+ exp(-ri(t) – bi)]-1 9/16/2018 ©Bud Mishra, 2001

Determining the Response of Each Gene The bi parameter represents the predisposition of the gene in the absence of any regulative input (its basal rate) We can represent it as just another weight connected to a “gene” that is always completely on. xi(t+1) = [1 +exp{ -(åj wij uj(t) + bi)}]-1 9/16/2018 ©Bud Mishra, 2001

Predicting the Expression Level of Each Gene at Time t+1 The response of each gene is a value in [0,1]. Convert this relative level into a real unit of expression Allow different levels of maximal expression for each gene 9/16/2018 ©Bud Mishra, 2001

Predicting the Expression Level of Each Gene at Time t+1 ui(t+1) = mi xi(t+1) ui = Expression level of i mi = Maximal expression level for i xi = Response of i 9/16/2018 ©Bud Mishra, 2001

Maximal expression level for i Expression level of gene i at time t+1 Putting it Together ui(t+1) = mi/ [1 +exp{ -(åj wij uj(t) + bi)}] Maximal expression level for i Regulatory input to i Expression level of gene i at time t+1 9/16/2018 ©Bud Mishra, 2001

Including Environmental Variables One can represent environmental variables (e.g., the concentration of lactose) as follows: Extend input vector to include n genes and p environmental variables Extend weight matrix so that each gene is connected to p environmental variables 9/16/2018 ©Bud Mishra, 2001

Learning the Parameters of the Model Given A time series of expression measurements u(0), …, u(t), u(t+1): Pairs h u(t), u(t+1) i Find The wij parameters so that the data are closely modeled. This model can be solved with “back-propagation” algorithm as in a feed-forward neural network 9/16/2018 ©Bud Mishra, 2001

Learning the Parameters: Linear Algebra Approach Weaver et al: Example of a linear algebraic approach The model for each gene is independent So one can determine the best weights for gene i, Then the best weight for gene j etc… Set up a linear problem or determining the weights for each gene i 9/16/2018 ©Bud Mishra, 2001

Overview of the model ©Bud Mishra, 2001 u(t) Input expression levels at time t r(t) Determine net regulation of each gene at time t x(t) Determine response of each gene at time t u(t+1) predict input expression levels at time t+1 u(t) r(t) x(t) u(t+1) 9/16/2018 ©Bud Mishra, 2001

[ ][ ]=[ ] Linear Algebra ©Bud Mishra, 2001 Learning the parameters: Alternatively: U wi = ri ) wi = U-1ri Use singular value decomposition to calculate the inverse of U. u1(0) L un(0) wi1 ri(0) M O M M M u1(t) L un(t) win ri(t) [ ][ ]=[ ] 9/16/2018 ©Bud Mishra, 2001

Experimental Methodology Generate random weight matrix models Use model to generate data h u(t), u(t+1) i pairs See how well the method recovers the “correct” model 9/16/2018 ©Bud Mishra, 2001

Experimental Methodology Generate random regulatory networks # Genes (n) ranged from 10 to 200 Each had a set maximal expression level Several parameters to control the distribution of weights Average % of non-zero weights in a row Max and min for absolute value of weights Normally distributed noise is introduced into inputs. 9/16/2018 ©Bud Mishra, 2001

Experimental Methodology Evaluated method according to how well it identified non-zero weights (I.e., correctly identified gene interactions) Specifically, consider: Sensitivity = TP/(TP+FP) TP= True Positive =#correctly predicted non zero weights FP=False Positives =#incorrectly predicted non zero weights 9/16/2018 ©Bud Mishra, 2001

Results ©Bud Mishra, 2001 More training data ) More accurate models Sparse Networks ) More accurate models False positive (non-zero) weights about 10 times smaller than true positive… Sensitivity > 90% 9/16/2018 ©Bud Mishra, 2001

Limitations of Approach Assumption that all gene interactions are independent of one another Assumption about regular discrete time evolution Assumption that a gene’s maximal expression level is known or can be estimated The model accounts only for transcriptional regulation 9/16/2018 ©Bud Mishra, 2001

Bayesian Networks ©Bud Mishra, 2001 Friedman, Linial, Nachman & Pe’er ‘2000 Learned Bayesian network models from Stanford yeast cell-cycle data 76 measurements of 6177 genes Focused on 800 genes whose expression varied over the cell-cycle stages 9/16/2018 ©Bud Mishra, 2001

Bayesian Networks E A D B C ©Bud Mishra, 2001 Edges represent dependencies Nodes represent gene activities E A E A Pr[B|E,A] Pr[: B|E,A] 0 0 0..3 0.7 0 1 0.4 0.6 1 0 0.7 0.3 1 1 0.1 0.9 D B C 9/16/2018 ©Bud Mishra, 2001

Representing Partial Models Since there is little data and many variables, focus on finding “features” common to lots of models that could explain the data Markov relations: Is Y in the Markov blanket of X? X, given its Markov blanket is independent of other variables in network Order relations: Is X an ancestor of Y? 9/16/2018 ©Bud Mishra, 2001

Estimating Confidence in Features Bootstrap Method: For I = 1 to m Sample (with replacement) expression experiments Learn a Bayesian network from this sample The confidence in a feature is the fraction of the m models in which it was represented… 9/16/2018 ©Bud Mishra, 2001

Biological Analysis ©Bud Mishra, 2001 Using confidence in order relations, the approach identified “dominant genes” Several of these are known to be involved in cell-cycle control Several have non-viable null mutants Many encode proteins involved in replication, sporulation, budding Assessing confident Markov relations Most pairs are functionally related 9/16/2018 ©Bud Mishra, 2001