Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001

Similar presentations


Presentation on theme: "Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001"— Presentation transcript:

1 Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001
Computational Biology Lecture #11: Inferring Regulatory Networks from Gene Expression Data Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001 9/16/2018 ©Bud Mishra, 2001

2 Regulatory Networks ©Bud Mishra, 2001
All cells in an organism have the same genomic data, but the proteins synthesized in each vary according to cell type, time and environmental factors There are network of interactions among various biochemical entities in a cell (DNA RNA, protein, small moleules) Can we infer the networks of interactions among genes? 9/16/2018 ©Bud Mishra, 2001

3 Gene Regulation ©Bud Mishra, 2001 DNA Transport to cytosol
transcription mRNA Nonphosphorylated protein Transport to nucleus Nonphosphorylated protein Post-translational modifications Nonphosphorylated protein 9/16/2018 ©Bud Mishra, 2001

4 Regulatory Networks ©Bud Mishra, 2001
There are lots of regulatory interactions that occur after transcription. But we will focus on transcriptional regulation: It plays a major role in the regulation of protein synthesis We can measure mRNA levels relatively easily 9/16/2018 ©Bud Mishra, 2001

5 Transcriptional Regulation: Example: The lac Operon
Regions coding for proteins Regulatory Regions Diffusable regulatory proteins RNA polymerase P O lacZ lacI lacY lacA I Z Y A mRNA + ribosomes 9/16/2018 ©Bud Mishra, 2001

6 Transcriptional Regulation: Example: The lac Operon
Binds but cannot move to transcribe Regions coding for proteins Regulatory Regions Diffusable regulatory proteins RNA polymerase I lacI P P O lacZ lacY lacA mRNA + ribosomes No mRNA I When lactose is absent, the protein encoded by lacI represses transcription of the lac operon 9/16/2018 ©Bud Mishra, 2001

7 Transcriptional Regulation: Example: The lac Operon
Regions coding for proteins Regulatory Regions Diffusable regulatory proteins RNA polymerase P O lacZ lacI lacY lacA I Z Y A mRNA + ribosomes Lactose Confirmational change Blocked 9/16/2018 ©Bud Mishra, 2001

8 Inferring Regulatory Network
Given: Temporal expression data for a set of genes Infer: The network of regulatory relationship among the genes 9/16/2018 ©Bud Mishra, 2001

9 Regulatory Network Models
Boolean Networks Kaufmann ’93, Liang, Fuhrman & Somogyi ’98 Differential Equations Chen, He & Church ’99 Bayesian Networks Friedman et al. ’99 Weight Matrices Weaver, Workman & Stormo ‘99 9/16/2018 ©Bud Mishra, 2001

10 Inferring Regulatory Networks with Weight Matrices
Overview: Assume discrete time steps u(t) is a vector representing the expression level of n genes at time t Build a model for predicting u(t+1) given u(0), u(1),…, u(t) 9/16/2018 ©Bud Mishra, 2001

11 Overview of the model ©Bud Mishra, 2001
u(t) Input expression levels at time t r(t) Determine net regulation of each gene at time t x(t) Determine response of each gene at time t u(t+1) predict input expression levels at time t+1 u(t) r(t) x(t) u(t+1) 9/16/2018 ©Bud Mishra, 2001

12 Determining the Net Regulation of Each Gene
Model regulative interactions among genes with a weight matrix ri(t) = åj wij uj(t) ri(t) = Regulatory input to i wij = Regulatory influence of j on i uj(t) = Expression level of j 9/16/2018 ©Bud Mishra, 2001

13 Determining the Response of Each Gene
r(t) x(t+1) The regulatory input to each gene determines its response through a sigmoid-like (“squashing”) function. xi(t+1) = [1+ exp(-ri(t) – bi)]-1 9/16/2018 ©Bud Mishra, 2001

14 Determining the Response of Each Gene
The bi parameter represents the predisposition of the gene in the absence of any regulative input (its basal rate) We can represent it as just another weight connected to a “gene” that is always completely on. xi(t+1) = [1 +exp{ -(åj wij uj(t) + bi)}]-1 9/16/2018 ©Bud Mishra, 2001

15 Predicting the Expression Level of Each Gene at Time t+1
The response of each gene is a value in [0,1]. Convert this relative level into a real unit of expression Allow different levels of maximal expression for each gene 9/16/2018 ©Bud Mishra, 2001

16 Predicting the Expression Level of Each Gene at Time t+1
ui(t+1) = mi xi(t+1) ui = Expression level of i mi = Maximal expression level for i xi = Response of i 9/16/2018 ©Bud Mishra, 2001

17 Maximal expression level for i Expression level of gene i at time t+1
Putting it Together ui(t+1) = mi/ [1 +exp{ -(åj wij uj(t) + bi)}] Maximal expression level for i Regulatory input to i Expression level of gene i at time t+1 9/16/2018 ©Bud Mishra, 2001

18 Including Environmental Variables
One can represent environmental variables (e.g., the concentration of lactose) as follows: Extend input vector to include n genes and p environmental variables Extend weight matrix so that each gene is connected to p environmental variables 9/16/2018 ©Bud Mishra, 2001

19 Learning the Parameters of the Model
Given A time series of expression measurements u(0), …, u(t), u(t+1): Pairs h u(t), u(t+1) i Find The wij parameters so that the data are closely modeled. This model can be solved with “back-propagation” algorithm as in a feed-forward neural network 9/16/2018 ©Bud Mishra, 2001

20 Learning the Parameters: Linear Algebra Approach
Weaver et al: Example of a linear algebraic approach The model for each gene is independent So one can determine the best weights for gene i, Then the best weight for gene j etc… Set up a linear problem or determining the weights for each gene i 9/16/2018 ©Bud Mishra, 2001

21 Overview of the model ©Bud Mishra, 2001
u(t) Input expression levels at time t r(t) Determine net regulation of each gene at time t x(t) Determine response of each gene at time t u(t+1) predict input expression levels at time t+1 u(t) r(t) x(t) u(t+1) 9/16/2018 ©Bud Mishra, 2001

22 [ ][ ]=[ ] Linear Algebra ©Bud Mishra, 2001 Learning the parameters:
Alternatively: U wi = ri ) wi = U-1ri Use singular value decomposition to calculate the inverse of U. u1(0) L un(0) wi ri(0) M O M M M u1(t) L un(t) win ri(t) [ ][ ]=[ ] 9/16/2018 ©Bud Mishra, 2001

23 Experimental Methodology
Generate random weight matrix models Use model to generate data h u(t), u(t+1) i pairs See how well the method recovers the “correct” model 9/16/2018 ©Bud Mishra, 2001

24 Experimental Methodology
Generate random regulatory networks # Genes (n) ranged from 10 to 200 Each had a set maximal expression level Several parameters to control the distribution of weights Average % of non-zero weights in a row Max and min for absolute value of weights Normally distributed noise is introduced into inputs. 9/16/2018 ©Bud Mishra, 2001

25 Experimental Methodology
Evaluated method according to how well it identified non-zero weights (I.e., correctly identified gene interactions) Specifically, consider: Sensitivity = TP/(TP+FP) TP= True Positive =#correctly predicted non zero weights FP=False Positives =#incorrectly predicted non zero weights 9/16/2018 ©Bud Mishra, 2001

26 Results ©Bud Mishra, 2001 More training data ) More accurate models
Sparse Networks ) More accurate models False positive (non-zero) weights about 10 times smaller than true positive… Sensitivity > 90% 9/16/2018 ©Bud Mishra, 2001

27 Limitations of Approach
Assumption that all gene interactions are independent of one another Assumption about regular discrete time evolution Assumption that a gene’s maximal expression level is known or can be estimated The model accounts only for transcriptional regulation 9/16/2018 ©Bud Mishra, 2001

28 Bayesian Networks ©Bud Mishra, 2001
Friedman, Linial, Nachman & Pe’er ‘2000 Learned Bayesian network models from Stanford yeast cell-cycle data 76 measurements of 6177 genes Focused on 800 genes whose expression varied over the cell-cycle stages 9/16/2018 ©Bud Mishra, 2001

29 Bayesian Networks E A D B C ©Bud Mishra, 2001
Edges represent dependencies Nodes represent gene activities E A E A Pr[B|E,A] Pr[: B|E,A] D B C 9/16/2018 ©Bud Mishra, 2001

30 Representing Partial Models
Since there is little data and many variables, focus on finding “features” common to lots of models that could explain the data Markov relations: Is Y in the Markov blanket of X? X, given its Markov blanket is independent of other variables in network Order relations: Is X an ancestor of Y? 9/16/2018 ©Bud Mishra, 2001

31 Estimating Confidence in Features
Bootstrap Method: For I = 1 to m Sample (with replacement) expression experiments Learn a Bayesian network from this sample The confidence in a feature is the fraction of the m models in which it was represented… 9/16/2018 ©Bud Mishra, 2001

32 Biological Analysis ©Bud Mishra, 2001
Using confidence in order relations, the approach identified “dominant genes” Several of these are known to be involved in cell-cycle control Several have non-viable null mutants Many encode proteins involved in replication, sporulation, budding Assessing confident Markov relations Most pairs are functionally related 9/16/2018 ©Bud Mishra, 2001


Download ppt "Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001"

Similar presentations


Ads by Google