Reverse engineering of regulatory networks Dirk Husmeier & Adriano Werhli.

Slides:



Advertisements
Similar presentations
J. Daunizeau Institute of Empirical Research in Economics, Zurich, Switzerland Brain and Spine Institute, Paris, France Bayesian inference.
Advertisements

Network biology Wang Jie Shanghai Institutes of Biological Sciences.
Probabilistic modelling in computational biology Dirk Husmeier Biomathematics & Statistics Scotland.
CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
Mechanistic models and machine learning methods for TIMET Dirk Husmeier.
. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N. Friedman.
Consistent probabilistic outputs for protein function prediction William Stafford Noble Department of Genome Sciences Department of Computer Science and.
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented.
Gene ontology & hypergeometric test Simon Rasmussen CBS - DTU.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
Reverse engineering gene and protein regulatory networks using Graphical Models. A comparative evaluation study. Marco Grzegorczyk Dirk Husmeier Adriano.
Simulation and Application on learning gene causal relationships Xin Zhang.
Data Mining Presentation Learning Patterns in the Dynamics of Biological Networks Chang hun You, Lawrence B. Holder, Diane J. Cook.
Biological networks Construction and Analysis. Recap Gene regulatory networks –Transcription Factors: special proteins that function as “keys” to the.
Evaluation of Bayesian Networks Used for Diagnostics[1]
6. Gene Regulatory Networks
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
Gaussian Processes for Transcription Factor Protein Inference Neil D. Lawrence, Guido Sanguinetti and Magnus Rattray.
Bayesian network models of Biological signaling pathways
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Statistical Bioinformatics QTL mapping Analysis of DNA sequence alignments Postgenomic data integration Systems biology.
Cis-regulation Trans-regulation 5 Objective: pathway reconstruction.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
Gene Set Enrichment Analysis (GSEA)
Probabilistic Models that uncover the hidden Information Flow in Signalling Networks.
Probabilistic Models that uncover the hidden Information Flow in Signalling Networks Achim Tresch.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Data Analysis with Bayesian Networks: A Bootstrap Approach Nir Friedman, Moises Goldszmidt, and Abraham Wyner, UAI99.
Reverse engineering gene regulatory networks Dirk Husmeier Adriano Werhli Marco Grzegorczyk.
Learning regulatory networks from postgenomic data and prior knowledge Dirk Husmeier 1) Biomathematics & Statistics Scotland 2) Centre for Systems Biology.
Statistical Bioinformatics Genomics Transcriptomics Proteomics Systems Biology.
Inferring gene regulatory networks from transcriptomic profiles Dirk Husmeier Biomathematics & Statistics Scotland.
Probabilistic modelling in computational biology Dirk Husmeier Biomathematics & Statistics Scotland.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Analysis of GO annotation at cluster level by Agnieszka S. Juncker.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Inferring gene regulatory networks with non-stationary dynamic Bayesian networks Dirk Husmeier Frank Dondelinger Sophie Lebre Biomathematics & Statistics.
Reconstructing gene regulatory networks with probabilistic models Marco Grzegorczyk Dirk Husmeier.
Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme Dirk Husmeier Marco Grzegorczyk 1) Biomathematics & Statistics.
MCMC in structure space MCMC in order space.
Introduction to biological molecular networks
1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,
. Finding Motifs in Promoter Regions Libi Hertzberg Or Zuk.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No May.
Network deconvolution as a general method to distinguish direct dependencies in networks MIT group; Accepted Jun. 2013; Nature Biotechnology Presented.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Introduction: Metropolis-Hasting Sampler Purpose--To draw samples from a probability distribution There are three steps 1Propose a move from x to y 2Accept.
Mechanistic models and machine learning methods for TIMET
Gaussian Process Networks Nir Friedman and Iftach Nachman UAI-2K.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.
Identifying submodules of cellular regulatory networks Guido Sanguinetti Joint work with N.D. Lawrence and M. Rattray.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Bayesian Semi-Parametric Multiple Shrinkage
Incorporating graph priors in Bayesian networks
Reverse-engineering transcription control networks timothy s
Figure Legend: From: Bayesian inference for psychometric functions
A Non-Parametric Bayesian Method for Inferring Hidden Causes
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
Regulation Analysis using Restricted Boltzmann Machines
Presentation transcript:

Reverse engineering of regulatory networks Dirk Husmeier & Adriano Werhli

Systems biology Learning signalling pathways and regulatory networks from postgenomic data

Reverse Engineering of Regulatory Networks Can we learn the network structure from postgenomic data themselves? Statistical methods to distinguish between –Direct correlations –Indirect correlations Challenge: Distinguish between –Correlations –Causal interactions Breaking symmetries with active interventions: –Gene knockouts (VIGs, RNAi)

Shrinkage estimation and the lemma of Ledoit-Wolf

Bayesian networks versus Graphical Gaussian models Directed versus undirected graphs Score based versus constrained based inference

Evaluation On real experimental data, using the gold standard network from the literature On synthetic data simulated from the gold- standard network

Evaluation: Raf signalling pathway Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell Deregulation  carcinogenesis Extensively studied in the literature  gold standard network

Data Laboratory data from cytometry experiments Down-sampled to 100 measurements Sample size indicative of microarray experiments

Two types of experiments

Evaluation On real experimental data, using the gold standard network from the literature On synthetic data simulated from the gold- standard network

Comparison with simulated data 1

Raf pathway

Comparison with simulated data 2

Steady-state approximation

Evaluation 1: AUC scores

Evaluation 2: TP scores We set the threshold such that we obtained 5 spurious edges (5 FPs) and counted the corresponding number of true edges (TP count).

AUC scores

TP scores

Raf pathway

Conclusions 1 BNs and GGMs outperform RNs, most notably on Gaussian data. No significant difference between BNs and GGMs on observational data. For interventional data, BNs clearly outperform GGMs and RNs, especially when taking the edge direction (DGE score) rather than just the skeleton (UGE score) into account.

Conclusions 2 Performance on synthetic data better than on real data: Real data: more complex Real interventions are not ideal Errors in the gold-standard network

Reconstructing gene regulatory networks with Bayesian networks by combining microarray data with biological prior knowledge

MOTIVATION

Use TF binding motifs in promoter sequences

Use prior knowledge from KEGG

Prior knowledge

Biological prior knowledge matrix Biological Prior Knowledge Indicates some knowledge about the relationship between genes i and j

Biological prior knowledge matrix Biological Prior Knowledge Define the energy of a Graph G Indicates some knowledge about the relationship between genes i and j

Prior distribution over networks Energy of a network

Rewriting the energy Energy of a network

Approximation of the partition function

Multiple sources of prior knowledge

Rewriting the energy Energy of a network

Approximation of the partition function

MCMC sampling scheme

Sample networks and hyperparameters from the posterior distribution Metropolis-Hastings scheme Proposal probabilities

Metropolis-Hastings scheme

MCMC with one prior Sample graph and the parameter . Separate in two samples to improve the acceptance: 1.Sample graph with  fixed. 2.Sample  with graph fixed.

Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with  fixed. 2.Sample  with graph fixed.

Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with  fixed. 2.Sample  with graph fixed.

Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with  fixed. 2.Sample  with graph fixed.

Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with  fixed. 2.Sample  with graph fixed.

Approximation of the partition function

MCMC with two priors Sample graph and the parameters    and  2 Separate in three samples to improve the acceptance: 1.Sample graph with  1 and  2 fixed. 2.Sample  1 with graph and  2 fixed. 3.Sample  2 with graph and  1 fixed.

Application to real data

Flow cytometry data and KEGG

Data available: –Intracellular multicolour flow cytometry. –Measured protein concentrations. –1200 data points. We sample 5 data sets with 100 data points each.

Flow cytometry data and KEGG KEGG PATHWAYS are a collection of manually drawn pathway maps representing our knowledge of molecular interactions and reaction networks.

Flow cytometry data and KEGG

Prior distribution

Sampled values of the hyperparameters

Idealized network population

Sampled values of the hyperparameters

Performance evaluation: AUC scores

Flow cytometry data and KEGG

Comparison: AUC for fixed and sampled hyperparameters on real data

Comparison: AUC for fixed and sampled hyperparameters on synthetic data

Future work

Thank you