Evaluation of inferred networks

Slides:



Advertisements
Similar presentations
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Advertisements

DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng.
. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N. Friedman.
GENIE – GEne Network Inference with Ensemble of trees Van Anh Huynh-Thu Department of Electrical Engineering and Computer Science, Systems and Modeling,
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.
Goal: Reconstruct Cellular Networks Biocarta. Conditions Genes.
Probabilistic Graphical models for molecular networks Sushmita Roy BMI/CS 576 Nov 11 th, 2014.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Inferring subnetworks from perturbed expression profiles Dana Pe’er, Aviv Regev, Gal Elidan and Nir Friedman Bioinformatics, Vol.17 Suppl
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Dependency networks Sushmita Roy BMI/CS 576 Nov 26 th, 2013.
Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
Using Bayesian Networks to Analyze Expression Data By Friedman Nir, Linial Michal, Nachman Iftach, Pe'er Dana (2000) Presented by Nikolaos Aravanis Lysimachos.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Data Analysis with Bayesian Networks: A Bootstrap Approach Nir Friedman, Moises Goldszmidt, and Abraham Wyner, UAI99.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Speaker: Bin-Shenq Ho Dec. 19, 2011
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Understanding Network Concepts in Modules Dong J, Horvath S (2007) BMC Systems Biology 2007, 1:24.
Computational methods to inferring cellular networks Stat 877 Apr 15 th 2014 Sushmita Roy.
IMPROVED RECONSTRUCTION OF IN SILICO GENE REGULATORY NETWORKS BY INTEGRATING KNOCKOUT AND PERTURBATION DATA Yip, K. Y., Alexander, R. P., Yan, K. K., &
Dependency networks Sushmita Roy BMI/CS 576 Nov 25 th, 2014.
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
Module Networks BMI/CS 576 Mark Craven December 2007.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Analyzing circadian expression data by harmonic regression based on autoregressive spectral estimation Rendong Yang and Zhen Su Division of Bioinformatics,
Identifying submodules of cellular regulatory networks Guido Sanguinetti Joint work with N.D. Lawrence and M. Rattray.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Graph clustering to detect network modules
David Amar, Tom Hait, and Ron Shamir
Estimation of Gene-Specific Variance
Yiming Kang, Hien-haw Liow, Ezekiel Maier, & Michael Brent
Constrained graph structure learning by integrating diverse data types
Hierarchical Agglomerative Clustering on graphs
Spectral methods for Global Network Alignment
Dependency Networks: GENIE3
Incorporating graph priors in Bayesian networks
Multi-task learning approaches to modeling context-specific networks
Probabilistic models for interpreting perturbations in networks: Nested effect models Sushmita Roy Computational Network Biology.
Hyunghoon Cho, Bonnie Berger, Jian Peng  Cell Systems 
Dynamic Bayesian Networks
Inferring Models of cis-Regulatory Modules using Information Theory
Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001
1 Department of Engineering, 2 Department of Mathematics,
Finding modules on graphs
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Schedule for the Afternoon
Computational Network Biology Biostatistics & Medical Informatics 826
Analyzing Time Series Gene Expression Data
Spectral methods for Global Network Alignment
Incorporating graph priors in Bayesian networks
Input Output HMMs for modeling network dynamics
Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer Yang et al Presented by Yves A. Lussier MD PhD The University.
Deep Learning in Bioinformatics
Hyunghoon Cho, Bonnie Berger, Jian Peng  Cell Systems 
Presentation transcript:

Evaluation of inferred networks Sushmita Roy sroy@biostat.wisc.edu Computational Network Biology Biostatistics & Medical Informatics 826 https://compnetbiocourse.discovery.wisc.edu Sep 27th 2018

Evaluating the network Assessing confidence Area under the precision recall curve Do modules or target sets of genes participate in coherent function? Can the network predict expression in a new condition?

Assessing confidence in the learned network Typically the number of training samples is not sufficient to reliably determine the “right” network One can however estimate the confidence of specific features of the network Graph features f(G) Examples of f(G) An edge between two random variables Order relations: Is X, Y’s ancestor?

How to assess confidence in graph features? What we want is P(f(G)|D), which is But it is not feasible to compute this sum Instead we will use a “bootstrap” procedure

Bootstrap to assess graph feature confidence For i=1 to m Construct dataset Di by sampling with replacement N samples from dataset D, where N is the size of the original D Learn a graphical model {Gi,Θi} For each feature of interest f, calculate confidence

Bootstrap/stability selection Final inferred network Expression data Output Input 6

Does the bootstrap confidence represent real relationships? Compare the confidence distribution to that obtained from randomized data Shuffle the columns of each row (gene) separately Repeat the bootstrap procedure Experimental conditions genes randomize each row independently Slide credit Prof. Mark Craven

Bootstrap-based confidence differs between real and actual data Random f Friedman et al 2000

Example of a high confidence sub-network Bootstrapped confidence Bayesian network: highlights a subnetwork associated with yeast mating pathway. Colors indicate genes with known functions. One learned Bayesian network Nir Friedman, Science 2004

Area under the precision recall curve (AUPR) Assume we know what the “right” network is One can use Precision-Recall curves to evaluate the predicted network Area under the PR curve (AUPR) curve quantifies performance Precision= Recall= # of correct edges # of correct edges # of predicted edges # of true edges

Edge based comparison (AUPR) Both networks Inferred network True network True positive Slide credit: Alireza Fotuhi Siahpirani

Experimental datasets to assess network structure for gene regulatory networks Sequence specific motifs ChIP-chip and ChIP-seq Factor knockout followed by whole-transcriptome profiling

AUPR based performance comparison

DREAM: Dialogue for reverse engineeting assessments and methods Community effort to assess regulatory network inference DREAM 5 challenge Previous challenges: 2006, 2007, 2008, 2009, 2010 Marbach et al. 2012, Nature Methods

Where do different methods rank? GENIE3 was in the OTHER category Method 1. Random Community Marbach et al., 2012 These approaches were mostly per-gene

Methods tend to cluster together These approaches were mostly per-gene Marbach et al., 2012

Comparing per-module (LeMoNe) and per-gene (CLR) methods CLR: more range of factors. Module-based methods: works better for TFs that have larger targets. Marchal & De Smet, Nature Reviews Microbiology, 2010

Some comments about expression-based network inference methods We have seen multiple types of algorithms to learn these networks Per-gene methods (learn regulators for individual genes) Sparse candidate, GENIE3, ARACNE, CLR Per-module methods Module networks: learn regulators for sets of genes/modules Other implementations of module networks exist LIRNET: Learning a Prior on Regulatory Potential from eQTL Data (Su In Lee et al, Plos genetics 2009, http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000358) LeMoNe: Learning Module Networks (Michoel et al 2007, http://www.biomedcentral.com/1471-2105/8/S2/S5) Methods that combine per-gene and per-module (MERLIN) Methods differ in how they quantify dependence between genes Higher-order or pairwise Focus on structure or structure & parameters Expression alone is not enough to infer the structure of the network Integrative approaches that combine expression with other types of data are likely more successful (next lectures)

References Markowetz, Florian and Rainer Spang. "Inferring cellular networks-a review.." BMC bioinformatics 8 Suppl 6 (2007): S5+. N. Friedman, M. Linial, I. Nachman, and D. Pe'er, "Using bayesian networks to analyze expression data," Journal of Computational Biology, vol. 7, no. 3-4, pp. 601-620, Aug. 2000. [Online]. Available: http://dx.doi.org/10.1089/106652700750050961 Dependency Networks for Inference, Collaborative Filtering and Data visualization Heckerman, Chickering, Meek, Rounthwaite, Kadie 2000 Inferring Regulatory Networks from Expression Data Using Tree-Based Methods Van Anh Huynh-Thu, Alexandre Irrthum, Louis Wehenkel, Pierre Geurts, Plos One 2010 D. Marbach et al., "Wisdom of crowds for robust gene network inference," Nature Methods, vol. 9, no. 8, pp. 796-804, Jul. 2012. [Online]. Available: http://dx.doi.org/10.1038/nmeth. R. De Smet and K. Marchal, "Advantages and limitations of current network inference methods." Nature reviews. Microbiology, vol. 8, no. 10, pp. 717-729, Oct. 2010. [Online]. Available: http://dx.doi.org/10.1038/nrmicro2419