GENIE – GEne Network Inference with Ensemble of trees Van Anh Huynh-Thu Department of Electrical Engineering and Computer Science, Systems and Modeling,

Slides:



Advertisements
Similar presentations
DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng.
Advertisements

Control Case Common Always active
A Novel Knowledge Based Method to Predicting Transcription Factor Targets
CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.
Global Mapping of the Yeast Genetic Interaction Network Tong et. al, Science, Feb 2004 Presented by Bowen Cui.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Work Process Using Enrich Load biological data Check enrichment of crossed data sets Extract statistically significant results Multiple hypothesis correction.
Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis Jonsson.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Cs726 Modeling regulatory networks in cells using Bayesian networks Golan Yona Department of Computer Science Cornell University.
Functional genomics and inferring regulatory pathways with gene expression data.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Gene Regulatory Networks - the Boolean Approach Andrey Zhdanov Based on the papers by Tatsuya Akutsu et al and others.
Networks and Algorithms in Bio-informatics D. Frank Hsu Fordham University *Joint work with Stuart Brown; NYU Medical School Hong Fang.
Bacterial Physiology (Micr430)
Fuzzy K means.
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
Inferring the nature of the gene network connectivity Dynamic modeling of gene expression data Neal S. Holter, Amos Maritan, Marek Cieplak, Nina V. Fedoroff,
6. Gene Regulatory Networks
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Epistasis Analysis Using Microarrays Chris Workman.
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Dependency networks Sushmita Roy BMI/CS 576 Nov 26 th, 2013.
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Networks and Interactions Boo Virk v1.0.
Network Biology Presentation by: Ansuman sahoo 10th semester
Outline Who regulates whom and when? Model Learning algorithm Evaluation Wet lab experiments Perspective: why does it work? Reg. ACGTGC.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Analysis of the yeast transcriptional regulatory network.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Biological networks Tutorial 12. Protein-Protein interactions –STRING Protein and genetic interactions –BioGRID Network visualization –Cytoscape Cool.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
Systems Biology ___ Toward System-level Understanding of Biological Systems Hou-Haifeng.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Steady-state Analysis of Gene Regulatory Networks via G-networks Intelligent Systems & Networks Group Dept. Electrical and Electronic Engineering Haseong.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
KEY CONCEPT Biotechnology relies on cutting DNA at specific places.
IMPROVED RECONSTRUCTION OF IN SILICO GENE REGULATORY NETWORKS BY INTEGRATING KNOCKOUT AND PERTURBATION DATA Yip, K. Y., Alexander, R. P., Yan, K. K., &
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Dependency networks Sushmita Roy BMI/CS 576 Nov 25 th, 2014.
Introduction to biological molecular networks
Cluster validation Integration ICES Bioinformatics.
Brad Windle, Ph.D Unsupervised Learning and Microarrays Web Site: Link to Courses and.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Yeast Cell-Cycle Regulation Network inference Wang Lin.
Biological Network Analysis
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Biotechnology.
Biological networks CS 5263 Bioinformatics.
Time-Course Network Enrichment
Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001
Dept of Biomedical Informatics University of Pittsburgh
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
CISC 841 Bioinformatics (Spring 2006) Inference of Biological Networks
1 Department of Engineering, 2 Department of Mathematics,
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Evaluation of inferred networks
Translation of Genotype to Phenotype by a Hierarchy of Cell Subsystems
Anastasia Baryshnikova  Cell Systems 
Principle of Epistasis Analysis
Dynamic regulatory map and static network for yeast response to AA starvation. Dynamic regulatory map and static network for yeast response to AA starvation.
Presentation transcript:

GENIE – GEne Network Inference with Ensemble of trees Van Anh Huynh-Thu Department of Electrical Engineering and Computer Science, Systems and Modeling, University of Liege, Belgium

Inference of GRNs  Gene regulatory networks (GRNs) are behind the scene players in gene expression  How do we determine the regulators of each gene?  Input: Gene expression data in different conditions/time points A subset of the genes that contains all the regulators (without GENIE accuracy plummets)

Underlying Model  Every reverse engineering tool assumes an underlying model  GENIE assume that the GRN is a Boolean network  Therefore, the regulation of each gene is a Boolean function

GENIE Strategy Outline  Not to make strong assumptions about the possible regulatory interactions (e.g. a strong assumption is linearity)  Treat time-series as static experiments  Solve the problem for each gene separately, and combine the results  The final output is a ranking of potential interactions in descending confidence

GENIE workflow

Tree-based Ensemble Methods  A regulation function is a binary tree – at each node a binary test according to a different regulator is performed  The prediction is at the leaf  For each gene, randomly select a set of samples and produce a tree from each one (the root is the single gene that splits K random conditions of the target best, and so on)  Rank the regulators according to their importance in the trees

Ranking of regulators #S is the number of samples that reach the node N #S t (S f ) is the number of samples with output true (false) Var() is the variance of the output In order to avoid bias towards highly variable genes, the expression values are first normalized to unit variance

Best performer in DREAM5 network inference

The Genetic Landscape of the Cell Charles Boone University of Toronto, Donnelly Center

Synthetic Genetic Arrays No growth Single mutant strand (query gene) is crossed with all other single mutants Double mutants are selected Currently done for budding yeast, e.coli and s.pombe

Genetic Interactions  Positive interaction: The double knockout is more viable than would be expected by the separate contributions of the single knockouts  Negative interaction: The double knockout is less viable than would be expected by the separate contributions of the single knockouts  They crossed ~1700 yeast single mutants with ~3,800 single mutants, and after filtering failures they got ~5.4 million double mutants

Yeast Interaction Map Edges are interactions that pass cutoff threshold (170,000) Proximity in the layout is according to similarity in interaction profiles Colored sets = GO enrichment

Proximity between clusters and related functions Proximate clusters Both require cytoskeleton genes

Zoom in on pathway Red – Negative Green - Positive Budding Required for polarization and growth Cell division Interactions between pathways and complexes were often monochromatic Translation

Positive vs. negative interactions Negative interactions are ~two times more prominent than positive No interaction

Degree distribution Severe fitness defects in single mutants correlate with degree Hubs are less numerous

Gene duplicates interact less

Correlation between degree and gene properties Black - PPI # morphological phenotypes # chemical perturbations unstable structure

Genetic interactions between cellular processes Cell cycle is more buffered?

Hubs in the chemical interaction networks match hubs in GI network DNA repair Hydroxyurea blocks DNA synthesis Erodoxin (new) similar to protein Folding-related gene Single mutant + chemical = chemical interaction

Discovering Master Regulators of Alcohol Addiction William Shin Center for Computational Biology and Bioinformatics Columbia University

Rat Model of Alcohol Addiction Control Alcohol Self Administration Alcohol Vapor Treatment (Chronic alcohol addiction) Control Non Dependent No Alcohol Vapor

Rat model of alcohol addiction Alcohol self- administration (lever pressing) Alcohol Intake during early withdrawal Dependent (exposed to alcohol vapor) Non-dependent (exposed to air) Baseline Alcohol responding (0.5 hr) * Induction of alcohol- dependence

Identification of TF-target interactions  Rat Brain regions were sliced and used as microarray samples 92 samples from Dependent, Non-Dependent, Control Rats across 8 regions that are known as sites-of-action for of addictive drugs.  Applied ARACNE to this data Information-theory based (MI) Tests triplets of genes for indirect interactions  130,000 TF-target interactions in total

Screening of false positives Targets of TF 1 TF 1 TF 2 TF 1 shadows TF 2 : TF 2 appears enriched only because it shares common targets with TF 1 Targets of TF 2

Masters regulators in the Accumbens shell

Activity profile at different brain regions

siRNA validation has 50-75% success rate NOT ALL TARGETS WERE TESTED YET