Regulatory network inference: use of whole brain vs

Slides:



Advertisements
Similar presentations
Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
Advertisements

GENIE – GEne Network Inference with Ensemble of trees Van Anh Huynh-Thu Department of Electrical Engineering and Computer Science, Systems and Modeling,
Mutual Information Mathematical Biology Seminar
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Investigating the Importance of non-coding transcripts.
Gene Regulatory Networks - the Boolean Approach Andrey Zhdanov Based on the papers by Tatsuya Akutsu et al and others.
MSmcDESPOT: Baseline vs. 1- year Diagnosis. N008 Baseline SPGR.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
1 Harvard Medical School Transcriptional Diagnosis by Bayesian Network Hsun-Hsien Chang and Marco F. Ramoni Children’s Hospital Informatics Program Harvard-MIT.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Networks and Interactions Boo Virk v1.0.
Not Whole Numbers I: Fractions Presented by Frank H. Osborne, Ph. D. © 2015 EMSE 3123 Math and Science in Education 1.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Yen Le Computation Biomedicine Lab Advisor: Dr. Kakadiaris 1 Automatic Multi-Region Segmentation Applied to Gene Expression Image from Mouse Brain.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Chapter 7 Probability and Samples: The Distribution of Sample Means
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
IMPROVED RECONSTRUCTION OF IN SILICO GENE REGULATORY NETWORKS BY INTEGRATING KNOCKOUT AND PERTURBATION DATA Yip, K. Y., Alexander, R. P., Yan, K. K., &
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
1 Using Graph Theory to Analyze Gene Network Coherence José A. Lagares Jesús S. Aguilar Norberto Díaz-Díaz Francisco A. Gómez-Vela
What is Science Anyway? Revised Science is...  Asking questions and finding answers.*  *It is a little more complicated than that but the above.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Modular organization.
Yiming Kang, Hien-haw Liow, Ezekiel Maier, & Michael Brent
1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION
Color Marking Enzyme Lab
Learning gene regulatory networks in Arabidopsis thaliana
Optimizing Biological Data Integration
Global Transcriptional Dysregulation in Breast Cancer
Inferring Models of cis-Regulatory Modules using Information Theory
Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001
M. Fu, G. Huang, Z. Zhang, J. Liu, Z. Zhang, Z. Huang, B. Yu, F. Meng 
Large Scale Data Integration
The Nature of Science.
Thalia E. Chan, Michael P.H. Stumpf, Ann C. Babtie  Cell Systems 
Gene Chips.
Elementary Statistics
Volume 17, Issue 5, Pages (October 2016)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Statistical Analysis Error Bars
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Avi J.H. Chanales, Ashima Oza, Serra E. Favila, Brice A. Kuhl 
The Development of Human Functional Brain Networks
Analyzing Time Series Gene Expression Data
Evolutionary Rewiring of Human Regulatory Networks by Waves of Genome Expansion  Davide Marnetto, Federica Mantica, Ivan Molineris, Elena Grassi, Igor.
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Understanding Tissue-Specific Gene Regulation
Anastasia Baryshnikova  Cell Systems 
Statistical Data Analysis
Consolidation Promotes the Emergence of Representational Overlap in the Hippocampus and Medial Prefrontal Cortex  Alexa Tompary, Lila Davachi  Neuron 
Volume 1, Issue 2, Pages (August 2015)
ADAGE model example. ADAGE model example. For one sample in the expression compendium (one column in the figure with red or green colors, representing.
Avi J.H. Chanales, Ashima Oza, Serra E. Favila, Brice A. Kuhl 
Matthieu Foll, Oscar E. Gaggiotti, Josephine T
Volume 24, Issue 19, Pages (October 2014)
Volume 28, Issue 18, Pages e2 (September 2018)
The Human Phenotype Ontology: A Tool for Annotating and Analyzing Human Hereditary Disease  Peter N. Robinson, Sebastian Köhler, Sebastian Bauer, Dominik.
Volume 8, Issue 3, Pages R73-R75 (January 1998)
A. A. The two independent mutations identified in the structural gene of pqn-82 are shown above the gene. The genomic coordinates for the altered base.
Characteristics of tissue‐specific co‐expression networks (CNs)‏
Volume 4, Issue 3, Pages e3 (March 2017)
Predicting Gene Expression from Sequence
Transcriptome analyses of four regions of the mouse brain.
P53 Pulses Diversify Target Gene Expression Dynamics in an mRNA Half-Life- Dependent Manner and Delineate Co-regulated Target Gene Subnetworks  Joshua R.
Bernard Mulvey, Joseph D. Dougherty  Cell 
Volume 25, Issue 6, Pages e3 (November 2018)
Thalia E. Chan, Michael P.H. Stumpf, Ann C. Babtie  Cell Systems 
Fig. 6. The cholinergic receptor subunits a6 (Chrna6) and b3 (Chrnb3) are (subset) specifically expressed in mdDA neurons during development.(A) The Chrna6.
Fig. 6. The cholinergic receptor subunits a6 (Chrna6) and b3 (Chrnb3) are (subset) specifically expressed in mdDA neurons during development.(A) The Chrna6.
The Development of Human Functional Brain Networks
Chapter 5: Sampling Distributions
Presentation transcript:

Regulatory network inference: use of whole brain vs Regulatory network inference: use of whole brain vs. brain region-specific gene expression data in the mouse Ronald C. Taylor, Computational Biology & Bioinformatics Group, Pacific Northwest National Laboratory, Richland, WA   George Acquaah-Mensah, Massachusetts College of Pharmacy and Health Sciences, Worcester, Mass 13th Annual Rocky Mountain Bioinformatics Conference, Snowmass, CO December 10-12, 2015

Background Regulatory relationships among genes impact disease processes, and thus inferred transcriptional regulatory networks can be a valuable source of biological insights. However, mammalian brain expression data is expensive and large data sets for localized to specific brain regions are difficult to obtain. QUESTIONS: what can be inferred using (cheaper) whole-brain data compared to region-specific data? How much of the region-specific patterns is diluted out by random noise? DATA SOURCE: in situ hybridization data from the Allen Brain Atlas, normalized and localized to 208 regions. METHOD: comparison of edges inferred by CLR algorithm, using data from five hippocampal regions vs. entire brain

in situ hybridization Allen Brain Atlas data – all regions Comparison Workflow in situ hybridization Allen Brain Atlas data – five hippocampal regions in situ hybridization Allen Brain Atlas data – all regions Selection of genes and voxels from initial set of 1011 genes, restricted to genes having 5% or less missing values 662 genes, 1296 vals per gene 492 genes, 23,035 vals per gene Use of CLR network inference algorithm to find gene-to-gene regulatory connection scores 218,791 pair-wise scores 120,786 pair-wise scores Filtering of gene-to-gene regulatory connections at >= SD3 above mean CLR score The common-sense assumption is that the region-specific data is going to be more accurate than data diluted across the entire brain, given that all data are generated in the same ISH manner. So the question is: how many of the hippo region edges are found in the entire brain data? Answer: not many: 333 edges, that is, 7.2%. 4616 edges 2407 edges Network Comparison Intersection: 333 edges (13.8% of all regions, 7.2% of 5 regions) Unique to 5 regions: 4283/4616 = 92.8%

All Regions vs Five Hippocampal Regions: Ep300 Connections Five Regions Cytoscape figure showing a Ep300 TF use case, where lots of targets are found by the hippo data, but only a truly puny set by the whole brain data. Node colors – The PURPLE nodes represent common targets; the WHITE nodes represent genes with increased expression in AD hippocampus (CA1 pyramidal neurons); BLUE (BLUE-GREEN) nodes represent those genes with suppressed expression. What about all the PINK nodes – huge number of them?? The entire brain data has > 13,000 quite accurate data points. A huge number forming an integrated data set from a single lab, using same platform. Possibly unique. But that does not help. Probably hurts through addition of noise. 4 of 10 targets in the entire-brain network are held in common with the five-hippo region network (are PURPLE). purple – held in common white – increased exp blue-green - suppressed exp pink - ???

All Regions vs Five Hippocampal Regions: Nfe2l1 Connections Five Regions All Regions Cytoscape figure showing a Nfe2l1 TF use case, where lots of targets are found by the hippo data, but only a truly puny set by the whole brain data. Node colors – The PURPLE nodes represent common targets; the WHITE nodes represent genes with increased expression in AD hippocampus (CA1 pyramidal neurons); BLUE (BLUE-GREEN) nodes represent those genes with suppressed expression. What about all the PINK nodes – huge number of them?? The entire brain data has > 13,000 quite accurate data points. A huge number forming an integrated data set from a single lab, using same platform. Possibly unique. But that does not help. Probably hurts through addition of noise. 7 of 13 targets in the entire-brain network are held in common with the five-hippo region network (are PURPLE). purple – held in common white – increased exp blue-green - suppressed exp pink - ???

Conclusions The set of regulatory relationships inferred with high confidence uses the smaller, more specific brain volume. Gene expression is strongly brain region specific. No getting round that fact. Hypothesis: noise is introduced even when we have a huge amount of accurate expression data across the brain. This is due to genes only being regulated by TFs in certain regions, uncorrelated elsewhere. None of this is surprising, but we are validating on possibly the single best mouse brain data set now available. Is the relatively small set of edges found by the entire-brain data set useful in any way? Perhaps useful as a guide to targets used by master TFs active throughput the brain?