Download presentation
Presentation is loading. Please wait.
Published byDina Walton Modified over 6 years ago
1
Regulatory network inference: use of whole brain vs
Regulatory network inference: use of whole brain vs. brain region-specific gene expression data in the mouse Ronald C. Taylor, Computational Biology & Bioinformatics Group, Pacific Northwest National Laboratory, Richland, WA George Acquaah-Mensah, Massachusetts College of Pharmacy and Health Sciences, Worcester, Mass 13th Annual Rocky Mountain Bioinformatics Conference, Snowmass, CO December 10-12, 2015
2
Background Regulatory relationships among genes impact disease processes, and thus inferred transcriptional regulatory networks can be a valuable source of biological insights. However, mammalian brain expression data is expensive and large data sets for localized to specific brain regions are difficult to obtain. QUESTIONS: what can be inferred using (cheaper) whole-brain data compared to region-specific data? How much of the region-specific patterns is diluted out by random noise? DATA SOURCE: in situ hybridization data from the Allen Brain Atlas, normalized and localized to 208 regions. METHOD: comparison of edges inferred by CLR algorithm, using data from five hippocampal regions vs. entire brain
3
in situ hybridization Allen Brain Atlas data – all regions
Comparison Workflow in situ hybridization Allen Brain Atlas data – five hippocampal regions in situ hybridization Allen Brain Atlas data – all regions Selection of genes and voxels from initial set of 1011 genes, restricted to genes having 5% or less missing values 662 genes, 1296 vals per gene 492 genes, 23,035 vals per gene Use of CLR network inference algorithm to find gene-to-gene regulatory connection scores 218,791 pair-wise scores 120,786 pair-wise scores Filtering of gene-to-gene regulatory connections at >= SD3 above mean CLR score The common-sense assumption is that the region-specific data is going to be more accurate than data diluted across the entire brain, given that all data are generated in the same ISH manner. So the question is: how many of the hippo region edges are found in the entire brain data? Answer: not many: 333 edges, that is, 7.2%. 4616 edges 2407 edges Network Comparison Intersection: 333 edges (13.8% of all regions, 7.2% of 5 regions) Unique to 5 regions: 4283/4616 = 92.8%
4
All Regions vs Five Hippocampal Regions: Ep300 Connections
Five Regions Cytoscape figure showing a Ep300 TF use case, where lots of targets are found by the hippo data, but only a truly puny set by the whole brain data. Node colors – The PURPLE nodes represent common targets; the WHITE nodes represent genes with increased expression in AD hippocampus (CA1 pyramidal neurons); BLUE (BLUE-GREEN) nodes represent those genes with suppressed expression. What about all the PINK nodes – huge number of them?? The entire brain data has > 13,000 quite accurate data points. A huge number forming an integrated data set from a single lab, using same platform. Possibly unique. But that does not help. Probably hurts through addition of noise. 4 of 10 targets in the entire-brain network are held in common with the five-hippo region network (are PURPLE). purple – held in common white – increased exp blue-green - suppressed exp pink - ???
5
All Regions vs Five Hippocampal Regions: Nfe2l1 Connections
Five Regions All Regions Cytoscape figure showing a Nfe2l1 TF use case, where lots of targets are found by the hippo data, but only a truly puny set by the whole brain data. Node colors – The PURPLE nodes represent common targets; the WHITE nodes represent genes with increased expression in AD hippocampus (CA1 pyramidal neurons); BLUE (BLUE-GREEN) nodes represent those genes with suppressed expression. What about all the PINK nodes – huge number of them?? The entire brain data has > 13,000 quite accurate data points. A huge number forming an integrated data set from a single lab, using same platform. Possibly unique. But that does not help. Probably hurts through addition of noise. 7 of 13 targets in the entire-brain network are held in common with the five-hippo region network (are PURPLE). purple – held in common white – increased exp blue-green - suppressed exp pink - ???
6
Conclusions The set of regulatory relationships inferred with high confidence uses the smaller, more specific brain volume. Gene expression is strongly brain region specific. No getting round that fact. Hypothesis: noise is introduced even when we have a huge amount of accurate expression data across the brain. This is due to genes only being regulated by TFs in certain regions, uncorrelated elsewhere. None of this is surprising, but we are validating on possibly the single best mouse brain data set now available. Is the relatively small set of edges found by the entire-brain data set useful in any way? Perhaps useful as a guide to targets used by master TFs active throughput the brain?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.