Returning Back …. A Big Thanks Again Prof. Jason Bohland Quantitative Neuroscience Laboratory Boston University Prof. Matt Hibbs Jackson Labs.

Slides:



Advertisements
Similar presentations
FMRI Methods Lecture 10 – Using natural stimuli. Reductionism Reducing complex things into simpler components Explaining the whole as a sum of its parts.
Advertisements

Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Metrics, Algorithms & Follow-ups Profile Similarity Measures Cluster combination procedures Hierarchical vs. Non-hierarchical Clustering Statistical follow-up.
Clustered alignments of gene- expression time series data Adam A. Smith, Aaron Vollrath, Cristopher A. Bradfield and Mark Craven Department of Biosatatistics.
Communities in Heterogeneous Networks Chapter 4 1 Chapter 4, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool,
EE 7730 Image Segmentation.
Mutual Information Mathematical Biology Seminar
University of CreteCS4831 The use of Minimum Spanning Trees in microarray expression data Gkirtzou Ekaterini.
Cluster Analysis.  What is Cluster Analysis?  Types of Data in Cluster Analysis  A Categorization of Major Clustering Methods  Partitioning Methods.
L16: Micro-array analysis Dimension reduction Unsupervised clustering.
09/05/2005 סמינריון במתמטיקה ביולוגית Dimension Reduction - PCA Principle Component Analysis.
Clustering (Part II) 11/26/07. Spectral Clustering.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Lecture 09 Clustering-based Learning
Introduction to Bioinformatics Algorithms Clustering and Microarray Analysis.
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
A Big Thanks Prof. Jason Bohland Quantitative Neuroscience Laboratory Boston University Dr. Luis Ibanez Open Source Proponent,
Image Segmentation Rob Atlas Nick Bridle Evan Radkoff.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
嵌入式視覺 Pattern Recognition for Embedded Vision Template matching Statistical / Structural Pattern Recognition Neural networks.
Supplemental Material
Next. A Big Thanks Again Prof. Jason Bohland Quantitative Neuroscience Laboratory Boston University.
Chapter 11: Cognition and neuroanatomy. Three general questions 1.How is the brain anatomically organized? 2.How is the mind functionally organized? 3.How.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
Chapter 14: SEGMENTATION BY CLUSTERING 1. 2 Outline Introduction Human Vision & Gestalt Properties Applications – Background Subtraction – Shot Boundary.
Proliferation cluster (G12) Figure S1 A The proliferation cluster is a stable one. A dendrogram depicting results of cluster analysis of all varying genes.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Lecture 20: Cluster Validation
Functional-anatomical correspondence Meta-analysis of motor and executive fMRI/PET activations showed close correspondence between functionally and connectivity-defined.
Abstract Background: In this work, a candidate gene prioritization method is described, and based on protein-protein interaction network (PPIN) analysis.
Gene expression analysis
Spectral Analysis based on the Adjacency Matrix of Network Data Leting Wu Fall 2009.
CZ5225: Modeling and Simulation in Biology Lecture 3: Clustering Analysis for Microarray Data I Prof. Chen Yu Zong Tel:
Gene expression & Clustering. Determining gene function Sequence comparison tells us if a gene is similar to another gene, e.g., in a new species –Dynamic.
Cluster validation Integration ICES Bioinformatics.
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Flat clustering approaches
Biclustering of Expression Data by Yizong Cheng and Geoge M. Church Presented by Bojun Yan March 25, 2004.
Clustering High-Dimensional Data. Clustering high-dimensional data – Many applications: text documents, DNA micro-array data – Major challenges: Many.
1 Microarray Clustering. 2 Outline Microarrays Hierarchical Clustering K-Means Clustering Corrupted Cliques Problem CAST Clustering Algorithm.
Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.
CZ5211 Topics in Computational Biology Lecture 4: Clustering Analysis for Microarray Data II Prof. Chen Yu Zong Tel:
Clustering [Idea only, Chapter 10.1, 10.2, 10.4].
Unsupervised Learning
Unsupervised Learning
PREDICT 422: Practical Machine Learning
Group Averaging of fMRI Data
Volume 63, Issue 3, Pages (August 2009)
Volume 87, Issue 2, Pages (July 2015)
REMOTE SENSING Multispectral Image Classification
Volume 60, Issue 4, Pages (November 2008)
Volume 17, Issue 5, Pages (October 2016)
Global approach to the diagnosis of leukemia using gene expression profiling by Torsten Haferlach, Alexander Kohlmann, Susanne Schnittger, Martin Dugas,
Sam Norman-Haignere, Nancy G. Kanwisher, Josh H. McDermott  Neuron 
The Development of Human Functional Brain Networks
Volume 63, Issue 3, Pages (August 2009)
Anastasia Baryshnikova  Cell Systems 
Text Categorization Berlin Chen 2003 Reference:
Monica W. Chu, Wankun L. Li, Takaki Komiyama  Neuron 
Michael A. Silver, Amitai Shenhav, Mark D'Esposito  Neuron 
Varying Intolerance of Gene Pathways to Mutational Classes Explain Genetic Convergence across Neuropsychiatric Disorders  Shahar Shohat, Eyal Ben-David,
The Development of Human Functional Brain Networks
Lecture 16. Classification (II): Practical Considerations
Spatial statistics of X-ray volumes reveal layering and spatially diverse distribution of cell bodies. Spatial statistics of X-ray volumes reveal layering.
César F. Lima, Saloni Krishnan, Sophie K. Scott 
Unsupervised Learning
Unsupervised Learning
Michael A. Silver, Amitai Shenhav, Mark D'Esposito  Neuron 
Presentation transcript:

Returning Back …

A Big Thanks Again Prof. Jason Bohland Quantitative Neuroscience Laboratory Boston University Prof. Matt Hibbs Jackson Labs

 Quality control → set of 3041 genes  Combine gene volumes into a large matrix  Decompose the voxel x gene matrix using singular value decomposition (SVD) voxels modes xx genes s.v.’s M ≈M ≈ “weight” spatial pattern gene pattern Large-scale Correlation

N=271 before we get to 90% of the variance N=67 before we get to 80% of the variance Principal modes (SVD) Cerebral cortex Olfactory areas Hippocampus Retrohippocampal Striatum Pallidum Thalamus Hypothalamus Midbrain Pons Medulla Cerebellum All LH brain voxels plotted as projections on first 3 modes

Interpreting gene modes Spatial modes are easily visualized. Attempt to annotate eigenmodes using Gene Ontology (GO) annotations: Each GO term partitions gene list into two subsets: IN genes: Genes annotated by that GO term OUT genes: Genes not annotated by that GO term Each singular vector associates each subset above with a set of amplitudes Compare these amplitudes, asking whether ‘IN’ genes have larger magnitudes than ‘OUT’ genes use K-S test to test whether the amplitude distributions are different

In this low dimensional space Cerebellum and striatum separated - GABAergic interneurons and glutamatergic projection neurons in adult mouse forebrain Other regions are clustered in greatly reduced space, but with considerable overlap Anatomical regions do not in general correspond directly to individual SVD modes Clustering of gene expression profiles in very low dimensional subspace groups voxels drawn from same brain regions

Component Annotations Distinctly high amplitude in the dentate gyrus of the hippocampus. Enhanced specificity for the cerebellum, Particularly prominent in the cerebellum and the striatum. Decomposition extracts correlated structure in expression profiles that corresponds to anatomical subdivision

Once again

Gene clustering? Genes are somewhat less separable - and less categorical Build gene-gene similarity graph partition, color code each point…

K-Means Segmentation What does gene expression tell about regional brain organization ? Use simple cluster analysis. K-means clustering:  Dimensionality reduced (to 271) by truncating SVD  Assign one of K labels to each voxel  All voxels assigned the same label have more similar expression profiles than voxels with different labels  Similarity defined by Euclidean distance Data-driven parcellation of mouse brain anatomy (level of granularity determined by K)

K-means clustering results

Spatially Contiguous Clusters K=2 – clusters separates cerebral cortex hippocampus (gray) from other areas (white) K = 8 – cerebellum/striatum clearly segmented, cortex is subdivided into distinct layers K = 16 - thalamus has its own cluster; cortical layers further differentiated, midbrain separated from hindbrain Large K – More anatomical details observed; separation of caudoputamen from the nucleus accumbens; display laminar and areal patterns in cortex

Clustering in Cerebral Cortex K = 40 (masked) ARA Area masks Divides aud/vis areas from somatosensory areas Laminar clusters broken into distinct groups along anterior– posterior direction (bottom) at border between auditory & somatosensory areas Validation

Relevant Questions Determine, for a given structure, at what value of K it emerges as its own cluster ? Relative prioritization of anatomical areas based on expression pattern similarity Dominant clustering of gene expression along cortical layers consistent with those of Ng et al.

Compare with Reference Atlas Reference atlases here are “flat” parcellations with 12 or 94 regions Similarity index (S) ranges from 0-1 Overlap saturating at K > 30 Clusters for large K are subdivisions of those for low K

Compare with Reference Atlas K=12 Clusters 1, 2, 3, and 4 together the cerebral cortex Cluster 11 largely corresponds thalamus Cluster 9 is wholly contained in the cerebellum Cluster 10 in the striatum.

Classification of Region Membership Supervised learning using linear discriminant (25% test set, 10-fold cross-validation) 94.5% correct overall

What Next ? Size of voxels large relative to individual cell bodies Voxels will contain a mixture of several cell types. Unique expression signature for discrete brain locations with different combinations of cell types. Spatial co-expression indicator of functionally-related or interacting genes

Localization of expression Normalized Expression Energy Voxels Non-localized expression pattern Well-localized expression pattern Kullback-Leibler (KL) divergence from (spatial) uniformity

summed thresholded Gene Localization Select most localized genes (KL > ~1.56) to further analyze Threshold voxels based on intensity histogram of summed expressions Remaining LH mask (6102 voxels) essentially excludes cerebral cortex

Voxel Uniformity in Gene Space Measure KL divergence from uniform density across gene space at each voxel Brighter color indicates lower KL divergence (more uniform expression across genes) Note cortex is generally more uniform than subcortical areas And middle cortical layers are notably more uniform than superficial and deepest layers

“Expression diversity” Expression diversity across gross structures Expression diversity across cortical layers and areas Average KL divergence across all voxels in a particular anatomical region

Construct a bipartite graph with N (200) genes in vertex set V 1 and M (~6000) mask voxels in V 2  Edges are expression levels of each gene at each voxel Apply graph partitioning methods to cut graph into connected components  Components contain both voxels and genes  Here we used the isoperimetric algorithm (Grady and Schwartz, 2006). GENES VOXELS Biclustering Genes & Voxels V1V1 V2V2 Can we group genes that are each highly localized to common brain regions (sets of voxels)?

What is Biclustering ? Finding submatrices in an n x m matrix that follow a desired pattern* Row/column order need not be consistent between different biclusters.

Bicluster properties For any submatrix C IJ where I and J are a subsets of genes and conditions, the mean squared residude score is A bicluster is a submatrix C IJ that has a low mean squared residue score. Biclustering of Expression data: Cheng and Church, RECOMB 2001

Cheng and Church Greedy Approach Finds a submatrix that minimizes MSR Biclusters (a) and (b) fits the definition of MSR

Biclustering Localized Genes 40 genes 29 genes Resulting voxel clusters correspond well to individual anatomical regions, w/ functionally relevant gene lists 97% of energy in the cerebellum Highly localized to ventricle system

Biclustering Localized Genes 30 genes 11 genes Results shown are for 13 biclusters 69% of energy in dentate gyrus, 20% Ammon’s horn 99% of energy in thalamus

Cell-type expression model Hypothesis: do genes emerging from these biclusters represent preferential “markers” of cell types localized to the corresponding regions? Cell-type specific microarray data are available (Okaty et al., 2009; 2011) to help answer this question Compare microarray profiles of these cell types with voxel- based transcriptomic data from ABA  2131 overlapping genes (with high quality ABA data)

Cell-type based expression Spatial patterns reflect organization within brain regions A B C D (A)Granule cells (B) Purkinje cells (C) Stellate cells (D) mature oligodendrocytes

Biclusters Cell Types Highly localized genes emerging from bi- clusters (usually) show selective expression in local cell types CP bi-cluster Cb bi-cluster

Heritable “Disease Networks” Online Mendelian Inheritance in Man (OMIM) –Contains records of genetic basis for ~4000 disorders –Manually curated 94 unique entities that are of neurological / neuropsychiatric interest and intersect our gene set 1.For each disorder, calculate the mean expression pattern across orthologs of implicated genes (MGI orthology) 2.Calculate a distance matrix between disorders by computing the pairwise cosine distance between expression profiles 3.Cluster disorders using hierarchical cluster analysis

OMIM Disease Clusters Complete linkage clustering

Lhx1 Autism Candidate For a given gene list, embed expression similarity in 2D space Ex: ASD candidate genes from Wigler lab (CSHL) (16 genes in high quality coronal data set) Calculate cosine distance matrix, and apply metric MDS Provide sub-groupings based on expression locus Fgd3 Cb MapT Doc2a Ctx Ptpdc1

Next ? timeTR Component Spatial components fMRI