Download presentation
Presentation is loading. Please wait.
Published byRoxanne Arnold Modified over 9 years ago
1
Next
2
A Big Thanks Again Prof. Jason Bohland Quantitative Neuroscience Laboratory Boston University
3
http://www.brain-map.org
4
Expression energy for each gene (M=4,376) and for each voxel (N=51,533) Spatial Transcriptome
5
Clusters of Correlated Gene Expression
6
Gene Finder User navigates to voxel-of-interest in reference atlas volume and a fixed threshold AGEA correlation map appears Get a gene list from ABA is returned.
7
Large-scale data analysis How much structure is present across space and across genes? How would brain segment using ISH gene expression patterns ? Is there structure in patterns of expression of localized genes ? What do we learn from expression of genes in disorders?
8
Combine gene volumes into a large matrix Decompose the voxel x gene matrix M using the singular value decomposition (SVD) Any m x n matrix M can be decomposed: where u n (x) are the spatial modes, and v n (g) are the gene modes, n 2 is the energy in the n th mode Large-scale Correlation
9
Quality control → set of 3041 genes Combine gene volumes into a large matrix Decompose the voxel x gene matrix using singular value decomposition (SVD) voxels modes xx genes s.v.’s M ≈M ≈ “weight” spatial pattern gene pattern Large-scale Correlation
10
10 SVD A = USV T A (m by n) is any rectangular matrix (m rows and n columns) U (m by n) is an “orthogonal” matrix S (n by n) is a diagonal matrix V (n by n) is another orthogonal matrix Such decomposition always exists All matrices are real; m ≥ n
11
Why SVD ? Product of three orthogonal matrices Each voxel is a vector in 3041 dimensional space Looking for directions in vector space with most variability Each “mode” has 3 items – spatial pattern – voxels that are correlated gene pattern – weighing how much a gene expresses this mode a weight – how much of overall variability in data does this mode account for SVD modes are ordered – Each successive mode accounts for as much of the variability remaining in the data as is possible
12
N=271 before we get to 90% of the variance N=67 before we get to 80% of the variance Principal modes (SVD) Cerebral cortex Olfactory areas Hippocampus Retrohippocampal Striatum Pallidum Thalamus Hypothalamus Midbrain Pons Medulla Cerebellum All LH brain voxels plotted as projections on first 3 modes
13
Singular Value Decomposition
14
14 SVD for microarray data (Alter et al, PNAS 2000)
15
15 A = USV T A is any rectangular matrix (m ≥ n) Row space: vector subspace generated by the row vectors of A Column space: vector subspace generated by the column vectors of A The dimension of the row & column space is rank of the matrix A: r (≤ n) A is a linear transformation that maps a vector x in row space into vector Ax in column space
16
16 A = USV T U is an “orthogonal” matrix (m ≥ n) Column vectors of U form orthonormal basis for column space of A: U T U=I u 1, …, u n in U are eigenvectors of AA T – AA T =USV T VSU T =US 2 U T “Left singular vectors”
17
17 A = USV T V is an orthogonal matrix (n by n) Column vectors of V form an orthonormal basis for the row space of A: V T V=VV T =I v 1, …, v n in V are eigenvectors of A T A – A T A =VSU T USV T =VS 2 V T “Right singular vectors”
18
18 A = USV T S is diagonal matrix of non-negative singular values Sorted from largest to smallest Singular values are non-negative square root of corresponding eigenvalues of A T A and AA T
19
19 AV = US Means each Av i = s i u i Note A is linear map from row to column space A maps orthonormal basis {v i } in row space into orthonormal basis {u i } in column space Each component of u i is projection of a row onto vector v i
20
20 SVD of A (m by n): recap A = USV T = (big-"orthogonal")(diagonal)(sq- orthogonal) u 1, …, u m in U are eigenvectors of AA T v 1, …, v n in V are eigenvectors of A T A s 1, …, s n in S are nonnegative singular values of A AV = US means each Av i = s i u i “Every A is diagonalized by 2 orthogonal matrices”
21
21 Recap
22
22 Eigengenes
23
23 Copyright ©2000 by the National Academy of Sciences Alter, Orly et al. (2000) Proc. Natl. Acad. Sci. USA 97, 10101-10106 Genes Ranking – Correlation of Top 2 Eigengenes
24
24 Copyright ©2000 by the National Academy of Sciences Alter, Orly et al. (2000) Proc. Natl. Acad. Sci. USA 97, 10101-10106 Normalized elutriation expression in the subspace associated with the cell cycle
25
25 SVD in Row Space x y v1v1 x y This line segment that goes through origin approximates the original data set The projected data set approximates the original data set x y s1u1v1Ts1u1v1T A
26
26 SVD = PCA? Centered data x y x’ y’
27
27 x y x’ y’ y’’ x’’ Translation is not a linear operation, as it moves the origin ! PCA SVD SVD = PCA?
28
Returning Back …
29
Quality control → set of 3041 genes Combine gene volumes into a large matrix Decompose the voxel x gene matrix using singular value decomposition (SVD) voxels modes xx genes s.v.’s M ≈M ≈ “weight” spatial pattern gene pattern Large-scale Correlation
30
Interpreting gene modes Spatial modes are easily visualized. Attempt to annotate eigenmodes using Gene Ontology (GO) annotations: Each GO term partitions gene list into two subsets: IN genes: Genes annotated by that GO term OUT genes: Genes not annotated by that GO term Each singular vector associates each subset above with a set of amplitudes Compare these amplitudes, asking whether ‘IN’ genes have larger magnitudes than ‘OUT’ genes use K-S test to test whether the amplitude distributions are different
31
Component annotations
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.