Download presentation
Presentation is loading. Please wait.
Published byJames York Modified over 9 years ago
1
J AMES L INDSAY 1 I ON MANDOIU 1 C RAIG N ELSON 2 Towards Whole-Transcriptome Deconvolution with Single-cell Data U NIVERSITY O F C ONNECTICUT 1 D EPARTMENT OF C OMPUTER S CIENCE AND E NGINEERING 2 D EPARTMENT OF M OLECULAR AND C ELL B IOLOGY
2
Mouse Embryo Somites POSTERIOR / TAIL ANTERIOR / HEAD Node Neural tube Primitive streak
3
Unknown Mesoderm Progenitor What is the expression profile of the progenitor cell type? NSB=node-streak border; PSM=presomitic mesoderm; S=somite; NT=neural tube/neurectoderm; EN=endoderm
4
Characterizing Cell-types Goal: Whole transcriptome expression profiles of individual cell-types Technically challenging to measure whole transcriptome expression from single-cells Approach: Computational Deconvolution of cell mixtures Assisted by single-cell qPCR expression data for a small number of genes
5
Modeling Cell Mixtures Mixtures (X) are a linear combination of s ignature matrix (S) and concentration matrix (C) mixtures genes cell types genes mixtures cell types
6
Previous Work 1.Coupled Deconvolution Given: X, Infer: S, C NMF Repsilber, BMC Bioinformatics, 2010 Minimum polytope Schwartz, BMC Bioinformatics, 2010 2.Estimation of Mixing Proportions Given: X, S Infer: C Quadratic ProgGong, PLoS One, 2012 LDAQiao, PLoS Comp Bio, 2o12 3.Estimation of Expression Signatures Given: X, C Infer: S csSAMShen-Orr, Nature Brief Com, 2010
7
Single-cell Assisted Deconvolution Given: X and single-cells qPCR data Infer: S, C Approach: 1.Identify cell-types and estimate reduced signature matrix using single-cells qPCR data Outlier removal K-means clustering followed by averaging 2.Estimate mixing proportions C using Quadratic programming, 1 mixture at a time 3.Estimate full expression signature matrix S using C Quadratic programming, 1 gene at a time
8
Step 1: Outlier Removal + Clustering unfilteredfiltered Remove cells that have maximum Pearson correlation to other cells below.95
9
Step 1: PCA of Clustering
10
Step 2: Estimate Mixture Proportions For a given mixture i: Reduced signature matrix. Centroid of k-means clusters
11
Step 3: Estimating Full Expression Signatures s: new gene to estimate signatures mixtures genes cell types genes mixtures cell types Now solve: C: known from step 2 x: observed signals from new gene
12
Experimental Design Simulated Concentrations Sample uniformly at random [0,1] Scale column sum to 1. Simulated Mixtures Choose single-cells randomly with replacement from each cluster Sum to generate mixture Single Cell Profiles 92 profiles 31 genes
13
Data: RT-qPCR CT values are the cycle in which gene was detected Relative Normalization to house-keeping genes HouseKeeping genes gapdh, bactin1 geometric mean Vandesompele, 2002 dCT(x) = geometric mean – CT(x) expression(x) = 2^dCT(x)
14
Accuracy of Inferred Mixing Proportions
15
Concentration Matrix: Concordance
16
Concentration by # Genes: Random
17
Concentration by # Genes: Ranked
18
Leave-one-out: Concentration: 50 mix RMSE 2^dCT Missing Gene
19
Leave-one-out: Signature: 10 mix RMSE 2^dCT Missing Gene
20
Leave-one-out: Signature: 50 mix RMSE 2^dCT Missing Gene
21
Future Work Bootstrapping to report a confidence interval of each estimated concentration and signature Show correlation between large CI and poor accuracy Mixing of heterogeneous technologies qPCR for single-cells, RNA-seq for mixtures Normalization (need to be linear) Whole-genome scale # genes to estimate 10,000+ signatures Data!
22
Conclusion Special Thanks to: Ion Mandoiu Craig Nelson Caroline Jakuba Mathew Gajdosik James.Lindsay@engr.uconn.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.