Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science.

Similar presentations


Presentation on theme: "Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science."— Presentation transcript:

1 Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science & Engineering

2 Next Generation Sequencing Roche/454 Illumina HiSeq SOLiD 5500 Ion Proton PacBio RS Oxford Nanopore

3 3 Ongoing Projects Transcriptome Analysis -Transcriptome quantification and differential expression analysis -Computational deconvolution of heterogeneous samples -Transcriptome and meta-transcriptome assembly Viral quasispecies -Quasispecies reconstruction from NGS reads -IBV evolution and vaccine optimization -Transmission graphs Immunoinformatics -Genomics-guided immunotherapy -Deep panning for early cancer detection Sequencing error correction, genome assembly and scaffolding, metabolomics, biomarker selection, … -More info & software at http://dna.engr.uconn.eduhttp://dna.engr.uconn.edu

4 Transcriptome Quantification RNA-PhASE pipeline for allele-specific isoform expression ABC AC IsoEM algorithm for isoform expression estimation - Incorporates fragment length distribution, hexamer bias correction, … Ion Torrent MAQC datasets

5 Differential Expression Fast estimation enables the use of accurate bootstrapping-based methods MAQC 454 datasets UHRR SRX002934 vs HBRR SRX002935

6 Computational Deconvolution of Heterogeneous Samples Goal: characterization expression of mesoderm progenitor cells – Whole-transcriptome expression data for NSB cell mixtures + single-cell qPCR data for few genes Three step approach – Cluster of single cell qPCR data and infer “reduced” cell type signatures – Infer mixing proportions based on reduced signatures using quadratic programming – Infer full expression signatures based on mixing proportions, solving one quadratic program per gene

7 1742365 t 1 : 174365 t 2 : 174235 t 3 :t 4 : 17435 1742365 Reference-Guided Transcriptome Reconstruction

8 TRIP: Transciptome Reconstruction using Integer Programming Select the smallest set of putative transcripts that yields a good statistical fit between – empirically determined during library preparation – implied by “mapping” read pairs 13 123 500 300 200 Mean : 500; Std. dev. 50

9 De Novo (Meta)Transcriptome Assembly of Bugula Neritina and its Symbiont Uncultured bacterial symbiont produces bryostatins - Symbiont absent in Northern Atlantic populations

10 De Novo (Meta)Transcriptome Assembly of Bugula Neritina and its Symbiont Developing scalable multi-sample meta transcriptome assembly pipeline based on differential-coverage clustering of reads

11 Acknowledgements Sahar Al Seesi Abdul Banday Amir Bayegan Gabriel Ilie Caroline Jakuba James Lindsay Rahul Kanadia Craig Nelson Marius Nicolae Adrian Caciula Nicole Lopanik Serghei Mangul Yvette Temate Tiagueu Alex Zelikovsky


Download ppt "Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science."

Similar presentations


Ads by Google