Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science.

Slides:



Advertisements
Similar presentations
Marius Nicolae Computer Science and Engineering Department
Advertisements

RNA-Seq based discovery and reconstruction of unannotated transcripts
Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Serghei Mangul, Irina Astrovskaya, Bassam Tork, Ion Mandoiu Viral.
 Experimental Setup  Whole brain RNA-Seq Data from Sanger Institute Mouse Genomes Project [Keane et al. 2011]  Synthetic hybrids with different levels.
Algorithms for Multisample Read Binning
Peter Tsai Bioinformatics Institute, University of Auckland
Next-generation sequencing
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
Marius Nicolae and Ion Măndoiu (University of Connecticut, USA)
Transcriptome Assembly and Quantification from Ion Torrent RNA-Seq Data Alex Zelikovsky Department of Computer Science Georgia State University Joint work.
Transcriptomics Jim Noonan GENE 760.
RNA-Seq based discovery and reconstruction of unannotated transcripts in partially annotated genomes 3 Serghei Mangul*, Adrian Caciula*, Ion.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Marius Nicolae Computer Science and Engineering Department University of Connecticut Joint work with Serghei Mangul, Ion Mandoiu and Alex Zelikovsky.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Reconstruction of infectious bronchitis virus quasispecies from 454 pyrosequencing reads CAME 2011 Ion Mandoiu Computer Science & Engineering Dept. University.
mRNA-Seq: methods and applications
Software for Robust Transcript Discovery and Quantification from RNA-Seq Ion Mandoiu, Alex Zelikovsky, Serghei Mangul.
Delon Toh. Pitfalls of 2 nd Gen Amplification of cDNA – Artifacts – Biased coverage Short reads – Medium ~100bp for Illumina – 700bp for 454.
Reconstruction of Haplotype Spectra from NGS Data Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science & Engineering.
J AMES L INDSAY 1 C AROLINE J AKUBA 2 I ON MANDOIU 1 C RAIG N ELSON 2 Gene Expression Deconvolution with Single-cell Data U NIVERSITY O F C ONNECTICUT.
Sequencing Technologies and Applications at JGI
Li and Dewey BMC Bioinformatics 2011, 12:323
Vertex labels swapping Edges swapping Pathway activity levels with ratio Abstract Metabolic pathway activity estimation from RNA-Seq data Yvette Temate-Tiagueu,
Ji-hye Choi August Introduction (2006) ABRF-NGS (the Association fo Biomolecular Resource Facilities next-generation sequencing study)
J AMES L INDSAY 1 I ON MANDOIU 1 C RAIG N ELSON 2 Towards Whole-Transcriptome Deconvolution with Single-cell Data U NIVERSITY O F C ONNECTICUT 1 D EPARTMENT.
Todd J. Treangen, Steven L. Salzberg
Introduction to next generation sequencing Rolf Sommer Kaas.
VirVarSeq vs ViVaMBC Pictured above: The structure of HIV.
Variables: – T(p) - set of candidate transcripts on which pe read p can be mapped within 1 std. dev. – y(t) -1 if a candidate transcript t is selected,
Computational methods for genomics-guided immunotherapy
Adrian Caciula Department of Computer Science Georgia State University Joint work with Serghei Mangul (UCLA) Ion Mandoiu (UCONN) Alex Zelikovsky (GSU)
Computational Methods for Analysis of Single Cell RNA-Seq Data
The iPlant Collaborative
RNA-Seq Assembly 转录组拼接 唐海宝 基因组与生物技术研究中心 2013 年 11 月 23 日.
Novel transcript reconstruction from ION Torrent sequencing reads and Viral Meta-genome Reconstruction from AmpliSeq Ion Torrent data University of Connecticut.
Serghei Mangul Department of Computer Science Georgia State University Joint work with Irina Astrovskaya, Marius Nicolae, Bassam Tork, Ion Mandoiu and.
Sahar Al Seesi and Ion Măndoiu Computer Science and Engineering
Transcriptomics Sequencing. over view The transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non coding RNA produced.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
UK NGS Sequencing Update July 2009 Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics.
Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion Măndoiu, UConn Co-PDs: Mazhar.
Computational methods for genomics-guided immunotherapy Sahar Al Seesi Computer Science & Engineering Department, UCONN Immunology Department, UCONN Health.
Bioinformatics support at School of Biological Sciences
The iPlant Collaborative
No reference available
Lecture 12 RNA – seq analysis.
Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA) James Lindsay, Ion.
Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science.
An Integer Programming Approach to Novel Transcript Reconstruction from Paired-End RNA-Seq Reads Serghei Mangul Department of Computer Science Georgia.
CyVerse Workshop Transcriptome Assembly. Overview of work RNA-Seq without a reference genome Generate Sequence QC and Processing Transcriptome Assembly.
Library QA & QC Day 1, Video 3
KGEM: an EM Error Correction Algorithm for NGS Amplicon-based Data Alexander Artyomenko.
ICCABS 2013 kGEM: An EM-based Algorithm for Local Reconstruction of Viral Quasispecies Alexander Artyomenko.
RNA Quantitation from RNAseq Data
Cancer Vaccine Design Ion Mandoiu
Gene expression from RNA-Seq
Computational methods for genomics-guided immunotherapy
Kallisto: near-optimal RNA seq quantification tool
Do You Want to Build a Transcriptome?
2nd (Next) Generation Sequencing
Dec. 22, 2011 live call UCONN: Ion Mandoiu, Sahar Al Seesi
Transcript length distribution resulting from different assemblies of the embryo samples across the three technologies (HiSeq, MiSeq, and PacBio). Transcript.
Mapping rates of different transcript sets to the P
Quantitative analyses using RNA-seq data
Sequence Analysis - RNA-Seq 2
Schematic representation of a transcriptomic evaluation approach.
Presentation transcript:

Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science & Engineering

Next Generation Sequencing Roche/454 Illumina HiSeq SOLiD 5500 Ion Proton PacBio RS Oxford Nanopore

3 Ongoing Projects Transcriptome Analysis -Transcriptome quantification and differential expression analysis -Computational deconvolution of heterogeneous samples -Transcriptome and meta-transcriptome assembly Viral quasispecies -Quasispecies reconstruction from NGS reads -IBV evolution and vaccine optimization -Transmission graphs Immunoinformatics -Genomics-guided immunotherapy -Deep panning for early cancer detection Sequencing error correction, genome assembly and scaffolding, metabolomics, biomarker selection, … -More info & software at

Transcriptome Quantification RNA-PhASE pipeline for allele-specific isoform expression ABC AC IsoEM algorithm for isoform expression estimation - Incorporates fragment length distribution, hexamer bias correction, … Ion Torrent MAQC datasets

Differential Expression Fast estimation enables the use of accurate bootstrapping-based methods MAQC 454 datasets UHRR SRX vs HBRR SRX002935

Computational Deconvolution of Heterogeneous Samples Goal: characterization expression of mesoderm progenitor cells – Whole-transcriptome expression data for NSB cell mixtures + single-cell qPCR data for few genes Three step approach – Cluster of single cell qPCR data and infer “reduced” cell type signatures – Infer mixing proportions based on reduced signatures using quadratic programming – Infer full expression signatures based on mixing proportions, solving one quadratic program per gene

t 1 : t 2 : t 3 :t 4 : Reference-Guided Transcriptome Reconstruction

TRIP: Transciptome Reconstruction using Integer Programming Select the smallest set of putative transcripts that yields a good statistical fit between – empirically determined during library preparation – implied by “mapping” read pairs Mean : 500; Std. dev. 50

De Novo (Meta)Transcriptome Assembly of Bugula Neritina and its Symbiont Uncultured bacterial symbiont produces bryostatins - Symbiont absent in Northern Atlantic populations

De Novo (Meta)Transcriptome Assembly of Bugula Neritina and its Symbiont Developing scalable multi-sample meta transcriptome assembly pipeline based on differential-coverage clustering of reads

Acknowledgements Sahar Al Seesi Abdul Banday Amir Bayegan Gabriel Ilie Caroline Jakuba James Lindsay Rahul Kanadia Craig Nelson Marius Nicolae Adrian Caciula Nicole Lopanik Serghei Mangul Yvette Temate Tiagueu Alex Zelikovsky