Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ultra-large alignments using Ensembles of HMMs Nam-phuong Nguyen Institute for Genomic Biology University of Illinois at Urbana-Champaign.

Similar presentations


Presentation on theme: "Ultra-large alignments using Ensembles of HMMs Nam-phuong Nguyen Institute for Genomic Biology University of Illinois at Urbana-Champaign."— Presentation transcript:

1 Ultra-large alignments using Ensembles of HMMs Nam-phuong Nguyen Institute for Genomic Biology University of Illinois at Urbana-Champaign

2 UPP: Ultra-large alignment UPP: Ultra-large alignments using Phylogeny- aware Profiles Objective: Estimate accurate alignments on large datasets, which may be evolutionarily divergent and contain fragmentary sequences Nguyen N., Mirarab S., Kumar K., and Warnow, T. RECOMB 2015.

3 UPP Algorithmic Strategy

4 RNASim: alignment error Note: All methods given 24 hrs on a 12-core machine. Mafft fails to complete on 200K sequences. Clustal-Omega only completes on 10K dataset. 1 Million RNASim: UPP(Fast) generated an alignment in 12 days compared to 15 days for PASTA. UPP(Fast) resulted in a better alignment (5.7% lower error), but PASTA resulted in a better tree (1.5% lower error).

5 Running Time Wall-clock time used (in hours) given 12 processors

6 Ensemble of HMMs Use a collection of HMMs instead of a single HMM to represent a backbone alignment Improves alignment accuracy, which can lead to better downstream analyses – Phylogenetic placement (SEPP; PSB 2012) – Taxonomic identification (TIPP, Bioinformatics 2014)


Download ppt "Ultra-large alignments using Ensembles of HMMs Nam-phuong Nguyen Institute for Genomic Biology University of Illinois at Urbana-Champaign."

Similar presentations


Ads by Google