Integration of Clustering and Multidimensional Scaling to Determine Phylogenetic Trees as Spherical Phylograms Visualized in 3 Dimensions  Introduction.

Slides:



Advertisements
Similar presentations
Clustering II.
Advertisements

Parsimony Small Parsimony and Search Algorithms Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Hybrid MapReduce Workflow Yang Ruan, Zhenhua Guo, Yuduo Zhou, Judy Qiu, Geoffrey Fox Indiana University, US.
Molecular Evolution Revised 29/12/06
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.
Protein Sequence Classification Using Neighbor-Joining Method
Phylogenetic trees Tutorial 6. Distance based methods UPGMA Neighbor Joining Tools Mega phylogeny.fr DrewTree Phylogenetic Trees.
Bioinformatics tools for phylogeny and visualization
Parallel Data Analysis from Multicore to Cloudy Grids Indiana University Geoffrey Fox, Xiaohong Qiu, Scott Beason, Seung-Hee.
Dimension Reduction and Visualization of Large High-Dimensional Data via Interpolation Seung-Hee Bae, Jong Youl Choi, Judy Qiu, and Geoffrey Fox School.
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
VAST 2011 Sebastian Bremm, Tatiana von Landesberger, Martin Heß, Tobias Schreck, Philipp Weil, and Kay Hamacher Interactive-Graphics Systems TU Darmstadt,
Tara and Pawel.  Download MEGA (Molecular Evolutionary Genetics Analysis) 
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
BINF6201/8201 Molecular phylogenetic methods
Presenter: Yang Ruan Indiana University Bloomington
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Yang Ruan PhD Candidate Computer Science Department Indiana University.
Parallel Applications And Tools For Cloud Computing Environments Azure MapReduce Large-scale PageRank with Twister Twister BLAST Thilina Gunarathne, Stephen.
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
Phylogenetic Trees  Importance of phylogenetic trees  What is the phylogenetic analysis  Example of cladistics  Assumptions in cladistics  Frequently.
Printing: This poster is 48” wide by 36” high. It’s designed to be printed on a large-format printer. Customizing the Content: The placeholders in this.
Calculating branch lengths from distances. ABC A B C----- a b c.
394C, Spring 2013 Sept 4, 2013 Tandy Warnow. DNA Sequence Evolution AAGACTT TGGACTTAAGGCCT -3 mil yrs -2 mil yrs -1 mil yrs today AGGGCATTAGCCCTAGCACTT.
Phylogeny and visualization: MEGA and iTOL Yanbin Yin Spring
SCALABLE AND ROBUST DIMENSION REDUCTION AND CLUSTERING
Looking at Use Case 19, 20 Genomics 1st JTC 1 SGBD Meeting SDSC San Diego March Judy Qiu Shantenu Jha (Rutgers) Geoffrey Fox
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
Yang Ruan PhD Candidate Salsahpc Group Community Grid Lab Indiana University.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
Phylogenetic trees. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.
Date of download: 7/7/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A) DNA.
Introduction to Bioinformatics Resources for DNA Barcoding
394C, Spring 2012 Jan 23, 2012 Tandy Warnow.
Figure 1. Lineages circulating in sampled regions
Using Bioinformatics to Study Evolutionary Relationships
Department of Intelligent Systems Engineering
DACIDR for Gene Analysis
Overview Identify similarities present in biological sequences and present them in a comprehensible manner to the biologists Objective Capturing Similarity.
Iterative Optimization
STREAM2016 Workshop Washington DC March
Data Science for Life Sciences Research & the Public Good
SPIDAL and Deterministic Annealing
Adaptive Interpolation of Multidimensional Scaling
Towards High Performance Data Analytics with Java
Inference of Environmental Factor-Microbe and Microbe-Microbe Associations from Metagenomic Data Using a Hierarchical Bayesian Statistical Model  Yuqing.
BNFO 602 Phylogenetics – maximum likelihood
BNFO 602 Phylogenetics Usman Roshan.
September 1, 2009 Tandy Warnow
Group 9 – Data Mining: Data
High-Throughput Identification and Quantification of Candida Species Using High Resolution Derivative Melt Analysis of Panfungal Amplicons  Tasneem Mandviwala,
Volume 21, Issue 6, Pages (June 2013)
(A) Tiled view of an ESOM map constructed using all 51 metagenome bins assembled from the samples collected in this study, with the white square encompassing.
Phylogenetic tree based on 16S rRNA gene sequence comparisons over 1,260 aligned bases showing the relationship between species of the genus Actinomyces.
Phylogenetic tree of 38 Pseudomonas type strains, based on the V3-V5 region sequence of the 16S rRNA gene (V3 primer, positions 442 to 492; and V5 primer,
Claudio H Slamovits, Naomi M Fast, Joyce S Law, Patrick J Keeling 
MDS and Visualization September Geoffrey Fox
Phylogenetic tree of 38 Pseudomonas type strains, based on a concatenated nine-gene MLST analysis. Phylogenetic tree of 38 Pseudomonas type strains, based.
Fig. 2. —Phylogenetic relationships and motif compositions of some representative MORC genes in plants and animals. ... Fig. 2. —Phylogenetic relationships.
Phylogenetic tree of medically important flaviviruses based on E protein amino acid diversity. Phylogenetic tree of medically important flaviviruses based.
Novel West Nile virus lineage 1a full genome sequences from human cases of infection in north-eastern Italy, 2011  L. Barzon  Clinical Microbiology and.
Phylogenetic analysis of AquK2P.
Neighbor-joining tree of the 262 S
Phylogenetic tree representation of a neighbor-joining analysis of several species of piroplasms. Phylogenetic tree representation of a neighbor-joining.
Fig. 3. Phylogenetic relationship of the replicons of the family Burkholderiaceae. An unrooted RAxML maximum ... Fig. 3. Phylogenetic relationship of the.
Presentation transcript:

Integration of Clustering and Multidimensional Scaling to Determine Phylogenetic Trees as Spherical Phylograms Visualized in 3 Dimensions  Introduction Phylogenetic analysis is commonly used to analyze genetic sequence data from fungal communities, while ordination and clustering techniques commonly are used to analyze sequence data from bacterial communities. However, few studies have attempted to link these two independent approaches. We propose a method, which we call spherical phylogram (SP), to display the phylogenetic tree within the clustering and visualization result from a pipeline called DACIDR. In comparison with traditional tree display methods, the correlations between the tree and the clustering can be observed directly. In addition, we propose an algorithm called interpolative joining (IJ) to construct and visualize the SP in 3D space. Mega Region Visualization of full data. (446k Fungi Data) Cluster Visualization from Mega Region 0 Figure 2 Maximum likelihood phylogenetic tree from reference sequences and representative sequences found in each clusters, which is collapsed into clades at the genus level as denoted by colored triangles at the end of the branches. Branch lengths denote levels of sequence divergence between genera and nodes are labeled with bootstrap confidence values. Representative sequences from spores that are not part of another clade are denoted with the label ‘454 sequence from spore’. This figure is generated by FigTree. Spherical Phylogram visualized using the phylogenetic tree generated by RaXml using the representative sequences and reference sequences, the color scheme is same as in Figure 2 Visualization of all the clusters found by Recursive Clustering Figure 1 Screen shots of visualization result after data clustering DACIDR Pairwise Clustering Sample Clustering Result Pairwise Sequence Alignment Dissimilarity Matrix Multidimensional Scaling Input Sequences Mega Region Result Interpolation Visualization Mega Region 0 DACIDR Recursive Clustering Mega Region 1 DACIDR … … Final Clustering Result Visualization Mega Region N DACIDR Representative Sequences 3D Phylogenetic Tree Visualization Find Cluster Centers DACIDR Reference Sequences Spherical Phylogram Interpolative Joining Visualization RaXml Figure 3. Flowchart of Large Scale Data Clustering and Visualization. This is based on MPI and MapReduce parallel computing framework Contacts: Yang Ruan (yangruan@indiana.edu), Saliya Ekanayake (sekayana@indiana.edu), Geoffrey Fox (gcf@indiana.edu)

… … DACIDR Recursive Clustering 3D Phylogenetic Tree Visualization Pairwise Clustering Sample Clustering Result DACIDR Pairwise Sequence Alignment Dissimilarity Matrix Multidimensional Scaling Mega Region Result Input Sequences Interpolation Visualization Recursive Clustering Mega Region 0 DACIDR Mega Region 1 DACIDR Final Clustering Result … … Visualization Mega Region N DACIDR 3D Phylogenetic Tree Visualization Representative Sequences Find Cluster Centers DACIDR Spherical Phylogram Interpolative Joining Visualization Reference Sequences RaXml