Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck.

Slides:



Advertisements
Similar presentations
Reconstructing Phylogenies from Gene-Order Data Overview.
Advertisements

CS 598AGB What simulations can tell us. Questions that simulations cannot answer Simulations are on finite data. Some questions (e.g., whether a method.
A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks Luay Nakhleh Department of Computer Sciences UT Austin.
Challenges in computational phylogenetics Tandy Warnow Radcliffe Institute for Advanced Study University of Texas at Austin.
DCJUC: A Maximum Parsimony Simulator for Constructing Phylogenetic Tree of Genomes with Unequal Contents Zhaoming Yin Bader-Polo Joint Group Meeting, Nov.
Computer Science and Reconstructing Evolutionary Trees Tandy Warnow Department of Computer Science University of Illinois at Urbana-Champaign.
Large-Scale Phylogenetic Analysis Tandy Warnow Associate Professor Department of Computer Sciences Graduate Program in Evolution and Ecology Co-Director.
Computational biology and computational biologists Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular and Molecular Biology.
High-Performance Algorithm Engineering for Computational Phylogenetics [B Moret, D Bader] Kexue Liu CMSC 838 Presentation.
Current Approaches to Whole Genome Phylogenetic Analysis Hongli Li.
Genome Rearrangement Phylogeny
BNFO 602 Phylogenetics Usman Roshan.
CIS786, Lecture 3 Usman Roshan.
Phylogeny reconstruction BNFO 602 Roshan. Simulation studies.
BNFO 602 Phylogenetics Usman Roshan. Summary of last time Models of evolution Distance based tree reconstruction –Neighbor joining –UPGMA.
FPGA Acceleration of Gene Rearrangement Analysis Jason D. Bakos Dept. of Computer Science and Engineering University of South Carolina Columbia, SC USA.
Inferring Phylogeny using Permutation Patterns on Genomic Data 1 Md Enamul Karim 2 Laxmi Parida 1 Arun Lakhotia 1 University of Louisiana at Lafayette.
CIS786, Lecture 4 Usman Roshan.
CIS786, Lecture 8 Usman Roshan Some of the slides are based upon material by Dennis Livesay and David.
Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow.
Computational and mathematical challenges involved in very large-scale phylogenetics Tandy Warnow The University of Texas at Austin.
Combinatorial and graph-theoretic problems in evolutionary tree reconstruction Tandy Warnow Department of Computer Sciences University of Texas at Austin.
Phylogeny Estimation: Why It Is "Hard", and How to Design Methods with Good Performance Tandy Warnow Department of Computer Sciences University of Texas.
CIPRES: Enabling Tree of Life Projects Tandy Warnow The Program in Evolutionary Dynamics at Harvard University The University of Texas at Austin.
CIPRES: Enabling Tree of Life Projects Tandy Warnow The University of Texas at Austin.
Disk-Covering Methods for phylogeny reconstruction Tandy Warnow The University of Texas at Austin.
Phylogenetic Tree Reconstruction Tandy Warnow The Program in Evolutionary Dynamics at Harvard University The University of Texas at Austin.
Complexity and The Tree of Life Tandy Warnow The University of Texas at Austin.
Combinatorial and Statistical Approaches in Gene Rearrangement Analysis Jijun Tang Computer Science and Engineering University of South Carolina
Computer Science Research for The Tree of Life Tandy Warnow Department of Computer Sciences University of Texas at Austin.
Binary Encoding and Gene Rearrangement Analysis Jijun Tang Tianjin University University of South Carolina (803)
Gene Order Phylogeny Tandy Warnow The Program in Evolutionary Dynamics, Harvard University The University of Texas at Austin.
Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Evolutionary Trees Usman Roshan Department of Computer Science New Jersey Institute of.
NP-hardness and Phylogeny Reconstruction Tandy Warnow Department of Computer Sciences University of Texas at Austin.
CIPRES: Enabling Tree of Life Projects Tandy Warnow The University of Texas at Austin.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
CIPRES: Enabling Tree of Life Projects Tandy Warnow The University of Texas at Austin.
Introduction to Phylogenetic Estimation Algorithms Tandy Warnow.
Algorithmic research in phylogeny reconstruction Tandy Warnow The University of Texas at Austin.
GRAPPA: Large-scale whole genome phylogenies based upon gene order evolution Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular.
Using Divide-and-Conquer to Construct the Tree of Life Tandy Warnow University of Illinois at Urbana-Champaign.
SupreFine, a new supertree method Shel Swenson September 17th 2009.
The Big Issues in Phylogenetic Reconstruction Randy Linder Integrative Biology, University of Texas
Problems with large-scale phylogeny Tandy Warnow, UT-Austin Department of Computer Sciences Center for Computational Biology and Bioinformatics.
CS 395T: Computational phylogenetics January 18, 2006 Tandy Warnow.
Iterative-DCM3: A Fast Algorithmic Technique for Reconstructing Large Phylogenetic Trees Usman Roshan and Tandy Warnow U. of Texas at Austin Bernard Moret.
Approaching multiple sequence alignment from a phylogenetic perspective Tandy Warnow Department of Computer Sciences The University of Texas at Austin.
Simultaneous alignment and tree reconstruction Collaborative grant: Texas, Nebraska, Georgia, Kansas Penn State University, Huston-Tillotson, NJIT, and.
The Tree of Life: Algorithmic and Software Challenges Tandy Warnow The University of Texas at Austin.
394C: Algorithms for Computational Biology Tandy Warnow Jan 25, 2012.
20 years and 22 papers with Bernard Moret
The Disk-Covering Method for Phylogenetic Tree Reconstruction
New Approaches for Inferring the Tree of Life
394C, Spring 2012 Jan 23, 2012 Tandy Warnow.
Multiple Sequence Alignment Methods
Tandy Warnow Department of Computer Sciences
Challenges in constructing very large evolutionary trees
CIPRES: Enabling Tree of Life Projects
BNFO 602 Phylogenetics Usman Roshan.
BNFO 602 Phylogenetics – maximum parsimony
CS 581 Tandy Warnow.
Tandy Warnow Department of Computer Sciences
New methods for simultaneous estimation of trees and alignments
BNFO 602 Phylogenetics – maximum likelihood
BNFO 602 Phylogenetics Usman Roshan.
CS 394C: Computational Biology Algorithms
September 1, 2009 Tandy Warnow
Algorithms for Inferring the Tree of Life
Tandy Warnow The University of Texas at Austin
Tandy Warnow The University of Texas at Austin
Presentation transcript:

Algorithms research Tandy Warnow UT-Austin

“Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck UNM: Moret, Bader, Williams External participants: Mossel (UCB), Huson (Germany), Steel (NZ), and others

Main research foci Solving maximum parsimony and maximum likelihood more effectively “Fast converging methods” Gene order and content phylogeny Reticulate evolution Multiple sequence alignment at the genomic level

GRAPPA (Genome Rearrangement Analysis under Parsimony and other Phylogenetic Algorithms) Heuristics for NP-hard optimization problems Fast polynomial time distance-based methods Contributors: U. New Mexico,U. Texas at Austin, Universitá di Bologna, Italy Poster: Jijun Tang

Maximum Parsimony on Rearranged Genomes (MPRG) The leaves are rearranged genomes. Find the tree that minimizes the total number of rearrangement events A B C D A B C D E F Total length = 18

Benchmark gene order dataset: Campanulaceae 12 genomes + 1 outgroup (Tobacco), 105 gene segments NP-hard optimization problems: breakpoint and inversion phylogenies 1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.)

Benchmark gene order dataset: Campanulaceae 12 genomes + 1 outgroup (Tobacco), 105 gene segments NP-hard optimization problems: breakpoint and inversion phylogenies 1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.) 2000: Using GRAPPA v1.1 on the 512-processor Los Lobos Supercluster machine: 2 minutes (200,000-fold speedup per processor)

Benchmark gene order dataset: Campanulaceae 12 genomes + 1 outgroup (Tobacco), 105 gene segments NP-hard optimization problems: breakpoint and inversion phylogenies 1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.) 2000: Using GRAPPA v1.1 on the 512-processor Los Lobos Supercluster machine: 2 minutes (200,000-fold speedup per processor) 2003: Using latest version of GRAPPA: 2 minutes on a single processor (1-billion-fold speedup per processor)

Reticulate Evolution Group leader: Randy Linder Software: (1) producing random networks, (2) simulating sequences down networks, (3) performance evaluation of methods (4) inferring reticulate networks Current reconstruction methods limited to one reticulation event Poster: Luay Nakhleh

20-taxon 1-hybrid network. 0.1 scaling factor.

MP/ML heuristics Disk-Covering Methods (DCMs): Divide- and-conquer strategies that boosting the performance of base methods for MP/ML (Warnow) Mr Bayes (Huelsenbeck) New I-DCM3 technique improves upon the Ratchet and TBR Poster: Usman Roshan (DCM-MP)

Gutell dataset: 854 rRNA sequences Iterative-DCM3 trials find trees of MP score in 30 hours, whereas ratchet500 trials take 45 hours to find trees of same score

Other planned projects (partial list) Multiple Sequence Alignment (Myers and Williams) Steiner Tree algorithms - error bounds and new heuristics (Rao) MCMC methods (Russell and Huelsenbeck) Symbolic representation of data (Hunt) Parallel algorithms (Bader and Williams)

Questions for group How should we measure performance? How should we use simulated data? How should we use real datasets? How can we study criteria (MP, ML, etc.) as opposed to methods? Should we sponsor DIMACS-style challenges? Others? (please bring questions, comments, answers, to the break-out session)