BNFO 602 Phylogenetics – maximum likelihood

Slides:



Advertisements
Similar presentations
Parsimony Small Parsimony and Search Algorithms Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Advertisements

Parallel BioInformatics Sathish Vadhiyar. Parallel Bioinformatics  Many large scale applications in bioinformatics – sequence search, alignment, construction.
Branch and Bound Optimization In an exhaustive search, all possible trees in a search space are generated for comparison At each node, if the tree is optimal.
Phylogenetic Trees Lecture 4
Tree Reconstruction.
Classification and risk prediction
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
BNFO 602 Phylogenetics Usman Roshan.
Maximum Likelihood. Historically the newest method. Popularized by Joseph Felsenstein, Seattle, Washington. Its slow uptake by the scientific community.
Probabilistic Approaches to Phylogeny Wouter Van Gool & Thomas Jellema.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Phylogeny reconstruction BNFO 602 Roshan. Simulation studies.
BNFO 602 Phylogenetics Usman Roshan. Summary of last time Models of evolution Distance based tree reconstruction –Neighbor joining –UPGMA.
. Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo.
CISC667, F05, Lec16, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (III) Probabilistic methods.
Probabilistic methods for phylogenetic trees (Part 2)
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
CIS786, Lecture 8 Usman Roshan Some of the slides are based upon material by Dennis Livesay and David.
. Phylogenetic Trees Lecture 13 This class consists of parts of Prof Joe Felsenstein’s lectures 4 and 5 taken from:
Phylogeny Estimation: Why It Is "Hard", and How to Design Methods with Good Performance Tandy Warnow Department of Computer Sciences University of Texas.
Computer Science Research for The Tree of Life Tandy Warnow Department of Computer Sciences University of Texas at Austin.
Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Evolutionary Trees Usman Roshan Department of Computer Science New Jersey Institute of.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Newer methods for tree building
CS 173, Lecture B August 25, 2015 Professor Tandy Warnow.
Using traveling salesman problem algorithms for evolutionary tree construction Chantal Korostensky and Gaston H. Gonnet Presentation by: Ben Snider.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Selecting Genomes for Reconstruction of Ancestral Genomes Louxin Zhang Department of Mathematics National University of Singapore.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Machine learning optimization Usman Roshan. Machine learning Two components: – Modeling – Optimization Modeling – Generative: we assume a probabilistic.
Local alignment and BLAST Usman Roshan BNFO 601. Local alignment Global alignment recursions: Local alignment recursions.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
CS 395T: Computational phylogenetics January 18, 2006 Tandy Warnow.
Hierarchical Mixture of Experts Presented by Qi An Machine learning reading group Duke University 07/15/2005.
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Chapter AGB. Today’s material Maximum Parsimony Fixed tree versions (solvable in polynomial time using dynamic programming) Optimal tree search.
394C: Algorithms for Computational Biology Tandy Warnow Jan 25, 2012.
Phylogenetic Trees - Parsimony Tutorial #12
Introduction to Bioinformatics Resources for DNA Barcoding
Usman Roshan CS 675 Machine Learning
Phylogenetic basis of systematics
Statistical tree estimation
Lecture 6B – Optimality Criteria: ML & ME
An Equivalence of Maximum Parsimony and Maximum Likelihood revisited
Maximum likelihood (ML) method
26.3 Shared Characters Are Used To Construct Phylogenetic Trees
Goals of Phylogenetic Analysis
CSCI 5822 Probabilistic Models of Human and Machine Learning
Local alignment and BLAST
BNFO 236 Smith Waterman alignment
Distance based phylogeny reconstruction
Molecular Evolution.
BNFO 602 Phylogenetics Usman Roshan.
BNFO 602 Phylogenetics – maximum parsimony
CS 581 Tandy Warnow.
CS 581 Tandy Warnow.
Lecture 6B – Optimality Criteria: ML & ME
Representing binary trees with lists
BNFO 602 Phylogenetics Usman Roshan.
The Most General Markov Substitution Model on an Unrooted Tree
Phylogeny.
CS 394C: Computational Biology Algorithms
Algorithms for Inferring the Tree of Life
Phylogenetic tree representation of a neighbor-joining analysis of several species of piroplasms. Phylogenetic tree representation of a neighbor-joining.
Presentation transcript:

BNFO 602 Phylogenetics – maximum likelihood Usman Roshan

Maximum Likelihood D = data, M = model Bayes rule P(M|D) = P(D|M)P(M) / P(D) P(M|D) is the posterior probability P(D|M) is the likelihood P(M) is the prior probability on the model By rewriting P(D) we get = P(D|M)P(M) / ∑M P(D|M)P(M) which implies that P(M|D) is proportional to P(D|M)P(M) Note that by assuming uniform priors P(M|D) = P(D|M)1/k / ∑M P(D|M)1/k

Maximum Likelihood Data (input) is the alignment Model consists of the tree with branch lengths and leaves labeled with the DNA sequences in the data (input) a DNA sequence evolution model (such as Jukes Cantor) How do we compute the likelihood P(D|T) of the tree below?

Which of the two trees below have the higher likelihood?

Maximum Likelihood ML problem: Under a fixed model find the tree with branch lengths and internal nodes that has the highest likelihood. Very large search space NP-hard Sub-problems What is the likelihood of a tree with branch lengths and internal nodes? Linear time solution What if no internal nodes are given? Felsenstein’s algorithm gives linear time solution What if no branch lengths are given? We use gradient descent

Maximum Likelihood Comparison to MP: Both are NP-hard For fixed tree it takes polynomial time to find the parsimony score For fixed tree is is NP-hard to find the likelihood score Similar local search heuristics as MP