Ortholog vs. paralog? 1. Collect Sequence Data Good Dataset

Slides:



Advertisements
Similar presentations
CS 598AGB What simulations can tell us. Questions that simulations cannot answer Simulations are on finite data. Some questions (e.g., whether a method.
Advertisements

. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
An Introduction to Phylogenetic Methods
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Molecular Evolution Revised 29/12/06
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
“Evolutionary speculation constitutes a kind of metascience, which has the same intellectual fascination for some biologists that metaphysical speculation.
What can sequences tell us? BIOL E-127– 10/15/07.
OEB 192 – “Evolutionary speculation constitutes a kind of metascience, which has the same intellectual fascination for some biologists that metaphysical.
Phylogeny. Reconstructing a phylogeny  The phylogenetic tree (phylogeny) describes the evolutionary relationships between the studied data  The data.
Phylogenetic reconstruction
More on neutral theory OEB 192 – Example: Neighbor Joining (NJ) 4. Choose Methods Taxa Characters Species A ATGGCTATTCTTATAGTACG Species B ATCGCTAGTCTTATATTACA.
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
Bas E. Dutilh Phylogenomics Using complete genomes to determine the phylogeny of species.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Gene transfer Organismal tree: species B species A species C species D Gene Transfer seq. from B seq. from A seq. from C seq. from D molecular tree: speciation.
BIOS E-127 – Phenetics vs. cladistics Lysozyme amino acid changes in unrelated ruminants Phenetics vs. cladistics.
Probabilistic methods for phylogenetic trees (Part 2)
Phylogenetic Analysis. 2 Phylogenetic Analysis Overview Insight into evolutionary relationships Inferring or estimating these evolutionary relationships.
Bioinformatics tools for phylogeny and visualization
Phylogenetic trees Sushmita Roy BMI/CS 576
Phylogenetic Analysis
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
Molecular phylogenetics
Christian M Zmasek, PhD 15 June 2010.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
COMPUTATIONAL MODELS FOR PHYLOGENETIC ANALYSIS K. R. PARDASANI DEPTT OF APPLIED MATHEMATICS MAULANA AZAD NATIONAL INSTITUTE OF TECHNOLOGY (MANIT) BHOPAL.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
Phylogenetics and Coalescence Lab 9 October 24, 2012.
3- RIBOSOMAL RNA GENE RECONSTRUCITON  Phenetics Vs. Cladistics  Homology/Homoplasy/Orthology/Paralogy  Evolution Vs. Phylogeny  The relevance of the.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Introduction to Phylogenetic Trees
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.
Phylogeny Ch. 7 & 8.
Phylogenetics.
Phylogeny & Systematics
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Bootstrap ? See herehere. Maximum Likelihood and Model Choice The maximum Likelihood Ratio Test (LRT) allows to compare two nested models given a dataset.Likelihood.
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Building Phylogenies Maximum Likelihood. Methods Distance-based Parsimony Maximum likelihood.
Phylogenetic trees. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.
Darwin’s Tree of Life, July million species Phylogenetic inference from genomic.
Phylogeny and the Tree of Life
Phylogenetic genome analysis, phylogenomics
Announcements Seminar today after class! Seminar Wednesday!
Evolutionary genomics can now be applied beyond ‘model’ organisms
Phylogenetic basis of systematics
Gene-sequence analysis reveals at least three species hidden in Zausodes arenicolus Erin Easton November 13, 2008.
Phylogenetic Inference
Goals of Phylogenetic Analysis
Methods of molecular phylogeny
Why could a gene tree be different from the species tree?
BNFO 602 Phylogenetics Usman Roshan.
Chapter 19 Molecular Phylogenetics
Lecture 7 – Algorithmic Approaches
Volume 9, Issue 9, Pages (September 2016)
CS 394C: Computational Biology Algorithms
Algorithms for Inferring the Tree of Life
Panthera pardus Genus: Panthera Family: Felidae Order: Carnivora
But what if there is a large amount of homoplasy in the data?
Core genome phylogeny of V. anguillarum strains.
Comparison of 16S sequencing and shallow shotgun recovery of species-level taxa. Comparison of 16S sequencing and shallow shotgun recovery of species-level.
16S rRNA-based phylogeny of sponge-associated cyanobacteria and chloroplasts. 16S rRNA-based phylogeny of sponge-associated cyanobacteria and chloroplasts.
Presentation transcript:

Ortholog vs. paralog? 1. Collect Sequence Data Good Dataset B species 1 species 2 species 3 species 4 A1 B1 A2 B2 A4 B4 A3 B3 Good Dataset Bad Dataset Draw resulting trees [A1, A2, A3, A4] [A1, B2, A3, A4] OEB 192 – 11.09.14

Alignment taxa1 CGGATAAAC taxa2 CGGATAGAC taxa3 CGCTGATAAAC taxa4 2. Sequence Alignment CGGATAAAC CGGATAGAC CGCTGATAAAC CGGATAC taxa1 taxa2 taxa3 taxa4 CG--GATAAAC CG--GATAGAC CGCTGATAAAC CG--GAT--AC

Choose methods: distance-based Example: Neighbor Joining (NJ) Taxa Characters Species A ATGGCTATTCTTATAGTACG Species B ATCGCTAGTCTTATATTACA Species C TTCACTAGACCTGTGGTCCA Species D TTGACCAGACCTGTGGTCCG Species E TTGACCAGTTCTCTAGTTCG A B C D E Species A ---- 4 10 9 8 Species B -19.3 ---- 8 11 10 Species C -10 -14.7 ---- 3 8 Species D -10.7 -11.3 -16 ---- 5 Species E -12.7 -13.3 -12 -14.7 ---- A B C D E Species A ---- 4 10 9 8 Species B ---- 8 11 10 Species C ---- 3 8 Species D ---- 5 Species E ---- M(AB)=d(AB) -[(r(A) + r(B))/(N-2)] A B C D E Insert node connecting pair most relatively close to each other, than repeat for U1, C, D, E

Discrete character methods 4. Choose Methods Discrete character methods Maximum Parsimony (MP): Model: Evolution goes through the least number of changes Maximum Likelihood (ML): L (data| model) Adjust the internal node to minimize the sum of changes of all characters Scores trees for that most likely given the model parameters Bayesian Inference

? Choose “model” 3. Choose Models Ancestral Sequences Model Observed Sequences ? Model Substitution models (arrows between all nucleotides) 12 rates, 4 equilibrium frequencies, indels

Assess reliability 5. Assess Reliability Re-sampling to produce pseudo-dataset (random weighting) CGATCGTTA CAATGATAG CGCTGATAA CGCTGATCG taxa1 taxa2 taxa3 taxa4 123456789 100 73 I. Bootstrap Sampling with replacement II. Jacknife Random deletion of sub-dataset Randomize dataset to build null likelihood distribution III. Permutation test

Utility of phylogeny: Molecular clock 5. Assess Reliability How assess timeline? Made tree (ML) & regressed branch length against year. Ancestor of M clade: 1931 (95% c.i. made via bootstraps: 1915-1941) – Do you believe it? There is a recovered Zaire sequence, for which they predict would have occurred 1957 (actually 1959)… (Korber et al., 2000)

Utility of phylogeny: Molecular clock 5. Assess Reliability How assess timeline? Made tree (ML) & regressed branch length against year. Ancestor of M clade: 1931 (95% c.i. made via bootstraps: 1915-1941) – Do you believe it? There is a recovered Zaire sequence, for which they predict would have occurred 1957 (actually 1959)… (Korber et al., 2000)

Utility of phylogeny: Molecular clock 5. Assess Reliability How assess timeline? Made tree (ML) & regressed branch length against year. Ancestor of M clade: 1931 (95% c.i. made via bootstraps: 1915-1941) – Do you believe it? There is a recovered Zaire sequence, for which they predict would have occurred 1957 (actually 1959)… (Hillis, 2000)

Utility of phylogeny: infer past environment? Was the ancestor of bacteria a thermophile? Reconstructed EF-Tu at key nodes All ancestral types have high Topt (Gaucher et al., 2003)

Molecular signs of selection Several, dn/ds most common 1 = neutrality; <1 purifying selection; >1 positive selection (particularly sustained) – ex: extreme values show up for genes interacting with immune system (Sawyer & Malik, 2006)

Genetic exchange in bacteria/archaea Transformation Used by Griffiths, and later Avery to show DNA is genetic material Some bacteria naturally competent Transduction Generalized & specialized Conjugation High-frequency recombinants (Hfr) Rare, partial and unidirectional Can be novel genes or new alleles (homologous rec.)

Two “types” of HGT… Transformation Transduction Conjugation Used by Griffiths, and later Avery to show DNA is genetic material Some bacteria naturally competent Transduction Generalized & specialized Conjugation High-frequency recombinants (Hfr) Rare, partial and unidirectional Can be novel genes or new alleles (homologous rec.)

Comparison of HGT to “normal” sex Transformation Used by Griffiths, and later Avery to show DNA is genetic material Some bacteria naturally competent Transduction Generalized & specialized Conjugation High-frequency recombinants (Hfr) Rare, partial and unidirectional Can be novel genes or new alleles (homologous rec.)

Monday (9/19): Horizontal gene transfer