Applied Bioinformatics Week 8 Jens Allmer. Practice I.

Slides:



Advertisements
Similar presentations
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Lecture 13 CS5661 Phylogenetics Motivation Concepts Algorithms.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
BIO2093 – Phylogenetics Darren Soanes Phylogeny I.
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
© Wiley Publishing All Rights Reserved. Phylogeny.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Multiple sequence alignment
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Building Phylogenies Distance-Based Methods. Methods Distance-based Parsimony Maximum likelihood.
Phylogenetic trees Tutorial 6. Distance based methods UPGMA Neighbor Joining Tools Mega phylogeny.fr DrewTree Phylogenetic Trees.
Bioinformatics tools for phylogeny and visualization
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Sequence Alignment and Phylogenetic Prediction using Map Reduce Programming Model in Hadoop DFS Presented by C. Geetha Jini (07MW03) D. Komagal Meenakshi.
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Terminology of phylogenetic trees
Christian M Zmasek, PhD 15 June 2010.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Molecular basis of evolution. Goal – to reconstruct the evolutionary history of all organisms in the form of phylogenetic trees. Classical approach: phylogenetic.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
1 Generalized Tree Alignment: The Deferred Path Heuristic Stinus Lindgreen
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Multiple Alignments Motifs/Profiles What is multiple alignment? HOW does one do this? WHY does one do this? What do we mean by a motif or profile? BIO520.
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Phylogenetic Prediction Lecture II by Clarke S. Arnold March 19, 2002.
Phylogenetic Trees Tutorial 5. Agenda How to construct a tree using Neighbor Joining algorithm Phylogeny.fr tool Cool story of the day: Horizontal gene.
Phylogenetic Trees  Importance of phylogenetic trees  What is the phylogenetic analysis  Example of cladistics  Assumptions in cladistics  Frequently.
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Using traveling salesman problem algorithms for evolutionary tree construction Chantal Korostensky and Gaston H. Gonnet Presentation by: Ben Snider.
Algorithms in Computational Biology11Department of Mathematics & Computer Science Algorithms in Computational Biology Building Phylogenetic Trees.
Copyright OpenHelix. No use or reproduction without express written consent1.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
Phylogeny and visualization: MEGA and iTOL Yanbin Yin Spring
From basic Concepts to Advanced applications Molecular Evolution & Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Why do trees?. Phylogeny 101 OTUsoperational taxonomic units: species, populations, individuals Nodes internal (often ancestors) Nodes external (terminal,
Phylogeny Ch. 7 & 8.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
Tutorial 5 Phylogenetic Trees.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Sequence Alignment Abhishek Niroula Department of Experimental Medical Science Lund University
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
CSCE555 Bioinformatics Lecture 13 Phylogenetics II Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Introduction to Bioinformatics Resources for DNA Barcoding
Phylogenetic basis of systematics
Phylogeny - based on whole genome data
Inferring a phylogeny is an estimation procedure.
Overview of Multiple Sequence Alignment Algorithms
Multiple Alignment and Phylogenetic Trees
Patterns in Evolution I. Phylogenetic
Inferring phylogenetic trees: Distance and maximum likelihood methods
Phylogenetic Trees.
Reading Phylogenetic Trees
#30 - Phylogenetics Distance-Based Methods
Phylogeny.
Molecular data assisted morphological analyses
Presentation transcript:

Applied Bioinformatics Week 8 Jens Allmer

Practice I

Topic Multiple Sequence Alignment Review –Building an MSA –Editing an MSA Dendrograms Phylogenetic Trees

Choosing Sequences How many? –10 – 15 (less than 50 would be good) Seqs should be >30% and <90% identical Prefer seqs of similar length Prefer seqs without internal repeats or extract them

Choosing Sequences While choosing your sequences give them good names Some sequences should be well annotated

Create an MSA This time use 20 – 50 sequences –From different species Use ClustalW for alignment Most ClustalW servers display a dendrogram Confirm this by using a few of them

Gathering Sequences Download the sequences as a FASTA file as well Most programs will support this format

Output Formats Many different formats –FASTA widely supported –PdfOnly for printing/ storing/ sharing –PirSimilar to fasta –Msfcommon MSA format –Alnsubset of msf

Converting Formats s/fmtseq.html Names (>…) no longer than 15 characters Different formats maintain different data Converting will introduce the problem of loosing data Make sure to have a master copy

Editing Alignments Start the program Choose File – Input Alignment – from Textbox Copy and paste the ClustalW alignment

Dendrogram Jalview also allows you to view different types of Dendrograms based on different similarity measures Use Jalview and compare the trees that are constructed based on the different measures

End Practice I 15 min break

Theory I

Phylogeny Sources –Sequences –Clades –Organims Why –Understand evolution –Strain diversity –Epidemiology –Gene predicion

Dendrogram

Phylogenetic Tree

Tree Terminology All circled elements (e.g.: a) are called node(s) The connections between them are called edge(s) or branch(es) The first node that forms the tree is called root (here abcdef) Terminal nodes that have only one connection are called leaf(ves) (e.g.: a) Unrooted Trees (remove red root)

Branch Length Arbitrary Similarity Evolutionary Time

Tree types A dendrogram is a broad term for the diagrammatic representation of a phylogenetic tree. A cladogram is a tree formed using cladistic methods. This type of tree only represents a branching pattern, i.e., its branch lengths do not represent time. A phylogram is a phylogenetic tree that explicitly represents number of character changes through its branch lengths. A chronogram is a phylogenetic tree that explicitly represents evolutionary time through its branch lengths.

Sequences DNA –Sensitive but quite divergent at longer distances –Use for very closely related organisms cDNA –Still sensitve but less divergent (e.g. introns) –Use for closely related families Protein –Least sensitive but most useful for more distant relationships –Use for distantly related species 16S RNA –Exists in all organisms –Highly conserved

Overall Process Get Sequences Construct MSA Compute pairwise distances (for some methods) Build Tree –Topology –Branch Lengths Estimate accuracy, reliability –Build several different trees for that Visualize the tree

Computational Tree Formation Distance Methods –Neighbor-Joining –Least-Squares –UPGMA Parsimony –Least number of evolutionary steps Maximum Likelihood –Highest probable tree to fit to the hypothesis is constructed

Neighbor Joining Bottom-up clustering method 1.Create distance map 2.Join closest nodes 3.Do (1-2) until fully joined

Least Squares Standard approximation approach –Minimizes the sum of the error (squares) Example PGLS –Phylogenetic Generalized Least Squares –Needs additional data (traits)

UPGMA Unweighted Pair Group Method with Arithmetic Mean –Aglomerative hierarchial clustering method –Assumes constant rate of evolution

Similarity Measures Sequence –Number of different positions –Weighted differences Substitution Matrices –Pairwise alignments NW, SW,.. Additional measurements or knowlege –Traits Parsimony –Number of changes for tree paths

Tree Accuracy Bootstrapping –Resample –Recompute –Do many times –Compare results

End Theory I Mindmap Break

Practice II

Where to get Trees Most servers that allow for MSA will also provide at least the guide tree which was used to construct the alignment If that’s all you are interested in you don’t need to go any further

Edit your MSA Remove blocks consisting of mostly gaps (using JalView) Remove N- and C-termini if not conserved well

Easy Tree Paste your alignment Select a tree type Other options need to be set (see right) Press run Make a screen shot You can paste it where needed

Phylip (More elaborate tree) phylip-uk.html Choose protdist from the page Paste the MSA Bootstrapping e.g.:

Phylip Run the query Click further analysis

Click Run Select full screen view There is your tree

Ugly Tree Let’s face it the tree is quite ugly iubio.bio.indiana.edu/treeapp/treeprint-form.html Select the consense.outtree from the previous website and paste it into the box Select submit to create the tree Play around with the formats and settings

Tree Topologies

Other Resources enetics_software